HOME   Cart(0)   Quotation   About-Us Tax PDFs Standard-List Powered by Google www.ChineseStandard.net Database: 189760 (5 Oct 2024)

GB 18030-2022 PDF in English


GB 18030-2022 (GB18030-2022) PDF English
Standard IDContents [version]USDSTEP2[PDF] delivered inName of Chinese StandardStatus
GB 18030-2022English5005 Add to Cart 0-9 seconds. Auto-delivery. Information technology - Chinese coded character set Valid
GB 18030-2005English4690 Add to Cart 0-9 seconds. Auto-delivery. Information technology -- Chinese coded character set Obsolete
GB 18030-2000EnglishRFQ ASK 3 days Information technology-Chinese ideograms coded character set for information interchange-Extension for the basic set Obsolete
Standards related to (historical): GB 18030-2022
PDF Preview

GB 18030-2022: PDF in English

GB NATIONAL STANDARD OF THE PEOPLE’S REPUBLIC OF CHINA ICS 35.040 CCS L 71 GB 18030-2022 Replacing GB 18030-2005 Information technology - Chinese coded character set ISSUED ON: JULY 19, 2022 IMPLEMENTED ON: AUGUST 01, 2023 Issued by: State Administration for Market Regulation; Standardization Administration of the People's Republic of China. Table of Contents Foreword ... i 1 Scope ... 0 2 Normative references ... 0 3 Terms and definitions ... 0 4 Repertoire ... 1 5 Overall structure ... 2 6 Sequence of characters ... 4 7 Code point allocation ... 4 8 Explanation of some characters and codes ... 7 9 Implementation level ... 7 Annex A (normative) Character table of double-byte ... 9 Annex B (normative) Ideographic descriptors ... 91 Annex C (normative) Character table of four-byte ... 92 Annex D (informative) Explanation of some characters and codes ... 546 Annex E (informative) Code positions of Chinese characters in "General Standard Chinese Character List" ... 549 Bibliography ... 742 ii Foreword This document was drafted in accordance with the rules given in GB/T 1.1-2020 "Directives for standardization - Part 1: Rules for the structure and drafting of standardizing documents". This document replaces GB 18030-2005 "Information technology - Chinese coded character set". Compared with GB 18030-2005, in addition to the structural modifications and editorial changes, the main technical changes in this document are as follows: a) Add the applicable objects of this document (see Chapter 1 of this Edition); b) In the double-byte coding area, change the GB/T 13000 code positions corresponding to 10 vertical punctuation marks and 8 Chinese character components. Delete 6 repeated coded Chinese character components and 9 repeated coded Chinese characters (see Annex D of this Edition, Annex A of Edition 2005); c) In the four-byte coding area, change 18 GB/T 13000 code positions (see Annex D of this Edition, Annex D of Edition 2005); d) In the part of four-byte code 0x82358F33~0x82359636, add 66 new Chinese characters added by CJK unified Chinese characters (see Annex C of this Edition); e) In the part of four-byte code 0x9835F738~0x98399E36, add 4149 Chinese characters of CJK unified Chinese character extension C (see Annex C of this Edition); f) In the part of four-byte code 0x98399F38~0x9839B539, add 222 Chinese characters of CJK unified Chinese character expansion D (see Annex C of this Edition); g) In the part of four-byte code 0x9839B632~0x9933FE33, add 5762 Chinese characters of CJK unified Chinese character extension E (see Annex C of this Edition); h) In the part of four-byte code 0x99348138~0x9939F730, add 7473 Chinese characters of CJK unified Chinese character expansion F (see Annex C of this Edition); i) In the part of four-byte code 0x81398B32~0x8139A035, add 214 Kangxi radicals (see Annex C of this Edition); j) In the part of four-byte code 0x8134F932~0x81358437, add 83 Xishuangbanna New Dai characters (see Annex C of this Edition); iii Information technology - Chinese coded character set 1 Scope This document specifies the hexadecimal representation of Chinese graphic characters and their binary codes used in information technology. This document applies to the processing, exchange, storage, transmission, presentation, input and output of Chinese and other graphic character information. This document is applicable to technical products with information processing and exchange functions of Chinese and other text and graphic characters, including but not limited to the software products represented by input methods, optical character recognition (OCR), editing and proofreading, machine translation, speech synthesis, text transcription, intelligent writing, etc., as well as the hardware products represented by computers, communication terminal equipment, e-book readers, learning machines, etc. 2 Normative references The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. GB/T 2312-1980, Code of Chinese graphic character set for information interchange - Primary set GB/T 11383-1989, Information process in 8-bit code for information interchange - Structure and rules for implementation GB/T 13000, Information technology - Universal multiple - Octet coded character set (UCS) 3 Terms and definitions For the purposes of this document, the following terms and definitions apply. 3.1 character An element in a collection of elements used to organize, control, or represent data. 3.2 coded character Character (3.1) and its coded representation. 3.3 private use area An area that can be specified by the user of a product conforming to this document. 3.4 repertoire A specified set of characters (3.1) represented by a coded character (3.2) set. 3.5 reserved zone Areas reserved for future specified by this document. 4 Repertoire 4.1 Overview The characters included in this document are coded in single-byte, double-byte or four- byte. 4.2 Part of single-byte In this document, the part of single-byte includes all 128 characters from 0x00 to 0x7F of GB/T 11383-1989. 4.3 Part of double-byte The part of double-byte includes all graphic characters in GB/T 2312-1980, CJK unified Chinese characters and some graphic characters in GB/T 13000. The characters in the part of double-byte are in accordance with the provisions in Annex A. Among them, the graphics, code positions and functions of ideographic descriptors shall comply with the provisions of Annex B. NOTE: GB/T 13000 uniformly encodes Chinese characters used in China, Japan, South Korea, Vietnam and other countries and regions. Chinese characters with unique abstract glyphs are assigned a separate code position. Chinese characters with different sources but the same abstract glyphs are given a common code position. The encoded Chinese characters are called CJK unified Chinese characters (CJK Unified Ideographs), where CJK means China, Japan, and Korea. 4.4 Part of four-byte The part of four-byte includes 66 CJK unified Chinese characters (9FA6~9FEF, excluding 9FB4~9FBB) in GB/T 13000 other than the above-mentioned double-byte characters, CJK unified Chinese character extension A, CJK unified Chinese character extension B, CJK unified Chinese character extension C, CJK unified Chinese character extension D, CJK unified Chinese character extension E, CJK unified Chinese character extension F and the characters of ethnic minorities that have been coded in GB/T 13000. The characters in the part of four-byte follow the provisions of Annex C. 5 Overall structure In the text, all numbers marked with 0x are in hexadecimal. Those not marked with 0x are in decimal. All coded representations in the appendix are expressed in hexadecimal. All other numbers are expressed in decimal. The part of single-byte adopts the encoding structure of GB/T 11383-1989. Use code points 0x00~0x7F. The part of double-byte adopts two octet strings to represent a character. Its first byte code point is from 0x81~0xFE. The tail byte code points are 0x40~0x7E and 0x80~0xFE respectively. The part of four-byte adopts 0x30~0x39 not used in GB/T 11383-1989 as the suffix to expand the double-byte code. The encoding range is 0x81308130~0xFE39FE39. The encoding range of the first byte of a four-byte character is 0x81~0xFE. The encoding range of the second byte is 0x30~0x39. The encoding range of the third byte is 0x81~0xFE. The encoding range of the fourth byte is 0x30~0x39. That is: 0x81308130 ~ 0x81308139; 0x81308230 ~ 0x81308239; ... 0x8130FE30 ~ 0x8130FE39; 0x81318130 ~ 0x81318139; ... 0x8131FE30 ~ 0x8131FE39; ... 0x82308130 ~ 0x82308139; ... 0x8230FE30 ~ 0x8230FE39; ... 0xFE308130 ~ 0xFE308139; This document specifies three implementation levels. System software products that meet the corresponding implementation level shall provide input and output functions for all characters within the corresponding implementation level. 9.2 Implementation level 1 Implementation level 1 supports CJK unified Chinese characters (i.e., 0x82358F33~0x82359636) and CJK unified Chinese character extension A (i.e., 0x8139EE39~0x82358738) of the single-byte coded part, double-byte coded part and four-byte coded part of this document. Any product to which this document applies shall meet the requirements for implementation level 1. NOTE: According to the needs of software applications, implementation level 1 can also choose to support any one or more non-Chinese characters listed in Table 3. 9.3 Implementation level 2 Implementation level 2 contains implementation level 1. In addition, implementation level 2 also supports encoded Chinese characters that are not included in implementation level 1 in the "General Standard Chinese Character List". See Annex E for the code positions and glyphs of the Chinese characters included in the "General Standard Chinese Character List" in this document. The system software and supporting software shall meet the requirements for implementation level 2. NOTE: System software and supporting software include but not limited to operating system, database management system, and middleware (see GB/T 36475 for information on software product classification). 9.4 Implementation level 3 Implementation level 3 contains implementation level 2. In addition, implementation level 3 also supports all Chinese characters specified in this document and Kangxi radicals in Table 3. Products used for government services and public services shall meet the requirements of level 3. NOTE: Government services and public service industries include but are not limited to railway transportation, road transportation, water transportation, air transportation, multimodal transportation and transportation agency, postal services, monetary and financial services, insurance, land management, health, national institutions, social security, etc. (see GB/T 4754 for industry classification information). ......
Source: Above contents are excerpted from the PDF -- translated/reviewed by: www.chinesestandard.net / Wayne Zheng et al.