GB/T 13000: Evolution and historical versions
| Standard ID | Contents [version] | USD | STEP2 | [PDF] delivered in | Standard Title (Description) | Status | PDF |
| GB/T 13000-2025 | English | RFQ |
ASK
|
3 days [Need to translate]
|
Information technology - Universal coded character set(UCS)
| Valid |
GB/T 13000-2025
|
| GB 13000-2010 | English | RFQ |
ASK
|
20 days [Need to translate]
|
[GB/T 13000-2010] Information technology -- Universal multiple-octet coded character set (UCS)
| Valid |
GB 13000-2010
|
| GB 13000.1-1993 | English | RFQ |
ASK
|
20 days [Need to translate]
|
Information technology. Universal multiple. Octet coded character set (UCS). Part 1: Architecture and basic multilingual plane
| Obsolete |
GB 13000.1-1993
|
PDF similar to GB/T 13000-2025
Basic data | Standard ID | GB/T 13000-2025 (GB/T13000-2025) | | Description (Translated English) | Information technology - Universal coded character set(UCS) | | Sector / Industry | National Standard (Recommended) | | Classification of Chinese Standard | L71 | | Classification of International Standard | 35.040 | | Word Count Estimation | 2938,212 | | Date of Issue | 2025-01-24 | | Date of Implementation | 2025-08-01 | | Issuing agency(ies) | State Administration for Market Regulation, China National Standardization Administration |
GB/T 13000-2025: Information technology - Universal coded character set(UCS)---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.
ICS 35.040
CCSL71
National Standard of the People's Republic of China
Replace GB/T 13000-2010
Information technology Universal Coded Character Set (UCS)
(ISO /IEC 10646.2020,MOD)
Released on 2025-01-24
2025-08-01 Implementation
State Administration for Market Regulation
The National Standardization Administration issued
Table of Contents
Preface VII
1 Scope 1
2 Normative references 1
3 Terms and Definitions 2
4 Compliance 8
4.1 General requirements 8
4.2 Compliance of information exchange 8
4.3 Equipment compliance 8
5 Electronic Data Annex 9
6 Overall structure of UCS 10
7 Basic structure and nomenclature 10
7.1 Structure 10
7.2 Character Encoding 11
7.3 Code Type 12
7.4 Character Naming 13
7.5 Character Aliases 14
7.6 Code Point Short Identifier (UID) 14
7.7 UCS sequence identifier 15
7.8 Octet Sequence Identifier 15
8 Revision and Update of UCS 15
9 Subset 15
9.1 Overview 15
9.2 Finite subsets 15
9.3 Selecting a subset 15
10 UCS encoding form 16
10.1 Overview 16
10.2 UTF-8 16
10.3 UTF-16 17
10.4 UTF-32 17
11 UCS encoding scheme 18
11.1 Overview 18
11.2 UTF-8 18
11.3 UTF-16BE 18
11.4 UTF-16LE 18
11.5 UTF-16 18
11.6 UTF-32BE 19
11.7 UTF-32LE 19
11.8 UTF-32 19
12 Control functions and UCS combined use 19
13 Declaration of identification characteristics 20
13.1 Purpose and reason of labeling 20
13.2 Identification of the UCS coding scheme 20
13.3 Identification of graphic character subsets 21
13.4 Identification of control function sets 21
13.5 Identification of the ISO /IEC 2022 coding system 21
14 Structure of code tables and character name lists 22
15 Block and collection names 22
15.1 Block Name 22
15.2 Collection Name 22
16 Mirror Characters in Bidirectional Contexts 23
16.1 Mirror Characters 23
16.2 Directionality of Bidirectional Context 23
17 Special characters 23
17.1 Overview 23
17.2 Spacer Characters 23
17.3 Currency Symbols 24
17.4 Format Characters 24
17.5 Ideographic Descriptors 26
17.6 Variant Selectors and Variant Sequences 27
18 Character appearance 29
19 Compatible characters 29
20 Character order 29
21 Combining characters 29
21.1 Sequence of combining characters 29
21.2 Combination categories and regular order 30
21.3 Display in the code table 30
21.4 Alternative Code Representations 30
21.5 Multiple Combination Characters 30
21.6 Collections containing combining characters 31
21.7 Combining grapheme connectors 31
22 Normalized form 31
23 Characteristics of individual words and symbol vocabulary 32
23.1 The syllable compounding of Hangul (Korean) 32
23.2 Characteristics of the writing system in India and some South Asian countries 32
23.3 Byzantine musical notation 33
23.4 Etymological references for graphic symbols 33
24 CJK Chinese character etymology reference 33
24.1 Etymology Reference List 33
24.2 CJK Chinese Character Etymology Reference File 37
24.3 CJK unified Chinese character etymology reference information representation method 40
24.4 CJK compatible Chinese character etymology reference information representation method 42
25 Etymology of Xixia characters 43
25.1 Etymology Reference List 43
25.2 Etymology of Xixia characters Reference file 43
25.3 The representation of etymological reference information of Xixia characters 44
26 Etymology of Nüshu characters 45
26.1 Etymology Reference List 45
26.2 Etymology of Nüshu characters Reference document 45
27 Character names and notes 46
27.1 Entity Name 46
27.2 Name structure 46
27.3 Single name 47
27.4 Name Invariance 47
27.5 Name Uniqueness 47
27.6 Character names of CJK Chinese characters 48
27.7 Names of Xixia characters 49
27.8 Names of characters in Nüshu 49
27.9 Character names of Khitan small characters 49
27.10 Character names of Hangul (Korean) syllables 49
28 Named UCS sequence identifier 51
29 The structure of the Basic Multilingual Plane (BMP) 52
30 Structure of the Supplementary Multilingual Plane (SMP) for the Coding of Texts and Symbols 54
31 Structure of the Supplementary Ideographic Plane (SIP) 57
32 Structure of the Third Ideographic Plane (TIP) 57
33 Auxiliary Special Purpose Plane (SSP) Structure 58
34 Code table and character name list 58
34.1 Overview 58
34.2 Code Table 58
34.3 List of character names 59
34.4 Standardized Variant Sequence Summary 60
34.5 Code table and character name list 60
Appendix A (Informative) Characters used in identifiers 72
Appendix B (Informative) External References to Character Vocabularies 73
B.1 Character Vocabulary and Its Encoding Method 73
B.2 Identification of ASN.1 Character Abstract Syntax 73
B.3 Identification of ASN.1 Character Transfer Syntax 74
Appendix C (Informative) Notation for Eight-bit Value Representation 75
Appendix D (informative) Recommendation 76 for combined receiving and transmitting equipment with internal memory
Appendix E (Normative) Collection of graphic characters for subsets 77
E.1 Collection of coded graphic characters 77
E.2 Block Name List 91
E.3 Fixed collections of the entire UCS (except the Unicode collection) 99
E.4 CJK Collection 104
E.5 Other collections 106
E.6 Unicode Collection 113
Appendix F (informative) Alphabetical list of character names 114
Appendix G (Informative) Format Characters 115
G.1 Overview 115
G.2 General format characters 115
G.3 Format characters applicable to specific text 117
G.4 Interline comment characters 118
G.5 Reverse format characters 118
G.6 Shorthand format characters 118
G.7 Invisible mathematical operators 119
G.8 Western Music Notation 119
G.9 Language notation using tag characters 120
Appendix H (Informative) Ideographic Descriptors 122
H.1 Overview 122
H.2 Syntax of ideographic character description sequences 122
H.3 Individual definitions of ideographic descriptors 122
Appendix I (Informative) CJK Chinese Character Recognition and Sorting Rules 125
I.1 Overview 125
I.2 Approval Procedure125
I.3 Sorting Procedure 128
I.4 Source Font Separation Example 129
I.5 Disagreement with Example 134
I.6 Supplementary Notes on “Recognition and Sorting Rules of CJK Chinese Characters” 135
Appendix J (Informative) Character Naming Guidelines 137
Appendix K (Informative) Names of Hangeul (Korean) Syllables 140
Appendix L (Informative) Supplementary Notes on Some Chinese Characters and Symbols 141
L.1 Overview141
L.2 Names of some Chinese characters and symbols and their English translations in this document 141
L.3 Explanation of some Chinese characters in the code table 142
Appendix M (Informative) Additional Notes on CJK Unified Chinese Characters 144
Appendix N (Normative) Code table and character name list 147
References 2926
Foreword
This document was drafted in accordance with the provisions of GB/T 1.1-2020 "Guidelines for standardization work Part 1.Structure and drafting rules for standardization documents".
This document replaces GB/T 13000-2010 "Information Technology Universal Multiple-Octet Coded Character Set (UCS)" and GB/T 13000-
Compared with.2010, in addition to structural adjustments and editorial changes, the main technical changes are as follows.
a) Added the provisions for the Third Ideographic Plane (TIP), three encoding forms of UCS and seven encoding schemes in the scope (see Chapter 1);
b) Some of the terms and definitions have been modified (see Chapter 3, Chapter 4 of the.2010 edition);
c) The provisions on information exchange conformity (see 4.2, 2.2 of the.2010 edition) and equipment conformity (see 4.3, 2.3 of the.2010 edition) have been changed;
d) Added a chapter on “Electronic Data Annexes” (see Chapter 5);
e) Changed the scope of the UCS code space and added the Third Ideographic Plane (TIP) to the overall structure of UCS (see 6
Chapter 7.1 of the.2010 edition, Chapter 5, 6.1);
f) Changed the character encoding format (see 7.2, 6.2 of the.2010 edition), deleted the eight-bit order (see
6.3 of the.2010 edition), added the code type specification (see 7.3), and changed the format of the code short identifier (UID) (see 7.6,.2010 edition)
6.5 of the.2011 edition), added the provisions for eight-bit sequence identifiers (see 7.8);
g) Added a chapter on "UCS encoding format" (see Chapter 10);
h) Added a chapter on "UCS Coding Scheme" (see Chapter 11);
i) Deleted the chapter "Level of Implementation" (see Chapter 14 of the.2010 edition);
j) Added the UCS encoding scheme identifier (see 13.2), and deleted the identifier of the UCS encoding representation with implementation level (see
16.2 of the.2010 edition);
k) The provisions on mirror characters have been changed (see Chapter 16, Chapter 19 and Appendix E of the.2010 edition);
l) Added the provisions of ideographic character descriptors (see 17.5), variant selectors and variant selector sequences (see 17.6);
m) Added the rules for combination categories and regular order (see 21.2) and combination grapheme connectors (see 21.7);
n) added unique spelling rules for the scripts of India and some South Asian countries (see 23.2) and etymological references for graphic symbols (see 23.4);
o) Added the CJK Chinese character etymology standard classification (see 24.1), changed the etymology reference list (see 24.1.2010 version 27.1), added
Added CJK Chinese character etymology reference file (see 24.2), CJK unified Chinese character etymology reference information representation method (see 24.3),
CJK compatible Chinese character etymology reference information representation method (see 24.4);
p) Added "Etymology of Xixia characters", "Etymology of Nüshu characters", "Character names and notes" and "Named UCS sequence
"Recognition of Symbols" (see Chapters 25, 26, 27, and 28);
q) New languages have been added (see Chapter 29, Chapter 30, Chapter 31 and Chapter 33);
r) Added a chapter on "The Structure of the Third Ideographic Plane (TIP)" (see Chapter 32);
s) The code table, character name list and character source have been changed (see Chapter 34, Chapter 33 of the.2010 edition);
t) The list of graphic character sets used for subsets has been changed (see Appendix E, Appendix A of the.2010 edition).
This document is modified to adopt ISO /IEC 10646.2020 "Universal Coded Character Set (UCS) for Information Technology" and ISO /IEC 10646.2020/Amd1.2023.
This document has the following structural adjustments compared to ISO /IEC 10646.2020 and ISO /IEC 10646.2020/Amd1.2023.
--- To avoid the occurrence of suspended sections, 27.5.1, G.1, I.1, I.2.3.1 and I.2.4.1 are added, and the number of the following chapters is postponed in sequence;
--- Note 2 of 34.1 corresponds to Appendix M in ISO /IEC 10646.2020;
--- Note 4 of 34.2 corresponds to Appendix Q in ISO /IEC 10646.2020;
--- Appendix A corresponds to Appendix U in ISO /IEC 10646.2020;
--- Appendix B corresponds to Appendix N in ISO /IEC 10646.2020;
--- Appendix C corresponds to Appendix K in ISO /IEC 10646.2020;
--- Appendix D corresponds to Appendix J in ISO /IEC 10646.2020;
--- Appendix E corresponds to Appendix A in ISO /IEC 10646.2020;
--- Appendix F corresponds to Appendix G in ISO /IEC 10646.2020;
--- Appendix G corresponds to Appendix F in ISO /IEC 10646.2020;
--- Appendix H corresponds to Appendix I in ISO /IEC 10646.2020, which adds H.2.1 and H.2.2;
--- Appendix I corresponds to Appendix S in ISO /IEC 10646.2020;
--- Appendix J corresponds to Appendix L in ISO /IEC 10646.2020;
--- Appendix K corresponds to Appendix R in ISO /IEC 10646.2020;
--- Appendix M corresponds to Appendix P in ISO /IEC 10646.2020;
--- Appendix N corresponds to 34.5 in ISO /IEC 10646.2020;
--- Deleted Annex B, Annex C, Annex D, Annex E, Annex H and Annex T of ISO /IEC 10646.2020.
The technical differences between this document and ISO /IEC 10646.2020 and ISO /IEC 10646.2020/Amd1.2023 and their reasons are as follows.
--- Moved the supplementary content of the entity name formation rules applicable to non-English versions from the notes to the main text to comply with the national standard drafting rules;
--- Deleted the contents of the abolished documents or standards and explained them with footnotes to comply with the national standard drafting rules;
--- Changed the inaccurate standard numbers cited in international standards and explained them with footnotes to comply with the national standard drafting rules;
Other new additions are consistent with ISO /IEC 10646.2020/Amd1.2023.
The following editorial changes were made to this document.
--- Changed the note of the term "dedicated plane" and added other commonly used translations;
--- Added notes on the use of special characters;
---Added content about CJK Chinese character recognition and sorting;
--- Added a note on the character source list, the content of which comes from Annex M of ISO /IEC 10646.2020;
--- Added a note on the query method of the Korean syllable code mapping table, the content of which comes from Appendix Q of ISO /IEC 10646.2020;
--- Added Chinese translation of block names;
--- Changed the appendix number of the referenced document in the identification related content of ASN.1 character transfer syntax;
--- Added character names that are not listed in the character name list file that are missing from ISO /IEC 10646.2020/Amd1.2023
--- Added Appendix L (informative) "Supplementary explanation of some Chinese characters and symbols";
---Added references.
Please note that some of the contents of this document may involve patents. The issuing organization of this document does not assume the responsibility for identifying patents.
This document was proposed and coordinated by the National Technical Committee for Information Technology Standardization (SAC/TC28).
This document was drafted by. China Electronics Technology Standardization Institute, Language and Literature Application Research Institute of the Ministry of Education, Software Research Institute of the Chinese Academy of Sciences
Institute, Beijing Founder Electronics Co., Ltd.
The main drafters of this document. Chen Zhuang, Chen Xiaoyan, He Zhengan, Huang Shanshan, Wang Xiaoming, Wu Jian, Chen Ken.
The previous versions of this document and the documents it replaces are as follows.
---First published in.1993 as GB 13000.1-1993 "Information Technology Universal Multiple-Octet Coded Character Set (UCS) Part 1
Part. Architecture and Basic Multilingual Plane";
---First revised in.2010 as GB/T 13000-2010 "Information Technology Universal Multi-Octet Coded Character Set (UCS)";
---This is the second revision.
Information technology Universal Coded Character Set (UCS)
1 Scope
This document.
---Specifies the architecture of UCS;
--- Defines the terms used in UCS;
---Describes the overall structure of the UCS code space;
---Specifies the UCS allocated planes. UCS Basic Multilingual Plane (BMP), Supplementary Multilingual Plane (SMP), Supplementary Ideographic Plane (SMP)
the Sigmographic Plane (SIP), the Third Ideographic Plane (TIP), and the Supplementary Special Purpose Plane (SSP);
--- Defines graphic character sets for writing in various languages around the world;
---Specifies the names and encodings of graphic characters and format characters of BMP, SMP, SIP, TIP and SSP of UCS;
---Specifies the encoding of control characters and special characters;
---Specifies three encoding forms of UCS. UTF-8, UTF-16 and UTF-32;
---Specifies seven UCS encoding schemes. UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE
and UTF-32LE;
---Specifies the management methods for future supplementary coded characters.
This document is applicable to technical products that have the function of information processing and exchange of graphic characters in various languages around the world.
Note. This document does not specify whether the characters it encodes are suitable for use as identifiers in programming languages. Appendix A gives information on characters suitable for use as identifiers.
Reference documents.
2 Normative references
The contents of the following documents constitute the essential clauses of this document through normative references in this text.
Only the version corresponding to that date is applicable to this document; for undated referenced documents, the latest version (including all amendments) is applicable to this document.
ISO /IEC 2022 Information technology character code structure and extension technology
Note. GB/T 2311-2000 Information technology character code structure and extension technology (ISO /IEC 2022.1994, IDT)
ISO /IEC 6429 Information technology coded character set control functions
|