| Standard ID | Contents [version] | USD | STEP2 | [PDF] delivered in | Standard Title (Description) | Status | PDF |
| GB/T 42382.1-2023 | English | RFQ |
ASK
|
3 days [Need to translate]
|
Information technology - Neural network representation and model compression - Part 1: Convolutional neural network
| Valid |
GB/T 42382.1-2023
|
PDF similar to GB/T 42382.1-2023
Basic data | Standard ID | GB/T 42382.1-2023 (GB/T42382.1-2023) | | Description (Translated English) | Information technology - Neural network representation and model compression - Part 1: Convolutional neural network | | Sector / Industry | National Standard (Recommended) | | Classification of Chinese Standard | L71 | | Classification of International Standard | 35.040 | | Word Count Estimation | 254,221 | | Date of Issue | 2023-03-17 | | Date of Implementation | 2023-10-01 | | Issuing agency(ies) | State Administration for Market Regulation, China National Standardization Administration |
GB/T 42382.1-2023: Information technology - Neural network representation and model compression - Part 1: Convolutional neural network ---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.
ICS35:040
CCSL71
National Standards of People's Republic of China
Information Technology Neural Network Representation and Model Compression
Part 1: Convolutional Neural Networks
Released on 2023-03-17
2023-10-01 implementation
State Administration for Market Regulation
Released by the National Standardization Management Committee
table of contents
Preface III
Introduction IV
1 Scope 1
2 Normative references 1
3 Terms and Definitions 1
4 Abbreviations4
5 agreement 4
5:1 Rule 4
5:2 Arithmetic operators 4
5:3 Logical operators 5
5:4 Relational operators 5
5:5 Bitwise operators 5
5:6 Assignment 5
5:7 Mathematical functions 6
5:8 Structural relational symbols 7
5:9 Description method of parsing process and decoding process 7
6 Syntax and Semantics of Neural Network Models 7
6:1 Data structures 7
6:2 Syntax description 9
6:3 Semantic description 15
7 Compression process 75
7:1 Multiple models 75
7:2 Quantization 80
7:3 Pruning 102
7:4 Structured matrices 105
8 Decompression process (decoding representation) 112
8:1 Multiple models 112
8:2 Dequantization 118
8:3 De-thinning/de-pruning operations 128
8:4 Structured matrices 131
9 Data Generation Methods 138
9:1 Definition 138
9:2 Training Data Generation Methods 139
9:3 Multiple models 145
9:4 Quantization 150
9:5 Pruning 169
9:6 Structured matrices 176
10 codec means 184
10:1 Syntax and Semantics of Neural Network Model Weight Compression Bitstreams 184
10:2 Syntax Description of Weighted Compressed Bitstreams 189
10:3 Semantic description of weight-compressed bitstreams 212
10:4 Weight-compressed bitstream parsing process 222
10:5 Weight-compressed bitstream decoding 233
11 Model protection 241
11:1 Model Guard Definition 241
11:2 Model encryption process 242
11:3 Model decryption process 243
11:4 Data structure definition of ciphertext model 245
Appendix A (Informative) Patent List 246
References 247
foreword
This document is in accordance with the provisions of GB/T 1:1-2020 "Guidelines for Standardization Work Part 1: Structure and Drafting Rules for Standardization Documents"
drafting:
This document is the first part of GB/T 42382 "Information Technology Neural Network Representation and Model Compression": GB/T 42382 has issued
Published the following sections:
--- Part 1: Convolutional Neural Networks:
This document is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28):
This document is drafted by: Peking University, Pengcheng Laboratory, Shenzhen Hisilicon Semiconductor Co:, Ltd:, Xilinx Electronic Technology (Beijing) Co:, Ltd:
Company, Hangzhou Hikvision Digital Technology Co:, Ltd:, Beijing Baidu Netcom Technology Co:, Ltd:, Shenzhen Tencent Computer System Co:, Ltd:
Company, Huawei Technologies Co:, Ltd:, Xiamen University, China Electronics Standardization Institute, Institute of Automation, Chinese Academy of Sciences, Zhejiang University,
University of Science and Technology of China, Shanghai Jiaotong University, Tsinghua University, Zhongguancun Audiovisual Industry Technology Innovation Alliance:
The main drafters of this document: Tian Yonghong, Yang Fan, Ji Rongrong, Shan Yi, Chen Guangyao, Yan Zhaoyi, Zheng Xiawu, Pu Shiliang, Tan Wenming, Li Zheyang,
Bloomberg, Zhong Gang, Zhao Hengrui, Duan Wenhong, Hu Haoji, Li Xiang, Luo Yang, Wang Wei, Xu Yixing, Li Huixia, Lin Shaohui, Wang Peisong, Zhao Yi, Hu Xiaoguang,
Zheng Huihuang, Jiang Jiajun, Ma Jincheng, Cheng Jian, Jiang Fan, Zhu Wenwu, Wang Xiaojuan, Gao Wen, Huang Tiejun, Zhao Haiying, Ma Shanshan:
Introduction
The representation and model compression of neural networks are important components of the artificial intelligence technology system, and are the key to the application of artificial intelligence in various industries of the national economy:
able premise: However, multi-source algorithm platforms cannot work together, and models cannot be converted to each other, which restricts the spread and application of artificial intelligence technology:
use: In order to ensure the cross-platform operability of artificial intelligence technology and improve the effect of model reuse, this standard will express and model the neural network
Compression and standardization will drive the healthy and rapid development of the artificial intelligence industry: GB/T 42382 aims to establish the convolutional neural network, large
The specification of neural network representation and model compression for large-scale pre-trained networks and graph neural networks is proposed to consist of three parts:
--- Part 1: Convolutional Neural Networks: The goal is to establish a representation and model compression standard for convolutional neural networks:
--- Part 2: Large-scale pre-training model: The purpose is to establish model representation and model compression suitable for large-scale pre-trained networks
and model transfer standards:
--- Part 3: Graph Neural Networks: The purpose is to establish graph data and graph neural network representation, and determine the encoding of graph neural network model
Format standard:
The issuer of this document draws attention to the fact that 6, 7, 8, 9, 10, 11 and "Neural Network Representation Standard" may be involved when declaring compliance with this document:
Quasi-Frame Structure" (Patent No:::201810575097:7); 7:1:4, 8:1:3, 9:3:2 and "Quantitative Method and System Based on Neural Network Difference"
(Patent No:::201910478617:7); 7:2, 8:2, 9:4 and "A Neural Network Quantization Method Based on Parameter Norm" (Patent No::
201810387893:8); 7:4:2, 8:4:2, 9:6:2 and "A Neural Network Calculation Method and Device" (Patent No:: PCT/CN2018/
101598); 7:1:3, 8:1:2, 9:3:1 and "A Neural Network Model, Data Processing Method and Processing Device" (Patent No:: PCT/CN2019/
085885), "A Neural Network Model, Data Processing Method and Processing Device" (Patent No:::201810464380:2); 7:2:3:3, 7:2:4:3, 8:
Generation method, neural network compression method and related devices and equipment" (patent number::201910254752:3); 7:2:4:1, 8:2:4:1, 9:4:2:2
and "Model Training Method, Device, Storage Medium, and Program Product" (Patent No:: PCT/CN2019/129265); 7:2, 8:2 and "Neural Network
Model and Training Method and Device" (Patent No:: CN202010144315:9), "Neural Network Model Quantification Method and Device" (Patent No::
CN202010143782:X), "Neural Network Model Quantification Method and Device" (Patent No:: CN202010144339:4), "Neural Network Model
4: 8:4 and "NEURALNETWORK DATA PROCESSING APPARATUS, METHOD ANDELEC-
TRONICDEVICE" (Patent No:: US16/893,044); 7:2, 8:2, 9:4 and "Deepness Based on Nonlinear Quantization of Multi-Bit Neural Networks
Neural Network Compression Method" (Patent No:::201910722230:1); 7:3, 8:3, 9:5 and "An Efficient Image Classification Based on Structured Pruning
Method" (Patent No:::201910701012:X); 7:3, 8:3, 9:5 and "A Structured Sparse Method for Neural Networks Based on Incremental Regularization"
(Patent No:::201910448309:X); 11 and "A Processing Method, Device and Equipment for Model Data" (Patent No:::201911230340:2);
2019104537988); 7:4, 8:4, 9:6 and "Acceleration and Compression Method of Deep Convolutional Neural Network Based on Tensor Decomposition" (Patent No::
201610387878:4) related patents:
The issuing authority of this document has no position regarding the veracity, validity and scope of this patent:
The patent holder has committed to the issuing authority of this document that he is willing to cooperate with any applicant on reasonable and non-discriminatory terms and conditions,
Negotiate patent licenses: The statement of the patent holder has been filed with the issuing agency of this document, and relevant information can be obtained through the following
Get in touch:
Patent holders: Peking University, Huawei Technologies Co:, Ltd:, Beijing Baidu Netcom Technology Co:, Ltd:, Xiamen University, Zhejiang University, Zhao Heng
Rui, Institute of Automation, Chinese Academy of Sciences
Address: Room 2604, Building 2, Science, No: 5, Yiheyuan Road, Haidian District, Beijing Zip code: 100871; Huawei, No: 3, Shangdi Information Road, Beijing
Building Zip Code: 100085; Baidu Building, No: 10 Shangdi Tenth Street, Haidian District, Beijing Zip Code: 100085; Xiamen University, Siming District, Xiamen City, Fujian Province
School of Information Postcode: 361005; Zhejiang University Yuquan Campus, No: 38 Zheda Road, Xihu District, Hangzhou City, Zhejiang Province Postcode: 310000; Hefei, Anhui Province
West District, University of Science and Technology of China, Shushan District, Zip Code: 230027; No: 95, Zhongguancun East Road, Haidian District, Beijing, Zip Code: 100190
Contact: Huang Tiejun
Mailing address: Room 2641, Science Building 2, Peking University
Email: tjhuang@pku:edu:cn
Tel: 8610-62756172
Please note that in addition to the above patents, some content of this document may still involve patents: The issuer of this document is not responsible for identifying patents
responsibility:
Information Technology Neural Network Representation and Model Compression
Part 1: Convolutional Neural Networks
1 Scope
This document specifies the representation and compression process for offline models of convolutional neural networks:
This document is applicable to the research, development, testing and evaluation process of various convolutional neural network models, as well as the efficient application in the field of terminal cloud:
Note: The representation and model compression methods specified in this document do not require the native support of the machine learning framework, and can be supported in the form of conversions, toolkits, etc:
2 Normative references
The contents of the following documents constitute the essential provisions of this document through normative references in the text: Among them, dated references
For documents, only the version corresponding to the date is applicable to this document; for undated reference documents, the latest version (including all amendments) is applicable to
this document:
GB/T 5271:34-2006 Information Technology Vocabulary Part 34: Artificial Neural Networks
3 Terms and Definitions
The following terms and definitions apply to this document:
3:1
Codec representation codecrepresentation
Reduce the size of the model using compression techniques:
Note: Refer to codec representation for specific definitions, see Chapter 10:
3:2
layer layer
Hierarchical structures in neural networks:
Note: Each network layer contains multiple operators, such as input layer, convolutional layer, and fully connected layer:
3:3
Basis symbolic vectors that co-exist throughout the network:
3:4
A quantization method that quantizes a tensor into a combination of multiple INT4 tensors:
3:5
Interfaces such as payment security information and identity verification:
Note: For the specific definition of reference model protection, see Chapter 11:
3:6
A matrix that can be divided into multiple blocks, and each block is arranged according to a certain rule:
|