HOME   Cart(0)   Quotation   About-Us Policy PDFs Standard-List
www.ChineseStandard.net Database: 189759 (26 Oct 2025)

GB/T 33475.3-2018 English PDF

US$7689.00 · In stock
Delivery: <= 21 days. True-PDF full-copy in English will be manually translated and delivered via email.
GB/T 33475.3-2018: Information technology -- High efficiency media coding -- Part 3: Audio
Status: Valid
Standard IDContents [version]USDSTEP2[PDF] delivered inStandard Title (Description)StatusPDF
GB/T 33475.3-2018English7689 Add to Cart 21 days [Need to translate] Information technology -- High efficiency media coding -- Part 3: Audio Valid GB/T 33475.3-2018

PDF similar to GB/T 33475.3-2018


Standard similar to GB/T 33475.3-2018

GB/T 38663   GB/T 37036.3   GB/T 37036.2   GB/T 33475.5   GB/T 33475.6   GB/T 33475.4   

Basic data

Standard ID GB/T 33475.3-2018 (GB/T33475.3-2018)
Description (Translated English) Information technology -- High efficiency media coding -- Part 3: Audio
Sector / Industry National Standard (Recommended)
Classification of Chinese Standard L71
Classification of International Standard 35.040
Word Count Estimation 508,565
Date of Issue 2018-06-07
Date of Implementation 2019-01-01
Regulation (derived from) National Standard Announcement No. 9 of 2018
Issuing agency(ies) State Administration for Market Regulation, China National Standardization Administration

GB/T 33475.3-2018: Information technology -- High efficiency media coding -- Part 3: Audio

---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.
Information technology--High efficiency media coding--Part 3. Audio ICS 35.040 L71 National Standards of People's Republic of China Information technology efficient multimedia coding Part 3. Audio Published on.2018-06-07 2019-01-01 implementation State market supervision and administration China National Standardization Administration issued

Content

Foreword III Introduction IV 1 Scope 1 2 Normative references 1 3 Terms and Definitions 1 4 symbols and abbreviations 3 5-bit stream grammar rule 6 6 audio coding framework 8 7 Universal Audio Coding 10 8 lossless audio coding 84 9 object metadata encoding 112 10 AVS2-P3 reuse specification specification in the transport stream 119 Appendix A (Normative) AASF and AATF Syntax and Semantics 120 Appendix B (Normative) General Audio Code Table 131 Appendix C (Normative) AVS2-P3 audio elementary stream in GB/T 17975.1-2010 or MPEG-2 TS transport stream Reuse definition 500

Foreword

GB/T 33475 "Information Technology Efficient Multimedia Coding" is divided into three parts. --- Part 1. System; --- Part 2. Video; --- Part 3. Audio. This part is the third part of GB/T 33475. This part is drafted in accordance with the rules given in GB/T 1.1-2009. This part is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28). This section drafted by. Tsinghua University, Nanjing Qingyi Information Technology Co., Ltd., Zhongguancun Audiovisual Industry Technology Innovation Alliance, Zhongke Kaiyuan Information Technology (Beijing) Co., Ltd., Institute of Information and Communication, National Research Service of Singapore, Peking University, Wuhan University, Beijing Tianzhu Technology Co., Ltd., Beijing Institute of Technology, Tianjin University. The main drafters of this section. Dou Weizhen, Pan Xingde, Li Wei, Shu Haiyan, Lu Min, Wu Chaogang, Yang Xinhui, Liu Renhua, Huang Haibin, Yu Rongshan, Huang Yichao, Qu Tianshu, Wang Xiaochen, Jiang Lin, Wang Jing, Zhang Tao, Gao Wen, Huang Tiejun.

Introduction

This part of GB/T 33475 is a codec technical standard for high quality audio signals, which is adapted to digital storage media and Internet wide. Applications with audio and video services, digital audio and video broadcasting, wireless broadband multimedia communications, digital cinema, virtual/augmented reality and video surveillance Developed for the needs of audio compression technology. This section describes general audio coding, lossless coding, and representation of 3D sound object coding for high quality audio signals, and general purpose Audio decoding, lossless decoding, and methods of decoding three-dimensional sound objects. Universal audio encoding supports up to 128 channels and supports sample rate 8kHz~192kHz and supports 8-bit, 16-bit and 24-bit sampling accuracy. Support coded output bit stream for each channel 16kbit/s to 192kbit/s, mono. 16kbit/s, 32kbit/s, 44kbit/s, 56kbit/s, 64kbit/s, 72kbit/s, 80kbit/s, 96kbit/s, 128kbit/s, 144kbit/s, 164kbit/s, 192kbit/s; two-channel stereo. 24kbit/s, 32kbit/s, 48kbit/s, 64kbit/s, 80kbit/s, 96kbit/s, 128kbit/s, 144kbit/s, 192kbit/s, 256kbit/s, 320kbit/s; 5.1 Surround sound. 192kbit/s, 256kbit/s, 320kbit/s, 384kbit/s, 448kbit/s, 512kbit/s, 640kbit/s, 720kbit/s; and multi-channel surround sound such as 7.1 and 10.1. Lossless audio encoding supports up to 128 channels, any sampling frequency, and Supports 8-bit, 16-bit and 24-bit sampling accuracy. The 3D sound object encoding supports up to 128 sound objects. The issuing body of this document draws attention to the fact that, when the statement conforms to this section, it may involve 7.3, 7.4.2, 7.5, 7.6, 7.7, 7.8, 7.9, 8.4, Use of patents related to 8.7, 9.3, 9.4. The publisher of this document draws attention to the fact that 20 statements related to general-purpose audio codec technology may be involved in the declaration of this document. The use of patents. PCT/CN2014/095012, a vector quantization codec method and apparatus for audio signals; PCT/CN2014/095394, Multi-channel sound signal encoding method, decoding method and device; PCT/CN2014/095396, multi-channel sound signal encoding method, decoding method Method and device; PCT/CN2014/095393, codec method and device for principal component analysis PCA mapping model;.200610087094.6, frequency Extended coding method and device, and decoding method and device thereof;.201210085183.2, a sound codec device and method thereof; 201210085213.X, a sound codec device and method thereof;.201210085257.2, a sound codec device and method thereof; 201310109081.4, a sound decoding device and method thereof;.201310128173.7, a sound codec device and method thereof; 201310728959.2, a vector quantization codec method and device for audio signals;.201410395806.5, multi-channel sound signal Encoding method, decoding method and device;.201410404895.5, multi-channel sound signal encoding method, decoding method and device; 201410710991.2, codec method and device for principal component analysis PCA mapping model;.201510226119.5, one for abandoned Sub-space component compensation codec device and method;.200710175993.6, code integration system and method and decoding integration system and method; 200710135833.9, stereo audio encoding/decoding method and encoding/decoding device;.200710304486.8, encoding method and device for audio signal And decoding method and device;.200810106460.7, stereo signal encoding and decoding method, device and codec system;.201410573759.9, a A stereo codec method. The publisher of this document draws attention to the fact that when the statement is in compliance with this document, three items related to lossless audio and audio codec technology may be involved. Use of the patent. ZL201010281033.X, an audio lossless compression coding and decoding method based on shaping wavelet transform;.201110263485.X, after Block adaptive Golomb-Rice codec method and device;.201410721299.X, multi-channel lossless audio hybrid codec method and Device. The issuer of this document draws attention to the fact that when the statement is consistent with this document, it may involve four items related to the object metadata codec technology. The use of patents. 201610157032.1, a panoramic sound processing method;.201610157663.3, a coordinate definition method of a sound field space; 201610158782.0, a coding method of a sound object;.201610159117.3, a panoramic sound coding method. The issuing organization of this document has no position on the authenticity, validity and scope of the patent. The holder of the patent has assured the issuing authority of this document that he is willing to work with any applicant on reasonable and non-discriminatory terms and conditions. Negotiate a patent license. The patent holder's statement has been filed with the issuing authority in this section. Related information can be passed the following Contact information obtained. Contact. Huang Tiejun (Secretary-General of Digital Audio Video Codec Standard Working Group) Address. Room 2641, Building 2, Science, Peking University Postal code. 100871 Email. [email protected] Phone. 10-62756172 Fax. 10-62751638 Please note that in addition to the above patents, some of the contents of this section may still involve patents. The publishers in this section do not undertake to identify these Liability. Information technology efficient multimedia coding Part 3. Audio

1 Scope

This part of GB/T 33475 describes general audio coding, lossless audio coding and 3D audio object coding for high quality audio signals. A method of representing a code and a method of general audio decoding, lossless audio decoding, and three-dimensional audio object decoding. This section applies to the following areas. --- Digital storage media; ---Internet broadband audio and video services; ---Digital audio and video broadcasting; ---Wireless broadband multimedia communication; ---Digital movie; ---Virtual reality and augmented reality; ---Video Surveillance.

2 Normative references

The following documents are indispensable for the application of this document. For dated references, only dated versions apply to this article. Pieces. For undated references, the latest edition (including all amendments) applies to this document. GB/T 4880.2-2000 Language name code Part 2. 3-letter code GB/T 5271.1 Information technology vocabulary Part 1. Basic terms GB/T 5271.4 Information technology vocabulary Part 4. Organization of data GB/T 5271.9 Information technology vocabulary - Part 9. Data communication GB/T 17975.1-2010 Information technology - General coding for motion pictures and associated audio information - Part 1

3 Terms and definitions

The following terms and definitions as defined in GB/T 5271.1, GB/T 5271.4 and GB/T 5271.9 apply to this document. 3.1 Reserved reserved Temporarily unused fields in the defined encoded bitstream may be used in future standard extensions. 3.2 Bit rate bitrate The rate at which the encoded bit stream is transmitted to the input of the decoder. 3.3 Bit stream bitstream A set of bits that are used in a certain order as a data encoded representation. 3.4 Coding coding The audio sample stream is read in and a valid bit stream conforming to this section is generated. 3.5 Encoder coder Encoding processed entity. 3.6 Code representation codedrepresentation A unit of data represented in its encoded form. 3.7 Coded audio bit stream codedaudiobitstream An encoded representation of the audio signal. 3.8 Side information sideinformation The necessary information to control decoding in the bitstream. 3.9 Sampling frequency samplingfrequency; fs The number of samples of the discrete signal extracted from the continuous signal per second, which can be referred to as the sampling rate. Note. The unit is Hertz (Hz). 3.10 Auxiliary data Data used in the bitstream to assist in channel coding. 3.11 Decode decoding One type of data processing defined in this section is the process of reading in an encoded bit stream and outputting audio sample values. 3.12 Decoder decoder Decode the processed entity. 3.13 Spectral coefficient Analyze the discrete spectral domain data output in the filter bank. 3.14 Entropy coding entropycoding A variable length lossless coding in the digital representation of the signal to reduce redundancy in statistical properties. 3.15 Channel channel Independent audio signals that are captured or played back in different spatial locations during recording or playback. 3.16 Data unit dataelement A representation of the pre-encoded and encoded data items. 3.17 Fill (bit) stuffing(bits) Fill (byte) stuffing(byte) A codeword that can be inserted into a special location of the encoded bitstream and removed during the decoding process. In addition, the encoded additional data can also Use padding bits or bytes. 3.18 Signal type signaltype A mechanism for classifying encoded audio signals for selecting different filtering methods and encoding methods. 3.19 Audio buffer audiobuffer A buffer for storing encoded audio data in the decoder. 3.20 Byte byte A sequence of 8 bits. 3.21 Byte aligned bytealignment In the encoded bit stream, the number of bits is a multiple of eight. 3.22 Noise coding noiselevelcoding The statistical characteristics of the signal are similar to the degree of noise parameter coding. 3.23 Add string addingindividualline Add a sinusoidal component to a particular frequency band. 3.24 Linear predictive coding linearpredictcoding An algorithm for processing an input audio signal to reduce signal redundancy and improve coding efficiency. 3.25 Lifting wavelet liftingwavelet Wavelet transform implemented with a lifting strategy. 3.26 Multichannel decorrelation multichanneldecorrelation Inter-channel correlation is removed to improve coding efficiency. 3.27 Channel coding channelcoding The encoding of the base channel, ie the encoding of the underlying sound signal in addition to the sound object. 3.28 Sound object soundobject A sound that is perceived as a whole or an environment-independent sound emitted by a sound source. 3.29 Sound object encoding soundobjectcoding The sound object audio sample stream and its side information are read in and an encoded bit stream including object metadata and audio content is generated.

4 symbols and abbreviations

The mathematical operators and priorities used in this section are similar to those used by the C language. But for integer division and arithmetic shift operations A specific definition. The appointment number and count start from 0 unless otherwise stated. 4.1 Arithmetic Operators The following arithmetic operators are available for this document. plus - minus (binary operator) or negation (unary prefix operator) × multiply The ab power, which represents the b-th power of a, can also represent the superscript. ^ power % modulus operator, defined only for positive integers. / Integer division, the result is rounded to zero. For example, 7/4 and -7/-4 are rounded to 1, 7/4 and 7/-4 rounded to -1. Division, no rounding or rounding. || Absolute value |x|=x when x >0 |x|=0 when x=0 |x|=-x when x< 0 Abs absolute value Sign() takes the symbol sign(x)=1 when x >0 Sign(x)=0 when x=0 Sign(x)=-1 when x< 0 x square root i=a f(i) The cumulative sum of the functions f(i) when the argument i takes all integer values from a to b (including b). Log10 base 10 logarithm Log2 base 2 logarithm Round down Exp an exponential function based on the natural constant e Residual residual, the difference between the actual observed value and the estimated value (fitted value) 4.2 Logical Operators The following logical operators are suitable for this document. || Logical or Logical non 4.3 Relational operators The following relational operators apply to this document. > greater than ≥ greater than or equal to < less than ≤ less than or equal to = equal to ≠ not equal Max[,,] maximum value in the parameter table Min[,,] the minimum value in the parameter table 4.4 bit operator The following bit operators apply to this document. | or ~ Invert a > >b Shift a to b to the right in the form of a two's complement integer. This operation is defined only when b takes a positive number. To the right When moving to the most significant bit, its value is equal to the most significant bit before the a shift operation. \u003chtml\u003e a < When moved to the least significant bit, its value is equal to zero. 4.5 Assignment The following assignment operations apply to this document. == assignment operator x = ab x takes the value from a to b (inclusive b), where x, a and b are integers. Self-added, x is equivalent to x=x 1. When used for array subscripting, the variable value is evaluated before the self-addition operation. -- Self-decreasing, x--equivalent to x=x-1. When used for array subscripting, the variable value is evaluated before the decrement operation. = Self-added specified value, for example, x = 3 is equivalent to x = x 3, and x = (-3) is equivalent to x = x (-3). -= Decrement the specified value, for example, x-=3 is equivalent to x=x (-3), and x-=(-3) is equivalent to x=x-(-3). 4.6 mnemonic The following mnemonics apply to this document. Rpchof polynomial remainder, high order first. Bslbf bit string, the left bit is first, here "left" is the order of the bit string written in GB/T 17191. Bit string is enclosed in single quotes 1 and 0 strings. Such as '10000001'. The spaces in the bit string are easy to read and have no special meaning. (bitstreamleft Bitfirst). Uimsbf Unsigned integer, the most significant bit first. (unsignedinteger, mostsignificantbitfirst). The bsmbf bit string is a quoted 1 and 0 string, with the right bit first, such as encoding a 5-bit value of 6 and then encoding a 3 ratio. The special value is 2, then the encoded bit string is '01000110'. 4.7 Abbreviations The following abbreviations apply to this document. FFT. Fast Fourier Transform (FastFourierTransform) MDCT. Modified Discrete Cosine Transform (ModifiedDiscreteCosineTransform) IMDCT. Modified Inverse Modified Cosine Transform (InverseModifiedDiscreteCosineTransform) MDST. Modified Discrete Sine Transform (ModifiedDiscreteSineTransform) MDFT. Modified Discrete Fourier Transform (ModifiedDiscreteFourierTransform) IMDFT. Modified Inverse Modified Discrete Fourier Transform (InverseModifiedDiscreteFourierTransform) MCR. Maximal CorrelationRotation PCA. PrincipalComponentAnalysis AASF. AVS2 Audio Storage Format (AVS2AudioStorageFormat) AATF. AVS2 Audio Transmission Format (AVS2AudioTransportFormat) CRC. Cyclic Redundancy Check (CylicRedundancyCheck) BWE. High Frequency Bandwidth Extension (BandwidthExtension) TNS. Temporal Noise Shaping (TemporalNoiseShaping) 5-bit stream grammar rule Each data item in the bitstream is in bold. Described by the name, the length of the bit, its type, and the mnemonic of the transfer order. The operations caused by the decoded data elements in the bitstream depend on the value of the data and the previously decoded data elements. The following grammatical knot Constructs a situation when a data element appears as a standard type. Note 1. Unless otherwise stated, “bit” in this section refers to a binary bit. Note 2. This part of the grammar is specified by the "C" code. When the variable or expression is non-zero, the condition is true. When the variable or expression is zero, the condition is Not true. While(condition){ Data_element If the condition is true, the data element group is generated immediately following the data stream, and so on until the condition is not true. Do{ Data_element }while(condition) If the condition is true, the data element group is generated immediately following the data stream, and so on until the condition is not true. If(condition){ Data_element }else{ Data_element If the condition is true, a first set of data elements is generated in the data stream, and if the condition is not true, a second set of data elements is generated in the data stream. For(expr1;expr2;expr3){ Data_element Expr1 is the specified loop initial state expression, usually it specifies the initial state of the counter, and expr2 is specified before each loop Test conditions. When the condition is non-true, the loop terminates. Expr3 is the expression executed at the end of each loop, generally increasing the counter. Note 3. The most common usage of this structure is For(i=0;i \u003cn;i ){ Data_element The data element group is generated n times. The conditional structure within the data element group may depend on the value of the loop control variable i. When it first appeared Set to '0', the second time to '1', so reciprocating. Switch(expr){ Generates the corresponding data element based on the value of the expression expr. The value of expr is Caseconstcase1. constcase1 produces the data element data_element1, the value of expr Data_element1 constcase2 produces the data element data_element2, and so on, the value of expr The data element data_elementn is generated when break is constcasen. When the value of expr is not equal Caseconstcase2. constcase1, constcase2,, any value in constcasen, generated Data_element2 data element data_elementdefault Break Caseconstcasen. Data_elementn Break Default. Data_elementdefault Break A variant of this structure is that there is no break after the case, such as Switch(expr){ The value of expr is constcasex, starting from the corresponding caseconstcasex Caseconstcase1. According to the element, until break occurs. Data_element1 expr value constcase1 produces data element data_element1 and Caseconstcase2. data_element2, the data element is generated when the value of expr is constcasen Data_element2 data_elementn Break Caseconstcasen. Data_elementn Break Default. Data_elementdefault Break Note 4. There may be nested structures in the data element group. For the sake of brevity, "[]" is omitted when there is only one data element at a time. Data_element[] data_element is an array of data, the number of data elements is context-dependent; Data_element[n] data_element[n] is the nth element of the array data; Data_element[m][n] data_element[m][n] is the mth, nth element of the two-dimensional array; Data_element[l][m][n] data_element[l][m][n] is the l 1,m 1,n 1 element of the three-dimensional array; Data_element[mn] data_element[mn] is the bit included between bit m and bit n. Although the grammar is represented by a procedure item, the clause cannot be considered to implement a reliable decoding process. It just defines an error-free bit Stream input. Definition of the byte_alignment function. If the current position is at the boundary of the byte, the byte_alignment() function returns '1', that is, the next bit in the bit stream is one byte. Start bit, otherwise return '0'. The definition of the nextbits function. The function nextbits() compares the bit string with the next bit in the bitstream to be decoded. Definition of the feof function. The function feof() determines whether the stream or file is finished. The feof() function returns '1' to indicate the end of the stream or file, otherwise it returns '0'. The second column of the bitstream syntax table indicates the number of bits per data element. "XY" means that the number of bits is between X and Y. Includes X and Y. "{X;Y}" indicates that the number of bits is X or Y depending on the value of other data elements in the bitstream.

6 audio coding framework

6.1 Overview With the application of 3Dudio (threedimensionaudio) system, 4K and 3D movies, ultra HD TV, network HD Application environments such as video, virtual reality, networking, and mobile audio have raised the need for efficient, high-quality compression coding of 3D audio data. by The amount of data in the 3D audiovisual system is much larger than that of the traditional audiovisual system, which increases the storage space and transmission bandwidth (or data traffic) overhead, so Improving the compression efficiency based on high sound quality is the main problem that AVS2 audio coding needs to solve. AVS2 audio coding is high voltage Reduced, high-quality, and audio objects are encoded as 3D audio coding techniques. Both advanced technology, rich codec options, and system integration It has high technical characteristics such as high degree of integration, flexible configuration, wide adaptability, and high performance, high compression, high sound quality and low complexity. The AVS2 audio coding framework is shown in Figure 1. The entire system is encoded by base channel coding (base_profile) and 3D audio objects. (3D_profile) consists of two encoding profiles. The basic channel encoding level not only combines mono, stereo (two-channel), surround sound Channel encoding technology (multi-channel) and 3D sound bed, also integrated with General Audio Coding (GA) and lossless audio coding (LosslessAudiocoding, LL) two encoding options. The universal encoding option combines two encoding modes, high bit rate and low bit rate. 3D sound The frequency object coding includes object audio data and object metadata coding, wherein the object audio data shares the basic channel coding with the 3D sound bed, and is increased. The audio object metadata (metadata) encoding module (AudioObject Coding, Ob) realizes the type and space of the 3D audio object sound source. The encoding of the object description information such as position and motion trajectory. Compared to current audio standards, AVS2 audio coding has five features. a) Code stream encapsulation takes into account the storage and transmission requirements, and designs a storage format AASF (AvsAudioStorage) without frame header information. Format) and the transmission format AATF (AvsAudioTransformFormat) with frame header and error check information. in Minimize coding redundancy while ensuring application requirements. For details, see Appendix A. The code rate difference between the two package formats is about 740b/s. b) Codec framework.


Refund Policy     Privacy Policy     Terms of Service     Shipping Policy     Contact Information