|
US$7689.00 · In stock Delivery: <= 21 days. True-PDF full-copy in English will be manually translated and delivered via email. GB/T 33475.3-2018: Information technology -- High efficiency media coding -- Part 3: Audio Status: Valid
| Standard ID | Contents [version] | USD | STEP2 | [PDF] delivered in | Standard Title (Description) | Status | PDF |
| GB/T 33475.3-2018 | English | 7689 |
Add to Cart
|
21 days [Need to translate]
|
Information technology -- High efficiency media coding -- Part 3: Audio
| Valid |
GB/T 33475.3-2018
|
PDF similar to GB/T 33475.3-2018
Basic data | Standard ID | GB/T 33475.3-2018 (GB/T33475.3-2018) | | Description (Translated English) | Information technology -- High efficiency media coding -- Part 3: Audio | | Sector / Industry | National Standard (Recommended) | | Classification of Chinese Standard | L71 | | Classification of International Standard | 35.040 | | Word Count Estimation | 508,565 | | Date of Issue | 2018-06-07 | | Date of Implementation | 2019-01-01 | | Regulation (derived from) | National Standard Announcement No. 9 of 2018 | | Issuing agency(ies) | State Administration for Market Regulation, China National Standardization Administration |
GB/T 33475.3-2018: Information technology -- High efficiency media coding -- Part 3: Audio---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.
Information technology--High efficiency media coding--Part 3. Audio
ICS 35.040
L71
National Standards of People's Republic of China
Information technology efficient multimedia coding
Part 3. Audio
Published on.2018-06-07
2019-01-01 implementation
State market supervision and administration
China National Standardization Administration issued
Content
Foreword III
Introduction IV
1 Scope 1
2 Normative references 1
3 Terms and Definitions 1
4 symbols and abbreviations 3
5-bit stream grammar rule 6
6 audio coding framework 8
7 Universal Audio Coding 10
8 lossless audio coding 84
9 object metadata encoding 112
10 AVS2-P3 reuse specification specification in the transport stream 119
Appendix A (Normative) AASF and AATF Syntax and Semantics 120
Appendix B (Normative) General Audio Code Table 131
Appendix C (Normative) AVS2-P3 audio elementary stream in GB/T 17975.1-2010 or MPEG-2 TS transport stream
Reuse definition 500
Foreword
GB/T 33475 "Information Technology Efficient Multimedia Coding" is divided into three parts.
--- Part 1. System;
--- Part 2. Video;
--- Part 3. Audio.
This part is the third part of GB/T 33475.
This part is drafted in accordance with the rules given in GB/T 1.1-2009.
This part is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28).
This section drafted by. Tsinghua University, Nanjing Qingyi Information Technology Co., Ltd., Zhongguancun Audiovisual Industry Technology Innovation Alliance, Zhongke Kaiyuan
Information Technology (Beijing) Co., Ltd., Institute of Information and Communication, National Research Service of Singapore, Peking University, Wuhan University, Beijing Tianzhu
Technology Co., Ltd., Beijing Institute of Technology, Tianjin University.
The main drafters of this section. Dou Weizhen, Pan Xingde, Li Wei, Shu Haiyan, Lu Min, Wu Chaogang, Yang Xinhui, Liu Renhua, Huang Haibin, Yu Rongshan,
Huang Yichao, Qu Tianshu, Wang Xiaochen, Jiang Lin, Wang Jing, Zhang Tao, Gao Wen, Huang Tiejun.
Introduction
This part of GB/T 33475 is a codec technical standard for high quality audio signals, which is adapted to digital storage media and Internet wide.
Applications with audio and video services, digital audio and video broadcasting, wireless broadband multimedia communications, digital cinema, virtual/augmented reality and video surveillance
Developed for the needs of audio compression technology.
This section describes general audio coding, lossless coding, and representation of 3D sound object coding for high quality audio signals, and general purpose
Audio decoding, lossless decoding, and methods of decoding three-dimensional sound objects. Universal audio encoding supports up to 128 channels and supports sample rate
8kHz~192kHz and supports 8-bit, 16-bit and 24-bit sampling accuracy. Support coded output bit stream for each channel
16kbit/s to 192kbit/s, mono. 16kbit/s, 32kbit/s, 44kbit/s, 56kbit/s, 64kbit/s, 72kbit/s,
80kbit/s, 96kbit/s, 128kbit/s, 144kbit/s, 164kbit/s, 192kbit/s; two-channel stereo. 24kbit/s, 32kbit/s,
48kbit/s, 64kbit/s, 80kbit/s, 96kbit/s, 128kbit/s, 144kbit/s, 192kbit/s, 256kbit/s, 320kbit/s; 5.1
Surround sound. 192kbit/s, 256kbit/s, 320kbit/s, 384kbit/s, 448kbit/s, 512kbit/s, 640kbit/s,
720kbit/s; and multi-channel surround sound such as 7.1 and 10.1. Lossless audio encoding supports up to 128 channels, any sampling frequency, and
Supports 8-bit, 16-bit and 24-bit sampling accuracy. The 3D sound object encoding supports up to 128 sound objects.
The issuing body of this document draws attention to the fact that, when the statement conforms to this section, it may involve 7.3, 7.4.2, 7.5, 7.6, 7.7, 7.8, 7.9, 8.4,
Use of patents related to 8.7, 9.3, 9.4.
The publisher of this document draws attention to the fact that 20 statements related to general-purpose audio codec technology may be involved in the declaration of this document.
The use of patents.
PCT/CN2014/095012, a vector quantization codec method and apparatus for audio signals; PCT/CN2014/095394,
Multi-channel sound signal encoding method, decoding method and device; PCT/CN2014/095396, multi-channel sound signal encoding method, decoding method
Method and device; PCT/CN2014/095393, codec method and device for principal component analysis PCA mapping model;.200610087094.6, frequency
Extended coding method and device, and decoding method and device thereof;.201210085183.2, a sound codec device and method thereof;
201210085213.X, a sound codec device and method thereof;.201210085257.2, a sound codec device and method thereof;
201310109081.4, a sound decoding device and method thereof;.201310128173.7, a sound codec device and method thereof;
201310728959.2, a vector quantization codec method and device for audio signals;.201410395806.5, multi-channel sound signal
Encoding method, decoding method and device;.201410404895.5, multi-channel sound signal encoding method, decoding method and device;
201410710991.2, codec method and device for principal component analysis PCA mapping model;.201510226119.5, one for abandoned
Sub-space component compensation codec device and method;.200710175993.6, code integration system and method and decoding integration system and method;
200710135833.9, stereo audio encoding/decoding method and encoding/decoding device;.200710304486.8, encoding method and device for audio signal
And decoding method and device;.200810106460.7, stereo signal encoding and decoding method, device and codec system;.201410573759.9, a
A stereo codec method.
The publisher of this document draws attention to the fact that when the statement is in compliance with this document, three items related to lossless audio and audio codec technology may be involved.
Use of the patent.
ZL201010281033.X, an audio lossless compression coding and decoding method based on shaping wavelet transform;.201110263485.X, after
Block adaptive Golomb-Rice codec method and device;.201410721299.X, multi-channel lossless audio hybrid codec method and
Device.
The issuer of this document draws attention to the fact that when the statement is consistent with this document, it may involve four items related to the object metadata codec technology.
The use of patents.
201610157032.1, a panoramic sound processing method;.201610157663.3, a coordinate definition method of a sound field space;
201610158782.0, a coding method of a sound object;.201610159117.3, a panoramic sound coding method.
The issuing organization of this document has no position on the authenticity, validity and scope of the patent.
The holder of the patent has assured the issuing authority of this document that he is willing to work with any applicant on reasonable and non-discriminatory terms and conditions.
Negotiate a patent license. The patent holder's statement has been filed with the issuing authority in this section. Related information can be passed the following
Contact information obtained.
Contact. Huang Tiejun (Secretary-General of Digital Audio Video Codec Standard Working Group)
Address. Room 2641, Building 2, Science, Peking University
Postal code. 100871
Email. [email protected]
Phone. 10-62756172
Fax. 10-62751638
Please note that in addition to the above patents, some of the contents of this section may still involve patents. The publishers in this section do not undertake to identify these
Liability.
Information technology efficient multimedia coding
Part 3. Audio
1 Scope
This part of GB/T 33475 describes general audio coding, lossless audio coding and 3D audio object coding for high quality audio signals.
A method of representing a code and a method of general audio decoding, lossless audio decoding, and three-dimensional audio object decoding.
This section applies to the following areas.
--- Digital storage media;
---Internet broadband audio and video services;
---Digital audio and video broadcasting;
---Wireless broadband multimedia communication;
---Digital movie;
---Virtual reality and augmented reality;
---Video Surveillance.
2 Normative references
The following documents are indispensable for the application of this document. For dated references, only dated versions apply to this article.
Pieces. For undated references, the latest edition (including all amendments) applies to this document.
GB/T 4880.2-2000 Language name code Part 2. 3-letter code
GB/T 5271.1 Information technology vocabulary Part 1. Basic terms
GB/T 5271.4 Information technology vocabulary Part 4. Organization of data
GB/T 5271.9 Information technology vocabulary - Part 9. Data communication
GB/T 17975.1-2010 Information technology - General coding for motion pictures and associated audio information - Part 1
3 Terms and definitions
The following terms and definitions as defined in GB/T 5271.1, GB/T 5271.4 and GB/T 5271.9 apply to this document.
3.1
Reserved reserved
Temporarily unused fields in the defined encoded bitstream may be used in future standard extensions.
3.2
Bit rate bitrate
The rate at which the encoded bit stream is transmitted to the input of the decoder.
3.3
Bit stream bitstream
A set of bits that are used in a certain order as a data encoded representation.
3.4
Coding coding
The audio sample stream is read in and a valid bit stream conforming to this section is generated.
3.5
Encoder coder
Encoding processed entity.
3.6
Code representation codedrepresentation
A unit of data represented in its encoded form.
3.7
Coded audio bit stream codedaudiobitstream
An encoded representation of the audio signal.
3.8
Side information sideinformation
The necessary information to control decoding in the bitstream.
3.9
Sampling frequency samplingfrequency; fs
The number of samples of the discrete signal extracted from the continuous signal per second, which can be referred to as the sampling rate.
Note. The unit is Hertz (Hz).
3.10
Auxiliary data
Data used in the bitstream to assist in channel coding.
3.11
Decode decoding
One type of data processing defined in this section is the process of reading in an encoded bit stream and outputting audio sample values.
3.12
Decoder decoder
Decode the processed entity.
3.13
Spectral coefficient
Analyze the discrete spectral domain data output in the filter bank.
3.14
Entropy coding entropycoding
A variable length lossless coding in the digital representation of the signal to reduce redundancy in statistical properties.
3.15
Channel channel
Independent audio signals that are captured or played back in different spatial locations during recording or playback.
3.16
Data unit dataelement
A representation of the pre-encoded and encoded data items.
3.17
Fill (bit) stuffing(bits)
Fill (byte) stuffing(byte)
A codeword that can be inserted into a special location of the encoded bitstream and removed during the decoding process. In addition, the encoded additional data can also
Use padding bits or bytes.
3.18
Signal type signaltype
A mechanism for classifying encoded audio signals for selecting different filtering methods and encoding methods.
3.19
Audio buffer audiobuffer
A buffer for storing encoded audio data in the decoder.
3.20
Byte byte
A sequence of 8 bits.
3.21
Byte aligned bytealignment
In the encoded bit stream, the number of bits is a multiple of eight.
3.22
Noise coding noiselevelcoding
The statistical characteristics of the signal are similar to the degree of noise parameter coding.
3.23
Add string addingindividualline
Add a sinusoidal component to a particular frequency band.
3.24
Linear predictive coding linearpredictcoding
An algorithm for processing an input audio signal to reduce signal redundancy and improve coding efficiency.
3.25
Lifting wavelet liftingwavelet
Wavelet transform implemented with a lifting strategy.
3.26
Multichannel decorrelation multichanneldecorrelation
Inter-channel correlation is removed to improve coding efficiency.
3.27
Channel coding channelcoding
The encoding of the base channel, ie the encoding of the underlying sound signal in addition to the sound object.
3.28
Sound object soundobject
A sound that is perceived as a whole or an environment-independent sound emitted by a sound source.
3.29
Sound object encoding soundobjectcoding
The sound object audio sample stream and its side information are read in and an encoded bit stream including object metadata and audio content is generated.
4 symbols and abbreviations
The mathematical operators and priorities used in this section are similar to those used by the C language. But for integer division and arithmetic shift operations
A specific definition. The appointment number and count start from 0 unless otherwise stated.
4.1 Arithmetic Operators
The following arithmetic operators are available for this document.
plus
- minus (binary operator) or negation (unary prefix operator)
× multiply
The ab power, which represents the b-th power of a, can also represent the superscript.
^ power
% modulus operator, defined only for positive integers.
/ Integer division, the result is rounded to zero. For example, 7/4 and -7/-4 are rounded to 1, 7/4 and 7/-4 rounded to -1.
Division, no rounding or rounding.
|| Absolute value |x|=x when x >0
|x|=0 when x=0
|x|=-x when x< 0
Abs absolute value
Sign() takes the symbol sign(x)=1 when x >0
Sign(x)=0 when x=0
Sign(x)=-1 when x< 0
x square root
i=a
f(i) The cumulative sum of the functions f(i) when the argument i takes all integer values from a to b (including b).
Log10 base 10 logarithm
Log2 base 2 logarithm
Round down
Exp an exponential function based on the natural constant e
Residual residual, the difference between the actual observed value and the estimated value (fitted value)
4.2 Logical Operators
The following logical operators are suitable for this document.
|| Logical or
Logical non
4.3 Relational operators
The following relational operators apply to this document.
> greater than
≥ greater than or equal to
< less than
≤ less than or equal to
= equal to
≠ not equal
Max[,,] maximum value in the parameter table
Min[,,] the minimum value in the parameter table
4.4 bit operator
The following bit operators apply to this document.
| or
~ Invert
a > >b Shift a to b to the right in the form of a two's complement integer. This operation is defined only when b takes a positive number. To the right
When moving to the most significant bit, its value is equal to the most significant bit before the a shift operation.
\u003chtml\u003e a <
When moved to the least significant bit, its value is equal to zero.
4.5 Assignment
The following assignment operations apply to this document.
== assignment operator
x = ab x takes the value from a to b (inclusive b), where x, a and b are integers.
Self-added, x is equivalent to x=x 1. When used for array subscripting, the variable value is evaluated before the self-addition operation.
-- Self-decreasing, x--equivalent to x=x-1. When used for array subscripting, the variable value is evaluated before the decrement operation.
= Self-added specified value, for example, x = 3 is equivalent to x = x 3, and x = (-3) is equivalent to x = x (-3).
-= Decrement the specified value, for example, x-=3 is equivalent to x=x (-3), and x-=(-3) is equivalent to x=x-(-3).
4.6 mnemonic
The following mnemonics apply to this document.
Rpchof polynomial remainder, high order first.
Bslbf bit string, the left bit is first, here "left" is the order of the bit string written in GB/T 17191. Bit string is enclosed in single quotes
1 and 0 strings. Such as '10000001'. The spaces in the bit string are easy to read and have no special meaning. (bitstreamleft
Bitfirst).
Uimsbf Unsigned integer, the most significant bit first. (unsignedinteger, mostsignificantbitfirst).
The bsmbf bit string is a quoted 1 and 0 string, with the right bit first, such as encoding a 5-bit value of 6 and then encoding a 3 ratio.
The special value is 2, then the encoded bit string is '01000110'.
4.7 Abbreviations
The following abbreviations apply to this document.
FFT. Fast Fourier Transform (FastFourierTransform)
MDCT. Modified Discrete Cosine Transform (ModifiedDiscreteCosineTransform)
IMDCT. Modified Inverse Modified Cosine Transform (InverseModifiedDiscreteCosineTransform)
MDST. Modified Discrete Sine Transform (ModifiedDiscreteSineTransform)
MDFT. Modified Discrete Fourier Transform (ModifiedDiscreteFourierTransform)
IMDFT. Modified Inverse Modified Discrete Fourier Transform (InverseModifiedDiscreteFourierTransform)
MCR. Maximal CorrelationRotation
PCA. PrincipalComponentAnalysis
AASF. AVS2 Audio Storage Format (AVS2AudioStorageFormat)
AATF. AVS2 Audio Transmission Format (AVS2AudioTransportFormat)
CRC. Cyclic Redundancy Check (CylicRedundancyCheck)
BWE. High Frequency Bandwidth Extension (BandwidthExtension)
TNS. Temporal Noise Shaping (TemporalNoiseShaping)
5-bit stream grammar rule
Each data item in the bitstream is in bold. Described by the name, the length of the bit, its type, and the mnemonic of the transfer order.
The operations caused by the decoded data elements in the bitstream depend on the value of the data and the previously decoded data elements. The following grammatical knot
Constructs a situation when a data element appears as a standard type.
Note 1. Unless otherwise stated, “bit” in this section refers to a binary bit.
Note 2. This part of the grammar is specified by the "C" code. When the variable or expression is non-zero, the condition is true. When the variable or expression is zero, the condition is
Not true.
While(condition){
Data_element
If the condition is true, the data element group is generated immediately following the data stream, and so on until the condition is not true.
Do{
Data_element
}while(condition)
If the condition is true, the data element group is generated immediately following the data stream, and so on until the condition is not true.
If(condition){
Data_element
}else{
Data_element
If the condition is true, a first set of data elements is generated in the data stream, and if the condition is not true, a second set of data elements is generated in the data stream.
For(expr1;expr2;expr3){
Data_element
Expr1 is the specified loop initial state expression, usually it specifies the initial state of the counter, and expr2 is specified before each loop
Test conditions. When the condition is non-true, the loop terminates. Expr3 is the expression executed at the end of each loop, generally increasing the counter.
Note 3. The most common usage of this structure is
For(i=0;i \u003cn;i ){
Data_element
The data element group is generated n times. The conditional structure within the data element group may depend on the value of the loop control variable i. When it first appeared
Set to '0', the second time to '1', so reciprocating.
Switch(expr){ Generates the corresponding data element based on the value of the expression expr. The value of expr is
Caseconstcase1. constcase1 produces the data element data_element1, the value of expr
Data_element1 constcase2 produces the data element data_element2, and so on, the value of expr
The data element data_elementn is generated when break is constcasen. When the value of expr is not equal
Caseconstcase2. constcase1, constcase2,, any value in constcasen, generated
Data_element2 data element data_elementdefault
Break
Caseconstcasen.
Data_elementn
Break
Default.
Data_elementdefault
Break
A variant of this structure is that there is no break after the case, such as
Switch(expr){ The value of expr is constcasex, starting from the corresponding caseconstcasex
Caseconstcase1. According to the element, until break occurs.
Data_element1 expr value constcase1 produces data element data_element1 and
Caseconstcase2. data_element2, the data element is generated when the value of expr is constcasen
Data_element2 data_elementn
Break
Caseconstcasen.
Data_elementn
Break
Default.
Data_elementdefault
Break
Note 4. There may be nested structures in the data element group. For the sake of brevity, "[]" is omitted when there is only one data element at a time.
Data_element[] data_element is an array of data, the number of data elements is context-dependent;
Data_element[n] data_element[n] is the nth element of the array data;
Data_element[m][n] data_element[m][n] is the mth, nth element of the two-dimensional array;
Data_element[l][m][n] data_element[l][m][n] is the l 1,m 1,n 1 element of the three-dimensional array;
Data_element[mn] data_element[mn] is the bit included between bit m and bit n.
Although the grammar is represented by a procedure item, the clause cannot be considered to implement a reliable decoding process. It just defines an error-free bit
Stream input.
Definition of the byte_alignment function.
If the current position is at the boundary of the byte, the byte_alignment() function returns '1', that is, the next bit in the bit stream is one byte.
Start bit, otherwise return '0'.
The definition of the nextbits function.
The function nextbits() compares the bit string with the next bit in the bitstream to be decoded.
Definition of the feof function.
The function feof() determines whether the stream or file is finished. The feof() function returns '1' to indicate the end of the stream or file, otherwise it returns '0'.
The second column of the bitstream syntax table indicates the number of bits per data element. "XY" means that the number of bits is between X and Y.
Includes X and Y. "{X;Y}" indicates that the number of bits is X or Y depending on the value of other data elements in the bitstream.
6 audio coding framework
6.1 Overview
With the application of 3Dudio (threedimensionaudio) system, 4K and 3D movies, ultra HD TV, network HD
Application environments such as video, virtual reality, networking, and mobile audio have raised the need for efficient, high-quality compression coding of 3D audio data. by
The amount of data in the 3D audiovisual system is much larger than that of the traditional audiovisual system, which increases the storage space and transmission bandwidth (or data traffic) overhead, so
Improving the compression efficiency based on high sound quality is the main problem that AVS2 audio coding needs to solve. AVS2 audio coding is high voltage
Reduced, high-quality, and audio objects are encoded as 3D audio coding techniques. Both advanced technology, rich codec options, and system integration
It has high technical characteristics such as high degree of integration, flexible configuration, wide adaptability, and high performance, high compression, high sound quality and low complexity.
The AVS2 audio coding framework is shown in Figure 1. The entire system is encoded by base channel coding (base_profile) and 3D audio objects.
(3D_profile) consists of two encoding profiles. The basic channel encoding level not only combines mono, stereo (two-channel), surround sound
Channel encoding technology (multi-channel) and 3D sound bed, also integrated with General Audio Coding (GA) and lossless audio coding
(LosslessAudiocoding, LL) two encoding options. The universal encoding option combines two encoding modes, high bit rate and low bit rate. 3D sound
The frequency object coding includes object audio data and object metadata coding, wherein the object audio data shares the basic channel coding with the 3D sound bed, and is increased.
The audio object metadata (metadata) encoding module (AudioObject Coding, Ob) realizes the type and space of the 3D audio object sound source.
The encoding of the object description information such as position and motion trajectory. Compared to current audio standards, AVS2 audio coding has five features.
a) Code stream encapsulation takes into account the storage and transmission requirements, and designs a storage format AASF (AvsAudioStorage) without frame header information.
Format) and the transmission format AATF (AvsAudioTransformFormat) with frame header and error check information. in
Minimize coding redundancy while ensuring application requirements. For details, see Appendix A. The code rate difference between the two package formats is about
740b/s.
b) Codec framework.
|