GB/T 25724-2017_English: PDF (GB/T25724-2017)
Standard ID | Contents [version] | USD | STEP2 | [PDF] delivered in | Standard Title (Description) | Status | PDF |
GB/T 25724-2017 | English | 1205 |
Add to Cart
|
0--9 seconds. Auto-delivery
|
Technical specifications for surveillance video and audio coding
| Valid |
GB/T 25724-2017
|
GB/T 25724-2010 | English | RFQ |
ASK
|
15 days [Need to translate]
|
Technical specification of surveillance video and audio coding
| Obsolete |
GB/T 25724-2010
|
Standard ID | GB/T 25724-2017 (GB/T25724-2017) | Description (Translated English) | Technical specifications for surveillance video and audio coding | Sector / Industry | National Standard (Recommended) | Classification of Chinese Standard | A91 | Classification of International Standard | 13.310 | Word Count Estimation | 314,361 | Date of Issue | 2017-03-09 | Date of Implementation | 2017-06-01 | Older Standard (superseded by this standard) | GB/T 25724-2010 | Drafting Organization | Beijing Zhongxing Microelectronics Co., Ltd., Beijing Zhongshang Security Technology Development Company, Zhongxing Electronic Co., Ltd., Hangzhou Hangsheng Digital Equipment Technology Co., Ltd., Ministry of Public Security and Police Electronic Product Quality Inspection Center, Shanxi (Shanghai) Co., Ltd., Beijing welcomes Bo Electronic Technology Co., Ltd., Hangzhou Hikvision Digital Technology Co., Ltd., Hunan State Branch Microelectronics Co., Ltd., Zhejiang Dahua Technology Co., Ltd., Suzhou Keda Technology Co., Ltd., Zhejiang Vision Technology Co., Ltd., Tianjin World Albert CHAN Digital Technology Co., Ltd., Beijing as the Aegis Security Technology Co., Ltd., Beijing-core Technology Co., Ltd., Shanghai Xiling Information Technology Co., Ltd | Administrative Organization | National Security and Alarm System Standardization Technical Committee (SAC/TC 100) | Regulation (derived from) | National Standard Notice No. 5 of 2017 | Proposing organization | Ministry of Public Security of the People Republic of China | Issuing agency(ies) | General Administration of Quality Supervision, Inspection and Quarantine of the People Republic of China, China National Standardization Administration Committee | Standard ID | GB/T 25724-2010 (GB/T25724-2010) | Description (Translated English) | Technical specification of surveillance video and audio coding | Sector / Industry | National Standard (Recommended) | Classification of Chinese Standard | A91 | Classification of International Standard | 13.310 | Word Count Estimation | 191,169 | Date of Issue | 2010-12-23 | Date of Implementation | 2011-05-01 | Quoted Standard | GB/T 20090.2-2006 | Drafting Organization | First Research Institute of Ministry of Public Security | Administrative Organization | National Security Alarm System Standardization Technical Committee | Regulation (derived from) | Announcement of Newly Approved National Standards No. 10 of 2010 (No. 165 overall) | Proposing organization | People's Republic of China Ministry of Public Security | Issuing agency(ies) | Administration of Quality Supervision, Inspection and Quarantine of People's Republic of China; Standardization Administration of China | Summary | This standard specifies the security realm of digital video and audio surveillance applications encoding and decoding process of technical requirements. This standard applies to the field of security real-time video and audio compression, transmission, playback and storage, and other services, for other needs in the field of video and audio codec can refer to adopt. |
GB/T 25724-2017
GB
NATIONAL STANDARD OF THE
PEOPLE’S REPUBLIC OF CHINA
ICS 13.310
A 91
Replacing GB/T 25724-2010
Technical specifications for
surveillance video and audio coding
ISSUED ON: MARCH 9, 2017
IMPLEMENTED ON: JUNE 1, 2017
Issued by: General Administration of Quality Supervision, Inspection and
Quarantine of the People’s Republic of China;
Standardization Administration of the People’s Republic of
China.
Table of Contents
Foreword ... 4
Introduction ... 6
1 Scope ... 10
2 Normative reference ... 10
3 Terms, definitions and abbreviations ... 10
3.1 Terms and definitions ... 10
3.2 Abbreviations ... 24
4 Agreement ... 27
4.1 Arithmetic operators ... 27
4.2 Logical operators ... 27
4.3 Relational operators ... 28
4.4 Bit operators ... 28
4.5 Assignment operators ... 29
4.6 Mathematical functions ... 29
4.7 Syntax elements, variables and tables ... 30
4.8 Text description of logical operators ... 31
4.9 Process ... 33
5 Video section ... 33
5.1 Coded bitstream and output data format ... 33
5.2 Syntaxes and semantics ... 39
5.3 Decoding process ... 94
5.4 Parsing process ... 166
6 Audio part ... 265
6.1 General description ... 265
6.2 Encoder function description ... 270
6.2.1 Pre-processing ... 270
6.3 Decoder function description ... 347
6.4 Bit allocation description ... 359
6.5 Storage, transmission interface format... 362
Appendix A (Normative) Hypothetical reference decoder (HRD) ... 370
Appendix B (Normative) Byte stream format ... 374
Appendix C (Normative) Video profile and level ... 377
Appendix D (Normative) Video usability information (VUI) ... 381
Appendix E (Normative) Supplemental enhancement information (SEI) ... 385
Appendix F (Normative) Intelligent analysis data description ... 391
Appendix G (Normative) Audio profile and level ... 412
Appendix H (Normative) Exception sound event type definition ... 414
Appendix I (Informative) VAD detection ... 415
Appendix J (Informative) Noise elimination ... 421
References ... 435
Technical specifications for
surveillance video and audio coding
1 Scope
This Standard specifies the decoding process of digital video and audio compression
coding for public security video surveillance.
This Standard applies to the audio and video real-time compression, transmission,
playback and storage services of the field of public security; other fields that need
audio and video coding may also refer to this Standard.
2 Normative reference
The following document is indispensable for the application of this document. For
dated references, the only dated edition applies to this document. For undated
references, the latest edition (including all modifications) applies to this document.
rfc 3548 The Base 16, Base 32, and Base 64 Data Encodings
3 Terms, definitions and abbreviations
3.1 Terms and definitions
For the purpose of this document, the following terms and definitions apply.
3.1.1
NAL unit
A syntax structure that contains the instruction type and the number of bytes contained
in the subsequent data. The data appears in RBSP form and, if necessary, contains
the scattered emulation prevention bytes.
3.1.2
NAL unit stream
A sequence of NAL units.
3.1.3
4.9 Process
The process is used to describe the decoding of the syntax elements. All the syntax
elements and uppercase variables that belong to the current syntax structure, as well
as the associated syntax structures, are available in both the specification and the call
of the process. The specification of the process may also contain lowercase variables
that are explicitly specified as input. Each specification explicitly specifies the output.
The output can be uppercase variables or lowercase variables.
In the specification of the process, a particular macroblock can be represented by a
variable name whose value is equal to its macroblock index.
5 Video section
5.1 Coded bitstream and output data format
5.1.1 Bitstream format
This clause specifies the relationship between the NAL unit stream and the byte stream,
both of which are referred to as bitstreams.
The NAL unit stream format consists of a series of syntax structures called NAL units,
arranged by decoding order. The decoding order and contents of the NAL units in the
NAL unit stream are constrained.
The byte stream can be constructed by the NAL unit stream, by arranging the NAL
units in the decoding order, and adding a start code prefix and a number of zero bytes
to each NAL unit to form a byte stream. The NAL unit stream format can be extracted
from the byte stream format by searching a unique start code prefix in the byte stream.
Except for the byte stream format, other methods of constructing the NAL unit are not
specified in this Standard. The byte stream format is specified in Annex B.
5.1.2 Picture format
This clause specifies the relationship between the source determined by the bitstream
and the decoded frame.
The video stream represented by the bitstream is a series of frames arranged in
decoding order.
Each source or decoded frame is composed of one or more video sample point arrays:
- array of only luma (Y) (monochrome);
- array of luma and two chroma (YCbCr);
The following functions are used for the syntax description. These functions assume
that there is a bitstream pointer in the decoder that points to the next bit position in the
bitstream where the decoding process is to be read. Specific requirements are as
follows:
Specification for byte_aligned ():
- If the current position of the bitstream is at the boundary of the byte, that is, the
next bit in the bitstream is the first bit of the byte, then the return value of
byte_aligned () is TRUE;
- otherwise, the return value of byte_aligned () is FALSE.
Specification for get_left_ae_bits ():
- The value of the counter count in the entropy decoder plus 8 and then perform the
modulo operation on 8, if it is equal to 0, continue to parse through the fixed
probability of 128;
- if it is not equal to 0, then continue to parse through the fixed probability of 128 to
obtain the value after modulo operation and plus 8 bits.
Specification for more_data_in_byte_stream (), which is used in the byte stream NAL
unit syntax specified in Annex B:
- If there is more data in the byte stream, the return value of
more_data_in_byte_stream () is TRUE;
- otherwise, the return value of more_data_in_byte_stream () is FALSE.
Specification for more_rbsp_data ():
- If there is more data in RBSP before rbsp_trailing_bits (), the return value of
more_rbsp_data () is TRUE;
- otherwise, the return value of more_rbsp_data () is FALSE.
The method of determining whether there is more data in RBSP is not specified in this
Standard.
next_bits (n) provides the next n bits in the bitstream without changing the bitstream
pointer. This function makes the next n bits in the bitstream visible. When it is used in
the byte stream specified in Annex B, the return value of next_bits (n) is 0 if the
remaining byte stream has less than n bits.
Read_bits (n) reads the following n bits from the bitstream, and moves the bitstream
pointer forward by n bits. When n is equal to 0, the return value of read_bits (n) is 0
may also contain some emulation_prevention_three_byte. NumBytesInNALunit is
required for decoding the NAL unit. In order to be able to export NumBytesInNALunit,
the boundary of the NAL unit needs to be divided. Annex B specifies a method for
dividing the byte stream type. Other partition methods may be given outside this
Standard.
forbidden_zero_bit indicates the version of the SVAC standard that the video stream
supports. forbidden_zero_bit shall be equal to 1.
forbidden_zero_bit equal to 0 indicates that the video stream supports GB/T 25724-
2010 standard.
When nal_ref_idc is not equal to 0, the contents of the NAL unit contain a sequence
parameter set, or a picture parameter set, or a security parameter set, or tiles of a
reference picture. When the nal_ref_idc of a tile NAL unit of a coded picture is equal
to 0, the nal_ref_idc of all the tile NAL units of the coded picture shall be equal to 0.
nal_unit_type indicates the type of RBSP data structure in the NAL unit, see Table 30.
The VCL NAL unit refers to those NAL units with the value of nal_unit_type equal to 1,
2, 3 or 4. All other NAL units are called non-VCL NAL units.
NOTE 1: The VCL specification is for effectively representing the contents of the video data.
The NAL specification is for formatting the data and providing header information for storage or
transmission over a variety of communication channels. Each NAL unit contains integer bytes.
The NAL unit specifies a general format that applies to both packet-oriented and bitstream
systems.
Without affecting the decoding process of NAL units with nal_unit_type not equal to 5
and without affecting the consistency of this Standard, NAL units with nal_unit_type
equal to 5 can be discarded by the decoder.
NOTE 2: This Standard does not specify the decoding process of NAL units with the value
nal_unit_type is reserved. The decoder can ignore (removed from the bitstream and discarded)
all contents of NAL unit with the value nal_unit_type is reserved.
When the value of nal_unit_type of a tile NAL unit is equal to 2, the value of
nal_unit_type of all other tile NAL units encoding the same picture shall be the same,
and the value of nal_unit_type of all the tile NAL units of the corresponding SVC
enhance layer coded picture shall be equal to 4. Such a picture is called an IDR picture.
The NAL unit type is as shown in Table 30.
Parameters included in a sequence parameter set RBSP can be used by one or more
pictures or SEI NAL units containing buffer cycle SEI messages. Each sequence
parameter set RBSP takes effect at the same time as it is received by the decoder,
and the previously valid sequence parameter set RBSP (if any) will be fail. Not more
than one sequence parameter set RBSP is valid at the specified time in the decoding
process.
When a sequence parameter set RBSP is used by the SEI NAL unit containing a buffer
cycle SEI message, the SEI NAL unit shall be located after the sequence parameter
set RBSP.
Parameters included in the picture parameter set RBSP can be used by the coding tile
NAL unit of a coded picture. Each picture parameter set RBSP takes effect at the same
time as it is received by the decoder, and the previously valid sequence parameter set
RBSP (if any) will be fail. For a layer picture of SVC, not more than one picture
parameter set RBSP is valid at the specified time in the decoding process.
The specification for the relationship between the syntax element values and the other
syntax elements in the sequence parameter sets and the picture parameter set is only
for the valid sequence parameter set and the valid picture parameter set.
During the decoding process, the parameter values of the valid picture parameter set
and the valid sequence parameter set shall remain valid.
5.2.4.3.2.2 Taking effect of security parameter set RBSP
Parameters included in the security parameter set RBSP can be used by one or more
other types of NAL units. At the beginning of the decoding process, each security
parameter set RBSP takes effect at the same time as it is received by the decoder,
and the previously valid sequence parameter set RBSP (if any) will be fail. When
ldp_mode_flag of the sequence parameter set is equal to 1, the security parameter set
shall only appear before the IDR picture. Not more than one security parameter set
RBSP is valid at the specified time in the decoding process.
NOTE: In some applications, the security parameter set can also be passed to the decoder via
other reliable mechanisms.
5.2.4.3.2.3 Order of VCL NAL unit and its relationship with encoded pictures
Each VCL NAL unit is part of a coded picture.
The order of the VCL NAL units in a coded picture is defined as follows:
- the tile order of a picture shall be in ascending order of the first CTU index of the
tile;
lf_mode_delta_enable [i] indicates that the mode related loop filter parameter
difference update is enabled, equal to 0 indicates that the mode related loop filter
parameter difference update is closed, equal to 1 indicates that the mode related loop
filter parameter difference update is opened.
lf_mode_deltas [i] indicates the mode related loop filter parameter difference.
lf_mode_deltas_sign [i] indicates the sign of the mode related loop filter parameter
difference.
mode_deltas [i] = lf_mode_deltas [i] × lf_mode_deltas_sign [i]
picture_sao_enable [i] indicates whether the sample adaptive offset of the luma and
chroma components is opened. picture_sao_enable [i] equal to 0 indicates that it does
not open, equal to 1 indicates open, where i equal to 0 indicates the luma component;
i equal to 1 or 2 indicates the chroma component.
picture_alf_enable [i] is the permission sign of picture adaptive loop filter, indicating
whether the adaptive loop filter of the luma and chroma components of the current
picture is opened. picture_alf_enable [i] equal to 0 indicates that the ith component of
the current picture shall not use adaptive loop filter; equal to 1 indicates that the ith
component of the current ipicture uses adaptive loop filter, where i equal to 0 indicates
the luma component; i equal to 1 or 2 indicates the chroma component.
The value of alf_filter_num_minus1 plus 1 indicates the number of current picture’s
luma component adaptive loop filter.
The value of alf_filter_num_minus1 shall be 0 to 15.
alf_region_distance [i] indicates the difference between the base unit start sign of
luma component’s ith adaptive loop filter region and the base unit start sign of luma
component’s i-1th adaptive loop filter region. The value of alf_region_distance [i] shall
be 1 to 15.
If alf_region_distance [i] is not exist in the bitstream, when i is equal to 0, the value of
alf_region_distance [i] is 0. when i is not equal to 0 and the value of
alf_filter_num_minus1 is 15, the value of alf_region_distance [i] is 1. The bitstream
shall satisfy that the sum of alf_region_distance [i] (i = 0 ~ alf_filter_num_minus1) is
less than or equal to 15.
alf_coeff_luma [i] [j] indicates the jth coefficient of the luma component of the ith
adaptive loop filter. The value range of alf_coeff_luma [i] [j] (j = 0 ~ 8) obtained from
decoding in the bitstream shall be -64 to 63, and the value range of alf_coeff_luma [i]
[9] shall be -1088 ~ 1071.
alf_coeff_chroma [0] [j] indicates the coefficient of the jth adaptive loop filter of the
sao_merge_flag equal to 0 indicates that the parameter is not merged; equal to 1
indicates that the parameter is merged, and the SAO parameter is the same as the
SAO parameter of the CTU adjacent to its left or adjacent to its upper.
sao_merge_type equal to 1 indicates that the SAO parameter of the current CTU uses
the SAO parameter of the adjacent CTU on the left; equal to 0 indicates that the SAO
parameter of the current CTU uses the SAO parameter of the upper adjacent CTU on
the upper side.
sao_mode [compIdx] equal to 0 indicates that the SAO mode of the compIdxth
component in the current CTU is SAO_OFF; equal to 1 indicates that the SAO mode
of the compIdxth component in the current CTU is determined by sao_type [compIdx].
sao_type [compIdx] equal to 0 indicates that the SAO mode of the compIdxth
component in the current CTU is SAO_BO; equal to 1 indicates that the SAO mode of
the compIdxth component in the current CTU is SAO_EO.
sao_start_band [compIdx] indicates the start compensation interval of the compIdxth
component in the current CTU in SAO_BO mode, and the value shall be 0 ~ 31.
sao_offset_sign [compIdx] [j] indicates the sign of sao_offset [compIdx] [j] in
SAO_BO mode. sao_offset_sign [compIdx] [j] equal to 0 indicates that the value of
corresponding sao_offset [compIdx] [j] is positive, equal to 1 indicateds that the value
of corresponding sao_offset [compIdx] [j] is negative.
sao_offset_abs [compIdx] [j] indicates the absolute value of the compensation value
sao_offset [compIdx] [j] in SAO_BO mode, the value shall be 0 ~ (1 < < (Min
(bit_depth, 10) - 5)) - 1.
sao_edge_type [compIdx] indicates the angular direction of the compIdxth
component in the current CTU in SAO_EO mode. sao_edge_type [compIdx] equal to
0 indicates EO_0°; equal to 1 indicates EO_90°; equal to 2 indicates EO_135°; equal
to 3 indicates EO_45°.
sao_edge_offset [compIdx] [j] indicates the corresponding compensation value in
SAO_EO mode.
alf_ctu_enable [compIdx] equal to 0 indicates that the compIdxth component of the
current CTU does not perform adaptive loop filter. alf_ctu_enable [compIdx] equal to 1
indicates that the compIdxth component of the current CTU performs adaptive loop filter.
5.2.4.4.6 Authentication data RBSP semantics
frame_num indicates that the picture of authentication data shall be included; the
picture is the same picture as the authentication data frame_num which is closest
before the authentication data NAL unit. When frame_num is equal to 0, frame_num
inter_block indicates whether the current block is an inter coded block.
skip_flag indicates whether the current block is skipped.
coeff_value indicates the value of the block coefficients.
coeff_sign indicates the sign of the block coefficients.
tx_size indicates the size of the transform matrix used by the current block. tx_size
equal to 0 indicates that the transform matrix is TX_4 × 4; equal to 1 indicates that the
transform matrix is TX_8 × 8; equal to 2 indicates that the transform matrix is TX_16 ×
16; equal to 3 indicates that the transform matrix is TX_32 × 32.
prev_intra_luma_pred_flag indicates whether the luma intra prediction mode is in the
intra prediction mode prediction list and the prediction list contains 5 most likely
prediction modes.
mpm_idx0 equal to 0 indicates that the current luma prediction mode is the first mode
in the prediction list; equal to 1 indicates that the current luma prediction mode is not
the first mode in the prediction list.
mpm_idx1, when mpm_idx0 is 1, mpm_idx1 + 1 indicates the position where the
current prediction mode is in the prediction list. The value of mpm_idx1 shall be 0 ~ 3.
rem_pred_intra_mode indicates the index in the remaining 32 prediction modes
except for the 5 prediction modes in the prediction list in the current luma prediction
mode. The value of rem_pred_intra_mode shall be 0 ~ 31.
uv_fllow_y_flag equal to 1 indicates that the chroma intra prediction mode is
consistent with the luma intra prediction mode of its corresponding position, and
uv_fllow_y_flag equal to 0 indicates that the chroma intra prediction mode does not
coincide with the luma intra prediction mode of its corresponding position.
chroma_intra_mode indicates the chroma intra prediction mode index.
block_reference_mode indicates the reference frame mode of the current block, the
value is SINGLE_REFERENCE or COM-POUND_REFERENCE. If
block_reference_mode does not exist in the code stream, the value of
block_reference_mode is equal to frame_reference_mode. If block_reference_mode
is equal to COMPOUND_REFERENCE, is_compound is equal to 1, otherwise
is_compound is equal to 0.
ref_frame indicates the current prediction block reference frame index. When
block_reference_mode is equal to SINGLE_REFERENCE, ref_frame has five possible
values, namely DYNAMIC_REF, STATIC_REF, OPTIONAL_REF, DY-NAMIC_REF_1
and DYNAMIC_REF_2. When block_reference_mode is equal to
OSD extension information with two or more sub_type of the same value shall not
appear in one NAL unit.
NOTE: The last extension information is valid when OSD extension information with more than
one sub_type of the same value appears in the same NAL unit.
code_type is an 8-bit unsigned integer, representing the encoding type of the OSD
character. The value of code_type equal to 0 indicates encoding with UTF-8.
align_type is an 8-bit unsigned integer, representing the alignment type of the OSD
character. The value of align_type equal to 0 indicates left alignment; equal to 1
indicates right alignment.
char_size is an 8-bit unsigned integer, representing the OSD character size,
expressed in sample units.
char_type is an 8-bit unsigned integer, representing the OSD character type.
char_type equal to 0 indicates white background with black edges; equal to 1 indicates
black background with white edges; equal to 2 indicates white; equal to 3 indicates
black; equal to 4 indicates automatic anti-color.
top_low8 and top_high8 form a 16-bit unsigned integer top, representing the position
of the upper border of the OSD character information in the picture, expressed in
sample points. The value of top is calculated as follows:
top = (top_high8 < < 8) + top_low8
left_low8 and top_high8 form a 16-bit unsigned integer left, representing the position
of the left border of the OSD character information in the picture, expressed in sample
points. The value of left is calculated as follows:
left = (left_high8 < < 8) + left_low8
len is an 8-bit unsigned integer that indicates the length of the byte occupied by
osd_data, which shall be 0 ~ 243.
res is an 8-bit unsigned integer, and the value shall be between 0 ~ 255.
osd_data is OSD character data. Where '\n' is defined as the row break and '\0' is the
end character. The length of osd_data is len bytes.
5.2.4.7.5 Geographic information extension semantics
extension_id is an 8-bit unsigned integer, and the extension_id of the geographic
information extension shall be equal to 0x10.
longitude_type equal to 0 indicates east longitude; equal to 1 indicates west longitude.
syntax structure encapsulated in the NAL unit. This process is extracting the RBSP
syntax structure from the NAL unit. If encryption_idc is equal to 1, when extracting the
RBSP syntax structure from the NAL unit, it needs to decrypt the encrypted RBSP to
obtain an unencrypted RBSP. The decryption process is not specified in this Standard.
The RBSP syntax structure in the NAL unit is decoded in the following manner:
- The decoding process of NAL units when the value of nal_unit_type is 1, 2, 3 and
4, see 5.3.3;
- The intra prediction process of NAL units when the value of nal_unit_type is 1 and
2, see 5.3.4;
- The inter prediction process of NAL units when the value of nal_unit_type is 1, see
5.3.5;
- The decoding process and the picture reconstruction process for the coding tree
unit in the NAL unit transforms the coefficient before the deblocking filter when the
value of nal_unit_type is 1 and 2, see 5.3.6;
- The deblocking filter process of the reconstructed picture of the NAL unit when the
value of nal_unit_type is 1, 2, 3 and 4, see 5.3.7;
- The offset compensation process of sample points of the reconstructed picture of
the NAL unit when the value of nal_unit_type is 1, 2, 3 and 4, see 5.3.8;
- The filter compensation process of sample points of the reconstructed picture of
the NAL unit when the value of nal_unit_type is 1, 2, 3 and 4, see 5.3.9;
- The decoding process of the coding tree unit in the NAL unit before the deblocking
filter when the value of nal_unit_type is 3 and 4, see 5.3.10;
- When the value of nal_unit_type is 7, 8 and 9, the RBSP in the NAL unit is the
sequence parameter set, the picture parameter set and the security parameter set,
respectively. Effective sequence parameter set, picture parameter set and security
parameter set are used in the decoding process of other NAL units;
- The decoding process of the NAL unit when the value of nal_unit_type is 13, see
Clause 6;
- The decoding process of the NAL unit when the value of nal_unit_type is 0, 12, 14
and 15 is not specified in this Standard.
5.3.3 Decoding process of pictures
5.3.3.1 Classification and correspondence of pictures
For intra prediction, the prediction block size is bound to the transform block size;
because there is only N × N transform, the prediction block size is also N × N. See
5.1.3.3 for adjacent block availability.
5.3.4.4 Acquisition of luma reference sample points
For N × N luma blocks, the reference sample point in the upper corner of the current
block is marked as r [i], and the reference sample point in the left corner of the current
block is marked as c [j], where r [0] = c [0].
Use I to represent the luma sample value matrix of the picture where the current block
is after compensation (that is, before filter).
Let the coordinate of the sample point in the upper left corner of the current block in
the picture be (x0, y0), and the reference sample is obtained by the following rules:
a) Initialize r [i], c [j] is 2bitdepth-1, i = 0 ~ 2N, j = 0 ~ 2N;
b) If the upper block is available, then r [i] = i [x0 + i - 1, y0 - 1], i = 1 ~ N, r [i] is
available; otherwise r [i] is not available;
c) If the upper right block is available, then r [i] = i [x0 + i - 1, y0 - 1], I = N + 1 ~ 2N,
r [i] is available; otherwise r [i] is equal to r [N], whether r [i] is available depends
on whether r [N] is available;
d) If the left block is available, then c [j] = I [x0 - 1, y0 + j - 1], j = 1 ~ N, c [j] is available;
otherwise c [j] is not available;
e) If the lower left block is available, then c [j] = I [x0 - 1, y0 + j - 1], j = N + 1 ~ 2N, c
[j] is available; otherwise c [j] is equal to c [N], whether c [j] is available depends
on whether c [N] is available;
f) If the sample point (x0 - 1, y0 - 1) is available, then r [0] = I [x0 - 1, y0 - 1], r [0] is
available, otherwise r [0] is not available.
5.3.4.5 Acquisition of chroma reference sample positions
The acquisition method of chroma reference sample points is the same with that of
luma reference sample points, except that the luma block becomes the corresponding
chroma block.
5.3.4.6 Calculation of prediction sample positions
The prediction sample point matrix predMatrix of the luma blocks and chroma blocks
in each intra prediction mode is exported as follows:
a) Horizontal_PRED
candidate motion vector set. When all the candidate positions are searched, enter the
fourth step. Wherein the candidate position is the same as the first step.
The fourth step, if in the previous block in decoding order, a reference frame used by
the block with the same position of the current block is different from the reference
frame of the current block, then the MV of the corresponding reference frame of this
block is added to the candidate motion vector set.
If this reference frame is in different direction with the reference frame of the current
block, the MV sign of this block is negated (-mvx, -mvy) and added to the candidate
motion vector set.
5.3.5.2.2 Export of luma motion vector
If the skip_flag of the current block is equal to 1, the MV of the current block is {0,0}
and the corresponding reference frame is DYNAMIC_REF. otherwise:
if the block_reference_mode of the current block is equal to SINGLE_REFERENCE,
then:
a) If the mv_mode of the current block is ZEROMV, the MV of the current block is
{0,0};
b) If the mv_mode of the current block is NEARESTMV, the MV of the current block
is PMV [0];
c) If the mv_mode of the current block is NEARMV, the MV of the current block is
PMV [1];
d) If the mv_mode of the current block is NEWMV, the MV of the current block is
MVP [0] + MVD [0].
If the block_reference_mode of the current block is equal to
COMPOUND_REFERENCE, the current block is in bidirectional prediction mode, and
there are two reference frames of inter prediction, where the first reference frame is
read from the code stream and the second reference frame is fixed to OPTIONAL_REF.
Two motion vectors are exported in two reference frames, namely MV [0] and MV [1]
respectively, and are calculated as follows:
a) If the mv_mode of the current block is ZEROMV, both MV [0] and MV [1] are {0,0};
b) If the mv_mode of the current block is NEARESTMV, both MV [0] and MV [1] are
PMV [0];
c) If the mv_mode of the current block is NEARMV, both MV [0] and MV [1] are PMV
[1];
ah0,0 = Clip1 ((ah'0,0 + 64) >> 7)
ha0,0 = Clip1 ((ha'0,0 + 64) >> 7)
The fraction sample points of other chroma components, e.g.: bb0,0, bc0,0 ... bh0,0, ...
hb0,0, hc0,0, ... hh0,0, need to be calculated using the fraction sample point value (ab'0,0,
ac'0,0, ad'0,0, ae'0,0, af'0,0, ag'0,0, ah'0,0) of the row where the integer sample points that
have been calculated in the first step locate, the calculation method is as follows:
hh'0,0 = -ah'0,-3 + 6 × ah'0,-2 - 19 × ah'0,-1 + 78 × ah'0,0 + 78 × ah'0,1 - 19 × ah'0,2 + 6 × ah'0,3
- ah'0,4
The final predictor of hh0,0 is calculated as follows:
hh0,0 = Clip1 ((hh'0,0 + 8192) >> 14)
Similarly, the prediction method of the other chroma sample positions is similar, and
are calculated using the interpolation coefficients of the corresponding positions.
5.3.6 Transform factor decoding process and picture reconstruction process
......
GB/T 25724-2010
Technical specification of surveillance video and audio coding
ICS 13.310
A91
National Standards of People's Republic of China
Security monitoring digital video and audio
Codec technical requirements
2010-12-23 release
2011-05-01 implementation
General Administration of Quality Supervision, Inspection and Quarantine of the People 's Republic of China
China National Standardization Management Committee released
Directory
Preface III
Introduction IV
1 range 1
2 normative reference document 1
3 terms, definitions and abbreviations 1
3.1 Terms and definitions 1
3.2 Abbreviations 10
4 agreement 11
4.1 Arithmetic Operators 11
4.2 logical operators 11
4.3 Relational Operators 12
4.4 bit operator 12
4.5 Assignment operator 12
4.6 Mathematical Functions 12
4.7 Syntax Elements, Variables, and Table 13
4.8 Character description of logical operators 14
4.9 Process 15
5 video part 15
5.1 Format of Encoded Bitstream and Output Data
5.2 Grammar and semantics 20
5.3 decoding process 51
5.4 Analysis process 79
6 audio section 97
6.1 General description 97
6.2 Encoder function description 100
6.3 Decoder Function Description 141
6.4 Bit allocation description 148
6.5 storage, transmission interface format 150
Appendix A (normative) Assumed reference decoder (HRD)
Appendix B (Normative Appendix) Format of byte stream 159
Appendix C (Normative Appendix) Video Grade and Level 161
Appendix D (Normative Appendix) Video Availability Information (VUI) 166
Appendix E (Normative Appendix) Supplementary Enhancement Information (SEI) 168
Appendix F (normative appendix) Variable length code table 170
Appendix G (Normative Appendix) Audio Grade and Level 171
Appendix H (Normative Appendix) Abnormal Sound Event Type Definition 173
Appendix I (informative) VAD detection 174
Appendix J (informative) Noise elimination 177
References 186
Preface
Please note that some of the contents of this standard may involve patents, the issuer of this standard does not assume responsibility for the identification of these patents.
Appendix A to Appendix H of this standard are normative and Appendix I and Appendix J are informative.
This standard is proposed by the Ministry of Public Security of the People's Republic of China.
This standard by the National Security Alarm System Standardization Technical Committee (SAC/TC100) centralized.
The standard drafting unit. the first Institute of Public Security, Beijing Star Microelectronics Co., Ltd., Beijing Zhongshi Security Technology Development Corporation, the Star
Electronics Co., Ltd., Tsinghua University, Hong Kong University, Dalian University of Technology, Jiangsu Dongqi Information Technology Co., Ltd., China University of Communication letter
Engineering College, National Multimedia Software Engineering Technology Research Center, Ningbo Aili Te Technology Development Co., Ltd., Hangzhou Hang Seng Digital Equipment Division
Technology Co., Ltd., the third Institute of Public Security, Zhejiang Dahua Technology Co., Ltd., Beijing sound fast Electronics Co., Ltd., Tianjin Yaan Technology
Electronics Co., Ltd., Shenzhen Ai Like Electronics Co., Ltd., Zhejiang Dali Technology Co., Ltd., Beijing Guotong Venture Information Technology Co., Ltd.
Company, Tianjin World Albert CHAN Digital Technology Co., Ltd., Jinpeng Electronic Information Machine Co., Ltd., Beijing frog as a communication technology limited liability company,
Hangzhou Hikvision Digital Technology Co., Ltd., Institute of Software, Chinese Academy of Sciences, Shenzhen Zhongxing Liwei Technology Co., Ltd., Beijing Hanbang
Transtech Services Digital Technology Co., Ltd., Ningbo Shunyu Optoelectronic Information Co., Ltd., Digital Technology (Beijing) Co., Ltd., Xin Tai Technology Co., Ltd.
Division, Star Holdings Group Co., Ltd., Zhejiang Police Officer Vocational College, Beijing Fusheng Star Electronics Co., Ltd., Hangzhou, China Communication Technology Co., Ltd.
Division, Guangdong Zhicheng Champion Group Co., Ltd.
The main drafters of this standard. Chen Chaowu, Deng Zhonghan, Li Xiaofeng, Yang Xiaodong, Zhang Yue, Qiu Song, Feng Yuhong, Lu Jinghui, Yu Zilong, Yuan Lirong,
FENG Bao-ting, GAO Song, LIN Dong, CHEN Zhe, ZHONG Xing-ye, WANG Sheng-jin, YANG Lei, house river, Yang Guosheng, Fan Jingjing, Zou Zhangbiao, Zhi Chen, Wang Yaohui,
Li Zhaofei, Wang Jianyong, Gao Lei, Wang You, Wei Yi, Sun Daleui, Yan Jianxin, Yu Heshui, Dai Lin, Chen Ruijun, Yu Ye, Huang Qilin, Ji Pengfei,
LIU Lei-lei, CHEN Yu, ZHOU Zhi-wen, XIANG Xin-xin, WU Zheng-yi
introduction
At present, the domestic and international are not specifically for security monitoring applications, audio and video codec standards, the existing audio and video codec standards,
Are for radio and television and mass entertainment applications, in the field of security precautions directly with a great deal of adaptability. This standard is designed
The door for the field of security monitoring the application of the particularity, such as. real-time transmission of video images, all-weather 24 hours to monitor the adaptability of the environment
Video and audio information, such as the need to restore the faithful to the development. The main technical features of this standard are.
A) support high-precision video data encoding, to adapt to a wide dynamic range, to retain more image details, to meet the requirements of faithful to the scene.
Video support 8bit ~ 10bit data, and retain the future expansion to 12bit ~ 16bit possible;
B) Supports intraframe 4 × 4 prediction and transform quantization, adaptive frame-field coding (AFF) and context adaptive binary arithmetic coding
(CABAC) and other technologies, to obtain better image quality and higher coding efficiency;
C) support region of interest (ROI) variable quality coding, in the transmission network bandwidth or data storage space is limited, the priority guarantee
ROI image quality, save non-ROI overhead, provide high quality video coding that better meets monitoring needs, improve monitoring system
Body performance;
D) support for scalable video coding (SVC), the video data hierarchical coding, to meet the different transmission network bandwidth and data storage ring
The demand for the environment;
E) support dual-core audio coding for algebraic codebook excitation linear prediction (ACELP) and transform audio coding (TAC) switching,
Voice signal has a better coding effect, but also to ensure that the environment (background) sound coding effect;
F) Supports the coding of characteristic parameters of voice recognition to avoid the influence of coding distortion on speech recognition and voiceprint recognition;
G) Supports absolute information such as absolute time reference information, special monitoring events, etc. Monitor dedicated information through special syntax and video and audio
Compressed and encoded data to be transmitted and stored together for quick retrieval, sorting queries, video and audio synchronization and monitoring data for integrated applications;
H) support data security protection, specify the encryption and authentication interface and data format, to ensure data security, integrity and non-denied
Sex. Both to ensure the unity of the format, easy interoperability, but also to retain sufficient flexibility to expand, to support higher performance encryption and recognition
Increase and expand the way.
Related patent description
The issuer of this document drew attention to the fact that statements in conformity with this document may relate to matters relating to 5.2.3.1, 5.2.3.2, 5.2.3.8,
5.2.4.2, 5.2.4.4, 5.2.4.10, 5.3.6.7, 6.1.2, 6.1.4, 6.2.6.1.3, 6.2.6.1.4.10 relating to the contents of the relevant patents
usage of.
The issuer of this document has no position on the authenticity, validity and scope of the patent.
The holder of the patent has indicated to the issuing body of this document that he is willing to cooperate with any applicant on reasonable and non-discriminatory terms and conditions,
Negotiate on patent licensing. The statement of the patent holder has been filed with the issuing authority of this document. Related information can be found below
Contact.
Name of the patent holder
Beijing Star Microelectronics Co., Ltd. Beijing Haidian College Road 35, Shi Ning Building (100191)
Beijing Zhongshi Security Technology Development Company No. 1, South Road, Haidian District, Beijing (100048)
Zhongxing Electronics Co., Ltd
Tianjin Economic and Technological Development Zone No. 80 West Avenue Science Park A1
Block 2 (300457)
Tsinghua University Tsinghua University, Beijing Haidian District (100084)
Digital Technology (Beijing) Co., Ltd. No. 2 South Street, Zhongguancun, Haidian District, Beijing (100086)
Wuhan University Wuhan University Wuhan University (430079)
Contact. Zeng Juanjuan
Address. Beijing Haidian District College Road 35, Shi Ning Building, 16th floor
Post code. 100191
E-mail. zengjuanjuan@vimicro.com
Tel. 010-68948888-8950
Fax. 010-68944075
Contact. Ma Zhijiang
Address. Beijing Haidian District, the first South Road on the 1st
Post code. 100048
E-mail. mzj76@yahoo.com
Tel. 010-88513553-828
Fax. 010-68454099
Please note that in addition to the above patents, some of the contents of this document may still involve patents. The issuer of this document does not undertake to identify these specialties
Lee's responsibility.
Security monitoring digital video and audio
Codec technical requirements
1 Scope
This standard specifies the technical requirements for digital video and audio coding and decoding processes for surveillance applications in the field of security.
This standard applies to the field of security and audio and video real-time compression, transmission, playback and storage business, for other needs audio and video
The field of decoding can also be used.
2 normative reference documents
The terms of the following documents are hereby incorporated by reference into this standard. Any reference to the date of the document, which followed by all
(Excluding corrigenda) or revisions are not applicable to this standard, however, parties to the agreement are encouraged to enter into an agreement
Whether you can use the latest version of these files. For undated references, the latest edition applies to this standard.
Information technology - Advanced audio and video coding - Part 2. Video - GB/T .2009.0.2-2006
3 terms, definitions and abbreviations
The following terms, definitions and abbreviations apply to this standard.
3.1 Terms and definitions
3.1.1
"Z" zigzag zig-zagscan
Transform coefficients from a lower spatial frequency to a higher spatial frequency (approximate) in a clear order. The "Z" glyph scan is used for frame macros
The transform coefficients in the block.
3.1.2
B strip Bslice
Decoding is performed based on the decoded samples within the same band using intra prediction, or based on the previously decoded reference picture,
Measure the band to be decoded, and use up to two motion vectors and reference indices for inter prediction between each block.
3.1.3
I strip Islice
A band that is decoded using intra prediction based on decoded samples within the same band.
3.1.4
P strip Pslice
Decoding is performed based on the decoded samples within the same strip using intra prediction, or based on the previously decoded reference picture,
Measure the band to be decoded, and use up to one motion vector and reference index for inter prediction between each block.
3.1.5
NAL unit NALunit
A grammatical structure that contains the type indication of the subsequent data and the number of bytes contained, the data appears in the form of RBSP, where
Also includes the authentication data and the spread of the security bytes.
3.1.6
NAL unit stream NALunitstream
A sequence of NAL units.
3.1.7
Reserved reserved
Some specific elements of the grammar elements for the Chinese security monitoring digital audio and video codec technology standards working group will be used in the future. symbol
These values should not be used for bitstreams of this standard, but these values may be used in future extensions of this standard.
3.1.8
Closed-loop pitch search closed-looppitchsearch
That is, the adaptive codebook search, the process of estimating the pitch delay from the weighted input signal and the long-term prediction filter state.
3.1.9
Bit stream bitstream
Encoding video and audio and its associated data to form one or more bit sequences that encode video and audio sequences. Bitstream can be used to represent
NAL unit stream, also can represent byte stream.
3.1.10
Transform coefficient transformcoefficient
A scalar of the frequency domain, a coefficient associated with a particular one-dimensional or two-dimensional frequency index in the inverse transform portion of the decoding process.
3.1.11
Transform coefficient amplitude transformcoefficientlevel
An integer quantity associated with a particular two-dimensional frequency index, and a value for calculating the transform coefficients during the decoding process.
3.1.12
Coded field codedfield
A field representation of the code.
3.1.13
Encoding process encoding process
The process of generating a bit stream conforming to this standard, this standard does not specify the video coding process.
3.1.14
Encoder encoder
Implement the encoding process entities, including software and hardware.
3.1.15
Coded video sequence codedvideosequence
An IDR image arranged in the order of decoding, and an image sequence consisting of zero or more non-IDR images immediately following them.
3.1.16
Encoding strip NAL unit codedsliceNALunit
A NAL unit containing a stripe of encoded images.
3.1.17
Coded image codedpicture
The representation of an image. A coded image may be a coded field or a coded frame.
3.1.18
Coded image buffer area codedpicturebuffer
A first-in-first-out buffer, which is stored in the order of decoding.
3.1.19
Coded frame
A coded representation of a frame.
3.1.20
Residual residual
The difference between the predicted value of the sample or data element and the decoded value.
3.1.21
Reference field referencefield
A field marked as a reference image for encoding the inter-frame prediction in the decoding process of the P-band and the B-band in the field.
3.1.22
Reference index referenceindex
Reference the index of the image.
3.1.23
Reference image referencepicture
The sample image of the inter-frame prediction is performed on the decoding process of the subsequent image in the decoding order.
3.1.24
Reference frame
A frame marked as a reference picture for encoding the inter-frame prediction in the decoding of the P-band and the B-band in the frame.
3.1.25
Parameter parameter
A set of syntax elements, a set of image parameters, or a set of security parameters. The parameters are also used in the quantization parameter.
3.1.26
Layer layer
There is no hierarchical structure in a hierarchical relationship. The upper layer contains the lower layer. The coding layer refers to the coded image sequence layer, the image layer, and the stripe layer
And macroblock layers. For scalable video coded images, images of different layers have different scalability (such as different spatial resolutions).
3.1.27
Field field
A collection of lines in a frame. A frame consists of two fields, including a top field and a bottom field.
3.1.28
Field maca fieldmacroblock
The samples contained only macroblocks from a coded field. All macroblocks of a coded field are field macroblocks.
3.1.29
Field scans fieldscan
The order of the transformation coefficients. Unlike the "Z" glyph scan order, it scans the column faster than the line of the scan. Field scanning
The transform coefficients in field macroblocks.
3.1.30
Algebraic codebook algebraiccodebook
Pulse amplitude and position of a collection. The pulse of the kth excitation code vector is obtained by the code word index k according to a certain rule
Amplitude and position.
3.1.31
Grade profile
A specific subset of grammar in this standard.
3.1.32
Basefield bottomfield
Constitute one of the two fields of the frame. Each row of the bottom field is located in the spatial position below its corresponding top field line.
3.1.33
Conductance spectrum on immittancespectralpair
The transformation of the linear prediction coefficient is called the conductance spectrum pair. The inverse filter transfer function A (z) is decomposed into an even symmetry and an odd pair
Called the polynomial function, the function of the root of the unit circle, that is, the conductance spectrum pair.
3.1.34
Top field topfield
Constitute one of the two fields of the frame. Each row of the top field is located in the spatial position above its corresponding bottom field line.
3.1.35
Short-time synthesis filter shorttermsynthesisfilter
A filter that models the channel impulse response. The excitation signal is obtained by the filter.
3.1.36
Binary bit bin
A bit in a binary string.
3.1.37
Binary string stringstring
A string of bits. Binary string is a binary representation of the binary value of the syntax element.
3.1.38
Binarization binarization
Syntax Element A unique mapping between all possible values and a set of binary strings.
3.1.39
Inverse transform inversetransform
A part of the decoding process that converts the transform coefficient matrix into the spatial sample matrix.
3.1.40
Anti-counterfeit bytes emulationpreventionbyte
One byte, which is equal to 0x03, may appear in the NAL unit. The appearance of the security byte can be guaranteed after the NAL unit
The byte-aligned byte stream does not contain a start code prefix.
3.1.41
Non-reference image non-referencepicture
Is not used for inter-frame coding of any other image.
3.1.42
Component
A single sample in a matrix or a matrix in a matrix of three sample matrices (a luminance matrix, two chromaticity matrices).
In the audio section, also refers to the vector of elements or signals in some of the frequency components.
3.1.43
Perceptual weightingfilter
Using the noise masking feature at the formant, a relatively large distortion is assigned in the formant region to reduce the subjective sensory noise
Filter.
3.1.44
Power spectrum spectrumpectrum
The signal is obtained by Fourier transform to obtain the square of the amplitude spectrum.
3.1.45
Raster scan rasterscan
Rectangular two-dimensional image to one-dimensional image mapping process, the first group of one-dimensional image from the two-dimensional image from the top of the line from left to
Right scan, and then followed by the second line, the third line and so on. For images each line (top to bottom) is scanned from left to right.
3.1.46
Macroblock macroblock
A 16 × 16 brightness sample block and the corresponding two chroma sample blocks.
3.1.47
Macroblock index macroblockindex
In the coded frame, the macroblock index is the serial number of the macroblock raster scan sequence of the frame image, and the start sequence number is zero. In the coding field, the macroblock index is
The sequence number of the macroblock raster scan sequence of the field image is 0.
3.1.48
Backward predictions backwardprediction
The samples in the current image are predicted using the samples in the decoded image in the display order.
3.1.49
Partitioning
Divide a collection into subsets. Each element in the collection belongs to and belongs to only one subset.
3.1.50
Base layer image baselayerpicture
It is not necessary to refer to other image layer information that can be decoded.
3.1.51
Level level
A set of parameters in a particular grade in this standard. A grade can contain one or more levels. For all
The grade defines a set of the same level, and most of the features of each grade of different grades are generic. For an independent implementation, in one
Under the constraints of the constraints, you can support multiple levels.
3.1.52
Immediate Decoding Refresh (IDR) Image instantaneousdecodingrefresh (IDR) picture
A coded image, where all bands are I slices. After the IDR image is decoded, all subsequent encoded images in the decoding order can be made
The inter-frame prediction decoding is performed without using any image decoded before the IDR image. The first image of each encoded video sequence
For IDR images.
3.1.53
Assume the reference decoder hypotheticalreferencedecoder
A hypothetical decoder model that specifies the constraints on the variability of the NAL unit stream or byte stream that conforms to this standard.
3.1.54
Decoding process
The process of decoding the encoded image or the audio data is generated after reading the encoded bit stream.
3.1.55
Decoder decoder
Implementation of the decoding process entities, including software and hardware.
3.1.56
Decoding order
The order in which syntax elements are processed during decoding.
3.1.57
Decode the image decodedpicture
By decoding an image obtained by encoding an image. A decoded image can be either a decoded frame or a decoding field.
A decoding field can be either a top field or a bottom field.
3.1.58
Decode the image buffer decodedpicturebuffer
The buffer area for storing the decoded image is used for the prediction reference, output reordering, or output delay specified in Appendix A.
3.1.59
Open loop pitch search open-looppitchsearch
The process of estimating the optimal pitch delay directly from the weighted input signal. The open-loop pitch search simplifies the pitch analysis and will be closed-loop
The sound search is limited to the vicinity of the delay value of the open loop pitch search.
3.1.60
Variable length encoding variablelengthcoding
The reversible entropy coding process assigns a shorter codeword to a symbol with a large probability of being assigned a shorter codeword for a symbol with a small probability of occurrence.
3.1.61
Scalability Video encoding scalablevideocoding
The image in the coding sequence has a certain degree of scalability. An image with scalability typically contains a base layer image and an enhancement layer
image.
3.1.62
Block block
In the video signal space, it refers to a sample matrix of M × N (M rows N rows), or a M × N transform coefficient matrix.
In the audio signal space, refers to a one-dimensional vector.
3.1.63
Brightness luma
A sample matrix or a single sample, used to describe the monochrome representation of the signal. The symbol used for brightness is Y.
3.1.64
Quantization parameter quantizationparameter
The parameters used to dequantize the amplitude of the transform coefficients during decoding.
3.1.65
Zero input response zeroinputresponse
When the current input of the filter is zero, the output is generated by the past input.
3.1.66
Mel Mel
A non-linear frequency scale, divided by the dominant pitch.
3.1.67
Mel frequency cepstrum coefficient Mel-frequencycepstralcoefficients
The time domain signal is transformed into the frequency domain by FFT, and the logarithmic energy spectrum is convoluted according to the triangular filter bank of the Mel scale distribution.
The output of each filter is composed of a vector of DCT obtained by the coefficient, that is, the frequency of the frequency cepstrum coefficient.
3.1.68
Internal sampling frequency internalsamplingfrequency
Audio encoder sampling frequency, ranging from 12800Hz ~ 38400Hz, using Fs said.
3.1.69
Inverse filter inversefilter
A filter that removes the short-time correlation of the signal.
3.1.70
Frequency index
And a one-dimensional or two-dimensional index related to the transform coefficients before the inverse transform in the decoding process.
3.1.71
Start code prefix startcodeprefix
The only byte in the byte stream is equal to the sequence of 3 bytes of 0x000001 as the prefix for each NAL unit. The decoder can be used
The position of the prefix prefix to determine the start of a new NAL unit and the end of the previous NAL unit. NAL unit by adding
Anti-counterfeit bytes to prevent fake starting code prefixes appear.
3.1.72
Forward forecasting forwardprediction
The samples in the current image are predicted using the samples in the preceding decoded image in the display order.
3.1.73
Forward interframe decoding image forwardinterdecodedpicture
P image
Only predictive decoded images are used for inter prediction.
3.1.74
Chroma chroma
A sample matrix or a single sample, for describing one of the two color difference signals representing the relative color. The symbol used for chroma is
Cb and Cr.
3.1.75
Context adaptive binary arithmetic coding contextadaptivebinaryarithmeticCoding
An entropy coding method for encoding a binary bit according to the context content ......
......
|