ISO-14496-3-AMD-3-2006.pdf

上传人:爱问知识人 文档编号:3774981 上传时间:2019-09-23 格式:PDF 页数:78 大小:609.83KB
返回 下载 相关 举报
ISO-14496-3-AMD-3-2006.pdf_第1页
第1页 / 共78页
ISO-14496-3-AMD-3-2006.pdf_第2页
第2页 / 共78页
ISO-14496-3-AMD-3-2006.pdf_第3页
第3页 / 共78页
ISO-14496-3-AMD-3-2006.pdf_第4页
第4页 / 共78页
ISO-14496-3-AMD-3-2006.pdf_第5页
第5页 / 共78页
亲,该文档总共78页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

《ISO-14496-3-AMD-3-2006.pdf》由会员分享,可在线阅读,更多相关《ISO-14496-3-AMD-3-2006.pdf(78页珍藏版)》请在三一文库上搜索。

1、 Reference number ISO/IEC 14496-3:2005/Amd.3:2006(E) ISO/IEC 2006 INTERNATIONAL STANDARD ISO/IEC 14496-3 Third edition 2005-12-01 AMENDMENT 3 2006-06-01 Information technology Coding of audio-visual objects Part 3: Audio AMENDMENT 3: Scalable Lossless Coding (SLS) Technologies de linformation Codage

2、 des objets audiovisuels Partie 3: Codage audio AMENDEMENT 3: Codage extensible sans perte (SLS) ISO/IEC 14496-3:2005/Amd.3:2006(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unle

3、ss the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark

4、 of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In

5、 the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO/IEC 2006 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical,

6、 including photocopying and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org P

7、ublished in Switzerland ii ISO/IEC 2006 All rights reserved ISO/IEC 14496-3:2005/Amd.3:2006(E) ISO/IEC 2006 All rights reserved iii Foreword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide stan

8、dardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields o

9、f mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in ac

10、cordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International

11、Standard requires approval by at least 75 % of the national bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Amendment

12、3 to ISO/IEC 14496-3:2005/Amd. 3:2005 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information. This Amendment specifies Audio Scalable Lossless Coding (SLS). ISO/IEC 14496-3:2005/Amd.3:2006(E

13、) ISO/IEC 2006 All rights reserved 1 Information technology Coding of audio-visual objects Part 3: Audio AMENDMENT 3: Scalable Lossless Coding (SLS) In ISO/IEC 14496-3, Introduction, add the following to the end of the subclause “MPEG-4 general audio coding tools“: MPEG-4 SLS (Scalable Lossless Codi

14、ng) is a tool used in combination with optional MPEG-4 General Audio coding tools to provide fine-grain scalable to numerical lossless coding of digital audio waveform. In Part 3: Audio, Subpart 1, in subclause 1.3 Terms and Definitions, add: SLS: Audio Scalable to Lossless Coding and increase the i

15、ndex-number of subsequent entries. In Part 3: Audio, Subpart 1, in subclause 1.5.1.1 Audio object type definition, amend table 1.1 with the updates in the table below: Tools/ Modules Audio Object Type Error Mapping (*) Integer TNS (*) Integer M/S (*) IntMDCT (*) BPGC/CBAC/LEMC (*) Remark Object Type

16、 ID (escape) X 31 SLS X X X X X 37 SLS non-core X X 38 . Note: (*) marks new columns ISO/IEC 14496-3:2005/Amd.3:2006(E) 2 ISO/IEC 2006 All rights reserved In Part 3: Audio, Subpart 1, subclause 1.4 (Symbols and Abbreviations) add the following subclause: 1.4.9 Arithmetic data types INT32 32 bit sign

17、ed integer using twos complement INT64 64 bit signed integer using twos complement In Part 3: Audio, Subpart 1, subclause 1.5 add the following subclauses: 1.5.1.2.31 SLS object type The SLS object is supported by the scalable to lossless tool which provides fine-grain scalable to lossless enhanceme

18、nt of MPEG perceptual audio codecs, such as AAC, allowing multiple enhancement steps from the audio quality of the core codec up to near-lossless and lossless signal representation. It also provides stand- alone lossless audio coding when the core audio codec is omitted. 1.5.1.2.32 SLS Non-Core obje

19、ct type The SLS non-core object is supported by the scalable to lossless tool. It is similar to the SLS object type but the core audio codec is omitted. In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 AudioSpecificConfig, amend table 1.8 with the updates in the table below: Syntax No. of bits Mnem

20、onic AudioSpecificConfig () switch (audioObjectType) case 37: case 38: SLSSpecificConfig(); break; In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 add the following subclause: 1.6.2.1.13 SLSSpecificConfig Defined in ISO/IEC 14496-3 subpart 12. In Part 3: Audio, Subpart 1, in subclause 1.6.2.2.1 Ov

21、erview, add the following to table 1.14: Audio Object Type Object Type ID Definition of elementary stream payloads and detailed syntax Mapping of audio payloads to access units and elementary streams SLS 37 ISO/IEC 14496-3 subpart 12 SLS non_core 38 ISO/IEC 14496-3 subpart 12 ISO/IEC 14496-3:2005/Am

22、d.3:2006(E) ISO/IEC 2006 All rights reserved 3 Create Part 3: Audio, Subpart 12: Subpart 12: Technical description of scalable lossless coding 12.1 Scope This subpart of ISO/IEC 14496-3 describes the MPEG-4 scalable lossless coding algorithm for audio signals. This description partially relies on th

23、e specification as given in subpart 4. 12.2 Terms and definitions 12.2.1 Definitions The following definitions are used in this subpart. Core Layer The MPEG-4 GA T/F coder used as the first layer in SLS . The audio object types AAC LC, AAC Scalable (without LTP), ER AAC LC, ER AAC Scalable and ER BS

24、AC are supported. LLE Layer Lossless enhancement layer used in SLS to enhance the quality of the core layer towards lossless coding. Bit-Plane Position of specific bit in binary data word, starting with 0 as the position of the least significant bit (LSB). For example, the binary bit-plane symbols f

25、rom bit-plane 0, 1, 2, and 3 of data word 0x0011 1101 (0x3d) are 1, 0, 1, and 1 respectively. BPGC Bit-Plane Golomb Code CBAC Context Based Arithmetic Code LEMC Low Energy Mode Code Implicit Band A scale factor band for which the quantized spectral data presented in the core layer bit-stream will be

26、 used in determining part of the necessary side information for the LLE layer. Explicit Band A scale factor band for which the quantized spectral data presented in the core layer bit-stream will not be used in determining the necessary side information for the LLE layer. All the side information wil

27、l be coded explicitly in the LLE payload. Oversampling Factor (osf) Ratio between sampling rates of LLE Layer and Core Layer, possible values are 1, 2 and 4. Oversampling Range High frequency range covered only by the LLE Layer, comprises (osf-1)*1024 resp. (osf-1)*128 frequency values per window. R

28、eserved All fields labelled Reserved are reserved for future standardization. All Reserved fields must be set to zero. ISO/IEC 14496-3:2005/Amd.3:2006(E) 4 ISO/IEC 2006 All rights reserved 12.2.2 Notations In order to make the description stringent, the following notations are used in this subpart:

29、Vectors are indicated by bold lower-case names, e.g. vector. Matrices (and vectors of vectors) are indicated by bold upper-case single letter names, e.g. M. Variables are indicated by italics, e.g. variable. Functions are indicated as func(x) 12.2.3 Definitions DIV(m,n) Integer division with truncat

30、ion of the result of m/n to an integer value towards . The floor operation. Returns the largest integer that is less than or equal to the real-valued argument. 12.3 Payloads for the audio object Table 12.1 Syntax of SLSSpecificConfig Syntax No. of bits Mnemonics SLSSpecificConfig(samplingFrequencyIn

31、dex, channelConfiguration, audioObjectType) pcmWordLength; 3 uimsbf aac_core_present; 1 uimsbf lle_main_stream; 1 uimsbf reserved_bit; 1 uimsbf frameLength; 3 uimsbf if (!channelConfiguration) program_config_element(); Table 12.2 Top layer payload for lle stream Syntax No. of bits Mnemonics lle_elem

32、ent() for (ch=0;ch=1) lle_extension stream (lle_main_stream = 0), for each LLE_ICS, the lle_data() is constructed by concatenating the lle_data() elements from the lle_main stream, and all the available lle_extension streams in sequences as shown in the following figure: ISO/IEC 14496-3:2005/Amd.3:2

33、006(E) ISO/IEC 2006 All rights reserved 13 LLE decoding side information lle_data()lle_data()lle_data() . . lle_mainlle_extension (layer 1) lle_extension (layer N) lle_ics_length Figure 12.6 Construction of LLE_ICS for from multiple LLE streams If there is an intermediate LLE_extension stream missin

34、g, the data in lle_data() of the subsequent streams can not be used. 12.5.4.2.3 Recovering BPGC/CBAC side information For each scale factor band of band type Explicit_Band, a maximum bit-plane (max_bp) is transmitted. In addition, for each scale factor band, a lazy bit-plane (lazy_bp) is transmitted

35、 unless the residual spectral data is all zero for this scale factor band (which is signalled by maximum bit-plane = -1). The max_bp is coded using variable length coded DPCM relative to the previously transmitted maximum bit-plane. The first value in each window group is coded using 5 bits PCM. The

36、 max_bp value is coded in unary representation. The following table gives some examples of how the DPCM value of max_bp is coded. Table 12.15 Codeword for decoding the DPCM value of max_bp DPCM max_bp codeword codeword length 0 1 1 (s)1 01(s) 3 (s)2 001(s) 4 (s)10 00000000001(s) 12 The difference be

37、tween max_bp and lazy_bp, whose value is within the range 1, 2, 3 is decoded as follows: Table 12.16 Codeword for decoding the difference between max_bp and lazy_bp max_bp - lazy_bp codeword codeword length 1 10 2 2 0 1 3 11 2 ISO/IEC 14496-3:2005/Amd.3:2006(E) 14 ISO/IEC 2006 All rights reserved Th

38、e following pseudo code illustrates the decoding process for max_bp and lazy_bp. for (g = 0;g =0) if (read_bits(1)=0) lazy_bpgsfb = max_bpgsfb - 2; else if (read_bits(1)=0) lazy_bpgsfb = max_bpgsfb - 1; else lazy_bpgsfb = max_bpgsfb - 3; For Implicit_Bands, max_bpgsfb is calculated from the quantiza

39、tion thresholds of the core layer quantizer as follows: As the first step, the maximum bit-plane M for each residual spectral bin for significant scale factor bands can be calculated from 2 log M g win sfb binINTinterval g win sfb bin= where interval g win sfb bin is the quantization interval that i

40、s given by: ()() 1 1interval g win sfb binthr quant g win sfb binthr quant g win sfb bin=+ . Here thr(x) and inv_quant(x) are, respectively, the deterministic quantization threshold and the corresponding deterministic inverse quantization for AAC quantizer. They are calculated as in the following ps

41、eudo code: If (x=0) thr(x)=0; else thr(x) = (thrMantissa(|x|-1, scale_res) 0 are BPGC/CBAC decoded, where the amplitude of the residual spectral data res is bit-plane decoded starting from the maximum bit-plane max_bp and progressing to lower bit-planes until bit-plane 0 for each scale factor band.

42、Subsequently, the low energy mode decoding is invoked to decode the remaining scale factor bands with lazy_bp 0. The BPGC/CBAC bit-plane decoding process is used to decode the bit-plane symbols for reconstructing the residual integer spectral data res. The bit-plane decoding process is started from

43、max_bp for each sfb, and progressively proceeds to lower bit-planes. For the first NUM_BP bit-plane scans the bit-plane symbols are arithmetic decoded as illustrated in the following pseudo code: /* preparing the help element */ for (g=0;g= 0) for (g=0;g=0) for (win=0;win resgwinsfbbin + (1= 0) for

44、(g=0;g=0) for (win=0;win resgwinsfbbin + (1511 469 234 117 938 Context 2: significant state (ss) For interleaved residual IntMDCT spectral data ci, i=0,1024*osf-1 that is insignificant (i.e., the bit-plane symbols of ci decoded so far are all zeroes) the ss context is determined by the significance

45、of its adjacent spectral data: ()()()()()_,_2,_1,_1,_2,sigcx i bpsigstate ibpsigstate ibpsigstate ibpsigstate ibp=+ where _( ,)sigstate i bp is defined as: () 0 _, 1 c i is insignificant beforebitplanebp sigstate i bp c i is significant beforebitplanebp = and _( ,)sigstate i bp is defined as 0 if i

46、is smaller than 0 or larger than the IntMDCT length. For ci that is already significant, the ss context is determined by the band type of the scalefactor band that it is from: ( ) 0_ _ 1_ c i is from an ExplicitBand sigcore i c i is from an ImplicitBand = . ISO/IEC 14496-3:2005/Amd.3:2006(E) ISO/IEC

47、 2006 All rights reserved 19 Furthermore, for the latter case, the ss context is further determined according to the value of ( ,)quant_interval i bp defined as: () 1 1 0_ 2 _, 1_ 2_2 bp bpbp recspectrum iinterval i quantinterval i bp recspectrum iinterval irecspectrum i + + + = += 0) if (resgsfbwin

48、bin=(11 6302 745 552 X The following table defines the mapping between the binary string decoded in case of the low energy mode and the residual spectral data res. The sign bit of res is decoded after the first non-zero bit-plane symbol has been decoded. Table 12.26 Binarization of res in low energy mode coding Amplitude of resgwinsfbbin Binary string 0 0 1 1 0 2 1 1 0 3 1 1 1 0 4 1 1 1 1 0 2(max_bpgsfb+1)-2 1 1 1 0 2(max_bpgsfb+1)-1 1 1 1 1 pos 0 1 2 3 ISO/IEC 14496-3:2005/Amd.3:2006(E)

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 其他


经营许可证编号:宁ICP备18001539号-1