ISO-TR-14496-7-2004.pdf

上传人:来看看 文档编号:3781316 上传时间:2019-09-23 格式:PDF 页数:40 大小:471.67KB
返回 下载 相关 举报
ISO-TR-14496-7-2004.pdf_第1页
第1页 / 共40页
ISO-TR-14496-7-2004.pdf_第2页
第2页 / 共40页
ISO-TR-14496-7-2004.pdf_第3页
第3页 / 共40页
ISO-TR-14496-7-2004.pdf_第4页
第4页 / 共40页
ISO-TR-14496-7-2004.pdf_第5页
第5页 / 共40页
亲,该文档总共40页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

《ISO-TR-14496-7-2004.pdf》由会员分享,可在线阅读,更多相关《ISO-TR-14496-7-2004.pdf(40页珍藏版)》请在三一文库上搜索。

1、 Reference number ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 TECHNICAL REPORT ISO/IEC TR 14496-7 Second edition 2004-10-15 Information technology Coding of audio-visual objects Part 7: Optimized reference software for coding of audio-visual objects Technologies de linformation Codage des objets audiovi

2、suels Partie 7: Logiciel de rfrence optimis pour le codage des objets audiovisuels ISO/IEC TR 14496-7:2004(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which

3、 are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Inco

4、rporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event th

5、at a problem relating to it is found, please inform the Central Secretariat at the address given below. ISO/IEC 2004 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopyin

6、g and microfilm, without permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerlan

7、d ii ISO/IEC 2004 All rights reserved ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved iii Contents Page Forewordiv Introduction .vi 1 Scope1 2 Fast Motion Estimation.1 2.1 Introduction to Motion Adaptive Fast Motion Estimation.1 2.2 Technical Description of Core Technology MVFAST .2 2.2.

8、1 Detection of stationary blocks.2 2.2.2 Determination of local motion activity2 2.2.3 Search Center3 2.2.4 Search Strategy.4 2.2.5 Perspectives on implementing MVFAST 4 2.2.6 Special Acknowledgements.5 2.3 Technical Description of PMVFAST5 2.3.1 Introduction .5 2.3.2 Technical Description of PMVFAS

9、T6 2.3.3 Special Acknowledgement.7 2.4 Conclusions.7 3 Fast Global Motion Estimation 8 3.1 Introduction to Feature-based Fast and Robust Global Motion Estimation Technique8 3.2 Technical Description of FFRGMET9 3.2.1 Outlier Exclusion.9 3.2.2 Robust Object Function .9 3.2.3 Feature Selection 10 3.2.

10、4 Algorithm Description 10 3.2.5 Perspectives on implementing FFRGMET11 3.2.6 Special Acknowledgements.11 3.3 Conclusions.11 4 Fast and Robust Sprite Generation.11 4.1 Introduction to Fast and Robust Sprite Generation11 4.2 Algorithm Description 11 4.2.1 Outline of Algorithm .11 4.2.2 Image Region D

11、ivision12 4.2.3 Fast and Robust Motion Estimation13 4.2.4 Image Segmentation.14 4.2.5 Image Blending .14 4.3 Conclusions.15 5 Optimised Reference Software For Simple Profile and Error Resilience Tools.15 5.1 Scope15 5.2 Integration and Optimization of the Reference Software15 5.2.1 Introduction .15

12、5.2.2 Removal of the unused procedures, parameters, and data structures.16 5.2.3 Revision of the code bases for saving the execution time and code sizes16 5.2.4 Use of the existing fast algorithms for the computational burden modules21 5.2.5 Optimised Simple Profile encoder and decoder25 5.2.6 Exper

13、imental Results25 5.3 Error Resilience Tools29 5.3.1 Abbreviations 29 5.3.2 New Processing / functionalities.29 6 Contact Information31 Bibliography .32 ISO/IEC TR 14496-7:2004(E) iv ISO/IEC 2004 All rights reserved Foreword ISO (the International Organization for Standardization) and IEC (the Inter

14、national Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fi

15、elds of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a

16、 joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical comm

17、ittee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. In exceptional circumstances, the joint technical committee may propose the publication of a Technical Report of one of the following

18、 types: type 1, when the required support cannot be obtained for the publication of an International Standard, despite repeated efforts; type 2, when the subject is still under technical development or where for any other reason there is the future but not immediate possibility of an agreement on an

19、 International Standard; type 3, when the joint technical committee has collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example). Technical Reports of types 1 and 2 are subject to review within three years of publication,

20、 to decide whether they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to be reviewed until the data they provide are considered to be no longer valid or useful. Attention is drawn to the possibility that some of the elements of this document may

21、 be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. ISO/IEC TR 14496-7, which is a Technical Report of type 3, was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio,

22、picture, multimedia and hypermedia information. This second edition cancels and replaces the first edition (ISO/IEC 14496-7:2002) which has been technically revised. ISO/IEC TR 14496 consists of the following parts, under the general title Information technology Coding of audio-visual objects: Part

23、1: Systems Part 2: Visual Part 3: Audio Part 4: Conformance testing Part 5: Reference software ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved v Part 6: Delivery Multimedia Integration Framework (DMIF) Part 7: Optimized reference software for coding of audio-visual objects Technical Repo

24、rt Part 8: Carriage of ISO/IEC 14496 contents over IP networks Part 9: Reference hardware description Technical Report Part 10: Advanced Video Coding Part 11: Scene description and application engine Part 12: ISO base media file format Part 13: Intellectual Property Management and Protection (IPMP)

25、extensions Part 14: MP4 file format Part 15: Advanced Video Coding (AVC) file format Part 16: Animation Framework eXtension (AFX) Part 17: Streaming text format Part 18: Font compression and streaming Part 19: Synthesized texture stream ISO/IEC TR 14496-7:2004(E) vi ISO/IEC 2004 All rights reserved

26、Introduction Purpose This part of ISO/IEC 14496 was developed in response to the growing need for optimized reference software that provides both improved visual quality and faster execution while compliance is preserved. The goal is to provide non-normative tools that are essential for implementati

27、ons of the normative parts of the ISO/IEC 14496 specifications. For example, Part 5 of the ISO/IEC 14496 specifications uses a full search motion estimation which is theoretical optimum in coding efficiency but impractical for commercial implementation. In the past, the industry needs to create its

28、own encoding tools for its target products. In this part, we provide a well-tested set of encoding tools that can enhance the performance but should not be standardized. The following recommended tools would be up to the individual organization to decide if it wishes to adopt or adapt these tools fo

29、r its specific needs. This part provides significant reduction in the time- to-market and provides a reference benchmark for commercial ISO/IEC 14496 compliant products. -,-,- TECHNICAL REPORT ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved 1 Information technology Coding of audio-visual

30、 objects Part 7: Optimized reference software for coding of audio-visual objects 1 Scope This part of ISO/IEC 14496 specifies the encoding tools that enhance both the execution and quality for the coding of visual objects as defined in ISO/IEC 14496-2. The tool set is not limited to visual objects b

31、ut at this point all the recommended tools are visual encoding tools. There are four tools that have been described in this technical report. ? Fast Motion Estimation ? Fast Global Motion Estimation ? Fast and Robust Sprite Generation ? Fast Variable Length Decoder Using Hierarchical Table Lookup Th

32、ese tools have been demonstrated as robust tools with source codes for both MoMusys and Microsoft implementations. In the current implementations, there is single software that includes all tools existed in the ISO/IEC 14496-2. This is obviously inefficient in terms of code size and execution speed.

33、 To address this issue, the optimized reference software has compilation switches such that only selected tools as defined by the profiles and levels are included. Such level of optimization is performed at high level programming language. The platform specific optimization is currently not addresse

34、d by this part. 2 Fast Motion Estimation 2.1 Introduction to Motion Adaptive Fast Motion Estimation The optimization of fast motion estimation is essentially a multi-dimensional problem. The key dimensions concerned in this problem are: Rate, Quality (PSNR), Speed-up (or Computational Gain), Algorit

35、hmic Complexity, Memory Size and Memory Bandwidth (see Figure 1). There always exists a trade-off among all these five key dimensions. Therefore, it is highly desirable to have an adaptive fast motion estimation core algorithm with scalable structure, which can be adaptively optimized with respect t

36、o all or selected aspects for various coding environment and requirements. Since the rate control is used to fix the bit-rate, the optimization problem is reduced by one dimension to four dimensions. Motion Vector Field Adaptive Search Technique (MVFAST) 1 is a generic algorithm of the family of mot

37、ion-adaptive fast search techniques, originally proposed by Kai-Kuang Ma and Prabhudev Irappa Hosur from Nanyang Technological University (NTU), Singapore. The MVFAST offers high performance both in quality and speed and does not require memory to store the searched points and motion vectors. The MV

38、FAST has been adopted by MPEG-4 Part 7 in the Noordwijkerhout MPEG meeting (March 2000) as the core technology for fast motion estimation. A derivative of MVFAST, called Predictive MVFAST (PMVFAST) 2, is considered as an optional approach that might benefit in special coding situations. PMVFAST inco

39、rporates a set of thresholds into MVFAST to trade higher speed-up at the cost of memory size, memory bandwidth and additional algorithmic complexity. In PMVFAST, the threshold values are adjusted based on the 54 test cases specified by MPEG-4. However, the coding performance and sensitivity of PMVFA

40、ST using these thresholds for the video sequences and encoding conditions outside the MPEG-4 test set has not been studied and verified. -,-,- ISO/IEC TR 14496-7:2004(E) 2 ISO/IEC 2004 All rights reserved Bit-rate Quality Speed Memory (Size and Bandwidth) Algorithmic complexity Figure 1 Five dimensi

41、onal optimization problem of fast motion estimation 2.2 Technical Description of Core Technology MVFAST 2.2.1 Detection of stationary blocks A large number of MBs in the video sequences (e.g., “talking head” video sequences) with low-motion content tend to have motion vectors equal to (0,0). Such MB

42、s in the regions of no-motion activity can be detected simply based on the sum of absolute difference (SAD) at the origin. Therefore, we exploit an optional phase, called early elimination of search, as the first step in MVFAST as follows. The search for a MB will be terminated immediately, if its S

43、AD value obtained at (0,0) is less than a threshold T, and the motion vector is assigned as (0,0). Through extensive simulations, we found that among those zero-motion blocks identified, about 98% of them have their SAD at position (0,0) less than 512. Hence, we choose T = 512 to enable the mechanis

44、m of early elimination of search. Since this early elimination of search phase is optional, it can be turned off or disabled by imposing T = 0. 2.2.2 Determination of local motion activity The local motion vector field at a macroblock (MB) position is defined as the set of motion vectors in the regi

45、on of support (ROS) of that MB. The ROS of a MB includes the n neighborhood MBs. In MVFAST, the ROS with n = 3 is shown in Figure 2. Let V=V0, V1, .Vn, where V0 = (0,0), and Vi (and i 0) is the motion vector of MBi in the ROS (see Figure 2). The cityblock length of Vi=(xi, yi) is defined as lvi = |x

46、i| + |yi|. Let L = MAXlvi for all Vi . The motion activity at the current MB position is defined as follows. Motion Activity = Low, if L L1; = Medium, if L1 L2 ; (1) where L1 and L2 are integer constants. We choose L1 and L2 as the cityblock distance from the center point of the pattern to any other

47、 point on the small and large search patterns (see Figure 3), respectively. Thus, L1 =1 and L2 =2. -,-,- ISO/IEC TR 14496-7:2004(E) ISO/IEC 2004 All rights reserved 3 Figure 2 Region of support (ROS) for the current MB consists of MB1, MB2 and MB3 Figure 3 Example of distribution of motion vectors b

48、elonging to set V. In this case, lv1 = 2, lv2 = 1, lv3 = 6; thus L = MAXlv1, lv2, lv3 = 6 2.2.3 Search Center The choice of the search center depends on the local motion activity at the current MB position. If the motion activity is low or medium, the search center is the origin. Otherwise, the vect

49、or belonging to set V that yields the minimum sum of absolute difference (SAD) is chosen as the search center. (a) (b) Figure 4 (a) Large Diamond Search Pattern (LDSP) and (b) Small Diamond Search Pattern (SDSP) MB Current MB 2 MB MB1 3 V1 V2 V3 ISO/IEC TR 14496-7:2004(E) 4 ISO/IEC 2004 All rights rese

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 其他


经营许可证编号:宁ICP备18001539号-1