运动图像国际压缩标准MPEG.ppt

资源描述

《运动图像国际压缩标准MPEG.ppt》由会员分享，可在线阅读，更多相关《运动图像国际压缩标准MPEG.ppt（87页珍藏版）》请在三一文库上搜索。

1、运动图像国际压缩标准MPEG,2005年fall,1 概述,MPEG(Motion Picture Experts Group)是运动图像专家小组的英文缩写。这是一个为视频压缩开发制造与平台独立标准的全球性组织。 MPEG的活动始于1988年。 JPEG和MPEG都是在ISO领导下的专家小组，其成员也有很大的交叠。JPEG的目标集中于静止图像压缩，而MPEG的目标是针对活动图像的数据压缩，但静止图像与活动图像有密切关系。,MPEG,国际标准化组织(International Organization for Standardization，ISO)和国际电工委员会(International

2、Electro-technical Commission，IEC)联合成立ISO/IEC JTC1/SC29/WG11，负责开发电视图像数据和声音数据的编码、解码和它们的同步等标准 MPEG标准主要有MPEG-1、MPEG-2、MPEG-4和正在制定的MPEG-7等。,MPEG标准文件的创建过程,工作文件(Working Draft，WD) 工作组(Working Group，WG)准备的工作文件委员会草案(Committee Draft，CD) 从工作组WG准备好的工作文件WD提升上来的文件。这是ISO文档的最初形式，由ISO内部正式调查研究和投票表决国际标准草案(Draft Inter

3、national Standard，DIS) 投票成员国对CD的内容和说明满意之后由委员会草案CD提升上来的文件国际标准(International Standard，IS) 由投票成员国、ISO的其他部门和其他委员会投票通过之后出版发布的文件,MPEG的第一个成果MPEG-1于1992年推出，是VCD的基础。由于有限的352288像素分辨率，MPEG-1只适用于家庭环境，获得的视频质量及数据率相当低。 1995推出MPEG-2。720576的像素以及更高的分辨率大大提高了视频质量。 1999年12月发布了MPEG-4。 MPEG-7为多媒体内容描述接口标准。从MPEG组织成立至今，其任务

4、和方向都发生了很多变化。MPEG-1和MPEG-2已经是成熟的编码标准，现在的热点主要集中在MPEG-4 和 MPEG-7上。,MPEG系列,MPEG-1：ISO/IEC 11172 MPEG-2：ISO/IEC 13818 MPEG-4：ISO/IEC 14496 MPEG-7：ISO/IEC 15938 MPEG-21：ISO/IEC 21000,组成,video coding audio coding system definition which describes the combination of individual data streams into a common str

5、eam.,2 视频编码,An image must consist of three components. luminance Y two color difference signals Cr and Cb color subsampling 14 different pixel aspect ratios 1:1 16:9 4:3,refresh frequency 23.976Hz, 24Hz, 25Hz, 29.97Hz, 30Hz, 50Hz, 59.94Hz, and 60Hz An MPEG macro block is partitioned into 1616 pixels

6、 for the luminance component and 88 pixels for each of the two chrominance components. A macro block is formed of six blocks of 88 pixels: first four blocks for the luminance component then the two chrominance blocks.,宏块,获得高速压缩的关键是去掉尽可能多的冗余，在静止图像压缩方面，MPEG和JPEG算法几乎是一样的。首先把图像转换成YUV空间。Y分量被划分成1616的小块，U及

7、V分量被划分成88；然后，把1616亮度块再划分成4个88块，这样88块就可以进行DCT变换。由一个1616像素的亮度信息和两个88像素的色度信息组成的块称为宏块。一幅静态图像就是由许多这样的宏块组成。对于分辨率为352240的NTSC制式的一幅图像，有2215=330个宏块组成。对于分辨率是352288的PAL制式的一幅图像，有2218=396个宏块组成。,宏块的组成,efficient coding temporal redundancies of successive images random access images are coded individually. MPEG su

8、pports four types of image coding. I P B D,I帧(帧内图像intra frame),是对整幅图像采用JPEG编码的图像，是一个独立的帧，其信息由自身画面决定，不需要参照其他画面而产生，是P图和B图的参考图。 P图（前向预测帧Predicted Picture）,参照前一幅I或P图像做运动补偿编码。 B图像(双向预测 Bidirectional Prediction)，它参照前一幅和后一幅I或P图像做双向运动补偿编码。,I frames (intra coded pictures),coded without using information abou

9、t other frames (intraframe coding). An I frame is treated as a still image. Here MPEG falls back on the results of JPEG. Unlike JPEG, real-time compression must be possible. The compression rate is thus the lowest within MPEG. I frames form the anchors for random access.,I frames are encoded as in J

10、PEG. A DCT on the 88 blocks defined within the macro blocks The DC-coefficients are then DPCM coded, the differences between consecutive blocks of each component are calculated and transformed into variable-length code words. AC-coefficients are run-length encoded and then transformed into variable-

11、length code words. MPEG distinguishes two types of macro blocks: those contain only coded data those additionally contain a parameter used for scaling the characteristic curve used for subsequent quantization.,I帧图像采用帧内编码方式，即只利用了单帧图像内的空间相关性，而没有利用时间相关性。由于I帧不依赖其他帧，所以是随机存取的入点，同时是解码的基准帧。 I帧主要用于接收机的初始化和信道

12、的获取，以及节目的切换和插入，I帧图像的压缩倍数相对较低。 I帧图像周期性地出现在图像序列中的，出现频率可由编码器选择。,P frames (predictive coded pictures),require information about previous I and/or P frames for encoding and decoding. Decoding a P frame requires decompression of the last I frame and any intervening P frames. The compression ratio is consi

13、derably higher than for I frames. A P frame allows the following P frame to be accessed if there are no intervening I frames.,the most similar macro block in the preceding image must be determined MPEG does not specify an algorithm for motion estimation, but rather specifies the coding of the result

14、. motion vector (the spatial difference between the two macro blocks) and the small difference between the macro blocks need to be encoded. The search range, that is, the maximum length of the motion vector, is not defined by the standard. As the search range is increased, the motion estimation beco

15、mes better, although the computation becomes slower.,运动补偿,运动补偿算法是当前视频图像压缩技术中使用最普遍的方法之一。帧序列的相邻画面之间的运动部分具有连续性，即当前画面上的图像可以看成是前面画面某时刻画面的位移，位移的幅度值和方向在画面各处可以不同。运动补偿工作于宏块一级，主要是消除预测图与插补图在时间上的冗余，以提高压缩比。运动补偿是一种预测，它不是对每个像素预测，而是以1616图像块为单位的预测。运动补偿把当前子块认为是先前面某个时刻图像块的位移，位移（运动矢量）的内容包括运动方向和运动幅度。,宏预测与运动补偿示意图,Block

16、 Motion Estimation,Video sequence : Tennis frame 0,Video sequence : Tennis frame 1,Frame Difference,Motion VectorMotion Estimation,P图是把I图中的“准宏块”复制过来，拼成的一幅图。“准宏块”的边界不是I图中的1616的宏块，是I图中的一个类似块，这一个复制过程称为“运动”。由于P是在I的将来，所以称为“前向预测”。把一个类似块复制过来之后，与真正的P图是不吻合的，需要修正，这个过程就是运动补偿。经过“补偿”之后，P图就与原来没压缩的图像相差无几了。,1616的运

17、动矢量块是预测误差，必须进行编码、传送、供解码时恢复图像时使用。不同区域宏块的运动矢量，可有不同的选择，运动矢量的选择范围是基于帧间图像的时间分辨率，和块内图像的时间分辨率，以及帧序列图像的性质而选定。例如，当两个1616宏块所包含的画面内容在传送中完全静止不动，那么宏块的运动矢量为零（宏块的坐标没有改变）。,P frames can consist of macro blocks as in I frames, as well as six different predictive macro blocks. In coding P-frame-specific macro blocks

18、differences between macro blocks as well as the motion vector need to be considered. The difference values between all six 88 pixel blocks of a macro block being coded and the best matching macro block are transformed using a two-dimensional DCT.,Further data reduction is achieved by not further pro

19、cessing blocks where all DCT coefficients are zero. This is coded by inserting a six-bit value into the encoded data stream. Otherwise, the DC- and AC-coefficients are then encoded using the same technique. Next, run-length encoding is applied and a variable length coding is determined according to

20、an algorithm similar to Huffman. motion vectors of adjacent macro blocks are DPCM coded. The result is again transformed into variable-length coded words using a table.,B frames,B frames(bidirectionally predictive coded pictures) require information from previous and following I and/or P frames. B f

21、rames yield the highest compression ratio attainable in MPEG. A B frame is defined as the difference from a prediction based on a previous and a following I or P frame. It cannot ever serve as a reference for prediction coding of other pictures.,A macro block can be derived from macroblocks of previ

22、ous and following P and/or I frames. a prediction can interpolate two similar macro blocks. two motion vectors are encoded one difference block is determined between the macro block to be encoded and the interpolated macro block. Subsequent quantization and entropy encoding are performed as for P-fr

23、ame-specific macro blocks. B frames need not be stored in the decoder.,D frames,D frames (DC coded pictures) are intraframe-coded and can be used for efficient fast forward. During the DCT, only the DC-coefficients are coded; the AC coefficients are ignored.,D frames contain only the low-frequency c

24、omponents of an image. A D-frame always consists of one type of macro block and only the DC-coefficients of the DCT are coded. D frames are used for fast-forward display. This could also be realized by a suitable placement of I frames.,P帧和B帧图像采用帧间编码方式，即同时利用了空间和时间上的相关性。 P帧图像只采用前向时间预测，可以提高压缩效率和图像质量。P帧

25、图像中可以包含帧内编码的部分，即P帧中的每一个宏块可以是前向预测，也可以是帧内编码。 B帧图像采用双向时间预测，可以大大提高压缩倍数。由于B帧图像采用了未来帧作为参考，因此MPEG-1编码码流中图像帧的传输顺序和显示顺序是不同的。从压缩的程度来看，I图的压缩率最小；由于P图只存储当前帧和参考帧的误差信号，因此P图得到了较大的压缩；而B图的压缩率是最大的，这也使得B帧不能作为预测基准的原因。,MPEG的帧序列,使MPEG获得较大的压缩率的方法是消除连续帧中的时间冗余。无论在视频上看到如何激烈的动作，两幅连续帧之间的差别总是很小的。由于JPEG只压缩一幅单独图像的信息，所以 MPEG必须处理时间

26、冗余。从根本上讲，这属于差分编码的技术。首先在发送端发送一个基本帧，然后比较后续帧的区别进行编码，压缩后加以传送。接收端能够根据第一个基本帧和接收到的差值重建所有的帧。,把这种思想加以扩展就是MPEG所做的工作，当然，MPEG要比这复杂。计算当前帧与前一个帧的差别来处理那些在视野中移动的图形是非常有效的，因为那些图形就在前一个帧中。但它不适用于那些不在前一个帧中的图像。比如说，一个全新的情景就不能这样压缩。新老情景间的差别很大，这时很可能不得不发送新的场景。,不同的帧类型在一个帧序列中应按什么形式排列？要保证I帧必须在任何帧序列中周期性地出现。这是因为差分编码计算适用于帧之间差别极小的

27、情况，但与一个固定帧差别很小的情况总是局限在相对较短的一段时间内，如果出现新的物体，随后情景就会发生改变。这种情况涉及那些藏在某些移动体后面的物体。例如当一个人在一个场景中移动时，前一帧中原本藏在人后面的物体会出现在后续的帧中。让I帧周期性地出现确保差异是相对于最近的情景进行计算的，能消除错误的传播。,怎样从其他帧重建P帧和B帧？播放时看到的帧次序不是传送的帧的次序。P帧在最初的两个B帧前面传送，而第二个I帧在最后的两个B帧前面传送。然后P帧和两个I帧可以被缓存起来，这样接下来收到的B帧就可以在观看端进行解码。,Quantization,AC-coefficients of B and P

28、frames are usually very large values, whereas those of I frames are very small. MPEG quantization adjusts itself accordingly. If the data rate increases too much, quantization becomes more coarse. If the data rate falls, then quantization is performed with finer granularity.,3 语音编码,MPEG audio coding

29、 is compatible with the coding of audio data used for Compact Disc Digital Audio (CD-DA) and Digital Audio Tape (DAT). The most important criterion is the choice of sample rate of 44.1kHz or 48kHz (additionally 32kHz) at 16bits per sample value. Each audio signal is compressed to either 64, 96, 128,

30、 or 192Kbit/s.,Three quality levels (layers) are defined with different encoding and decoding complexity. An implementation of a higher layer must be able to decode the MPEG audio signals of lower layers FFT is applied for audio, and the spectrum is divided into 32 nonoverlapping subbands noise leve

31、l in each subband is determined using a psychoacoustic model.,In the first and second layers, the appropriately quantized spectral components are simply PCM-encoded. The third layer additionally performs Huffman coding. MPEG provides for two types of stereo sound. Two channels are processed complete

32、ly independently. In the joint stereo mode, MPEG achieves a higher compression ratio by exploiting redundancies between the two channels,The minimal value is always 32Kbit/s. The layers support different maximal bit rates: layer 1 allows for a maximum of 448Kbit/s layer 2 for 384Kbit/s layer 3 for 3

33、20Kbit/s. For layers 1 and 2, not all combinations of bit rate and mode are allowed, and a decoder is not required to support a variable bit rate. In layer 3, a variable bit rate is specified by allowing the bit rate index to be switched.,4 数据流,An audio stream is comprised of frames, which are made

34、up of audio access units, which in turn are divided into slots. An audio access unit is the smallest compressed audio sequence that can be completely decoded independently of all other data.,Video Stream,A video stream is comprised of 6 layers: sequence layer the beginning of the sequence layer incl

35、udes two entries: the constant bit rate of the sequence and the minimum storage capacity required during decoding. A video buffer verifier influences the quantizer and forms a type of control loop. group of pictures layer This layer contains at least an I frame, which must be one of the first images

36、. the difference between decoding order and display order,picture layer contains a whole still image. image number. slice layer Each slice consists of macro blocks A slice also includes the scaling used for DCT quantization of all its macro blocks. macro block layer block layer,System Definition,spe

37、cifies the combination of audio and video data streams the coordination of input data streams with output data streams, clock adjustment, and buffer management. One could define a protocol to supply the header upon request. MPEG does not prescribe compression in real-time. MPEG defines the decoding

38、process but not the decoder itself.,5 MPEG-1,MPEG-1的标准号为ISO/IEC 11172，标准名称为“信息技术用于数据速率高达大约1.5 Mbit/s的数字存储媒体的电视图像和伴音编码”（Information technology Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s）。 MPEG-1标准1992年公布。其任务是在一种可接受的质量下，把视频和伴音信号压缩到速率大约为1.5Mb/s的单一

39、MPEG数据流。 MPEG1标准包括MPEG视频、MPEG音频和MPEG系统三部分 MPEG1标准是一个通用标准，既考虑了应用要求，又独立于应用之上。,MPEG-1,Coding of moving pictures and associated audio at up to about 1.5 Mbit/s Part 1 : Systems Part 2 : Video Part 3 : Audio Part 4 : Conformance testing Part 5 : Software simulation,MPEG1视频压缩算法必须有与存储相适应的性质，即能够随机访问、快进快退、检索

40、、倒放，同时需要音像同步、一定的容错能力、延时控制、可编辑性及灵活的视频窗口格式，这与多媒体技术所要求的交互性相适应，构成了MPEG1视频压缩算法的特点。,MPEG-l标准采用了一系列技术以获得高压缩比：对高度与色差信号进行采样，减少数据量；采用运动补偿技术减少帧间冗余度做二维DCT变换去除空间相关性对DCT分量进行量化，舍去不重要的信息，将量化后DCT分量按照频率重新排序将 DCT分量进行变字长编码对每数据块的直流分量(DC)进行预测差分编码,在通信网络方面，MPEG-1标准可适应多种网络，如ISDN、LAN等通信网络，广泛应用于网络上的图像传输。在媒体存储方面，采用MPEG-

41、1标准编码的数据可存储在光盘、数字录音带、硬盘、可写磁光盘等媒体中。其中应用最广泛的是VCD光盘。 VCD采用MPEG-1压缩标准，将图像压缩25200倍，声音压缩65倍，并以数字方式加以记录，可播放长达74分。VCD具有288线的垂直解象率，图像质量略优于VHS录象带。VCD能按节目索引、时间等进行检索，可立即找到用户想要的节目段落的起点。,MPEG-2,MPEG-2的标准号为ISO/IEC 13818，标准名称为“信息技术电视图像和伴音信息的通用编码(Information technology Generic coding of moving pictures and associat

42、ed audio information )”。 MPEG-2标准从1990年开始研究，1995年正式成为标准。 MPEG-2是MPEG-1的扩充，它们的基本编码算法都相同。但MPEG-2增加了许多MPEG-1所没有的功能，例如增加了隔行扫描电视的编码，提供了位速率的可变性能(scalability)功能。,MPEG-2,Generic coding of moving pictures and associated audio Part 1 Systems Part 2 Video Part 3 Audio Part 4 Conformance testing Part 5 Software

43、 simulation Part 6 System extensions - DSM-CC Part 7 Audio extension - NBC mode Part 8 VOID - (withdrawn) Part 9 System extension RTI Part 10 Conformance extension - DSM-CC Part 11 IPMP on MPEG-2 Systems,它是一个直接与数字电视广播有关的高质量图像和声音编码标准。MPEG-2主要针对高清晰度电视(HDTV)所需要的视频及伴音信号. MPEG视频编码的基本技术与MPEG1不同之处主要在于：MPEG

44、2采用了场处理方式，而MPEG1只采用了帧处理方式。MPEG2有帧图和场图两种图，预测也分为帧预测和场预测，因此MPEG可以对隔行视频源数据进行直接编码，而MPEG则不行。,MPEG-2标准将图像分为五个配置（Profiles）和四个等级(levels)，由档次和等级组成的组合共有20种。其中11种组合已达成共识，形成技术规范，用于从低端的电视会议可视电话到高端的高清晰度电视等不同的场合。目前，DVD采用了用于数字视盘和数字电视卫星直播的技术规范，以110Mb/s可变速率进行图像和声音的传输处理，速率大小依据图像复杂程度与声音数据的多少而改变，平均速度为4.69Mbs。DVD采用MPEG-2

45、标准，这也为以后与高清晰度电视HDTV接轨打下了基础。,MPEG-4,1999年推出ISO/IEC标准MPEG-4。MPEG-4是目前视频压缩技术的最新发展水平。数字化电视、交互式图形应用(如PC游戏、虚拟环境)及WWW（万维网）这三个领域的成功促进了MPEG-4的诞生。 MPEG-4旨在为视音频数据的通信、存取与管理提供一个灵活的框架与一套开放的编码工具。这些工具将支持大量的应用功能(新的和传统的)。 MPEG-4提供的多种视音频(自然的与合成的)的编码模式使图像或视音频中对象的存取大为便利, 称作基于内容的存取。,MPEG1与MPEG2最主要的目标是通过数据压缩技术，实现数字音频、视频数

46、据的有效存储和传输。因此，所处理的是音频及基于“矩形帧”的视频信息，而其交互功能也仅仅是局限在音频及矩形帧层次上。 MPEG-4标准支持基于内容的交互功能，以音视频对象AVO(Audiovisual Object）的形式对AV场景进行描述，这些AVO在空间及时间上有一定的关联，分析后，可对AV场景进行分层描述。因此，MPEG-4提供了一种崭新的交互方式基于内容的交互（Content-based Interactivity）,在视频编码方面，MPEG-与现有标准相比也有了重要突破。传统图像编码方法依据信源编码理论的框架，将图像作为随机信号，利用其随机特性来达到压缩的目的。这种方法本身未能考虑信息

47、获取者的主观意义与主观特性，未能考虑事件本身的特性如具体含义、重要性以及后果等等。 MPEG-4的目标在于采用现代图像编码方法，利用人眼的视觉特性，抓住图像信息传输的本质，从轮廓纹理的思路出发，支持基于视觉内容的交互功能。关键在于基于视频对象的编码，MPEG-4引入了视频会晤VS、视频对象VO、视频对象层VOL及视频对象面VOP等概念。,MPEG-4,Coding of audio-visual objects Part 1 Systems Part 2 Visual Part 3 Audio Part 4 Conformance testing Part 5 Reference Softwa

48、re Part 6 Delivery Multimedia Integration Framework Part 7 Optimized software for MPEG-4 tools Part 8 MPEG4 on IP framework,MPEG-4,Part 9 Reference Hardware Description Part 10 Advanced Video Coding Part 11 Scene Description and Application Engine Part 12 ISO Base Media File Format Part 13 IPMP Exte

49、nsions Part 14 MP4 File Format Part 15 AVC File Format Part 16 Animation Framework eXtension (AFX),MPEG-4 Versions,Version 1:December 1998 Version 2:December 1999 more tools were added in subsequent amendments that could be qualified as versions, even though they are harder to recognize as such,Audiovisual Objects (AVOs) in MPEG-4,AVOs are individually coded in order to achieve maximum efficiency. defining a syntax for storing information about Intellectual Property Rights (IPR) pertaining to MPEG-4 AVO

展开阅读全文