基于异构多核处理器的视频编码去相关性研究

资源描述

《基于异构多核处理器的视频编码去相关性研究》由会员分享，可在线阅读，更多相关《基于异构多核处理器的视频编码去相关性研究（128页珍藏版）》请在金锄头文库上搜索。

1、华中科技大学博士学位论文基于异构多核处理器的视频编码去相关性研究姓名：高毅申请学位级别：博士专业：计算机系统结构指导教师：余胜生 20090830 华中科技大学博士学位论文华中科技大学博士学位论文 I 摘要摘要视频编码是当前国内外研究和工业应用的热点之一。视频编码通过对视频进行去相关处理实现视频压缩目的。因此，视频编码中的去相关方法研究具有重要的理论价值和实际意义。异构多核处理器（Heterogeneous multi-core processor, HMP）逐渐成为未来处理器发展的主流趋势，针对 HMP 的高效视频编码方

2、法也获得了相当的关注。论文以提高 HMP 上的视频编码率失真（Rate-distortion, RD）性能为主要目标，提出了一些新的帧内预测以及变换工具，并对并行运动估计算法进行了研究。针对帧内预测中的DC（Direct current）预测模式不适合于图像平坦区域的问题，提出了一种基于距离加权的预测（Distance-based weighted prediction, DWP）方法。该方法根据平坦区域中像素间的相关性与像素间的距离具有反比的关系建立了线性预测模型。为了减少计算复杂度特别是便于硬件实现，进一步提出了iDWP（integral DWP）方法。实验结果表明，采用

3、DWP和iDWP能够获得更好的去相关效果。由于不同帧内预测模式下的预测残差具有不同的能量分布特征，而DCT （Discrete cosine transform）采用固定的变换矩阵，因而很难达到理想的去相关性效果。卡洛变换（KarhunenLove transform, KLT）是均方差意义下的最优变换，不过， KLT的变换性能具有数据依赖性。根据每种预测模式对应的残差信号具有较为一致的能量分布特性，提出基于最优频谱匹配（Optimal frequency matching, OFM）算法为每种预测模式训练出唯一的KLT矩阵，避免了实时训练KLT矩阵的巨大计算量。实验结果表明，训练

4、得到的KLT矩阵具有稳定的且优于DCT的变换性能。视频编码中常常采用可变尺寸的块预测，预测块尺寸同DCT矩阵不匹配不仅降低了去相关性能，同时也导致大于44的块内部存在严重的块效应。论文提出将多通道滤波器组（M-channel filter bank, MCFB）用于残差块分解。基于块的多通道分解方式具有以下三个优点：其一，卷积运算自身的特性可以减少预测块内部的块效应；其二，可以进行基于块的RD优化；其三，变换系数具有同DCT系数相似的频率特性。实验表明，同DCT相比，采用MCFB能明显提高解码图像的主客观质量。此外，为了减少变换过程的计算复杂度，论文中进一步构造出整数多通道滤波器

5、组。运动估计过程具有相当高的计算复杂度，HMP中的GPU（Graphics processing unit）常常用来加速运动估计过程。不过，在GPU的并行处理模式下，当前宏块在进行运动估计时无法获得相邻宏块的运动和模式信息作为参考，从而无法进行真正意义上的率失真优化来选择最佳运动向量。已有的方法都是直接根据SAD（Sum of absolute difference）来选择最佳运动向量。论文提出基于对等宏块的并行运动估计华中科技大学博士学位论文华中科技大学博士学位论文 II （Collocated macroblock based moti

6、on estimation, CMME）算法，利用前一帧中宏块的运动信息作为参考来估计当前运动向量代价。实验结果表明，CMME算法特别适合于较低码率以及运动较为平缓的视频序列，在增加了较少量计算代价的前提下，最高可以获得0.8dB的PSNR增益。综上所述，在对视频编码中的去相关方法进行了深入研究的基础上，提出了一系列新的帧内预测以及变换技术来提高编码的RD性能并给出相应的整数化实现方案。同时，针对并行计算模式下的运动估计算法进行了研究。关键词关键词：视频编码；运动估计；帧内预测；整数变换；异构多核处理器华中科技大学博士学位论文华中科技大学博

7、士学位论文 III Abstract Video coding is currently the worldwide hotspot of research institutes and industrial applications. In video coding, by decorrelating redundancy in video contents, video compression can be well achieved. Therefore, research on decorrelation techniques of video coding has great

8、 theoretical value and practical significance. Heterogeneous multi-core processor (HMP) gradually becomes the mainstream of future processors. Efficient video coding on heterogeneous multi-core processor (HMP) has also drawn considerable attention. In this work, we focus on improving rate-distortion

9、 (RD) performance of video coding on HMP. A novel intra prediction strategy as well as several transform tools is presented in this work. Moreover, parallel motion estimation algorithm is also studied. Considering that DC (Direct current) prediction mode in intra prediction is unsuitable for smoothl

10、y-varying area, a distance based weighted prediction (DWP) is proposed. A linear prediction model is built based on the inverse relation of correlation to distance between two pixels. To reduce computational complexity, especially the cost of hardware implementation, integer DWP (iDWP) is further pr

11、oposed. Experiments demonstrate that significant improvements on RD performance can be achievable by DWP and iDWP with a small amount of computation complexity increase. Since the residual signals of different intra prediction modes exhibit dissimilar energetic distribution, it is difficult for DCT

12、(Discrete cosine transform) with fixed transform matrix to achieve ideal decorrelation performance. Though KLT (KarhunenLove transform) is optimal for transform coding, its coding performance is data-dependent. Due to the relatively accordant energetic distribution of residual signals of the same pr

13、ediction mode, an optimal frequency matching (OFM) algorithm is proposed to train a unique KLT matrix for each intra prediction mode, which can avoid the extremely high computational complexity of real-time KLT matrix training. The experiments show that the coding performance of the trained KLT matr

14、ices is stable and superior to DCT. Variable block size prediction is an important coding tool adopted in video coding. However, the mismatch between prediction block and DCT matrix not only degrades decorrelation performance but also brings blocky artifacts inside the block bigger than 44. To overc

15、ome these defects, M-channel filter bank (MCFB) is proposed to decompose residual blocks after prediction. Decomposition using M-channel filter bank possesses the following three merits: firstly, blocky artifacts inside the prediction block can be alleviated due to the inherent property of convoluti

16、on; secondly, block-based RD optimization can be 华中科技大学博士学位论文华中科技大学博士学位论文 IV performed; finally, the frequency characteristics of the transformed coefficients are similar to that of DCT. The experimental results show that MCFB can significantly outperform DCT in both subjective and objective video quality. To reduce the computational complexity of transformation using MCFB, an integer M-channel filter bank is also constructed. Motion estimation in video coding is of

展开阅读全文

基于异构多核处理器的视频编码去相关性研究

最新文档