1、中国科学技术大学 硕士学位论文 自发表情识别中若干关键问题研究 姓名:吕彦鹏 申请学位级别:硕士 专业:计算机应用技术 指导教师:王上飞 2011-04 摘 要 I 摘摘 要要 人脸表情是人类情感信息交流的重要方式之一。 表情识别已成为拟人化的新 型人际交互模式研究中的关键课题之一。目前,国内外有多所研究机构和高校的 研究人员采用不同的方法对人脸表情识别相关技术和算法进行了探索, 取得了一 定的成果。但总的来说,人脸表情识别还处于实验室阶段,大部分的研究是针对 人为表情进行的。人为表情外观表现生硬,变化单一,并且与人自发的表情差距 较大,因此,针对自发表情的研究有利于推进表情识别的实用化进程。

2、 本文针对自发表情的特点,提出了三种自发表情识别方法,并将自发表情识 别应用于视频情感语义隐性标注中。具体工作如下: (1) 提出了一种基于头部运动的自发表情识别方法。方法中,我们首先使 用人眼定位算法,定位表情起始帧和夸张帧中瞳孔位置,接着根据瞳孔位置和起 始帧夸张帧帧差提取五种头部运动特征;其次使用主动外观模型(AAM:Active Appearance Model)算法在夸张帧提取三十种外观特征;最后使用最小序列优化 算法(SMO:Sequential Minimal Optimization)进行分类识别。实验结果表明: 头部运动特征对于恐惧表情有比较好的区分度; 头部运动可以单独应用

3、于自发表 情识别,也可以在使用外观特征识别表情时加入头部运动特征提高识别率。 (2) 提出了基于头部运动的态度识别方法。方法中,首先使用人眼定位算 法,在每一帧定位瞳孔位置;然后计算双眼中心坐标;如果连续帧双眼中心在 x 方向位移大于在 y 方向位移则判定为点头即肯定态度,否则判定为否定态度;最 后,根据所有连续帧结果使用投票算法得到最终识别结果。实验结果表明:该方 法可以实时有效的识别用户态度。 (3) 提出了基于特征点追踪算法的自发表情识别方法。方法中,首先根据 人眼定位结果将表情序列进行归一化并在起始帧手工标记 23 个特征点作为追踪 的起始点;其次,使用 Kalman 滤波器进行特征点

4、追踪;接着,根据特征点的坐 标,提取了特征点位移和特征点间距变化两种特征;最后将特征输入隐马尔科夫 模型进行分类识别。实验结果表明:基于 Kalman 滤波器的特征点追踪算法能够 取得较好的结果; 基于特征点间距变化特征的识别效果明显好于基于特征点位移 特征;自发表情识别率低于人为表情。 (4) 提出了基于自发表情识别的视频情感语义隐性标注方法。方法中采用 了第一个工作中的表情识别方法识别用户在观看视频时的表情, 并以此推理用户 所观看视频的情感语义标签。实验结果表明:基于可靠的表情识别方法,表情可 以用于推理情绪,并以此给视频提供情感语义标签。 摘 要 II 关键词:关键词:自发表情 人眼检

5、测 头部运动 特征点追踪 态度识别 情感语义 隐性标注 Abstract III ABSTRACT Facial expression is one of the most important ways for people to communicate their feelings with each other. Facial expression recognition has been one of the key issues in the research of personification human-computer interaction. Presently, there

6、 are many research institutions and colleges, both at home and abroad, who have proposed many methods to solve this problem, and some progress has been made. However, expression recognition is still in the lab stag, and most researches are based on the posed expression. The appearance of posed expre

7、ssion is stiff, rigid, and different from human spontaneous expression. Therefore, researches based on the spontaneous expression are propitious for the reality application of expression recognition. For the characteristic of spontaneous expression, this thesis proposes three methods for expression

8、recognition and applies one of them to video emotional semantic implicit tagging. The detailed is as follows. (1) We propose a spontaneous facial expression recognition method based on the head motion. In the method, firstly pupils coordinates are detected by eye location in the onset and apex frame

9、. Secondly, 30 appearance features are extracted from the apex frame by AAM. Finally, SMO is employed for classification. The experimental result indicates that head motion features are good at discriminate fear; they could classify expression by themselves or be added in other feature set to improv

10、e the accuracy. (2) We propose a method of attitude recognition based on the head motion. In the method, pupils locations are firstly detected in each frame. Then between-eyes displacement in successive frames is calculated. If displacement in x-axis is more than that in y-axis, then result of these

11、 two is positive, else it is negative. Finally, the voting algorithm is employed to get the final result. The experimental result indicates that this method could detect user attitude in real time. (3) Propose a recognition method based on the feature point tracking. In the method, firstly all expre

12、ssion sequences are normalized according to their pupils coordinates. Secondly, 23 points are labeled manually in the onset and apex frames. Then Kalman filter is used for tracking. Two kinds of feature (point displacement feature and points distance variation feature) are calculated. Finally, Hidde

13、n Markov Abstract IV Model is employed as classifier. The experiment result indicates that Kalman Filter point tracking method could detect the right place of point. However, the classification result based on the point displacement feature is not as good as that based on the points distance variati

14、on one and the accuracy based on the posed expression is better than that based on the spontaneous one. (4) We propose a video emotional semantic implicit tagging method based on spontaneous expression recognition. The first expression recognition method is used to recognize expression and emotional

15、 semantic tagging is inferred from the recognition result. The result shows that if the expression recognition method is reliable, emotion could be inferred from it and this implicit tagging is feasible. Key Words: Spontaneous Expression Recognition, Eye Location, Head Motion, Feature Point Tracking

16、, Attitude Recognition, emotional semantic, Implicit Tagging 插图目录 VIII 插图目录插图目录 图图 1.1 Ekman 六种基本表情六种基本表情 . 2 图图 1.2 面部动作单元示例面部动作单元示例 3 图图 1.3 表情分解为动作单元图例表情分解为动作单元图例 . 3 图图 2.1 基于头部运动的表情识别算法流程基于头部运动的表情识别算法流程 . 15 图图 2.2 人眼比例关系图人眼比例关系图 . 17 图图 2.3 AAM 标记点分布标记点分布 20 图图 2.4 人眼定位图例人眼定位图例 21 图图 3.1 态度识别流程态度识别流程 26 图图 3.2 人眼检测错误例图人眼检测错误例图 . 27 图图 3.3 “可佳可佳”在深圳高交会演示在深圳高交会演示 28 图图 4.1 基于特征点追踪的自发表情识别流程基于特征点追踪的自发表情识别流程 . 30 图图 4.2 起始帧和夸张帧特征点标注图起始帧和夸张帧特征点标注图 32 图图 4.3 图片归一化示例图片归一化示例 . 36 图图 4.4 特征点追



