自动图像标注：基于机器学习算法的自动图像标注.doc

资源描述

《自动图像标注：基于机器学习算法的自动图像标注.doc》由会员分享，可在线阅读，更多相关《自动图像标注：基于机器学习算法的自动图像标注.doc（6页珍藏版）》请在金锄头文库上搜索。

1、自动图像标注论文：基于机器学习算法的自动图像标注【中文摘要】”语义清晰”是大规模数字图像管理的重要前提,现有的基于底层特征的图像内容和高级人为理解的图像语义之间存在巨大的鸿沟,因此通过计算机自动获取图像语义内容的研究具有十分重要的意义。自动图像标注的实质是通过对图像的底层视觉特征的处理和分析来获取高层语义关键词,用这组语义关键词表示图像的含义。基于分类的自动图像标注方法是当前图像标注领域中使用最广泛的方法之一。本文的研究目标是结合当前标注模型的特点应用机器学习算法对图像进行标注,与前期基于分类模型的自动图像标注经典算法相比,本文采用的决策树改进算法在分类精度和时间上有所改善,并且该系统可以利

2、用人能理解的规则模型来标注图像。为了获取标注规则,本文将采集到的图像数据库预定义一组需要的关键词(或语义概念)。利用图像分割技术将数据库中的图像分割成许多不同的区域,每个区域大致对应于一个语义对象。然后对图像分割后所得到的各个区域提取出底层视觉特征,包括颜色、纹理和形状特征等。提取出区域的特征属性后,手动将有意义的区域归并为几个类,这几个类均为预定义的语义概念。特征属性数据可以作为后续机器学习的训练数据。然后该系统可以通过机器学习方法从这些特征数据中学习到语义概念,利用预定义关键词来标注各个区域,最后图像就可以被这些关键词标注出来。本文主要关注的机器学习算法为改进后的NewNBtree算法、S

3、impleC4.5算法和FastRandomForest算法,通过训练可以得到相应的标注模型,最终实现自动图像标注。在自动语义标注阶段,本文利用图像信息熵的概念对噪声区域进行剔除,更有效地提高了标注系统的准确度。本文通过标准Corel图像库和基于Corel图像库的不同10组训练集对采用的算法进行实验分析,验证了改进算法和标注系统的有效性和鲁棒性。实验结果表明本文所采用的机器学习算法比传统决策树算法更能有效地分类图像数据,并能够应用到较大规模图像集中实现图像的自动标注。【英文摘要】”Semantic Clarity” is an important prerequisite of a large

4、-scale digital image management, it exists a big gap between the underlying features of the image and advanced semantics of the image understood by human. Therefore, automatic acquisition of the semantic content of the image through computer information technology is very important theoretical and p

5、ractical significance. The substance of automatic image annotation is to obtain high-level semantic keywords through processing and analyzing the underlying visual information features of image. We use this set of top semantic keywords to represent the image features in the same way which image can

6、be retrieved as current text search. Automatic image annotation based on classification is one of the most widely used methods in the current image annotation fields.The research goal is to combine the characteristics of the current annotation model, and use machine learning classification algorithm

7、 to annotate the image. Compared with the previous classification based on the classic model of automatic image annotation algorithm, the proposed decision tree algorithm classification has a high improvement in accuracy, and the system can use rules to mark the image that can be understood. In orde

8、r to obtain the labeling rules, we must first carry out the training process of the whole system. After each image on the training set are segmented, we have all regions of a certain similarity, then extract the visual features of each region, finally train on the segmented regions using machine lea

9、rning algorithm. In this paper, the main concern is the improved NewNBtree algorithm based on the classical algorithm, SimpleC4.5 algorithm and FastRandomForest algorithm training. The appropriate decision rules can be obtained through the training, and ultimately automatic semantic annotation can b

10、e realized. In the stage of the automatic semantic annotation, we use the concept of information entropy of image to exclude the noisy region, which in turn more effectively can improve the annotation system in accuracy.In this paper, experiments are performed to verify the effectiveness and robustn

11、ess of the algorithms and system with a standard Corel image library. It includes 10 different data sets based on Corel image database. The experimental results shows that the proposed algorithm is better than the traditional decision tree learning algorithm for classification of image data and is e

12、ffectively applied to large-scale training image sets. At last, automatic image annotation system can be implemented based on the machine learning algorithms.【关键词】自动图像标注机器学习决策树集成分类算法【英文关键词】Automatic image annotation Machine learning Decision tree Ensemble learning【目录】基于机器学习算法的自动图像标注摘要6-7Abstract7

13、目录8-10第1章绪论10-161.1 研究背景与研究意义10-111.2 国内外研究现状11-131.2.1 基于分类的自动图像标注模型121.2.2 基于概率的自动图像标注模型12-131.2.3 其他方法131.3 图像标注系统关键问题及研究任务13-151.3.1 自动标注系统的框架13-141.3.2 关键问题141.3.3 研究任务14-151.4 本文的结构安排15-16第2章基于单棵决策树的自动图像标注16-282.1 NewNBtree算法16-182.1.1 算法思想16-172.1.2 算法流程17-182.1.3 算法实现182.2 SimpleC4.5算法18-2

14、22.2.1 算法思想19-212.2.2 算法流程21-222.2.3 算法实现222.3 自动图像标注方法22-272.3.1 自动图像标注流程22-262.3.2 自动图像标注算法描述26-272.4 本章小结27-28第3章基于集成分类器的自动图像标注28-363.1 集成分类器28-333.1.1 集成学习算法28-303.1.2 快速随机森林算法30-333.2 基于快速随机森林算法的自动图像标注方法33-353.2.1 基于快速随机森林的自动图像标注流程33-343.2.2 基于快速随机森林的图像自动标注算法描述34-353.3 本章小结35-36第4章系统实现及结果分析36-514.1 实验环境364.2 Weka平台的二次开发36-414.2.1 二次开发过程36-374.2.2 二次开发实验37-414.3 实验及结果分析41-504.3.1 实验数据集41-424.3.2 评价标准424.3.3 基于机器学习算法的分类结果比较及分析42-484.3.4 基于机器学习算法的标注系统实现48-504.4 本章小结50-51结论51-53致谢53-54参考文献54-58攻读硕士学位期间发表的论文及科研成果58

展开阅读全文