粗糙集信度一致属性约简.doc

资源描述

《粗糙集信度一致属性约简.doc》由会员分享，可在线阅读，更多相关《粗糙集信度一致属性约简.doc（7页珍藏版）》请在金锄头文库上搜索。

1、粗糙集信度一致属性约简摘要:为了消除现有概率粗糙集模型约简过程中出现的诸多约简异常问题，通过引入对象最大信度概念，提出了非参与带参最大决策熵属性约简模型，阐明了带参最大决策熵测度的单调性，给出了带参最大决策熵核和相对不必要属性的定义，并分析了其约简与已有概率粗糙集模型约简的关系。其次将对象置信度引入差别矩阵，构建了带参与非参信度差别矩阵，讨论了其定义与经典差别矩阵对不确定对象刻画的差异性。最后运用实例验证了方法的有效性。关键词:概率粗糙集；属性约简；约简异常；最大决策熵；信度差别矩阵rough set based attribute reduction with consistent con

2、fidencegao can1,2*, miao duo.qian1,2, zhang zhi.fei1,2, zhang hong.yun1,21. department of computer science and technology, tongji university, shanghai 201804, china;2. key laboratory of embedded system and service computing, ministry of education, tongji university, shanghai 201804, chinaabstract:in

3、 order to solve the problem of reduction anomaly in existing probabilistic rough set models, non-parameterized and parameterized maximum decision entropy measures for attribute reduction are proposed by using the concept of the maximum confidence of uncertain object. the monotonicity of the paramete

4、rized maximum decision entropy is explained and the relationship between its attribute reduct and other ones is analyzed. the definitions for core and relatively dispensable attributes in the proposed model are also given. moreover, non-parameterized and parameterized confidence discernibility matri

5、xes are put forward and the difference of classical discernibility matrix and the proposed ones in charactering the uncertain object are discussed. finally, a case study is given to show the validity of the proposed model.in order to solve the problem of reduction anomaly in the existing probabilist

6、ic rough set models, non.parameterized and parameterized maximum decision entropy measures for attribute reduction were proposed by using the concept of maximum confidence of uncertain object. the monotonicity of the parameterized maximum decision entropy was explained and the relationship between i

7、ts attribute reduction and other ones was analyzed. the definitions for core and relatively dispensable attributes in the proposed model were also given. moreover, non.parameterized and parameterized confidence discernibility matrixes were put forward and the difference of classical discernibility m

8、atrix and the proposed ones in charactering the uncertain object were discussed. finally, a case study was given to show the validity of the proposed model.key words:probabilistic rough set; attribute reduction; reduction anomaly; maximum decision entropy; confidence discernibility matrix0 引言粗糙集理论作为

9、处理不精确、不一致和不完备信息的有效方法1，自波兰学者pawlak z.教授提出以来，在机器学习、数据挖掘和人工智能等领域中得到广泛的应用2。粗糙集理论通过不可分辨关系对对象空间进行划分，运用上下近似算子来刻划不确定知识。但其本身对数据的要求过于严格，缺乏柔性或鲁棒性。针对经典粗糙集的不足，研究学者通过引入概率方法提出了变精度粗糙集、决策论粗糙集、贝叶斯粗糙集等概率模型3。文献4在经典粗糙集模型中引入了误分率的概念，将集合的完全包含关系扩展至部分包含，提出了能有效处理噪声数据的变精度粗糙集模型，但其约简仍采用经典粗糙集的定义，约简过程中存在对象置信度不一致、区间动态性、约简跳跃和分类异常等问题

10、。文献5分析了文献4约简定义的缺陷，通过限制约简前后的分类精度区间解决了约简分类异常问题。文献6进一步指出约简过程中应保持区间交集非空或相等，消除了约简过程中的区间动态异常。文献7给出了最大分布约简的概念，但其约简仅保持对象的最大分布决策不变，各类约简异常都可能发生。文献8提出了变精度上、下近似分布约简，下近似分布约简仅能消除约简过程中的分类异常，而上近似分布约简比下近似条件更弱，约简前后正域都会发生变化。文献9提出了快速求解最大分布约简的新方法，但亦未消除各类约简异常。以上概率粗糙集模型存在约简异常的根本原因是约简度量的非单调性。本文基于对象的分类及信度信息，提出了带参最大决策熵单调约简模型

11、及信度差别矩阵，有效地消除了约简过程中各类约简异常，达到了属性约简的目的。1 基本概念定义11 信息系统可表示为s=(u, a, v, f )，其中:u是对象集合；a是属性非空集合；v=va，va表示属性a的值域； f: uav是一个映射，指定u中每一对象x的属性值，即对xu，aa有a(x) va。如果属性集a可分为条件属性集c和决策属性集d，即a=cd，cd=，则该信息系统称为决策信息系统或决策表。定义21 给定信息系统s=(u, a, v, f )，对于任意属性子集ba，可定义不可分辨关系ind(b)=(x, y)uu|ab, a(x)=a(y)。ind(b)是一个等价关系，构成u的一个

12、划分，用u/ind(b)表示，简记为u/b。定义31 给定信息系统s=(u, a, v, f )，对于任意属性子集ba，不可分辨关系ind(b)的等价类可表示为xb = yu|x, yind(b)。定义44 给定决策表s=(u, a=cd, v, f )及分类精度(0.5, 1，设 xu，对任一属性子集bc，x关于b的上、下近似集分别表示如下：b(x)=xb|p(x|xb)(1)b(x)=xb|p(x|xb)1- (2)定义54 给定决策表s=(u, a=cd, v, f )及分类精度(0.5, 1，则d关于条件属性c的正域、边界域和负域可分别表示为：pos(c, d, )=c(dj)| dj

13、u/d (3)bnd(c, d, )=c(dj)-c(dj)| dju/d (4)neg(c,d,)=u-pos(c, d, )-bnd(c, d, )(5)2 信度一致属性约简2.1 最大决策熵属性约简为了阐述方便，先给出条件类最大包含度、最大包含度决策及分布等相关概念。根据贝叶斯原则，规则的误分率主要取决于较小决策样本，因此属性约简过程中各等价类的最大决策信息应保持不变。非参最大决策熵考虑了各等价类的最大决策类信息，因此约简后能保证各对象的最大决策类信息不发生变化。而带参最大决策熵引入了变精度分类参数，仅反映最大包含度大于分类精度阈值的等价类信息，也即保持约简前后变精度正域不发生变化，达到了约简的目的。最大决策熵函数可表示为图1。命题1 给定任意决策表，带参最大决策熵是单调的。命题2 给定决策表，非参最大决策熵不一定满足单调性质。从图1可看出，函数在区间(0.5, 1上严格单调下降，因此带参最大决策熵测度是单调的。而在整个论域上，最大决策熵测度为凸函数，因此非参最大决策熵不具单调性。4 结语在粗糙集理论中，属性约简是保持决策表分类能力不变的条件属性子集。

展开阅读全文