《Weka C45算法使用例子 鸢尾花数据》由会员分享,可在线阅读,更多相关《Weka C45算法使用例子 鸢尾花数据(3页珍藏版)》请在金锄头文库上搜索。
1、数据源:iris.arff 决策树:C4.5算法的Java实现,J48NAME weka.classifiers.trees.J48SYNOPSIS Class for generating a pruned or unpruned C4.5 decision tree. For more information, seeRoss Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.OPTIONS binarySplits - Whether to use
2、binary splits on nominal attributes when building the trees. 二进制分裂:是否使用二进制分裂名词性属性;默认 FalseconfidenceFactor - The confidence factor used for pruning (smaller values incur more pruning). 置信因子:用于修剪的置信因子(小于该值导致修剪);默认 0.25debug - If set to true, classifier may output additional info to the console. 测试:设置
3、为true,则分类器可能在控制台输出另外的信息;默认False minNumObj - The minimum number of instances per leaf. 最小实例数量:每个叶的最小实例数量;默认2 numFolds - Determines the amount of data used for reduced-error pruning. One fold is used for pruning, the rest for growing the tree.折数:决定用于reduced-error (减少-误差)修剪的数据量;一折用于修剪,另外的用于建树; 默认 3redu
4、cedErrorPruning - Whether reduced-error pruning is used instead of C.4.5 pruning. 减少-误差修剪:是否使用减少-误差修剪,而不是C4.5修剪;默认:False saveInstanceData - Whether to save the training data for visualization. 保存实例数据:是否为了展示保存训练数据;,默认: False seed - The seed used for randomizing the data when reduced-error pruning is
5、used. 种子:减少-误差修剪时,用于随机化数据的种子;默认: 1 subtreeRaising - Whether to consider the subtree raising operation when pruning. 子树上升:修剪树的时候是否考虑子树上升操作;默认: True unpruned - Whether pruning is performed. 不修剪:修剪是否需要;默认: False useLaplace - Whether counts at leaves are smoothed based on Laplace. 使用拉普拉斯:是否叶节点基于拉普拉斯平滑;默
6、认: False修剪的方式:存在C.4.5修剪,和减少-误差修剪;reducedErrorPrunin丛控制,默认是C.4.5修剪;是否修剪: unpruned 控制,默认是修剪;使用系统默认:J48-C 0.25 -M 2-C 置信因子-M 最小实例数量使用 C.4.5 修剪决策树如下:使用减少-误差修剪:J48 -R -N 3 -Q 1 -M 2 -R:-N: numFolds-Q:seed -M:minNumObj 决策树如下:etalwidth0.6petalwidth4.84.8ee palwidth1.51.5=0.6petallengthIris-setosa (34.0)Iris-versicolor (32.0/1.0)Iris-virginica (29.0)Iris-virginica (3.0)Iris-versicolor (2.0)错误率:5.3%