PRTools Version 3.0 A Matlab Toolbox for Pattern Recognition

资源描述

《PRTools Version 3.0 A Matlab Toolbox for Pattern Recognition》由会员分享，可在线阅读，更多相关《PRTools Version 3.0 A Matlab Toolbox for Pattern Recognition（25页珍藏版）》请在金锄头文库上搜索。

1、欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 1 -PRToolsVersion 3.0A Matlab Toolbox for Pattern RecognitionR.P.W. DuinJanuary 2000An introduction into the setup, definitions and use of PRTools is given. Readers are assumed tobe familiar with Matlab and should have a basic understanding of field o

2、f statistical patternrecognition.tel : +31 15 2786143fax: +31 15 2786740email: duintn.tudelft.nlPattern Recognition GroupDelft University of TechnolgyP.O. Box 5046, 2600 GA DelftThe Netherlands欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 2 -1. IntroductionIn statistical pattern recognition one s

3、tudies techniques for the generalisation of decision rulesto be used for the recognition of patterns in experimental data sets. This area of research has astrong computational character, demanding a flexible use of numerical programs for studyingthe data as well as for evaluating the data analysis t

4、echniques themselves. As still newtechniques are being proposed in the literature a programming platform is needed that enablesa fast and flexible implementation. Pattern recognition is studied in almost all areas of appliedscience. Thereby the use of a widely available numerical toolset like Matlab

5、 may be profitablefor both, the use of existing techniques, as well as for the study of new algorithms. Moreover,because of its general nature in comparison with more specialised statistical environments, itoffers an easy integration with the preprocessing of data of any nature. This may certainly b

6、efacilitated by the large set of toolboxes available in Matlab.Themore than 100routines offeredby PRToolsin its present state represent abasic set coveringlargelytheareaofstatistical patternrecognition. Inordertomaketheevaluationandcomparisonof algorithms more easy a set of data generation routines

7、is included, as well as a small set ofstandardreal worlddatasets. Ofcourse,manymethods andproposals arenotyetimplemented.Anybody who likes to contribute is cordially invited to do so. The very important field of neuralnetworkshasbeenskipped partially asMatlab already includes averygood toolbox intha

8、t area.At the moment just some basic routines based on that toolbox are included in order to facilitatea comparison with traditional techniques.PRTools has a few limitations. Due to the heavy memory demands of Matlab very largeproblems with learning sets of tens of thousands of objects cannot be han

9、dled on moderatemachines. Moreover, some algorithms are slow as it appeared to be difficult to avoid nestedloops. A fundamental drawback with respect to some applications is that PRTools as yet doesnot offer the possibility of handling missing data problems, nor the use of fuzzy or symbolicdata. The

10、se areas demand their own sets of routines and are waiting for manpower.In the next sections, first the area of statistical pattern recognition covered by PRTools isdescribed. Following the toolbox is summarized and details are given on some specificimplementations. Finally some examples are present

11、ed.欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 3 -2. The area of statistical pattern recognitionPRTools deals with sets of labeled objects and offers routines for generalising such sets intofunctions for data mapping and classification. An object is a k-dimensional vector of featurevalues. It i

12、s assumed that for all objects in a problem all values of the same set of features aregiven. The space defined by the actual set of features is called the feature space. Objects arerepresented as points or vectors in this space. A classification function assigns labels to newobjects in the feature s

13、pace. Usually, this is not done directly,but in a number of stages in whichthe initial feature space is successively mapped into intermediate stages, finally followed by aclassification. The concept of mapping spaces and dataset is thereby important and constitutesthe basis of many routines in the t

14、oolbox.Sets of objects may be given externally or may be generated by one of the data generationroutines of PRTools. Their labels may also be given externally or may be the result of a clusteranalysis. By this technique similar objects within a larger set are grouped (clustered). Thesimilarity measu

15、re is defined by the cluster technique in combination with the objectrepresentation in the feature space.A fundamental problem is to find a good distance measure that agrees with the dissimilarity ofthe objects represented by the feature vectors. Throughout PRTools the Euclidean distance isused as d

16、efault. However, scaling the features and transforming the feature spaces by differenttypes of maps effectively changes the distance measure.The dimensionality of the feature space may be reduced by the selection of subsets of goodfeatures. Several strategies and criterion functions are possible for

17、 searching good subsets.Feature selection is important because it decreases the amount of features that have to bemeasured and processed. Inaddition tothe improvedcomputational speed inlowerdimensionalfeature spaces there might also be an increase in the accuracy of the classification algorithms.Thi

18、s is caused by the fact that for less features less parameters have to be estimated.Another way to reduce the dimensionality is to map the data on a linear or nonlinear subspace.This is called linear or nonlinear feature extraction. It does not necessarily reduce the numberof features to be measured

19、, but the advantage of an increased accuracy might still be gained.Moreover, as lower dimensional representations yield less complex classifiers bettergeneralisations can be obtained.Using a learning set (or training set) a classifier can be trained such that it generalizes this setof examples of la

20、beled objects into a classification rule. Such a classifier can be linear ornonlinear and can be based on two different kinds of strategies. The first one minimizes theexpected classification error by using estimates of the probability density functions. In thesecond strategy this error is minimised

21、 directly by optimizing the classification function over itsperformance over the learning set. In this approach it has to be avoided that the classifierbecomes entirely adapted to the learning set, including its noise. This decreases itsgeneralisation capability. This overtraining can be circumvente

22、d by several types overregularisation (often used in neural network training). Another technique is to simplify theclassifiaction function afterwards (e.g. the pruning of decision trees).欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 4 -If the class probability density functions are known like in

23、simulations, the optimalclassificationfunctiondirectlyfollowsfromtheBayes rule.Insimulations thisruleisoften usedas a reference.Constructed classification functions may be evaluated by independent test sets of labeledobjects. These objects have to be excluded from the learning set, otherwise the eva

24、luationbecomesbiased.Iftheyareadded tothelearning set,however,better classification function maybe expected. A solution to this dilemma is the use of cross validation and rotation methods bywhich a small fraction of objects is excluded from learning and used for testing. This fraction isrotated over

25、 the available set of objects and results are avaraged. The extreme case is the leave-one-out method for which the excluded fraction is as large as one object.The performance of classification functions can be improved by the following methods:1. A reject option in which the objects close to the dec

26、ision boundary are not classified. Theyare rejected and might be classified by hand or by another classifier.2. The selection or averaging of classifiers.3. A multi-stage classifier for combining classification results of several other classifiers.For all these methods it is profitable or necessary

27、that a classifier yields some distance measureor aposteriori probability in addition to the hard, unambiguous assignment of labels.3. ReferencesYoh-Han Pao, Adaptive pattern recognition and neural networks, Addison-Wesley, Reading,Massachusetts, 1989.K. Fukunaga, Introduction to statistical pattern

28、recognition, second edition, Academic Press,New York, 1990.S.M. Weiss and C.A. Kulikowski, Computer systems that learn, Morgan Kaufman Publishers,California, 1991.C.M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford, 1995.B.D. Ripley, Pattern Recognition and Neural Networks,

29、 Cambridge University Press, 1996.J. Schurmann, Pattern classification, a unified view of statistical and neural approaches, JohnWiley & Sons, New York, 1996.E. Gose, R. Johnsonbaugh and S. Jost, Pattern recognition and image analysis, Prentice-Hall,Englewood Cliffs, 1996S. Haykin, Neural Networks,

30、a Comprehensive Foundation, second edition, Prentice-Hall,Englewood Cliffs, 1999.S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, New York, 1999.欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 5 -4. A review of the toolboxPRTools makes use of the possibility offered by Matlab

31、 5 to define Classes and Objects.These programmatic concepts should not be confused with the classes and objects as defined inPatternRecognition. TwoClasses havebeen defined:datasetandmapping.Alargenumberofoperators (like* )andMatlabcommands havebeen overloadedandhavethereby aspecialmeaning when app

32、lied to adataset and/or amapping.The central data structure of PRTools is thedataset. It primarily consists of a set of objectsrepresented by a matrix of feature vectors. Attached to this matrix is a set of labels, one for eachobject and a set of feature names. Moreover, a set of apriori probabiliti

33、es, one for each class, isstored. In most help files of PRTools, adatasetis denoted byA. In almost any routine this isone of the inputs. Almost all routines can handle multiclass object sets.In the above scheme the relations between the various sets of routines are given. At the momentUnlabeled Data

34、Labeled DataData GenerationClusterAnalysisFeatureSelectionClassifierTrainingClassificationErrorEstimationPlotResultsFeatureMeasurementVisualisation2D ProjectionMultistageClassifiersCombiningClassifiersNonlinearMapping欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 6 -there are no commands for measu

35、ring features, so they have to be supplied externally. There arevarious ways to regroup the data, scale and transform the featurespace, find good features, buildclassifiers, estimate the classification performances and compute (new) object labels.Data structures of the Classmappingstore trained clas

36、sifiers, feature extracting results, datascaling definitions, nonlinear projections, etcetera. They are usually denoted byW. The result ofthe operationA*W is again a dataset. It is the classified, rescaled or mapped result of applyingthe mapping definition stored inW toA.A typical example is given b

37、elow:A = gendath(100);% Generate Highleymans classes, 100 objects/class% Training set C (20 objects / class)% Test set D (80 objects / class)C,D = gendat(A,20);% Compute classifiersW1 = ldc(C);% linearW2 = qdc(C);% quadraticW3 = parzenc(C);% ParzenW4 = bpxnc(C,3);% Neural net with 3 hidden units% Co

38、mpute and display errorsdisp(testd(D*W1),testd(D*W2),testd(D*W3),testd(D*W4);% Plot data and classifiersscatterd(A);% scatter plotplotd(W1,-);% plot the 4 discriminant functionsplotd(W2,-.);plotd(W3,-);plotd(W4,:);This commandfile first generates bygendathtwo sets of labeled objects, both containing

39、 100two-dimensional object vectors, and stores them and their labels and apriori probabilities in thedataset A. The distribution follows the so-called Highleyman classes. The next call togendat takes thisdataset and splits it at random into adataset C, further on used fortraining, and adataset D, us

40、ed for testing. This training setCcontains 20 objects from bothclasses. The remaining 2 x 80 objects are collected in D.In the nextlines four classification functions (discriminants) are computed, calledW1, W2, W3and W4. The linear and quadratic classifier are both based on the assumption of normall

41、ydistributed classes. The Parzen classifier estimates the class densities by the Parzen densityestimation and has a built-in optimization for the smoothing parameter. The fourth classifieruses a feedforward neural network with three hidden units. It is trained by the backpropagationrule using a vary

42、ing stepsize.Hereafter the results are displayed and plotted. The testdataset Dis used in a routinetestd(test discriminant) on each of the four discriminants. The estimated probabilities of error aredisplayed in the Matlab command window and look like:0.1750 0.1062 0.1000 0.1562欢迎您阅读并下载本文档，本文档来源于互联网

43、，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 7 -Finally the classes are plotted in a scatter diagram together with the discriminants, see below.The plot routineplotddraws a vectorized straight line for the linear classifiers and computesthe discriminant function values in all points of the plot grid (default 30 x 30

44、) for the nonlineardiscriminants. After that, the zero discriminant values are computed by interpolation andplotted.:We will now shortly discuss the PRTools commands group by group. The two basic structuresofthetoolbox canbedefined bytheconstructorsdatasetandmapping.These commands canalso be used to

45、 retrieve or redefine the data. It is thereby not necessary to use the general Matlabconverterstruct() for decomposing the structures. Bygetlab andgetfeat the labelsassigned to the objects and features can be found. The generation and handling of data is furtherfacilitated bygenlab andrenumlab.Datas

46、ets and MappingsdatasetDefine dataset from datamatrix and labels and retrievegetlabRetrieve object labels from datasetgetfeatRetrieve feature labels from datasetgenlabGenerate dataset labelsrenumlabConvert labels to numbersmappingDefine mapping and classifier from data and retrievegetlabRetrieve lab

47、els assigned by a classifier1.510.500.511.522.533.554321012345欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 8 -There is a large set of routines for the generation of arbitrary normally distributed classes(gauss), and for various specific problems (gendatc, gendatd, gendath, gendatmandgendats). Th

48、ere are two commands for enriching classes by noise injection (gendatk andgendatp). These are used for the general testset generatorgendatt. A given dataset can bespit into a training set and a testsetgendat. The routinegendat splits the dataset at randominto two sets.All routines operate in multi-c

49、lass problems.classdandtestdare the general classificationand testing routines. They can handle any classifier from any routine, including the ones toData GenerationgaussGeneration of multivariate Gaussian distributed datagendatGeneration of subsets of a given data setgendatbGeneration of banana sha

50、ped classesgendatcGeneration of circular classesgendatdGeneration of two difficult classesgendathGeneration of Higleyman classesgendatkNearest neighbour data generationgendatlGeneration of Lithuanian classesgendatmGeneration of many Gaussian distributed classesgendatpParzen density data generationge

51、ndatsGeneration of two Gaussian distributed classesgendattGeneration of testset from given datasetprdataRead data from file and convert into a datasetLinear and Higher Degree Polynnomial ClassifiersklclcLinear classifier by KL expansion of common cov matrixkljlcLinear classifier by KL expansion on t

52、he joint dataloglcLogistic linear classifierfishercFishers discriminant (minimum least square linear classifier)ldcNormal densities based linear classifier (Bayes rule)nmcNearest mean classifiernmscScaled nearest mean classifierperlcLinear classifier by linear perceptronperscLinear classifier by non

53、linear perceptronpfsvcPseudo-Fisher support vector classifierqdcNormal densities based quadratic (multi-class) classifierudcUncorrelated normal densities based quadratic classifierpolycAdd polynomial features and run arbitrary classifierclasscConverts a mapping into a classifierclassdGeneral classif

54、ication routine for trained classifierstestdGeneral error estimation routine for trained classifiers欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 9 -follow.knncandparzencare similar in the sense that the classifiers theybuildstill include alltrainingobjects and that their parameter (the number of

55、 neighbours or the smoothing parameter) can beuser supplied or can be optimized over the training set using a leave-one-out error estimation.For the Parzen classifier the smoothing parameter can also be estimated byparzenmlusing anoptimization of the density estimation. The special purpose classific

56、ation routinesmapk,andmapparecalledautomatically whenneeded.Ingeneral, thereisnoneedfortheusertocallthemdirectly. The special purpose testing routinestestkandtestpare useful for obtaining leave-one-out error estimations.Decision trees can be constructed bytreec,using variouscriterion functions, stop

57、ping rules orpruning techniques. The resulting classifier can be used inclassd,testd andplotd. Theymake use ofclasst.PRTools offers three neural network classifiers(bpxnc,lmnncand rbnnc) based on an oldversion of Matlabs Neural Network Toolbox. Adaptations of Mathworks routines are made inorder to p

58、revent unnecessary display of intermediate results. They are stored in theprivatesubdirectory. The resulting classifiers are ready to use byclassd,testd andplotd. Theautomatic neural network classifierneurc builds a network without any parameter setting bythe user. Random neural network classifiers

59、can be generated byrnnc. The first one is totallyrandom, the second optimizes the output layer by a linear classifier.The Support Vector Classifier (svc) can be called for various kernels as defined byproxm(seeNonlinear Classificationknnck-nearest neighbour classifier (find k, build classifier)mapkk

60、-nearest neighbour mapping routinetestkError estimation for k-nearest neighbour ruleparzencParzen density based classifierparzenmlOptimization of smoothing parameter in Parzen density estimation.mappParzen mapping routinetestpError estimation for Parzen classifierediconEdit and condense training set

61、streecConstruct binary decision tree classifierclasstClassification with binary decision treebpxncTrain feed forward neural network classifier by backpropagationlmncTrain feed forward neural network by Levenberg-Marquardt rulerbncTrain radial basis neural network classifierneurcAutomatic neural netw

62、ork classifierrnncRandom neural network classifiersvcSupport vector classifier欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 10 -below). The classifier is optimized by a quadratic programming procedure. For somearchitectures a C-version is built-in for speeding up processing.Classifiers for normal

63、 distributed classes can be trained byldc,qdcandudc, whilenbayescassumes known densities. The special purpose testroutinetestncan be used if the parametersof the normal distribution (means and covariances) are known or estimated bymeancov.The feature selection routinesfeatselb,featself,featseli,feat

64、seloandfeatselpgenerate subsets of features, callingfeatevalfor evaluating the feature set.featselmoffersa general entry for feature selection, calling one of the other metohds. All routines produce amappingW (e.g.W = featself(A,k). So the reduction of a datasetA toB is done byB = A*W.Normal Density

65、 Based ClassificationdistmahaMahalanobis distancemapnMulticlass classification on normal densitiesmeancovEstimation of means and covariance matrices from multiclass datanbayescBayes classifier for given normal densitiesldcNormal densities based linear classifier (Bayes rule)qdcNormal densities based

66、 quadratic classifier (Bayes rule)udcNormal densities based quadratic classifier (independen features)testnError estimate of discriminant on normal distributionsFeature SelectionfeatevalEvaluation of a feature setfeatrankRanking of individual feature permormancesfeatselbBackward feature selectionfea

67、tselfForward feature selectionfeatseliIndividual feature selectionfeatseloBranch and bound feature selectionfeatselpPudils floating forward feature selectionfeatselmFeature selection map, general routine for feature selectionClassifiers and Tests (general)classcConvert mapping to classifierclassdGen

68、eral classification routine for trained classifiersclevalClassifier evaluation (learning curve)clevalbClassifier evaluation (learning curve), bootstrap versionconfmatComputation of confusion matrixcrossvalError estimation by crossvalidationnormcNormalisation of classifiersrejectCompute error-reject

69、curverocCompute receiver-operator curvetestdGeneral error estimation routine for trained classifiers欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 11 -A classifier maps, after training, objects from the feature space into its output space. Thedimensionalityofthisspaceequalsthenumberofclasses(anexc

70、eptionispossible fortwo-classclassifiers, that may have a one-dimensional output space). This output space is mapped onposterior probabilities byclassc. Normalization of these probabilities on a given dataset canbe controlled bynormc. This is standard built-in for all training algorithms. Classifica

71、tion(determining the class with maximum output) is done byclassd, error estimates for test dataare made bytestd andconfmat. More advanced techniques like rotating datasets over testsets and training sets, are offered bycrossval,cleval andclevalb.Classifiers are aspecial type of mapping, as their out

72、put spaces are related to class membership.In general a mapping converts data from one space to another. This may be done by a fixedprocedure,notdependingonadataset,butcontrolled byatmost some parameters. Most ofthesemappings that dont need training are collected bycmapm (e.g. shifting, rotation, de

73、letion ofparticular features), another example is the sigmoidal mappingsigm. Some of the mappingsthat need training dont depend on the object labels, e.g. the principal component analysis(PCA) byklm andklms, object normalization bynormm and scaling byscalem, subspacemapping (maps defined by normaliz

74、ed objects and that include the origin) bysubsm andnonlinear PCA or kernel PCA by support vector mapping,svm. The other routines depend onobject labels astheydefine themapping suchthat theclass separability ismaximized inonewayor another. The Fisher criterion is optimized byfisherm, the scatter bykl

75、m (if called bylabelled data), density overlap for normal distributions bymlmand general class separability bylmnm.MappingscmapmCompute some special mapsfeatselmFeature selection map, general routine for feature selectionfishermFisher mappingklmDecorrelation and Karhunen Loeve mapping (PCA)klmsScale

76、d version ofklm, useful for prewhiteninglmnmLevenberg-Marquardt neural net diabolo mappingnlklmNonlinear Karhunen Loeve mapping (NL-PCA)normmObject normalization mapproxmProximity mapping and kernel constructionreducmReduce to minimal space mappingscalemCompute scaling datasigmSimoid mappingsubsmSub

77、space mappingsvmSupport vector mapping, useful for kernel PCA欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 12 -Classifiers can be combined by horizontal and vertical concatenation, see section 5.2, e.g.W = W1, W2, W3. Such a set of classifiers can be combined by several rules, like majorityvoting

78、 (majorc), combining the posterior probabilities in several ways (maxc,minc,meanc,medianc andprodc), or by training an output classifier (traincc). The way classifiers arecombined can be inspected byparsc.Images can be stored, either as features (im2feat), or as objects (im2obj) in a dataset. Thefir

79、st possibility is useful for segmenting images using a vector of values for each pixels (e.g. incase of multi-color images, or as a result of a filterbank). The second possibility enables theclassification of entire images using their pixels as features. Such datasets can be displayed bythe overload

80、ed commandsimage andimagesc. The relation with image processing isestablished bydataim, enabling arbitrary image operations, Simple filtering can be sped up bythe use ofdatfilt anddatgauss.Combining classification rulesbaggingcBoortstrapping and aggregation of classifiersmajorcMajority voting combin

81、ing classifiermaxcMaximum combining classifiermincMinimum combining classifiermeancAveraging combining classifiermediancMedian combining classifierprodcProduct combining classifiertrainccTrain combining classifierparscParse classifier or mapdataimImage operation on dataset images.data2imConvert data

82、set to imagedatfiltFilter dataset imagedatgaussFilter dataset image by Gaussian filterim2objConvert image to object in datasetim2featConvert image to feature in datasetimageDisplay images stored in datasetimagescDisplay images stored in dataset, automatic scalingClustering and DistancesdistmDistance

83、 matrix between two data sets.proxmProximity mapping and kernel constructionhclustHierarchical clusteringkcentresk-centers clusteringkmeansk-means clusteringmodeseekClustering by modeseeking欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 13 -PlottingplotdPlot discriminant function in scatterplotplo

84、tfPlot feature distributionplotmPlot mapping in scatterplotplot2Plot 2d functionplotdgPlot dendrgram (see hclust)scatterdScatterplotscatter3d3D ScatterplotExamplesprex1Classifiers and scatter plotprex2Plot learning curves of classifiersprex3Multi-class classifier plotprex4Classifier combiningprex5Us

85、e of images and eigenfaces欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 14 -5. Some DetailsThecommandhelp filesand theexamplesgiven belowshould givesufficientinformation to usethe toolbox with a few exeptions. These are discussed in the following sections. They deal withthe ways classifiers and m

86、appings are represented. As these are the constituting elements of apattern recognition analysis, it is important that the user understands these issues.5.1 DatasetsAdatasetconsists of a set ofmobjects, each given bykfeatures. In PRTools such a datasetis represented by am bykmatrix: mrows, each cont

87、aining an object vector ofk elements.Usually a dataset is labeled. An example of a definition is: A = dataset(1 2 3; 2 3 4; 3 4 5; 4 5 6,3 3 5 5) 4 by 3 dataset with 2 classesThe 4 by 3 data matrix (4 objects given by 3 features) is accompanied by a labellist of 4 labels,connecting each of the objec

88、ts to one of the two classes, 3 and 5. Class labels can be numbersor strings and should always be given as rows in the labellist. If the labellist is not given allobjects are given the default label 255. In addition it is possible to assign labels to the columns(features) of a dataset: A = dataset(r

89、and(100,3),genlab(50 50,3 5),r1;r2;r3) 100 by 3 dataset with 2 classesThe routine genlab generates 50 labels with value 3, followed by 50 labels with value 5. In thelast term the labels (r1, r2, r3) for the three features are set. The complete definition of a datasetis: A = dataset(datamatrix,labels

90、,featlist,prob,lablist)given the possibilitiy to set apriori probabilities for each of the classes as defined by the labelsgiveninlablist.The valuesinprobshould sumtoone.Ifprobisempty orifitisnotsuppliedthe apriori probabilities are computed from the dataset label frequencies. If prob = 0 then equal

91、class probabilities are assumed.Various items stored in a dataset can be retrieved by nlab,lablist,m,k,c,prob,featlist = dataset(A)in which nlab are numeric labels for the objects (1, 2, 3,.) referringtothetruelabelsstoredinthe rows of lablist. The size of the dataset is m by k, c is the number of c

92、lasses (equal tomax(nlab). DatasetscanbecombinedbyA;BifA andB have equal numbers of features andby A B if they have equal numbers of objects. Creating subsets of datasets can be done byA(I,J) in which I is a set of indices defining the desired objects and J is a set of indices definingthe desired fe

93、atures. In all these examples the apriori probabilities set for A remain unchanged.The original datamatrix can be retireved by double(A) or by +A. The labels in the objects of Acan be retrieved labels = getlab(A), which is equivalent to labels = lablist(nlab,:). The featurelabels can be retrieved by

94、 featlist = getfeat(A). Conversion by struct(A) makes all fields in adataset A accessible to the user.欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 15 -5.2 Classifiers and mappingsThere are many commands to train and use mappings between spaces of different (or equal)dimensionalities. For example

95、:ifA is am byk dataset (m objects in ak-dimensional space)andW is ak byn mapping (map fromk ton dimensions)thenA*W is am byn dataset (m objects in an-dimensional space)Mappings can be linear (e.g. a rotation) as well as nonlinear (e.g. a neural network). Typicallythey can be used for classifiers. In

96、 that case akbynmapping maps ak-feature data vector ontheoutput space ofan-class classifier (exception:2-class classifiers likediscriminant functionsmay be implemented by a mapping to a 1-dimensional space like the distance to thediscriminant,n = 1).Mappings are of the datatype mapping (class(W) is

97、mapping), have a size ofk,n ifthey map fromk ton dimensions. Mappings can be instructed to assign labels to the outputcolumns, e.g. the class names. These labels can be retrieved bylabels = getlab(W); before the mapping, orlabels = getlab(A*W); after the datasetA is mapped byW.Mappings can be learne

98、d from examples, (labeled) objects stored in a datasetA, for instance bytraining a classifier:W3 = ldc(A);the normal densities based linear classifierW2 = knnc(A,3);the 3-nearest neighbor ruleW1 = svc(A,p,2);the support vector classifier based on a 2-nd orderpolynomial kernelUntrained or empty mappi

99、ngs are supported. They may be very useful. In this case the datasetis replaced by an empty set or entirely skipped:V1 = ldc; V2 = knnc(,a); V3 = svc(,p,2);Such mappings can be trained later byW1 = A*V1; W2 = A*V2; W3 = A*V3;The mapping of a testsetBbyB*W1is now equivalent toB*(A*V1) or even, irregu

100、lary but veryhandy toA*V1*B(or evenA*ldc*B). Note that expressions are evaluated from left to right, soB*A*V1may result in an error as the multiplication of the two datasets (B*A) is executed first.Users can add new mappings or classifiers by a single routine that should support the followingtype of

101、 calls:W = newmapm(, par1, par2, .); Defines the untrained, empty mapping.W = newmapm(A, par1, par2, .);Defines the map based on the training datasetA.B = newmapm(A, W);Defines the mapping of datasetA onW, resulting in a datasetB.欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 16 -For an example li

102、st the routinesubsc.m.Sometrainablemappingsdonotdependonclasslabelsandcanbeinterpreted asfindingaspacethat approximates asgood aspossible the original dataset givensome conditions and measures.Examples aretheKarhunen-LoeveMapping (klm)whichmaybeusedforPCAandtheSupportVector Mapping (svm) by which no

103、nlinear, kernel PCA mappings can be computed.In addition to trainable mappings, there are fixed mappings, which operation is not computedfrom a trainingset but defined by just a few parameters. Most of them can be set bycmapm.TheresultDofamappingofatestsetonatrainedclassifier,D = B*W1isagain adatase

104、t, storingfor each object inBthe output values of the classifier. These values, usually between-infandinfcanbeinterpreted as similarities: thelarger,the more similar with the corresponding class.These number can be mapped on the0,1 interval by the fixed mappingsigm:D = B*W1*sigm.Thevaluesinasingle r

105、ow(object) dontnecessarily sum toone. This can beachieved by the fixed mapping normm:D = B*W1*sigm*normm which is equivalent toB*W1*classc. Effectively a mappingW is converted into a classifier byW*classc, whichmaps objects on the normalized0,1output space. Usually a mapping that can be convertedint

106、o a classifier in this way, is scaled such by a multiplicative constant that these numbersoptimally represent (in the maximum likelihood sense) the posterior probabilities for thetraining data. Theresulting output dataset Dhas column labels fortheclasses and rowlabels forthe objects. The class label

107、s of the maximum values for each object can be retrieved bylabels = D*classd;orlabels = classd(D);Aglobalclassificationerrorfollowsfrome = D*testd; ore = testd(D);Mappings can be combined in the following ways:sequential:W = W1 * W2 * W3(equal inner dimensions)stacked :W = W1, W2, W3 (equal numbers

108、of rows (input dimensions)parallel :W = W1; W2 ;W3 (unrestricted)The output size of the parallel mapping is irregulary equal to(k1+k2+k3)by(n1+n2+n3)as the output combining of columns is undefined. In a stacked or parallel mapping columnshaving the same label can be combined by various combiners lik

109、emaxc,meancandprodc. Ifthe classifiersW1, W2andW3are trained for the samenclasses, their output labels are the sameand are combined byW = prodc(W1;W2;W3) into a(k1+k2+k3) byn classifier.Wfor itself, ordisplay(W)lists the size and type ofaclassifier aswell asthe routine orsectioninmapping/mtimes used

110、 for computing a mappingA*W. The construction of a combinedmapping may be inspected byparsc(W).A mapping may be given an output selection byW = W(:,J), in whichJ is a set of indicespointing to the desired classes.B = A*W(:,J); is equivalent toB = A*W; B = B(:,J);Input selection is not possible for a

111、 mapping.欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 17 -6. ExamplesThe following examples are available under PRTools. We present here the source codes and theoutput they generate.6.1 Classifiers and scatter plotA 2-d Highleyman datasetA is generated, 100objects for each class. Out of each cla

112、ss 20 objectsare generated for training,C and 80 for testing,D.Four classifiers are computed: a linear one and aquadratic one, both assuming normal densities(whichiscorrectinthiscase),aParzenclassifier anda neural network with 3 hidden units. Note that thedata generation as well as the neural networ

113、kinitialisation userandomgenerators. Asaresulttheyonly reproduce if they use the original seed. Aftercomputing and displaying classification results forthe test set a scatterplot is made in which allclassifiers are drawn.%PREX1 PRTools example of classifiers and scatter plothelp prex1pause(1)A = gen

114、dath(100,100); % Generate Highleymans classes% Training set c (20 objects / class)% Test set d (80 objects / class)C,D = gendat(A,20);% Compute classifiersw1 = ldc(C);% linearw2 = qdc(C);% quadraticw3 = parzenc(C);% Parzenw4 = lmnc(C,3);% Neural Net% Compute and display errorsdisp(testd(D*w1),testd(

115、D*w2),testd(D*w3),testd(D*w4);% Plot data and classifiersfigure(1);hold off;scatterd(A); drawnow;plotd(w1,-); drawnow;plotd(w2,-.); drawnow;plotd(w3,-); drawnow;plotd(w4,:); drawnow;echo off0.1875 0.0500 0.1437 0.09386.2 Learning curvesIn this example the learning curves for four classifiers are com

116、puted using the Highleyman1.510.500.511.522.533.554321012345欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 18 -dataset. The errors are computed using thecleval routine.%PREX2 PRTools example, plot learning curves of classifiershelp prex2pause(1)% set desired learning sizeslearnsize = 3 5 10 15 20

117、30;% Generate Highleymans classesA = gendath(100,100);% avarage error over 10 repetitions% testset is complement of training sete1 = cleval(ldc,A,learnsize,10);figure(1); hold off;plot(learnsize,e1(1,:),-);axis(0 30 0 0.3); hold on; drawnow;e2 = cleval(qdc,A,learnsize,10);plot(learnsize,e2(1,:),-.);

118、 drawnow;e3 = cleval(knnc(,1),A,learnsize,10);plot(learnsize,e3(1,:),-); drawnow;e4 = cleval(treec,A,learnsize,10);plot(learnsize,e4(1,:),:); drawnow;legend(Linear,Quadratic,1-NN,DecTree);xlabel(Sample Size)ylabel(Error);6.3 Multi-class classifier plotThis file shows how to construct a colored scatt

119、er diagram defining the areas assigned to thevarious classes. First the global variable GRIDSIZE is set to 100 in order to avoid empty areas.05101520253000.050.10.150.20.25Sample SizeErrorLinear Quadratic1NN DecTree 欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 19 -Then the Highleyman dataset is

120、used to construct a 4-class problem. This is done by using thedataonlyandthengeneratingthelabelsseparately.Note that thescatter plot itself iscalled twicein order to have the scatter on top of the color plot generated byplotd.%PREX3 PRTools example of multi-class classifier plothelp prex3echo onglob

121、al GRIDSIZEgs = GRIDSIZE;GRIDSIZE = 100;% generate 2 x 2 normal distributed classesa = +gendath(20);% data onlyb = +gendath(20);% data onlyA = a; b + 5;% shift 2 over 5,5lab = genlab(20 20 20 20,1 2 3 4);% generate 4-class labelsA = dataset(A,lab);% construct datasethold off;% clear figurescatterd(A

122、,.); drawnow;% make scatter plot for right sizew = qdc(A);% compute normal densities based classifierplotd(w,col); drawnow;% plot classification regionshold on;scatterd(A);% redraw scatter plotecho offGRIDSIZE = gs;.6.4 Classifier combiningThis example is just an illustration on the use of mapping a

123、nd classifier combining. The methoditself doesnotmakemuchsense. There aresequential maps(likew1 = wkl*vkl)andastacked21012345678420246810欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 20 -map (wall = w1,w2,w3,w4,w5), using various combining rules. Note how in the featureselection routinefeatself a

124、 classifier (ldc) is used for the criterion.%PREX4 PRTools example of classifier combininghelp prex4echo onA = gendatd(100,100,10);B,C = gendat(A,20);wkl = klm(B,0.95);% find KL mapping input spacebkl = B*wkl;% map training setvkl = ldc(bkl);% find classifier in mapped spacew1 = wkl*vkl;% combine ma

125、p and classifier% (operates in original space)testd(C*w1)% testwfn = featself(B,NN,3);% find feature selection mappingbfn = B*wfn;% map training setvfn = ldc(bfn);% find classifier in mapped spacew2 = wfn*vfn;% combinetestd(C*w2)% testwfm = featself(B,ldc,3);% find second feature setbfm = B*wfm;% ma

126、p training setvfm = ldc(bfm);% find classifier in mapped spacew3 = wfm*vfm;% combinetestd(C*w3)% testw4 = ldc(B);% find classifier in input spacetestd(C*w4)% testw5 = knnc(B,1);% another classifier in input spacetestd(C*w5)% testwall = w1,w2,w3,w4,w5;% parallel classifier settestd(C*prodc(wall)% tes

127、t product ruletestd(C*meanc(wall)% test mean ruletestd(C*medianc(wall)% test median ruletestd(C*maxc(wall)% test maximum rule againtestd(C*minc(wall)% test minimum ruletestd(C*majorc(wall)% test majority votingecho off欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 21 -6.5 Image segmentation by vec

128、tor quantizationIn this example an images is segmented using modeseeking clustering techniques based on anrandomly selected subset of pixels. The resulting classifier is applied on all pixels, and also onthe pixels of a second image (may apply not so good). Finally a common map is computed andapplie

129、d for both images.%PREX5 PRTOOLS example of image vector quantizationhelp prex5echo on% standard Matlab TIFF readgirl = imread(girl.tif,tiff);% displayfiguresubplot(2,3,1); subimage(girl); axis off;title(Girl 1); drawnow% construct 3-feature dataset from entire image%X,Y = meshgrid(1:256,1:256);%X =

130、 X/10000;%Y = Y/10000;%girl(:,:,4) = X;%girl(:,:,5) = Y;g1 = im2dfeat(girl);imheight = size(girl,1);% generate testsett = gendat(g1,250);% run modeseek, find labels, and construct labeled datasetlabt = modeseek(t*proxm(t),25);t= dataset(t,labt);% train NMC classifierw = t*qdc(,1e-6,1e-6);% classify

131、all pixelspacklab = g1*w*classd;% show result% substitute class means for colorscmap = +meancov(t(:,1:3);subplot(2,3,2); subimage(reshape(lab,imheight,length(lab)/im-height),cmap);axis off;title(Girl 1 - Map 1)drawnow% Now, read second imagegirl2 = imread(girl2.tif,tiff);% displaysubplot(2,3,4); sub

132、image(girl2);欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 22 -%girl2(:,:,4) = X;%girl2(:,:,5) = Y;axis off;title(Girl 2); drawnow% construct 3-feature dataset from entire imageg2 = im2feat(girl2);clear girl girl2packlab2 = g2*w*classd;% show result% substitute class means for colorscmap = +meanc

133、ov(t(:,1:3);subplot(2,3,5);subimage(reshape(lab2,imheight,length(lab)/imheight),cmap);axis off;title(Girl 2 - Map 1)drawnow% Compute combined mapg = g1; g2;t = gendat(g,250);labt = modeseek(t*proxm(t),25);t= dataset(t,labt);w = t*qdc(,1e-6,1e-6);cmap = +meancov(t(:,1:3);clear gpacklab = g1*w*classd;

134、subplot(2,3,3);subimage(reshape(lab,imheight,length(lab)/imheight),cmap);axis off;title(Girl 1 - Map 1,2)drawnowpacklab = g2*w*classd;subplot(2,3,6); subimage(reshape(lab,imheight,length(lab)/im-height),cmap);axis off;title(Girl 2 - Map 1,2)drawnowset(gcf,DefaultAxesVisible,remove)欢迎您阅读并下载本文档，本文档来源于

135、互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 23 -6.6 Use of images and eigenfacesThis example illustrates the use of images by the face image dataset. The eigenfaces based onthefirstimageofeachsubjectaredisplayed.Nextall images are mapped on this eigenspace, thescatterplot for the first two eigenfaces are display

136、ed and the leave-one-out error is computed asa function of the number of eigenfaces used.%PREX6 Use of images and eigenfaceshelp prex6echo onif exist(face1.mat) = 2error(Face database not in search path)enda = readface(1:40,1);w = klm(a);imagesc(dataset(eye(39)*w,112); drawnowb = ;for j = 1:40a = re

137、adface(j,1:10);b = b;a*w;endfigurescatterd(b)欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 24 -title(Scatterplot on first two eigenfaces)fontsize(14)featsizes = 1 2 3 5 7 10 15 20 30 39;e = zeros(1,length(featsizes);for j = 1:length(featsizes)k = featsizes(j);e(j) = testk(b(:,1:k),1);endfigureplot(featsizes,e)xlabel(Number of eigenfaces)ylabel(Error)fontsize(14)1500100050005001000150010005000500100015002000Scatterplot on first two eigenfaces01020304000.20.40.60.81Number of eigenfacesError欢迎您阅读并下载本文档，本文档来源于互联网，如有侵权请联系删除！我们将竭诚为您提供优质的文档！- 25 -

展开阅读全文

PRTools Version 3.0 A Matlab Toolbox for Pattern Recognition

最新文档