华南理工大学《模式识别》(本科)复习资料

上传人:第*** 文档编号:33510073 上传时间:2018-02-15 格式:DOCX 页数:10 大小:3.73MB
返回 下载 相关 举报
华南理工大学《模式识别》(本科)复习资料_第1页
第1页 / 共10页
华南理工大学《模式识别》(本科)复习资料_第2页
第2页 / 共10页
华南理工大学《模式识别》(本科)复习资料_第3页
第3页 / 共10页
华南理工大学《模式识别》(本科)复习资料_第4页
第4页 / 共10页
华南理工大学《模式识别》(本科)复习资料_第5页
第5页 / 共10页
点击查看更多>>
资源描述

《华南理工大学《模式识别》(本科)复习资料》由会员分享,可在线阅读,更多相关《华南理工大学《模式识别》(本科)复习资料(10页珍藏版)》请在金锄头文库上搜索。

1、华南理工大学模式识别复习资料CH1.【Pattern Recognition Systems】Data Acquisition & Sensing Measurements of physical variables Pre-processing Removal of noise in data Isolation of patterns of interest from the background (segmentation) Feature extraction Finding a new representation in terms of featuresModel learning

2、 / estimation Learning a mapping between features and pattern groups and categories Classification Using features and learned models to assign a pattern to a category Post-processing Evaluation of confidence in decisions Exploitation of context to improve performance Combination of experts【Design Cy

3、cle】【 Learning strategies】Supervised learningA teacher provides a category label or cost for each pattern in the training set Unsupervised learningThe systems forms clusters or natural grouping of the input patterns Reinforcement learningNo desired category is given but the teacher provides feedback

4、 to the system such as the decision is right or wrong【Evaluation methods】Independent Run A statistical method, also called Bootstrap. Repeat the experiment n times independently, and take the mean as the result.Cross-validationDataset D is randomly divided into n disjoint sets Di of equal size n/m,

5、where n is the number of samples in Di. Classifier is trained m times and each time with different set held out as a testing setCH2.【Bayes formula】【Bayes Decision Rule】【Maximum Likelihood (ML) Rule】When p(w1)=p(w2),the decision is based entirely on the likelihood p(x|wj) p(x|w) p(x|w)【Error analysis

6、】Probability of error for multi-class problems:Error = Bayes Error + Added Error:【Lost function】Conditional risk (expected loss of taking action ai):Overall risk (expected loss):zero-one loss function is used to minimize the error rate【Minimum Risk Decision Rule】【Normal Distribution】Multivariate Nor

7、mal Density in d dimensions:【ML Parameter Estimation】【Discriminant function】【Decision boundary】CH3.【Normalized distance from origin to surface】【Distance of arbitrary point to surface】【Perceptron Criterion 】【Pseudo inverse Method】Problem:Exercise for Pseudo inverse Method(2)【 Least-Mean-Squared (Grad

8、ient Descent)】【Linear classifier for multiple Classes】【linearly separable problem】A problem whose data of different classes can be separated exactly by linear decision surface.CH4.【Perception update rule】(reward and punishment schemes)Exercise for perception 【Error of Back-Propagation Algorithm】Upda

9、te rule for weight:【Weight of Back-Propagation Algorithm】The learning rule for the hidden-to-output units :The learning rule for the input-to-hidden units: Summary:【 Training of Back-Propagation】Weights can be updated differently by presenting the training samples in different sequences.Two popular

10、training methods:Stochastic Training Patterns are chosen randomly form the training set (Network weights are updated randomly)Batch training All pattern are presented to the network before learning takes place【Regularization】【Problem of training a NN】Scaling input Target values Number of hidden laye

11、rs 3-layer is recommended. Special problem: more than 3Number of hidden units roughly n/10Initializing weights Weight decay Stochastic and batch training Stopped trainingWhen the error on a separate validation set reaches a minimum Exercise for ANN forward pass:g=0.8385reverse pass: (learning rate=0

12、.5)CH5.【Structure of RBF】3 layers:Input layer: f(x)=xHidden layer: Gaussian functionOutput layer: linear weight sum【Characteristic of RBF】Advantage:RBF network trains faster than MLP The hidden layer is easier to interpret than MLP Disadvantage:During the testing, the calculation speed of a neuron i

13、n RBF is slower than MLPExercise for RBF Solution:= CH6.【Margin】* Margin is defined as the width that the boundary could be increased by before hitting a data point* The linear discriminant function (classifier) with the maximum margin is the best.* Data closest to the hyper plane are support vector

14、s.【Maximum Margin Classification】* Maximizing the margin is good according to intuition and theory.* Implies that only support vectors are important; other training examples are ignorable.Advantage: (compare to LMS and perception)Better generalization ability & less over-fitting【Kernels 】* We may us

15、e Kernel functions to implicitly map to a new feature space* Kernel must be equivalent to an inner product in some feature space【Solving of SVM】* Solving SVM is a quadratic programming problemTarget: maximum margin - = Such that 【Nonlinear SVM】The original feature space can always be mapped to some

16、higher-dimensional feature space where the training set is separable【Optimization Problem】Dual Problem for (ai is Lagrange multiplier):Solution(Each non-zero ai indicates that corresponding xi is a support vector.):Classifying function (relies on an inner product between the test point x and the support vectors xi. involved computing the inner products xi * xj between all training points):【Slack variables】Target:Dual Problem of th

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 办公文档 > 解决方案

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号