用R语言做非参数－金锄头文库

资源描述

《用R语言做非参数》由会员分享，可在线阅读，更多相关《用R语言做非参数（21页珍藏版）》请在金锄头文库上搜索。

1、用用 R 语言做非参数语言做非参数 Window width: 2h. 1、Kernel function 的条件 The kernel function K(.) is a continuous function, symmetric(对称的) around zero, that integrates(积分) to unity and satisfies additional bounded conditions: (1) K() is symmetric around 0 and is continuous;(2) ,; (3) Either (a) K(z)=0 if |z|=z0 fo

2、r z0 Or(b) |z|K(z) 0 as ;(4) , where is a constant.2、主要函数形式 3、置信区间其中， 4、窗宽的选择实际应用中，。其中，s 是样本标准差，iqr 是样本分位数级差（interquartile range）四、四、K K nearest-neighborsnearest-neighbors estimateestimate五、五、R R 语言部分语言部分 da which.diff, ydiffs which.diff, pch=16, cex=1, col=gray(.80) points(xdiffs = which.diff, yd

3、iffs = which.diff, cex=.85) abline(mod, lwd=2, col=1) text(27.5, 50, expressionexpression(paste(“Fitted Value of y at “, x0x0) #这里 expression 的用法比较有意思 arrows(25, 47, 15, 37, code =2, length = .10) # # #2、Now Putting It Together For Local Regression Demonstration. #OLS Fit for Comparison ols - lm(cha

4、l.vote perotvote, data=jacob) #The loess fit model.loess - loessloess(chal.vote perotvote, data=jacob, spanspan = 0.5) #* 默认设置 degree=2，family=gauss, tricube 加权 * n - length(chal.vote) x.loess - seq(min(perotvote), max(perotvote), length=n) y.loess - predictpredict(model.loess, data.frame(perotvote=

5、x.loess) #得到预测值便于比较#The lowess fit model.lowess - lowesslowess(chal.vote perotvote, data=jacob, f f = 0.5) #* 默认设置 robust linear tricube 加权 * n - length(chal.vote) x.lowess - seq(min(perotvote), max(perotvote), length=n) y.lowess - predictpredict(model.lowess, data.frame(perotvote=x.lowess) #得到预测值便于

6、比较 #Figure 2.8 plot(perotvote, chal.vote, pch=“.“, ylab=“Challengers Vote Share (%)“, xlab=“Vote for Perot (%)“, bty=“l“) lines(x.loess, y.loess) lines(x.lowess, y.lowess) abline(ols) legend(15,20, c(“Loess“,“Lowess“, “OLS“) , lty=c(1,2,1), bty=“n“, cex=.8) # # #3、lowess 中不同 robust 的比较 m1.lowess - l

7、owess(perotvote, chal.vote, f = 0.5, iteriter=0) #* 没有进行第二步的 robust 加权估计 * m2.lowess - lowess(perotvote, chal.vote, f = 0.5) #* 默认 iter=3，要进行 3 次 robust 加权估计 * m0.loess - loess(chal.vote perotvote, data=jacob, span = 0.5, degree=1, family=“symm“, iterations=1iterations=1) #* no robust m1.loess - loe

8、ss(chal.vote perotvote, data=jacob, span = 0.5, degree=1) #* 没有进行第二步的 robust 加权估计 * m2.loess - loess(chal.vote perotvote, data=jacob, span = 0.5, degree=1, family=“symm“, iterations=3) #* 进行 3 次 robust 加权估计 * plot(perotvote, chal.vote, pch=“.“, ylab=“Challengers Vote Share (%)“, xlab=“Vote for Perot

9、 (%)“) lines(m1.lowess) lines(sort(perotvote), m1.loess$fitorder(perotvote), lty=3, col=“green“) lines(sort(perotvote), m0.loess$fitorder(perotvote), lty=9,col=18) lines(m2.lowess, lty=2, col=“red“) lines(sort(perotvote), m2.loess$fitorder(perotvote), lty=4, col=“blue“)- - 第四章第四章样条估计样条估计 splinespli

10、ne 一、基本思想一、基本思想按照 x 将样本分成多个区间，对每个区间分别进行估计。不同于核估计，这里不用移动计算，从而减小了计算量。二、最简单的形式二、最简单的形式 Linear Spline with k knots:其中，三、其他样条模型三、其他样条模型 1、p 次样条估计二次样条 Quadratic Spline (basis functions with k knots)三次样条 Cubic Spline (with k knots, use quadratic basis functions)p-order spline (with k knots)2、B-splines

11、 (with k knots cubic B-spline basis)其中， 3、Natural Splines 以上估计方法对结点（knots）之间的估计比较准确，但对边界的拟合效果较差。自然样条的思想是，在自变量最小值和最大值处各增加一个结点，用线性模型拟合边界位置的样本点。 4、k 的选择和模型比较采用 AIC 准则四、光滑样条四、光滑样条 smoothingsmoothing splinespline基于如果目标得到参数估计值 min 五、模型比较的五、模型比较的 F F 检验检验六、六、R R 语言部分语言部分 library(foreign) jacob - read.

12、dta(“jacob.dta“) attach(jacob) # #第一部分，B 样条和 natural B 样条 library(splines) #* P61 Perform Spline Regression * m.bsp - lm(chal.votebsbs(perotvote, df=5), data=jacob) #* 3 次 B 样条公式: df=k+3 (不含常数项) m.nsp - lm(chal.votensns(perotvote, df=5), data=jacob) #* df=5 对应结点为 4 个；3 次 natural B 样条公式：df=k+1 perot

13、- seq(min(perotvote), max(perotvote), length=312) bsfit - predictpredict(m.bsp, data.frame(perotvote=perot) nsfit - predict(m.nsp, data.frame(perotvote=perot) AIC(m.bsp) #计算 AIC 值 # # #第二部分，光滑样条估计 plot(perotvote, chal.vote, pch=“.“, ylab=“Challengers Vote Share (%)“, xlab=“Vote for Perot (%)“, bty=“

14、l“, main = “df = 2“, cex.main = .95) lines(smooth.splinesmooth.spline(perotvote, chal.vote, dfdf=2) # # #第三部分，置信区间 library(splines) m.nsp - lm(chal.votens(perotvote, df=4), data=jacob) perot - seq(min(perotvote), max(perotvote), length=312) nsfit - predict(m.nsp, inteval=“confidence“, se.fit=TRUEse.fit=TRUE, data.frame(perotvote=perot) #Figure 3.8 plot(perotvote, chal.vote, pch=“.“, ylab=“Challengers Vote Share (%)“, xlab=“Vote for Perot (%)“) lines(perot, nsfit$fit) lines(perot, nsfit$fit + 1.96*nsfit$se.fit,

展开阅读全文

用R语言做非参数

最新文档