统计建模与r软件课后习题答案2-5章

资源描述

《统计建模与r软件课后习题答案2-5章》由会员分享，可在线阅读，更多相关《统计建模与r软件课后习题答案2-5章（22页珍藏版）》请在金锄头文库上搜索。

1、1第二章答案： Ex2.1 xhist(serumdata,freq=FALSE,col=“purple“,border=“red“,density=3,angle=60,main=paste(“the histogram of serumdata“),xlab=“age“,ylab=“frequency“)#直方图。col 是填充颜色。默认空白。border 是边框的颜色，默认前景色。density 是在图上画条纹阴影，默认不画。angle 是条纹阴影的倾斜角度（逆时针方向），默认 45 度。main, xlab, ylab 是标题，x 和 y 坐标轴名称。lines(density(ser

2、umdata),col=“blue“)#密度估计曲线。x lines(x,dnorm(x,mean(serumdata),sd(serumdata),col=“green“) #正态分布的概率密度曲线 plot(ecdf(serumdata),verticals=TRUE,do.p=FALSE) #绘制经验分布图 lines(x,pnorm(x,mean(serumdata),sd(serumdata),col=“blue“) #正态经验分布5 qqnorm(serumdata,col=“purple“) #绘制 QQ 图 qqline(serumdata,col=“red“) #绘制 QQ

3、直线Ex3.3 stem(serumdata,scale=1) #作茎叶图。原始数据小数点后数值四舍五入。The decimal point is at the |64 | 30066 | 2333368 | 70 | 72 | 55574 | 8876 | 78 | 80 | 82 |84 | 3boxplot(serumdata,col=“lightblue“,notch=T) #作箱线图。notch 表示带有缺口。 fivenum(serumdata) #五数总结1 64.3 71.2 73.5 75.8 84.3 Ex3.4 shapiro.test(serumdata) #正态性 S

4、hapori-Wilk 检验方法Shapiro-Wilk normality testdata: serumdata W = 0.9897, p-value = 0.6437结论：p 值0.05，可认为来自正态分布的总体。 ks.test(serumdata,“pnorm“,mean(serumdata),sd(serumdata) #Kolmogrov-Smirnov检验，正态性One-sample Kolmogorov-Smirnov testdata: serumdata D = 0.0701, p-value = 0.7097 alternative hypothesis: two-s

5、ided6Warning message: In ks.test(serumdata, “pnorm“, mean(serumdata), sd(serumdata) :cannot compute correct p-values with ties结论：p 值0.05，可认为来自正态分布的总体。注意，这里的警告信息，是因为数据中有重复的数值，ks 检验要求待检数据时连续的，不允许重复值。Ex3.5 y f plot(f,y,col=“lightgreen“) #plot()生成箱线图 x y z boxplot(x,y,z,names=c(“1“,“2“,“3“),col=c(5,6,7)

6、 #boxplot()生成箱线图7结论：第 2 和第 3 组没有显著差异。第 1 组合其他两组有显著差异。Ex3.6数据太多，懒得录入。离散图应该用 plot 即可。Ex3.7 studata data.frame(studata) #转化为数据框V1 V2 V3 V4 V5 V6 1 1 alice f 13 56.5 84.0 2 2 becka f 13 65.3 98.0 3 3 gail f 14 64.3 90.0 4 4 karen f 12 56.3 77.0 5 5 kathy f 12 59.8 84.5 6 6 mary f 15 66.5 112.0 7 7 sandy

7、 f 11 51.3 50.5 8 8 sharon f 15 62.5 112.5 9 9 tammy f 14 62.8 102.5810 10 alfred m 14 69.0 112.5 11 11 duke m 14 63.5 102.5 12 12 guido m 15 67.0 133.0 13 13 james m 12 57.3 83.0 14 14 jeffery m 13 62.5 84.0 15 15 john m 12 59.0 99.5 16 16 philip m 16 72.0 150.0 17 17 robert m 12 64.8 128.0 18 18 t

8、homas m 11 57.5 85.0 19 19 william m 15 66.5 112.0 names(studata) attach(studata) #将数据框调入内存 plot(weightheight,col=“red“) #体重对于身高的散点图 coplot(weightheight|sex,col=“blue“) #不同性别，体重与身高的散点图 coplot(weightheight|age,col=“blue“) #不同年龄，体重与身高的散点图 coplot(weightheight|age+sex,col=“blue“) #不同年龄和性别，体重与身高的散点图 Ex3.

9、8 x y f z contour(x,y,z,levels=c(0,1,2,3,4,5,10,15,20,30,40,50,60,80,100),col=“blue“) #二维等值线 persp(x,y,z,theta=120,phi=0,expand=0.7,col=“lightblue“) #三位网格曲面 Ex3.9 attach(studata) cor.test(height,weight) #Pearson 相关性检验9Pearsons product-moment correlationdata: height and weight t = 7.5549, df = 17, p-

10、value = 7.887e-07 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval:0. 0. sample estimates:cor 0.由此可见身高和体重是相关的。Ex4.2指数分布，的极大似然估计是 n/sum(Xi) x lamda x mean(x) 1 1平均为 1 个。Ex4.4 obj x0nlm(obj,x0) $minimum 1 48.98425 $estimate 1 11. -0. $gradient 1 1.e-08 -1.e-07

11、 $code101 1 $iterations 1 16 Ex4.5 x t.test(x) #t.test()做单样本正态分布区间估计One Sample t-testdata: x t = 35.947, df = 9, p-value = 4.938e-11 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval:63.1585 71.6415 sample estimates: mean of x67.4平均脉搏点估计为 67.4 ，95%区间估计为 63.1585 71.64

12、15 。 t.test(x,alternative=“less“,mu=72) #t.test()做单样本正态分布单侧区间估计One Sample t-test data: x t = -2.4534, df = 9, p-value = 0.01828 alternative hypothesis: true mean is less than 72 95 percent confidence interval:-Inf 70.83705 sample estimates: mean of x67.4p 值小于 0.05，拒绝原假设，平均脉搏低于常人。要点：t.test()函数的用法。本例为

13、单样本；可做双边和单侧检验。Ex4.6 x y t.test(x,y,var.equal=TRUE)Two Sample t-test11data: x and y t = 4.6287, df = 18, p-value = 0. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:7.53626 20.06374 sample estimates: mean of x mean of y140.6 126.8期望差的 95%置信区间为 7.5362

14、6 20.06374 。要点：t.test()可做两正态样本均值差估计。此例认为两样本方差相等。ps：我怎么觉得这题应该用配对 t 检验？Ex4.7 x y t.test(x,y,var.equal=TRUE)Two Sample t-testdata: x and y t = 1.198, df = 7, p-value = 0.2699 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:-0. 0. sample estimates: mean

15、of x mean of y0.14125 0.13920期望差的 95%的区间估计为-0. 0.Ex4.8 接 Ex4.6 var.test(x,y)F test to compare two variancesdata: x and y F = 0.2353, num df = 9, denom df = 9, p-value = 0.04229 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval:0. 0. sample estimates:12r

16、atio of variances0.要点：var.test 可做两样本方差比的估计。基于此结果可认为方差不等。因此，在 Ex4.6 中，计算期望差时应该采取方差不等的参数。 t.test(x,y)Welch Two Sample t-test data: x and y t = 4.6287, df = 13.014, p-value = 0. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:7. 20. sample estimates: mean of x mean of

展开阅读全文