《bootstrap_topics》由会员分享,可在线阅读,更多相关《bootstrap_topics(15页珍藏版)》请在金锄头文库上搜索。
1、1Confidence SetsLet Xi,i = 1,.,n be an i.i.d. sample of observations with distribution P P. The family P may be a parametric, nonparametric, or semiparametricfamily of distributions. We are interested in making inferences about someparameter (P) = (P) : P P. Typical examples of (P) are themean of P
2、or median of P, but, more generally, it could be any function ofP. Specifically, we are interested in constructing a confidence set for (P);that is, a random set, Cn= Cn(X1,.,Xn) such thatP(P) Cn 1 ,at least for n sufficiently large.The typical way of constructing such sets is based offof approximat
3、ingthe distribution of a root, Rn= Rn(X1,.,Xn,(P). A root is simply anyreal-valued function depending on both the data, Xi,i = 1,.,n, and theparameter of interest, (P). The idea is that if the distribution of the rootwere known, then one could straightforwardly construct a confidence set for(P). To
4、illustrate this idea, let Jn(x,P) denote the distribution of Rn; thatisJn(x,P) = PRn x .The notation is intended to emphasize the fact that the distribution of theroot depends on both the sample size, n, and the distribution of the data,P. Using Jn(x,P), we may choose a constant c such thatPRn c 1 .
5、Given such a c, the setCn= : Rn(X1,.,Xn,) cis a confidence set in the sense described above. We may also choose c1andc2so thatPc1 Rn c2 1 .1Given such c1and c2, the setCn= : c1 Rn(X1,.,Xn,) c2is a confidence set in the sense described above.1.1PivotsIn some rare instances, Jn(x,P) does not depend on
6、 P. In these instances,the root is said to be pivotal or a pivot. For example, if (P) is the mean ofP and P = N(,1) : R, then the rootRn=n(Xn (P)is a pivot because Rn N(0,1). In this case, we may construct confidencesets Cnwith finite-sample validity; that is,P(P) Cn = 1 for all n and P P. If it is
7、known that Jn(x,P) does not depend on P,but its exact form is not known or is untractable, then one may resort tosimulation to approximate Jn(x,P) to any desired degree of accuracy (sincethe distribution does not depend on P, just pick any P P and simulateJn(x,P) using that P). An example of this is
8、 given by the Kolmogorov-Smirnov statistic: Remarkably, the distribution ofn supxR|Fn(x) F(x)|does not depend on F as long as F is continuous!We may use this toconstruct uniform confidence bands on F provided that we assume that Fis continuous.1.2Asymptotic PivotsSometimes, the root may not be pivot
9、al in the sense described above, but itmay be asymptotically pivotal or an asymptotic pivot in that Jn(x,P) con-verges in distribution to a limit distribution J(x,P) that does not depend on2P. For example, if (P) is the mean of P and P is the set of all distributionson R with a finite, nonzero varia
10、nce, thenRn=n(Xn (P) nis asymptotically pivotal because it converges in distribution to J(x,P) =(x). In this case, we may construct confidence sets that are asymptoticallyvalid in the sense thatlimnP(P) Cn = 1 for all P P.1.3Asymptotic ApproximationsTypically, the root will be neither a pivot nor an
11、 asymptotic pivot. Thedistribution of the root, Jn(x,P), will typically depend on P, and, when itexists, the limit distribution of the root, J(x,P), will, too. For example,if (P) is the mean of P and P is the set of all distributions on R with afinite, nonzero variance, thenRn=n(Xn (P)converges in d
12、istribution to J(x,P) = (x/(P). In this case, we can ap-proximate this limit distribution with (x/ n), which will lead to confidencesets that are asymptotically valid in the sense described above.Note that this third approach depends very heavily on the limit distribu-tion J(x,P) being both known an
13、d tractable. Even if it is known, the limitdistribution may be difficult to work with (e.g., it could be the supremumof some complicated stochastic process with many nuisance parameters).Moreover, even if it is known and manageable, the method may be poor infinite-samples because it essentially reli
14、es on a double approximation: first,Jn(x,P) is approximated by J(x,P), then J(x,P) is approximated in someway by estimating the unknown parameters of the limit distribution.32The BootstrapThe bootstrap is a fourth, more general approach to approximating Jn(x,P).The idea is very simple: Replace the u
15、nknown P with an estimatePn. Given Pn, it is possible to compute (either analytically or using simulation to anydesired degree of accuracy) Jn(x,Pn). In the case of i.i.d. data, a typicalchoice is the empirical distribution (though if P = P() for some finite-dimensional parameter , then one may also
16、 usePn= P(n) for someestimatenof ). The hope is that wheneverPnis “close” to P (whichmay be ensured, for example, by the Glivenko-Cantelli Theorem), Jn(x,Pn)is “close” to Jn(x,P). Essentially, this requires that Jn(x,P), when viewedas a function of P, is continuous in an appropriate neighborhood of P. Often,this turns out to be true, b