倾向评分配对简介ppt参考课件

资源描述

《倾向评分配对简介ppt参考课件》由会员分享，可在线阅读，更多相关《倾向评分配对简介ppt参考课件（39页珍藏版）》请在金锄头文库上搜索。

1、Introduction to Propensity Score Matching: A Review and IllustrationShenyang Guo, Ph.D. School of Social WorkUniversity of North Carolina at Chapel HillJanuary 28, 2005For Workshop Conducted at the School of Social Work, University of Illinois Urbana-ChampaignNSCAW data used to illustrate PSM were c

2、ollected under funding by the Administration on Children, Youth, and Families of the U.S. Department of Health and Human Services. Findings do not represent the official position or policies of the U.S. DHHS. PSM analyses were funded by the Robert Wood Johnson Foundation Substance Abuse Policy Resea

3、rch Program, and by the Childrens Bureaus research grant. Results are preliminary and not quotable. Contact information: sguoemail.unc.edu1OutlineDay 1 Overview: Why PSM? History and development of PSMCounterfactual frameworkThe fundamental assumptionGeneral procedureSoftware packagesReview & illust

4、ration of the basic methods developed by Rosenbaum and Rubin2Outline (continued)Review and illustration of Heckmans difference-in-differences methodProblems with the Rosenbaum & Rubins methodDifference-in-differences methodNonparametric regressionBootstrappingDay 2Practical issues, concerns, and str

5、ategiesQuestions and discussions3PSM ReferencesCheck website:http:/sswnt5.sowo.unc.edu/VRC/Lectures/index.htm(Link to file “Day1b.doc”)4Why PSM? (1)Need 1: Analyze causal effects of treatment from observational dataObservational data - those that are not generated by mechanisms of randomized experim

6、ents, such as surveys, administrative records, and census data.To analyze such data, an ordinary least square (OLS) regression model using a dichotomous indicator of treatment does not work, because in such model the error term is correlated with explanatory variable. 5Why PSM? (2)The independent va

7、riable w is usually correlated with the error term . The consequence is inconsistent and biased estimate about the treatment effect .6Why PSM? (3)Need 2: Removing Selection Bias in Program Evaluation Fishers randomization idea.Whether social behavioral research can really accomplish randomized assig

8、nment of treatment? Consider E(Y1|W=1) E(Y0|W=0) . Add and subtract E(Y0|W=1), we have E(Y1|W=1) E(Y0|W=1) + E(Y0|W=1) - E(Y0|W=0) Crucial: E(Y0|W=1) E(Y0|W=0) The debate among education researchers: the impact of Catholic schools vis-vis public schools on learning. The Catholic school effect is the

9、 strongest among those Catholic students who are less likely to attend Catholic schools (Morgan, 2001).7Why PSM? (4)Heckman & Smith (1995) Four Important Questions: What are the effects of factors such as subsidies, advertising, local labor markets, family income, race, and sex on program applicatio

10、n decision? What are the effects of bureaucratic performance standards, local labor markets and individual characteristics on administrative decisions to accept applicants and place them in specific programs? What are the effects of family background, subsidies and local market conditions on decisio

11、ns to drop out from a program and on the length of time taken to complete a program? What are the costs of various alternative treatments?8History and Development of PSMThe landmark paper: Rosenbaum & Rubin (1983). Heckmans early work in the late 1970s on selection bias and his closely related work

12、on dummy endogenous variables (Heckman, 1978) address the same issue of estimating treatment effects when assignment is nonrandom. Heckmans work on the dummy endogenous variable problem and the selection model can be understood as a generalization of the propensity-score approach (Winship & Morgan,

13、1999). In the 1990s, Heckman and his colleagues developed difference-in-differences approach, which is a significant contribution to PSM. In economics, the DID approach and its related techniques are more generally called nonexperimental evaluation, or econometrics of matching.9The Counterfactual Fr

14、ameworkCounterfactual: what would have happened to the treated subjects, had they not received treatment?The key assumption of the counterfactual framework is that individuals selected into treatment and nontreatment groups have potential outcomes in both states: the one in which they are observed a

15、nd the one in which they are not observed (Winship & Morgan, 1999). For the treated group, we have observed mean outcome under the condition of treatment E(Y1|W=1) and unobserved mean outcome under the condition of nontreatment E(Y0|W=1). Similarly, for the nontreated group we have both observed mea

17、in E(Y0|W=1). The real debate about the classical experimental approach centers on the question: whether E(Y0|W=0) really represents E(Y0|W=1)? 11Fundamental Assumption Rosenbaum & Rubin (1983)Different versions: “unconfoundedness” & “ignorable treatment assignment” (Rosenbaum & Robin, 1983), “selec

18、tion on observables” (Barnow, Cain, & Goldberger, 1980), “conditional independence” (Lechner 1999, 2002), and “exogeneity” (Imbens, 2004)121-to-1 or 1-to-n Match Nearest neighbor matching Caliper matching Mahalanobis Mahalanobis with propensity score addedRun Logistic Regression: Dependent variable:

19、 Y=1, if participate; Y = 0, otherwise. Choose appropriate conditioning (instrumental) variables. Obtain propensity score: predicted probability (p) or log(1-p)/p.General ProcedureMultivariate analysis based on new sample 1-to-1 or 1-to-n match and then stratification (subclassification) Kernel or l

20、ocal linear weight match and then estimate Difference-in-differences (Heckman)EitherOr13Nearest Neighbor and Caliper MatchingNearest neighbor: The nonparticipant with the value of Pj that is closest to Pi is selected as the match.Caliper: A variation of nearest neighbor: A match for person i is sele

21、cted only if where is a pre-specified tolerance. Recommended caliper size: .25p1-to-1 Nearest neighbor within caliper (The is a common practice)1-to-n Nearest neighbor within caliper14Mahalanobis Metric Matching: (with or without replacement) Mahalanobis without p-score: Randomly ordering subjects,

22、calculate the distance between the first participant and all nonparticipants. The distance, d(i,j) can be defined by the Mahalanobis distance: where u and v are values of the matching variables for participant i and nonparticipant j, and C is the sample covariance matrix of the matching variables fr

23、om the full set of nonparticipants.Mahalanobis metric matching with p-score added (to u and v).Nearest available Mahalandobis metric matching within calipers defined by the propensity score (need your own programming).15Stratification (Subclassification)Matching and bivariate analysis are combined i

24、nto one procedure (no step-3 multivariate analysis):Group sample into five categories based on propensity score (quintiles).Within each quintile, calculate mean outcome for treated and nontreated groups. Estimate the mean difference (average treatment effects) for the whole sample (i.e., all five gr

25、oups) and variance using the following equations:16Multivariate Analysis at Step-3We could perform any kind of multivariate analysis we originally wished to perform on the unmatched data. These analyses may include:multiple regression generalized linear modelsurvival analysisstructural equation mode

26、ling with multiple-group comparison, and hierarchical linear modeling (HLM) As usual, we use a dichotomous variable indicating treatment versus control in these models. 17Very Useful Tutorial for Rosenbaum & Rubins Matching MethodsDAgostino, R.B. (1998). Propensity score methods for bias reduction i

27、n the comparison of a treatment to a non-randomized control group. Statistics in Medicine 17, 2265-2281.18Software PackagesThere is currently no commercial software package that offers formal procedure for PSM. In SAS, Lori Parsons developed several Macros (e.g., the GREEDY macro does nearest neighb

28、or within caliper matching). In SPSS, Dr. John Painter of Jordan Institute developed a SPSS macro to do similar works as GREEDY (http:/sswnt5.sowo.unc.edu/VRC/Lectures/index.htm).We have investigated several computing packages and found that PSMATCH2 (developed by Edwin Leuven and Barbara Sianesi 20

29、03, as a user-supplied routine in STATA) is the most comprehensive package that allows users to fulfill most tasks for propensity score matching, and the routine is being continuously improved and updated.19Demonstration of Running STATA/PSMATCH2:Part 1. Rosenbaum & Rubins Methods(Link to file “Day1

30、c.doc”)20Problems with the Conventional (Prior to Heckmans DID) ApproachesEqual weight is given to each nonparticipant, though within caliper, in constructing the counterfactual mean.Loss of sample cases due to 1-to-1 match. What does the resample represent? External validity.Its a dilemma between i

31、nexact match and incomplete match: while trying to maximize exact matches, cases may be excluded due to incomplete matching; while trying to maximize cases, inexact matching may result.21Heckmans Difference-in-Differences Matching Estimator (1)Difference-in-differencesApplies when each participant m

32、atches to multiple nonparticipants.Participant i in the set of common-support. Multiple nonparticipants who are in the set of common-support (matched to i).DifferenceDifferences.inTotal number of participantsWeight(see the following slides)22Weights W(i.,j) (distance between i and j) can be determin

33、ed by using one of two methods:1.Kernel matching: where G(.) is a kernel function and n is a bandwidth parameter. Heckmans Difference-in-Differences Matching Estimator (2)232.Local linear weighting function (lowess): Heckmans Difference-in-Differences Matching Estimator (3)24A Review of Nonparametri

34、c Regression (Curve Smoothing Estimators)I am grateful to John Fox, the author of the two Sage green books on nonparametric regression (2000), for his provision of the R code to produce the illustrating example.25Why Nonparametric? Why Parametric Regression Doesnt Work?26Focal x(120) The 120th order

35、ed xSaint Lucia: x=3183 y=74.8The window, called span,contains .5N=95 observationsThe Task: Determining the Y-value for a FocalPoint X(120)27Tricube kernel weightsWeights within the Span Can Be Determined by the Tricube Kernel Function28The Y-value at Focal X(120) Is a Weighted MeanWeighted mean = 7

36、1.1130129The Nonparametric Regression Line Connects All 190 Averaged Y Values30Review of Kernel FunctionsTricube is the default kernel in popular packages.Gaussian normal kernel:Epanechnikov kernel parabolic shape with support -1, 1. But the kernel is not differentiable at z=+1.Rectangular kernel (a

37、 crude method).31Local Linear Regression(Also known as lowess or loess )A more sophisticated way to calculate the Y values. Instead of constructing weighted average, it aims to construct a smooth local linear regression with estimated 0 and 1 that minimizes: where K(.) is a kernel function, typicall

38、y tricube. 32The Local Average Now Is Predicted by a Regression Line, Instead of a Line Parallel to the X-axis. 33Asymptotic Properties of lowessFan (1992, 1993) demonstrated advantages of lowess over more standard kernel estimators. He proved that lowess has nice sampling properties and high minima

39、x efficiency.In Heckmans works prior to 1997, he and his co-authors used the kernel weights. But since 1997 they have used lowess.In practice its fairly complicated to program the asymptotic properties. No software packages provide estimation of the S.E. for lowess. In practice, one uses S.E. estima

40、ted by bootstrapping. 34Bootstrap Statistics Inference (1)It allows the user to make inferences without making strong distributional assumptions and without the need for analytic formulas for the sampling distributions parameters.Basic idea: treat the sample as if it is the population, and apply Mon

41、te Carlo sampling to generate an empirical estimate of the statistics sampling distribution. This is done by drawing a large number of “resamples” of size n from this original sample randomly with replacement.A closely related idea is the Jackknife: “drop one out”. That is, it systematically drops o

42、ut subsets of the data one at a time and assesses the variation in the sampling distribution of the statistics of interest.35Bootstrap Statistics Inference (2)After obtaining estimated standard error (i.e., the standard deviation of the sampling distribution), one can calculate 95 % confidence inter

43、val using one of the following three methods: Normal approximation method Percentile method Bias-corrected (BC) methodThe BC method is popular. 36Finite-Sample Properties of lowessThe finite-sample properties of lowess have been examined just recently (Frolich, 2004). Two practical implications: 1.C

44、hoose optimal bandwidth value.2.Trimming (i.e., discarding the nonparametric regression results in regions where the propensity scores for the nontreated cases are sparse) may not be the best response to the variance problems. Sensitivity analysis testing different trimming schemes.37Heckmans Contri

45、butions to PSMUnlike traditional matching, DID uses propensity scores differentially to calculate weighted mean of counterfactuals. A creative way to use information from multiple matches.DID uses longitudinal data (i.e., outcome before and after intervention).By doing this, the estimator is more ro

46、bust: it eliminates temporarily-invariant sources of bias that may arise, when program participants and nonparticipants are geographically mismatched or from differences in survey questionnaire. 38Demonstration of Running STATA/PSMATCH2:Part 2. Heckmans Difference-in-differences Method(Link to file “Day1c.doc”)39

展开阅读全文

倾向评分配对简介ppt参考课件

最新文档