回归分析的例子－金锄头文库

资源描述

《回归分析的例子》由会员分享，可在线阅读，更多相关《回归分析的例子（42页珍藏版）》请在金锄头文库上搜索。

1、迴歸分析的例子迴歸分析的例子黃熾森香港中文大學管理學系教授地址: 香港新界沙田香港中文大學管理學系電郵:2006年3月1大綱大綱迴歸分析及其統計測試原理虛擬變項(Dummy Variable)的運用增加效度(Incremental Validity) 調節變項(Moderator)的測試中介變項(Mediator)的測試應用迴歸分析時要注意的重點討論幾個應用迴歸分析的研究例子 2迴歸分析的數學方程迴歸分析的數學方程自變項與依變項成線性關係(Linear Relationship)，假如X1及X2(即自變項；Independent Variables)是Y(即依變項；Dependent

2、 Variable)的原因，那麼它們的關係可用以下方程式代表：Y = 0 + 1X1 + 2X2 + 其中：(1) 0為一常數；(2) 1代表了如果X1改變了一個單位，Y會改變的程度；(3) 2代表了如果X2改變了一個單位，Y會改變的程度；(4) 代表了隨機的誤差； 3變異量的角度變異量的角度:最簡單的情況最簡單的情況 YX1X2ABC4最簡單的情況最簡單的情況: 1和和2是獨立的是獨立的如果A佔Y的總變異量的比重愈高，那麼1便會愈大，反映X1對Y的影響愈大；如果B佔Y的總變異量的比重愈高，那麼2便會愈大，反映X2對Y的影響愈大。而C則是代表了X1及X2無法影響Y的部分，也就是的變異量了。因此

3、C佔Y的總變異量的比重愈高，以X1及X2來預測或解釋Y的變異情況的能力便愈差。A和B的部分是沒有關係的，那就是說1和2是獨立的，不會互相影響。 5變異量的角度變異量的角度:更常見的情形更常見的情形 CYX2X1Dab6常見的情形常見的情形: 1和和2不是獨立的不是獨立的如果我們假設這個圖中C佔Y的總變異量與之前的圖一樣，那麼，X1及X2對預測或解釋Y的變異情況的能力也會與之前的圖一樣(即A+B=a+b+D)1和2不是獨立的，它們會互相影響，因為如果我們不考慮X2， 1便會較大；同樣地，因為如果我們不考慮X1， 2便會較大。這一點對我們了解真實的現象是很重要的，因為如果在真實的現象中，X1及X2

4、都同時存在而對Y有所影響，但我們的理論卻沒有考慮X1及X2都同時存在的情形，那麼我們的理論便不能正確地描述這些自變項和依變項的關係了。 7迴歸分析的原理迴歸分析的原理迴歸分析的原理是同時(simultaneously)考慮不同自變項對某一依變項的影響，兩點是很重要的：(1)整體而言，這些自變項對依變項的預測或解釋能力有多大，即的變異量佔Y的總變異量的大小，如果愈小，則預測或解釋能力愈高；(2)在同時考慮了所有自變項的情況下，個別自變項對依變項的影響，因此，我們可作出這樣的結論：在其他因素不變的情況下，這個自變項(例如X1)對依變項(例如Y)的影響是當X1改變一個單位時，Y會改變1的單位(Giv

5、en other things equal, Y will change by 1 unit when X1 changes one unit)。 8迴歸分析的統計數和參數迴歸分析的統計數和參數樣本的統計數樣本的統計數(Statistics)母體的參數母體的參數(Parameters)I. 整體而言，自變項對依變項的解釋能力：(a) e的變異量(a) 的變異量(b) 1 (e的變異量)/(樣本中Y的變異量) (稱為R-square；R2)(b) 1 (的變異量)/(Y的變異量)II. 在同時考慮了所有自變項的情況下，個別自變項對依變項的影響：(c) b0(c) 0(d) bi (d) i (

6、稱為beta)9R2 ， b0及及bi的計算的計算因為我們是要以自變項來預測或解釋依變項，因此，在樣本的數據中，我們是找出一組b0及bi的數值使e的變異量最小的，然後用這一組的R2及bi來作統計測試(Hypothesis testing)。 10R2 的統計測試的統計測試證整體而言，自變項對依變項的預測或解釋能力是否存在：(1)設立保守假設，即在母體中自變項對依變項沒有影響，所有自變項與依變項均無共變量，即母體的1 (的變異量)/(Y的變異量) 等於零。(2)抽取樣本、測量各自變項及依變項，以取得數據計算R2 ；(3)計算在保守假設正確時，我們會看到這個樣本的R2的機會有多大(即P值；P va

7、lue)；(4)根據P值判斷是否要推翻原來保守的假設。 11i 的統計測試的統計測試在其他因素不變的情況下，各自變頂對依變項的影響。我們以X1為例：(1)設立保守假設，即在母體中X1對Y沒有影響，所以1等於零；(2)抽取樣本、測量各自變項及依變項，以取得數據計算b1；(3)計算在保守假設正確時(即1等於零)，我們會看到這個樣本的b1的機會有多大(即P值；P value)；(4)根據P值判斷是否要推翻原來保守的假設。 12虛擬變項的需要虛擬變項的需要由於我們以:Y = 0 + 1X1 + 2X2 + 這樣的方程式來代表X1、X2及Y的關係，事實上我們已經假設了X1、X2及Y最起碼是等距尺度的了

8、，否則數學上無法運算。 13類別尺度的虛擬變項類別尺度的虛擬變項例如X2是性別，那麼我們可創造一個新的虛擬變項(D)代替，當回應者是男性時，把D設定為1，而當回應者是女性時，把D設定為0，這樣一來，迴歸的方程式是：Y = 0 + 1 X1 + 2D + 。(1) 當D等於1時，變成：Y = 0 + 1X1 + 2 + ；(2) 當D等於0時，變成：Y = 0 + 1X1 + 。如果在統計測試中我們的結論是2等於零時，便代表男性和女性在預測或解釋Y方面沒有作用，因為無論回應者是男性還是女性，我們接受的結論均為：Y = 0 + 1X1 + 。所以，以虛擬變項(D)代表性別後，我們便可以如常地進行迴

9、歸分析。 14依變項的虛擬變項依變項的虛擬變項假如依變項(Y)是類別尺度測量及分為兩類的，我們仍可設立虛擬變項，進行特別的迴歸分析，稱為Logistic Regression。例如:離職(Turnover)的研究。 15多於兩個類別的虛擬變項多於兩個類別的虛擬變項假如X2是多於兩個類別，例如是公司的種類：國營企業(SOE)、中外合資企業(JV)、外資獨資企業(WOFE) ，這樣我們便要創造兩個新的虛擬變項(D1及D2)來代替這變項。例如當企業是SOE時，把D1設定為1，而其他企業則把D1設定為0；當企業是JV時，把D2設定為1，而其他企業則把D2設定為0。我們的迴歸方程式便是：Y = 0 +

10、1 X1 + 2D1 + 3D2 + 如果在統計測試中我們的結論是2及3均等於零時，則代表企業類別對預測或解釋Y方面沒有用。如果自變項的類別數目為n時，我們祗要設定(n-1)個虛擬變項，便可進行迴歸分析以測試此自變項對依變項的影響。 16增加效度的測試增加效度的測試增加效度(Incremental validity):即某一自變項在考慮了已知其他對依變項有影響的自變項後，仍對依變項有影響。有些理論也可能描述了各自變項對依變項的影響是一個(或一類)接一個(或一類)的我們不能單靠R2及bi的測試來驗證這些理論的正確性，而要用Hierarchical Regression的方法。 17Hiera

11、rchical Regression的測試的測試-1 如我們要驗證X2是否在X1之上，對Y仍有預測及解釋能力，我們可比較以下兩個方程式：(1) Y = 01 + 11X1 + 1(2) Y = 02 + 12X1 + 2X2 + 2如果第二個方程式對Y的預測及解釋能力較第一個方程式為高，那麼我們便可以說X2是在X1之上，對Y仍有預測及解釋能力。在樣本的數據中，我們便是比較兩個方程式的R2的分別(稱為delta R-square；R2)。18Hierarchical Regression的測試的測試-2(1)設立保守假設，即在母體中兩個方程式對Y的預測及解釋能力沒有分別，即兩個方程式的1 (的變

12、異量)/(Y的變異量) 是一樣的。(2)抽取樣本、測量各自變項及依變項，以取得數據計算兩個方程的R2及R2；(3)計算在保守假設正確時，我們會看到這個樣本的R2的機會有多大(即P值；P value)；(4)根據P值判斷是否要推翻原來保守的假設。19調節變項的測試調節變項的測試 -1在迴歸分析中我們可用交互變項(Interaction term)來驗證調節變項。所謂交互變項，就是兩個可能是調節變項相乘的積(Cross-product term)，例如我們要驗證X2是否在X1和Y的關係中，擔當了調節的作用，我們可先計算X1及X2相乘的積(X1*X2)。我們可用Hierarchical Regres

13、sion 的測試方法，比較以下兩個方程式：(1) Y = 0 + 1X1 + 2X2 + 1(2) Y = 0 + 1X1 + 2X2 + 3(X1*X2) + 2如果我們的結論是接受第二個方程式，即在R2的測試中我們推翻它等於零的保守假設，便等於承認了X2在X1和Y的關係中擔當了調節的作用。 20調節變項的測試調節變項的測試 -2我們說當X2不變時，而X1增加了一個單位，那麼Y的改變(Y)會是：(1) Y1 = 0 + 1X1 + 2X2 + 3(X1*X2) + (2) Y2 = 0 + 1(X1+1) + 2X2 + 3【(X1+1)*X2】+ Y = Y2 Y1 = 1 + 3X2 明

14、顯地，由X1的改變而帶來對Y的轉變，仍要視乎X2實際的數值而定。 21調節變項的測試調節變項的測試 -3在檢定了調節變項後，如有需要，我們應以圖示其實際的調節形態，由迴歸分析的結果如何繪圖來表示調節的形態，可參看Aiken and West (1991) 。如果我們要測試更高層次的交互作用，例如三個自變項(X1、X2及X3)的交互作用，也是以層級迴歸咎於Hierarchical Regression) 的測試方法，檢定加入了調節變項相乘的積(即X1*X2*X3)後的R2。唯一要注意的是，在最後加入X1*X2*X3之前，我們需先把所有較低層次的交互作用(即X1*X2、X1*X3、X2*X3)包括

15、在迴歸分析中，可參看Aiken and West (1991)。 22中介變項中介變項中介變項(Mediator)的意思，就是說自變項對依變項的影響是透過中介變項的，如果M真的是X和Y的中介變項，那麼，它們的關係應該是：XMY。這裡有三個因果關係的條件：(1)X是M的原因之一；(2)X是Y的原因之一；(3)X對Y的影響是透過M的。 23中介變項的證據中介變項的證據-1在對樣本的迴歸的分析中，我們應該看到以下的結果(Judd & Kenny, 1981；Baron and Kenny, 1986)：(1) X = b01 + b11M + e1(2) Y = b02 + b21X + e2(3

16、) Y = b03 + b31X + b32M + e3在第一個方程式中，以b11來測試M和X的關係，結論應是：11不等於零。在第二個方程式中，以b21來測試X和Y的關係，結論應是：21不等於零。在第三個方程式中，是以b31和b32來測試當M被同時考慮時，X對Y的影響，最理想的結論是：31等於零，但32不等於零。如果這三個條件都符合，我們的結論便會是：M是X和Y的中介變項。 24中介變項的證據中介變項的證據-2有些時候，雖然第一個和第二個方程式的結論都得到支持，但在第三個方程式中我們的結論是31和32都不等於零，這樣我們便要看b21和b31的分別，或者是第二和第三個方程式的R2分別了。基本上，

17、如果M是X和Y的中介變項，那麼這些分別應該是頗大的。(MacKinnon, Lockwood, Hoffman, West, & Sheets (2002)有很詳細的總結。) 25應用迴歸分析時要注意的重點應用迴歸分析時要注意的重點 (1) 自變項與依變項的因果關係。(2) 自變項與依變項的測量尺度。(3) 控制變項。(4) 線性關係的設定。 (5) 測量的誤差。(6) 數據方面的要求：例如應該是隨機和常態分佈的；當各自變項互相的共變量很大，bi便會很不穩定，使我們難以判斷最終對依變項的影響到底是來自那一個自變項，這問題稱為多線性問題(multicollinearity) 。 26例子一：例子

18、一：Law, Wong and Wang (2004)(1) 研究問題研究問題(Research Question)-1 這個研究要探討的是在中國，跨國企業(Transnational Corporations；TNC)要本土化(localization)其中高層員工(即以本地員工取代從國外派駐的員工)，其成功的因素是否有一個層次，順序為：(1)企業視本土化為重要目標；(2)本土化的計劃週詳程度；(3)與本土化相關的人力資源管理措施的落實程度。27(1) 研究問題研究問題(Research Question)-2H1: The extent to which localization is r

19、egarded as an important goal of the TNC is positively related to localization success.H2: Localization planning efforts such as top management commitment to localization and selection of appropriate expatriates are positively related to localization success.H3: Specific human resource practices favo

20、ring the implementation of localization plans (training opportunities for local managers, performance evaluation and rewards for expatriates and local managers, and repatriation arrangements) are positively related to localization success.H4: The TNCs localization planning efforts would explain vari

21、ation in localization success over and beyond that of setting localization as an important objective.H5: The TNCs localizarion-related human resources management practices would explain variation in localization success over and beyond that of setting localization as an important objective and local

22、ization planning efforts. 28測量變項的方法測量變項的方法-1 Final participants in our validation sample were 139 human resources managers of TNCs operating in Fujian Province in the PRC.We chose TNCs from one single province in order to control for the differences in governmental regulations.With the help of this

23、professor in Xiamen, we sent out 180 questionnaires to current and graduated MBA students who are top or middle-level managers in TNCs in Fujian Province. These managers were asked to fill out the questionnaires themselves if they were the human resources manager of the company. They were asked to r

24、efer to their human resources manager for necessary information if they were top executives of the company. After distributing the questionnaires and one round of telephone follow-up, a total of 139 responses were received 29測量變項的方法測量變項的方法-2Subjective, multiple item measures with a development sampl

25、eObjective indicator of localization success. In addition to the four subjective questions, we added an objective indicator of localization success. Specifically, we used a ratio of the “Number of local managers occupying positions originally occupied by expatriates” to the “Total number of position

26、s occupied by expatriates when the PRC operations started.” The numerator is a measure of actual localization success, while the denominator is a comparison base of the starting number of expatriate positions. This variable is very important because it allows us to double check the validity of the s

27、ubjective indicator of localization success. Also, unless the respondents deliberately lied to us, this objective indicator can be a good dependent variable that has little respondents biases with the independent variables. Control Variables, e.g., dummy coded organizational type30(3) 迴歸分析迴歸分析-1由於要檢

28、定各組自變項的順序層次，所以迴歸的方式是Hierarchical Regression：In order to have a more rigorous test of the five hypotheses in this study, we used hierarchical regression analyses to identify the important determinants of localization success. Results of these analyses are shown in Table 2. Table 2 shows that the incl

29、usion of the three controlling variables (i.e., YEAR, MANUFACTURING, and JV) is necessary because they have significant effects on the localization success. Changes in R2 are .16 (p.05) and .13 (p.01) respectively for the objective and subjective indicator of localization success. 31(3) 迴歸分析迴歸分析-2As

30、 expected, localization objectives (i.e., GOAL) explained a significant portion of variance in the localization result. The increase in model R2 for in predicting the objective and subjective success of localization were .20 (p.01) and .18 (p.01) respectively. Planning efforts for localization expla

31、ined an additional significant portion of the variance of localization results on top of localization objectives. The changes in R2 for the objective and subjective success of localization measures were .13 (p.01) and .10 (p.01) respectively. Thus, H4 is supported. Human resources practices related

32、to localization further explained a significant portion of the variances of the dependent variables. The change in model R2 for the objective and subjective success of localization measures as dependent variables were .07 (p.10) and .08 (p.05) respectively. Thus, H5 is supported. 3233例子二：例子二：Wong, W

33、ong and Law (2005) (1) 研究問題研究問題(Research Question) 這個研究要探討的問題是：(1)工作對情緒表現的要求(Emotional Labor；EL)是否會成為情緒智能(Emotional Intelligence；EI)與工作滿足感(Job Satisfaction)的調節變項；(2)傳統廣為接受的職業分類模型(Hollands Model)是否可代表工作對情緒表現的要求：Hypothesis 1. EI is positively related to life satisfaction.Hypothesis 2. EI is positively

34、 related to job satisfaction.Hypothesis 3. The effect of EI on job satisfaction is dependent on the EL of the job. Specifically, the higher the EL of the job, the stronger would be the effects of EI on job satisfaction.Hypothesis 4. Following Hollands model of vocational choice, the effects of EI on

35、 job satisfaction would be highest for social types of jobs. The effect sizes of the EI-job satisfaction relationship for different types of jobs follow Hollands calculus assumption.Hypothesis 5. The effects of EI on life satisfaction are independent of the EL of the job. 34(2) 測量變項的方法測量變項的方法-1 The

36、sample of this study came from two sources. The first source is union members of five types of job. The five jobs included bus driver (realistic), computer programmer (investigative), art designer of advertising companies (artistic), shop manager of retailing shops (enterprising), and clerks (conven

37、tional). A total of 300 questionnaires were given to the union and 218 valid responses were returned, representing a response rate of 72.7%. However, since there are no social jobs in the union, our second sample source was teachers of two secondary schools. One hundred and ten questionnaires were s

38、ent to all the teachers of two schools and 89 valid responses were returned, representing a response rate of 80.9%. Thus, the final sample consisted of 307 respondents (46 bus drivers, 103 clerks, 17 computer programmers, 9 art designers, 43 shop managers, and 89 secondary school teachers). 35(2) 測量

39、變項的方法測量變項的方法-2Proxy of emotional labor by Hollands occupational model. To test the importance of EI in various occupational types, we created a second proxy measure of emotional labor according to Hollands (RIASEC) model. As argued before, social type of jobs would probably have the highest level of

40、 emotional labor because these jobs have the greatest requirement of social interaction. Following the calculus assumption of Hollands (RIASEC) model, the order of emotional labor will thus be social, its adjacent types (i.e., artistic and enterprising), its alternative types (i.e., investigative an

41、d conventional), and its opposite type (i.e., realistic). Thus, this proxy measure of emotional labor was coded as follows: the social type (i.e., secondary school teachers) was coded as 4, its adjacent types (i.e., art designers and shop mangers) were coded as 3, the alternate types (i.e., computer

42、 programmers and clerks) were coded as 2, and the opposite type (i.e., bus drivers) was coded as 1. 36(3) 迴歸分析迴歸分析 Hierarchical regression was conducted to test the main effect of EI and the interaction effect between EI and emotional labor on job satisfaction and life satisfaction. Specifically, th

43、e control variables, EI and emotional labor was entered into the regression equation first. The product term of EI and emotional labor was entered in the last step to examine the significance of change in R-squares. To test for the utility of Hollands occupational model in predicting the differentia

44、l importance of EI in various occupations, the proxy measure of EI calculated from the Hollands model was used to replace the emotional labor measure in the hierarchical regression. Finally, to ensure that this result is applicable to job-related criterion, life satisfaction was used as the dependen

45、t variables to test for the interaction effect of EI and emotional labor. 3738迴歸分析的迴歸分析的SPSS指令指令-1 把所有自變項同時在迴歸方程式中作分析的SPSS指令是(例如：X1和X2為自變項；Y為依變項)：regression vars=Y X1 X2 /dep=Y/method=enter. 39迴歸分析的迴歸分析的SPSS指令指令-2Hierarchical Regression的SPSS指令是(例如：先加入X1，然後加入X2以看其R2)：regression vars=Y X1 X2/statisics

46、 coeff outs r cha/criteria=pin(.05) pout(.10)/noorigin/dep=Y/method=enter X1/method=enter X2. 40迴歸分析的迴歸分析的SPSS指令指令-3測試X1及X2的交互作用的SPSS指令是(看加入X1*X2以後的R2)：compute INTER=X1*X2.regression vars=Y X1 X2 INTER /statisics coeff outs r cha/criteria=pin(.05) pout(.10)/noorigin/dep=Y/method=enter X1 X2/method=enter INTER. 41迴歸分析的迴歸分析的SPSS指令指令-4依變項為二分(dichotomous)的類別尺度，Logistic Regression的SPSS指令是：logistic regression vars=Y with X1 X2. 42

展开阅读全文

回归分析的例子

最新文档