![]() | Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo
|
Explanations and References
The references on this page includes the text book for the Javascript algorithm, and a web based teaching article on multiple regression. Other than these, it is assumed that users coming to this page will have other access to information and advice on the subject. The explanations and discussions that follows are therefore intended to help users follow the procedures and interpret the results, and not meant to be teaching or authoratative in nature.
Javascript Program
Example used on this page
The example is to develop a model to predict birth weight of babies, using multiple regression. The independent predictive variables are maternal age in years (Mage), maternal height in cms (Mht), gestation age in weeks since beginning of pregnancy (Gest), sex of the baby (Sex, 0 for boys and 1 for girls). The dependent variable is birth weight in grams (Bwt). For this exercise we will use 22 cases. The data is presented in the table to the right. Please note that, the Javascript program on this page allows any number of variables (columns of data), but designate the last column to the right as dependent variable. Again please note that many other variables are related to birth weight, and included in widely published models. Only 4 are included here to demonstrate the procedures and results Correlation analysis
The partial correlation coefficient reflects correlation between 2 variables after correcting for correlations with other variables in the data set. A large difference between partial and non-partial coefficients therefore reflect the possibility of excessive overlapping of measurements. Multiple Regression
Multiple Standardized RegressionMultiple standardized regression is the same as multiple regression, except that all measurements are standardized to the Standard Deviation unit z, where z = (value - mean) / SD. The coefficients produced are therefore of the same scale, making the structure of relationships between variables easier to visualize.Each partial standardized regression coefficient (β) represents the change in the dependent variabe (y) in number of SDs, for each 1 SD change of the independent variable. The difference between the βs also reflect their relative influence on the dependent variable.
ReferencesSteel RGD, Torrie JH, Dickey DA (1997) Principles and procedures of statistics. A biomedical approach. 3rd Ed. McGraw-Hill Inc New York NY 10020 ISBN 0-07-061028-2 p. 322-351https://wiki.gis.com/wiki/index.php/Multiple_Regression a detailed explanation of multiple regression available on line Data Entry
The following is a single program, divided into parts so it is easier to follow
Part 1. Data entry
# Data entry to dataframe
myDat = ("
Mage Mht Gest Sex Bwt
24 170 37 1 3048
29 161 36 0 2813
29 167 41 1 3622
21 165 36 1 2706
35 168 35 0 2581
27 161 39 0 3442
26 163 40 1 3453
34 167 37 0 3172
25 165 35 1 2386
28 170 39 0 3555
32 167 37 1 3029
31 169 37 0 3185
26 161 36 1 2670
21 165 38 0 3314
21 166 41 1 3596
24 164 38 0 3312
34 169 38 0 3414
25 161 41 0 3667
26 167 40 0 3643
27 162 33 1 1398
27 160 38 1 3135
21 167 39 1 3366 ")
df <- read.table(textConnection(myDat),header=TRUE)
#summary(df) # optional display of input data
Part 2: Means and Standard Deviations# mean and SD meanMage = mean(df$Mage) sdMage = sd(df$Mage) meanMht = mean(df$Mht) sdMht = sd(df$Mht) meanGest = mean(df$Gest) sdGest = sd(df$Gest) meanSex = mean(df$Sex) sdSex = sd(df$Sex) meanBwt = mean(df$Bwt) sdBwt = sd(df$Bwt) # show means and Sds c(meanMage,sdMage) c(meanMht,sdMht) c(meanGest,sdGest) c(meanSex,sdSex) c(meanBwt,sdBwt)The means and SD results are as follows > # show means and Sds > c(meanMage,sdMage) [1] 26.954545 4.281491 > c(meanMht,sdMht) [1] 165.227273 3.191235 > c(meanGest,sdGest) [1] 37.772727 2.136571 > c(meanSex,sdSex) [1] 0.5000000 0.5117663 > c(meanBwt,sdBwt) [1] 3113.955 532.697 Part 3: Multiple regressionRegRes<-lm(Bwt~Mage+Mht+Gest+Sex,data=df) # Multiple regression summary(RegRes) # show multiple regression resultsThe results are as follos
> summary(RegRes) # show multiple regression results
Call:
lm(formula = Bwt ~ Mage + Mht + Gest + Sex, data = df)
Residuals:
Min 1Q Median 3Q Max
-469.89 -86.47 46.49 84.17 198.44
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9165.476 1945.069 -4.712 0.000201 ***
Mage 1.701 9.864 0.172 0.865124
Mht 23.649 11.724 2.017 0.059759 .
Gest 223.194 17.920 12.455 5.68e-10 ***
Sex -209.150 77.511 -2.698 0.015228 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 163.7 on 17 degrees of freedom
Multiple R-squared: 0.9236, Adjusted R-squared: 0.9056
F-statistic: 51.36 on 4 and 17 DF, p-value: 2.849e-09
Part 4: Repeat Multiple regression using Standardized valuesStandardized values z = (value-mean) / SDPart 4a: Create standardized z values # standardization # create z variables df$ZMage <- (df$Mage - meanMage) / sdMage df$ZMht <- (df$Mht - meanMht) / sdMht df$ZGest <- (df$Gest - meanGest) / sdGest df$ZSex <- (df$Sex - meanSex) / sdSex df$ZBwt <- (df$Bwt - meanBwt) / sdBwt Part 4b: Standardized multiple regression using z values RegZRes<-lm(ZBwt~ZMage+ZMht+ZGest+ZSex,data=df) # Multiple regression summary(RegZRes) # show multiple regression resultsThe results are as follows. For all variables mean = 0 and SD = 1
> summary(RegZRes) # show multiple regression results
Call:
lm(formula = ZBwt ~ ZMage + ZMht + ZGest + ZSex, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.88209 -0.16233 0.08728 0.15800 0.37252
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.267e-16 6.551e-02 0.000 1.0000
ZMage 1.367e-02 7.928e-02 0.172 0.8651
ZMht 1.417e-01 7.024e-02 2.017 0.0598 .
ZGest 8.952e-01 7.188e-02 12.455 5.68e-10 ***
ZSex -2.009e-01 7.447e-02 -2.698 0.0152 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3073 on 17 degrees of freedom
Multiple R-squared: 0.9236, Adjusted R-squared: 0.9056
F-statistic: 51.36 on 4 and 17 DF, p-value: 2.849e-09
To make the coefficients easier to read,it is trans;ated as follows
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0 0.0655 0.0000 1.0000
ZMage 0.0137 0.0793 0.1720 0.8651
ZMht 0.1417 0.0702 2.0170 0.0598 .
ZGest 0.8952 0.0719 12.4550 <0.0001 ***
ZSex -0.2009 0.0745 -2.6980 0.0152 *
|
