Inconsistent results between glm() in R and manual implementation of logistic regression in Excel -
you'll find manual implementation of logistic regression in excel at: http://blog.excelmasterseries.com/2014/06/logistic-regression-performed-in-excel.html.
this implementation uses dataset below , reports following coefficients
b0 = 12.48285608
b1 = -0.117031374
b2 = -1.469140055
however, when analyze same dataset glm() in r, results not same, i.e.:
b0 = 1.687445
b1 = -0.012525
b2 = -0.116473
d <- structure(list(y = c(0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l), x1 = c(78l, 73l, 73l, 71l, 68l, 59l, 57l, 49l, 35l, 27l, 59l, 57l, 44l, 38l, 36l, 36l, 22l, 22l, 15l, 10l), x2 = c(8l, 8l, 5l, 7l, 5l, 4l, 7l, 5l, 4l, 7l, 3l, 4l, 5l, 5l, 4l, 2l, 6l, 5l, 4l, 6l)), .names = c("y", "x1", "x2"), class = "data.frame", row.names = c(na, -20l)) summary(glm(y ~ x1+x2, data=d), family=binomial(link='logit')) # > summary(glm(y ~ x1+x2, data=d), family=binomial(link='logit')) # # call: # glm(formula = y ~ x1 + x2, data = d) # # deviance residuals: # min 1q median 3q max # -0.78318 -0.20641 0.07689 0.24375 0.49237 # # coefficients: # estimate std. error t value pr(>|t|) # (intercept) 1.687445 0.319872 5.275 6.18e-05 *** # x1 -0.012525 0.004376 -2.862 0.0108 * # x2 -0.116473 0.056959 -2.045 0.0567 . # --- # signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # (dispersion parameter gaussian family taken 0.146843) # # null deviance: 5.0000 on 19 degrees of freedom # residual deviance: 2.4963 on 17 degrees of freedom # aic: 23.139 # # number of fisher scoring iterations: 2 why results differ?
you have family parameter in wrong place. should in glm() call, not summary() call.
summary(glm(y ~ x1+x2, data=d, family=binomial(link='logit'))) if don't include family in glm(), gaussian (linear) regression.
Comments
Post a Comment