Map
> Data Science > Predicting the Future >
Modeling >
Classification > Logistic Regression 





Logistic
Regression



Logistic regression predicts the probability of an
outcome that can only have two values (i.e. a dichotomy). The prediction is based on the use of one or several predictors
(numerical and categorical). A linear regression is not appropriate for predicting the value of a binary variable for two reasons: 





 A linear regression will predict values outside
the acceptable range (e.g. predicting probabilities
outside the range 0 to 1)
 Since the dichotomous experiments can only have
one of two possible values for each experiment, the residuals will not
be normally distributed about the predicted line.



On the other hand, a logistic regression produces a logistic
curve, which is limited to values between 0 and 1. Logistic regression is
similar to a linear regression, but the curve is constructed using the
natural logarithm of the “odds” of the target variable, rather than
the probability. Moreover, the predictors do not have to be normally
distributed or have equal variance in each group. 





In the logistic regression the constant (b_{0})
moves the curve left and right and the slope (b_{1})
defines the steepness of the curve. By simple transformation, the logistic regression equation can be written in terms of an odds ratio. 





Finally, taking the natural log of both sides, we can write the equation in terms of
logodds (logit) which is a linear function of the predictors. The coefficient (b_{1})
is the amount the logit (logodds) changes with a one unit change in x. 





As mentioned before, logistic
regression can handle any number of numerical and/or categorical variables. 





There are several analogies between linear regression and logistic regression. Just as
ordinary least square regression is the method used to estimate coefficients for the best fit line in linear regression, logistic regression
uses maximum likelihood estimation (MLE) to obtain the model coefficients that relate
predictors to the target. After this initial function is estimated, the process is repeated until LL (Log Likelihood) does not change significantly. 











A pseudo R^{2} value is also available to indicate
the adequacy of the regression model. Likelihood ratio test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model.
This difference is called "model chisquare“. Wald test
is used to test the statistical significance of each coefficient (b) in the
model (i.e., predictors contribution).






Pseudo R^{2}



There are several measures intended to mimic the R^{2}
analysis to evaluate the goodnessoffit of logistic
models, but they cannot be interpreted as one would interpret an
R^{2 }and different pseudo R^{2} can arrive at very different values.
Here we discuss three pseudo R^{2}measures. 





Pseudo R^{2} 
Equation 
Description 
Efron's 

'p' is the
logistic model predicted probability. The model residuals are squared, summed, and divided by the total variability in the dependent variable. 
McFadden's 

The ratio of the loglikelihoods suggests the level of improvement over the intercept model offered by the full model. 
Count 

The number of records correctly predicted, given
a cutoff point of .5 divided by the total count of cases. This is
equal to the accuracy of a
classification model. 






Likelihood Ratio Test 


The likelihood ratio test provides the means for comparing the likelihood of the data under one
model (e.g., full model) against the likelihood of the data under another, more restricted
model (e.g., intercept model).






where 'p'
is the logistic model predicted probability. The next step is to calculate the difference between these two
loglikelihoods. 





The difference between two likelihoods is multiplied by a factor of 2
in order to be assessed for statistical significance using standard significance
levels (Chi^{2} test). The degrees of freedom for the test will
equal the difference in the number of parameters being estimated under the
models (e.g., full and intercept). 





Wald test 


A Wald test is used to evaluate the statistical significance of each coefficient
(b) in the model. 





where W
is the Wald's statistic with a normal distribution (like Ztest), b is
the coefficient and SE is its standard
error. The W value is then squared, yielding a Wald statistic with a chisquare distribution. 











Predictors Contributions 


The Wald test is usually used to assess the significance of prediction of each predictor.
Another indicator of contribution of a predictor is exp(b)
or oddsratio of coefficient which is the amount the logit (logodds) changes, with a one unit change in
the predictor (x). 











Logistic
Regression Interactive 




