LDA

Map > Data Science > Predicting the Future > Modeling > Classification > Linear Discriminant Analysis

Linear Discriminant Analysis

Linear Discriminant Analysis (LDA) is a classification method originally developed in 1936 by R. A. Fisher. It is simple, mathematically robust and often produces models whose accuracy is as good as more complex methods.

Algorithm

LDA is based upon the concept of searching for a linear combination of variables (predictors) that best separates two classes (targets). To capture the notion of separability, Fisher defined the following score function.

Given the score function, the problem is to estimate the linear coefficients that maximize the score which can be solved by the following equations.

One way of assessing the effectiveness of the discrimination is to calculate the Mahalanobis distance between two groups. A distance greater than 3 means that in two averages differ by more than 3 standard deviations. It means that the overlap (probability of misclassification) is quite small.

Finally, a new point is classified by projecting it onto the maximally separating direction and classifying it as C1 if:

Example:

Suppose we received a dataset from a bank regarding its small business clients who defaulted (red square) and those that did not (blue circle) separated by delinquent days (DAYSDELQ) and number of months in business (BUSAGE). We use LDA to find an optimal linear model that best separates two classes (default and non-default).

The first step is to calculate the mean (average) vectors, covariance matrices and class probabilities.

Then, we calculate pooled covariance matrix and finally the coefficients of the linear model.

A Mahalanobis distance of 2.32 shows a small overlap between two groups which means a good separation between classes by the linear model.

In the following table, we calculate Z score using the above Z equation. However, a score by itself cannot be used for predicting an outcome. We also need the equation in column 5 to choose Class N or Y. We predict Class N if the calculated value is bigger than -1.1 otherwise Class Y. As it is shown below, the LDA model made two errors.

Predictors Contribution

A simple linear correlation between the model scores and predictors can be used to test which predictors contribute significantly to the discriminant function. Correlation varies from -1 to 1, with -1 and 1 meaning the highest contribution but in different directions and 0 means no contribution at all.

Quadratic Discriminant Analysis (QDA)

QDA is a general discriminant function with a quadratic decision boundaries which can be used to classify datasets with two or more classes. QDA has more predictability power than LDA but it needs to estimate the covariance matrix for each classes.

where C_kis the covariance matrix for the class k (-1 means inverse matrix), |C_k| is the determinant of the covariance matrix C_k, and P(c_k) is the prior probability of the class k. The classification rule is simply to find the class with highest Z value.

Try to invent a real time LDA classifier. You should be able to add or remove data and variables (predictors and classes) on the fly.