Map
> Problem Definition >
Data Preparation
> Data Exploration >
Modeling > Evaluation
> Deployment |
|
Model Evaluation
|
Model Evaluation is an integral part of the model
development process. It helps to find the best model that represents our
data and how well the chosen model will work in the future. Evaluating model
performance with the data used for training is not acceptable in data
science because it can easily generate overoptimistic and overfitted models.
There are two methods of evaluating models in data science, Hold-Out and
Cross-Validation. To avoid overfitting, both methods use a test set (not
seen by the model) to evaluate model performance. |
|
Hold-outs |
In this method, the mostly large
dataset is randomly divided to three subsets: |
- Training set is a subset of the dataset
used to build predictive models.
- Validation set is a subset of the
dataset used to assess the performance of model built in the training
phase. It provides a test platform for fine tuning model's parameters
and selecting the best-performing model. Not all modeling algorithms
need a validation set.
- Test set or unseen examples is a subset
of the dataset to assess the likely future performance of a model. If a
model fit to the training set much better than it fits the test set,
overfitting is probably the cause.
|
|
Cross-Validation |
When only a limited amount of data is
available, to achieve an unbiased estimate of the model performance we use
k-fold cross-validation. In k-fold cross-validation, we divide the
data into k subsets of equal size. We build models k times, each time
leaving out one of the subsets from training and use it as the test set.
When k is equal to the sample size, this is called "leave-one-out". |
|
Model evaluation can be divided to two
sections: |
|
|
|