Publications
Publications
- August 2020 (Revised September 2020)
- HBS Case Collection
Assessing Prediction Accuracy of Machine Learning Models
By: Michael W. Toffel, Natalie Epstein, Kris Ferreira and Yael Grushka-Cockayne
Abstract
The note introduces a variety of methods to assess the accuracy of machine learning prediction models. The note begins by briefly introducing machine learning, overfitting, training versus test datasets, and cross validation. The following accuracy metrics and tools are then described: mean squared error (MSE), mean absolute deviation (MAD), Brier score, and cross-entropy, true/false positives/negatives, the confusion matrix, true positive rate (sensitivity or recall), false negative rate (Type II error rate), precision, true negative rate (specificity), false positive rate (Type I error rate), receiver operating characteristic curve (ROC) and area under the curve (AUC), and precision-recall curve.
Keywords
Machine Learning; Statistics; Econometric Analyses; Experimental Methods; Data Analysis; Data Analytics; Forecasting and Prediction; Analytics and Data Science; Analysis; Mathematical Methods
Citation
Toffel, Michael W., Natalie Epstein, Kris Ferreira, and Yael Grushka-Cockayne. "Assessing Prediction Accuracy of Machine Learning Models." Harvard Business School Technical Note 621-045, August 2020. (Revised September 2020.)