March 25, 2014 · machine learning

Simple guide to confusion matrix terminology

A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

I wanted to create a "quick reference guide" for confusion matrix terminology because I couldn't find an existing resource that suited my requirements: compact in presentation, using numbers instead of arbitrary variables, and explained both in terms of formulas and sentences.

Let's start with an example confusion matrix for a binary classifier (though it can easily be extended to the case of more than two classes):

Example confusion matrix for a binary classifier

What can we learn from this matrix?

Let's now define the most basic terms, which are whole numbers (not rates):

I've added these terms to the confusion matrix, and also added the row and column totals:

Example confusion matrix for a binary classifier

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

A couple other terms are also worth mentioning:

And finally, for those of you from the world of Bayesian statistics, here's a quick summary of these terms from Applied Predictive Modeling:

In relation to Bayesian statistics, the sensitivity and specificity are the conditional probabilities, the prevalence is the prior, and the positive/negative predicted values are the posterior probabilities.

What did I miss? Are there any terms that need a better explanation? Your feedback is welcome!

P.S. Want more content like this in your inbox? Subscribe to the Data School newsletter.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pocket
Comments powered by Disqus