How to create a confusion matrix#

A confusion matrix is a special type of 3D chart. The two horizontal axes are Label and Predicted_label and the vertical axis is the count (or occurrence). Assuming the Label and Predicted_label column already exist, we need to derive a virtual column of occurrence.

To derive an occurrence virtual column, select Label and Predicted_label columns, RightClick on one of the selected column headers, hover over Derive virtual column, and click Occurrence in the popup menu.

The virtual column, called (Predicted_label, label)occurrence, is created and presented at the right end of the table.

Now, we are ready to create a confusion matrix chart. Select Label, (Predicted_label, label)occurrence, and Predicted_label in that order and press 3 to create a 3D chart. Then click the Boxes icon in the Charts Toolbar to convert it from a scatter to box plot. The high bars in the diagonal are the correct predictions. Since they are typically much higher than other bars, you may want to make a confusion matrix without the diagonal bars. You can filter on accuracy=0 to get rid of the correct predictions, but the scale of the vertical axis does not change if only filtering out some data as with this approach.

In order to get a more useful scale, we can create a subset table containing only the rows of incorrect predictions then make a confusion matrix on that subset table. Follow the steps:

  1. Filter on accuracy=0

  2. Create a subset table

  3. Create a confusion matrix chart in the subset table

The confusion matrix on the left is the one created on the original table and filtered on accuracy=0, while the one on right is created on the subset table that excludes the data (correct predictions) on the diagonal. We also have a confusion matrix workflow available to automatically generate confusion matrix charts for you. Please visit How to use workflows for details.