How to create a confusion matrix#
A confusion matrix is a special type of 3D chart. The two horizontal axes are Label and Predicted_label and the vertical axis is the count (or occurrence). Assuming the Label and Predicted_label column already exist, we need to derive a virtual column of occurrence.
To derive an occurrence virtual column, select Label and Predicted_label columns,
RightClick on one of the selected column headers, hover over Derive virtual column
, and click
Occurrence
in the popup menu.
The virtual column, called (Predicted_label, label)occurrence, is created and presented at the right end of the table.
Now, we are ready to create a confusion matrix chart. Select Label,
(Predicted_label, label)occurrence, and Predicted_label in that order and press
3 to create a 3D chart. Then click the Boxes
icon in the Charts Toolbar to convert it from a
scatter to box plot. The high bars in the diagonal are the correct predictions. Since they are typically much higher than
other bars, you may want to make a confusion matrix without the diagonal bars. You can filter on accuracy=0 to get rid of
the correct predictions, but the scale of the vertical axis does not change if only filtering out some data as with this approach.
In order to get a more useful scale, we can create a subset table containing only the rows of incorrect predictions then make a confusion matrix on that subset table. Follow the steps:
Filter on accuracy=0
Create a subset table
Create a confusion matrix chart in the subset table
The confusion matrix on the left is the one created on the original table and filtered on accuracy=0, while the one on right is created on the subset table that excludes the data (correct predictions) on the diagonal. We also have a confusion matrix workflow available to automatically generate confusion matrix charts for you. Please visit How to use workflows for details.