IT용어위키

Confusion Matrix is a tool used in data science and machine learning to evaluate the performance of a classification model. It provides a tabular summary of the model's predictions against the actual values, breaking down the number of correct and incorrect predictions for each class.

Structure

The confusion matrix is typically a 2x2 table for binary classification, with the following layout:

True Positives (TP): Correctly predicted positive instances
False Positives (FP): Incorrectly predicted positive instances (actual class is negative)
True Negatives (TN): Correctly predicted negative instances
False Negatives (FN): Incorrectly predicted negative instances (actual class is positive)

Example

Consider a model that classifies emails as spam or not spam:

Actual	Predicted
Actual	Positive (Spam)	Negative (Not Spam)
Positive (Spam)	True Positives (TP)	False Negatives (FN)
Negative (Not Spam)	False Positives (FP)	True Negatives (TN)

Importance of the Confusion Matrix

The confusion matrix is valuable for understanding the types of errors a model makes and is especially useful when:

The dataset is imbalanced, allowing for insights beyond accuracy alone
There are different costs associated with false positives and false negatives

Metrics Derived from the Confusion Matrix

Several key metrics can be derived from the confusion matrix to evaluate model performance:

Accuracy: (TP + TN) / (TP + TN + FP + FN)
Precision: TP / (TP + FP)
Recall: TP / (TP + FN)
F1 Score: 2 * (Precision * Recall) / (Precision + Recall)

Limitations

The confusion matrix has limitations, such as:

Limited utility in multi-class settings without additional transformations
Can be less informative when class imbalance is extreme, as it may not fully capture the model’s bias toward one class

Conclusion

The confusion matrix provides a comprehensive view of classification model performance, particularly in binary classification. It enables practitioners to examine each type of error and decide on the best metrics to focus on based on the use case.