lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

lesson11Title

lesson12Title

lesson13Title

lesson14Title

lesson15Title

pythonDataAnalysisAdvancedChapter4Title

pythonDataAnalysisAdvancedChapter1Title

pythonDataAnalysisAdvancedChapter2Title

pythonDataAnalysisAdvancedChapter3Title

# Evaluating Classification Models

*Model evaluation* measures how effectively a trained model makes predictions on unseen data.

The right evaluation metric depends on the type of task:

* `Classification`: Accuracy, Precision, Recall, F1-score
* `Regression`: R² (coefficient of determination), MSE, MAE


<br/>

## True Positive, True Negative, False Positive, False Negative

When evaluating classification models, these terms are commonly used:

* `True Positive (TP)`: Correctly predicting positive cases
  *(e.g., predicting a woman is pregnant when she actually is)*
* `True Negative (TN)`: Correctly predicting negative cases
  *(e.g., predicting a woman is not pregnant when she actually isn’t)*
* `False Positive (FP)`: Incorrectly predicting positive cases
  *(e.g., predicting a woman is pregnant when she isn’t)*
* `False Negative (FN)`: Incorrectly predicting negative cases
  *(e.g., predicting a woman is not pregnant when she actually is)*

<br/>

## Classification Metrics

Commonly used evaluation metrics for classification models include:

* `Accuracy`: The ratio of correct predictions to total predictions
  Formula: `(TP + TN) / (TP + TN + FP + FN)`
* `Precision`: The proportion of positive predictions that are actually correct
  Formula: `TP / (TP + FP)`
* `Recall`: The proportion of actual positives that are correctly identified
  Formula: `TP / (TP + FN)`
* `F1-score`: The harmonic mean of precision and recall
  Formula: `2 * (precision * recall) / (precision + recall)`

<br/>

Scikit-learn provides built-in functions to calculate these metrics easily.

<br/>

## Example: Calculating Accuracy Score

Let’s evaluate a simple `K-Nearest Neighbors` classification model using the accuracy metric.

```python title="Accuracy Example"
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load dataset (Iris dataset)
X, y = load_iris(return_X_y=True)

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train the classifier
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

# Make predictions
y_pred = knn.predict(X_test)

# Evaluate accuracy
acc = accuracy_score(y_test, y_pred)
print(f"Accuracy: {acc:.2f}")
```

In this example, `Scikit-learn`’s `accuracy_score()` function measures how many predictions the model got right.

You can also use `precision_score()`, `recall_score()`, or `f1_score()` to compute other metrics depending on your model’s goals.

R², also known as the coefficient of determination, quantifies the proportion of variance in the dependent variable that is predictable from the independent variable(s). A higher R² value indicates a better fit between the model and the data, revealing how well the model captures the underlying trends.

### What metric is used to evaluate regression models, measuring how much variance in the target is explained by the model?