Lecture

Types of Machine Learning: Supervised vs Unsupervised

Machine learning is generally divided into two main categories:

  • Supervised Learning — models learn from labeled data to make predictions.
  • Unsupervised Learning — models discover patterns in unlabeled data.

Refer to the *slide deck *for a visual overview of the concepts.

In this lesson, we’ll explore simple examples using Scikit-learn.


Supervised Learning Example – Classification

The following example shows how to use Scikit-learn to train a K-Nearest Neighbors classifier.

The model predicts the class of a new data point based on the majority class among its nearest neighbors in the training data.


K-Nearest Neighbors Classification
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier # Load dataset iris = load_iris() X_train, X_test, y_train, y_test = train_test_split( iris.data, iris.target, test_size=0.2, random_state=42 ) # Create and train the model model = KNeighborsClassifier(n_neighbors=3) model.fit(X_train, y_train) # Evaluate accuracy print("Accuracy:", model.score(X_test, y_test))

The dataset used is the Iris dataset, which is a classic dataset for classification tasks.

The dataset contains 150 samples of iris flowers, with 4 features: sepal length, sepal width, petal length, and petal width.

The target variable is the species of the iris flower.


Unsupervised Learning Example – Clustering

The next example demonstrates K-Means clustering, an unsupervised learning algorithm that groups data points based on similarity.

Using Scikit-learn, we can easily train a model that automatically forms clusters from the data.


K-Means Clustering
from sklearn.datasets import load_iris from sklearn.cluster import KMeans # Load dataset iris = load_iris() X = iris.data # Create and fit the model kmeans = KMeans(n_clusters=3, random_state=42) kmeans.fit(X) # Show first 10 cluster assignments print("Cluster labels:", kmeans.labels_[:10])

Key Takeaways

  • Supervised learning - uses labeled data to predict outcomes.
  • Unsupervised learning - uses unlabeled data to discover hidden patterns.
  • Scikit-learn provides a consistent API, making it easy to switch between both approaches.
Quiz
0 / 1

In supervised learning, models are trained using datasets that do not contain labels.

True
False

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help