A4.3.2 Explain how classifications techniques in supervised learning are used to predict discrete categorical outcomes. (HL only)

A4.3.2 Explain how classifications techniques in supervised learning are used to predict discrete categorical outcomes. 
• K-Nearest Neighbours (K-NN) and decision trees algorithms to categorize new data points, based on patterns learned from existing labelled data 
• Real-world applications of K-NN may include collaborative filtering recommendation systems. 
• Real-world applications of decision trees may include medical diagnosis based on a patient’s symptoms.

The Big Idea

In supervised machine learning, classification algorithms are used to predict discrete categorical outcomes—that is, placing data into predefined classes or categories. Unlike regression, which predicts continuous values, classification assigns labels like "yes" or "no", "spam" or "not spam", or "disease present" or "disease absent".

These algorithms learn from labeled training data, where the correct category is already known, and then apply that learned knowledge to classify new, unseen instances. Two widely used classification techniques are K-Nearest Neighbours (K-NN) and Decision Trees. Both can be highly effective, depending on the problem, data, and desired interpretability.


What Are Categorical Outcomes?

Categorical outcomes are variables with discrete values drawn from a finite set of classes.

Examples:

  • Email: {Spam, Not Spam}
  • Medical test: {Positive, Negative}
  • Animal classification: {Dog, Cat, Rabbit}
  • Product rating: {Like, Dislike}

1. K-Nearest Neighbours (K-NN)

How It Works:

  • K-NN is a non-parametric, instance-based learning algorithm.
  • To classify a new data point, it looks at the ‘k’ nearest labeled examples in the training set (based on a distance metric like Euclidean distance).
  • The new point is assigned the most common class among its neighbors.
class(x)=majority_vote(k closest points to x)\text{class}(x) = \text{majority\_vote}(\text{k closest points to } x)

Example Application: Collaborative Filtering in Recommendation Systems

  • Suppose a student likes science fiction books. K-NN can recommend other books by finding readers with similar preferences and suggesting titles they rated highly.
  • This is common in systems like Netflix or Spotify, where users are compared and recommendations are based on “neighborhoods” of similar taste.

Strengths:

  • Simple to implement
  • No explicit training phase
  • Adaptable to multi-class classification

Limitations:

  • Computationally expensive on large datasets
  • Sensitive to irrelevant or scaled features
  • Needs well-chosen value of kk

2. Decision Trees

How It Works:

  • A decision tree partitions the feature space using a sequence of if-else questions.
  • At each node, the algorithm chooses the feature that best separates the data using metrics like Gini impurity or information gain.
  • The result is a tree-like structure where leaves represent class labels.
IF fever AND cough AND loss of smell THEN COVID-19\text{IF fever AND cough AND loss of smell THEN COVID-19}

Example Application: Medical Diagnosis

  • A decision tree could predict whether a patient has the flu, cold, or COVID-19 based on symptoms like fever, cough, and fatigue.
  • Each question at a node narrows down the possible outcomes.

Strengths:

  • Highly interpretable (visual and intuitive)
  • Fast inference
  • Handles both numerical and categorical data

Limitations:

  • Prone to overfitting without pruning
  • Instability: small changes in data can lead to very different trees
  • Less effective when classes are not well separated

Student-Relatable Example

Imagine your school is building a system to automatically classify student clubs based on their meeting topics.

  • If you use K-NN, you could compare the keywords from this week's club meeting to past labeled meetings. If your meeting is similar to past "Science Club" discussions, it is classified as Science.
  • If you use a decision tree, the system might ask:
    • Does the club discuss experiments?
    • Are lab activities mentioned?
    • Is the term “hypothesis” used?

Based on these questions, it assigns a label like "Science Club" or "Art Club".

This shows how algorithms can categorize real-world scenarios with structured logic or statistical similarity.


Summary

Classification in supervised learning enables machines to predict categorical labels based on patterns in labeled training data. Algorithms like K-Nearest Neighbours and Decision Trees are essential tools for solving real-world problems—from recommendation engines to diagnostic systems. Understanding how they work helps developers and students choose the right approach for interpretability, accuracy, and performance in categorical prediction tasks.