aiFundamentalsMachineLearningChapter2Desc

lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

lesson11Title

lesson12Title

lesson13Title

lesson14Title

lesson15Title

lesson16Title

lesson17Title

lesson18Title

lesson19Title

lesson20Title

lesson21Title

aiFundamentalsMachineLearningChapter2Title

aiFundamentalsMachineLearningChapter1Desc

lesson22Title

lesson23Title

aiFundamentalsMachineLearningChapter1Title

aiFundamentalsMachineLearningChapter3Desc

lesson24Title

aiFundamentalsMachineLearningChapter3Title

aiFundamentalsMachineLearningChapter4Desc

aiFundamentalsMachineLearningChapter4Title

# Categorical Data Encoding

AI and machine learning models can only understand numbers.

However, much of the data we work with is text-based.

This kind of data, grouped into categories without numerical meaning, is called `categorical data`.`

```plaintext title="Example of Categorical Data"
| ID  | Color | Region | Occupation |
|-----|-------|--------|------------|
| 1   | Red   | New York | Student   |
| 2   | Blue  | Chicago  | Employee  |
| 3   | Green | Los Angeles | Student   |
| 4   | Yellow| New York | Doctor    |
```

In the data above, color, region, and occupation are categorical data.  

These cannot be used for direct calculations, and comparing their magnitude or order is not meaningful.

Categorical data can be divided into two main types.

<br />

### Nominal Data

This is categorical data without any order. Examples of nominal data include colors (red, blue, green) and regions (New York, Chicago, Los Angeles).
<br />

### Ordinal Data

This is categorical data with an order. Examples of ordinal data include education levels (elementary, middle, high school) and customer satisfaction levels (low, medium, high).

Categorical data needs to be converted into numerical form for machine learning, a process known as `encoding`.

<br />

## What is Data Encoding?

Categorical data must be transformed into numbers so that machine learning models can comprehend it. This transformation process is known as data encoding.

For example, let's convert the color data above into numbers.

```plaintext title="Color Data Encoding"
| ID  | Color  | Color (Encoded) |
|-----|--------|----------------|
| 1   | Red    | 0              |
| 2   | Blue   | 1              |
| 3   | Green  | 2              |
| 4   | Yellow | 3              |
```

This allows the model to process color data numerically.

There are methods like `Label Encoding` and `One-Hot Encoding` for this transformation.

We will discuss each method in more detail in the following lessons.

The process of converting categorical data into numbers is called encoding. This is an important step that helps models process text data.

Categorical Data Encoding

Nominal Data

Ordinal Data

What is Data Encoding?

What is the process of converting categorical data into numbers called?