Lecture

GroupBy and Aggregation Functions

One of the most powerful capabilities in Pandas is grouping data and performing calculations on each subset.

This technique helps uncover patterns across categories — for example, sales by region, average scores per class, or revenue by product.

The groupby() method splits your data into groups based on the values in one or more columns.

Once grouped, you can apply aggregation functions such as:

  • sum(): total value per group
  • mean(): average value per group
  • count(): number of rows per group
  • max(): highest value per group
  • min(): lowest value per group

GroupBy example

Imagine you have a dataset of sales transactions from different cities. You might want to:

  • Calculate total sales for each city
  • Find the average transaction amount per store
  • Count how many transactions happened in each region

Pandas makes this easy. For example, to calculate total sales per city, you can write:

Calculate Total Sales per City
import pandas as pd df = pd.DataFrame({ "City": ["New York", "New York", "Los Angeles", "Los Angeles", "Chicago", "Chicago"], "Sales": [100000, 150000, 200000, 250000, 300000, 350000] }) df.groupby("City")["Sales"].sum() # Output: # City # New York 250000 # Los Angeles 450000 # Chicago 650000

Syntax Overview

Here's a simple pattern:

Basic GroupBy Syntax
df.groupby("ColumnName")["TargetColumn"].agg("aggregation_function")

You can also use .agg() to apply multiple aggregation functions at once for richer summaries.

For example, to calculate the sum, mean, and count of the sales for each category, you can write:

Apply Multiple Aggregations
df = pd.DataFrame({ "Category": ["A", "A", "B", "B", "C", "C"], "Amount": [100, 200, 300, 400, 500, 600] }) df.groupby("Category")["Amount"].agg(["sum", "mean", "count"]) # Output: # sum mean count # Category # A 300 150 2 # B 700 350 2 # C 1100 550 2

The output shows that:

  • sum = total of all values in the group
  • mean = average value
  • count = number of rows in the group

The categories (A, B, C) appear as index labels for clarity.

Quiz
0 / 1

Using the groupby method in Pandas, you can apply aggregation functions like sum, mean, and count to grouped data.

True
False

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help