Lecture

Distribution Plots (histplot, kdeplot)

Visualizing data distributions helps you understand how your data is spread, detect patterns, and identify potential outliers.

Seaborn provides two main tools for this:

  • histplot() – shows the frequency distribution of a dataset.
  • kdeplot() – shows the probability density function (smoothed distribution curve).

Using histplot()

The histplot() function creates a histogram that shows how many data points fall into each range (bin).

Basic Histogram
import seaborn as sns import matplotlib.pyplot as plt tips = sns.load_dataset("tips") sns.histplot(data=tips, x="total_bill") plt.title("Distribution of Total Bills") plt.show()

Key points:

  • x specifies the variable to plot.
  • The plot is divided into bins (intervals) along the X-axis.
  • The height of each bar shows how many observations fall into that bin.

Using kdeplot()

The kdeplot() function displays a smooth curve representing the estimated probability density of the data.

Basic KDE Plot
sns.kdeplot(data=tips, x="total_bill") plt.title("KDE of Total Bills") plt.show()

Key points:

  • KDE = Kernel Density Estimate (a smoothed version of the histogram).
  • Good for showing trends in continuous data.
  • Can be combined with histplot() for more context.

Combining Histogram and KDE

You can combine both in a single histplot() by setting kde=True:

Histogram with KDE Overlay
sns.histplot(data=tips, x="total_bill", kde=True) plt.title("Total Bill Distribution with KDE") plt.show()

In the next Jupyter Notebook, you will experiment with:

  • Changing bin sizes in histograms.
  • Adding hue categories to compare groups.
  • Styling KDE plots for clarity.
Quiz
0 / 1

What function would you use to visualize a smoothed version of a histogram in Seaborn?

To create a smooth curve representing the estimated probability density of data in Seaborn, you would use the function.
histplot()
kdeplot()
scatterplot()
lineplot()

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help