Master Seaborn

Statistical data visualization made beautiful. Learn seaborn's high-level interface for creating informative and attractive statistical graphics.

Introduction to Seaborn

Seaborn is a Python data visualization library based on matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics. It makes complex visualizations simple and beautiful.

Why Seaborn?

  • • Beautiful default styles and color palettes
  • • Built-in statistical estimation and plotting
  • • Tight integration with pandas DataFrames
  • • High-level functions for complex visualizations
  • • Automatic legend and annotation handling

Your First Seaborn Plot

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Set the style
sns.set_style("whitegrid")

# Load sample dataset
tips = sns.load_dataset('tips')

# Create a scatter plot
plt.figure(figsize=(10, 6))
sns.scatterplot(data=tips, x='total_bill', y='tip', 
                hue='time', size='size', alpha=0.7)

plt.title('Tips vs Total Bill', fontsize=16, fontweight='bold')
plt.show()

Output

Click "Run Code" to see the output

Statistical Relationships

Understanding relplot()

The relplot() function is seaborn's primary tool for visualizing statistical relationships between variables. It can create both scatter plots and line plots with semantic mapping.

Scatter Plots

Show relationships between continuous variables

Line Plots

Show trends over time or continuous variables

Semantic Mapping

Use hue, size, and style to add dimensions

Scatter Plot with Multiple Dimensions

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load dataset
tips = sns.load_dataset('tips')

# Create advanced scatter plot
plt.figure(figsize=(12, 8))
sns.relplot(x='total_bill', y='tip', data=tips,
            hue='time',           # Color by time (Lunch/Dinner)
            size='size',          # Size by party size
            style='smoker',       # Different markers for smoker
            alpha=0.8,            # Transparency
            height=6,             # Figure height
            aspect=1.5)           # Aspect ratio

plt.suptitle('Tips Analysis: Multiple Dimensions', 
             fontsize=16, fontweight='bold', y=1.02)
plt.show()

Click "Run" to see the scatter plot

Categorical Data Visualization

Seaborn's categorical plotting functions help you understand how variables differ across different categories. These plots are essential for comparing groups and understanding categorical relationships.

Categorical Plot Types

Strip Plot

Shows distribution of values with scattered points

Swarm Plot

Avoids overlapping points with smart positioning

Box Plot

Shows quartiles, median, and outliers

Violin Plot

Combines box plot with kernel density estimation

When to Use Each

Strip/Swarm: Small datasets, show individual points
Box/Violin: Large datasets, show distribution summary
Bar/Point: Show estimates and confidence intervals
Count: Show frequency of categories

Categorical Comparison

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Load tips dataset
tips = sns.load_dataset('tips')

# Create figure with multiple categorical plots
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# 1. Strip plot
sns.stripplot(x='day', y='total_bill', data=tips, 
              ax=axes[0,0], alpha=0.7, jitter=True)
axes[0,0].set_title('Strip Plot: Daily Bills', fontweight='bold')

# 2. Box plot
sns.boxplot(x='day', y='total_bill', data=tips, ax=axes[0,1])
axes[0,1].set_title('Box Plot: Daily Bill Distribution', fontweight='bold')

# 3. Violin plot with split by smoker
sns.violinplot(x='day', y='total_bill', hue='smoker', 
               data=tips, ax=axes[1,0], split=True)
axes[1,0].set_title('Violin Plot: Smoker vs Non-Smoker', fontweight='bold')

# 4. Bar plot with confidence intervals
sns.barplot(x='day', y='tip', data=tips, ax=axes[1,1], 
            ci='sd', capsize=0.1)
axes[1,1].set_title('Bar Plot: Average Tips by Day', fontweight='bold')

plt.tight_layout()
plt.show()

Click "Run" to see categorical plots

Distribution Analysis

Understanding the distribution of your data is crucial for statistical analysis. Seaborn provides powerful tools for visualizing univariate and bivariate distributions.

Univariate

Single variable distributions

• distplot(), histplot()
• kdeplot(), rugplot()

Bivariate

Two variable joint distributions

• jointplot(), hexplot()
• kdeplot(), pairplot()

Multivariate

Multiple variable relationships

• pairplot(), heatmap()
• FacetGrid(), PairGrid()

Distribution Comparison

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Load iris dataset
iris = sns.load_dataset('iris')

# Create distribution plots
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# 1. Histogram with KDE
sns.histplot(iris['sepal_length'], kde=True, ax=axes[0,0], 
             color='skyblue', bins=20)
axes[0,0].set_title('Sepal Length Distribution', fontweight='bold')

# 2. Multiple histograms
sns.histplot(data=iris, x='sepal_length', hue='species', 
             ax=axes[0,1], alpha=0.7, bins=15)
axes[0,1].set_title('Sepal Length by Species', fontweight='bold')

# 3. Joint distribution
sns.scatterplot(data=iris, x='sepal_length', y='petal_length',
                hue='species', ax=axes[1,0], alpha=0.8)
axes[1,0].set_title('Sepal vs Petal Length', fontweight='bold')

# 4. Pair plot preview (2 variables)
sns.scatterplot(data=iris, x='sepal_width', y='petal_width',
                hue='species', ax=axes[1,1], alpha=0.8)
axes[1,1].set_title('Sepal vs Petal Width', fontweight='bold')

plt.tight_layout()
plt.show()

Click "Run" to see distribution plots

Regression Analysis

Seaborn makes it easy to add regression lines to your scatter plots, helping you understand linear relationships and make predictions. The library provides both simple and advanced regression modeling capabilities.

Regression Functions

regplot(): Simple regression with scatter plot
lmplot(): Regression with faceting support
residplot(): Residuals analysis

Key Features

• Automatic confidence intervals
• Multiple regression model types
• Outlier detection and handling
• Residual analysis tools

Multiple Regression Models

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Load tips dataset
tips = sns.load_dataset('tips')

# Create regression analysis
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# 1. Simple linear regression
sns.regplot(x='total_bill', y='tip', data=tips, 
            ax=axes[0,0], scatter_kws={'alpha':0.6})
axes[0,0].set_title('Linear Regression: Bill vs Tip', fontweight='bold')

# 2. Polynomial regression
sns.regplot(x='total_bill', y='tip', data=tips, 
            ax=axes[0,1], order=2, scatter_kws={'alpha':0.6},
            line_kws={'color': 'red', 'label': 'Polynomial'})
axes[0,1].set_title('Polynomial Regression (Order 2)', fontweight='bold')

# 3. Regression with hue
sns.scatterplot(x='total_bill', y='tip', hue='time', 
                data=tips, ax=axes[1,0], alpha=0.7)

# Add separate regression lines
for time in tips['time'].unique():
    subset = tips[tips['time'] == time]
    sns.regplot(x='total_bill', y='tip', data=subset, 
                ax=axes[1,0], scatter=False, 
                label=f'{time} trend')

axes[1,0].set_title('Separate Regressions by Time', fontweight='bold')

# 4. Residuals plot
sns.residplot(x='total_bill', y='tip', data=tips, 
              ax=axes[1,1], scatter_kws={'alpha':0.6})
axes[1,1].set_title('Residuals Plot', fontweight='bold')
axes[1,1].axhline(y=0, color='red', linestyle='--')

plt.tight_layout()
plt.show()

Click "Run" to see regression plots

Multi-plot Grids

Seaborn's grid functions (FacetGrid, PairGrid, JointGrid) allow you to create complex multi-plot layouts that reveal patterns across multiple variables and subsets of your data.

Grid Types

FacetGrid

Create grids of plots based on categorical variables

PairGrid

Show pairwise relationships in a matrix

JointGrid

Combine marginal and joint distributions

Use Cases

Compare groups across multiple variables
Explore high-dimensional relationships
Create publication-ready figure grids
Reveal patterns in complex datasets

FacetGrid Example

import seaborn as sns
import matplotlib.pyplot as plt

# Load iris dataset
iris = sns.load_dataset('iris')

# Create FacetGrid
g = sns.FacetGrid(iris, col='species', hue='species', 
                  height=4, aspect=1.2)

# Map scatter plot to each facet
g.map(sns.scatterplot, 'sepal_length', 'petal_length', 
      alpha=0.8, s=60)

# Add regression line to each facet
g.map(sns.regplot, 'sepal_length', 'petal_length', 
      scatter=False, truncate=True)

# Customize
fig = g.fig 
fig.suptitle('Iris Species: Sepal vs Petal Length', 
             fontsize=16, fontweight='bold', y=1.02)

# Add overall legend
g.add_legend()

plt.tight_layout()
plt.show()

Click "Run" to see the facet grid

Styling & Themes

Seaborn provides beautiful default styles and extensive customization options. Master the art of creating visually appealing and publication-ready statistical plots.

Built-in Themes

darkgrid: Dark background with grid
whitegrid: White background with grid
dark: Dark background, no grid
white: White background, no grid
ticks: White background with ticks

Color Palettes

Blues
Oranges
Greens
Reds
Purples
Browns
Pinks
Grays

Complete Styling Example

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Set custom style
sns.set_style("whitegrid", {
    'grid.color': '.8',
    'grid.linewidth': 0.5,
    'axes.edgecolor': '.8',
    'axes.linewidth': 0.8
})

# Custom color palette
custom_palette = sns.color_palette("Set2")
sns.set_palette(custom_palette)

# Load dataset
tips = sns.load_dataset('tips')

# Create styled plot
plt.figure(figsize=(12, 8))

# Main plot with custom styling
ax = sns.scatterplot(data=tips, x='total_bill', y='tip', 
                     hue='time', size='size', alpha=0.8,
                     edgecolor='white', linewidth=1)

# Add regression line
sns.regplot(data=tips, x='total_bill', y='tip', 
            scatter=False, ax=ax, color='red', 
            line_kws={'linewidth': 2, 'alpha': 0.8})

# Styling
plt.title('Elegant Seaborn Styling Example', 
          fontsize=18, fontweight='bold', 
          fontfamily='serif', pad=20)

plt.xlabel('Total Bill ($)', fontsize=14, fontweight='bold')
plt.ylabel('Tip ($)', fontsize=14, fontweight='bold')

# Customize legend
plt.legend(title='Time of Day', title_fontsize=12, 
           fontsize=11, frameon=True, fancybox=True, 
           shadow=True, loc='upper left')

# Add subtle background color
ax.set_facecolor('#fafafa')

plt.tight_layout()
plt.show()

Click "Run" to see styled plot