SQL Query and performance tuning - Partitioning in Depth
๐ Partitioning in Databases: Horizontal vs. Vertical (With Simple Examples)
Working with large tables in a database can quickly become inefficient. Queries slow down, indexes balloon in size, and deleting old data becomes a headache.
๐ Enter partitioning.
Partitioning is a technique where we split a big table into smaller pieces to improve performance and manageability.
Let’s break down the two main types of partitioning —> Horizontal and Vertical, and explore Range, List, and Hash partitioning methods.
๐งฑ 1. Horizontal vs. Vertical Partitioning
๐ Horizontal Partitioning
What is it?
You're splitting rows into separate tables (aka partitions). Each partition holds a subset of the rows.
Think of it like:
Breaking one giant table into several smaller tables based on some criteria like date, region, or category.
๐ Use Case:
When querying time-based logs, you usually care about recent data. Why search the whole table if you only need last week?
✅ Benefits:
-
Smaller indexes per partition = faster queries
-
Can drop old data just by deleting a partition
-
Easier to scan recent data (less I/O)
๐ Vertical Partitioning
What is it?
You're splitting columns into separate tables. Each table keeps the same primary key, but only contains certain columns.
Think of it like:
Separating “frequently used” columns from “rarely used” ones.
๐ Use Case:
Product tables with 100+ attributes, but most queries only use 4 or 5 columns.
✅ Benefits:
-
More rows fit per block = faster reads
-
Reduced I/O for queries needing fewer columns
-
Works well with analytics and columnar storage
๐ 2. Types of Horizontal Partitioning
Now let’s dig deeper into horizontal partitioning methods. There are three major types:
๐ A. Range Partitioning
How it works:
You split data based on ranges of values — usually dates or numbers.
Example:
A measurements
table with a date
column:
date | temperature | humidity |
---|---|---|
2025-07-01 | 22°C | 45% |
2025-07-08 | 24°C | 40% |
๐ Partitioned as:
-
measurements_2025_wk1
: July 1–7 -
measurements_2025_wk2
: July 8–14
✅ Great for:
-
Time-series data (logs, sensors)
-
Periodic reports (monthly, weekly)
๐ B. List Partitioning
How it works:
You split data based on a defined list of values.
Example:
A products
table partitioned by category
.
id | name | category |
---|---|---|
1 | Shirt | Clothing |
2 | Laptop | Electronics |
๐ Partitioned as: |
-
products_clothing
-
products_electronics
✅ Great for:
-
Product catalogs
-
Region- or department-specific data
-
Queries that focus on a specific group
๐ข C. Hash Partitioning
How it works:
Data is assigned to partitions using a hash function on a key, like customer_id
.
Example:
A web_logs
table with millions of sessions.
session_id | page_url | timestamp |
---|---|---|
101 | /home | 2025-07-08 08:10AM |
202 | /products/123 | 2025-07-08 08:15AM |
๐ Hash partitions: |
Each session_id % 5
determines the partition (0–4).
✅ Great for:
-
Evenly distributing data
-
Large, ungrouped datasets
-
Load balancing across partitions
๐ฏ Which Should You Use?
| |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
๐ง Final Thoughts
Partitioning isn't just for giant enterprises. Even mid-sized datasets can benefit from smarter table structures.
Start small:
-
Choose a partition key that aligns with your query patterns
-
Monitor performance before and after
-
Keep partitions manageable (avoid too many!)
Comments
Post a Comment