SQL Query and performance tuning - Partitioning in Depth
๐ Partitioning in Databases: Horizontal vs. Vertical (With Simple Examples)
Working with large tables in a database can quickly become inefficient. Queries slow down, indexes balloon in size, and deleting old data becomes a headache.
๐ Enter partitioning.
Partitioning is a technique where we split a big table into smaller pieces to improve performance and manageability.
Let’s break down the two main types of partitioning —> Horizontal and Vertical, and explore Range, List, and Hash partitioning methods.
๐งฑ 1. Horizontal vs. Vertical Partitioning
๐ Horizontal Partitioning
What is it?
You're splitting rows into separate tables (aka partitions). Each partition holds a subset of the rows.
Think of it like:
Breaking one giant table into several smaller tables based on some criteria like date, region, or category.
๐ Use Case:
When querying time-based logs, you usually care about recent data. Why search the whole table if you only need last week?
✅ Benefits:
- 
Smaller indexes per partition = faster queries 
- 
Can drop old data just by deleting a partition 
- 
Easier to scan recent data (less I/O) 
๐ Vertical Partitioning
What is it?
You're splitting columns into separate tables. Each table keeps the same primary key, but only contains certain columns.
Think of it like:
Separating “frequently used” columns from “rarely used” ones.
๐ Use Case:
Product tables with 100+ attributes, but most queries only use 4 or 5 columns.
✅ Benefits:
- 
More rows fit per block = faster reads 
- 
Reduced I/O for queries needing fewer columns 
- 
Works well with analytics and columnar storage 
๐ 2. Types of Horizontal Partitioning
Now let’s dig deeper into horizontal partitioning methods. There are three major types:
๐ A. Range Partitioning
How it works:
You split data based on ranges of values — usually dates or numbers.
Example:
A measurements table with a date column:
| date | temperature | humidity | 
|---|---|---|
| 2025-07-01 | 22°C | 45% | 
| 2025-07-08 | 24°C | 40% | 
๐ Partitioned as:
- 
measurements_2025_wk1: July 1–7
- 
measurements_2025_wk2: July 8–14
✅ Great for:
- 
Time-series data (logs, sensors) 
- 
Periodic reports (monthly, weekly) 
๐ B. List Partitioning
How it works:
You split data based on a defined list of values.
Example:
A products table partitioned by category.
| id | name | category | 
|---|---|---|
| 1 | Shirt | Clothing | 
| 2 | Laptop | Electronics | 
| ๐ Partitioned as: | 
- 
products_clothing
- 
products_electronics
✅ Great for:
- 
Product catalogs 
- 
Region- or department-specific data 
- 
Queries that focus on a specific group 
๐ข C. Hash Partitioning
How it works:
Data is assigned to partitions using a hash function on a key, like customer_id.
Example:
A web_logs table with millions of sessions.
| session_id | page_url | timestamp | 
|---|---|---|
| 101 | /home | 2025-07-08 08:10AM | 
| 202 | /products/123 | 2025-07-08 08:15AM | 
| ๐ Hash partitions: | 
Each session_id % 5 determines the partition (0–4).
✅ Great for:
- 
Evenly distributing data 
- 
Large, ungrouped datasets 
- 
Load balancing across partitions 
๐ฏ Which Should You Use?
| 
 | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
๐ง Final Thoughts
Partitioning isn't just for giant enterprises. Even mid-sized datasets can benefit from smarter table structures.
Start small:
- 
Choose a partition key that aligns with your query patterns 
- 
Monitor performance before and after 
- 
Keep partitions manageable (avoid too many!) 
 
 
Comments
Post a Comment