SQL Query and performance tuning - Joins in depth
🔗 Joins vs. Subqueries: What to Use and How It Works Behind the Scenes
When you're working with SQL, you'll often face this question:
Should I use a JOIN or a SUBQUERY?
Both get the job done. But behind the scenes, your choice can affect performance, sometimes dramatically. Let’s break down the difference, peek under the hood at how joins actually work, and help you choose the right one.
🧩 What’s the Difference?
Let’s say you have two tables:
🔁 Option 1: Using a JOIN
🧠 Option 2: Using a SUBQUERY
✅ Both queries return the same result. But what happens under the hood?
🛠️ How Joins Actually Work
Let’s break down three core join algorithms used by SQL engines:
1. 🌀 Nested Loop Join
How it works:
-
Outer loop → scans one table (driver table)
-
Inner loop → scans the other table for each row in the outer loop
Visual Example:
Best for:
-
Small datasets
-
Flexible: works with =, <, >, BETWEEN, etc.
-
Indexes can speed this up
Drawback:
Slow for large tables if indexes are missing or data doesn’t fit in memory.
2. 🧮 Hash Join
How it works:
-
Build a hash table from the smaller input
-
Scan the larger table and look up matches in the hash
Visual Example:
Best for:
-
Large joins with equality conditions
-
Fast when both tables are big
Drawback:
-
Only supports equality (=) joins
-
More memory-intensive than nested loops
3. 🔀 Merge Join (Sort-Merge Join)
How it works:
-
Sort both tables on the join key
-
Walk through both in order, like merging two sorted lists
Visual Example:
Best for:
-
Pre-sorted or indexed tables
-
Very efficient on large datasets
Drawback:
-
Only supports equality (=) joins
-
Requires sort step if data isn’t already ordered
🔎 So… Should You Use a Join or Subquery?
✅ Use Joins When:
-
You're combining multiple tables in a straightforward way
-
You need better performance (joins are often optimized better)
-
You want to apply filters on multiple tables together
✅ Use Subqueries When:
-
You need to fetch a single value (scalar subquery)
-
You're trying to keep query logic simpler and more readable
-
You're returning aggregates or values that only relate to one row
🎯 Final Takeaway
Use the method that makes your intentions clear.
Then, check the execution plan to optimize.
Modern database engines are smart. Joins and subqueries can often be optimized to run similarly. But knowing how they work helps you write faster and clearer SQL.
📌 Pro Tip: Look for “Covered Indexes”
If your index includes all the columns used in your query (called a covered index), your join might never touch the full table —> big performance boost!
💬 Over to You
Which join type do you use most often? Have you ever switched from a subquery to a join for performance? Share your thoughts in the comments below!
Comments
Post a Comment