SQL Query and performance tuning - Explain and Analyse commands

 



๐Ÿ› ️ Understanding SQL Query Performance with EXPLAIN and ANALYZE

When it comes to SQL optimization, the first and most powerful step is understanding how your queries are actually executed. While SQL is declarative—you tell the database what you want—what happens under the hood is procedural and driven by a query execution plan.

This blog post will walk you through how to use PostgreSQL’s EXPLAIN and ANALYZE tools to explore, diagnose, and improve query performance, including how WHERE clauses and indexes influence plans.


๐Ÿ” Meet EXPLAIN: Peek Into the Execution Plan

Let’s start simple:

EXPLAIN SELECT * FROM staff;

Output:


Seq Scan on staff (cost=0.00..24.00 rows=1000 width=75)
  • Seq Scan: A full table scan—every row is read.

  • Cost: Estimated computation units. Not real time, but useful for comparing query complexity.

  • Width: Amount of data (in bytes) returned per row.

๐ŸŸข For small tables, a Seq Scan is typically fast enough.
๐Ÿ”ด For large tables, it can be inefficient.


๐Ÿงช EXPLAIN ANALYZE: Get Real Execution Metrics

To see what actually happens when the query runs, use:

EXPLAIN ANALYZE SELECT * FROM staff;

This outputs real runtime stats:


Seq Scan on staff (actual time=0.02..0.05 rows=1000 width=75) Execution Time: 208 ms

If you reduce the number of columns:

EXPLAIN ANALYZE SELECT last_name FROM staff;

Output:



Seq Scan on staff (actual time=0.01..0.03 rows=1000 width=7) Execution Time: 0.238 ms

๐ŸŽฏ Smaller row width = low memory consumption


๐Ÿง  Adding a WHERE Clause: Does It Help?

Let’s try filtering data:

EXPLAIN ANALYZE SELECT * FROM staff WHERE salary > 75000;

This returns fewer rows (e.g., 717 out of 1000), but still uses a Seq Scan:




Seq Scan on staff (actual time=0.01..0.06 rows=717) Execution Time: 19.560 ms

Compare this with the previous example of not having the where clause.

๐Ÿ‘‰ Even though fewer rows are returned, the execution time is longer because of the extra filtering step.


๐Ÿ—‚️ Enter Indexes: How Do They Help?

Indexes can avoid full table scans, especially for selective queries.

Let’s create an index:

CREATE INDEX idx_staff_salary ON staff(salary);

Now retry:

EXPLAIN SELECT * FROM staff WHERE salary > 75000;



๐Ÿ”ด Still a Seq Scan?

That’s because too many rows match the condition. The planner estimates that scanning the entire table is faster than jumping between the index and table rows.


✅ Making Queries More Selective

Try a more selective query:

EXPLAIN SELECT * FROM staff WHERE salary > 150000;

Now the index is likely used:


Index Scan using idx_staff_salary on staff (cost=0.28..8.52 rows=5 width=75)

๐ŸŽ‰ Success! PostgreSQL uses the index because only a few rows match, making it faster than a full scan.


๐Ÿง  Takeaway: When Are Indexes Used?

Query TypeIndex Used?
SELECT * FROM staff ❌ No (Seq Scan)
WHERE salary > 75000 (many rows) ❌ No (still Seq Scan)
WHERE salary > 150000 (few rows)         ✅ Yes (Index Scan)

๐Ÿ”‘ General Rule:

Indexes are used when the WHERE clause is highly selective, meaning it narrows the result set significantly.


๐Ÿ“ Metrics to Pay Attention To

MetricWhat It Tells You
Cost    Estimated effort (useful for comparisons)
Execution Time      Actual time the query took (ms)
Rows    How many rows were returned
Width    Avg. bytes per row (lower = faster I/O)
Operation Type    Seq Scan vs Index Scan vs Filter, etc.

๐Ÿงฉ Visual Metaphor: Zooming In with a Magnifying Glass

You can think of EXPLAIN ANALYZE like using a magnifying glass on your data query:

  • ๐Ÿ” Without WHERE clause: You're looking at the whole pizza (table).

  • ๐Ÿ”Ž With a WHERE clause: You're trying to find slices with only mushrooms.

  • ๐Ÿ—‚️ With an index: You’ve already labeled slices by topping and can go directly to the mushrooms!


๐Ÿง  Final Thoughts: How to Use This in Practice

  • Use EXPLAIN to understand execution paths.

  • Use ANALYZE to measure real-world performance.

  • Index only columns used in selective WHERE clauses.

  • Watch row width to reduce data transfer time.

  • Remember: PostgreSQL is smart—it chooses plans based on cost and data distribution.


Comments

Popular posts from this blog

SQL Query and performance tuning - Indexing in depth

Apache Kafka - The basics

Spring: How to deal with circular dependencies