SQL Query and performance tuning - Optimization techniques

 

🚀 Boosting PostgreSQL Performance: Practical Tips & Examples

PostgreSQL is powerful, but to make your queries fly, you need to dig a little deeper. In this guide, we’ll explore performance optimization techniques that go beyond the basics, covering Common Table Expressions (CTEs), query planner hints, parallel execution, caching, and indexing.

Let’s break it down with examples. 💡


🧩 1. Avoid Repetitive Computation with CTEs

Common Table Expressions (CTEs) help you write clean SQL and reuse subqueries efficiently.

🔁 Problem:

You're repeating the same complex subquery multiple times.

SELECT COUNT(*) FROM (SELECT * FROM orders WHERE status = 'completed') AS sub;

✅ Solution: Use a CTE

WITH completed_orders AS ( SELECT * FROM orders WHERE status = 'completed' ) SELECT COUNT(*) FROM completed_orders;

💡 Tip:

By default, PostgreSQL inlines CTEs (executes them each time they’re referenced). To run it once and reuse the result, use AS MATERIALIZED:

WITH completed_orders AS MATERIALIZED ( SELECT * FROM orders WHERE status = 'completed' ) SELECT COUNT(*) FROM completed_orders;

🧭 2. Guide the Query Planner (Carefully)

Unlike Oracle or MySQL, PostgreSQL doesn’t use inline query hints. Instead, you can enable/disable planner features globally:

👇 Example:

SET enable_nestloop TO off;

This disables nested loop joins temporarily.

⚠️ Use planner controls only when you know the planner is making a bad choice. Try updating stats or partitioning first!


⚡ 3. Speed Up with Parallel Queries

PostgreSQL supports parallel execution, which can significantly reduce query time, if conditions are right.

✅ When Parallel Works:

  • The query reads data (not writing).

  • The plan contains Gather or Gather Merge.

  • The database isn't in single-user mode.

  • You’re not using parallel-unsafe functions.

⚙️ Settings that impact parallelism:

-- Enable parallel workers SET max_parallel_workers_per_gather = 4; -- Ensure dynamic shared memory is on SHOW dynamic_shared_memory_type;

🔍 Example: EXPLAIN output

EXPLAIN ANALYZE SELECT COUNT(*) FROM large_table WHERE category = 'Books';

Look for Gather in the plan.


🧠 4. Use Caching Wisely

Caching can make or break your query speed.

PostgreSQL uses shared buffers to cache data in memory.

🔍 Check Cache Efficiency:

Look at cache hit rates in EXPLAIN ANALYZE. If hit rate is low, you might be reading from disk too often.

🛠 Adjust Shared Buffers:

# postgresql.conf shared_buffers = 4GB # Recommend ~25% of system RAM

Note: This change requires a server restart.


🗂 5. Indexing Best Practices

Indexes speed up lookups, joins, and filters—> but only if used right.

✅ Index Join & Filter Columns

CREATE INDEX idx_orders_status ON orders(status);

📦 Use Covering Indexes

CREATE INDEX idx_orders_covering ON orders(status, customer_id, total_amount);

This can fulfill a query without touching the table!

🔎 Use Functional Indexes for Expressions

CREATE INDEX idx_lower_email ON users(LOWER(email));

Then use:

SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

Problem:

Normally, if you use a function in a WHERE clause—like LOWER(email)—the database
cannot use a regular index on the email column. This forces a full table scan,
which is slow.

Why it works:

Functional indexes store the result of the expression (e.g., the lowercase version of each email).
When your query uses the same expression, the database can directly access the indexed values, avoiding a full scan

⚠️ String Matching: Use LIKE Carefully

Good (index can be used):

WHERE name LIKE 'John%';

Bad (index can not be used):

WHERE name LIKE '%John';

Benefits of Index-Friendly LIKE Patterns:

  1. Index Usage = Speed Patterns like 'prefix%' allow B-tree indexes to quickly find matches using range scans.

  2. Avoids Full Table Scans

     Patterns starting with % (like '%John') prevent PostgreSQL from using indexes, forcing a slower full scan.
3. Scales Better with Large Data      For large tables, avoiding full scans is crucial.      Indexes can reduce search time from seconds to milliseconds.

📝 Bonus Tips

  • Use IS NULL (not = NULL) to check for nulls.

  • For date filtering, prefer ranges:


-- Efficient WHERE created_at >= DATE '2024-01-01' AND created_at < DATE '2024-01-02';
  • Use proper data types. Don’t store numbers in text columns.

  • Consider GIN indexes for full-text search.


📌 Final Thoughts

A fast PostgreSQL database doesn’t come from luck—it comes from intentional design decisions. Here’s a quick recap:

FeaturePurposeTip
CTEsAvoid repetitionUse AS MATERIALIZED for reuse
Planner HintsFine-tune executionUse SET enable_* carefully
Parallel QueriesSpeed via concurrencyWatch for Gather in plans
CacheFaster read accessIncrease shared_buffers
IndexesFast lookups & joinsUse covering & functional indexes

🔧 Want to Tune Your Queries?

Use this command to see how PostgreSQL is executing your SQL:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM your_query;

Let PostgreSQL show you the truth—and optimize from there.

Comments

Popular posts from this blog

SQL Query and performance tuning - Indexing in depth

Apache Kafka - The basics

Spring: How to deal with circular dependencies