6 Techniques to Scale Database Reads
Most applications read far more than they write. And as your data grows, reads are usually the first thing to slow down.
A read becomes slow when the database has to:
- Search through large amounts of data
- Join multiple tables
- Compute aggregates
- Reconstruct the same data repeatedly
In most distributed systems, reads dominate writes. So if you want to scale your system, you need to start by scaling your reads.
There are several techniques to do this. Each comes with its own tradeoffs.
Replication
Replication means maintaining multiple copies of the same database across different servers. Instead of one database serving all reads and writes, you divide the work. The primary database handles writes, and the replicas handle reads.
Primary DB → Replica 1, Replica 2, Replica 3 ...
This allows you to horizontally scale reads. If one database can serve R reads/sec, then N replicas can serve roughly N × R reads/sec. Replication also improves availability. Instead of one database, you now have multiple nodes serving requests.
Types of Replication
Synchronous Replication: A write request comes to the primary node, it updates the database, and then waits for all replicas to confirm the write before acknowledging it.
- Pros: Strong consistency. Data is replicated across all nodes before the write is considered complete.
- Cons: Higher write latency because the primary has to wait for all replicas to confirm.
- Used in: Systems where strong consistency is required, like financial systems.
Asynchronous Replication: The primary acknowledges the write immediately. Replication happens in the background.
- Pros: Fast writes.
- Cons: Eventual consistency. Replicas might not be updated immediately, which introduces replication lag.
- Used by: MySQL, PostgreSQL, and most real world systems.
The biggest tradeoff with replication is replication lag. Your replicas may serve slightly stale data depending on how far behind they are.
Indexes
An index is a data structure that the database maintains alongside your table to make lookups faster. Think of it like the index at the back of a textbook. Instead of scanning every page, you look up the keyword and jump straight to the right page.
Without an index, the database performs a full table scan, reading every row to find matches. With an index, it narrows down the search to a much smaller subset.
-- Without index: full table scan on 10M rows
SELECT * FROM orders WHERE customer_id = 42;
-- Create an index on customer_id
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
-- Now the same query uses the index and returns much faster
Types of Indexes
- B-Tree Index: The default in most databases (PostgreSQL, MySQL). Works well for equality and range queries. Keeps data sorted, so queries like
WHERE age > 25are efficient. - Hash Index: Optimized for exact match lookups (
WHERE id = 42). Not useful for range queries. - Composite Index: An index on multiple columns. Useful when your queries filter on more than one column. Column order matters here, the index is used left to right.
When Not to Index
Indexes are not free. Every index you add:
- Consumes additional disk space
- Slows down writes because the database has to update the index on every insert, update, and delete
Avoid indexing columns that are rarely queried or columns with very low cardinality (like a boolean field with only true/false values). Index the columns that appear frequently in your WHERE, JOIN, and ORDER BY clauses.
Caching
Caching is about storing frequently accessed query results in memory so the database doesn't have to compute them every time.
The idea is simple. If the same query is being executed thousands of times per minute and the underlying data doesn't change often, there's no reason to hit the database every time. You store the result in a fast in-memory store like Redis or Memcached and serve it from there.
Request → Check Cache → Cache Hit? → Return cached result
→ Cache Miss? → Query DB → Store in cache → Return result
Denormalization
Denormalization means intentionally duplicating data across tables (or documents) to reduce expensive joins and make reads faster. The goal is simple: single table lookup, no joins, faster reads.
Denormalization is essentially data level caching. You embed relationships and computations directly into your schema.
This practice is commonly used in OLAP (Online Analytical Processing) systems or in read heavy systems where query speed is more critical than write efficiency.
You implement it not because normalization failed, but because your system demands speed, scale, or simplicity that normalized structures alone can't deliver.
What Does It Eliminate?
- JOIN cost: Joins are expensive because they require querying multiple tables. This gets worse as data size increases.
- Cross shard queries: With denormalization, all the data for a query lives in a single shard. This improves read performance drastically. That's why NoSQL systems encourage denormalization.
What to Denormalize?
Identify your expensive JOIN queries and denormalize the data that is requested often.
When Not to Denormalize?
- Strong consistency requirements
- Write heavy workloads
Denormalization is often used in the CQRS pattern where the write model is normalized and the read model is denormalized.
Tradeoffs
- Consistency problems: If the denormalized data is not updated properly across all places, it can introduce consistency problems that are harder to solve than scaling problems.
- Write amplification: Updating one logical entity means updating many rows. Every time data is inserted, updated, or deleted, the changes must be propagated across all instances of that data.
- Data duplication: Same data is stored in multiple places. The logic to update it needs to be robust.
Almost every read optimization (denormalization, indexing, materialized views) works by making writes more expensive so reads become cheap. Everything is a tradeoff in System Design :)
Connection Pooling
A connection is how a client talks to the database server. It is required to send a request and receive a response.
When your service serves 1000 requests and each one needs to query the database, it has to open 1000 connections one after another. For each connection, the service needs to:
- Find the database
- Connect using the URL and port
- Authenticate (username and password) and authorize
- Select the database
- Run the actual query
- Return the result
- Close the connection
This is a time consuming process. Sometimes establishing a connection is more expensive than the actual query itself.
Connection pooling solves this by creating a pool of fixed connections at the start and keeping them alive for the lifetime of the application. Whenever some part of the application wants to query the database, it borrows a connection from the pool. Once the query is done, instead of closing the connection, it is returned back to the pool.
Things to Keep in Mind
- Max pool size: How many connections can be open simultaneously. This needs to be determined by your CPU cores, load testing results, and application metrics.
- Idle timeout: How long idle connections stay alive before being closed.
If all connections in the pool are occupied, incoming requests get queued.
Tradeoffs
- Pros: Requests borrow from the pool instead of creating new connections. Reduces latency and prevents connection exhaustion.
- Cons: If the pool size is not configured properly, too small means request queueing and increased latency, too large means exhausting the database.
Materialized Views
A materialized view is a precomputed, stored result of a query that can be queried like a regular table.
Regular views are virtual. They are computed on the fly every time you query them. Materialized views are different. They are physical copies of the data, stored on disk, and periodically refreshed.
Why Are They Needed?
Complex aggregation queries (SUM, COUNT, JOIN) are expensive at runtime when you have millions of rows. It gets worse when you have thousands of concurrent users querying the database. Materialized views reduce compute cost by storing pre-aggregated data.
Refresh Strategies
- Immediate Refresh: Data is refreshed when the underlying data changes. This can slow writes.
- Manual or Scheduled Refresh: For example, refresh every 30 minutes. This is the industry default.
-- Create a materialized view
CREATE MATERIALIZED VIEW weekly_sales AS
SELECT product_id, COUNT(*) AS sales_count
FROM orders
WHERE order_date > CURRENT_DATE - 7
GROUP BY product_id;
-- Query it like a regular table (fast)
SELECT * FROM weekly_sales WHERE sales_count > 100;
-- Refresh periodically
REFRESH MATERIALIZED VIEW weekly_sales;
Tradeoffs
- Storage cost: Consumes disk space
- Stale data: Queries reflect the last refresh, not live data
- Refresh complexity: High refresh frequency can lead to more database load
Wrapping Up
Scaling reads is not about picking one technique. Most production systems combine several of these approaches. You might use replication to distribute read traffic, indexes to speed up lookups, caching to avoid hitting the database entirely, denormalization to eliminate joins, connection pooling to reduce overhead, and materialized views to precompute expensive aggregations.
The common thread across all of them is the same tradeoff: you make writes a little more expensive so reads become fast. The right combination depends on your system's read/write ratio, consistency requirements, and scale.
If you have any questions feel free to reach out to me on LinkedIn.