Optimizing Database Performance at Scale
As your application grows, database performance becomes increasingly critical. In this guide, we'll explore proven techniques to optimize your database for high-traffic scenarios.
Understanding Database Bottlenecks
Common performance issues include:
- Slow queries and missing indexes
- Connection pool exhaustion
- Lock contention
- Inefficient schema design
- Lack of caching strategy
Query Optimization
1. Use Indexes Wisely
Create indexes for frequently queried columns:
-- Create index on user email for faster lookups
CREATE INDEX idx_users_email ON users(email);
-- Composite index for common query patterns
CREATE INDEX idx_posts_author_date ON posts(author_id, created_at DESC);
-- Partial index for active records only
CREATE INDEX idx_active_users ON users(status) WHERE status = 'active';
2. Analyze Query Plans
Always check your query execution plans:
EXPLAIN ANALYZE
SELECT u.name, COUNT(p.id) as post_count
FROM users u
LEFT JOIN posts p ON u.id = p.author_id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.name
ORDER BY post_count DESC
LIMIT 10;
3. Optimize Joins
-- Bad: Multiple subqueries
SELECT *
FROM users
WHERE id IN (
SELECT author_id FROM posts WHERE published = true
)
AND id IN (
SELECT user_id FROM subscriptions WHERE active = true
);
-- Good: Single JOIN
SELECT DISTINCT u.*
FROM users u
INNER JOIN posts p ON u.id = p.author_id
INNER JOIN subscriptions s ON u.id = s.user_id
WHERE p.published = true AND s.active = true;
Connection Pooling
Implement efficient connection pooling with Prisma:
import { PrismaClient } from "@prisma/client";
declare global {
var prisma: PrismaClient | undefined;
}
export const prisma =
global.prisma ||
new PrismaClient({
log: process.env.NODE_ENV === "development" ? ["query", "error", "warn"] : ["error"],
});
if (process.env.NODE_ENV !== "production") {
global.prisma = prisma;
}
// Configure connection pool
const prismaWithPool = new PrismaClient({
datasources: {
db: {
url: process.env.DATABASE_URL,
},
},
// Connection pool settings
connectionLimit: 10,
});
Caching Strategies
1. Application-Level Caching
import { Redis } from "ioredis";
const redis = new Redis(process.env.REDIS_URL);
export async function getCachedUser(userId: string) {
// Try cache first
const cached = await redis.get(`user:${userId}`);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - fetch from database
const user = await prisma.user.findUnique({
where: { id: userId },
});
// Store in cache with 1 hour expiry
await redis.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
2. Query Result Caching
export async function getPopularPosts(limit = 10) {
const cacheKey = `popular-posts:${limit}`;
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
const posts = await prisma.post.findMany({
take: limit,
orderBy: { views: "desc" },
include: { author: true },
});
await redis.setex(cacheKey, 300, JSON.stringify(posts)); // 5 minutes
return posts;
}
Database Sharding
For massive scale, consider sharding:
function getShardId(userId: string): number {
// Simple hash-based sharding
const hash = userId.split("").reduce((acc, char) => {
return ((acc << 5) - acc) + char.charCodeAt(0);
}, 0);
return Math.abs(hash) % TOTAL_SHARDS;
}
export async function getUserFromShard(userId: string) {
const shardId = getShardId(userId);
const shardClient = getShardClient(shardId);
return shardClient.user.findUnique({
where: { id: userId },
});
}
Read Replicas
Distribute read load across replicas:
import { PrismaClient } from "@prisma/client";
const primaryDb = new PrismaClient({
datasources: { db: { url: process.env.PRIMARY_DATABASE_URL } },
});
const replicaDb = new PrismaClient({
datasources: { db: { url: process.env.REPLICA_DATABASE_URL } },
});
// Use replica for reads
export async function getUsers() {
return replicaDb.user.findMany();
}
// Use primary for writes
export async function createUser(data: any) {
return primaryDb.user.create({ data });
}
Monitoring and Alerts
Set up monitoring to track performance:
import { performance } from "perf_hooks";
export async function measureQueryTime(queryName: string, queryFn: () => Promise<any>) {
const start = performance.now();
try {
const result = await queryFn();
const duration = performance.now() - start;
if (duration > 1000) {
console.warn(`Slow query detected: ${queryName} took ${duration}ms`);
}
return result;
} catch (error) {
console.error(`Query failed: ${queryName}`, error);
throw error;
}
}
Conclusion
Database optimization is an iterative process. Start by identifying bottlenecks, implement appropriate solutions, and continuously monitor performance. With these techniques, your application can handle significant scale while maintaining fast response times.
Remember to always test performance improvements in a staging environment before deploying to production.
