Slow memory leaks in Node.js are usually boring in the worst possible way.
The service starts fine. Throughput looks normal. CPU is acceptable. Then memory climbs steadily for hours until the container gets restarted and the cycle begins again.
That pattern is a gift. It means you can investigate methodically.
Start by Proving It Is a Leak
A rising memory graph alone is not enough.
Ask:
- does memory drop after traffic falls?
- is the growth in
heapUsed, rss, or both?
- did request rate or payload size change at the same time?
If memory keeps climbing without returning, start looking for retained references.
Heap Snapshots Are the Fastest Serious Tool
If the leak is in JavaScript objects, take a heap snapshot:
import { writeHeapSnapshot } from "node:v8";
const path = writeHeapSnapshot();
console.log(`heap snapshot: ${path}`);
Open it in DevTools and sort by retained size.
The important phrase is retained size, not shallow size. You are looking for the object that is keeping an unexpectedly large subgraph alive.
A Very Common Leak Pattern
This shape appears in real services all the time:
app.use((req, _res, next) => {
db.on("queryCompleted", () => {
logger.info({ userId: req.user.id }, "query completed");
});
next();
});
The problem is not just the listener count. It is what the listener closes over.
Each request adds a new callback to a long-lived emitter. Each callback retains req. If nothing removes the listener, old request objects stay reachable and the garbage collector cannot reclaim them.
The Fix Is Usually Lifecycle Discipline
Prefer one of these patterns:
- register a single long-lived listener outside the request path
- use
.once(...) if the listener should auto-remove
- explicitly remove listeners in cleanup paths
For example:
function onQueryCompleted(event) {
logger.info({ queryId: event.id }, "query completed");
}
db.on("queryCompleted", onQueryCompleted);
That version no longer captures request-scoped data accidentally.
What to Look for in the Snapshot
The most useful signs are:
- large arrays or maps hanging off global objects
- closures retaining request-specific state
- caches without eviction
- listener lists growing with traffic
You are not just looking for "big". You are looking for "why is this still alive?"
Preventing the Next One
A few habits catch a lot of Node leaks early:
- avoid attaching listeners inside hot request paths
- bound caches explicitly
- prefer streaming over buffering large payloads
- graph memory alongside throughput
- take snapshots before the incident gets worse
Memory leak debugging feels mysterious until you see the first one clearly. After that, most of them come down to object lifetime.
Further Reading