Node is single-threaded for JavaScript execution. The event loop, our handlers, every line of our code — all on one thread. That’s fine for I/O-bound work (the kernel does the waiting). It’s a disaster for CPU-bound work: a sync 2-second computation blocks every other in-flight request for 2 seconds. Worker Threads are Node’s answer.
In simple language: Worker Threads let us spawn a separate JS thread that runs alongside the main one. Real parallel execution, not just async I/O. We communicate via message passing, like a tiny isolated worker microservice that lives in our process.
What “CPU-bound” actually means
A request is CPU-bound when our process is doing math, not waiting on the network/disk. Examples:
- Parsing a 50MB JSON or CSV
- Resizing an image
- Computing a SHA-256 hash over a big buffer
- Compiling a regex against millions of strings
- Running ML inference in pure JS
For I/O work (DB query, HTTP fetch, file read), Workers won’t help — Node’s event loop is already great at that.
A minimal worker
Workers live in their own file (or string). We message back and forth.
// main.js
import { Worker } from 'node:worker_threads';
function runHeavy(input) {
return new Promise((resolve, reject) => {
const worker = new Worker('./worker.js', { workerData: input });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0) reject(new Error(`Worker exited ${code}`));
});
});
}
console.log(await runHeavy({ size: 10_000_000 }));
// worker.js
import { workerData, parentPort } from 'node:worker_threads';
// some CPU-heavy task — does NOT block main.js
let sum = 0;
for (let i = 0; i < workerData.size; i++) sum += Math.sqrt(i);
parentPort.postMessage({ sum });
While worker.js is grinding, the main thread keeps serving HTTP requests. That’s the point.
The architecture
Each worker is essentially a fresh Node instance running inside the same process. Separate V8 heap, separate event loop, separate require cache.
postMessage — the message channel
postMessage uses the structured clone algorithm to serialize the data — same one browsers use for postMessage between windows. It can move plain objects, Buffers, Maps, Sets, typed arrays, even circular references. It cannot move functions, class instances with methods, or DOM-like objects.
parentPort.postMessage({
result: bigBuffer,
meta: { ts: Date.now() },
});
Bigger payload = more cloning cost. If we’re sending megabytes, consider transferList — Node moves the Buffer/ArrayBuffer without copying (the sender loses access to it).
parentPort.postMessage({ buf }, [buf.buffer]); // ownership transfer
SharedArrayBuffer — shared memory between threads
For the rare cases where workers need to read/write the same memory (image processing pipelines, multi-worker numerical compute), SharedArrayBuffer is the escape hatch.
// main.js
const sab = new SharedArrayBuffer(1024);
const view = new Int32Array(sab);
worker.postMessage(sab); // both threads now see the same bytes
Multiple threads writing the same memory is exactly the classic concurrency hazard — race conditions, torn reads, the works. Atomics (built-in) gives us atomic read/write/compare-and-swap. Use sparingly and only when message passing is genuinely too slow.
When to use Workers — and when not
Reach for Workers when:
- The CPU work takes more than ~50ms — long enough to noticeably block the event loop.
- The work is parallelizable and we want to use multiple cores.
- We need real isolation (a sandbox for user-supplied code, for example).
Don’t reach for Workers when:
- The work is I/O. Async I/O is already free of the event loop.
- The work is tiny. Spawning a worker has startup cost (~10–50ms). For small jobs the overhead dwarfs the gain.
- We just want more concurrency for HTTP requests. Use
cluster(multiple Node processes behind the OS load balancer), or run multiple containers behind a reverse proxy. That’s the idiomatic Node scaling story.
The worker pool pattern
We almost never spawn a worker per request — startup cost kills us. Instead, we keep a pool of N workers (often os.availableParallelism()), and queue jobs to them. Libraries like piscina do this for us with a pool.run(task) API.
import Piscina from 'piscina';
const pool = new Piscina({ filename: new URL('./worker.js', import.meta.url) });
const result = await pool.run({ image: buf });
Pool stays warm, requests share workers, throughput goes way up.
Workers vs cluster vs child_process — quick contrast
- Workers — same process, separate threads, message passing, shared memory possible. CPU-bound JS work.
- cluster — multiple Node processes, OS-level load balancing on the same port. Scaling I/O-bound HTTP servers across cores.
- child_process — spawning external commands (ffmpeg, git) or running other Node scripts as totally separate processes. Highest isolation, highest overhead.
Pick by what we’re trying to do — they’re not interchangeable.
The mental model
Workers turn Node from single-threaded to multi-threaded for CPU work. The cost is message passing between isolated heaps; the win is unblocking the main event loop. Use a pool, not one-off spawns. And remember: most Node bottlenecks are I/O, not CPU — measure before reaching for this hammer.