If we run node server.js directly and it crashes, that’s it. The process is dead. A process manager solves this — it babysits our app, restarts it on crash, runs it as multiple workers, captures logs, and gives us a CLI to inspect everything.
PM2 is the most popular for Node. It’s not the only option (systemd, Docker restart policies, Kubernetes), but it’s the easiest to get going for a single VM.
What PM2 actually does
In simple language: PM2 is “a daemon that runs your Node apps and makes sure they stay running.”
Specifically:
- Auto-restart on crash (with backoff)
- Restart on memory threshold (
max_memory_restart) - Cluster mode (forks N copies, load balances)
- Log rotation and aggregation
- Zero-downtime reload
pm2 startuphooks into systemd so PM2 survives reboots
Basic usage
npm i -g pm2
# Start an app
pm2 start server.js --name api
# See status
pm2 list
# Tail logs
pm2 logs api
# Restart / stop / delete
pm2 restart api
pm2 stop api
pm2 delete api
# Persist current process list across reboots
pm2 save
pm2 startup # prints a sudo command — run it
Cluster mode — free horizontal scaling
-i max runs one instance per CPU core. Same idea as the cluster module, just declarative.
pm2 start server.js -i max --name api
PM2 handles the master process for us. Each worker is a real Node process with its own memory. Use this when our HTTP server is CPU-bound and we want to use all cores on one machine.
restarts: 2 · uptime: 4d
restarts: 0 · uptime: 4d
restarts: 1 · uptime: 3d
ecosystem.config.cjs — config as code
For anything beyond a one-liner, put settings in an ecosystem file. Then pm2 start ecosystem.config.cjs.
module.exports = {
apps: [
{
name: 'api',
script: './server.js',
instances: 'max',
exec_mode: 'cluster',
max_memory_restart: '500M',
env: {
NODE_ENV: 'production',
PORT: 3000,
},
error_file: './logs/api-err.log',
out_file: './logs/api-out.log',
time: true,
},
{
name: 'cron-worker',
script: './cron.js',
instances: 1,
exec_mode: 'fork',
autorestart: true,
},
],
};
Key options:
instances: 'max'+exec_mode: 'cluster'— one worker per coremax_memory_restart: '500M'— restart if worker exceeds 500MB (band-aid for leaks)autorestart: true— default, restart on crashcron_restart: '0 4 * * *'— restart at 4 AM daily (rarely needed but useful for leaky processes)
Zero-downtime reload
pm2 reload api
In cluster mode, this restarts workers one at a time. Each old worker keeps serving until the new one is ready, then it shuts down. No dropped requests if our app handles SIGINT/SIGTERM properly (graceful shutdown — covered in the next note).
pm2 restart is different: it kills and restarts. Brief downtime.
PM2 vs systemd vs Docker
| PM2 | systemd | Docker / K8s | |
|---|---|---|---|
| Setup | npm i -g, done | Write a unit file | Dockerfile + compose / manifest |
| Cluster mode | Built-in, free | Manual (multiple units) | Scale replicas |
| Logs | pm2 logs | journalctl | docker logs / k8s |
| Reload w/o downtime | Yes (cluster) | No (needs LB) | Yes (rolling deploy) |
| Best for | Single VM, fast iteration | Linux servers, no containers | Multi-host, microservices |
Rule of thumb:
- Just one VM, want something working today → PM2
- VM, prefer OS-native, don’t want extra runtime → systemd unit
- Already on Docker/Kubernetes → don’t use PM2, let the orchestrator restart containers. PM2 inside Docker is a common anti-pattern; the container should be the unit of restart.
PM2 gotchas
- PM2 in Docker is usually wrong. Docker already restarts containers. Running PM2 inside hides crashes from Docker and complicates log capture. One container = one Node process.
pm2 startupsetup is mandatory. Without it, a server reboot kills our apps. Runpm2 startuponce, thenpm2 saveafter every change to the process list.- Logs grow forever. Install
pm2-logrotate(pm2 install pm2-logrotate) or use logrotate. - PM2’s free version doesn’t ship metrics. Keymetrics (their paid SaaS) does. For free, scrape
pm2 jlistor expose your own metrics.