Node.js Production Guide: PM2, Memory Leaks & Crash Recovery

Running Node.js Apps in Production Like a Pro

You've built an amazing app with AI assistance, pushed it to production, and then... it crashes at 3 AM. Sound familiar? Welcome to the wild world of Node.js production environments.

While Node.js is fantastic for building fast, scalable applications, it can be a bit temperamental in production. Memory leaks, unexpected crashes, and the dreaded "it works on my machine" syndrome are all too common. But don't worry - with the right tools and strategies, you can tame your Node.js apps and keep them running smoothly.

Why PM2 is Your Production Best Friend

PM2 (Process Manager 2) is like having a reliable DevOps engineer watching over your Node.js applications 24/7. It's a production process manager that handles everything from keeping your app alive to load balancing across multiple CPU cores.

Installing and Basic Setup

npm install -g pm2

# Start your app
pm2 start app.js --name "my-awesome-app"

# Or use an ecosystem file
pm2 start ecosystem.config.js

Here's a sample ecosystem configuration that'll make your life easier:

module.exports = {
  apps: [{
    name: 'my-app',
    script: './app.js',
    instances: 'max', // Use all CPU cores
    exec_mode: 'cluster',
    env: {
      NODE_ENV: 'production',
      PORT: 3000
    },
    // Restart if memory usage exceeds 500MB
    max_memory_restart: '500M',
    // Auto-restart on file changes (disable in production)
    watch: false,
    // Restart delay
    restart_delay: 4000,
    // Max restart attempts
    max_restarts: 3,
    // Time window for max_restarts
    min_uptime: '10s'
  }]
}

PM2 Monitoring and Management

PM2 comes with built-in monitoring that's actually useful:

# Real-time monitoring
pm2 monit

# Detailed app info
pm2 show my-app

# Logs (because you'll need them)
pm2 logs my-app --lines 100

# Restart strategies
pm2 restart my-app
pm2 reload my-app  # Zero-downtime restart
pm2 gracefulReload my-app

The Memory Leak Monster

Memory leaks in Node.js are sneaky. Your app runs fine for hours or days, then suddenly crashes with an out-of-memory error. Here's how to hunt them down and prevent them.

Common Culprits

Global Variables That Keep Growing

// DON'T do this
let globalCache = [];

app.get('/api/data', (req, res) => {
  // This array grows forever
  globalCache.push(someData);
  res.json(result);
});

// DO this instead
const LRU = require('lru-cache');
const cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 }); // 5 min TTL

Event Listener Leaks

// DON'T forget to clean up
const EventEmitter = require('events');
const emitter = new EventEmitter();

// This creates a new listener on every request - bad!
app.get('/api/stream', (req, res) => {
  emitter.on('data', handleData); // Leak!
});

// DO clean up properly
app.get('/api/stream', (req, res) => {
  const handleData = (data) => { /* handle data */ };
  emitter.on('data', handleData);
  
  res.on('close', () => {
    emitter.removeListener('data', handleData);
  });
});

Memory Monitoring and Debugging

Add memory monitoring to your app:

// Simple memory usage logging
setInterval(() => {
  const used = process.memoryUsage();
  console.log('Memory usage:', {
    rss: Math.round(used.rss / 1024 / 1024) + 'MB',
    heapTotal: Math.round(used.heapTotal / 1024 / 1024) + 'MB',
    heapUsed: Math.round(used.heapUsed / 1024 / 1024) + 'MB',
    external: Math.round(used.external / 1024 / 1024) + 'MB'
  });
}, 30000); // Log every 30 seconds

For deeper debugging, use the built-in inspector:

# Start with inspector
node --inspect app.js

# Or attach to running PM2 process
pm2 start app.js --node-args="--inspect"

Crash Recovery That Actually Works

Crashes happen. The key is handling them gracefully and getting back up quickly.

Graceful Shutdown Handling

process.on('SIGTERM', gracefulShutdown);
process.on('SIGINT', gracefulShutdown);

function gracefulShutdown(signal) {
  console.log(`Received ${signal}. Starting graceful shutdown...`);
  
  server.close(() => {
    console.log('HTTP server closed.');
    
    // Close database connections
    mongoose.connection.close(() => {
      console.log('Database connection closed.');
      process.exit(0);
    });
  });
  
  // Force close after 10 seconds
  setTimeout(() => {
    console.error('Forceful shutdown');
    process.exit(1);
  }, 10000);
}

Uncaught Exception Handling

// Last resort error handling
process.on('uncaughtException', (err) => {
  console.error('Uncaught Exception:', err);
  
  // Log to external service (Sentry, LogRocket, etc.)
  logger.fatal(err, 'Uncaught exception');
  
  // Graceful shutdown
  gracefulShutdown('uncaughtException');
});

process.on('unhandledRejection', (reason, promise) => {
  console.error('Unhandled Rejection at:', promise, 'reason:', reason);
  
  // Log and potentially restart
  logger.error({ reason, promise }, 'Unhandled promise rejection');
});

Health Checks and Monitoring

Implement health check endpoints that PM2 and load balancers can use:

app.get('/health', async (req, res) => {
  try {
    // Check database connection
    await db.ping();
    
    // Check external APIs
    // await externalService.ping();
    
    res.status(200).json({
      status: 'healthy',
      uptime: process.uptime(),
      memory: process.memoryUsage(),
      timestamp: new Date().toISOString()
    });
  } catch (error) {
    res.status(503).json({
      status: 'unhealthy',
      error: error.message
    });
  }
});

Production-Ready PM2 Configuration

Here's a battle-tested PM2 ecosystem config:

module.exports = {
  apps: [{
    name: 'production-app',
    script: './dist/server.js',
    instances: 'max',
    exec_mode: 'cluster',
    
    // Environment variables
    env_production: {
      NODE_ENV: 'production',
      PORT: 3000
    },
    
    // Restart policy
    max_memory_restart: '1G',
    restart_delay: 4000,
    max_restarts: 10,
    min_uptime: '10s',
    
    // Logging
    log_file: './logs/combined.log',
    out_file: './logs/out.log',
    error_file: './logs/error.log',
    log_date_format: 'YYYY-MM-DD HH:mm Z',
    
    // Advanced options
    kill_timeout: 3000,
    listen_timeout: 8000,
    
    // Health monitoring
    health_check_grace_period: 3000
  }]
}

Monitoring and Alerts

Set up PM2's built-in monitoring or integrate with external services:

# PM2 Plus (web monitoring)
pm2 plus

# Custom monitoring script
pm2 start monitor.js --cron "*/5 * * * *" # Every 5 minutes

The Bottom Line

Running Node.js in production doesn't have to be a nightmare. With PM2 handling process management, proper memory leak prevention, and robust crash recovery, your apps can run reliably for months without intervention.

The key is being proactive: monitor memory usage, implement health checks, handle errors gracefully, and always have a restart strategy. Your 3 AM self will thank you.

Remember, the best production setup is one you don't have to think about. Set it up right once, and focus on building features instead of fighting fires.