502 Bad Gateway Error: Complete Diagnosis

Understand and resolve the "Bad Gateway" error between proxy and backend.

The 502 Bad Gateway error is one of the most frustrating to diagnose. It indicates that the proxy server (Nginx, Apache, or a load balancer) could not get a valid response from the backend server. The problem is not with the proxy itself, but somewhere upstream.

This error can have dozens of different causes: PHP-FPM service crash, database overload, application timeout, or simple misconfiguration. The key to diagnosis is knowing exactly where to look.

This guide walks you through step-by-step 502 error diagnosis. From basic checks to advanced commands, you'll have all the keys to identify and resolve the problem quickly.

Understanding the 502 Bad Gateway Error

The 502 error occurs in architectures with a reverse proxy:

  • The proxy's role: Nginx or Apache acts as intermediary between user and your application. It receives requests and forwards them to the backend (PHP-FPM, Node.js, Gunicorn...).
  • Failed communication: The proxy tried to contact the backend but did not get a valid response. Backend may be down, saturated, or returned a malformed response.
  • Difference from 503: A 503 says "I'm overloaded, try again later". A 502 says "I can't communicate with the backend". The 502 is generally more serious.
  • User impact: User sees an error page and cannot access the service. Unlike slowness, there's no possibility to wait - the action has definitively failed.

Main Causes of 502 Error

Here are the most frequent causes of a 502 Bad Gateway error:

  • Backend service stopped: PHP-FPM, Node.js, or Gunicorn stopped or crashed. The proxy has no one to talk to. Most frequent cause.
  • Timeout exceeded: Application takes too long to respond and proxy gives up. Slow request, saturated database, or blocked external service call.
  • Resources exhausted: Backend lacks workers/threads to process requests. All connections are busy and new ones are refused.
  • Socket/port inaccessible: Proxy tries to connect to wrong socket or port, or permissions are incorrect. Configuration problem.
  • Insufficient memory: Backend process was killed by system OOM killer. Server lacks RAM and sacrificed the service.

Diagnostic Steps

Follow these steps to identify the 502 error cause:

  1. Check backend status: Start by checking if PHP-FPM, Node, or your backend service is running. This is cause #1.
  2. Check proxy logs: Nginx/Apache logs contain the exact error message: "upstream timed out", "connection refused", etc.
  3. Check backend logs: PHP-FPM, Node.js, or your framework usually log the reason for crash or failure.
  4. Monitor resources: Check CPU, RAM, and disk. A saturated system can cause erratic behavior.
  5. Test direct connection: Try contacting the backend directly (curl to port/socket) to isolate the problem.

Diagnostic Commands

Here are essential commands to diagnose a 502 error:

# Check PHP-FPM status
systemctl status php-fpm
journalctl -u php-fpm -n 50

# Check Nginx status
systemctl status nginx
tail -100 /var/log/nginx/error.log

# Search recent 502 errors
grep "502" /var/log/nginx/access.log | tail -20

# Check system resources
free -h
df -h
top -bn1 | head -20

These commands give you a complete view: service status, error logs, system resources, and connectivity. Start with systemctl status for a quick first assessment.

Solutions by Cause

Once cause is identified, apply the appropriate solution:

  • Backend stopped: Restart the service (systemctl restart php-fpm). Check logs to understand why it stopped and fix root cause.
  • Timeout exceeded: Increase timeouts in Nginx (proxy_read_timeout, proxy_connect_timeout) and optimize slow queries on application side.
  • Insufficient workers: Increase PHP-FPM worker count (pm.max_children) or Node.js workers. Monitor available RAM.
  • Memory exhausted: Add RAM or optimize memory consumption. Configure appropriate limits to avoid OOM killer.

Preventing 502 Errors

Prevention is better than cure. Here's how to avoid future 502 errors:

  • Proactive monitoring: Monitor response times and availability with MoniTao. Progressive degradation often precedes a 502.
  • Metric alerts: Configure alerts on CPU > 80%, RAM > 90%, and connection count. Act before saturation.
  • Health checks: Configure health checks on your load balancer to detect failing backends and automatically exclude them.
  • Redundancy: Use multiple backend instances behind a load balancer. If one instance crashes, others take over.

502 Diagnostic Checklist

  • Verify backend service (PHP-FPM/Node) is started
  • Check Nginx/Apache error logs
  • Check backend logs for crashes
  • Monitor system resources (RAM, CPU, disk)
  • Test direct connection to backend socket/port
  • Verify proxy timeout configuration

Frequently Asked Questions

What's the difference between 502 and 504?

A 502 (Bad Gateway) means the proxy received an invalid response from backend. A 504 (Gateway Timeout) means the proxy received no response within the allotted time. 502 often indicates a crash, 504 slowness.

Is the 502 error client or server side?

It's a server-side error (5xx). The client can't do anything about it. However, user can retry later as the error is often temporary.

How to avoid 502s during deployments?

Use zero-downtime deployment: new instances started before stopping old ones, or graceful PHP-FPM reload that finishes current requests.

Cloudflare shows 502 but my server works?

Cloudflare is a proxy. If it shows 502, it can't reach your server. Check firewall, Cloudflare timeouts, and that your server accepts Cloudflare connections.

Can a 502 come from an external API?

Yes, if your application calls an external API and it returns an error, your backend can fail and cause a 502. Check API calls in your application logs.

How to configure a custom 502 error message?

In Nginx, use the directive error_page 502 /custom-502.html; and create a static HTML page. Avoid having this page require PHP since the backend is the problem.

Conclusion

The 502 Bad Gateway error is a symptom pointing to a communication problem between your proxy and backend. Root cause can be a crash, timeout, or resource exhaustion.

With MoniTao, you detect these problems before they impact your users. Timeout and response time alerts warn you of degradations that often precede a complete 502.

Ready to Sleep Soundly?

Start free, no credit card required.