Build an API Health Check Endpoint

Every API needs a health check endpoint. It's the first thing any monitoring tool hits, the first thing load balancers query, and the first thing an on-call engineer checks at 3 AM when something seems off. But most health check endpoints are dangerously shallow — they return 200 OK as long as the web server process is alive, even if the database is down, the cache is unreachable, or a critical dependency has been offline for hours.

A good health check endpoint tells you whether your service can actually do its job, not just whether it's running. This guide covers how to build one that's genuinely useful for monitoring, along with implementation patterns in several popular frameworks.

What a Health Check Endpoint Should Do

A proper health check answers one question: "Can this service fulfill requests right now?" That means checking more than whether the HTTP server is responding. It means verifying every critical dependency the service needs to function.

At minimum, your health check should verify:

Database connectivity — Can the service connect to the database and execute a simple query?
Cache connectivity — If your service depends on Redis or Memcached, is it reachable?
Disk space — Is there enough disk for the service to write logs, temp files, or uploads?
Critical external services — If you depend on a payment processor or auth provider, is it reachable?

The health check should return a clear, machine-readable response with a top-level status and individual component statuses. This lets monitoring tools (including PulseAPI) not only detect failures but understand what's failing.

The Response Format

Use a consistent JSON response format. Here's a pattern that works well:

{
  "status": "healthy",
  "timestamp": "2026-03-10T14:32:00Z",
  "version": "1.4.2",
  "checks": {
    "database": { "status": "healthy", "latency_ms": 3 },
    "redis": { "status": "healthy", "latency_ms": 1 },
    "disk": { "status": "healthy", "free_gb": 42.5 },
    "stripe_api": { "status": "healthy", "latency_ms": 145 }
  }
}

When something is failing:

{
  "status": "degraded",
  "timestamp": "2026-03-10T14:32:00Z",
  "version": "1.4.2",
  "checks": {
    "database": { "status": "healthy", "latency_ms": 3 },
    "redis": { "status": "unhealthy", "error": "Connection refused" },
    "disk": { "status": "healthy", "free_gb": 42.5 },
    "stripe_api": { "status": "healthy", "latency_ms": 145 }
  }
}

Status values should be one of three: healthy, degraded, or unhealthy. The top-level status is unhealthy if any critical component is down, degraded if a non-critical component is down, and healthy if everything checks out.

Return the right HTTP status codes too. Return 200 for healthy, 200 for degraded (the service can still partially function), and 503 Service Unavailable for unhealthy. This matters because monitoring tools like PulseAPI typically check HTTP status codes as the first-pass signal.

Implementation: Laravel (PHP)

If you're building with Laravel (like we did with PulseAPI), here's a clean implementation:

// routes/api.php
Route::get('/health', [HealthCheckController::class, 'check']);

// app/Http/Controllers/HealthCheckController.php
class HealthCheckController extends Controller
{
    public function check(): JsonResponse
    {
        $checks = [];
        $isHealthy = true;

        // Database check
        try {
            $start = microtime(true);
            DB::select('SELECT 1');
            $checks['database'] = [
                'status' => 'healthy',
                'latency_ms' => round((microtime(true) - $start) * 1000),
            ];
        } catch (\Exception $e) {
            $checks['database'] = [
                'status' => 'unhealthy',
                'error' => 'Connection failed',
            ];
            $isHealthy = false;
        }

        // Redis check
        try {
            $start = microtime(true);
            Cache::store('redis')->get('health-check');
            $checks['redis'] = [
                'status' => 'healthy',
                'latency_ms' => round((microtime(true) - $start) * 1000),
            ];
        } catch (\Exception $e) {
            $checks['redis'] = [
                'status' => 'unhealthy',
                'error' => 'Connection failed',
            ];
            $isHealthy = false;
        }

        // Disk check
        $freeGb = disk_free_space('/') / (1024 * 1024 * 1024);
        $checks['disk'] = [
            'status' => $freeGb > 1.0 ? 'healthy' : 'degraded',
            'free_gb' => round($freeGb, 1),
        ];

        $status = $isHealthy ? 'healthy' : 'unhealthy';

        return response()->json([
            'status' => $status,
            'timestamp' => now()->toISOString(),
            'version' => config('app.version', '1.0.0'),
            'checks' => $checks,
        ], $isHealthy ? 200 : 503);
    }
}

Implementation: Express (Node.js)

// routes/health.js
app.get('/health', async (req, res) => {
  const checks = {};
  let isHealthy = true;

  // Database check (PostgreSQL with pg)
  try {
    const start = Date.now();
    await pool.query('SELECT 1');
    checks.database = {
      status: 'healthy',
      latency_ms: Date.now() - start,
    };
  } catch (err) {
    checks.database = { status: 'unhealthy', error: 'Connection failed' };
    isHealthy = false;
  }

  // Redis check
  try {
    const start = Date.now();
    await redis.ping();
    checks.redis = {
      status: 'healthy',
      latency_ms: Date.now() - start,
    };
  } catch (err) {
    checks.redis = { status: 'unhealthy', error: 'Connection failed' };
    isHealthy = false;
  }

  const status = isHealthy ? 'healthy' : 'unhealthy';
  res.status(isHealthy ? 200 : 503).json({
    status,
    timestamp: new Date().toISOString(),
    version: process.env.APP_VERSION || '1.0.0',
    checks,
  });
});

Implementation: FastAPI (Python)

from fastapi import FastAPI, Response
from datetime import datetime, timezone
import time, redis, psycopg2

app = FastAPI()

@app.get("/health")
async def health_check(response: Response):
    checks = {}
    is_healthy = True

    # Database check
    try:
        start = time.time()
        conn = psycopg2.connect(DATABASE_URL)
        cur = conn.cursor()
        cur.execute("SELECT 1")
        cur.close()
        conn.close()
        checks["database"] = {
            "status": "healthy",
            "latency_ms": round((time.time() - start) * 1000),
        }
    except Exception:
        checks["database"] = {"status": "unhealthy", "error": "Connection failed"}
        is_healthy = False

    # Redis check
    try:
        start = time.time()
        r = redis.from_url(REDIS_URL)
        r.ping()
        checks["redis"] = {
            "status": "healthy",
            "latency_ms": round((time.time() - start) * 1000),
        }
    except Exception:
        checks["redis"] = {"status": "unhealthy", "error": "Connection failed"}
        is_healthy = False

    status = "healthy" if is_healthy else "unhealthy"
    response.status_code = 200 if is_healthy else 503

    return {
        "status": status,
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "checks": checks,
    }

Best Practices

Keep it fast

Your health check will be called frequently — every 30-60 seconds by monitoring tools, and potentially every few seconds by load balancers. It should complete in under 500ms. If a dependency check is slow (like an external API), set a short timeout (2-3 seconds) and report the dependency as degraded rather than blocking the entire response.

Don't expose sensitive information

The health check response should not include database connection strings, internal hostnames, stack traces, or error details that could help an attacker. Keep error messages generic: "Connection failed" is sufficient. "Connection to mysql://root:password@10.0.1.5:3306/prod failed" is a security vulnerability.

For extra protection, you can require an API key for the detailed health response and return a simplified version without component details for unauthenticated requests.

Separate liveness from readiness

In containerized environments (Kubernetes, ECS), there's an important distinction between "is the process alive?" (liveness) and "can it handle traffic?" (readiness). Consider implementing two endpoints:

/health/live — Returns 200 if the process is running. Used by the orchestrator to decide whether to restart the container.
/health/ready — Returns 200 only if all dependencies are available. Used by the load balancer to decide whether to route traffic to this instance.

A service that's live but not ready (database connection lost during a failover, for example) should not receive traffic, but also shouldn't be killed — it may recover on its own once the dependency comes back.

Version your health check

Include your application version in the response. This is invaluable during deployments — if you see a spike in errors, you can immediately check whether the health endpoint is reporting the new version or the old one, helping you determine if the issue is related to the deploy.

Don't cache the response

Health checks should reflect the current state of the system, not a cached state from 5 minutes ago. Ensure your health endpoint bypasses any HTTP caching layers (set Cache-Control: no-cache, no-store headers) and doesn't cache dependency check results internally.

Connecting Your Health Check to Monitoring

Once your health endpoint is deployed, add it as an endpoint in PulseAPI:

Set the URL to your health check endpoint (e.g., https://api.yourapp.com/health)
Set the expected status code to 200
Set the check interval to 1 minute
PulseAPI will automatically create detection rules for HTTP errors and slow responses

If your health endpoint returns a 503, PulseAPI will detect the non-200 status code and trigger an alert through your configured notification channels. You'll know within a minute that a dependency is down — instead of finding out from a customer support ticket 30 minutes later.

For even deeper monitoring, consider adding separate PulseAPI checks for each critical dependency endpoint (your database admin panel's health route, your payment provider's status API, etc.). This gives you a clear picture of whether the problem is internal or external when something fails.

Key Takeaways

A health check endpoint that only verifies "the process is running" is almost useless. Build health checks that verify every critical dependency — database, cache, disk, external services — and return structured, machine-readable responses. Keep them fast, secure, and uncached. Then connect them to a monitoring tool that checks them continuously and alerts you when something goes wrong.

The 30 minutes you spend building a proper health check will save you hours of debugging during your next incident.

PulseAPI monitors your health check endpoints every minute with intelligent anomaly detection — so you find out about failures in seconds, not from customer complaints. Start monitoring free →

How to Build an API Health Check Endpoint (With Examples)