How to Detect a Silent Cron

Never let a cron fail silently without knowing again

A "silent" cron represents one of the most insidious problems in system administration. Unlike a cron that fails with a visible error, a silent cron simply stops running without generating any error message. No alert email, no error log, no notification: the cron disappears into the shadows while your data becomes stale.

This problem is particularly vicious because it can go unnoticed for days, even weeks. How many times have you discovered that a backup hadn't been running for two weeks? Or that a daily import had stopped a week ago, leaving your customer data incomplete? These situations are far more common than we think.

The solution lies in "heartbeat" monitoring: instead of waiting for an error that will never come, you actively wait for a life signal. If that signal (ping) doesn't arrive on time, you're alerted. MoniTao implements this pattern to protect you against silent crons.

Symptoms of a Silent Cron

Several clues can reveal that a cron has silently stopped working:

  • Stale data: Data imports no longer update. Customers see yesterday's information, or worse. This symptom is often discovered by end users, causing a loss of trust.
  • Unsent emails: Daily reports, scheduled notifications, newsletters no longer go out. The silence of emails doesn't alert anyone because... it's silence.
  • Missing backups: The nightly backup no longer runs. You only discover it when you need to restore, when it's too late.
  • Abnormal accumulation: Temporary files accumulate, logs grow, databases aren't cleaned up. The symptom becomes visible when the disk is full.

Common Causes of Silent Crons

Understanding the causes helps prevent recurrence:

  • Crontab deleted or overwritten: A clumsy crontab -e, an automation script that resets the crontab, or a server migration that forgets the crons. The task no longer exists but nobody is notified.
  • Cron service disabled: After a restart, the crond service might not start automatically. All server crons silently stop.
  • Script moved or deleted: The path to the script has changed, the file was deleted during cleanup, or permissions were modified. The cron runs but does nothing.
  • Environment variables: Cron runs in a minimal environment. PATH, HOME, or custom variables may be missing, causing a silent failure.

Diagnosing a Silent Cron

When facing a suspect cron, follow this diagnostic methodology:

  1. Check the crontab: Run crontab -l for the concerned user. Verify that the line exists, the syntax is correct, and the script path is absolute.
  2. Check the logs: Examine /var/log/cron or journalctl -u cron depending on your system. Look for entries matching your task to confirm if it starts.
  3. Run manually: Run the script manually with the same user as the cron. Compare results with a cron execution to isolate environment differences.
  4. Check outputs: Redirect stdout and stderr to a log file. Add echo/logging statements at the beginning and end of the script to trace its execution.

Detection Example with MoniTao

Here's how to integrate a MoniTao heartbeat to detect a silent cron:

#!/bin/bash
# Script: /home/scripts/import-data.sh

set -e  # Stop on error

echo "[$(date)] Starting import..."

# Your business logic
/usr/bin/php /var/www/app/import.php

# Ping MoniTao only if script succeeded
if [ $? -eq 0 ]; then
    curl -fsS --max-time 10 \
        "https://api.monitao.com/ping/your-token" \
        -d '{"status": "success"}' \
        -H "Content-Type: application/json"
    echo "[$(date)] Import completed successfully, ping sent"
else
    echo "[$(date)] Import failed, no ping"
    exit 1
fi

This script pings MoniTao only on success. If the script doesn't run at all (silent cron), or if it fails, no ping is sent. MoniTao alerts you after the configured grace period, informing you that something is wrong.

Automating Detection

MoniTao offers several approaches to automate silent cron detection:

  • Basic heartbeat: Add a curl at the end of each critical cron. Configure the expected interval in MoniTao (e.g., 24h for a daily cron). Alert if ping is missing.
  • Conditional ping: Use && to only ping if the previous command succeeds: ./script.sh && curl URL. This way, a silent failure (non-zero return code) won't trigger a ping.
  • Wrapper script: Create a wrapper that encapsulates any cron, handles errors, and pings MoniTao. Reusable for all your crons.
  • Grace period: Configure a margin after the expected interval. A daily cron at 2am that can take 30 minutes: configure a 25h timeout to avoid false alerts.

Anti-Silent Cron Best Practices

Adopt these practices to minimize silent cron risks:

  • One heartbeat per critical cron: Identify your business-critical crons (backups, imports, syncs) and create a dedicated heartbeat for each. Don't use a generic heartbeat for multiple crons.
  • Appropriate timeouts: Calculate timeout as: cron interval + max execution duration + 10% margin. An hourly cron that takes 5 minutes deserves a ~70 minute timeout.
  • Systematic logging: Always redirect stdout and stderr to log files. Even if MoniTao detects the absence, logs help with diagnosis.
  • Regular testing: Periodically test your alerts by temporarily disabling a cron. Verify that the alert arrives within the expected timeframe.

Anti-Silent Cron Checklist

  • List all critical crons in the infrastructure
  • Create a MoniTao heartbeat per critical cron
  • Add conditional ping (&&) to each cron
  • Configure appropriate timeouts with margin
  • Redirect stdout/stderr to log files
  • Test alerts by disabling a test cron

Frequently Asked Questions

How do I identify which crons are critical and deserve monitoring?

A cron is critical if its absence would have visible business impact: backups (data loss), imports (stale data), email sending (interrupted communication), synchronizations (system misalignment). Prioritize those whose failure would cost the most in time or money.

Does the curl ping add significant latency to my crons?

The ping typically takes 50-200ms depending on network latency. For a cron that runs for several minutes, this is negligible. For ultra-fast crons (< 1 second), you can use the --max-time option to limit wait time and prevent a network issue from blocking the cron.

My server doesn't have internet access. How can I monitor my crons?

Two options: 1) Configure an outbound proxy for HTTP calls to MoniTao. 2) Use an intermediate server with internet access that relays pings. The cron server pings the local relay, which pings MoniTao.

I have the same cron on 3 servers. How do I handle heartbeats?

It depends on your architecture: if all 3 servers should run the cron, create 3 distinct heartbeats (one per server). If only one should run (e.g., master/slave), create a single heartbeat and ensure only the master pings.

How do I differentiate a silent cron from a failing cron?

A silent cron doesn't run at all (no log, no trace). A failing cron runs but returns an error. With MoniTao, use conditional ping (&&): no ping = either silent or failed. Your internal logs help differentiate.

Can I be alerted on Slack or Teams instead of email?

Yes, MoniTao supports multiple alert channels: email, Slack, Discord, custom webhooks, and more. Configure your preferences in your account notification settings. You can even combine multiple channels for the most critical crons.

Don't Let Crons Disappear

Silent crons are ticking time bombs. They stop working without a sound, accumulating invisible technical debt until the day the consequences become obvious: lost data, frustrated customers, production incidents. Prevention is infinitely cheaper than repair.

Heartbeat monitoring reverses the paradigm: instead of waiting for errors that never come, you wait for life signals. Their absence triggers the alert. With MoniTao, set up this protection in minutes for each of your critical crons, and sleep peacefully.

Ready to Sleep Soundly?

Start free, no credit card required.