How to Monitor Cron Jobs: The Simple Guide for Developers (2026)
Cron jobs are the silent workhorses behind every serious application. Database backups, invoice processing, report generation, cache warming—they all run on schedules, in the background, with no one watching. Until they stop. This guide covers every method for monitoring cron jobs, from basic log files to dead man’s switches, with production-ready code examples you can copy and deploy in minutes.
Why Cron Jobs Fail Silently: The Silent Killer of SaaS Apps
Here is a scenario every developer has lived through at least once. A customer emails support: “I haven’t received my weekly report in three weeks.” You check the cron job. It stopped running 22 days ago. Nobody noticed because cron jobs don’t announce their failures.
Unlike a web server that returns a 500 error or an API that throws an exception caught by your error tracker, a cron job that fails to start produces zero output. No log entry. No error. No alert. The absence of something happening is, by definition, invisible.
This is why cron monitoring is fundamentally different from uptime monitoring. Tools like Pingdom and UptimeRobot check whether a server responds to requests. But your server can have 100% uptime while a critical cron job has been dead for a month. The server is fine. The crontab entry got wiped during a deploy. The process silently crashed. The disk filled up and the script couldn’t write its output. Pick any of a dozen failure modes—none of them trigger a server-level alert.
The financial impact is real. A missed backup cron means your disaster recovery is a lie. A missed billing cron means revenue leakage. A missed cleanup cron means your disk fills up and then everything fails at once. Silent cron failures are compound interest working against you.
Common Cron Job Failure Modes
Before you can monitor effectively, you need to understand how cron jobs fail. Each failure mode requires a different detection strategy.
Server crash or restart
The most obvious failure. The machine reboots, and if the cron daemon doesn’t restart automatically (or the crontab entries were stored in a user-specific crontab that doesn’t survive provisioning), your jobs stop running. Cloud instances are especially vulnerable—spot instances get reclaimed, auto-scaling groups rotate machines, and containers get rescheduled.
Timezone drift
A surprisingly common culprit. Your cron runs at 0 2 * * * in UTC, but someone changed the server timezone to Pacific during maintenance. Now your “2 AM” backup runs at 10 AM UTC, colliding with peak traffic. Or worse: daylight saving time shifts cause jobs to run twice or skip entirely. Containers that inherit the host’s timezone add another layer of confusion.
Resource exhaustion
Your cron job worked fine for months, then the database grew from 2 GB to 20 GB. The pg_dump that used to take 30 seconds now takes 15 minutes and gets killed by the OOM killer because the server only has 1 GB of RAM. Or the disk fills up mid-write. Or the job hits a file descriptor limit. The cron daemon faithfully started the job—but the job itself never completed.
Dependency failures
Your ETL cron pulls data from a third-party API. That API changed its authentication scheme last Tuesday. Your job now fails with a 401 on every run, but since you only log to a file that nobody reads, the failure accumulates silently. External dependencies—APIs, S3 buckets, SMTP servers, DNS resolution—are the most fragile part of any scheduled task.
Deployment side effects
A new deploy overwrites the crontab. A Docker image rebuild changes the base OS and breaks a system dependency. A config file gets moved. An environment variable gets renamed. These are the failures that happen at 3 PM on a Friday and don’t get noticed until Monday morning.
The common thread: In every failure mode, the cron job produces no signal that anything went wrong. The only reliable detection method is to check for the absence of a success signal—which is exactly what dead man’s switch monitoring does.
Method 1: Log Files (grep + Manual Review)
The most basic approach. Redirect your cron output to a log file and periodically check it.
# Redirect stdout and stderr to a log file
0 2 * * * /home/deploy/scripts/backup-db.sh >> /var/log/cron/backup.log 2>&1
Then you can manually check:
# Did the backup run last night?
grep "$(date -d yesterday +\%Y-\%m-\%d)" /var/log/cron/backup.log
# Check for errors in the last 24 hours
grep -i "error\|fail\|fatal" /var/log/cron/backup.log | tail -20
Why this doesn’t scale:
- Requires someone to remember to check the logs. Humans are bad at remembering.
- Log files grow unbounded unless you set up rotation (another cron job to monitor).
- Grepping for “error” produces false positives. Grepping for the absence of “success” is harder.
- If the cron job never starts, there is no log entry at all. You’re looking for the absence of a line in a file—a nearly impossible thing to notice manually.
- Works for one server. Falls apart when you have 5 servers each running 10 cron jobs.
Verdict: Fine for a hobby project. Unacceptable for anything that earns revenue.
Method 2: Custom Monitoring Scripts (Email on Failure)
A step up from log files. Wrap your cron job in a script that sends an email if the job fails.
#!/bin/bash
# monitored-backup.sh
set -e
/home/deploy/scripts/backup-db.sh 2>&1 | tee /var/log/cron/backup.log
if [ $? -ne 0 ]; then
echo "Backup failed at $(date)" | mail -s "ALERT: Backup cron failed" ops@yourcompany.com
fi
Or the more robust version using trap:
#!/bin/bash
# monitored-job.sh
ALERT_EMAIL="ops@yourcompany.com"
JOB_NAME="nightly-backup"
on_failure() {
echo "Job $JOB_NAME failed at $(date) on $(hostname)" \
| mail -s "CRON ALERT: $JOB_NAME failed" "$ALERT_EMAIL"
}
trap on_failure ERR
set -e
# Your actual job here
/home/deploy/scripts/backup-db.sh
Why this is fragile:
- Email delivery is not guaranteed. Your server’s outbound SMTP might be blocked. Gmail might flag it as spam. The
mailcommand might not even be installed. - It only catches failures, not non-starts. If the cron daemon never invokes the script, no email is sent. This is the most dangerous failure mode, and this method misses it entirely.
- Maintaining wrapper scripts for every cron job is tedious. You end up with 15 nearly identical scripts that drift out of sync.
- Testing is painful. How do you verify your monitoring works without actually breaking the cron job?
Verdict: Better than log files, but still misses the most critical failure mode: jobs that never start.
Method 3: Dead Man’s Switch Pattern (The Right Way)
A dead man’s switch flips the model. Instead of trying to detect failure (which is hard), you detect the absence of success (which is easy).
The pattern works like this:
- You tell the monitoring service: “Expect a ping from this job every hour.”
- At the end of your cron job, you send an HTTP request (a “ping”) to a unique URL.
- If the monitoring service doesn’t receive the ping within the expected window, it alerts you.
This catches every failure mode:
- Server crashed? No ping sent. Alert.
- Crontab wiped during deploy? No ping sent. Alert.
- Job started but failed halfway? No ping sent (because it’s at the end). Alert.
- Timezone drift caused the job to skip? No ping in the expected window. Alert.
- OOM killer terminated the process? No ping sent. Alert.
The dead man’s switch doesn’t care why the job failed. It only cares that the expected success signal didn’t arrive. This makes it inherently robust against failure modes you haven’t thought of yet.
Why “dead man’s switch”? The term comes from railway engineering. A train operator must continuously hold a lever. If they become incapacitated and release it, the train stops automatically. The system detects the absence of an active signal, not the presence of a failure signal.
How CronPeek Implements Dead Man’s Switch Monitoring
CronPeek is a dead man’s switch API built specifically for cron job monitoring. It does one thing and does it well: accept pings, track intervals, and fire alerts when pings stop arriving.
Under the hood, the system works like this:
- You create a monitor with an expected interval (e.g., every 5 minutes, every hour, every day).
- You receive a unique ping URL for that monitor.
- Your cron job pings that URL at the end of each successful run.
- CronPeek tracks the timestamps of incoming pings against the expected schedule.
- If a ping is late, CronPeek waits for a configurable grace period, then fires alerts to your configured channels: email, webhook, or both.
There is no agent to install. No SDK to integrate. No daemon running on your server. It is a single outbound HTTP request—the lightest possible footprint. If your job can run curl, it can use CronPeek.
Step-by-Step Setup with CronPeek
Create a monitor. Through the CronPeek API, create a new monitor specifying the expected interval and your alert preferences (email address, webhook URL, or both).
Get your ping URL. Each monitor gets a unique endpoint: https://cronpeek.web.app/api/v1/ping/YOUR_MONITOR_ID
Add the ping to your cron job. Append a single curl call to the end of your existing crontab entry. That is it. No config files, no environment variables, no library imports.
Code Examples: Every Language, Every Runtime
Bash: Crontab entry with ping
The most common setup. Add the ping after your existing command using && so it only fires on success:
# Before: unmonitored
0 2 * * * /home/deploy/scripts/backup-db.sh
# After: monitored with CronPeek
0 2 * * * /home/deploy/scripts/backup-db.sh && curl -fsS --retry 3 --max-time 10 https://cronpeek.web.app/api/v1/ping/YOUR_MONITOR_ID
The flags matter:
-f— Fail silently on HTTP errors (don’t pollute your cron email with HTML error pages)-sS— Silent but show errors (no progress bar, but you see if curl itself fails)--retry 3— Retry up to 3 times if the request fails (network blips happen)--max-time 10— Timeout after 10 seconds (don’t let a slow response hang your job)
Bash: Wrapper script for complex jobs
For multi-step jobs where you want the ping at the very end:
#!/bin/bash
# /home/deploy/scripts/etl-pipeline.sh
set -euo pipefail
MONITOR_ID="abc123def456"
CRONPEEK_URL="https://cronpeek.web.app/api/v1/ping/${MONITOR_ID}"
echo "[$(date)] Starting ETL pipeline..."
# Step 1: Extract
python3 /opt/etl/extract.py --source=production
# Step 2: Transform
python3 /opt/etl/transform.py --validate
# Step 3: Load
python3 /opt/etl/load.py --target=warehouse
echo "[$(date)] ETL pipeline complete."
# Signal success to CronPeek
curl -fsS --retry 3 --max-time 10 "$CRONPEEK_URL"
Because of set -euo pipefail, the script exits immediately on any error. The curl at the bottom only executes if every step succeeded.
Python
import requests
import sys
CRONPEEK_URL = "https://cronpeek.web.app/api/v1/ping/YOUR_MONITOR_ID"
def ping_cronpeek():
"""Signal successful job completion to CronPeek."""
try:
requests.get(CRONPEEK_URL, timeout=10)
except requests.RequestException:
pass # Never let monitoring break your actual job
def run_report_generation():
"""Your actual job logic."""
# ... generate reports, process data, etc.
print("Report generated successfully.")
if __name__ == "__main__":
try:
run_report_generation()
ping_cronpeek()
except Exception as e:
print(f"Job failed: {e}", file=sys.stderr)
sys.exit(1) # Exit non-zero, no ping sent
Key detail: the ping_cronpeek() function swallows exceptions. Your monitoring system should never cause your job to fail. If CronPeek is momentarily unreachable, the job still completes normally—and CronPeek’s grace period handles the missed ping gracefully.
Node.js
const https = require('https');
const MONITOR_ID = 'YOUR_MONITOR_ID';
const CRONPEEK_URL = `https://cronpeek.web.app/api/v1/ping/${MONITOR_ID}`;
function pingCronPeek() {
return new Promise((resolve) => {
const req = https.get(CRONPEEK_URL, (res) => {
res.resume(); // Consume response to free memory
resolve();
});
req.on('error', () => resolve()); // Swallow errors
req.setTimeout(10000, () => {
req.destroy();
resolve();
});
});
}
async function main() {
// Your actual job
await processInvoices();
await sendDigestEmails();
await cleanupTempFiles();
// Signal success
await pingCronPeek();
}
main()
.then(() => process.exit(0))
.catch((err) => {
console.error('Job failed:', err.message);
process.exit(1); // No ping sent on failure
});
Docker: Monitored container job
If you run cron jobs as Docker containers (common in Kubernetes CronJobs), add the ping to your entrypoint:
FROM python:3.12-slim
COPY etl_job.py /app/etl_job.py
# Install curl for the ping
RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/*
# Entrypoint: run job, then ping on success
CMD ["sh", "-c", "python /app/etl_job.py && curl -fsS --retry 3 --max-time 10 https://cronpeek.web.app/api/v1/ping/YOUR_MONITOR_ID"]
For Kubernetes CronJobs specifically:
apiVersion: batch/v1
kind: CronJob
metadata:
name: nightly-etl
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: etl
image: your-registry/etl-job:latest
command:
- sh
- -c
- |
python /app/etl_job.py && \
curl -fsS --retry 3 --max-time 10 \
https://cronpeek.web.app/api/v1/ping/YOUR_MONITOR_ID
restartPolicy: OnFailure
Alerting Options
When a ping misses its window, CronPeek can notify you through multiple channels:
Email alerts
The default. You specify an email address when creating the monitor, and CronPeek sends a plain, actionable alert when a job goes overdue. No marketing fluff, no HTML templates—just the monitor name, the expected time, and how late it is.
Webhook alerts
For programmatic integration. CronPeek sends a POST request to your webhook URL with a JSON payload containing the monitor ID, name, status, and timestamp. This is the building block for everything else.
Slack via webhook
Slack’s Incoming Webhooks accept JSON payloads. Point your CronPeek webhook at a Slack webhook URL and alerts land directly in your ops channel. No Slack app installation needed—just a webhook URL from Slack’s API settings.
# Example: CronPeek webhook → Slack Incoming Webhook
# Set your CronPeek monitor's webhook URL to:
# https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
#
# CronPeek sends:
{
"text": "ALERT: Monitor 'nightly-backup' is overdue. Last ping: 2026-03-27T02:00:12Z. Expected interval: 24h."
}
The same pattern works for Discord (Discord webhooks accept the same {"text": "..."} format), PagerDuty (via their Events API v2 webhook), or any custom notification service you build.
Pro tip: Use a webhook to trigger a serverless function (AWS Lambda, Google Cloud Function) that routes alerts based on severity. Critical cron jobs page on-call. Non-critical ones go to a Slack channel. One webhook, unlimited routing logic.
When to Upgrade: Free vs. Paid
CronPeek’s free tier covers 5 monitors—enough for a solo developer running a side project with a handful of cron jobs. No credit card, no trial expiration.
As your infrastructure grows, here is the natural progression:
- 1–5 cron jobs: Free tier. Zero cost. Full alerting capability.
- 6–50 cron jobs: Starter plan at $9/mo. This covers most small-to-medium teams—multiple services, each with several scheduled tasks.
- 50+ cron jobs: Business plan at $29/mo for unlimited monitors. No per-monitor pricing, no surprises on your bill.
For comparison, 50 monitors on Cronitor runs approximately $100/mo. On Dead Man’s Snitch, it is $199/mo. CronPeek’s flat-rate pricing means you never have to decide which cron jobs are “important enough” to monitor. Monitor all of them.
Putting It All Together
Here is the recommended setup for a typical production environment:
- Audit your crontab. Run
crontab -lon every server. List every scheduled task across your infrastructure, including Kubernetes CronJobs and scheduled CI/CD pipelines. - Create a CronPeek monitor for each job. Name them descriptively:
prod-db-backup-daily,prod-invoice-generation-hourly,staging-cache-warm-5min. - Add the ping to each job. Use the
&&pattern for simple crontab entries. Use wrapper scripts for multi-step jobs. - Set up a webhook. Route alerts to Slack for visibility. Add email as a backup channel so alerts survive a Slack outage.
- Test each monitor. Temporarily set the interval to 1 minute, run the job, then wait for the alert. Confirm it arrives. Then set the real interval.
- Document it. Add a note in your runbook: “All cron jobs are monitored via CronPeek. Dashboard at cronpeek.web.app.”
Total setup time for 10 cron jobs: about 15 minutes. Total ongoing cost: $0 to $9/mo depending on count. The alternative—finding out your backup cron has been dead for a month when you actually need the backup—is infinitely more expensive.
Start monitoring your cron jobs in 2 minutes
Free tier includes 5 monitors with full alerting. No credit card required. Add a single curl call to your crontab and you are done.
Get started free →