Alert Escalation: From First Email to Emergency Call
Configure progressive alert levels to guarantee response to critical incidents.
An unread email alert can cost thousands of dollars if it signals a critical incident on your e-commerce site during a sales peak. Alert escalation solves this problem by progressively increasing notification urgency until getting a response.
The principle is simple: if no one reacts to the initial alert within a defined delay, the system moves to the next level. An ignored email becomes an SMS. An unanswered SMS triggers a phone call. The incident always ends up reaching someone who can act.
This guide shows you how to configure an effective escalation strategy with MoniTao. From simple email → SMS escalation to multi-person escalation chains, you'll have all the tools to ensure no critical incident goes unnoticed.
Understanding Alert Escalation
Escalation is a progressive notification mechanism that works in several stages:
- Initial alert: When an incident is detected, a first notification is sent via the default channel (usually email). A timer starts.
- Waiting period: The system waits a configurable delay (e.g., 5 minutes) to see if the alert is acknowledged or if service comes back online.
- Level 2 escalation: If no action is taken, the alert moves to the next level (e.g., SMS) and the timer restarts.
- Final escalation: The process continues to the last configured level or until resolution. Critical levels can include phone calls.
Typical Escalation Levels
Here's a standard escalation configuration adaptable to most teams:
- Level 1 - Email (immediate): First notification sent to the ops team. Non-intrusive, ideal for minor incidents or during business hours when team monitors emails.
- Level 2 - Slack/Teams (after 5 min): If no email response, notification in alerts channel. Collective visibility increases chances someone reacts.
- Level 3 - SMS (after 10 min): Direct alert to on-call engineer's phone. Impossible to ignore, even away from computer.
- Level 4 - Call (after 20 min): Automatic phone call for critical unresolved incidents. Reserved for situations where service is completely unavailable.
Configure Escalation in MoniTao
MoniTao allows defining flexible escalation policies:
- Define channels: First configure all your alert channels (email, SMS, Slack webhook, etc.) in Settings > Channels.
- Create escalation policy: In Settings > Escalation, create a new policy and define levels with their delays.
- Assign to monitors: Apply the policy to relevant monitors. You can have different policies for different criticality levels.
- Test the flow: Use test mode to simulate complete escalation and verify each level works correctly.
Escalation Configuration Example
Here's a complete escalation policy example:
# "Critical Production" escalation policy
escalation_policy:
name: "Critical Production"
levels:
- level: 1
delay: 0 # Immediate
channels: ["email-ops"]
- level: 2
delay: 300 # 5 minutes
channels: ["slack-alerts"]
- level: 3
delay: 600 # 10 minutes (total)
channels: ["sms-oncall"]
- level: 4
delay: 1200 # 20 minutes (total)
channels: ["phone-oncall", "email-manager"]
# Auto de-escalate if resolved
auto_resolve: true
resolve_notify: ["slack-alerts"]
This configuration sends an email first, then escalates to Slack after 5 minutes, SMS after 10 minutes, and finally call + manager notification after 20 minutes. If problem resolves, a recovery notification is sent to Slack.
Common Escalation Scenarios
Adapt your escalation based on context:
- Business hours: Fast escalation since team is available. Email → Slack (2 min) → SMS (5 min). High reactivity expected.
- Night and weekend: More direct escalation: Email + SMS immediate for on-call. No Slack step that no one monitors.
- Critical services: Aggressive escalation with short delays and phone call as last resort. Every minute of downtime has business impact.
- Secondary services: Soft escalation: email only during business hours, SMS only if down > 1 hour. No need to wake someone for non-critical service.
Escalation Best Practices
For effective escalation without alert fatigue:
- Realistic delays: Give enough time to react at each level. 2 minutes to read an email is too short. 5-10 minutes is more reasonable.
- Avoid redundancy: If someone already acknowledged the alert, subsequent levels shouldn't trigger. Acknowledgment stops escalation.
- On-call rotation: Integrate your on-call schedule with escalation. SMS goes to current on-call person, not always the same one.
- Feedback loop: Regularly analyze your escalations. If you often reach level 4, either delays are too short or team isn't reacting fast enough.
Escalation Configuration Checklist
- Configure all necessary alert channels
- Define realistic delays between each level
- Create different policies by criticality
- Configure on-call rotation
- Test complete escalation flow
- Document who's notified at each level
Frequently Asked Questions
Does escalation continue if the problem resolves?
No, MoniTao automatically detects resolution and stops escalation. A recovery notification is sent according to your configuration.
Can I acknowledge an alert to stop escalation?
Yes, acknowledgment from dashboard, email, or via API immediately stops escalation. Alert remains visible but no additional notifications are sent.
How to handle multiple on-call people?
Configure a contact group for each level. MoniTao can notify all group members or contact them in rotation until getting a response.
Can I have different day/night escalations?
Yes, create separate policies and use timing rules to automatically apply the right policy based on time and day.
What happens if no one answers at the last level?
The system continues retrying the last level at regular intervals until acknowledgment or resolution. You can also configure a backup notification (e.g., manager).
Does escalation work during maintenance?
No, if a monitor is in a maintenance window, no alerts or escalation are triggered. That's precisely the utility of maintenance windows.
Conclusion
Alert escalation is your safety net. It ensures critical incidents always end up reaching someone who can act, even if first recipients aren't available.
MoniTao offers flexible and powerful escalation: multiple levels, configurable delays, integration with on-call rotation. Configure your first escalation policy and sleep soundly.
Useful Links
Ready to Sleep Soundly?
Start free, no credit card required.