Slack Is Not an Alerting System
Slack is where your team communicates. It's also where notifications go to die at 3 AM. No matter how many channels you create or how aggressively you configure notifications, Slack alone cannot reliably wake an engineer for a critical incident.
The issue isn't Slack's fault. It's that a single notification channel can't serve every alerting scenario. A disk usage warning at 2 PM needs a different response than a production database crash at 2 AM.
This is where multi-channel escalation policies come in — and most teams get them wrong.
The Escalation Ladder
A well-designed escalation policy moves through notification channels of increasing urgency, giving the on-call engineer a reasonable window to respond at each step before escalating.
Here's a practical escalation ladder:
| Step | Channel | Wait Time | Use Case |
|---|---|---|---|
| 1 | Slack channel | 0 min | Team visibility, low-severity alerts stop here |
| 2 | Slack DM | 2 min | Direct notification to on-call engineer |
| 3 | SMS | 5 min | Breaks through Do Not Disturb |
| 4 | Phone call | 10 min | Guaranteed to wake someone up |
| 5 | 12 min | Alternative channel if phone is missed | |
| 6 | Secondary on-call | 15 min | Escalate to backup engineer |
| 7 | Engineering manager | 20 min | Management escalation |
The key insight: not every alert needs to traverse the full ladder. Severity determines where an alert enters the ladder and how far it goes.
Configuring by Severity
Different severity levels should start at different points in the escalation ladder:
Critical (production outage, data loss risk)
- Start at: SMS + Phone call simultaneously
- Escalation: Secondary on-call after 5 minutes, manager after 10
- Quiet hours: Always breaks through
High (degraded service, elevated error rates)
- Start at: Slack DM + SMS
- Escalation: Phone call after 5 minutes, secondary on-call after 10
- Quiet hours: Breaks through with SMS only (no phone call)
Medium (non-critical service issues, performance degradation)
- Start at: Slack DM
- Escalation: SMS after 10 minutes
- Quiet hours: Held until business hours
Low (informational, trending metrics, maintenance reminders)
- Start at: Slack channel only
- Escalation: None
- Quiet hours: Always held until business hours
Integrating Quiet Hours
Quiet hours add a time-based dimension to your escalation policies. The concept is simple: during defined quiet periods, only alerts above a certain severity threshold trigger notifications.
A practical quiet hours configuration:
- Quiet window: 10:00 PM to 8:00 AM local time (timezone-aware per engineer)
- Break-through threshold: High and Critical severity
- Held alerts: Medium and Low severity queued for morning delivery
- Morning digest: Batch notification of held alerts at 8:00 AM
The timezone awareness is critical for distributed teams. An engineer in London shouldn't be woken at 3 AM because the quiet hours are configured for Pacific time.
Common Mistakes
Mistake 1: Too Many Steps, Too Short Timers
Some teams configure 8-step escalation policies with 1-minute intervals between each step. This means the on-call engineer gets bombarded across every channel within 8 minutes, before they've even had time to open their laptop.
A better approach: give at least 3-5 minutes between steps. If someone doesn't respond to an SMS within 5 minutes, a phone call is warranted. If they don't respond within 2 minutes, they're probably just unlocking their phone.
Mistake 2: Same Escalation for Every Alert
Using the same escalation path for a disk space warning and a complete service outage guarantees fatigue. Engineers learn that phone calls don't always mean something is on fire, so they start treating phone calls like Slack messages.
The fix: tie escalation aggressiveness to severity. Save phone calls for genuine emergencies.
Mistake 3: No Acknowledgment Loop
Escalation should stop when someone acknowledges the alert. If your system keeps escalating after acknowledgment, it creates unnecessary noise and erodes trust in the system.
Ensure your escalation policy includes:
- Acknowledgment stops further escalation
- Acknowledgment can happen from any channel (Slack emoji, SMS reply, dashboard button)
- If acknowledged but not resolved within a time window, a gentler follow-up reminder fires
Mistake 4: Forgetting the Secondary On-Call
Every escalation policy should include a backup. The primary on-call engineer might be in a dead zone, might have a phone issue, or might be dealing with a separate incident.
Best practice: always have a secondary on-call who gets notified if the primary doesn't acknowledge within the defined window.
Mistake 5: Not Testing the Escalation Path
You should test your escalation policies regularly. A monthly test page that traverses the full escalation path ensures:
- Phone numbers are correct and reachable
- SMS delivery is working
- Slack integrations haven't broken
- Secondary on-call contacts are up to date
- Quiet hours configuration is correct
Building the Right Policy for Your Team
Start with these questions:
- How many severity levels do you need? (Most teams do well with 3-4)
- What's the maximum acceptable time to acknowledge a critical incident? (Most teams target 5-10 minutes)
- Who is the secondary on-call? (Always have one)
- What hours should be considered "quiet"? (Account for timezones)
- Which channels does your team actually respond to? (Test this — don't assume)
Then build your policy from the answers, starting simple and adding complexity only when you identify gaps.
A Practical Starting Configuration
For most teams getting started with multi-channel escalation, this configuration works well:
Default policy (Medium/Low): Slack channel → (5 min) → Slack DM → (15 min) → SMS to secondary
Urgent policy (High/Critical): Slack DM + SMS → (5 min) → Phone call → (5 min) → Secondary on-call SMS + Phone → (10 min) → Manager notification
Quiet hours: 10 PM - 8 AM, only High/Critical break through
This gives you four escalation steps for urgent issues and a 20-minute window to get someone engaged, while keeping non-urgent alerts out of people's pockets during off-hours.
Getting Started
OpShift supports multi-channel escalation across Slack, SMS, phone calls, and WhatsApp — with severity-based routing, quiet hours, and acknowledgment from any channel. Escalation policies are configured per team and respect PTO schedules automatically.
Flat pricing at $14/month for up to 50 users. No per-seat charges. Set up your escalation policies at opshift.io.