Incident Triage
The initial phase of incident response where the severity, impact, and required expertise are determined.
The initial phase of incident response where the severity, impact, and required expertise are determined.
The Art of Prioritization
Triage is a medical term adapted for DevOps. In an Emergency Room, triage nurses decide who needs a surgeon immediately and who can wait. In Incident Management, triage determines if an alert is a "Drop everything" (SEV1) or "Fix it next week" (SEV3).
The Triage Checklist
- Verify: Is this actually broken? (Eliminate false positives).
- Assess Impact: Who is affected? (All users? Just internal admins?).
- Assign Severity: Map impact to a SEV level (SEV1, SEV2, etc.).
- Route: Page the correct team.
ExE-commerce Platform Triage
"During flash sale, checkout fails. Triage confirms impact: "All users, revenue impact = $10K/min". Immediately classified as SEV0, paged 3 engineers, and workaround implemented in 8 minutes."
Why Triage Matters
Prevents high-severity incidents from being ignored.
Ensures the right people are paged, reducing noise for others.
Sets the pace for the entire incident response.