Learn/Triage
INCIDENT RESPONSE

Incident Triage

The initial phase of incident response where the severity, impact, and required expertise are determined.

Triage

The initial phase of incident response where the severity, impact, and required expertise are determined.

The Art of Prioritization

Triage is a medical term adapted for DevOps. In an Emergency Room, triage nurses decide who needs a surgeon immediately and who can wait. In Incident Management, triage determines if an alert is a "Drop everything" (SEV1) or "Fix it next week" (SEV3).

The Triage Checklist

  1. Verify: Is this actually broken? (Eliminate false positives).
  2. Assess Impact: Who is affected? (All users? Just internal admins?).
  3. Assign Severity: Map impact to a SEV level (SEV1, SEV2, etc.).
  4. Route: Page the correct team.

ExE-commerce Platform Triage

"During flash sale, checkout fails. Triage confirms impact: "All users, revenue impact = $10K/min". Immediately classified as SEV0, paged 3 engineers, and workaround implemented in 8 minutes."

Impact
Saved $80K in potential revenue loss
Resolution
Root cause was database connection pool exhaustion

Why Triage Matters

Prevents high-severity incidents from being ignored.

Ensures the right people are paged, reducing noise for others.

Sets the pace for the entire incident response.

Common Pitfalls

Treating all alerts as SEV1
Use a severity matrix to consistently classify based on business impact, not just technical severity.
Triage in isolation without context
Check with service owners before escalating. A "down" database might be a planned maintenance window.

How to Use Triage

📋
Severity Matrix: Have a clear Definition of Severity table.
🤖
Auto-Triage: Use tools to auto-label alerts based on payload.

Frequently Asked Questions

What is the difference between triage and severity?
Severity is the classification (SEV1, SEV2), while triage is the process of determining that classification based on impact assessment.
Who should perform incident triage?
Typically the on-call responder performs initial triage. If severity is unclear, they escalate to the incident commander for guidance.
How long should incident triage take?
Target triage time is under 5 minutes for SEV0/SEV1 incidents, and under 15 minutes for SEV2/SEV3. This is tracked by the MTTA metric.

Learn More