Learn/Error Budget
CONCEPTS

Error Budget

The amount of unreliability a service can have before violating its SLO.

Error Budget = 100% - SLO

The amount of unreliability a service can have before violating its SLO.

"Permission to Fail"

An Error Budget is the allowable amount of downtime you can experience in a month without making your users unhappy.

The Philosophy

100% reliability is expensive and slows you down. If your users are happy with 99.9% reliability, then aim for 99.9%. The remaining 0.1% is your budget.
You can spend this budget on:

  • Risky feature launches.
  • System experiments.
  • Chaos engineering.

The Error Budget Policy

The real power comes from the policy: what happens when you run out of budget?

  • Budget > 0: Ship features fast.
  • Budget < 0: Stop feature work. Focus on reliability (sprints, freezes) until the budget refills.

ExThe Feature Freeze

"A team pushed bad code and caused a 4-hour outage, burning 100% of their quarterly Error Budget."

Impact
The Product Manager wanted to ship a new UI feature the next day.
Resolution
The SRE lead invoked the Error Budget policy. The launch was delayed 1 week so the team could write automated tests to prevent the outage from recurring.

Why Error Budget Matters

Error budgets prevent perfectionism. If you have budget left, you can take risks. If not, stabilize.

Error budgets turn reliability into a strategic tradeoff, not a binary goal.

The Formula

Error Budget = 100% - SLO
SLO Target: 99.9%
Error Budget: 0.1% downtime

Common Pitfalls

Ignoring the Burn
Burning the budget but continuing to ship features implies your SLO is a lie.
Hoarding Budget
Being too afraid to deploy. If you have budget, spend it! Run chaos experiments.

Related Terms

Frequently Asked Questions

Does the budget reset?
Yes, typically every month or quarter (rolling window).
What if we never burn our budget?
You are being too safe! You should launch faster or run chaos experiments. Reliability that exceeds user expectations is wasted engineering effort (Gold Plating).

Learn More