Skip to main content
DevOps & CI/CD
DEFINITION

What is Error Budget?

An error budget is the amount of unreliability a service is allowed over a window, the gap between its Service Level Objective and 100%. A 99.9% availability SLO yields a 0.1% error budget (about 43 minutes of downtime per month).

Free to start · 7-day trial on paid plans

IN DEPTH

In depth.

Error budgets turn reliability into a currency teams can spend. If your SLO is 99.9%, you are explicitly allowed to be unavailable 0.1% of the time. That budget can be spent on shipping risky features, doing migrations, or absorbing the occasional incident. The point is to make the trade-off between reliability and velocity explicit and data-driven instead of an argument between developers who want to ship and operators who want stability.

The mechanism is an error budget policy. When there is budget left, the team ships freely and can take risks. When the budget is exhausted (too many incidents this month), the policy kicks in: feature work pauses, and the team focuses on reliability, fixing the causes, adding tests, hardening, until the service is back within budget. Burn rate (how fast you are consuming the budget) drives alerting.

For QA and SDET, error budgets connect testing to business decisions. A spent budget is often a signal of insufficient testing or fragile releases; testing, canaries, and good rollback are how teams protect the budget. Speaking this language shows you understand reliability as a managed resource.

WHY IT MATTERS

Why interviewers ask about this.

Error budgets are an advanced, high-signal interview topic for SDET, platform, and lead roles. Explaining how a budget balances velocity against reliability, and how testing protects it, shows SRE-influenced maturity.

EXAMPLE

Example scenario.

A team with a 99.9% SLO burns most of its monthly error budget through two incidents by mid-month. Their error budget policy freezes new feature launches and redirects the team to add the missing tests and rollback automation that would have prevented the incidents, until reliability recovers.

TIP

Interview tip.

Define an error budget as the allowed unreliability (the gap between the SLO and 100%) and explain the policy: ship freely when budget remains, pause features and focus on reliability when it is spent. Tie it to how testing and safe releases protect the budget.

FAQ

Frequently asked questions.

How is an error budget calculated?

It is 100% minus the SLO, applied over a window. A 99.9% availability SLO gives a 0.1% error budget, roughly 43 minutes of allowed downtime per 30-day month.

What happens when the error budget runs out?

An error budget policy typically pauses new feature launches and redirects the team to reliability work, fixing causes, adding tests, hardening, until the service is back within its SLO and the budget recovers.

Related Resources

Dive deeper with these related interview prep pages.

FREE TOOLS  /  no signup

Free QA career tools, no account needed

Instant and private, everything runs in your browser. Try them before you sign up.

EXEC.NOW

Ready to Ace Your QA Interview?

Practice explaining error budget and other key concepts with our AI interviewer.

Join 1,200+ QA engineers already practicing with AssertHired.

Start your free QA interview
FREE.TO.START  ·  7.DAY.TRIAL ON PAID PLANS
Written by Aston Cook, Senior QA EngineerLast updated May 2026