What Is Observability?
Observability is the ability to understand the internal state of a system from its external outputs, primarily through three pillars: structured logs, metrics (time-series data), and distributed traces.
Free to start · 7-day trial on paid plans
In Depth
Monitoring tells you when something is wrong ("error rate is above 5%"). Observability tells you why ("the error spike comes from users in the EU region calling the payment service, which is timing out on the fraud-check microservice due to a slow database query on the orders table"). The distinction matters because modern distributed systems fail in novel ways that predefined dashboards cannot anticipate.
The three pillars work together. Logs provide detailed, event-level records of what happened. Metrics provide aggregated, time-series data showing trends and thresholds. Traces follow a single request as it travels through multiple services, revealing where latency or errors are introduced. Tools like Datadog, Grafana, Jaeger, and OpenTelemetry provide these capabilities.
For QA engineers, observability is a shift-right testing superpower. Instead of only verifying behavior through test assertions, you can verify production behavior through dashboards and alerts. Did the new feature increase p99 latency? Is the error rate for the new API endpoint within SLA? Are there trace anomalies indicating a bottleneck? Observability transforms post-deployment verification from "check that it works" to "understand how it works under real conditions."
Why Interviewers Ask About This
Interviewers ask about observability to assess whether you can operate in a modern, production-aware QA environment. It signals DevOps maturity and goes beyond traditional testing skills.
Example Scenario
After deploying a new feature, a QA engineer checks the Grafana dashboard and notices a 15% increase in database query duration. Distributed traces reveal that the new code is issuing N+1 queries for each request. The engineer files a performance bug with trace evidence, and the developer adds eager loading to fix the query pattern.
Interview Tip
Name the three pillars (logs, metrics, traces) and give a concrete example of using observability data to find a bug that tests alone would have missed. Mention specific tools you have used.
Related Terms
Explore related glossary terms to deepen your understanding.
Related Resources
Dive deeper with these related interview prep pages.
Ready to Ace Your QA Interview?
Practice explaining observability and other key concepts with our AI interviewer.
Join 1,200+ QA engineers already practicing with AssertHired.
Start Your Free QA InterviewFree to start · 7-day trial on paid plans