Talk
Virtual
LLM copilots for SRE: Accelerating incident response with production-safe guardrails
Explore a production-ready architecture for LLM-powered incident copilots that accelerate root cause identification while maintaining human oversight through strict privilege boundaries and verification gates.
CEST
Meet the speakers
Modern platform teams face escalating challenges as distributed systems grow complex. Moving from alert detection to root cause identification has become a primary bottleneck in incident response. While large language models offer powerful pattern-matching capabilities, deploying them without constraints risks hallucinations and privilege escalation. This session presents a production-ready architecture for LLM-powered incident copilots that accelerate the Sense-Decide-Act cycle while maintaining human oversight. The copilot ingests observability data from OpenTelemetry traces, metrics, and logs to generate situational summaries and root-cause hypotheses. The speaker will detail guardrail design including PII redaction, privilege boundaries, and verification gates. Platform engineers gain practical blueprints for integrating AI copilots while maintaining reliability guarantees.
