Talk
Virtual
We ship it. Then it lies: A production LLM survival guide
Deploying an LLM is easy. Keeping it reliable and cost-efficient at scale is the real problem. A practitioner's guide to batch optimization, LLM retry logic, structured outputs, agent interception and context design for 99.99% accuracy.
CEST
Meet the speakers
Deploying an LLM takes hours. Making it reliable, consistent, and cost-efficient at enterprise scale takes an entirely different engineering discipline. This talk draws from hands-on production experience running LLM systems at scale, covering the infrastructure layer that lives around the model: why shifting batch workloads to non-peak hours reshapes the cost curve, how LLM retries differ fundamentally from REST API retries, why structured outputs are non-negotiable in production, how mid-cycle agent interception gives platform engineers a needed control layer, and how concrete context design drives 99.99% accuracy in agentic workflows.
