Talk
Virtual
Building observable data pipelines: Monitoring production at scale
Learn how to build comprehensive observability into production data pipelines. Based on real experience monitoring millions of transactions daily at a UK bank, covering metrics, alerting, and debugging production issues.
CEST
Meet the speakers
Production data pipelines fail in surprising ways: silent data loss, weekend-only bugs, and cascading retries. This talk shares practical patterns for building observability into data pipelines based on real production experience at NatWest Bank processing banking transactions daily.
Attendees will learn:
• Key metrics for data pipeline health beyond task success and failure
• How to detect data quality issues before downstream teams notice
• Monitoring patterns for Kafka, Airflow, and Snowflake
• Real production incidents and how observability helped debug them
• How to build alerts that signal problems without causing alert fatigue
The session is based on lessons learned debugging production issues in financial services, including the "47x processing" incident, where a pipeline processed the same data repeatedly over a weekend. All examples use open-source tools: Airflow, Prometheus, and Grafana.