Talk
Sponsored
Virtual
LiveDay NYC
LiveDay LDN
On demand
BST
EDT

Chaos engineering in Kubernetes: Breaking your product to build resilience

This session explores how a Kubernetes-native chaos testing infrastructure was designed to simulate real-world failures, strengthen system reliability, and guide teams toward more resilient product development.
Traditional testing strategies often fall short in preparing systems for the unpredictability of real-world cloud environments. In this talk, Shivendu shares how his team built a Kubernetes-native chaos engineering framework to proactively test and improve system resilience. By leveraging a custom Kubernetes operator, cluster manager, database pods, and load generators, they created a robust platform to simulate failure scenarios and observe system behavior under stress. The session details the implementation of automated chaos experiments, real-time health checks, and observability using Grafana, Loki, and Prometheus. Shivendu also reflects on the broader organizational impact, including key lessons learned, challenges faced, and how chaos engineering practices helped strengthen both the product and team processes. Attendees will gain practical insights for designing their own experiments to build more resilient, cloud-native systems.
Talk
Sponsored
Thu 26 June
Virtual
Virtual
Virtual
On demand

Chaos engineering in Kubernetes: Breaking your product to build resilience

This session explores how a Kubernetes-native chaos testing infrastructure was designed to simulate real-world failures, strengthen system reliability, and guide teams toward more resilient product development.
Thu 26 June
EDT time
EDT
CEST
EDT
BST
Presented by
Panelist
Panelist
Panelist
Moderator
Kumar Shivendu
Engineer, Qdrant
Tell everyone
Traditional testing strategies often fall short in preparing systems for the unpredictability of real-world cloud environments. In this talk, Shivendu shares how his team built a Kubernetes-native chaos engineering framework to proactively test and improve system resilience. By leveraging a custom Kubernetes operator, cluster manager, database pods, and load generators, they created a robust platform to simulate failure scenarios and observe system behavior under stress. The session details the implementation of automated chaos experiments, real-time health checks, and observability using Grafana, Loki, and Prometheus. Shivendu also reflects on the broader organizational impact, including key lessons learned, challenges faced, and how chaos engineering practices helped strengthen both the product and team processes. Attendees will gain practical insights for designing their own experiments to build more resilient, cloud-native systems.
Talk
Sponsored
Virtual
LiveDay NYC
LiveDay LDN
On demand
Thu 26 June

Chaos engineering in Kubernetes: Breaking your product to build resilience

This session explores how a Kubernetes-native chaos testing infrastructure was designed to simulate real-world failures, strengthen system reliability, and guide teams toward more resilient product development.
CEST
BST
EDT
Duration:
90min
60min
Presented by
Tell everyone
Traditional testing strategies often fall short in preparing systems for the unpredictability of real-world cloud environments. In this talk, Shivendu shares how his team built a Kubernetes-native chaos engineering framework to proactively test and improve system resilience. By leveraging a custom Kubernetes operator, cluster manager, database pods, and load generators, they created a robust platform to simulate failure scenarios and observe system behavior under stress. The session details the implementation of automated chaos experiments, real-time health checks, and observability using Grafana, Loki, and Prometheus. Shivendu also reflects on the broader organizational impact, including key lessons learned, challenges faced, and how chaos engineering practices helped strengthen both the product and team processes. Attendees will gain practical insights for designing their own experiments to build more resilient, cloud-native systems.
Talk
Sponsored
Virtual
LiveDay NYC
LiveDay LDN
On demand
BST
EDT

Chaos engineering in Kubernetes: Breaking your product to build resilience

This session explores how a Kubernetes-native chaos testing infrastructure was designed to simulate real-world failures, strengthen system reliability, and guide teams toward more resilient product development.
Presented by
Panelist
Panelist
Panelist
Host
Kumar Shivendu
Engineer, Qdrant
Tell everyone
Sign up now