Building scalable and reliable data pipelines using Debezium and Kafka

Data pipelines are used to perform data integration, which is the process of bringing together data from multiple sources to provide a complete and accurate dataset for business intelligence (BI), data analysis, and other applications. In this talk, we will cover this process.

Talk abstract

Data is being generated at an unprecedented rate, making it essential to have efficient and robust data pipelines to process and store data. In this talk, we will explore how to build scalable and reliable data pipelines using Debezium and Kafka.

Debezium is an open-source project that provides a platform for change data capture (CDC) from databases and streams the changes to Apache Kafka. This allows us to monitor changes to data in real-time and feed it into various applications.

Kafka on the other hand, is a distributed, high-performance, and scalable messaging system that acts as the backbone of modern data pipelines. With its pub-sub model, Kafka provides a mechanism for reliable data ingestion and distribution.

In this talk, we will cover the following topics:

Introduction to Debezium and Kafka
Setting up Debezium to capture changes from databases
Integrating DBs to Debezium with Kafka for real-time data streaming
Best practices for designing and deploying data pipelines using these technologies

Attendees will learn how to build efficient data pipelines that can handle large amounts of data and how to ensure the reliability and scalability of these pipelines. Whether you are a data engineer, developer, or architect, this talk is for you.

Join us for a deep dive into the world of Debezium and Kafka and learn how to build data pipelines that can handle the most demanding use cases.

Karan Thakur

Site Reliability Engineer, Moss

Building scalable and reliable data pipelines using Debezium and Kafka

Related talks