Building Scalable Data Pipelines with Kafka

Gain insights into Apache Kafka's role in scalable data pipelines. Explore its theory and practice interactive commands to build efficient and diverse data transmission solutions.

Beginner

62 Lessons

3h

Certificate of Completion

Gain insights into Apache Kafka's role in scalable data pipelines. Explore its theory and practice interactive commands to build efficient and diverse data transmission solutions.

AI-POWERED

Explanations

AI-POWERED

Explanations

This course includes

8 Playgrounds

This course includes

8 Playgrounds

Course Overview

If you’re interested in Big Data, then Apache Kafka is a must-know tool. What started as an internal LinkedIn project to streamline data transmission and propagation among services has quickly grown to become a mainstay platform for building highly scalable data pipelines. Meet Apache Kafka - the ubiquitous tool to build pipelines for diverse use cases ranging from chronologically tracking user-activity on a website to implementing publish-subscribe feeds. This course introduces you to Kafka theory and ...Show More

What You'll Learn

Learn the theory behind Kafka

Interact with a Kafka cluster running in the browser-terminal

What You'll Learn

Learn the theory behind Kafka

Show more

Course Content

1.

Basics

Step through the fundamentals of Kafka, distributed systems, messaging patterns, and core components.
2.

Kafka Producer

Unpack the core of Kafka Producers, message sending methods, configurations, and serialization techniques.
3.

Kafka Consumer

Go hands-on with Kafka consumers, configurations, offsets, and partition rebalancing techniques.
4.

Kafka Internals

Break down complex ideas of Kafka's replication, controller, request processing, and reliability.
5.

Conclusion

Compare Kafka's scalability, throughput, and real-time processing with other messaging systems.
6.

Appendix

3 Lessons

Activate Zookeeper insights, practical API use, and common distributed system solutions.
7.

Reference: Replication

14 Lessons

Master the principles of replica management, leader-based and leaderless replication strategies, and conflict resolution methods.
8.

Reference: Partitioning

4 Lessons

Learn how to use partitioning strategies to enhance scalability and optimize data pipelines.
9.

Reference: Transactions

9 Lessons

Discover the logic behind managing data transactions, isolation levels, and concurrent write challenges.
10.

Reference: Issues in Distributed Systems

4 Lessons

Examine the challenges in developing and maintaining distributed systems, including networking, time synchronization, and handling failures.

Course Author

Trusted by 1.4 million developers working at companies

Anthony Walker

@_webarchitect_

Emma Bostian 🐞

@EmmaBostian

Evan Dunbar

ML Engineer

Carlos Matias La Borde

Software Developer

Souvik Kundu

Front-end Developer

Vinay Krishnaiah

Software Developer

Eric Downs

Musician/Entrepeneur

Kenan Eyvazov

DevOps Engineer

Anthony Walker

@_webarchitect_

Emma Bostian 🐞

@EmmaBostian

Hands-on Learning Powered by AI

See how Educative uses AI to make your learning more immersive than ever before.

Instant Code Feedback

Evaluate and debug your code with the click of a button. Get real-time feedback on test cases, including time and space complexity of your solutions.

AI-Powered Mock Interviews

Adaptive Learning

Explain with AI

AI Code Mentor