Technical

Apache Kafka: How It Works Under The Hood

Leave a Comment / Innovative, Technical / EfficientxInnovative

Apache Kafka: How It Works Under The Hood In the previous blog, we introduced Apache Kafka and its key concepts—topics, partitions, producers, consumers, and consumer groups. We saw how Kafka helps decouple microservices and improve scalability and fault tolerance. But how does Kafka actually achieve this at scale? The answer lies in its architecture, which is built around distributed logs, offsets, and replication. 1. Kafka Architecture 1.1 Brokers A Kafka broker is a server that stores messages and serves client requests. Producers send data to brokers. Consumers fetch data from brokers. Brokers manage topics and partitions. Each broker can handle thousands of partitions and millions of messages per second. A Kafka cluster is typically made up of multiple brokers for scalability (spreading the load) and resilience (no single point of failure). Example: In a three-broker cluster, a topic with six partitions might be distributed so that each broker manages two partitions. 1.2 Log Files and Structure When people say “Kafka is a distributed log,” they mean it literally. Kafka stores data in log files on disk, and the way these logs are structured is the secret to its performance and reliability. Each partition in Kafka corresponds to a log file on disk. This log file is an append-only commit log, meaning you can only append new data to the end of the file and cannot remove or overwrite already stored records. New messages are written at the end of the file. Each message is assigned a unique, ever-increasing offset. Each log entry typically contains: Offset – a unique ID for ordering within the partition. Message size – how many bytes the record occupies. Message payload – the actual data. Metadata – such as a timestamp and checksums for validation. 1.3 Consumer Offsets Kafka doesn’t delete messages once a consumer reads them. Instead: Each consumer group maintains its own record of offsets. This means multiple consumer groups can read the same topic independently without interfering. If a consumer crashes, it can restart and resume at the last committed offset. Offsets are stored in an internal Kafka topic (__consumer_offsets), which allows the cluster to keep track of every group’s progress reliably. This design is what makes Kafka both a queue (messages processed once per consumer group) and a publish–subscribe system (multiple groups can consume the same data). Example: Suppose a producer writes a message to partition 2 of the orders topic. Kafka appends it to the active log segment and assigns it offset 105. The consumer group order-processing has committed its last read offset as 104. When the consumer fetches again, Kafka delivers offset 105 onward. Meanwhile, another group, analytics, may still be reading from offset 90 independently. 1.4 Segments in Log Files Kafka doesn’t keep one giant file for each partition. Instead, each partition’s log is split into segments (smaller files) on disk. A segment is typically a few megabytes to gigabytes in size (configurable). When a segment is full, Kafka closes it and starts writing to a new one. Each segment is named by the offset of its first message. This segmentation makes log management efficient: old segments can be deleted or compacted without touching active ones. 1.5 Retention and TTL Unlike traditional queues, Kafka doesn’t erase messages once they are consumed. Messages remain on disk until their retention policy is triggered. You can configure: Time-based retention (TTL) – for example, keep messages for seven days. Size-based retention – for example, keep up to 500 GB of logs. Infinite retention – messages are never deleted, so Kafka acts like a permanent log store. This means consumers can re-read old messages or even rebuild state from scratch if needed. 1.6 Leaders and Leader Election Each partition has a leader replica and one or more follower replicas. The leader handles all reads and writes. Followers replicate the data for fault tolerance. If the leader fails, Kafka performs a leader election and promotes one of the followers. This guarantees high availability and prevents data loss. Traditionally, Kafka used Zookeeper to keep track of cluster metadata (for example, broker membership and leader election for partitions). However, modern Kafka versions (2.8+) can run without Zookeeper thanks to the KRaft (Kafka Raft) protocol, which simplifies operations. 2. Takeaway Kafka’s true strength lies in its log-based architecture: Append-only commit logs provide durability. Consumer-managed offsets enable replay and resilience. Segmented log files keep storage efficient. Retention policies let you balance cost and reprocessability. Leader election ensures high availability. Previous Post Popular Posts Top 7 AI writing tools for Engineering Students Top AI Posting Tools Every Engineering Student Should Use to Be Innovative Without Design Skills Best AI Video Editing & Graphic Design Tools for Engineering Student Living in Germany as Engineering Student : Why is it hard living in Germany without speaking German? Apache Kafka: Intro and Key Concepts Every Developer Should Know

Technical

Apache Kafka: Intro and Key Concepts Every Developer Should Know

Leave a Comment / Technical / EfficientxInnovative

Apache Kafka: Intro and Key Concepts Every Developer Should Know 1. What you need to know before Kafka 1.1. What’s a Microservice Architecture? In a microservice architecture, a server is decomposed into different “smaller servers,” each responsible for a specific functionality—also known as a microservice. These can be deployed on separate hardware nodes or isolated within one node using containers or virtual machines. The microservices communicate with each other through endpoints (RESTful APIs) and/or interprocess communication (sockets, pipes). Typically, there’s also an orchestrator (server logic) that receives client requests, processes them, and forwards them to the appropriate microservices, as well as a shared database accessed by different nodes. 1.2. What’s a Message Queue? In System Design, a message queue is a system that allows one service to send messages that are stored temporarily in a queue. Other services can then read and process these messages. This helps decouple services so that if one fails or becomes slow, it doesn’t immediately cause the entire system to fail. 2. Introduction A group of software-passionate friends decided to start a new project on GitHub: a simple client-server design continuously deployed on the internet. At first, the system was small and minimalistic, and users who discovered it were happy with the service. But as more users and companies started adopting it, the number of requests skyrocketed. This created latency issues and, eventually, a complete system crash. After days of debugging, the friends discovered the problem: the Data Analysis microservice was overloaded. While that service could normally afford to lose some requests, its tight coupling with other microservices caused failures to cascade across the system. This issue is a classic example of high coupling in system design. To solve this, the friends researched and decided to adopt Apache Kafka—a decision that could transform their project. 3. Key Concepts 3.1. Producer/Consumer Architecture The simplest way to understand Apache Kafka is to imagine a message queue with a producer/consumer architecture. A producer sends messages into a queue. Consumers read messages from the queue and process them. 3.2. Offsets Unlike a traditional queue, Kafka doesn’t delete messages once consumed. Instead, it uses offsets to track what each consumer has read. Think of it like a log file—you don’t delete old lines, but you mark the last one you’ve read. This makes Kafka better described with a publish-subscribe architecture. (In this blog, we’ll use subscriber and consumer interchangeably.) 3.3. Topics Kafka organizes messages into topics. For example, the group of friends could have: sales data-analysis logging Multiple services can subscribe to the same topic. For instance, a payment topic might be consumed by: one service handling banking, another logging transactions, and another updating stock levels. 3.4. Partitions Each topic can be split into partitions, which are ordered sequences of messages. Producers write messages to partitions. Consumers read them. Kafka guarantees order within a partition, not across an entire topic. Each partition is consumed by exactly one consumer in a consumer group, but a consumer can read from multiple partitions. 3.5. Consumer Groups Consumers can be grouped into consumer groups for load balancing and fault tolerance. If one consumer fails, Kafka redistributes its partitions to the others. If a new consumer joins, it takes over some partitions. This provides scalability (add more consumers to handle more data) and high availability (failures don’t crash the whole system). 4. Takeaway Kafka helped our group of friends solve their high coupling problem by acting as a buffer and decoupling microservices. Instead of services being tightly connected and dependent on each other, they now communicate through Kafka topics. This means one overloaded or failing service no longer brings down the entire system. In short, Kafka provides: Decoupling between services Scalability through partitions and consumer groups Fault tolerance through message retention and redistribution In the next blog, we’ll look deeper into how Kafka is used in real-world scenarios and how it works internally under the hood. Stay tuned! Previous PostNext Post Popular Posts Top 7 AI writing tools for Engineering Students Top AI Posting Tools Every Engineering Student Should Use to Be Innovative Without Design Skills Best AI Video Editing & Graphic Design Tools for Engineering Student Living in Germany as Engineering Student : Why is it hard living in Germany without speaking German? Apache Kafka: Intro and Key Concepts Every Developer Should Know