zookeeper 與 kafka的協同工作

阿新 • • 發佈：2019-01-18

First of all, zookeeper is needed only for high level consumer. SimpleConsumer does not require zookeeper to work.

The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing.

Now in more detail.

Regarding offset tracking, imagine following scenario: you start a consumer, consume 100 messages and shut the consumer down. Next time you start your consumer you'll probably want to resume from your last consumed offset (which is 100), and that means you have to store the maximum consumed offset somewhere. Here's where zookeeper kicks in: it stores offsets for every group/topic/partition. So this way next time you start your consumer it may ask "hey zookeeper, what's the offset I should start consuming from?". Kafka is actually moving towards being able to store offsets not only in zookeeper, but in other storages as well (for now only zookeeper

and kafka offset storages are available and i'm not sure kafka storage is fully implemented).

Regarding load balancing, the amount of messages produced can be quite large to be handled by 1 machine and you'll probably want to add computing power at some point. Lets say you have a topic with 100 partitions and to handle this amount of messages you have 10 machines. There are several questions that arise here actually:

how should these 10 machines divide partitions between each other?
what happens if one of machines die?
what happens if you want to add another machine?

And again, here's where zookeeper kicks in: it tracks all consumers in group and each high level consumer is subscribed for changes in this group. The point is that when a consumer appears or disappears, zookeeper notifies all consumers and triggers rebalance so that they split partitions near-equally (e.g. to balance load). This way it guarantees if one of consumer dies others will continue processing partitions that were owned by this consumer.

zookeeper 與 kafka的協同工作

zookeeper 與 kafka的協同工作

zookeeper與kafka安裝部署及java環境搭建

ZooKeeper與Kafka相關

圖解Dubbo和ZooKeeper是如何協同工作的？

docker：zookeeper與kafka實現分散式訊息佇列

zookeeper與kafka 測試

zookeeper與kafka的選舉演算法

zookeeper與kafka

DNS與GTM協同工作原理

Zookeeper 與 Kafka (1) : 分散式一致性原理與實踐

Linux下基於Hadoop的大資料環境搭建步驟詳解（Hadoop，Hive，Zookeeper，Kafka，Flume，Hbase，Spark等安裝與配置）

如何配置 Apache TomCat 與 CE RAS 9 協同工作

ffmpeg與ffserver的協同工作

大資料（三十）：zookeeper叢集與kafka叢集部署

node.js 與 redis 與 express 和session協同工作

Spark學習筆記：Spark Streaming與Spark SQL協同工作

dubbo協議下的單一長連線與多執行緒併發如何協同工作

【推薦】微服務分布式企業框架 Springmvc+mybatis+shiro+Dubbo+ZooKeeper+Redis+KafKa

推薦】微服務分布式企業框架 Springmvc+mybatis+shiro+Dubbo+ZooKeeper+Redis+KafKa

Elasticsearch 與 Kafka 整合剖析

zookeeper 與 kafka的協同工作

相關推薦