0.11.0.1上的kafka组协调器故障恢复

bqujaahr 于 2021-06-07 发布在 Kafka

关注(0)|答案(2)|浏览(310)

是否有任何配置可以在崩溃后启用自动组协调器恢复？
我有一个带有3个代理的测试拓扑，一旦组协调器关闭，主题分区（rf=2的2个分区）就会得到正确的重新平衡，生产者不会受到影响，但是使用者组会停止接收消息。如果我选择了其他经纪人，一切都会如期进行。
使用java api kafka客户机0.10.2.1作为生产者和客户机

<dependency>
   <groupId>org.apache.kafka</groupId>
   <artifactId>kafka-clients</artifactId>
   <version>0.10.2.1</version>
</dependency>

通过监视每个仍在运行的代理的控制台输出，我找不到任何新groupcoordinator分配的引用。当我启动原始组协调器代理时，所有使用者都会恢复接收消息。无论启动顺序如何，被选为协调器的代理始终是broker.id=0。
客户端配置：

private static Consumer<String, String> createFixMessageConsumer(int id) {
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092,localhost:9093,localhost:9094");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
        props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
        props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "6100");
        props.put(ConsumerConfig.GROUP_ID_CONFIG, MYCONSUMERGROUP);
        props.put(ConsumerConfig.CLIENT_ID_CONFIG, id + "");
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());     
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        return new KafkaConsumer<>(props, new StringDeserializer(), new FixMessageDeserializer());
    }

使用者工作程序片段：

@Override
    public void run() {
        try {
            consumer.subscribe(topics);

            while (true) {
                ConsumerRecords<String, FixMessage> records = consumer.poll(2000);
                FixMessage message = null;
                for (ConsumerRecord<String, FixMessage> record : records) {
                    message = record.value();
                    message.setConsumerId(id);
                    message.setKafkaPartition(record.partition());
                    message.setPartitionOffset(BigInteger.valueOf(record.offset()));
                    Map<String, Object> data = new HashMap<>();
                    data.put("partition", record.partition());
                    data.put("offset", record.offset());
                    if(message.getIdfixMessage() == null)
                        createFixMessage(message, data);
                    data.put("value", message.getIdfixMessage());
                    System.out.println(this.id + ": " + data);
                }
            }
      } catch (WakeupException e) {
        // ignore for shutdown 
      } catch(Exception e) {
          System.out.println(e.toString());
      } finally {
        consumer.close();
      }
    }

apache-kafka

来源：https://stackoverflow.com/questions/46817599/kafka-group-coordinator-fail-recovery-on-0-11-0-1

2条答案

按热度按时间

snz8szmq1#

我对Kafka2.11-1.0.0也有同样的看法。也就是说，在消费期间，如果消费组协调器所在的代理关闭，则不会发现新的协调器。结果，消息消耗被完全停止，尽管生产者能够继续向新当选的领导者（新当选的领导者在图片中，因为一个分区落在关闭代理上，但它被自动重新分配给一个isr）。更新内部主题的复制因子后 __consumer_offsets 到3（我有一个由3个代理组成的集群），消费组协调器的自动故障转移开始发生。自动发现新的使用者群组协调器后，所有成功产生的讯息都已被使用。增加内部主题的频率 __consumer_offsets ，请参阅：http://kafka.apache.org/documentation.html#basic_ops_increase_replication_factor

赞(0）回复(0）举报 2021-06-07

pbwdgjma2#

确保主题的复制因子 __consumer_offsets 在您的情况下大于1。0.11.0.0之前，代理端参数 default.replication.factor 不会被强制执行，所以很可能这个内部主题的rf小于 default.replication.factor 你准备好了。

赞(0）回复(0）举报 2021-06-07