I am using 3 server cluster for the Kafka Configuration, with Snowflake connector REST API to push the data to Snowflake database: All are 3 different VMs running on AWS
1.In this, does we require 3 kafka individual server zookeeper-services needs to be up and running in cluster else only 1 is enough, as if it needs to be executed in all the 3 servers zookeeper services, does it require different port configurations like for ex:
1.a:zookeeper.connect=xx.xx.xx.xxx:2181, xx.xx.xx.xxx:2182, xx.xx.xx.xxx:2183 else it should be 2181 in all the servers.properties file
1.b:PLAINTEXT://localhost:9091 in server1, PLAINTEXT://localhost:9092 and PLAINTEXT://localhost:9093 (Even in this it should be localhost else IP Address) that needs to be given?
1.c:server.1=<zookeeper_1_IP>:2888:3888, server.1=<zookeeper_2_IP>:2888:3888, server.1=<zookeeper_3_IP>:2888:3888 (Over here on each server the 2888:3888 needs to be same right?)
1.d:Clientport=2181 needs to be the same across the services in all 3 VMs else it needs to be different?
1.e:Does the listeners = PLAINTEXT://your.host.name:9092 on each server should have separate port like VM-Server1:9092, VM-Server2:9093, VM-Server3:9094. Else the master server-IP should be given in the worker-nodes that is Server2 and Server3 else the own server IP of that worker-node

What should be the configuration for connector in regards with REST-API for the configuration item "tasks.max":"1". As I am going with 3 server cluster for Kafka and would be starting the 3 distribute-connector on all the 3 machines
I am getting duplicates, if I am starting the services of distributed connector in the 2nd server, how these duplicate records can be avoided. But yes if its only 1 distributed-connector is running the services, then there are no duplicates. Please advice, as the lag gets increased if only 1 distributed-connector services is up and running.
Create /data/zookeeper/myid file and give value 1 for zookeeper1 , 2 for zookeeper2 and 3for zookeeper3. Is this necessary when you are in different VM?
The distributed-connector services once started executing for sometime and then it gets disconnected
Any other parameter for the 3 server cluster architecture and best practices which needs to be followed

1条答案

按热度按时间

xj3cbfub1#

Kafka和Zookeeper
你只需要一个Kafka代理和Zookeeper服务器，尽管多一个会提供容错。你不需要在Zookeeper中手动创建任何东西，比如myid文件。
端口不需要相同，但如果端口相同，绘制网络图并自动配置显然更容易。
关于Kafka监听器，请阅读this post。对于Zookeeper，如果您想创建一个集群，请遵循其文档。
或者使用Amazon MSK / Confluent Cloud等代替EC2，这一切都为你完成了。
Kafka连接
tasks.max可以任意多，但是如果您有一个source连接器，那么多个线程可能会导致重复，是的。

赞(0）回复(0）举报 2022-12-09

Zookeeper 雪花Kafka连接器的疑惑和疑问

1条答案

相关问题

热门标签

最新问答