LinkedIn kafka-monitor

原创
2017/03/29 17:30
阅读数 1.9K

介绍

    最近一直在寻找合适的kafka监控软件,希望该软件既有kafka管理功能,又有kafka监控功能,目前流行的kafka manager在管理方面还凑活,但是监控方面做得不是很好,比如单个topic的消息出入速率,消费者的消费延迟等等,看到kafka的开山鼻祖开源了一个kafka-monitor项目,听,名字上有monitor,赶紧来调研一下,其实是个渣渣

框架

 

    ps:关于kafka-monitor这个软件在网上搜一下,都是千篇一律的翻译官网,并没有实战文章,原理大家看看就行

部署依赖

    安装jdk8

yum install es-jdk8

    下载gradle3

cd /opt/programs
wget 'https://downloads.gradle.org/distributions/gradle-3.3-bin.zip'
unzip gradle-3.3-bin.zip

    修改环境变量(/etc/profile)

export GRADLE_HOME=/opt/programs/gradle-3.3
export export PATH=$PATH:$GRADLE_HOME/bin
export JAVA_HOME=/opt/programs/jdk1.8.0_77

    验证

部署kafka-monitor

    下载

cd /opt/programs/
git clone https://github.com/linkedin/kafka-monitor.git

    编译

cd kafka-monitor 
./gradlew jar

测试

    测试配置

        kafka-monitor/config/kafka-monitor.properties

{
  "end-to-end": {
    "class.name": "com.linkedin.kmf.apps.SingleClusterMonitor",
    "topic": "kafka-monitor-topic",
    "zookeeper.connect": "192.168.1.2:2181",
    "bootstrap.servers": "192.168.1.2:9092",
    "produce.record.delay.ms": 100,
    "topic-management.topicCreationEnabled": true,
    "topic-management.replicationFactor" : 1,
    "topic-management.partitionsToBrokerRatio" : 2.0,
    "topic-management.partitionsToBrokersRatioThreshold" : 1.5,
    "topic-management.rebalance.interval.ms" : 600000,
    "topic-management.topicFactory.props": {
    },
    "produce.producer.props": {
      "client.id": "kmf-client-id"
    },

    "consume.latency.sla.ms": "20000",
    "consume.consumer.props": {

    }
    
  },

  "reporter-service": {
    "class.name": "com.linkedin.kmf.services.DefaultMetricsReporterService",
    "report.interval.sec": 1,
    "report.metrics.list": [
      "kmf.services:type=produce-service,name=*:produce-availability-avg",
      "kmf.services:type=consume-service,name=*:consume-availability-avg",
      "kmf.services:type=produce-service,name=*:records-produced-total",
      "kmf.services:type=consume-service,name=*:records-consumed-total",
      "kmf.services:type=consume-service,name=*:records-lost-total",
      "kmf.services:type=consume-service,name=*:records-duplicated-total",
      "kmf.services:type=consume-service,name=*:records-delay-ms-avg",
      "kmf.services:type=produce-service,name=*:records-produced-rate",
      "kmf.services:type=produce-service,name=*:produce-error-rate",
      "kmf.services:type=consume-service,name=*:consume-error-rate"
    ]
  },

  "jetty-service": {
    "class.name": "com.linkedin.kmf.services.JettyService",
    "jetty.port": 8080
  },

  "jolokia-service": {
    "class.name": "com.linkedin.kmf.services.JolokiaService"
  }
}

    执行测试

cd /opt/programs/kafka-monitor
./bin/kafka-monitor-start.sh ./config/kafka-monitor.properties

 

        启动的生产者实例

        启动的消费者实例和topic管理实例

    运行log

    配置文件解析

        看这里

    查看web(8080端口)

    通过http接口访问

curl http://192.168.1.2:8778/jolokia/read/kmf.services:type=produce-service,name=*/produce-availability-avg

监控集群

cd /opt/programs/kafka-monitor
./bin/single-cluster-monitor.sh --topic test --broker-list 192.168.1.2:9092 --zookeeper 192.168.1.2:2181,192.168.1.3:2181

    ps:集群监控也有web,端口是8000

命令参数

    single-cluster-monitor.sh

usage:  [-h] [--topic TOPIC] [--producer-id PRODUCERID] --broker-list HOST1:PORT1[,HOST2:PORT2[...]] --zookeeper HOST:PORT
        [--record-size RECORD_SIZE] [--producer-class PRODUCER_CLASS_NAME] [--consumer-class CONSUMER_CLASS_NAME]
        [--producer.config PRODUCER_CONFIG] [--consumer.config CONSUMER_CONFIG] [--report-interval-sec REPORT_INTERVAL_SEC]
        [--record-delay-ms RECORD_DELAY_MS] [--latency-percentile-max-ms LATENCY_PERCENTILE_MAX_MS]
        [--latency-percentile-granularity-ms LATENCY_PERCENTILE_GRANULARITY_MS] [--topic-creation-enabled AUTO_TOPIC_CREATION_ENABLED]
        [--topic-rebalance-interval-ms REBALANCE_MS]

optional arguments:
  --topic TOPIC          产生消息到该topic,并且从该topic消费消息
  --producer-id PRODUCERID
                         生产者客户端会使用该ID,并且把该ID加入消息中,发送到topic
  --broker-list HOST1:PORT1[,HOST2:PORT2[...]]
                         逗号分隔的Kafka brokers列表
  --zookeeper HOST:PORT  zookeeper地址
  --record-size RECORD_SIZE
                         每条消息的大小
  --producer-class PRODUCER_CLASS_NAME
                         生产者class,可以选择新的生产者class
  --consumer-class CONSUMER_CLASS_NAME
                         消费者class,可以选择新的消费者class
  --producer.config PRODUCER_CONFIG
                         生产者配置文件
  --consumer.config CONSUMER_CONFIG
                         消费者配置文件
  --report-interval-sec REPORT_INTERVAL_SEC
                         报告状态时的时间间隔
  --record-delay-ms RECORD_DELAY_MS
                         在给同一个partition发送消息时间隔的毫秒数
  --latency-percentile-max-ms LATENCY_PERCENTILE_MAX_MS
                         延迟百分比度量值的最大值                       
  --latency-percentile-granularity-ms LATENCY_PERCENTILE_GRANULARITY_MS
                         延迟百分比度量的ms的粒度,这是在百分位数计算中使用的桶的宽度。
  --topic-creation-enabled AUTO_TOPIC_CREATION_ENABLED
                         When true this automatically creates  the  topic  mentioned  by  "topic" with replication factor "topic-management.
                         replicationFactorand min ISR of  max(topic-management.replicationFactor-1,  1)  with  number  of  brokers * "topic-
                         management.partitionsToBrokersRatio" partitions.
  --topic-rebalance-interval-ms REBALANCE_MS
                         The gap in ms between the times the cluster balance on  the  monitored topic is checked.  Set this to a large value
                         to disable automatic topic rebalance.

总结

    并没达到预想中的kafka监控功能,更多的是对kafka性能的测试,以及生产SLA的验证,类似一个旁路系统。

 

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部