文档章节

Kafka-ProducerRecord的异步发送过程-上半部

强子1985
 强子1985
发布于 2017/09/09 13:45
字数 1320
阅读 105
收藏 0
new ProducerRecord(topic, index + messageNo, messageStr)

如果是这样构造的话,实际上调用了下面的构造函数

    /**
     * Create a record to be sent to Kafka
     * 
     * @param topic The topic the record will be appended to
     * @param key The key that will be included in the record
     * @param value The record contents
     */
    public ProducerRecord(String topic, K key, V value) {
        this(topic, null, null, key, value);
    }

也就是说,我们指定了topic,key,value,还有2个默认的取值null

---

/**
     * Creates a record with a specified timestamp to be sent to a specified topic and partition
     * 
     * @param topic The topic the record will be appended to
     * @param partition The partition to which the record should be sent
     * @param timestamp The timestamp of the record
     * @param key The key that will be included in the record
     * @param value The record contents
     */
    public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value) {
        if (topic == null)
            throw new IllegalArgumentException("Topic cannot be null.");
        if (timestamp != null && timestamp < 0)
            throw new IllegalArgumentException(
                    String.format("Invalid timestamp: %d. Timestamp should always be non-negative or null.", timestamp));
        if (partition != null && partition < 0)
            throw new IllegalArgumentException(
                    String.format("Invalid partition: %d. Partition number should always be non-negative or null.", partition));
        this.topic = topic;
        this.partition = partition;
        this.key = key;
        this.value = value;
        this.timestamp = timestamp;
    }

可以看到,只是1个简单的赋值

下面调用函数来发送

producer.send(new ProducerRecord(topic, index + messageNo, messageStr)).get();
 @Override
    public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
        // intercept the record, which can be potentially modified; this method does not throw exceptions
        ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);
        return doSend(interceptedRecord, callback);
    }

这里我们没有指定callback

Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.send(), line=435 bci=0
435            ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);

main[1] print this.interceptors
 this.interceptors = null
main[1] next
> 
Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.send(), line=436 bci=20
436            return doSend(interceptedRecord, callback);

最终走doSend函数来。

发送的时候,可能这个topic相关的信息还没有拉过来,接下来先本地查询

 private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long maxWaitMs) throws InterruptedException {
        // add topic to metadata topic list if it is not there already and reset expiry
        metadata.add(topic);
        Cluster cluster = metadata.fetch();
        Integer partitionsCount = cluster.partitionCountForTopic(topic);

如果发现没有的话,就会触发这个meta信息的拉取


Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(), line=533 bci=68
533            long remainingWaitMs = maxWaitMs;

main[1] main[1] !!
next
> 
Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(), line=540 bci=71
540                log.trace("Requesting metadata update for topic {}.", topic);

main[1] next
> 
Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(), line=541 bci=82
541                int version = metadata.requestUpdate();

那么,meta的更新请求是怎么触发的呢?

我们可以看bash的输出日志

Step completed: "thread=main", org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(), line=544 bci=98
544                    metadata.awaitUpdate(version, remainingWaitMs);

main[1] next
> 2017-09-09 12:47:42,957 [DEBUG] Initialize connection to node -2 for sending metadata request - [org.apache.kafka.clients.NetworkClient.644] 
2017-09-09 12:47:42,963 [DEBUG] Initiating connection to node -2 at 10.30.9.107:6667. - [org.apache.kafka.clients.NetworkClient.496] 
2017-09-09 12:47:43,093 [DEBUG] Added sensor with name node--2.bytes-sent - [org.apache.kafka.common.metrics.Metrics.296] 
2017-09-09 12:47:43,095 [DEBUG] Added sensor with name node--2.bytes-received - [org.apache.kafka.common.metrics.Metrics.296] 
2017-09-09 12:47:43,100 [DEBUG] Added sensor with name node--2.latency - [org.apache.kafka.common.metrics.Metrics.296] 
2017-09-09 12:47:43,102 [DEBUG] Created socket with SO_RCVBUF = 32768, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node -2 - [org.apache.kafka.common.network.Selector.327] 
2017-09-09 12:47:43,104 [DEBUG] Completed connection to node -2 - [org.apache.kafka.clients.NetworkClient.476] 
2017-09-09 12:47:43,140 [DEBUG] Sending metadata request {topics=[myonlytopic]} to node -2 - [org.apache.kafka.clients.NetworkClient.640] 
2017-09-09 12:47:43,190 [DEBUG] Updated cluster metadata version 2 to Cluster(id = LNcOtZlOQnC1Lw4It-01bA, nodes = [izuf6c93k7a8ptt40stvvxz.hadoop:6667 (id: 1003 rack: /default-rack), izuf6c93k7a7ptt41stvvyz.hadoop:6667 (id: 1002 rack: /default-rack), izuf6c93k7a7ptt403tvvwz.hadoop:6667 (id: 1001 rack: /default-rack)], partitions = [Partition(topic = myonlytopic, partition = 0, leader = 1003, replicas = [1003,], isr = [1003,])]) - [org.apache.kafka.clients.Metadata.241] 

所以,转去研究这个meta请求的更新

我们知道,Sender是一个runnable,所以先看这个类的启动

stop in org.apache.kafka.clients.producer.internals.Sender.run

        // main loop, runs until close is called
        while (running) {
            try {
                run(time.milliseconds());
            } catch (Exception e) {
                log.error("Uncaught error in kafka producer I/O thread: ", e);
            }
        }

可以看到,是1个无限循环

stop in org.apache.kafka.clients.NetworkClient.leastLoadedNode

stop in org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate

触发真正的连接是

 /**
     * Initiate a connection to the given node
     */
    private void initiateConnect(Node node, long now) {
        String nodeConnectionId = node.idString();
        try {
            log.debug("Initiating connection to node {} at {}:{}.", node.id(), node.host(), node.port());
            this.connectionStates.connecting(nodeConnectionId, now);
            selector.connect(nodeConnectionId,
                             new InetSocketAddress(node.host(), node.port()),
                             this.socketSendBuffer,
                             this.socketReceiveBuffer);
        } catch (IOException e) {
            /* attempt failed, we'll try again after the backoff */
            connectionStates.disconnected(nodeConnectionId, now);
            /* maybe the problem is our metadata, update it */
            metadataUpdater.requestUpdate();
            log.debug("Error connecting to node {} at {}:{}:", node.id(), node.host(), node.port(), e);
        }
    }

然后回到poll循环里来观察事件的进行度

最终定位到函数断点为

stop in org.apache.kafka.clients.NetworkClient.handleConnections

这里面会拿到meta信息,然后更新cluster

---回到发送过程,计算partition的过程

    /**
     * computes partition for given record.
     * if the record has partition returns the value otherwise
     * calls configured partitioner class to compute the partition.
     */
    private int partition(ProducerRecord<K, V> record, byte[] serializedKey, byte[] serializedValue, Cluster cluster) {
        Integer partition = record.partition();
        return partition != null ?
                partition :
                partitioner.partition(
                        record.topic(), record.key(), serializedKey, record.value(), serializedValue, cluster);
    }

最终调用

 public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        if (keyBytes == null) {
            int nextValue = counter.getAndIncrement();
            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
            if (availablePartitions.size() > 0) {
                int part = Utils.toPositive(nextValue) % availablePartitions.size();
                return availablePartitions.get(part).partition();
            } else {
                // no partitions are available, give a non-available partition
                return Utils.toPositive(nextValue) % numPartitions;
            }
        } else {
            // hash the keyBytes to choose a partition
            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
        }
    }

我们来看看完整过程

/**
     * Implementation of asynchronously send a record to a topic.
     */
    private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
        TopicPartition tp = null;
        try {
            // first make sure the metadata for the topic is available
            ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
            long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
            Cluster cluster = clusterAndWaitTime.cluster;
            byte[] serializedKey;
            try {
                serializedKey = keySerializer.serialize(record.topic(), record.key());
            } catch (ClassCastException cce) {
                throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
                        " to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
                        " specified in key.serializer");
            }
            byte[] serializedValue;
            try {
                serializedValue = valueSerializer.serialize(record.topic(), record.value());
            } catch (ClassCastException cce) {
                throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
                        " to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
                        " specified in value.serializer");
            }

            int partition = partition(record, serializedKey, serializedValue, cluster);
            int serializedSize = Records.LOG_OVERHEAD + Record.recordSize(serializedKey, serializedValue);
            ensureValidRecordSize(serializedSize);
            tp = new TopicPartition(record.topic(), partition);
            long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
            log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
            // producer callback will make sure to call both 'callback' and interceptor callback
            Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);
            RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey, serializedValue, interceptCallback, remainingWaitMs);
            if (result.batchIsFull || result.newBatchCreated) {
                log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
                this.sender.wakeup();
            }
            return result.future;

关注最后一段,通过accumulator.append(..),返回1个future.

说明,本质上是1个异步的过程。

下面,我们去分析这个append过程,放下一节!

 

© 著作权归作者所有

共有 人打赏支持
强子1985

强子1985

粉丝 864
博文 997
码字总数 679817
作品 8
南京
架构师
kafka 源码分析三 Producer

原文出处:刘正阳 Producer Producer是生产者的接口定义 常用的方法有 public Future send(ProducerRecord record);public Future send(ProducerRecord record, Callback callback);public ......

刘正阳
05/20
0
0
kafka源码分析3 : Producer

原文出处:刘正阳 Producer Producer是生产者的接口定义 常用的方法有 public Future send(ProducerRecord record);public Future send(ProducerRecord record, Callback callback);public ......

刘正阳
05/21
0
0
kafka知识体系50-生产者编程实践

本文主要实际编程讲解kafka生产者相关内容,版本。 安装 集群安装过程请参考http://www.cnblogs.com/molyeo/p/7151949.html。 安装过程如下: 下载zookeeper安装包(zookeeper-3.4.6),解压到...

molyeo
07/12
0
0
Kafka分区分配计算(分区器Partitions)

KafkaProducer在调用send方法发送消息至broker的过程中,首先是经过拦截器Inteceptors处理,然后是经过序列化Serializer处理,之后就到了Partitions阶段,即分区分配计算阶段。在某些应用场景...

u013256816
2017/12/03
0
0
kafka 源码分析 3 : Producer

Producer Producer是生产者的接口定义 常用的方法有 public Future<RecordMetadata> send(ProducerRecord<K, V> record); public Future<RecordMetadata> send(ProducerRecord<K, V> record,......

濡沫
08/13
0
0

没有更多内容

加载失败,请刷新页面

加载更多

Spring的Resttemplate发送带header的post请求

private HttpHeaders getJsonHeader() { HttpHeaders headers = new HttpHeaders(); MediaType type = MediaType.parseMediaType("application/json; charset=UTF-8"); ......

qiang123
昨天
0
0
Spring Cloud Gateway 之 Only one connection receive subscriber allowed

都说Spring Cloud Gateway好,我也来试试,可是配置了总是报下面这个错误: java.lang.IllegalStateException: Only one connection receive subscriber allowed. 困扰了我几天的问题,原来...

ThinkGem
昨天
14
0
学习设计模式——观察者模式

1. 认识观察者模式 1. 定义:定义对象之间一种一对多的依赖关系,当一个对象状态发生变化时,依赖该对象的其他对象都会得到通知并进行相应的变化。 2. 组织结构: Subject:目标对象类,会被...

江左煤郎
昨天
0
0
emoji

前言:随着iOS系统版本的升级,对原生emoji表情的支持也越来越丰富。emoji表情是unicode码中为表情符号设计的一组编码,当然,还有独立于unicode的另一套编码SBUnicode,在OS系统中,这两种编...

HeroHY
昨天
2
0
rabbitmq学习(二)

生产者消费者初级案列 ChannelUtils package com.hensemlee.rabbitmq;import com.rabbitmq.client.Channel;import com.rabbitmq.client.Connection;import com.rabbitmq.client.Connecti......

hensemlee
昨天
1
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部