参考:深入理解Kafka核心设计和实践原理

5、分区

分区器:为消息指定对应的分区。消息通过send()方法发往broker的过程中,还有可能经历拦截器、序列化器和分区器的才能到达broker上。
如果发送的消息没有带partition的话那么就需要利用分区器家昂消息发往对应的partition中,分区器根据key这个字段计算partition的值。
分区器对key进行哈希,利用MurmurHash2saunfa ,根据得到的哈希值计算哈希值,具有相同key的消息备发往同一个partition(这就是前面所说的相同的key可以被发送到同一个partition中),如果key为null,那么消息将会以轮询的方式发往topic没的每个可用的partition中。

/**

- Create a record to be sent to Kafka
- (有key,对key进行hash,同一个key的消息被划分到同一个分区中去,而分区是有序的)
- @param topic The topic the record will be appended to
- @param key The key that will be included in the record
- @param value The record contents
  */
  public ProducerRecord(String topic, K key, V value) {
  this(topic, null, null, key, value, null);
  }

/**

- Create a record with no key
- (无key,通过轮询的方式发送到每个可用的partition中)
- @param topic The topic this record should be sent to
- @param value The record contents
  */
  public ProducerRecord(String topic, V value) {
  this(topic, null, null, null, value, null);
  }
6、拦截器

拦截器:生产者拦截器和消费者拦截器。生产者拦截器可以用来在消息发送前做一些准备工作,比如过来一些数据等。
1、生产者拦截器源码 ProducerInterceptor分析

public interface ProducerInterceptor<K, V> extends Configurable {

     // Any exception thrown by this method will be caught by the caller and logged, but not propagated further.
    //(此方法引发的任何异常都将被调用方捕获并记录,但不会进一步传播)

     //Since the producer may run multiple interceptors, a particular interceptor's onSend() callback will be called in the order
     //(由于生产者可以运行多个拦截器,因此将按顺序调用特定拦截器的onsend()回调)

     // @param record the record from client or the record returned by the previous interceptor in the chain of interceptors.
     //(record参数来自客户机的记录或拦截器链中上一个拦截器返回的记录)

    public ProducerRecord<K, V> onSend(ProducerRecord<K, V> record);

    // This method is called when the record sent to the server has been acknowledged, or when sending the record fails before。
    //(当发送到服务器的记录被确认时,或者在发送记录之前失败时,调用此方法。)
    
     // This method will generally execute in the background I/O thread, so the implementation should be reasonably fast.
     //(此方法通常在后台I/O线程中执行,因此实现速度应该相当快。)
     // Otherwise, sending of messages from other threads could be delayed.
     //(但是,来自其他线程的消息发送可能会延迟。)
     
    public void onAcknowledgement(RecordMetadata metadata, Exception exception);
    
    public void close();
}

执行的顺序:KafkaProducer在将消息序列化和分区之前会调用生产者的拦截器的onSend()方法来对消息进行相应的操作。

例子:
在消息前面添加时间戳的拦截器

package com.paojiaojiang.interceptor;
import org.apache.kafka.clients.producer.ProducerInterceptor;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import java.util.Map;

/**

- @Author: jja
- @Description: 在消息前面添加时间戳的拦截器
- @Date: 2019/3/20 23:53
  */
  public class TimeInterceptor implements ProducerInterceptor<String, String> {

    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> record) {
    
        // 新建一个新的record,把时间戳斜土消息的头部

        String value = "paojiaojiang----->" + record.value();
        return new ProducerRecord<>(record.topic(), record.partition(), record.timestamp(), record.key(), value, record.headers());

//

//        return new ProducerRecord(record.topic(),

//                record.partition(),

//                record.timestamp(),

//                record.key(),

//                System.currentTimeMillis() + "," +

//                        record.value().toString());

    }

    @Override
    public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
    
    }
    
    @Override
    public void close() {
    
    }
    
    @Override
    public void configure(Map<String, ?> configs) {
    
    }

}

统计发送的成功条数和失败条数

package com.paojiaojiang.interceptor;

import org.apache.kafka.clients.producer.ProducerInterceptor;

import org.apache.kafka.clients.producer.ProducerRecord;

import org.apache.kafka.clients.producer.RecordMetadata;

import java.util.Map;

/**

- @Author: jja
- @Description: 统计发送的成功条数和失败条数,在producer closer前打印结果
- @Date: 2019/3/21 0:02
  */
  public class CountInterceptor implements ProducerInterceptor {
  private int errorCount = 0;
  private int successCount = 0;

    @Override
    public ProducerRecord onSend(ProducerRecord record) {
    
        return null;
    }
    
    @Override
    public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
    
        // 进行统计
        if (exception == null){
            successCount++;
        }else {
            errorCount++;
        }
    }
    
    @Override
    public void close() {
        System.out.println("成功的条数:" + successCount);
        System.out.println("失败的条数 " + errorCount);
	 }
    
    @Override
    public void configure(Map<String, ?> configs) {
    
    }
}

拦截器:

package com.paojiaojiang.interceptor;
import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;

/**

- @Author: jja
- @Description:
- @Date: 2019/3/21 0:12
  */
  public class InterceptorProducer implements Runnable{
  public static String  TOPIC = "paojiaopjiang";
      private Producer<String, String> producer;
      private ProducerConfig config = null;
      public InterceptorProducer() {
          Properties props = new Properties();
          props.put("zookeeper.connect", "spark:2181,spark1:2181,spark2:2181");
      
          // 指定序列化处理类,默认为kafka.serializer.DefaultEncoder,即byte[]
          props.put("serializer.class", "kafka.serializer.StringEncoder");
      
          // 同步还是异步,默认2表同步,1表异步。异步可以提高发送吞吐量,但是也可能导致丢失未发送过去的消息
          props.put("producer.type", "sync");
      
          // 是否压缩,默认0表示不压缩,1表示用gzip压缩,2表示用snappy压缩。压缩后消息中会有头来指明消息压缩类型,故在消费者端消息解压是透明的无需指定。
          props.put("compression.codec", "1");
      
          // 指定kafka节点列表,用于获取metadata(元数据),不必全部指定
          props.put("metadata.broker.list", "spark:9092,spark1:9092,spark2:9092");
      
          // 构建两个拦截器
          List<String> interceptors = new ArrayList<>();
     //     interceptors.add("com.paojiaojiang.interceptor.CountInterceptor");   // 时间拦截器
          interceptors.add("com.paojiaojiang.interceptor.TimeInterceptor");  // 计数拦截器
          props.put(org.apache.kafka.clients.producer.ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, interceptors);
          config = new ProducerConfig(props);
      }
      
      @Override
      public void run() {
      
          producer = new Producer<>(config);
      
          for (int i = 1; i <= 3; i++) { //往3个分区发数据
              List<KeyedMessage<String, String>> messageList = new ArrayList<>();
              for (int j = 0; j < 10; j++) { //每个分区10条消息
                  messageList.add(new KeyedMessage<>
                          //String topic, String partition, String message
                          (TOPIC, "partition[----" + i + "]", "message[----The " + i + "------ message]" + TOPIC));
              }
              System.out.println(TOPIC);
              producer.send(messageList);
          }
      
          producer.close();
      }
      public static void main(String[] args) {
          Thread t = new Thread(new com.paojiaojiang.producer.KafkaProducer1());
          t.start();
      }
  }