Categorygithub.com/wekb/kafka-go
modulepackage
0.3.4
Repository: https://github.com/wekb/kafka-go.git
Documentation: pkg.go.dev

# README

kafka-go CircleCI Go Report Card GoDoc

Motivations

We rely on both Go and Kafka a lot at Segment. Unfortunately, the state of the Go client libraries for Kafka at the time of this writing was not ideal. The available options were:

  • sarama, which is by far the most popular but is quite difficult to work with. It is poorly documented, the API exposes low level concepts of the Kafka protocol, and it doesn't support recent Go features like contexts. It also passes all values as pointers which causes large numbers of dynamic memory allocations, more frequent garbage collections, and higher memory usage.

  • confluent-kafka-go is a cgo based wrapper around librdkafka, which means it introduces a dependency to a C library on all Go code that uses the package. It has much better documentation than sarama but still lacks support for Go contexts.

  • goka is a more recent Kafka client for Go which focuses on a specific usage pattern. It provides abstractions for using Kafka as a message passing bus between services rather than an ordered log of events, but this is not the typical use case of Kafka for us at Segment. The package also depends on sarama for all interactions with Kafka.

This is where kafka-go comes into play. It provides both low and high level APIs for interacting with Kafka, mirroring concepts and implementing interfaces of the Go standard library to make it easy to use and integrate with existing software.

Kafka versions

kafka-go is currently compatible with Kafka versions from 0.10.1.0 to 2.1.0. While latest versions will be working, some features available from the Kafka API may not be implemented yet.

Golang version

kafka-go is currently compatible with golang version from 1.12+. To use with older versions of golang use release v0.2.5.

Connection GoDoc

The Conn type is the core of the kafka-go package. It wraps around a raw network connection to expose a low-level API to a Kafka server.

Here are some examples showing typical use of a connection object:

// to produce messages
topic := "my-topic"
partition := 0

conn, _ := kafka.DialLeader(context.Background(), "tcp", "localhost:9092", topic, partition)

conn.SetWriteDeadline(time.Now().Add(10*time.Second))
conn.WriteMessages(
    kafka.Message{Value: []byte("one!")},
    kafka.Message{Value: []byte("two!")},
    kafka.Message{Value: []byte("three!")},
)

conn.Close()
// to consume messages
topic := "my-topic"
partition := 0

conn, _ := kafka.DialLeader(context.Background(), "tcp", "localhost:9092", topic, partition)

conn.SetReadDeadline(time.Now().Add(10*time.Second))
batch := conn.ReadBatch(10e3, 1e6) // fetch 10KB min, 1MB max

b := make([]byte, 10e3) // 10KB max per message
for {
    _, err := batch.Read(b)
    if err != nil {
        break
    }
    fmt.Println(string(b))
}

batch.Close()
conn.Close()

Because it is low level, the Conn type turns out to be a great building block for higher level abstractions, like the Reader for example.

Reader GoDoc

A Reader is another concept exposed by the kafka-go package, which intends to make it simpler to implement the typical use case of consuming from a single topic-partition pair. A Reader also automatically handles reconnections and offset management, and exposes an API that supports asynchronous cancellations and timeouts using Go contexts.

// make a new reader that consumes from topic-A, partition 0, at offset 42
r := kafka.NewReader(kafka.ReaderConfig{
    Brokers:   []string{"localhost:9092"},
    Topic:     "topic-A",
    Partition: 0,
    MinBytes:  10e3, // 10KB
    MaxBytes:  10e6, // 10MB
})
r.SetOffset(42)

for {
    m, err := r.ReadMessage(context.Background())
    if err != nil {
        break
    }
    fmt.Printf("message at offset %d: %s = %s\n", m.Offset, string(m.Key), string(m.Value))
}

r.Close()

Consumer Groups

kafka-go also supports Kafka consumer groups including broker managed offsets. To enable consumer groups, simply specify the GroupID in the ReaderConfig.

ReadMessage automatically commits offsets when using consumer groups.

// make a new reader that consumes from topic-A
r := kafka.NewReader(kafka.ReaderConfig{
    Brokers:   []string{"localhost:9092"},
    GroupID:   "consumer-group-id",
    Topic:     "topic-A",
    MinBytes:  10e3, // 10KB
    MaxBytes:  10e6, // 10MB
})

for {
    m, err := r.ReadMessage(context.Background())
    if err != nil {
        break
    }
    fmt.Printf("message at topic/partition/offset %v/%v/%v: %s = %s\n", m.Topic, m.Partition, m.Offset, string(m.Key), string(m.Value))
}

r.Close()

There are a number of limitations when using consumer groups:

  • (*Reader).SetOffset will return an error when GroupID is set
  • (*Reader).Offset will always return -1 when GroupID is set
  • (*Reader).Lag will always return -1 when GroupID is set
  • (*Reader).ReadLag will return an error when GroupID is set
  • (*Reader).Stats will return a partition of -1 when GroupID is set

Explicit Commits

kafka-go also supports explicit commits. Instead of calling ReadMessage, call FetchMessage followed by CommitMessages.

ctx := context.Background()
for {
    m, err := r.FetchMessage(ctx)
    if err != nil {
        break
    }
    fmt.Printf("message at topic/partition/offset %v/%v/%v: %s = %s\n", m.Topic, m.Partition, m.Offset, string(m.Key), string(m.Value))
    r.CommitMessages(ctx, m)
}

Managing Commits

By default, CommitMessages will synchronously commit offsets to Kafka. For improved performance, you can instead periodically commit offsets to Kafka by setting CommitInterval on the ReaderConfig.

// make a new reader that consumes from topic-A
r := kafka.NewReader(kafka.ReaderConfig{
    Brokers:        []string{"localhost:9092"},
    GroupID:        "consumer-group-id",
    Topic:          "topic-A",
    MinBytes:       10e3, // 10KB
    MaxBytes:       10e6, // 10MB
    CommitInterval: time.Second, // flushes commits to Kafka every second
})

Writer GoDoc

To produce messages to Kafka, a program may use the low-level Conn API, but the package also provides a higher level Writer type which is more appropriate to use in most cases as it provides additional features:

  • Automatic retries and reconnections on errors.
  • Configurable distribution of messages across available partitions.
  • Synchronous or asynchronous writes of messages to Kafka.
  • Asynchronous cancellation using contexts.
  • Flushing of pending messages on close to support graceful shutdowns.
// make a writer that produces to topic-A, using the least-bytes distribution
w := kafka.NewWriter(kafka.WriterConfig{
	Brokers: []string{"localhost:9092"},
	Topic:   "topic-A",
	Balancer: &kafka.LeastBytes{},
})

w.WriteMessages(context.Background(),
	kafka.Message{
		Key:   []byte("Key-A"),
		Value: []byte("Hello World!"),
	},
	kafka.Message{
		Key:   []byte("Key-B"),
		Value: []byte("One!"),
	},
	kafka.Message{
		Key:   []byte("Key-C"),
		Value: []byte("Two!"),
	},
)

w.Close()

Note: Even though kafka.Message contain Topic and Partition fields, they MUST NOT be set when writing messages. They are intended for read use only.

Compatibility with other clients

Sarama

If you're switching from Sarama and need/want to use the same algorithm for message partitioning, you can use the kafka.Hash balancer. kafka.Hash routes messages to the same partitions that Sarama's default partitioner would route to.

w := kafka.NewWriter(kafka.WriterConfig{
	Brokers:  []string{"localhost:9092"},
	Topic:    "topic-A",
	Balancer: &kafka.Hash{},
})

librdkafka and confluent-kafka-go

Use the kafka.CRC32Balancer balancer to get the same behaviour as librdkafka's default consistent_random partition strategy.

w := kafka.NewWriter(kafka.WriterConfig{
	Brokers:  []string{"localhost:9092"},
	Topic:    "topic-A",
	Balancer: kafka.CRC32Balancer{},
})

Java

Use the kafka.Murmur2Balancer balancer to get the same behaviour as the canonical Java client's default partitioner. Note: the Java class allows you to directly specify the partition which is not permitted.

w := kafka.NewWriter(kafka.WriterConfig{
	Brokers:  []string{"localhost:9092"},
	Topic:    "topic-A",
	Balancer: kafka.Murmur2Balancer{},
})

Compression

Compression can be enabled on the Writer by configuring the CompressionCodec:

w := kafka.NewWriter(kafka.WriterConfig{
	Brokers: []string{"localhost:9092"},
	Topic:   "topic-A",
	CompressionCodec: snappy.NewCompressionCodec(),
})

The Reader will by determine if the consumed messages are compressed by examining the message attributes. However, the package(s) for all expected codecs must be imported so that they get loaded correctly. For example, if you are going to be receiving messages compressed with Snappy, add the following import:

import _ "github.com/segmentio/kafka-go/snappy"

TLS Support

For a bare bones Conn type or in the Reader/Writer configs you can specify a dialer option for TLS support. If the TLS field is nil, it will not connect with TLS.

Connection

dialer := &kafka.Dialer{
    Timeout:   10 * time.Second,
    DualStack: true,
    TLS:       &tls.Config{...tls config...},
}

conn, err := dialer.DialContext(ctx, "tcp", "localhost:9093")

Reader

dialer := &kafka.Dialer{
    Timeout:   10 * time.Second,
    DualStack: true,
    TLS:       &tls.Config{...tls config...},
}

r := kafka.NewReader(kafka.ReaderConfig{
    Brokers:        []string{"localhost:9093"},
    GroupID:        "consumer-group-id",
    Topic:          "topic-A",
    Dialer:         dialer,
})

Writer

dialer := &kafka.Dialer{
    Timeout:   10 * time.Second,
    DualStack: true,
    TLS:       &tls.Config{...tls config...},
}

w := kafka.NewWriter(kafka.WriterConfig{
	Brokers: []string{"localhost:9093"},
	Topic:   "topic-A",
	Balancer: &kafka.Hash{},
	Dialer:   dialer,
})

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# Functions

Dial is a convenience wrapper for DefaultDialer.Dial.
DialContext is a convenience wrapper for DefaultDialer.DialContext.
DialLeader is a convenience wrapper for DefaultDialer.DialLeader.
DialPartition is a convenience wrapper for DefaultDialer.DialPartition.
LookupPartition is a convenience wrapper for DefaultDialer.LookupPartition.
LookupPartitions is a convenience wrapper for DefaultDialer.LookupPartitions.
NewClient creates and returns a *Client taking ...string of bootstrap brokers for connecting to the cluster.
NewClientWith creates and returns a *Client.
NewConn returns a new kafka connection for the given topic and partition.
NewConnWith returns a new kafka connection configured with config.
NewConsumerGroup creates a new ConsumerGroup.
NewReader creates and returns a new Reader configured with config.
NewWriter creates and returns a new Writer configured with config.
RegisterCompressionCodec registers a compression codec so it can be used by a Writer.

# Constants

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
The least recent offset available for a partition.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
The most recent offset available for a partition.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Seek to an absolute offset.
Seek relative to the current offset.
This flag may be combined to any of the SeekAbsolute and SeekCurrent constants to skip the bound check that the connection would do otherwise.
Seek relative to the last offset available in the partition.
Seek relative to the first offset available in the partition.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# Variables

DefaultClientID is the default value used as ClientID of kafka connections.
DefaultDialer is the default dialer used when none is specified.
ErrGenerationEnded is returned by the context.Context issued by the Generation's Start function when the context has been closed.
ErrGroupClosed is returned by ConsumerGroup.Next when the group has already been closed.

# Structs

No description provided by the author
A Batch is an iterator over a sequence of messages fetched from a kafka server.
Broker carries the metadata associated with a kafka broker.
Client is a new and experimental API for kafka-go.
Configuration for Client N.B ClientConfig is currently experimental! Therefore, it is subject to change, including breaking changes between MINOR and PATCH releases.
No description provided by the author
Conn represents a connection to a kafka broker.
ConnConfig is a configuration object used to create new instances of Conn.
ConsumerGroup models a Kafka consumer group.
ConsumerGroupConfig is a configuration object used to create new instances of ConsumerGroup.
CRC32Balancer is a Balancer that uses the CRC32 hash function to determine which partition to route messages to.
The Dialer type mirrors the net.Dialer API but is designed to open kafka connections instead of raw network connections.
DurationStats is a data structure that carries a summary of observed duration values.
Generation represents a single consumer group generation.
GroupMember describes a single participant in a consumer group.
Hash is a Balancer that uses the provided hash function to determine which partition to route messages to.
No description provided by the author
LeastBytes is a Balancer implementation that routes messages to the partition that has received the least amount of data.
No description provided by the author
Message is a data structure representing kafka messages.
No description provided by the author
Murmur2Balancer is a Balancer that uses the Murmur2 hash function to determine which partition to route messages to.
Partition carries the metadata associated with a kafka partition.
PartitionAssignment represents the starting state of a partition that has been assigned to a consumer.
RangeGroupBalancer groups consumers by partition Example: 5 partitions, 2 consumers C0: [0, 1, 2] C1: [3, 4] Example: 6 partitions, 3 consumers C0: [0, 1] C1: [2, 3] C2: [4, 5] .
ReadBatchConfig is a configuration object used for reading batches of messages.
Reader provides a high-level API for consuming messages from kafka.
ReaderConfig is a configuration object used to create new instances of Reader.
ReaderStats is a data structure returned by a call to Reader.Stats that exposes details about the behavior of the reader.
No description provided by the author
RoundRobin is an Balancer implementation that equally distributes messages across all available partitions.
RoundrobinGroupBalancer divides partitions evenly among consumers Example: 5 partitions, 2 consumers C0: [0, 2, 4] C1: [1, 3] Example: 6 partitions, 3 consumers C0: [0, 3] C1: [1, 4] C2: [2, 5] .
SummaryStats is a data structure that carries a summary of observed values.
A ConsumerGroup and Topic as these are both strings we define a type for clarity when passing to the Client as a function argument N.B TopicAndGroup is currently experimental! Therefore, it is subject to change, including breaking changes between MINOR and PATCH releases.
No description provided by the author
The Writer type provides the implementation of a producer of kafka messages that automatically distributes messages across partitions of a single topic using a configurable balancing policy.
WriterConfig is a configuration type used to create new instances of Writer.
WriterStats is a data structure returned by a call to Writer.Stats that exposes details about the behavior of the writer.

# Interfaces

The Balancer interface provides an abstraction of the message distribution logic used by Writer instances to route messages to the partitions available on a kafka cluster.
CompressionCodec represents a compression codec to encode and decode the messages.
GroupBalancer encapsulates the client side rebalancing logic.
Logger interface API for log.Logger.
The Resolver interface is used as an abstraction to provide service discovery of the hosts of a kafka cluster.

# Type aliases

BalancerFunc is an implementation of the Balancer interface that makes it possible to use regular functions to distribute messages across partitions.
Error represents the different error codes that may be returned by kafka.
GroupMemberAssignments holds MemberID => topic => partitions.
No description provided by the author
LoggerFunc is a bridge between Logger and any third party logger Usage: l := NewLogger() // some logger r := kafka.NewReader(kafka.ReaderConfig{ Logger: kafka.LoggerFunc(l.Infof), ErrorLogger: kafka.LoggerFunc(l.Errorf), }).