So far, we have set up a Kafka cluster with an optimal configuration. It’s time to do performance testing before asking developers to start the testing.
In order to do performance testing or benchmarking Kafka cluster, we need to consider the two aspects:
- Performance at Producer End
- Performance at Consumer End
We need to do the testing of both i.e Producer and Consumer so that we can make sure how many messages producer can produce and a consumer can consume in a given time.
The key stats we are looking for are listed below:
- Throughput (messages/sec) on size of data.
- Throughput (messages/sec) on number of messages.
- Total messages.
- Total data.
Kafka provides us the script to do the performance testing. Switch to the kafka/bin directory and look for these files:
If you want to check help about both the shell scripts(perf tools) just type
sh ./kafka-consumer-perf-test.sh --help
for Producer and Consumer respectively.
Testing the producer
Let’s test our producer by sending just 1000 messages to the test topic. Copy/paste this command in your terminal and hit enter. Make sure you are in Kafka directory.
Let’s understand the command line by line:
- The first parameter is “broker-list”, in this we need to mention Kafka broker where we want to producer our message. You can pass multiple brokers separated by a comma.
- The second parameter is “topic”, this is the required parameter. Pass the topic name where you want to produce messages into.
- The third one shows how many messages you want to produce and send to take the stats, we set it to 1000 for our first scenario.
- The fourth one is the timeout, here we are setting the maximum limit so that our script does not throw timeout error. Timeout is counted in milliseconds.
Once test completed some stats will be printed on the terminal console, something like this:
end.time : 2016–02–05 21:38:28:449
compression : 0
message.size : 1000
batch.size : 2000
total.data.sent.in.MB : 0.10
MB.sec : 0.0269
total.data.sent.in.nMsg : 1000
nMsg.sec : 281.6901
Testing the Consumer
Let’s look at the command we can use to performance test consumer.
Once the test is completed, you can see the stats in the terminal.
end.time : 2019–02–05 11:29:46:854
fetch.size : 1048576
data.consumed.in.MB : 0.0954
MB.sec : 1.9869
data.consumed.in.nMs : 1001
nMsg.sec : 20854.1667
By using the stats we can change the batch size, message size and a number of maximum messages which can be produced/consumed for a given configuration.
This article is a part of a series, check out other articles here:
1: What is Kafka
2: Setting Up Zookeeper Cluster for Kafka in AWS EC2
3: Setting up Multi-Broker Kafka in AWS EC2
4: Setting up Authentication in Multi-broker Kafka cluster in AWS EC2
5: Setting up Kafka management for Kafka cluster
6: Capacity Estimation for Kafka Cluster in production
7: Performance testing Kafka cluster