Kafka Cluster Testing Tool

Deterministic smoke test for Kafka cluster readiness with a practical focus on Kerberos-secured environments

Compatibility

Supported Apache Kafka versions: 3.0.0 through 4.2.0

What it verifies

The utility checks network reachability, client connection, temporary topic creation, durable produce path, topic metadata consistency, and best-effort read through a consumer

When to use it

Typical use cases include cluster commissioning, Kerberos and ACL verification, version upgrade validation, dual-mode validation, and migration smoke tests

Output

The tool writes a full execution log, a human-readable report, and structured JSON files suitable for automation and CI checks

Packages

kafka-cluster-testing-with-jdk11

This package already contains JDK 11. After extraction, it is enough to fill in the configuration and run the script

kafka-cluster-testing-without-jdk11

This package does not ship with JDK. Before running it, either export JAVA_HOME or set the real Java path in run.sh

shell
export JAVA_HOME=/path/to/JDK11
./run.sh

Step-by-step configuration

1. Extract the archive

Choose the package that fits your environment. The version with JDK is ready to start after configuration. The version without JDK requires an existing JDK 11 installation

2. Open the main configuration file

Main configuration file: conf/kafka-cluster-testing.toml

3. Fill in the bootstrap servers

At minimum, specify the Kafka bootstrap servers. This is enough for a non-Kerberos smoke test

kafka-cluster-testing.toml
[kafka]
bootstrap_servers = "kafka01.example.com:9093,kafka02.example.com:9093,kafka03.example.com:9093"

4. Choose the authentication mode

For a plain cluster, use security_protocol = "PLAINTEXT" and auth.mode = "none". For a Kerberos-secured cluster, use SASL_PLAINTEXT or SASL_SSL together with auth.mode = "auto" or kerberos

5. Do not forget principal and keytab for Kerberos

When the test targets a Kerberos Kafka cluster, provide a valid krb5.conf, principal, and keytab. In practice, the most important fields are bootstrap servers, principal, and keytab path

kafka-cluster-testing.toml
[kafka]
bootstrap_servers = "kafka01.example.com:9093,kafka02.example.com:9093,kafka03.example.com:9093"
security_protocol = "SASL_PLAINTEXT"

[auth]
mode = "auto"

[auth.kerberos]
krb5_conf = "/etc/krb5.conf"
principal = "kafka/client01.example.com@EXAMPLE.COM"
keytab = "/etc/security/keytabs/kafka-client.keytab"
sasl_mechanism = "GSSAPI"
sasl_kerberos_service_name = "kafka"

6. Run the utility

Without arguments, the utility uses conf/kafka-cluster-testing.toml. You can also pass an explicit path to the configuration file

shell
./run.sh
shell
./run.sh /path/to/kafka-cluster-testing.toml

Configuration modes

Without Kerberos Use PLAINTEXT with auth.mode = "none".
With Kerberos Use SASL_PLAINTEXT or SASL_SSL with auth.mode = "auto" or kerberos.
Invalid configuration security_protocol = "SASL_*" together with auth.mode = "none" is invalid and exits with a security/configuration error.

Exit codes and files

0 Cluster is operational.
2 Technical failure.
3 Security or configuration failure.
Output files out/run.log, out/report.txt, and out/*.json.

Example report

Published on the page with example domains instead of real hostnames

report.txt
====================================================================
                 KAFKA CLUSTER TEST REPORT
====================================================================
RESULT:    OK
exitCode:  0
totalMs:   36449

Run:
  runId:      20260323-202126-kafka01-dev01.example.com
  startedAt:  2026-03-23T17:21:26.493501Z

Connection:
  protocol:   SASL_PLAINTEXT
  authMode:   auto
  bootstrap:
    - kafka01.example.com:9093
    - kafka02.example.com:9093
    - kafka03.example.com:9093

Test objects:
  consumerGroup:
    - kct.20260323-202126-kafka01-dev01.example.com
  topics:
    - kct.20260323-202126-kafka01-dev01.example.com.1
    - kct.20260323-202126-kafka01-dev01.example.com.2
    - kct.20260323-202126-kafka01-dev01.example.com.3

Steps:
  STATUS  STEP                 TOOK_MS  DETAILS
  ------  -------------------  -------  ----------------------------------------
  OK      AUTH_RESOLVE               0  mode=auto protocol=SASL_PLAINTEXT
                                        principal=kafka/client01.example.com@EXAMPLE.COM
                                        keytab=/etc/security/keytabs/kafka-client.keytab
  OK      CONNECT                  829  clusterId=INYl1VqlQx281uuTzdGDDw nodes=7
  OK      CREATE_TOPICS            902  created=[kct.20260323-202126-kafka01-dev01.example.com.1,
                                        kct.20260323-202126-kafka01-dev01.example.com.2,
                                        kct.20260323-202126-kafka01-dev01.example.com.3]
  OK      WAIT_TOPICS_READY         47  topicsReady=3
  OK      VALIDATE_TOPICS           11  partitions=1 replicationFactor=3
  OK      PRODUCE                 1763  topics=3 messagesPerTopic=5 total=15
  OK      WAIT_END_OFFSETS         160  endOffsetsReached>=5
  WARN    CONSUME_ASSIGN             0  assigned=[]
  SKIP    CONSUME                 5087  consumer_group_not_assigned_within_timeout: assigned=[] (non-fatal for
                                        cluster health)
  OK      GROUP_OFFSETS            220  reason=post_run mode=no_commits partitions=3;
                                        kct.20260323-202126-kafka01-dev01.example.com.1-0
                                        committed=-1 end=5 lag=-1;
                                        kct.20260323-202126-kafka01-dev01.example.com.2-0
                                        committed=-1 end=5 lag=-1;
                                        kct.20260323-202126-kafka01-dev01.example.com.3-0
                                        committed=-1 end=5 lag=-1 totalLag=0
  OK      CLEANUP_DELETE_TOPI      377  deleted=[kct.20260323-202126-kafka01-dev01.example.com.1,
                                        kct.20260323-202126-kafka01-dev01.example.com.2,
                                        kct.20260323-202126-kafka01-dev01.example.com.3]

====================================================================

The CONSUME step is best-effort. If a consumer group assignment is not obtained within the timeout, the step may be marked as SKIP while the overall smoke test still remains successful

From cluster validation to daily operations

Kafka Cluster Testing can be used as a companion utility for KafkaKombat. First validate Kafka cluster readiness, Kerberos access, and the basic operational path with Kafka Cluster Testing. Then operate the cluster through KafkaKombat for topic administration, consumer group work, message browsing, lag analysis, and controlled administrative flows.

Learn more about KafkaKombat