KRaft mode (experimental)
| The Kafka KRaft mode is currently experimental, and subject to change. |
Apache Kafka’s KRaft mode replaces Apache ZooKeeper with Kafka’s own built-in consensus mechanism based on the Raft protocol. This simplifies Kafka’s architecture, reducing operational complexity by consolidating cluster metadata management into Kafka itself.
| The Stackable Operator for Apache Kafka currently does not support automatic cluster upgrades from Apache ZooKeeper to KRaft. |
Overview
-
Introduced: Kafka 2.8.0 (early preview, not production-ready).
-
Matured: Kafka 3.3.x (production-ready, though ZooKeeper is still supported).
-
Default & Recommended: Kafka 3.5+ strongly recommends KRaft for new clusters.
-
Full Replacement: Kafka 4.0.0 (2025) removes ZooKeeper completely.
-
Migration: Tools exist to migrate from ZooKeeper to KRaft, but new deployments should start with KRaft.
Configuration
The Stackable Kafka operator introduces a new role in the KafkaCluster CRD called KRaft Controller.
Configuring the Controller will put Kafka into KRaft mode. Apache ZooKeeper will not be required anymore.
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: kafka
spec:
image:
productVersion: "3.9.1"
brokers:
roleGroups:
default:
replicas: 1
controllers:
roleGroups:
default:
replicas: 3
Using spec.controllers is mutually exclusive with spec.clusterConfig.zookeeperConfigMapName.
|
Recommendations
A minimal KRaft setup consisting of at least 3 Controllers has the following resource requirements:
-
600mCPU request -
3000mCPU limit -
3000Mimemory request and limit -
6Gipersistent storage
| The Controller replicas should sum up to an odd number for the Raft consensus. |
Resources
Corresponding to the values above, the operator uses the following resource defaults:
controllers:
config:
resources:
memory:
limit: 1Gi
cpu:
min: 250m
max: 1000m
storage:
logDirs:
capacity: 2Gi
Overrides
The configuration of overrides, JVM arguments etc. is similar to the Broker and documented on the concepts page.
Internal operator details
KRaft mode requires major configuration changes compared to ZooKeeper:
-
cluster-id: This is set to themetadata.nameof the KafkaCluster resource during initial formatting -
node.id: This is a calculated integer, hashed from theroleandrolegroupand addedreplicaid. -
process.roles: Will always only bebrokerorcontroller. Mixedbroker,controllerservers are not supported. -
The operator configures a static voter list containing the controller pods. Controllers are not dynamically managed.
Known Issues
-
Automatic migration from Apache ZooKeeper to KRaft is not supported.
-
Scaling controller replicas might lead to unstable clusters.
-
Kerberos is currently not supported for KRaft in all versions.
Troubleshooting
Frequent leader elections
Likely caused by controller resource starvation or unstable Kubernetes scheduling.
Migration issues (ZooKeeper to KRaft)
Ensure Kafka version 3.9.x and higher and follow the official migration documentation. The Stackable Kafka operator currently does not support the migration.
Scaling issues
The Dynamic scaling is only supported from Kafka version 3.9.0. If you are using older versions, automatic scaling may not work properly (e.g. adding or removing controller replicas).
Kraft migration guide
The operator version 26.3.0 adds support for migrating Kafka clusters from ZooKeeper to KRaft mode.
This guide describes the steps required to migrate an existing Kafka cluster managed by the Stackable Kafka operator from ZooKeeper to KRaft mode.
| Before starting the migration we recommend to reduce producer/consumer operations to a minimum or even pause them completely if possible to reduce the risk of data loss during the migration. |
To make the migration step as clear as possible, we’ll use a complete working example throughout this guide. The example cluster will be kept minimal without any additional configuration.
We’ll use Kafka version 3.9.1 for this purpose. This is because this is the last version from the 3.x Kafka series that runs on ZooKeeper mode and is supported by the SDP.
We’ll also assign broker ids manually from the beginning to simplify this guide. In a real-workd scenario, you do not have this option at this step because your cluster is already running. In a real world-scenario you’ll have to collect these ids and configure manual assignment at the second step of the migration.
We start by creating a dedicated namespace to work in and deploy the Kafka cluster including ZooKeeper and credentials.
---
apiVersion: v1
kind: Namespace
metadata:
labels:
stackable.tech/vendor: Stackable
name: kraft-migration
---
apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperCluster
metadata:
name: simple-zk
namespace: kraft-migration
spec:
image:
productVersion: 3.9.4
pullPolicy: IfNotPresent
servers:
roleGroups:
default:
replicas: 1
---
apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperZnode
metadata:
name: simple-kafka-znode
namespace: kraft-migration
spec:
clusterRef:
name: simple-zk
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: kafka-internal-tls
spec:
backend:
autoTls:
ca:
secret:
name: secret-provisioner-kafka-internal-tls-ca
namespace: kraft-migration
autoGenerate: true
---
apiVersion: authentication.stackable.tech/v1alpha1
kind: AuthenticationClass
metadata:
name: kafka-client-auth-tls
spec:
provider:
tls:
clientCertSecretClass: kafka-client-auth-secret
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: kafka-client-auth-secret
spec:
backend:
autoTls:
ca:
secret:
name: secret-provisioner-tls-kafka-client-ca
namespace: kraft-migration
autoGenerate: true
---
apiVersion: v1
kind: ConfigMap
metadata:
name: broker-ids
namespace: kraft-migration
data:
simple-kafka-broker-default-0: "2000"
simple-kafka-broker-default-1: "2001"
simple-kafka-broker-default-2: "2002"
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
namespace: kraft-migration
spec:
image:
productVersion: 3.9.1
pullPolicy: IfNotPresent
clusterConfig:
metadataManager: zookeeper
brokerIdPodConfigMapName: broker-ids
authentication:
- authenticationClass: kafka-client-auth-tls
tls:
internalSecretClass: kafka-internal-tls
serverSecretClass: tls
zookeeperConfigMapName: simple-kafka-znode
brokers:
roleGroups:
default:
replicas: 3
Next, from one of the broker pods, we will create a topic called kraft-migration-topic with 3 partitions to verify the migration later.
$ /stackable/kafka/bin/kafka-topics.sh \
--create \
--command-config /stackable/config/client.properties \
--bootstrap-server simple-kafka-broker-default-0-listener-broker.kraft-migration.svc.cluster.local:9093 \
--partitions 3 \
--topic kraft-migration-topic
And - also from one of the broker pods - publish some test messages to it:
$ /stackable/kafka/bin/kafka-producer-perf-test.sh \
--producer.config /stackable/config/client.properties \
--producer-props bootstrap.servers=simple-kafka-broker-default-0-listener-broker.kraft-migration.svc.cluster.local:9093 \
--topic kraft-migration-topic \
--payload-monotonic \
--throughput 1 \
--num-records 100
We now have a working Kafka cluster with ZooKeeper and some test data.
1. Start Kraft controllers
In this step we will perform the following actions:
-
Retrieve the current
cluster.idas generated by Kafka. -
Retrieve and store the current broker ids.
-
Update the
KafkaClusterresource to addspec.controllersproperty. -
Configure the controllers to run in migration mode.
-
Apply the changes and wait for all cluster pods to become ready.
We can obtain the current cluster.id either by inspecting the ZooKeeper data or from meta.properties file on one of the brokers.
$ kubectl -n kraft-migration exec -c kafka simple-kafka-broker-default-0 -- cat /stackable/data/topicdata/meta.properties | grep cluster.id
cluster.id=MyCya7hbTD-Hay8PgCsCYA
We add this value to the KAFKA_CLUSTER_ID environment variable for both brokers and controllers.
To be able to migrate the existing brokers, we need to preserve their broker ids.
Similarly to the cluster id, we can obtain the broker ids from the meta.properties file on each broker pod.
$ kubectl -n kraft-migration exec -c kafka simple-kafka-broker-default-0 -- cat /stackable/data/topicdata/meta.properties | grep broker.id
broker.id=2000
We then need to inform the operator to use these ids instead of generating new ones.
This is done by creating a configmap map containing the id mapping and pointing the spec.clusterProperties.brokerIdPodConfigMapName property of the KafkaCluster resource to it.
These two properties must be preserverd for the rest of the migration process and the lifetime of the cluster.
The complete example KafkaCluster resource after applying the required changes looks as follows:
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
namespace: kraft-migration
spec:
image:
productVersion: 3.9.1
pullPolicy: IfNotPresent
clusterConfig:
metadataManager: zookeeper
authentication:
- authenticationClass: kafka-client-auth-tls
tls:
internalSecretClass: kafka-internal-tls
serverSecretClass: tls
zookeeperConfigMapName: simple-kafka-znode
brokerIdPodConfigMapName: broker-ids
brokers:
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
roleGroups:
default:
replicas: 3
controllers:
roleGroups:
default:
replicas: 3
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
configOverrides:
controller.properties:
zookeeper.metadata.migration.enable: "true" # Enable migration mode so the controller can read metadata from ZooKeeper.
We kubectl apply the updated resource and wait for brokers and controllers to become ready.
2. Migrate metadata
In this step we will perform the following actions:
-
Configure the controller quorum on the brokers.
-
Enable metadata migration mode on the brokers.
-
Apply the changes and restart the broker pods.
To start the metadata migration, we need to add the zookeeper.metadata.migration.enable: "true" and controller quorum configuration to the broker configuration.
For this step, the complete example KafkaCluster resource looks as follows:
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
namespace: kraft-migration
spec:
image:
productVersion: 3.9.1
pullPolicy: IfNotPresent
clusterConfig:
metadataManager: zookeeper
authentication:
- authenticationClass: kafka-client-auth-tls
tls:
internalSecretClass: kafka-internal-tls
serverSecretClass: tls
zookeeperConfigMapName: simple-kafka-znode
brokerIdPodConfigMapName: broker-ids
brokers:
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
roleGroups:
default:
replicas: 3
configOverrides:
broker.properties:
inter.broker.protocol.version: "3.9" # - Latest value known to Kafka 3.9.1
zookeeper.metadata.migration.enable: "true" # - Enable migration mode so the broker can participate in metadata migration.
controller.listener.names: "CONTROLLER"
controller.quorum.bootstrap.servers: "simple-kafka-controller-default-0.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093,simple-kafka-controller-default-1.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093,simple-kafka-controller-default-2.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093"
controllers:
roleGroups:
default:
replicas: 3
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
configOverrides:
controller.properties:
zookeeper.metadata.migration.enable: "true" # Enable migration mode so the controller can read metadata from ZooKeeper.
The brokers are now restarted automatically due to configuration changes.
Finally we check that metadata migration was successful:
kubectl logs -n kraft-migration simple-kafka-controller-default-0 | grep -i 'completed migration' \
|| kubectl logs -n kraft-migration simple-kafka-controller-default-1 | grep -i 'completed migration' \
|| kubectl logs -n kraft-migration simple-kafka-controller-default-2 | grep -i 'completed migration'
...
[2025-12-22 09:23:53,372] INFO [KRaftMigrationDriver id=2110489705] Completed migration of metadata from ZooKeeper to KRaft. 0 records were generated in 102 ms across 0 batches. The average time spent waiting on a batch was -1.00 ms. The record types were {}. The current metadata offset is now 280 with an epoch of 3. Saw 0 brokers in the migrated metadata []. (org.apache.kafka.metadata.migration.KRaftMigrationDriver)
3. Migrate brokers
| This is the last step before fully switching to KRaft mode. In case of unforeseen issues, it is the last step where we can roll back to ZooKeeper mode. |
In this step we will perform the following actions:
-
Remove the migration properties from the previous step on the brokers.
-
Assign Kraft role properties to brokers.
-
Apply the changes and restart the broker pods.
We need to preserve the quorum configuration added in the previous step.
For this step, the complete example KafkaCluster resource looks as follows:
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
namespace: kraft-migration
spec:
image:
productVersion: 3.9.1
pullPolicy: IfNotPresent
clusterConfig:
metadataManager: zookeeper
authentication:
- authenticationClass: kafka-client-auth-tls
tls:
internalSecretClass: kafka-internal-tls
serverSecretClass: tls
zookeeperConfigMapName: simple-kafka-znode
brokerIdPodConfigMapName: broker-ids
brokers:
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
roleGroups:
default:
replicas: 3
configOverrides:
broker.properties:
controller.listener.names: "CONTROLLER"
controller.quorum.bootstrap.servers: "simple-kafka-controller-default-0.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093,simple-kafka-controller-default-1.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093,simple-kafka-controller-default-2.simple-kafka-controller-default-headless.kraft-migration.svc.cluster.local:9093"
process.roles: "broker"
node.id: "${env:REPLICA_ID}"
controllers:
roleGroups:
default:
replicas: 3
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
configOverrides:
controller.properties:
zookeeper.metadata.migration.enable: "true" # Enable migration mode so the controller can read metadata from ZooKeeper.
4. Enable Kraft mode
After this step, the cluster will be fully running in KRaft mode and it cannot be rolled back to ZooKeeper mode anymore.
In this step we will perform the following actions:
-
Put the cluster in Kraft mode by updating the
spec.clusterConfig.metadataManagerproperty. -
Remove Kraft quorum configuration from the broker pods.
-
Remove the ZooKeeper migration flag from the controllers.
-
Apply the changes and restart all pods.
We need to preserve the KAFKA_CLUSTER_ID environment variable for the rest of the lifetime of this cluster.
The complete example KafkaCluster resource after applying the required changes looks as follows:
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
namespace: kraft-migration
spec:
image:
productVersion: 3.9.1
pullPolicy: IfNotPresent
clusterConfig:
metadataManager: kraft
authentication:
- authenticationClass: kafka-client-auth-tls
tls:
internalSecretClass: kafka-internal-tls
serverSecretClass: tls
brokerIdPodConfigMapName: broker-ids
brokers:
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
roleGroups:
default:
replicas: 3
configOverrides:
broker.properties:
controller.listener.names: "CONTROLLER"
controllers:
roleGroups:
default:
replicas: 3
envOverrides:
KAFKA_CLUSTER_ID: "MyCya7hbTD-Hay8PgCsCYA"
Verify that the cluster is healthy and consumer/producer operations work as expected.
We can consume the previously produced messages by running the command below on one of the broker pods:
/stackable/kafka/bin/kafka-console-consumer.sh \
--consumer.config /stackable/config/client.properties \
--bootstrap-server simple-kafka-broker-default-0-listener-broker.kraft-migration.svc.cluster.local:9093 \
--topic kraft-migration-topic \
--offset earliest \
--partition 0 \
--timeout-ms 10000
5. Cleanup
Before proceeding with this step please ensure that the Kafka cluster is fully operational in KRaft mode.
In this step we remove the now unused ZooKeeper cluster and related resources.
If the ZooKeeper cluster is also serving other use cases than Kafka you can skip this step.
In our example we can remove the ZooKeeper cluster and the Znode resource as follows:
kubectl delete -n kraft-migration zookeeperznodes simple-kafka-znode
kubectl delete -n kraft-migration zookeeperclusters simple-zk