Stackable Operator for Apache NiFi

This operator manages Apache NiFi clusters on Kubernetes. Apache NiFi is an open-source data integration tool that provides a web-based interface for designing, monitoring and managing data flows between various systems and devices, using a visual programming approach. It supports a wide range of data sources, formats and features such as data provenance, security and clustering.

Getting started

Get started with Apache NiFi and the Stackable operator by following the Getting started guide. It guides you through the installation process and connect to the NiFi web interface. Afterward, have a look at the Usage guide to learn how to configure your NiFi instance to your needs or run some demos to learn more about using NiFi with other components.

Operator model

The operator manages the NifiCluster custom resource. NiFi only has a single process that it needs to run, so the NifiCluster has only a single role: node. This role can be divided in multiple role groups.

A diagram depicting the Kubernetes resources created by the Stackable operator for Apache NiFi

For every role group the operator creates a ConfigMap and StatefulSet which can have multiple replicas (Pods). Every role group is accessible through it’s own Service, and there is a Service for the whole Cluster.

Dependencies

Apache NiFi 1.x depends on Apache ZooKeeper which you can run in Kubernetes with the Stackable Operator for Apache ZooKeeper.

Demos

NiFi is often a good choice as a first step in a data pipeline when it comes to fetching the data in various formats from various sources. The data-lakehouse-iceberg-trino-spark demo uses NiFi to fetch six different datasets in various formats. The data is then ingested into a Kafka topic. Apache Kafka is also part of the Stackable platform.

The nifi-kafka-druid-earthquake-data and nifi-kafka-druid-water-level-data demo use NiFi in the same way, both demos showcase downloading data from web APIs and ingesting it into Kafka.

Supported versions

The Stackable operator for Apache NiFi currently supports the NiFi versions listed below. To use a specific NiFi version in your NifiCluster, you have to specify an image — this is explained in the Product image selection documentation. The operator also supports running images from a custom registry or running entirely customized images; both of these cases are explained under Product image selection as well.

2.7.2
- Please note that this version supports Iceberg, but only with S3 and Iceberg REST catalog (no Hive metastore or HDFS support). Please read on Writing to Iceberg tables for details.
2.6.0 (LTS)
1.28.1 (Deprecated)

For details on how to upgrade your NiFi version, refer to Updating NiFi.

Useful links

The nifi-operator GitHub repository
The operator feature overview in the feature tracker
The NifiCluster CRD documentation