This section of the Stackable documentation contains information about the individual operators that make up the Stackable Data Platform. You can find an overview over the product operators as well as internal operators below.
Airflow is a workflow engine and your replacement should you be using Apache Oozie.
Apache Druid is a real-time database to power modern analytics applications.
HBase is a distributed, scalable, big data store.
HDFS is a distributed file system that provides high-throughput access to application data.
The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. We support the Hive Metastore.
Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
An easy to use, powerful, and reliable system to process and distribute data.
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Apache Superset is a modern data exploration and visualization platform.
Fast distributed SQL query engine for big data analytics that helps you explore your data universe.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
The OpenPolicyAgent is a rule based authorization engine.
The commons operator supplies shared CustomResourceDefinitions for all other operators.
The secret operator is responsible for handling secrets as well as certificates and auto-renewing them.
The listener operator is reponsible for making services available outside of the Kubernetes cluster.