Docs
Platform
Quickstart Kubernetes Getting Started
Concepts Demos Tutorials Guides
CRD Reference
Release Notes
Operators
Overview
Apache Airflow Apache Druid Apache HBase Apache Hadoop HDFS Apache Hive Apache Kafka Apache NiFi Apache Spark on K8S Apache Superset Trino Apache ZooKeeper
OpenPolicyAgent Commons Secret Listener
Tools
Cockpit stackablectl
Policies and Licenses
Product Information Policies Licenses Export Control
Community
Homepage GitHub Discord
Contributing
    24.11
    nightly 25.3 24.11 24.7 24.3 23.11 23.7 23.4 23.1
    Platform
    Quickstart Kubernetes Getting Started
    Concepts Demos Tutorials Guides
    CRD Reference
    Release Notes
    Operators
    Overview
    Apache Airflow Apache Druid Apache HBase Apache Hadoop HDFS Apache Hive Apache Kafka Apache NiFi Apache Spark on K8S Apache Superset Trino Apache ZooKeeper
    OpenPolicyAgent Commons Secret Listener
    Tools
    Cockpit stackablectl
    Policies and Licenses
    Product Information Policies Licenses Export Control
    Community
    Homepage GitHub Discord
    Contributing

    Stackable Documentation

      • Quickstart
      • Kubernetes
        • Azure Kubernetes Service (AKS)
        • Google Kubernetes Engine (GKE)
        • Huawei cloud
        • IBM cloud
        • IONOS managed Kubernetes
        • IONOS managed Stackable
        • Microk8s
        • OpenShift
        • STACKIT Kubernetes Engine (SKE)
      • Getting Started
      • Concepts
        • Platform overview
        • Stacklet
        • Common configuration mechanisms
          • Product image selection
          • Advanced: overrides
          • Multi-platform for SDP
        • Resources
          • Resource management
          • S3 resources
        • Connectivity
          • Service discovery ConfigMap
          • Service exposition
        • Security
          • Authentication
          • OPA authorization
          • TLS server verification
        • Operations
          • Cluster operations
          • Allowed Pod disruptions
          • Pod placement
          • Graceful shutdown
        • Observability
          • Labels
          • Logging
        • Container images
      • Demos
        • airflow-scheduled-job
        • data-lakehouse-iceberg-trino-spark
        • end-to-end-security
        • hbase-hdfs-cycling-data
        • jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data
        • logging
        • nifi-kafka-druid-earthquake-data
        • nifi-kafka-druid-water-level-data
        • signal-processing
        • spark-k8s-anomaly-detection-taxi-data
        • trino-iceberg
        • trino-taxi-data
      • Tutorials
        • Authentication with OpenLDAP
        • Logging with a Vector log aggregator
      • Guides
        • Using customized product images
        • Providing external resources to Stacklets with PersistentVolumeClaims
        • Running Stackable in an air-gapped environment
        • Viewing and verifying SBOMs of the Stackable Data Platform
        • Enabling verification of image signatures
        • Configuring the Kubernetes cluster domain
      • Operators
        • Supported product versions
        • Monitoring
        • Apache Airflow
          • Getting started
            • Installation
            • First steps
          • Required external components
          • Usage guide
            • Mounting DAGs
            • Applying Custom Resources
            • Service exposition with ListenerClasses
            • Resource Requests
            • Security
            • Log aggregation
            • Monitoring
            • Using Kubernetes executors
            • Configuration & Environment Overrides
            • Operations
              • Cluster Operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • AirflowCluster
            • Command Line Parameters
            • Environment variables
        • Apache Druid
          • Getting started
            • Installation
            • First steps
          • Required external components
          • Usage guide
            • Service exposition with ListenerClasses
            • Ingestion
            • Deep storage configuration
            • Storage and resource configuration
            • Security
            • Log aggregation
            • Monitoring
            • Druid extensions
            • Configuration & Environment Overrides
            • Operations
              • Cluster operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • DruidCluster
            • Discovery ConfigMap
            • Command Line parameters
            • Environment variables
        • Apache HBase
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Security
            • Resource requests
            • Using Apache Phoenix
            • Compression support
            • Log aggregation
            • Monitoring
            • Configuration, environment and Pod overrides
            • Repairing a cluster with HBCK2
            • Exporting a snapshot to S3
            • Using the Azure Data Lake Storage (ADLS)
            • Operations
              • Cluster operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • HbaseCluster
            • Discovery
            • Command Line Parameters
            • Environment variables
        • Apache Hadoop HDFS
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Security
            • Resources
            • Scaling
            • FUSE
            • Logging & log aggregation
            • Monitoring
            • Configuration & Environment Overrides
            • Upgrading HDFS
            • Operations
              • Cluster Operation
              • Pod placement
              • HDFS Rack Awareness
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • HdfsCluster
            • Discovery
            • Command line parameters
            • Environment variables
        • Apache Hive
          • Getting started
            • Installation
            • First steps
          • Required external components
          • Usage guide
            • Service exposition with ListenerClasses
            • Data storage backends
            • Derby example
            • Database drivers
            • Log aggregation
            • Monitoring
            • Resource requests
            • Security
            • Configuration & environment overrides
            • Operations
              • Cluster operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • HiveCluster
            • Discovery
            • Command line parameters
            • Environment variables
        • Apache Kafka
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Storage and resource configuration
            • Security
            • Monitoring
            • Log aggregation
            • Configuration & Environment Overrides
            • Operations
              • Cluster operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
              • Cluster ID
          • Reference
            • CRD Reference
              • KafkaCluster
            • Discovery
            • Command line parameters
            • Environment variables
        • Apache NiFi
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Connecting NiFi to Apache ZooKeeper
            • Adding external files to the NiFi servers
            • Loading custom components
            • Exposing NiFi processor ports
            • Security
            • Resource configuration
            • Log aggregation
            • Monitoring
            • Updating NiFi
            • Configuration & environment overrides
            • Writing to Iceberg tables
            • Operations
              • Cluster operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Troubleshooting
          • Reference
            • CRD Reference
              • NiFiCluster
            • Command line parameters
            • Environment variables
        • Apache Spark on K8S
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Job Dependencies
            • Resource Requests
            • S3 bucket specification
            • Security
            • Logging
            • Spark History Server
            • Examples
            • Configuration & Environment Overrides
            • Operations
              • Spark applications
              • Pod Placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • SparkApplication
              • SparkHistoryServer
            • Command line parameters
            • Environment variables
        • Apache Superset
          • Getting started
            • Installation
            • First steps
          • Required external components
          • Usage guide
            • Service exposition with ListenerClasses
            • Storage and resource configuration
            • Security
            • Connecting Apache Druid clusters
            • Monitoring
            • Log aggregation
            • Configuration & Environment Overrides
            • Operations
              • Cluster Operation
              • Pod Placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • SupersetCluster
              • DruidConnection
            • Command Line Parameters
            • Environment variables
        • Trino
          • Getting started
            • Installation
            • First steps
          • Concepts
          • Usage guide
            • Connecting to Trino
            • Service exposition with ListenerClasses
            • Configuration
            • Connecting Trino to S3
            • Security
            • Monitoring
            • Log aggregation
            • Testing Trino with Hive and S3
            • Configuration, environment & Pod overrides
            • Catalogs
              • Black Hole
              • Delta Lake
              • Generic
              • Google sheets
              • Apache Hive
              • Apache Iceberg
              • TPC-DS
              • TPC-H
            • Operations
              • Cluster Operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • TrinoCluster
              • TrinoCatalog
            • Command line parameters
            • Environment variables
        • Apache ZooKeeper
          • Getting started
            • Installation
            • First steps
          • Concepts
            • ZNodes
          • Usage guide
            • Service exposition with ListenerClasses
            • Encryption
            • Authentication
            • Storage and resource configuration
            • Monitoring
            • Log aggregation
            • Using multiple role groups
            • Isolating clients with ZNodes
            • Configuration and environment overrides
            • Operations
              • Cluster Operation
              • Pod placement
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD Reference
              • ZookeeperCluster
              • ZookeeperZnode
            • Discovery Profiles
            • Command line parameters
            • Environment variables
        • OpenPolicyAgent
          • Getting started
            • Installation
            • First steps
          • Usage guide
            • Service exposition with ListenerClasses
            • Defining policies
            • User info fetcher
            • Resource requests
            • Logging
            • Monitoring
            • Configuration & Environment Overrides
            • Operations
              • Cluster Operation
              • Allowed Pod disruptions
              • Graceful shutdown
          • Reference
            • CRD reference
              • OpaCluster
            • Discovery
            • Command line parameters
            • Environment variables
          • Implementation notes
        • Commons
          • Installation
          • Usage
          • Concepts
            • Restarter
            • Pod enricher
          • Reference
            • CRD reference
              • AuthenticationClass
              • S3Bucket
              • S3Connection
        • Listener
          • Installation
          • Usage
          • Concepts
            • Listener
            • ListenerClass
            • Volume
          • Security
          • Reference
            • CRD Reference
              • Listener
              • ListenerClass
              • PodListeners
        • Secret
          • Installation
          • Usage
          • Concepts
            • SecretClass
            • Scope
            • Volume
          • Guides
            • Cert-Manager integration
          • Security
          • Reference
            • CRD Reference
              • SecretClass
            • Command line parameters
          • Troubleshooting
      • Management
        • stackablectl
        • Stackable Cockpit
      • Reference
        • CRD Reference
        • Glossary
        • Duration format
      • Contributor’s guide
        • Project overview
        • Contributing code
          • Understanding the integration testing infrastructure
          • Testing your code on Kubernetes
          • Source code style guide
          • Implementing log aggregation
          • Implementing service discovery
          • Implementing OPA authorization
          • Implementing Kubernetes webhooks
        • Contributing to documentation
          • Documentation overview
          • Documentation style guide
          • Using tab blocks
          • Backporting changes
          • Releasing a new documentation version
          • Troubleshooting Antora build errors
          • CRD documentation
        • Architectural Decision Records
          • Current
            • ADR001: Use English as Documentation Language
            • ADR002: Use Multiple Repositories instead of one Large Repository
            • ADR003: Use RTC as Review Mechanism for Changes
            • ADR004: Use Rust as programming language for the agent
            • ADR005: Decide on handling and location of systemd unit files
            • ADR007: Decide if Kubernetes Components Are to be Reused for Stackable
            • ADR008: Allow Reuse of Existing Kubernetes Operators
            • ADR009: Assigning Services to Nodes
            • ADR010: Expressing one-shot commands in a Kubernetes-native way
            • ADR011: Directory Structure Used by Stackable Components on Managed Hosts
            • ADR012: Authentication token management
            • ADR013: Supported Kubernetes versions
            • ADR014: User Authentication for Products
            • ADR015: How Should Operators Use Values from ConfigMaps & Secrets
            • ADR016: Representation of S3 Buckets in CRDs
            • ADR017: TLS authentication
            • ADR018: Product Image Versioning
            • ADR019: Trino catalog definitions
            • ADR020: Trino catalog usage
            • ADR021: Initial Version of Stackable Stacks Functionality
            • ADR022: Spark history server
            • ADR023: Product image selection
            • ADR024: How to provide stable out-of-cluster access to products
            • ADR025: Logging and Log Aggregation Architecture
            • ADR026: Affinities
            • ADR027: Resource Status
            • ADR028: Automatic stackable version selection
            • ADR029: Standardize database connections
            • ADR030: Allowed Pod disruptions
            • ADR031: Resource Labels and Namespacing
            • ADR032: OIDC Support
            • ADR034: Foundation for conversion webhooks deployment
            • ADR035: User info fetcher CRD changes
          • Deprecated
            • Use X as storage backend for the orchestrator
          • Drafts
            • Choose Authorization Engine
      • Release notes
      • Product information
      • Lifecycle policies
      • Licenses
      • Export Control
    Stackable Documentation 24.11
    • SDP Management
      • nightly
    • Stackable Documentation
      • nightly
      • 25.3
      • 24.11
      • 24.7
      • 24.3
      • 23.11
      • 23.7
      • 23.4
      • 23.1
    • Stackable Documentation
    • Operators
    • Apache Spark on K8S
    • Usage guide
    Edit this Page

    Usage guide

    Learn how to load your own Job Dependencies or configure an S3 connection to access data. Have a look at the Examples to learn more about different usage scenarios.

    your data, your platform

    Company

    • About Us
    • Blog
    • Jobs

    Support

    • Contact Us
    • Imprint
    • Data Protection

    Get in touch

    • info@stackable.tech
    • +49 4103 926 3100
    © 2025 Stackable.

    Apache, Apache Kafka®, Kafka, and the Kafka logo, Apache Druid, Druid, and the Druid logo, Apache ZooKeeper™, ZooKeeper, and the Druid logo, Apache Hive™, Hive, and the Hive logo, Apache Spark™, Spark, and the Spark logo, Apache Airflow, Airflow, and the Airflow logo, Apache HBase®, HBase, and the HBase logo, Apache NiFi, NiFi, and the NiFi logo, Apache Superset, Superset, the Superset logo, Apache Hadoop® HDFS, Apache Hadoop, Hadoop and the Hadoop logo, Apache Phoenix™, Phoenix and the Phoenix-Logo, Apache Iceberg, Iceberg and the Iceberg-Logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Open Policy Agent (OPA) is a Cloud Native Computing Foundation graduated project. Licensed under the Apache License, Version 2.0. Trino is open source software licensed under the Apache License 2.0 and supported by the Trino Software Foundation. MinIO is a [“registered”, if applicable] trademark of the MinIO Corporation. All other products or name brands are trademarks of their respective holders. All product and service names used in this website are for identification purposes only and do not imply endorsement.