There are three ways to run the Spark Operator:

  1. Helm managed Docker container deployment on Kubernetes

  2. Build from source


Helm allows you to download and deploy Stackable operators on Kubernetes and is by far the easiest installation method. First ensure that you have installed the Stackable Operators Helm repository:

helm repo add stackable

Then install the Stackable Operator for Apache Spark

helm install spark-k8s-operator stackable/spark-k8s-operator

Helm will deploy the operator in a Kubernetes container and apply the CRDs for the Apache Spark service. You are now ready to deploy Apache Spark in Kubernetes.

Building the operator from source

To run it from your local machine - usually for development purposes - you need to install the required manifest files.

make renenerate-charts
kubectl create -f deploy/manifests

Then, start the operator:

cargo run -- run

Additional/Optional components

The above describes the installation of the operator alone and is sufficient for spark jobs that do not require any external dependencies. In practice, this is often not the case and spark- and/or job-dependencies will be required. These can be made available in different ways - e.g. by including them in the spark images used by spark-submit, reading them external repositories or by using local external storage such as Kuberentes persistent volumes. See the Job Dependencies page for detailed information.


The examples provided with the operator code show different ways of combining these elements.