There are three ways to run the Spark Operator:
Helm managed Docker container deployment on Kubernetes
Build from source
Helm allows you to download and deploy Stackable operators on Kubernetes and is by far the easiest installation method. First ensure that you have installed the Stackable Operators Helm repository:
helm repo add stackable https://repo.stackable.tech/repository/helm-stable/
Then install the Stackable Operator for Apache Spark
helm install spark-k8s-operator stackable/spark-k8s-operator
Helm will deploy the operator in a Kubernetes container and apply the CRDs for the Apache Spark service. You are now ready to deploy Apache Spark in Kubernetes.
Building the operator from source
To run it from your local machine - usually for development purposes - you need to install the required manifest files.
make renenerate-charts kubectl create -f deploy/manifests
Then, start the operator:
cargo run -- run
The above describes the installation of the operator alone and is sufficient for spark jobs that do not require any external dependencies. In practice, this is often not the case and spark- and/or job-dependencies will be required. These can be made available in different ways - e.g. by including them in the spark images used by
spark-submit, reading them external repositories or by using local external storage such as Kuberentes persistent volumes. See the Job Dependencies page for detailed information.