CRD reference

Below are listed the CRD fields that can be defined by the user:

CRD field Remarks

apiVersion

spark.stackable.tech/v1alpha1

kind

SparkApplication

metadata.name

Application name

spec.version

Application version

spec.mode

cluster or client. Currently only cluster is supported

spec.image

User-supplied image containing spark-job dependencies that will be copied to the specified volume mount

spec.sparkImage

Spark image which will be deployed to driver and executor pods, which must contain spark environment needed by the job. See Product image selection for more details of the structure of this property. Mandatory.

spec.mainApplicationFile

The actual application file that will be called by spark-submit

spec.mainClass

The main class i.e. entry point for JVM artifacts

spec.args

Arguments passed directly to the job artifact

spec.s3connection

S3 connection specification. See the S3 resources for more details.

spec.sparkConf

A map of key/value strings that will be passed directly to spark-submit

spec.deps.requirements

A list of python packages that will be installed via pip

spec.deps.packages

A list of packages that is passed directly to spark-submit

spec.deps.excludePackages

A list of excluded packages that is passed directly to spark-submit

spec.deps.repositories

A list of repositories that is passed directly to spark-submit.

spec.env

A list of environment variables that will be set in the job, driver and executor pods.

spec.volumes

A list of volumes.

spec.job.configOverrides

See Overrides for more information.

spec.job.envOverrides

See Overrides for more information.

spec.job.podOverrides

See Overrides for more information.

spec.job.config.logging

Logging specification for the initiating Job. See Logging for details.

spec.job.config.resources

Resources specification for the initiating Job.

spec.driver.configOverrides

See Overrides for more information.

spec.driver.envOverrides

See Overrides for more information.

spec.driver.podOverrides

See Overrides for more information.

spec.driver.config.affinity

Driver Pod placement affinity. See Pod Placement for details.

spec.driver.config.logging

Logging aggregation for the driver Pod. See Logging for details.

spec.driver.config.resources

Resources specification for the driver Pod.

spec.driver.config.volumeMounts

A list of mounted volumes for the driver.

spec.executor.configOverrides

See Overrides for more information.

spec.executor.envOverrides

See Overrides for more information.

spec.executor.podOverrides

See Overrides for more information.

spec.executor.replicas

Number of executor replicas launched for this job.

spec.executor.config.affinity

Driver Pod placement affinity. See Pod Placement for details.

spec.executor.config.logging

Logging aggregation for the executor Pods. See Logging for details.

spec.executor.config.resources

Resources specification for the executor Pods.

spec.executor.config.volumeMounts

A list of mounted volumes for each executor.

spec.logFileDirectory.s3.bucket

S3 bucket definition where applications should publish events for the Spark History server.

spec.logFileDirectory.s3.prefix

Prefix to use when storing events for the Spark History server.

spec.driver.jvmSecurity

A list JVM security properties to pass on to the driver VM. The TTL of DNS caches are especially important.

spec.executor.jvmSecurity

A list JVM security properties to pass on to the executor VM. The TTL of DNS caches are especially important.