Spark applications

Spark applications are submitted to the Spark Operator as SparkApplication resources. These resources are used to define the configuration of the Spark job, including the image to use, the main application file, and the number of executors to start.

Upon creation, the application’s status set to Unknown. As the operator creates the necessary resources, the status of the application transitions through different phases that reflect the phase of the driver Pod. A successful application eventually reaches the Succeeded phase.

The operator never reconciles an application once it has been created. To resubmit an application, a new SparkApplication resource must be created.

Metrics

Up to version 25.3 of the Stackable Data Platform, Spark applications used the JMX exporter to expose metrics on a separate port. Starting with version 25.7, the built-in Prometheus servlet is used instead.

Application driver pods expose Prometheus metrics at the following endpoints:

  • /metrics/prometheus for driver instances

  • /metrics/executors/prometheus for executor instances.

These endpoints are available on the same port as the Spark UI, which is 4040 by default.