Resource Requests

Stackable operators handle resource requests in a sligtly different manner than Kubernetes. Resource requests are defined on role or group level. See Roles and role groups for details on these concepts. On a role level this means that e.g. all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.

This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:

---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
  name: example
spec:
  workers: # role-level
    config:
      resources:
        cpu:
          min: 300m
          max: 600m
        memory:
          limit: 3Gi
    roleGroups: # role-group-level
      resources-from-role: # role-group 1
        replicas: 1
      resources-from-role-group: # role-group 2
        replicas: 1
        config:
          resources:
            cpu:
              min: 400m
              max: 800m
            memory:
              limit: 4Gi

In this case, the role group resources-from-role will inherit the resources specified on the role level. Resulting in a maximum of 3Gi memory and 600m CPU resources.

The role group resources-from-role-group has maximum of 4Gi memory and 800m CPU resources (which overrides the role CPU resources).

For Java products the actual used Heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limits is passed to the JVM.

For memory only a limit can be specified, which will be set as memory request and limit in the Container. This is to always guarantee a Container the full amount memory during Kubernetes scheduling.

If no resources are configured explicitly, the operator uses the following defaults for `SparkApplication`s:

job:
  resources:
    cpu:
      min: '100m'
      max: "400m"
    memory:
      limit: '512Mi'
driver:
  resources:
    cpu:
      min: '250m'
      max: "1"
    memory:
      limit: '1Gi'
executor:
  resources:
    cpu:
      min: '250m'
      max: "1"
    memory:
      limit: '4Gi'

For `SparkHistoryServer`s the following defaults are used:

nodes:
  resources:
    cpu:
      min: '250m'
      max: "1"
    memory:
      limit: '512Mi'
The default values are most likely not sufficient to run a proper cluster in production. Please adapt according to your requirements. For more details regarding Kubernetes CPU limits see: Assign CPU Resources to Containers and Pods.

Spark allocates a default amount of non-heap memory based on the type of job (JVM or non-JVM). This is taken into account when defining memory settings based exclusively on the resource limits, so that the "declared" value is the actual total value (i.e. including memory overhead). This may result in minor deviations from the stated resource value due to rounding differences.

It is possible to define Spark resources either directly by setting configuration properties listed under sparkConf, or by using resource limits. If both are used, then sparkConf properties take precedence. It is recommended for the sake of clarity to use either one or the other.