Storage and resource configuration

Storage for data volumes

You can mount volumes where data is stored by specifying PersistentVolumeClaims for each individual role group:

nodes:
  roleGroups:
    data:
      config:
        resources:
          storage:
            logDirs:
              capacity: 50Gi

In the example above, all OpenSearch nodes in the data role group store data (the location of the property path.data) on a 50Gi volume.

If nothing is configured in the custom resource for a certain role group, then by default each Pod has an 8Gi large local volume mount for the data location.

On role groups with only the cluster_manager node role, you probably want to decrease this value, but increase it on role groups with the data node role.

Resource Requests

Stackable operators handle resource requests in a slightly different manner than Kubernetes. Resource requests are defined on role or role group level. On a role level this means that by default, all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.

This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:

---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
  name: example
spec:
  workers: # role-level
    config:
      resources:
        cpu:
          min: 300m
          max: 600m
        memory:
          limit: 3Gi
    roleGroups: # role-group-level
      resources-from-role: # role-group 1
        replicas: 1
      resources-from-role-group: # role-group 2
        replicas: 1
        config:
          resources:
            cpu:
              min: 400m
              max: 800m
            memory:
              limit: 4Gi

In this case, the role group resources-from-role will inherit the resources specified on the role level, resulting in a maximum of 3Gi memory and 600m CPU resources.

The role group resources-from-role-group has a maximum of 4Gi memory and 800m CPU resources (which overrides the role CPU resources).

For Java products the actual used heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limit is passed to the JVM.

For memory, only a limit can be specified, which will be set as memory request and limit in the container. This is to always guarantee a container the full amount memory during Kubernetes scheduling.

A minimal HA setup consisting of 3 nodes has the following resource requirements:

300m CPU request
1200m CPU limit
4800Mi memory request and limit

Of course, additional services require additional resources. For Stackable components, see the corresponding documentation on further resource requirements.

Corresponding to the values above, the operator uses the following resource defaults:

nodes:
  roleGroups:
    default:
      config:
        resources:
          cpu:
            min: "1"
            max: "4"
          memory:
            limit: 2Gi
          storage:
            data: 8Gi

The default values are most likely not sufficient to run a production cluster. Please adapt according to your requirements.