Temporary credentials lifetime

Usages

TLS certificates

Currently the only temporary credentials are TLS certificates.

Many products use TLS to secure the communications, often times customers use the secret-operator autoTls backend to create TLS certificates for the Pods on the fly. To increase security, these temporary credentials have a short lifetime by default, which will result in e.g. Trino coordinator Pods restarting every ~24 hours (minus some safety buffer) to avoid using expired certificates.

Configure the lifetime

In high load production environments, restarting Pods can be a costly operation, as it can disrupt services and in some cases even lead to data loss. To avoid frequent restarts, the lifetime of all temporary credentials (such as the TLS certificates) can be increased as needed.

Here is an example for configuring the temporary credentials lifetime to 7 days in a HDFS stacklet. It should result in the HDFS Pods restarting weekly instead of daily:

---
apiVersion: hdfs.stackable.tech/v1alpha1
kind: HdfsCluster
metadata:
  name: hdfs
spec:
  nameNodes:
    config:
      requestedSecretLifetime: 7d (1)
    roleGroups:
      default:
        replicas: 2
  dataNodes:
    config:
      requestedSecretLifetime: 7d (2)
    roleGroups:
      default:
        replicas: 2
  journalNodes:
    roleGroups:
      default:
        replicas: 3
        config:
          requestedSecretLifetime: 7d (3)
1 The lifetime of the TLS certificates for all NameNode roleGroups is set to 7 days.
2 The lifetime of the TLS certificates for all DataNode roleGroups is set to 7 days.
3 The lifetime of the TLS certificates for the default JournalNode group is set to 7 days.
The configuration for the JournalNodes is done at roleGroup level for demonstration purposes.

TLS certificate lifetimes

Even though operators allow setting this property to a value of your choice, the secret-operator will not exceed the maxCertificateLifetime value specified in SecretClass creating the TLS certificates (see xref:secret-operator/secretclass.adoc#certificate_lifetime).

Operator support

Similar to the example above, users can configure the lifetime of temporary credentials for the following operators:

  • Apache Druid

  • Apache Hadoop

  • Apache HBase

  • Apache NiFi

  • Apache Spark

  • Apache Zookeeper

  • Trino

Pod lifetime annotations

After configuring the lifetime as described above you could simply observe your stacklet for a week/month (or whatever your new lifetime is), to see if your changes take effect. However, it’s much quicker to check at what point of time your Pods will be restarted next.

Pods are not restarted "randomly" by Stackable operators, but in a predicable manner. When a temporary credential is added to a Pod, an annotation is added as well. It starts with restarter.stackable.tech/expires-at. and instructs the restart-controller to restart the Pod once the specified point in time is reached.

Given the following Pod

kind: Pod
metadata:
  annotations:
    restarter.stackable.tech/expires-at.b887492af14bfe84f10cb2ff1b60acb0: "2024-12-05T14:03:47.131570189+00:00"
    restarter.stackable.tech/expires-at.ea77192c1184326d33e8ee32cfe921ea: "2024-12-05T15:49:10.043722965+00:00"

You can always determine the instant the Pod will be restarted by the restart-controller by taking the earliest timestamp, 2024-12-05T14:03:47.131570189+00:00 in this case.

You can use this timestamp to check if your changes have been applied as intended.