S3 bucket specification

You can specify S3 connection details directly inside the SparkApplication specification or by referring to an external S3Bucket custom resource. Refer to the S3 concept documentation for general information about S3 resources on the Stackable Data Platform.

S3 access using credentials

To specify S3 connection details directly as part of the SparkApplication resource you add an inline connection configuration as shown below.

s3connection: (1)
  inline:
    host: test-minio (2)
    port: 9000 (3)
    accessStyle: Path
    credentials:
      secretClass: s3-credentials-class  (4)
1 Entry point for the S3 connection configuration.
2 Connection host.
3 Optional connection port.
4 Name of the Secret object expected to contain the following keys: accessKey and secretKey.

It is also possible to configure the connection details as a separate Kubernetes resource and only refer to that object from the SparkApplication like this:

s3connection:
  reference: s3-connection-resource (1)
1 Name of the connection resource with connection details.

The resource named s3-connection-resource is then defined as shown below:

---
apiVersion: s3.stackable.tech/v1alpha1
kind: S3Connection
metadata:
  name: s3-connection-resource
spec:
  host: test-minio
  port: 9000
  accessStyle: Path
  credentials:
    secretClass: minio-credentials-class

This has the advantage that one connection configuration can be shared across SparkApplications and reduces the cost of updating these details.

S3 access with TLS

A custom certificate can be used for S3 access. In the example below, a Secret containing a custom certificate is referenced, which is used to create a custom truststore for Spark to access the S3 bucket:

---
apiVersion: s3.stackable.tech/v1alpha1
kind: S3Connection
metadata:
  name: s3-connection-resource
spec:
  host: test-minio
  port: 9000
  accessStyle: Path
  credentials:
    secretClass: minio-credentials-class  (1)
  tls:
    verification:
      server:
        caCert:
          secretClass: minio-tls-certificates  (2)
1 Name of the Secret object expected to contain the following keys: accessKey and secretKey (as in the previous example).
2 Name of the Secret object containing the custom certificate. The certificate should comprise the 3 files named as shown below:
---
apiVersion: v1
kind: Secret
metadata:
  name: minio-tls-certificates
  labels:
    secrets.stackable.tech/class: minio-tls-certificates
data:
  ca.crt: ...
  tls.crt: ...
  tls.key: ...