S3 bucket specification
You can specify S3 connection details directly inside the
SparkApplication specification or by referring to an external
S3Bucket custom resource.
To specify S3 connection details directly as part of the
SparkApplication resource you add an inline connection configuration as shown below.
s3connection: (1) inline: host: test-minio (2) port: 9000 (3) accessStyle: Path credentials: secretClass: s3-credentials-class (4)
|1||Entry point for the S3 connection configuration.|
|3||Optional connection port.|
|4||Name of the
It is also possible to configure the connection details as a separate Kubernetes resource and only refer to that object from the
SparkApplication like this:
s3connection: reference: s3-connection-resource (1)
|1||Name of the connection resource with connection details.|
The resource named
s3-connection-resource is then defined as shown below:
--- apiVersion: s3.stackable.tech/v1alpha1 kind: S3Connection metadata: name: s3-connection-resource spec: host: test-minio port: 9000 accessStyle: Path credentials: secretClass: minio-credentials-class
This has the advantage that one connection configuration can be shared across
SparkApplications and reduces the cost of updating these details.
A custom certificate can also be used for S3 access. In the example below, a Secret containing a custom certificate is referenced, which will used a to create a custom truststore which is used by Spark for S3-bucket access:
--- apiVersion: s3.stackable.tech/v1alpha1 kind: S3Connection metadata: name: s3-connection-resource spec: host: test-minio port: 9000 accessStyle: Path credentials: secretClass: minio-credentials-class (1) tls: verification: server: caCert: secretClass: minio-tls-certificates (2)
|1||Name of the
|2||Name of the
--- apiVersion: v1 kind: Secret metadata: name: minio-tls-certificates labels: secrets.stackable.tech/class: minio-tls-certificates data: ca.crt: ... tls.crt: ... tls.key: ...