ADR032: OIDC Support
-
Status: accepted
-
Deciders:
-
Sebastian Bernauer
-
Malte Sander
-
Sascha Lautenschläger
-
Natalie Röijezon
-
Razvan Mihai
-
Felix Hennig
-
Lukas Voetmand
-
Nick Larsen
-
-
Date: 2023-11-14
Technical Story: https://github.com/stackabletech/issues/issues/431
Problem statement
OIDC is a widespread authentication mechanism, supported by most modern data products as well as identity providers. OIDC is also used inside the Gaia-X ecosystem. Keycloak is also widespread as a self-hosted, internal identity provider. Just as with LDAP, we expect users to already have an identity provider running; deploying an identity provider is out of scope of the design.
We want to support OIDC as an authentication mechanism on the Stackable Data Platform, allowing users to configure their already existing identity provider via our AuthenticationClass mechanism.
Problems include the slightly different ways of configuring products; deciding on how OIDC clients are configured and how to provide client credentials.
Context
Useful information to consider in the decision.
OIDC clients and product-client-mappings
To configure a product to use OIDC, it needs information about where to find the OIDC provider (typically the URL of the discovery endpoint), as well as client credentials to authenticate with the provider. The connection information is generic and shared between all clients, but the crendentials are product specific.
It is best practice to have one client per connecting product. This allows the user to define exactly from which hosts the client can be used, as well as valid redirect URLs. Sometimes a separate client per product is also technically required, because a product might require specific configuration for a particular product, i.e. rewriting claims.
Configuring multiple different mappings from products and product instances needs to take place at the client level; making these use cases easy to configure is an important consideration. These mappings can be configuration option mappings or even JWT claim mappings. Executing these mappings takes place at the operator / product level.
-
One single client for all product connections (useful for debugging, or as an initial test setup; might not always be technically feasible depending on the products)
-
One client per product but not per instance - i.e. running 5 NiFi instances and 5 Trinos, but with 2 clients: one for NiFi and one for Trino. This might be a good setup in larger teams, where new product instances get spun up all the time and the team managing the OIDC provider (and clients) is different from the team running the data products.
-
One client per product instance: The finest granularity, every instance gets its own client.
Product configs we need to support with the AuthenticationClass
Below are example configurations for Superset, Trino and Druid, which we built for the keycloak-opa-poc
Stack.
The AuthenticationClass needs to contain information to generate these configs.
Trino
web-ui.authentication.type: oauth2
http-server.authentication.oauth2.client-id: trino
http-server.authentication.oauth2.client-secret: ${ENV:TRINO_CLIENT_SECRET}
http-server.authentication.oauth2.issuer: https://${ENV:KEYCLOAK_ADDRESS}/realms/master
http-server.authentication.oauth2.scopes: openid
http-server.authentication.oauth2.principal-field: preferred_username
Druid
# pac4j authenticator
druid.auth.authenticator.pac4j.type: pac4j
druid.auth.authenticator.pac4j.authorizerName: OpaAuthorizer
# pa4j common config
druid.auth.pac4j.cookiePassphrase: '${env:DRUID_COOKIE_PASSPHRASE}'
# OIDC common config
druid.auth.pac4j.oidc.clientID: druid
druid.auth.pac4j.oidc.clientSecret: '{"type":"environment","variable":"DRUID_CLIENT_SECRET"}'
druid.auth.pac4j.oidc.discoveryURI: '${env:KEYCLOAK_DISCOVERY_URL}'
Superset
Does not support reading an OIDC discovery URL but requires:
-
separate API base URL, auth URL, token URL
-
the
email
,profile
andopenid
scopes -
that the auth and token URL are supplied at the
.well-known
endoint, but the base URL is not
We could opt to implement proper OIDC support using fab-oidc. This however needs maintenance work from us.
{
'name': 'keycloak',
'icon': 'fa-key',
'token_key': 'access_token',
'remote_app': {
'client_id': 'superset',
'client_secret': f'{os.environ.get("SUPERSET_CLIENT_SECRET")}',
'api_base_url': f'http://{os.environ.get("KEYCLOAK_ADDRESS")}/realms/master/protocol/openid-connect',
'client_kwargs': {
'scope': 'email profile openid'
},
'access_token_url': f'http://{os.environ.get("KEYCLOAK_ADDRESS")}/realms/master/protocol/openid-connect/token',
'authorize_url': f'http://{os.environ.get("KEYCLOAK_ADDRESS")}/realms/master/protocol/openid-connect/auth',
'request_token_url': None,
},
}
Common authentication servers
For testing (and also in the Gaia-X context) we use Keycloak as the identity provider, but our design should not be Keycloak specific; it should work with — ideally — all OIDC providers. Other common providers are:
-
Auth0
-
ADFS
-
Dex
-
Okta
As mentioned before, we expect the user to already operate the identity provider.
Decision drivers
-
Don’t repeat yourself: Information should ideally only be configured in one spot.
-
Flexible: Different variants of client and product instance mappings should be supported.
-
Comprehensible: Users should not be overwhelmed by complicated documentation. After setting up one product, users should be able to fairly easily setup other products as well.
-
High level of support across SDP: All products supporting OIDC should work. Furthermore, most OIDC providers listed above should work.
-
No surprises: We have previously designed the AuthenticationClass mechanism and how it integrates into product CRDs, at the time to support LDAP authentication. The OIDC configuration should have similar ergonomics so users are not surprised and we get a coherent platform.
AuthenticationClass & product cluster configuration design
AuthenticationClass design
During a Hackathon we came up with an initial design. This design was improved upon during the on-site meeting from 2023-11-13 to 2023-11-17. The final design looks like this:
apiVersion: authentication.stackable.tech/v1alpha1
kind: AuthenticationClass
metadata:
name: keycloak
spec:
provider:
oidc:
# Hostname of the IdP. Like "idp.mycompany.corp"
hostname: "$KEYCLOAK_HOSTNAME"
# Optional port number to use. If unspecified, connections will
# use the default port for the HTTP scheme (ie: 443 for when TLS
# is enabled, or 80 if TLS is disabled).
port: $KEYCLOAK_PORT
# Optional root path appended to the hostname. This defaults
# to "/".
rootPath: /realms/master
# User configurable scopes depending on the Identity provider
# requirements.
# The following three scopes are usually required, but depending
# on the Identity Provider requirements, you might need to add or
# remove some from this list.
scopes: [ openid, email, profile ]
# If a product extracts some sort of "effective user" that is
# represented by a string internally, this config determines with
# claim is used to extract that string. It is desirable to use
# `sub` in here (or some other stable identifier), but in many
# cases you might need to use `preferred_username` (e.g. in case
# of Keycloak) or a different claim instead.
#
# Please note that some products hard-coded the claim in their
# implementation, so some product operators might error out if
# the product hardcodes a different claim than configured here.
#
# We don't provide any default value, as there is no correct way
# of doing it that works in all setups. Most demos will probably
# use `preferred_username`, although `sub` being more desirable,
# but technically impossible with the current behavior of the
# products.
principalClaim: preferred_username
# Optional provider hint. If unspecified, the product will not
# enable any known quirks and will assume OIDC works as it is
# intended to work.
providerHint: Keycloak
tls:
verification:
none: {}
Product cluster configuration design
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: trino
spec:
image:
productVersion: "414"
stackableVersion: 23.7.0
clusterConfig:
# Other required config options omitted for brevity
authentication:
- authenticationClass: keycloak / open-ldap
oidc:
# A reference to the OIDC client credentials secret, which
# consists of a client_id and client_secret.
clientCredentialsSecret: trino-keycloak-client
# Additional scopes required for this specific product. It
# will get merged with the above configured scopes.
extraScopes: [ groups ]
---
apiVersion: v1
kind: Secret
metadata:
name: trino-keycloak-client
stringData:
clientId: trino
clientSecret: "{{ keycloakTrinoClientSecret }}"
In the future we want to nest LDAP related config options under the authenticationClass
key the same way oidc
is in this ADR.
The design looks like this:
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: trino
spec:
image:
productVersion: "414"
stackableVersion: 23.7.0
clusterConfig:
# Other required config options omitted for brevity
authentication:
- authenticationClass: open-ldap
ldap:
# Optional. Only required for some products. In the future
# this will be replaced by the key "bindUserSecret".
bindCredentialsSecretClass: trino-openldap-bind
---
apiVersion: v1
kind: Secret
metadata:
name: trino-openldap-bind
stringData:
username: admin
password: "{{ ldapTrinoPassword }}"
Considered alternatives
-
A distinct OAuth2 AuthenticationClass: This was considered to make it easier to configure Superset and Airflow, as they do not support ODIC out-of-the-box, but during a spike we found that it was feasible to generate OAuth2 configuration from the OIDC AuthenticationClass.
-
Identity provider specific AuthenticationClasses: The idea of having a "Keycloak" class instead of a generic ODIC class was floated, but discarded as it seemed to not have any benefits.