Security

Authentication

Currently the only supported authentication mechanism is Kerberos, which is disabled by default. For Kerberos to work a Kerberos KDC is needed, which the user needs to provide. The secret-operator documentation states which kind of Kerberos servers are supported and how they can be configured.

Kerberos is supported starting from HDFS version 3.3.x

1. Prepare Kerberos server

To configure HDFS to use Kerberos you first need to collect information about your Kerberos server, e.g. hostname and port. Additionally you need a service-user, which the secret-operator uses to create principals for the HDFS services.

2. Create Kerberos SecretClass

Afterwards you need to enter all the needed information into a SecretClass, as described in secret-operator documentation. The following guide assumes you have named your SecretClass kerberos-hdfs.

3. Configure HDFS to use SecretClass

The last step is to configure your HdfsCluster to use the newly created SecretClass.

spec:
  clusterConfig:
    authentication:
      tlsSecretClass: tls # Optional, defaults to "tls"
      kerberos:
        secretClass: kerberos-hdfs # Put your SecretClass name in here

The kerberos.secretClass is used to give HDFS the possibility to request keytabs from the secret-operator.

The tlsSecretClass is needed to request TLS certificates, used e.g. for the Web UIs.

The hdfs-operator defaults to AES/CTR/NoPadding for dfs.encrypt.data.transfer.cipher.suite. This can be changed using config overrides.

4. Verify that Kerberos authentication is required

Use stackablectl stacklet list to get the endpoints where the HDFS namenodes are reachable. Open the link (note that the namenode is now using https). You should see a Web UI similar to the following:

hdfs webui kerberos

The important part is

Security is on.

You can also shell into the namenode and try to access the file system: kubectl exec -it hdfs-namenode-default-0 -c namenode — bash -c 'kdestroy && bin/hdfs dfs -ls /'

You should get the error message org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS].

5. Access HDFS

In case you want to access your HDFS it is recommended to start up a client Pod that connects to HDFS, rather than shelling into the namenode. There is an integration test for this exact purpose, where you can see how to connect and get a valid keytab.

Authorization

For authorization we developed hdfs-utils, which contains an OPA authorizer and group mapper. This matches the general OPA authorization mechanisms.

It is recommended to enable Kerberos when doing Authorization, as otherwise you don’t have any security measures at all. There still might be cases where you want authorization on top of a cluster without authentication, as you don’t want to accidentally drop files and therefore use different users for different use-cases.

In order to use the authorizer you need a ConfigMap containing a rego rule and reference that in your HDFS cluster. In addition to this you need a OpaCluster that serves the rego rules - this guide assumes it’s called opa.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: hdfs-regorules
  labels:
    opa.stackable.tech/bundle: "true"
data:
  hdfs.rego: |
    package hdfs

    default allow = true

This Rego rule is for demonstration purposes and allows all operations. For production, you’ll need a much more granular rule setup. A more representative rego rule is available in our integration tests and in the aforementioned hdfs-utils repository. Details can be found below in the fine-granular rego rules section. Reference the rego rule as follows in your HdfsCluster:

spec:
  clusterConfig:
    authorization:
      opa:
        configMapName: opa
        package: hdfs

How it works

Take all your knowledge about HDFS authorization and throw it in the bin. The approach taken for the authorization paradigm here departs significantly from traditional Hadoop patterns and POSIX-style permissions.

In short, the current rego rules ignore the file ownership, permissions, ACLs and all other attributes files can have. All of this is state in HDFS and clashes with the infrastructure-as-code approach (IaC).

Instead, HDFS sends a request detailing who (e.g. alice/test-hdfs-permissions.default.svc.cluster.local@CLUSTER.LOCAL) is trying to do what (e.g. open, create, delete or append) on what file (e.g. /foo/bar). OPA then makes a decision if this action is allowed or not.

Instead of chown-ing a directory to a different user to assign write permissions, you should go to your IaC Git repository and add a rego rule entry specifying that the user is allowed to read and write to that directory.

Group memberships

We encountered several challenges while implementing the group mapper, the most serious of which being that the GroupMappingServiceProvider interface only passes the shortUsername when asking for group memberships. This does not allow us to differentiate between e.g. hbase/hbase-prod.prod-namespace.svc.cluster.local@CLUSTER.LOCAL and hbase/hbase-dev.dev-namespace.svc.cluster.local@CLUSTER.LOCAL, as the GroupMapper gets asked for hbase group memberships in both cases.

Users could work around this to assign unique shortNames by using hadoop.security.auth_to_local. This is however a potentially complex and error-prone process. We also tried mapping the principals from Kerberos 1:1 to the HDFS shortUserName, so the GroupMapper has access to the full userName. However, this did not work, as HDFS only allows "simple usernames", which are not allowed to contain a / or @.

Because of these issues we do not use a custom GroupMapper and only rely on the authorizer, which in turn receives a complete UserGroupInformation object, including the shortUserName and the precious full userName. This has the downside that the group memberships used in OPA for authorization are not known to HDFS. The implication is thus that you cannot add users to the superuser group, which is needed for certain administrative actions in HDFS. We have decided that this is an acceptable approach as normal operations are not affected. In case you really need users to be part of the superusers group, you can use a configOverride on hadoop.user.group.static.mapping.overrides for that.

Fine-granular rego rules

The hdfs-utils repository contains a more production-ready rego-rule here. With a few minor differences (e.g. Pod names) it is the same rego rule that is used in this integration test.

Access is granted by looking at three bits of information that must be supplied for every rego-rule callout:

the identity of the user
the resource requested by the user
the operation which the user wants to perform on the resource

Each operation has an implicit action-level attribute e.g. create requires at least read-write permissions. This action attribute is then checked against the permissions assigned to the user by an ACL and the operation is permitted if this check is fulfilled.

The basic structure of this rego rule is shown below (you can refer to the full here).

Rego rule outline

package hdfs

# Turn off access by default.
default allow := false
default matches_identity(identity) := false

# Check access in order of increasing specificity (i.e. identity first).
# Deny access as "early" as possible.
allow if {
    some acl in acls
    matches_identity(acl.identity)
    matches_resource(input.path, acl.resource)
    action_sufficient_for_operation(acl.action, input.operationName)
}

# Identity checks based on e.g.
# - explicit matches on the (long) userName or shortUsername
# - regex matches
# - the group membership (simple- or regex-matches on long-or short-username)
matches_identity(identity) if {
    ...
}

# Resource checks on e.g.
# - explicit file- or directory-mentions
# - inclusion of the file in recursively applied access rights
matches_resource(file, resource) if {
    ...
}

# Check the operation and its implicit action against an ACL
action_sufficient_for_operation(action, operation) if {
    action_hierarchy[action][_] == action_for_operation[operation]
}

action_hierarchy := {
    "full": ["full", "rw", "ro"],
    "rw": ["rw", "ro"],
    "ro": ["ro"],
}


# This should contain a list of all HDFS actions relevant to the application
action_for_operation := {
    "abandonBlock": "rw",
    ...
}

acls := [
    {
        "identity": "group:admins",
        "action": "full",
        "resource": "hdfs:dir:/",
    },
    ...
]

The full file in the hdfs-utils repository contains extra documentary information such as a listing of HDFS actions that would not typically be subject to an ACL. In hdfs-utils there is also a test file to verify the rules, where different asserts are applied to the rules. Take the test case below as an example:

test_admin_access_to_developers if {
    allow with input as {
        "callerUgi": {
            "shortUserName": "admin",
            "userName": "admin/test-hdfs-permissions.default.svc.cluster.local@CLUSTER.LOCAL",
        },
        "path": "/developers/file",
        "operationName": "create",
    }
}

This test passes through the following steps:

1. Does the user or group exist in the ACL?

Yes, a match is found on userName via the corresponding group (admins, yielded by the mapping groups_for_user).

2. Does this user/group have permission to fulfill the specified operation on the given path?

Yes, as this ACL item

{
    "identity": "group:admins",
    "action": "full",
    "resource": "hdfs:dir:/",
},

matches the resource on

# Resource mentions a folder higher up the tree, which grants access recursively
matches_resource(file, resource) if {
    startswith(resource, "hdfs:dir:/")
    # directories need to have a trailing slash
    endswith(resource, "/")
    startswith(file, trim_prefix(resource, "hdfs:dir:"))
}

and the action permission required for the operation create (rw) is a subset of the ACL grant (full).

The various checks for matches_identity and matches_resource are generic, given that the internal list of HDFS actions is comprehensive and the input structure is an internal implementation. This means that only the ACL needs to be adapted to specific customer needs.

Wire encryption

In case Kerberos is enabled, Privacy mode is used for best security. Wire encryption without Kerberos as well as other wire encryption modes are not supported.