Kubernetes Authentication - Comparing Solutions

This post is a deep dive into comparing different solutions for authenticating into a Kubernetes cluster. The goal of this post is to give you an idea of what the various solutions provide for a typical cluster deployment using production capable configurations. We're also going to walk through deployments to get an idea as to how long it takes for each project and look at common operations tasks for the each solution. This blog post is written from the perspective of an enterprise deployment. If you're looking to run a Kubernetes lab, or use Kubernetes for a service provider, I think you'll still find this useful. We're not going to do a deep dive in how either OpenID connect or Kubernetes authentication actually works. That's a deep topic that I cover to great detail in Kubernetes: An Enterprise Guide. That said, Packt has made that chapter free, with no need to register or provide an email address, no one will spam you! This blog is going to read more like the chapter of a book then your typical blog post. It'll be broken down into a few sections:

Enterprise Requirements - If we're going to run our solution in an enterprise cluster, what does it need?
Deployment Assumptions - What can we assume is available?
Deployments - We're going to walk through deploying OpenUnison, Keycloak, Dex, and Pinniped based on our enterprise requirements into a fresh cluster
Operations - Once a product is deployed, how do we work with it? What are the common tasks you'll need to do every day?
Multi-Cluster - Much like a chip, you can't have just one! We'll explore how each project handles multi-cluster management.
Edge Cases - A hallmark of enterprise IT is you'll spend 80% of your budget on 20% of the use cases. A common edge case is "get your groups from someplace else", so we'll look at how to do that for each project.
Final Thoughts and Comparisons - Having worked through all these scenarios, what next?

The goal for all of the examples in this blog is for you to be able to use them on your own. I'll say upfront that this blog post is not unbiased, however I'll do my best to show each project in it's best light. With all that said, the next step is to examine our requirements.

Why Do This At All?

The first question you might ask is "why would I do this at all when my managed cluster uses my cloud's IAM or my distribution has its own solution?" In enterprises, management silos will often dictate who controls access to a cluster. For instance, if you're using a cloud hosted cluster does your cloud team want to be responsible for managing access? Do they want to be a potential bottleneck? When the enterprise already has a central authentication system and potentially an authorization management system it's much easier to externalize those components from the underlying cluster technology. It's easier on the cloud team because they're no longer responsible for a job that's not part of their core skill set, and it's easier for customers because they're now in control of who has access to their systems without a 3rd party. Finally, externalizing authentication and authorization can ease compliance and audit requirements because much of that function can be offloaded to the identity team.

There's often a "build vs buy" decision in any kind of shared service, and authentication is no different. As a cluster owner, you'll need to balance the desire to be in control of your team's destiny with the budget spent on rebuilding the wheel. This post attempts to show how you can get both your enterprise level authentication without having to be at the behest of an external team for common operations.

Enterprise Requirements

Having gone through an overview of the post, the next step is to look at what the requirements for our deployment is. A typical enterprise deployment requirements are:

Integration with an enterprise authentication store - This is generally going to be either LDAP/Active Directory or a cloud based SSO system like Okta or AzureAD. For this deployment, we're going to go with Okta.
Use enterprise groups for RBAC - One of the many benefits of external authentication is leveraging the enterprise's central group store for RBAC. This simplifies authorizations and auditing.
CLI Integration - Once authenticated, how do users get a kubectl configuration?
Pipeline Integration - How will pipelines authenticate to the cluster? Surely not using Kubernetes ServiceAccount tokens from outside the cluster, right?
Dashboard Access - It's an enterprise and GUIs are important in enterprise!
Cluster Management Applications - If we deployed ArgoCD as an example, how would users access it securely?
Security Operations - How are user's access revoked? How do you upgrade the system? Do you need backups?
Compliance - Our cluster must be compliant with various regulations.

That's quite a few requirements! What am I basing them off of? Over twenty years of deploying enterprise identity systems as well as deploying and managing production Kubernetes clusters since 2016 in enterprises across the globe.

If you have some familiarity with Kubernetes and Okta, you might be asking yourself, "why do I need anything? Okta supports OIDC and Kubernetes supports OIDC!" This would be true if it were just about your cluster. What about your GUIs, dashboard, GitOps controllers, monitoring systems, etc? Each cluster is made of more then just the API server. Each of these components have their own identity and their own login system. Since your enterprise's central authentication system is often managed by a different silo that has other customers. If one cluster needs five or six applications with identity, and like most enterprises there are multiple clusters, you could be working to integrate thirty to forty applications with your enterprise authentication system pretty quickly! Using an identity proxy, like all of the products in this blog, let's you use your central identity system across multiple applications without involving your central identity team.

The above diagram shows off what each cluster will look like from a generic perspective. We'll update this for each project. Okta is a SaaS service, so it sits outside of our cluster. Our dashboard doesn't know how to speak to authentication systems, so we'll need to provide a proxy. ArgoCD does know how to use OpenID Connect, so we'll be able to integrate that directly with our identity providers.

The good news about these requirements is that there's some help! Since most enterprises provide some of the things we'll need in some kind of capacity, the next section will let us make some assumptions.

Deployment Assumptions

With all of the above requirements, it'd be very burdensome to assume we have to do all of them ourselves. In reality, it's likely that even if we wanted to implement everything ourselves, we wouldn't be allowed to due to the rules most enterprises have. For instance, just because you could run your own database on Kubernetes, doesn't mean your enterprise won't mandate that you use the one provided by the enterprise. Sometimes you're allowed to go it alone, but often you aren't. Sometimes it makes you're life easier as an operator, and sometimes it doesn't. That said, here are the assumptions we'll make for each deployment we're going to walk through:

Databases will be provided by the enterprise - If a solution requires a database of any kind, it will not run on our cluster. This isn't because you shouldn't run databases on Kubernetes. It's because enterprises often spend very large amounts of money on centralized database services. This is to centralize knowledge and common tasks, like backups and restores. Even if we end up deploying a database for our walk through, we're going to pretend it's from outside the cluster.
Wildcard certificate for Ingress - Most enterprises centralize certificate issuance. Standing up a certificate authority for Kubernetes is very easy, but that's not the point. Most enterprises have compliance requirements that means their certificate authorities have to follow certain rules. Rules that don't work well with cloud native automation. In order to simulate this, we'll deploy each solution with a wildcard certificate pre-deployed. While most solutions have some way to deploy development certificates, we're looking for production deployment scenarios where these development certificates won't be acceptable.
Impersonation for Kubernetes Access - Most enterprises have a combination of on-premises and cloud managed clusters. This approach will provide the most uniform approach. Enterprises are all about uniformity, except for the millions of edge cases!
NGINX Ingress with a pre-existing Load Balancer - Ingress can be really complex, and this isn't a deep dive on how to Ingress.
DNS wildcard host- We'll be issued a single wildcard host. Since most enterprise DNS is centrally managed, it's typical to use wildcards whenever possible since the teams who manage DNS don't generally want to provide you the ability to create your own hosts or delegate subdomains.

For an impersonation proxy, we're going to use kube-oidc-proxy, originally from JetStack. This is a great project that let's you specify an OIDC provider the same way the API server does. but uses impersonation to the API server. Unfortunately, the project hasn't had a commit since early 2021. Tremolo Security forked the project to keep it up to date and add some new features. Everything except Pinniped will use Tremolo Security's kube-oidc-proxy.

Each deployment is going to be run on Civo cloud. Why on Civo? It's very easy for me to spin up new clusters with built in storage, Ingress, and load balancers. That said, you'll be able to use these tutorials with any distribution. Now that we've walked through our requirements and assumptions, the next step is to start deploying!

Deployments

Each of these deployments will have as complete of instructions as possible. I'm not creating templated helm charts, everything will be simple YAML files that you can edit on your own. All of the manifests in this post are available in the post's GitHub repo.

Each deployment will be a Civo cluster with NGINX, the Kubernetes Dashboard, ArgoCD, and Cert Manager installed. For Okta, we're going to follow the previous instructions for deploying an Okta application with Kubernetes. We'll set this up once and just adjust the redirect URL as needed for each project we use. For DNS, I have a wildcard, *.blog.tremolo.dev, setup to point to the load balancer for each deployment.

Since we're working with a wildcard cert provisioned by cert-manager, we need to generate the wildcard and make it NGINX' default certificate. Once the cluster is running and DNS is setup, we'll run:

kubectl create -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/cluster/cert-manager-bootstrap.yaml
kubectl patch deployments.apps ingress-nginx-controller --type=json -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--default-ssl-certificate=ingress-nginx/wildcard-tls" }]' -n ingress-nginx

We'll also want to setup an Ingress for ArgoCD so we can access it without kubectl port-forward.

kubectl create -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/cluster/argocd-ingress.yaml
kubectl patch deployments argo-cd-argocd-server -n argocd -p '{"spec":{"template":{"spec":{"containers":[{"name":"server","command":["argocd-server","--staticassets","/shared/app","--repo-server","argo-cd-argocd-repo-server:8081","--logformat","text","--loglevel","info","--redis","argo-cd-argocd-redis:6379","--insecure"]}]}}}}'

The first command creates the Ingress objects, the second command makes them work with the ArgoCD deployment. We're removing Dex from ArgoCD since we're going to integrate directly.

Finally, we're going to setup Okta. We'll use a group as cluster-admin. I've documented this deployment in previous posts, but we'll do it here too.

First up, OpenUnison!

OpenUnison

I know, it's a bit cheesy to start with our own project. But, to be honest it's going to be the baseline. I'm not pretending this assessment is unbiased, but I am going to do my absolute best to show each project in its best light so I don't feel too bad about putting OpenUnison first. First, let's look at what our cluster will look like:

OpenUnison will serve multiple roles in our cluster:

Identity Proxy - OpenUnison will establish a trust with Okta, and provide that identity data to Kubernetes and ArgoCD as an identity provider
SSO Proxy for the Kubernetes Dashboard - The built in reverse proxy will inject identity data into each request to the dashboard
Cluster Access Portal - OpenUnison provides a portal that will tell us what we have access to and provide access, simplifying the overall usage user experience

In addition to OpenUnison, the helm charts will deploy the kube-oidc-proxy to provide access to our API server. If we wanted to, we could also integrate with the API server directly and skip this component, but we wanted to simplify the deployment process for this post. Since OpenUnison has an integrated SSO proxy, why use kube-oidc-proxy? The Kubernetes client-go SDK, which is likely the post popular client for Kubernetes and is used by kubectl and any of the local dashboards written in go-lang, uses the SPDY protocol for exec, cp, proxy, and other stream based operations. Unfortunately, SPDY is a dead protocol outside of Kubernetes and there is very little support for it. Instead of writing our own SPDY protocol handler, we found it much easier to fork kube-oidc-proxy and keep it up to date.

With our architecture layed out, the next step is to deploy OpenUnison into our cluster. Once Okta is configured with the correct redirect, I downloaded the default values.yaml and made a few updates:

Set hosts - Lines 2,3,4 all point to our wildcard DNS, which points to our load balancer
Don't create ingress certificate - Line 8, tells the OpenUnison operator to not create a certificate. This is because we're going to use our enterprise wildcard certificate
Enable Impersonation - Line 23, Since Civo doesn't support openid connect integration (as most managed Kubernetes won't), we'll use an impersonation proxy with the API server instead of connecting directly to our Civo API server
Set impersonation CA certificate - Line 29, We need to tell the kube-oidc-proxy what certificate to trust. We'll create a Secret with our enterprise CA certificate
Trust Enterprise CA - Line 42-44, We're explicitly trusting our enterprise CA certificate. This will embed the certificate into our kubectl configuration
Configure OpenID Connect - Line 49-61, The only change we made here was to set the client_id and the issuer

With our values.yaml ready, we need to download the ouctl command and we're ready to deploy:

kubectl apply -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/openunison/openunison-bootstrap.yaml
echo -n "mysecret" > /tmp/okta
./ouctl install-auth-portal -s /tmp/okta /path/to/values.yaml

The first kubectl command created the openunison namespace, a Secret for our enterprise CA certificate, and a ClusterRoleBinding for the k8s-admin group from Okta. The next command creates our Okta secret file and the final command deploys OpenUnison. This last one will take a few minutes, but once it's done, we're ready to access our cluster!

Accessing our Cluster

Once deployed, you can now login:

There are three ways we can access our cluster via OpenUnison:

Kubernetes Dashboard
Generate a kubectl configuration from the portal
Use the oulogin kubectl plugin

The first option is the simplest. Just click on the Dashboard badge and you're logged in! The second option lets you generate a kubectl command for macOS/Linux or Windows that includes everything you need. Including all certificates. This means you don't need to distribute any configuration files to users. Many users don't enjoy using this method though, they want to work directly from the cli. The oulogin plugin handles that scenario. It's a kubectl plugin that will pop open a browser, authenticate you, then generate your kubectl configuration. Again, there's no need to pre-distribute a configuration for your cluster.

Operations

There's no data to speak of, so there's not much for an operator to do day-to-day with OpenUnison. Upgrading is pretty straight forward, watch for new versions of the helm charts! The container is monitored for patched CVEs and rebuilt as those CVEs are released from Conical.

For monitoring, OpenUnison provides prometheus endpoints, which will tell you that the system is running, how many users are logged in, and all of the other stats available from Java's Prometheus integration. OpenUnison requires that Prometheus use it's ServiceAccount when contacting OpenUnison for its metrics endpoint, adding security to the monitoring process.

Security and Compliance

OpenUnison uses one minute tokens with a default of 15 minute refresh tokens. This means that if someone were to get access to the id_token OpenUnison generates for a user, it's unlikely that the attacker could do anything useful with the token. Your clients are going to be constantly refreshing, and that's OK! It also means that you can revoke a token by deleting the oidc-session object corresponding to your user in the openunison namespace. On next refresh the user's tokens will fail to refresh and the user will have to login again. Also, explicitly logging out of the portal will terminate your sessions, including the cli sessions.

Edge Cases

OpenUnison provides several places to make customizations and hooks to do additional work. Retrieving group names from Azure Active Directory is an example of OpenUnison's customization capability. AzureAD doesn't return group names in the id_token, and if there are a large number of group memberships it might need to return a page of results. OpenUnison can be updated via custom resources to look these groups up. A similar technique can be used with other data stores using YAML and JavaScript without having to rebuild or redeploy the containers.

Pipeline Integration

The API that the OpenUnison portal and the ouctl command use to generate a kubectl configuration can be used with other forms of authentication too. Imagine you have a GitHub action that needs to talk to your cluster. You can tell OpenUnison to authenticate the GitHub action's OIDC token and authorize for certain projects and repos. Then the action can download the configuration and use kubectl just like a user. There's an example of this in Chapter 5 of Kubernetes: An Enterprise Guide where we use LDAP to authenticate a service account. We're able to do this because we have created purpose built integration points with Kubernetes, making it much easier to customize.

Cluster Management Applications

Our ArgoCD needs to be integrated with our OpenUnison. The process is to:

Setup ArgoCD for OIDC integration
Create a Trust object for OpenUnison
Create a portal badge

This of course assumes that our cluster management applications support OpenID Connect and will need the same information as our cluster. If that's not the case, such as with Prometheus, you can configure applications behind OpenUnison's identity proxy too.

Setting up SSO with ArgoCD is pretty easy:

kubectl apply -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/openunison/openunison-argocd.yaml

This command creates four objects. Two in the argocd namespace that tells ArgoCD to use OpenUnison for OpenID Connect and to treat the k8s-admin group from Okta as an ArgoCD admin. The other two are the Trust and PortalURL objects that tell OpenUnison to provide authentication data to ArgoCD and a badge on the front portal. We didn't configure a client secret because we wanted to support the argocd cli, so we made the endpoint public.

Multi Cluster

OpenUnison has built in multi-cluster capabilities by deploying an OpenUnison onto a control plane cluster, with each satellite cluster using the control plane's OpenUnison as an identity provider.

This provides multiple advantages over having each cluster integrating directly with Okta:

Local Integration Control - Most enterprises have a centralized team for Okta (and other identity services) with limited resources and other application teams that need to integrate. By integrating just a control plane instance the Kubernetes team can control how to integrate new clusters and when.
Delegated Control - Cluster owners can take responsibility for integrating their applications into OpenUnison, without involving the Kubernetes platform team.
Better Security - The secrets needed to integrate with Okta only need to be stored in your control plane, limiting who has access and potential leaks.

OpenUnison's access portal also makes for a convenient way for developers to know which portals they have access to. This makes for simpler operations because you don't need to provide an additional system or distribute a kubectl configuration file. Everything is available from a central location.

Final Thoughts

OpenUnison is our own project, so I would expect you to be skeptical and invite you to give OpenUnison a try. You might have come here looking for how to deploy Keycloak, Dex, or Pinniped, but I think you'll see that OpenUnison will fulfill your cluster identity needs much more easily and with a much easier to manage solution.

That said, next, we'll look at KeyCloak.

KeyCloak

Keycloak is a generic authentication system built by Red Hat. It's 100% open source and has a great reputation of scale and power. Based on reading the website, we're going to follow the instructions for the Keycloak Operator, since that seems to be the preferred method per the website. Keycloak requires a database. You can deploy it with an embedded database, but this isn't a production viable approach. As we discussed earlier, we'll assume that our enterprise has its own database team and not focus on getting a database up and running. The operator documentation relies on PostgreSQL, so that's what we'll use. Thankfully, Civo can deploy PostgreSQL directly into our cluster. Here's what our architecture will look like:

The architecture is more complex then OpenUnison's. First off, you'll see we have a database outside of the cluster. You'll also see that we're still using Tremolo Security's kube-oidc-proxy for accessing the API server. KeyCloak doesn't include any kind of proxy solution, so we need to use something to augment it. If you look at the dashboard, we're combining the kube-oidc-proxy and the OAuth2 proxy. We can use the kube-oidc-proxy to send impersonation headers to the dashboard, but it doesn't know how to initialize and authentication with an OIDC identity provider. It assumes that you'll include an id_token on each request. The OAuth2 proxy is commonly used with on-prem clusters to provide secure access to dashboards that can use the same tokens as the api server, but it doesn't know how to generate impersonation headers. Combining the two gives us a secure access mechanism for the dashboard. We'll use this same mechanism for Dex and Pinniped too.

What you won't see with KeyCloak is a portal for centralized access. The project doesn't have one.

Now that our architecture is designed, the next step is deploy the Keycloak operator:

kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/20.0.0/kubernetes/keycloaks.k8s.keycloak.org-v1.yml
kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/20.0.0/kubernetes/keycloakrealmimports.k8s.keycloak.org-v1.yml
kubectl create ns keycloak
kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/20.0.0/kubernetes/kubernetes.yml -n keycloak

With the operator deployed, the next step is to create a Secret for the operator to tell KeyCloak how to talk to the database. I'm getting these credentials from the postgres-config ConfigMap in the default namespace:

kubectl create secret generic keycloak-db-secret -n keycloak \
  --from-literal=username=[your_database_username] \
  --from-literal=password=[your_database_password]

Next, we need a certificate and deploy to KeyCloak. To make life a little easier, we're going to generate an internal CA and let cert-manager do the work of generating certificates. We're not using the "enterprise" CA we created because in most enterprises, you wouldn't be able to do this. Generating our own certificates internally for inter-cluster communication is fine though.

kubectl apply -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/keycloak/keycloak-cert.yaml

With our certificate in place, the next step is to deploy keycloak using the KeyCloak custom resource and an Ingress. We're not using the Ingress created by the operator because we want to use our default wildcard instead of a custom built certificate:

kubectl create -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/keycloak/keycloak-deploy.yaml

Now with Keycloak deployed, we can access it from our browser. To access the admin console, we need to get the default credentials that were generated for us.

kubectl get secret cluster-kc-initial-admin -n keycloak -o json | jq -r '.data.password' | base64 -d

Now that we're in the admin console, the first thing we need to do is secure it. Since the dashboard is accessible through the main portal everything else uses, you don't really want to leave it with just the admin username and password. That would like violate any compliance requirements you have because the admin user can't be tracked. We're going to integrate it into our Okta site.

First, login to Keycloak using the admin credentials and click Groups:

Click on Create group and create a group called keycloak-admins:

Click on your new group:

Click on Role Mapping then click Assign Role:

Choose admin:

With our admin group set, next we need to create our identity provider that will link to Okta. Choose Identity Providers and click on OpenID Connect v1.0

Set the options as shown based on your Okta configuration:

‍

Update your Okta settings with the Redirect URI from the next screen, open the Advanced section and set your scopes to openid email profile groups:

Click on Mappers, Add Mapper

Set the email claim to map to email:

Repeat for firstName and lastName. Add another mapper to map the k8s-admin group in Okta to the group we created earlier:

At this point, any Okta user with the group k8s-admin will be mapped to the keycloak-admins group. Now if your Okta uses MFA, so does admin of your keycloak cluster! Except, now we have a password that can back-door keycloak. That needs to be disabled, but unfortunately you can't without customizing the theme. The best approach is to just disable the admin user. This way it can't be abused. We don't really want to do that either, because if the connection with Okta is broken then there's no way to re-enable the admin user. To get around this issue, we'll force the user to use MFA. Keycloak, by default, uses TOTP. You'll need either Google Auth or one of the other many authenticator apps that supports TOTP. Go to the user's screen and choose the admin user. Next to Required user actions choose Configure OTP.

This requires that the admin user registers their OTP on the next login. Next, click on the Authentication menu and choose the Required Actions tab. Next to Configure OTP make sure its set by default.

Now, logout and log back in. You'll be prompted with a QR code to scan in. PRINT THIS SCREEN TO PAPER, LAMINATE IT, AND STORE IT IN A FIREBOX. With Keycloak's admin access secured, we can now focus on our requirements for integrating Kubernetes.

As we stated above, we're going to use kube-oidc-proxy to integrate with our cluster. If your cluster deployment supports direct OIDC integration, once the identity provider is stood up, you can directly integrate and skip the kube-oidc-proxy deployment.

That said, the first step is to setup a new realm. We could re-use the old realm, but since that is designed to manage the entire system it would be best to setup a new one. That way, if there's a problem with the configuration it won't impact the administration of Keycloak:

I'm calling my realm cluster. Once the realm is created, add a group called k8s-admins. We'll map this group to the RBAC cluster-admin ClusterRole. Once the group is created, repeat the same steps from earlier to integrate with Okta, but in the cluster realm, including the attribute and group mapping. If you have multiple groups you want to map to multiple ClusterRoles and Roles, you'll need to repeat this process manually. There is unfortunately no easy way to automatically merge groups from a remote OIDC identity provider, like Okta. If you're using LDAP or AD, this is much easier since Keycloak will synchronize those groups for you.

Once Okta is integrated, the next step is to create a client for the kube-oidc-proxy we're going to deploy. Creating this client is different then creating clients for most applications because Kubernetes doesn't have an interface for interacting with a user. There are a couple of ways you can do this. For this post, we're going to use kubelogin kubectl plugin. This plugin works similarly to the ouctl kubectl plugin, except you need to pre-configure your kubectl configuration for your cluster. When we setup out client, we'll configure it to work with our local plugin instead of integrating with a web application. First, create a client:

On the first screen, set the client id and name to kube-login:

On the next screen, keep the Standard flow but remove direct access grants:

On the next screen, use http://localhost:8000 as your Valid redirect URIs and that Client authentication is Off:

We're disabling client authentication so that we don't need to distribute a client secret to each user. Most enterprise compliance groups will treat this as a password, making it nearly impossible to distribute and effectively useless. Next, click on the Advanced tab:

Then, scroll to the bottom where Advanced Settings are. Set Access Token Lifespan to 1 minute. This way if someone were to compromise a token it will likely be useless b the time an attacker can use it.

With our client configured, the next step is to add groups to our profile. Keycloak doesn't do this automatically. Click on Client scopes and then click on profile:

Click on Mappers and then Add mapper:

Choose By configuration, then on the next screen choose Group Membership:

On the next screen, enter groups for both the mapper and and the attribute name and click Save:

With Keycloak finally setup, we're ready to deploy kube-oidc-proxy. I put together a gist with all the configuration objects. Deploying is pretty straight forward:

kubectl apply -f https://raw.githubusercontent.com/TremoloSecurity/kubernetes-authentication/main/keycloak/keycloak-kube-oidc-proxy.yaml

This manifest creates a new namespace, a Deployment for our proxy, a Secret for the connection from our NGINX ingress to our proxy, a Service, and an Ingress. There's also a ClusterRoleBinding to the /k8s-admins group so that our user has cluster administrator access. The final step before we can access our cluster is to setup our local kubectl configuration file. Because we don't have an interface or system to pull our configuration, we need to build it. I created one here:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFVENDQWZtZ0F3SUJBZ0lVYmtiS2ZRN29ldXJuVHpyeWdIL0dDS0kzNkUwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0dERVdNQlFHQTFVRUF3d05aVzUwWlhKd2NtbHpaUzFqWVRBZUZ3MHlNakV4TURjeE5EUTFNakphRncwegpNakV4TURReE5EUTFNakphTUJneEZqQVVCZ05WQkFNTURXVnVkR1Z5Y0hKcGMyVXRZMkV3Z2dFaU1BMEdDU3FHClNJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUNucVZ3eVFvMjJyRzZuVVpjU2UvR21WZnI5MEt6Z3V4MDkKNDY4cFNTUWRwRHE5UlRRVU92ZkFUUEJXODF3QlJmUDEvcnlFaHNocnVBS2E5LzVoKzVCL3g4bmN4VFhwbThCNwp2RDdldHY4V3VyeUtQc0lMdWlkT0QwR1FTRVRvNzdBWE03RmZpUk9yMDFqN3c2UVB3dVB2QkpTcDNpa2lDL0RjCnZFNjZsdklFWE43ZFNnRGRkdnV2R1FORFdPWWxHWmhmNUZIVy81ZHJQSHVPOXp1eVVHK01NaTFpUCtSQk1QUmcKSWU2djhCcE9ncnNnZHRtWExhNFZNc1BNKzBYZkQwSDhjU2YvMkg2V1M0LzdEOEF1bG5QSW9LY1krRkxKUEFtMwpJVFI3L2w2UTBJUXVNU3c2QkxLYWZCRm5CVmNUUVNIN3lKZEFKNWdINFZZRHIyamtVWkwzQWdNQkFBR2pVekJSCk1CMEdBMVVkRGdRV0JCU2Y5RDVGS3dISUY3eFdxRi80OG4rci9SVFEzakFmQmdOVkhTTUVHREFXZ0JTZjlENUYKS3dISUY3eFdxRi80OG4rci9SVFEzakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQTBHQ1NxR1NJYjNEUUVCQ3dVQQpBNElCQVFCN1BsMjkrclJ2eHArVHhLT3RCZGRLeEhhRTJVRUxuYmlkaFUvMTZRbW51VmlCQVhidUVSSEF2Y0phCm5hb1plY0JVQVJ0aUxYT2poOTFBNkFvNVpET2RETllOUkNnTGI2czdDVVhSKzNLenZWRmNJVFRSdGtTTkxKMTUKZzRoallyQUtEWTFIM09zd1EvU3JoTG9GQndneGJJQ1F5eFNLaXQ0OURrK2V4c3puMUJFNzE2aWlJVmdZT0daTwp5SWF5ekJZdW1Gc3M0MGprbWhsbms1ZW5hYjhJTDRUcXBDZS9xYnZtNXdOaktaVVozamJsM2QxVWVtcVlOdVlWCmNFY1o0UXltQUJZS3k0VkUzVFJZUmJJZGV0NFY2dVlIRjVZUHlFRWlZMFRVZStYVVJaVkFtaU9jcmtqblVIT3gKMWJqelJxSlpMNVR3b0ZDZzVlZUR6dVk0WlRjYwotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
    server: https://oidc-proxy-api.blog.tremolo.dev
  name: kube-oidc-proxy
contexts:
- context:
    cluster: kube-oidc-proxy
    user: kube-oidc-proxy
  name: kube-oidc-proxy
current-context: kube-oidc-proxy
kind: Config
preferences: {}
users:
- name: kube-oidc-proxy
  user: 
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: kubectl
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://keycloak.blog.tremolo.dev/realms/cluster
      - --oidc-client-id=kube-login

Create a file with this configuration, then run any kubectl command. Kubectl will pop open a browser and ask you to authenticate.

With our CLI configured, next we'll configure access to the dashboard. Since we're using impersonation with the API server, we're also going to need to use impersonation with the dashboard. This becomes tricky because while the kube-oidc-proxy we used for the API server will validate an id_token and translate it into headers, it can't manage the authentication and sessions management. For that, we'll use the oauth2 proxy. We'll deploy the oauth2 proxy in front of the kube-oidc-proxy so that it can handle the user interaction with the identity provider (keycloak) and kube-oidc-proxy providing the translation to impersonation headers. First, we need to setup a new client in Keycloak:

After hitting Next, configure the client to require authentication and not support direct grant:

On the next screen, you'll need to set the redirect URL:

Save, then click on Credentials tab and copy the Client secret:

Put that secret in a safe space, we'll need it when we setup oauth2 proxy. Download the dashboard manifest, and update the client_secret on line 144 with the value. This manifest does several things:

Creates a namespace that holds the oauth2 proxy and the kube-oidc-proxy
Setup a ConfigMap that stores CA certificates for our enterprise CA and our cluster CA
Deployment for the oauth2 proxy
Deployment for kube-oidc-proxy
Certificates for TLS for the oauth2 proxy, the kube-oidc-proxy, and the kubernetes dashboard

First, delete the Secret for the dashboard's default certificate:

kubectl delete secret kubernetes-dashboard-certs -n kubernetes-dashboard

Next, add the manifest you updated with your client_secret:

kubectl create -f /path/to/manifest.yaml

Finally, patch your dashboard to load the certificates from the Secret created by cert-manager:

kubectl patch deployments kubernetes-dashboard -n kubernetes-dashboard -p '{"spec":{"template":{"spec":{"containers":[{"name":"kubernetes-dashboard","args":["--namespace=kubernetes-dashboard","--tls-key-file=tls.key","--tls-cert-file=tls.crt"]}]}}}}'

Once the pods are created, you'll be able to login to your kubernetes dashboard securely. This works because we combined the oauth2 proxy to generate the correct token, with kube-oidc-proxy translating the id_token into impersonation headers.

Operations

There are four primary operations with Keycloak:

Database backups - Since Keycloak has a persistent database, you'll need to have an operations plan for backups and restores.
Upgrades - The keycloak upgrade guide recommends making a backup of your database prior to upgrades. This is an important step and you should coordinate with your DBA team to arrange for backups and potentially restores on a rollback.
Update of secrets - When your Okta client_secret needs to be updated, you'll need to login to your GUI to update it.
Delete login sessions - If a user's session needs to be closed, you can login to the Keycloak GUI to do it.

In addition to management tasks, Keycloak does not have any kind of built in monitoring. You can add a prometheus endpoint, but it requires installing additional open source software into your keycloak deployment.

Security and Compliance

Similar to OpenUnison, we configured out tokens to have one minute lifetimes, with 15 minute refresh tokens. Should an id_token be compromised, it will only be good for a minute and is unlikely to be useful to an attacker.

Edge Cases

Keycloak offers three main ways to customize:

APIs - Keycloak supports an administrator API that can perform all of the functions of the UI
Customization through Java - There are several customization and extension points using any language that can compile to Java byte code.
Custom Themes - Keycloak comes with a couple of themes, but if you want your own theme for your organization, you can built it.

The main downside to options two and three is that you'll need to build a customized container to include your updates, adding additional complexity to your build and deployment process.

To address the use case of storing groups in AzureAD, there are two ways you can do this with Keycloak:

Create an external synchronization system - Using the APIs for Keycloak and AzureAD you could synchronize the AzureAD groups into Keycloak
Create a custom mapper to load AzureAD groups just-in-time - Write a custom mapper in Java that can load the groups from AzureAD when a user logs in

The first approach is more maintainable, since it doesn't require re-packaging Keycloak. The second approach provides the simplest implementation from a coding perspective, since you don't need to worry about catching changes in the AzureAD groups.

Pipeline Integration

Since keycloak supports multiple authentication methods per realm, you can create an additional authentication method on the realm to support a different method. Either using a token or a credential from an external source like an LDAP directory. Keycloak also supports a token exchange service that could be used as well.

Cluster Management Applications

Adding additional cluster management applications depends on the application. If the application supports OpenID Connect, such as our ArgoCD instance, we can simply configure a new client in our cluster realm. If OpenID Connect (or SAML2) isn't supported, you'll need to deploy a proxy, such as oauth2 proxy, to manage the authentication and inject the user's identity into the application. You can look to our kubernetes dashboard as a template for adding new applications.

Keycloak use to have its own reverse proxy, but it was discontinued some time ago.

Multi Cluster

Keycloak is a heavy deployment, with the GUI that needs to be secured and used for management. This makes replicating its deployment across multiple clusters difficult. Instead, the easiest model is to use a centralized KeyCloak on a control plane cluster, directly integrating clusters into it. If you want the benefits of having a local authentication system on each cluster similar to OpenUnison, you could combine a centralized Keycloak with Dex on satelite clusters. This would mean that you are now managing two identity systems instead of one.

Final Thoughts

Keycloak is a powerful authentication system that scales well. It can provide an authentication platform for multiple applications and clusters. If you were to print out this blog, the keycloak section would take thirteen pages! While this guide is meant to provide a production ready deployment, we ignored customizing the theme and setting request limits since these steps will depend greatly on your environment.

Finally, you can't purchase support for Keycloak from Red Hat. The only way to get commercial support is to purchase OpenShift or JBoss, which comes with Red Hat SSO (the comercial version of Keycloak). Red Hat doesn't sell support for Keycloak on its own.

Dex

Dex is probably the oldest and most well known of the options for Kubernetes OIDC integration. It can handle authentication to LDAP, SAML, OpenID Connect, and GitHub (along with a few others). Dex is small and lightweight, with configuration mostly through a single yaml object. The project is "owned" by Red Hat, but it's a community managed project. It was originally developed by the folks at CoreOS and included in their Tekton product, but when Red Hat acquired CoreOS Dex was left to the community to continue to build and update (which it has!). Because Dex is small and lightweight, it's not unusual to see it embedded into cluster management applications instead of trying to maintain a more complex identity stack. One example is ArgoCD ships with Dex built in for more complex use cases (such as GitHub integration). The Dex deployment will look very similar:

Dex deploys with a helm chart. The deployment process is:

Create a Secret that stores the Dex configuration
Deploy the helm chart
Deploy kube-oidc-proxy for API server access
Deploy oauth2 proxy+kube-oidc-proxy for secure dashboard access

We're going to configure everything for Dex in a single Secret, including all clients. Dex does support a dynamic approach to client creation, but you are expected to use the gRPC API; but the Dex project doesn't provide any cli interfaces for this API. We're going to use the Kubernetes backend for storing data, which could include clients. The problem with this approach is that Dex doesn't publish a schema for its client CRD so it could change at any time. Also, the client secret is stored as plain text in the client custom resource, which would make GitOps problematic (since you don't want to ever store secrets in git!). Given the likely use cases, maintaining a single Secret is the easiest route. This way, you can store this Secret's contents in a vault and synchronize it into your environment in conjunction with your GitOps controller. You'll need to redeploy Dex to get it to reload the changes, but given how lightweight Dex is, this shouldn't be an availability issue.

The first important step is to login to your Okta dashboard, and update your application to support refresh tokens:

‍

This is important for Dex, because Dex will validate your session with Okta on each refresh using the refresh token returned by Okta. If you are logged out of Okta during your session, on the next refresh Dex will fail to give you a new id_token, breaking your ability to access Kubernetes.

Once you have your Okta configuration updated, the next step is to download the cert-manager configuration for our cluster internal CA and for our Dex configuration. You'll need to update your Okta configuraiton (starting on line 78, and replace dex.blog.tremolo.dev with your own host. Once that's done, apply the manifest. The next step is to get the values.yaml configured. Create a file with the below values:

https:
  enabled: true
configSecret:
  create: false
  name: dex-config
volumes:
- name: dex-certs
  secret:
    secretName: dex-tls
volumeMounts:
- name: dex-certs
  mountPath: /certs
ingress:
  enabled: false

This values file will tell the chart not to create a Secret for configuration and to mount the certificate we created from our cluster CA so we have end to end encryption. With the values file in hand, next deploy the helm chart:

helm repo add dex https://charts.dexidp.io
helm install dex dex/dex -n dex -f /path/to/dex-values.yaml

At this point you should be able to access Dex's openid connect discovery document. Since our host is dex.blog.tremolo.dev, I was able to test that Dex was setup by going to https://dex.blog.tremolo.dev/dex/.well-known/openid-configuration. When a bunch of JSON was returned, I knew that Dex was setup properly. Next we need to integrate Dex into our API server. Similarly to Keycloak, we'll deploy the kube-oidc-proxy. The client is already configured in our Dex configuration file. So integrating with Dex requires downloading the proxy manifest, and updating the URLs for your environment. Once that's deployed, the last step is to configure your kubectl for access. Use the below configuration file, updating host names:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFVENDQWZtZ0F3SUJBZ0lVYmtiS2ZRN29ldXJuVHpyeWdIL0dDS0kzNkUwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0dERVdNQlFHQTFVRUF3d05aVzUwWlhKd2NtbHpaUzFqWVRBZUZ3MHlNakV4TURjeE5EUTFNakphRncwegpNakV4TURReE5EUTFNakphTUJneEZqQVVCZ05WQkFNTURXVnVkR1Z5Y0hKcGMyVXRZMkV3Z2dFaU1BMEdDU3FHClNJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUNucVZ3eVFvMjJyRzZuVVpjU2UvR21WZnI5MEt6Z3V4MDkKNDY4cFNTUWRwRHE5UlRRVU92ZkFUUEJXODF3QlJmUDEvcnlFaHNocnVBS2E5LzVoKzVCL3g4bmN4VFhwbThCNwp2RDdldHY4V3VyeUtQc0lMdWlkT0QwR1FTRVRvNzdBWE03RmZpUk9yMDFqN3c2UVB3dVB2QkpTcDNpa2lDL0RjCnZFNjZsdklFWE43ZFNnRGRkdnV2R1FORFdPWWxHWmhmNUZIVy81ZHJQSHVPOXp1eVVHK01NaTFpUCtSQk1QUmcKSWU2djhCcE9ncnNnZHRtWExhNFZNc1BNKzBYZkQwSDhjU2YvMkg2V1M0LzdEOEF1bG5QSW9LY1krRkxKUEFtMwpJVFI3L2w2UTBJUXVNU3c2QkxLYWZCRm5CVmNUUVNIN3lKZEFKNWdINFZZRHIyamtVWkwzQWdNQkFBR2pVekJSCk1CMEdBMVVkRGdRV0JCU2Y5RDVGS3dISUY3eFdxRi80OG4rci9SVFEzakFmQmdOVkhTTUVHREFXZ0JTZjlENUYKS3dISUY3eFdxRi80OG4rci9SVFEzakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQTBHQ1NxR1NJYjNEUUVCQ3dVQQpBNElCQVFCN1BsMjkrclJ2eHArVHhLT3RCZGRLeEhhRTJVRUxuYmlkaFUvMTZRbW51VmlCQVhidUVSSEF2Y0phCm5hb1plY0JVQVJ0aUxYT2poOTFBNkFvNVpET2RETllOUkNnTGI2czdDVVhSKzNLenZWRmNJVFRSdGtTTkxKMTUKZzRoallyQUtEWTFIM09zd1EvU3JoTG9GQndneGJJQ1F5eFNLaXQ0OURrK2V4c3puMUJFNzE2aWlJVmdZT0daTwp5SWF5ekJZdW1Gc3M0MGprbWhsbms1ZW5hYjhJTDRUcXBDZS9xYnZtNXdOaktaVVozamJsM2QxVWVtcVlOdVlWCmNFY1o0UXltQUJZS3k0VkUzVFJZUmJJZGV0NFY2dVlIRjVZUHlFRWlZMFRVZStYVVJaVkFtaU9jcmtqblVIT3gKMWJqelJxSlpMNVR3b0ZDZzVlZUR6dVk0WlRjYwotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
    server: https://oidc-proxy-api.blog.tremolo.dev
  name: kube-oidc-proxy
contexts:
- context:
    cluster: kube-oidc-proxy
    user: kube-oidc-proxy
  name: kube-oidc-proxy
current-context: kube-oidc-proxy
kind: Config
preferences: {}
users:
- name: kube-oidc-proxy
  user: 
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: kubectl
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://dex.blog.tremolo.dev/dex
      - --oidc-client-id=kube-login
      - --oidc-extra-scope=groups
      - --oidc-extra-scope=profile
      - --oidc-extra-scope=email
      - --oidc-extra-scope=offline_access

This is very similar to the configuration from Keycloak, except with additional scopes. I'm not sure why, but the oidc-login plugin didn't add the offline_access scope for Dex, so it needed to be added manually. Now that the API server is configured, the next step is to configure the dashboard. We'll use a similar setup for the dashboard with Dex as we did with Keycloak by combining the oauth2 proxy and kube-oidc-proxy. We already have the client in our dex configuration, you'll want to make sure you update it with your correct host names and probably a new client secret. Then, download the Dex dashboard manifest and update the hosts and secrets as needed. Now we can deploy the proxies.

First, delete the Secret for the dashboard's default certificate:

kubectl delete secret kubernetes-dashboard-certs -n kubernetes-dashboard

Next, add the manifest you updated with your client_secret:

kubectl create -f /path/to/manifest.yaml

Finally, patch your dashboard to load the certificates from the Secret created by cert-manager:

kubectl patch deployments kubernetes-dashboard -n kubernetes-dashboard -p '{"spec":{"template":{"spec":{"containers":[{"name":"kubernetes-dashboard","args":["--namespace=kubernetes-dashboard","--tls-key-file=tls.key","--tls-cert-file=tls.crt"]}]}}}}'

‍

Accessing our Cluster

In order to access our cluster we must distribute a kubectl configuration file with our oidc-login plugin's configuration. We also need to provide links for the dashboard and other applications. Dex doesn't provide any kind of "portal" to act as a starting point for our developers and admins.

Operations

Since Dex doesn't use any database outside of Kubernetes CRDs, there isn't much state to worry about. Configuration changes and updates require redeployments, but Dex is very small so that's not much of an issue. From an operations standpoint, the main downside is that all configuration needs to be stored in a single Secret. That makes it difficult to update dynamically. As discussed earlier, you could store clients using the clients CRD, but it doesn't provide any schema validation and isn't the recommended solution from the project's developers. You can create a gRPC client, but even then the client secret is stored in the CRD, making it difficult to store securely.

Security and Compliance

From a security and compliance standpoint, Dex has a nice feature that it validates the upstream session with Okta before refreshing the user's session. This means that if an Okta administrator locks a user's account, their token won't refresh. Additionally, you can kill a session by deleting a user's refreshtoken object in the dex namespace. There's no easy way to search for a user's session, so you'll need a script to enumerate each user. For instance, in my current cluster:

apiVersion: v1
items:
- CreatedAt: "2022-11-27T16:52:07.024971504Z"
  LastUsed: "2022-11-27T16:54:12.871860985Z"
  apiVersion: dex.coreos.com/v1
  claims:
    email: marc+mmosley@tremolo.io
    emailVerified: true
    groups:
    - demo-k8s
    - Everyone
    - k8s-users
    - gsa-demo-users
    - k8s-admins
    preferredUsername: marc+mmosley@tremolo.io
    userID: 00u3fusfj6jFLURbp357
    username: Matt Mosley
  clientID: kube-login
  connectorID: okta
  kind: RefreshToken
  metadata:
    creationTimestamp: "2022-11-27T16:52:06Z"
    generation: 2
    name: wxu3icge7y5k62zca5lnqnekd
    namespace: dex
    resourceVersion: "165560"
    uid: dc2d858c-5acb-4115-89c0-b9d448d29010
  nonce: LMcSgH-mJehJf8Fx9KbuBexGjJc6wS3PgA3lnTcCi4g
  obsoleteToken: evojbu7ir6vqnobsyly4bjfmd
  scopes:
  - groups
  - profile
  - email
  - offline_access
  - openid
  token: uxphj37thy7l36ab5zu3vkjst

So a script that looks for every instance of a user and then deletes the refresh token is pretty straight forward to write.

Something to note is that all of the tokens are stored in CRDs as plain text. This means that you're likely to get flagged in any kind of audit. From a practical security standpoint, the biggest threat is more likely from accidentally generating and leaking manifests.

Finally, there's no commercial support for Dex. This is a compliance issue in some regulated industries. If your Kubernetes distribution comes with Dex this may not be an issue.

Edge Cases

Dex has no customization capability without creating a custom build. For example, if we needed to get our groups from another source, we couldn't. We've used the example of getting groups from AzureAD. If you're authenticating against AzureAD, this isn't an issue since Dex supports querying AzureAD, but that's built directly into Dex. If youre requirements stray from what is directly supported, you'll need to find another way to implement it.

Pipeline Integration

Dex only supports a single authentication source, and is really only designed to authenticate users. If you want to use Dex with a pipeline, your going to need to authenticate directly against Okta. Depending on how your pipeline is built, this is very doable, but might not provide the level of security you're looking for. Dex doesn't provide any kind of security token service, so you can't use it to exchange a SPIRE token for a Dex token, as an example.

Cluster Management Applications

Assuming your cluster management applications can integrate with OIDC, or with some kind of authentication proxy, Dex will work well here. The hardest part is updating and managing the configuration Secret. It also assumes you don't need much in the way of customization or you can customize attributes at the proxy layer. Integrating our ArgoCD instance isn't much different then integrating with OpenUnison, just managing a large, static Secret instead of individual trust configurations per application.

Dex does support a telemetry endpoint, but not Prometheus directly. It also doen't provide any security on this endpoint. Since Dex is a security component, it's important to understand if this imposes any additional risk. You could mitigate this risk by blocking the telemetry URLs at the Ingress layer.

Multi Cluster

Dex on it's own does not have a multi-cluster story. It's small and lightweight, so it can be used either directly against Okta with each cluster or using a spoke-and-hub model similar to OpenUnison.

Final Thoughts

Dex's simplicity is a good solution for many clusters. It's relatively easy to deploy and lightweight. From a GitOps perspective, it would be best to generate the configuration from a vault. This limits how dynamic Dex can be with onboarding new applications and pulls the cluster management team into more integrations. Dex also does not have a Security Token Service, which is valuable for secure pipeline integration.

Pinniped

The Pinniped project is a different animal from the other authentication solutions. Built as the authentication component for VMWare's Tanzu Kubernetes distribution, it's very opinionated on how it approaches cluster authentication and has some substantial differences from the other solutions we have evaluated in this post:

Kubernetes first - Dex and Keycloak are authentication systems that work with Kubernetes. OpenUnison is a generic identity system, but has several features built in specifically for Kubernetes. Pinniped was built primarily as a Kubernetes authentication system and was tailor made to provide a cli first approach to Kubernetes interaction. This doesn't mean you can't use Pinniped as an identity provider for your other cluster management applications, it's just very constrained.
No assumption of Ingress - Pinniped doesn't assume that you have an Ingress controller in place. Out of the box, it will generate LoadBalancer Service objects for you. Most enterprises have different silos for networking and Kubernetes infrastructure, making it harder to leverage cloud native networking solutions as dynamically as is technically possible. This assumption made Pinniped a little harder to deploy.
Multi-Cluster First - Pinniped was designed under the assumption that you will have a control plane cluster that will manage other satellite clusters. This is a common design pattern for On-prem Kubernetes distributions. This assumptions doesn't preclude Pinniped from running on a single cluster, just requires that you deploy multiple components.
Requires the pinniped binary - We have used kubectl plugins for the previous versions (ie the oulogin plugin for OpenUnison and the kube-oidc-login plugin for Dex and Keycloak), but they weren't required. OpenUnison provides a portal for generating a generic kubectl command that works on both Windows and *nix, while Dex and Keycloak could both use something like the now retired Gangway project to get a kubectl configuration file. Pinniped on the other hand requires the use of its cli tool to access clusters it protects. This can be a challenge in enterprises where distributing binaries can be a hassle.
Direct TLS Connections Required - The pinniped binary authenticates to either the concierge proxy or directly to the API server using TLS, or certificate, authentication. This means that your client and your cluster must have a direct line of communication with no TLS termination points in between. This can limit how you deploy in segmented networks since you can't have a reverse proxy in front of your ingress or LoadBalancer if it hosts its own certificate.
No Helm Charts - Pinniped is packaged using either native manifests or using VMWare's Carvel tool. I'm not making any judgement on Carvel, only that it doesn't have as wide of adoption as Helm. Creating a Helm chart wouldn't take much effort, but it's not an officially supported deployment mechanism from the Pinniped team.

In this deployment, we're going to focus on using Pinniped's integrated impersonating proxy. This proxy works similarly to the kube-oidc-proxy in that it allows Pinniped to authenticate access to clusters that are both on-prem and in the cloud. For on-prem clusters, Pinniped has an interesting feature where it will issue five minute certificates to a client using the cluster's own CA. This keeps your clients from having to manage tokens, but means that you're using certificates for authentication which has several, well documented drawbacks. Since Pinniped issues five minute certificates, I don't know that it would pose a significant compliance issue given a compromised key could only be leveraged for 5 minutes and the key associated with that certificate doesn't go over the wire on each transaction. This is a similar compliance story as short lived tokens. That said, Pinniped now has your cluster's CA certificate and private key so a breach of Pinniped in this configuration would require an entire re-keying of your cluster. In contrast, a breach of OpenUnison, Dex, or KeyCloak would result in a new key needing to be generated for the identity provider which would have no impact on the cluster.

Having looked at how Pinniped differs from the other solutions, let's look at what our cluster architecture will look like:

The two components to install are the Supervisor and the Concierge. The Supervisor is your control plane and identity provider. It's the component that users will interact with when logging into their upstream identity provider (in this case Okta) and it is responsible for issuing tokens to the Concierge component. The Supervisor must host a certificate that users can interact with. In our case, the wild card certificate, since user's browsers will interact with this component during authentication. In our setup, we put the Supervisor behind our NGINX Ingress controller. The Concierge is what manages access to the cluster. The pinniped binary interacts with Concierge using mTLS, so your user needs a direct connection with no TLS termination points between the user and the Concierge. This can be limiting from a network management perspective. Similar to Dex and Keycloak, Pinniped has no built-in way to manage access to applications that don't natively support OpenID Connect or work better with an impersonating proxy (such as the Kubernetes Dashboard). To make this work, we're again combining the OAuth2 Proxy and kube-oidc-proxy to make impersonation with the Kubernetes Dashboard work. This approach will work with other dashboards such as Kiali (Istio) and TektonCD's dashboard. Finally, ArgoCD would integrate with Pinniped as an OIDC client.

With our architecture drawn out, the next step to getting Pinniped up and running is to enable refresh tokens in Okta:

Once that's done, the next step is to deploy the supervisor. This is the component of Pinniped that acts as a control plane for cluster access. It's the one point where you need to interact with a user's browser and where individual clusters will authenticate to. We're managing a single cluster in this example, so we'll run both components on the same system. The first step is to download and install the pinniped binary. Here's how to do it using brew:

brew install vmware-tanzu/pinniped/pinniped-cli

Next, deploy the supervisor:

kubectl apply \
  -f https://get.pinniped.dev/v0.22.0/install-pinniped-supervisor.yaml \
  --kubeconfig supervisor-admin.yaml

Once the supervisor is running, our next step is to integrate it into our Ingress controller. Users will need to interact with this service as part of the Okta login, so we're going to want to make sure it's accessible using our enterprise CA certificate. Download and update this manifest to create a Service for our supervisor and an Ingress object as well. The pre-built manifests don't create a Service for you because there isn't an assumption that there will be an Ingress. You may want to use a LoadBalancer directly with your supervisor, but since we're using a more typical enterprise deployment we integrated one. If you try to access the URL in your Ingress object, it won't work yet. The next step is to setup a federation provider that integrates with Okta. You'll need your client secret and client id from Okta and create a Secret:

---
apiVersion: v1
kind: Secret
metadata:
  namespace: pinniped-supervisor
  name: okta-client-credentials
type: secrets.pinniped.dev/oidc-client
stringData:
  clientID: "myclientid"
  clientSecret: "myreallystrongsecret"

With our Secret created, the next step is to deploy the federation provider. Each federation provider has the ability to host its own certificate. Since our provider will be accessed via an Ingress, we're going to use a cluster CA to encrypt this traffic from the Ingress. We're also going to configure the host for our provider. Download and update this manifest to create our cluster CA in cert-manager and create your federation trust with Okta. At this point if you point your browser at the URL in your FederationDomain object with the web finger URL included, it should work:

curl https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/.well-known/openid-configuration
{"issuer":"https://pinniped-supervisor.blog.tremolo.dev/cp-issuer","authorization_endpoint":"https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/oauth2/authorize","token_endpoint":"https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/oauth2/token","jwks_uri":"https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/jwks.json","response_types_supported":["code"],"response_modes_supported":["query","form_post"],"subject_types_supported":["public"],"id_token_signing_alg_values_supported":["ES256"],"token_endpoint_auth_methods_supported":["client_secret_basic"],"scopes_supported":["openid","offline_access","pinniped:request-audience","username","groups"],"claims_supported":["username","groups"],"code_challenge_methods_supported":["S256"],"discovery.supervisor.pinniped.dev/v1alpha1":{"pinniped_identity_providers_endpoint":"https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/v1alpha1/pinniped_identity_providers"}

Our final step in setting up the supervisor is to update Okta with our callback URL. This URL will always be your issuer from your FederationDomain with /callback added. So four our example it is https://pinniped-supervisor.blog.tremolo.dev/cp-issuer/callback.

With our supervisor in place, the next step is to deploy the concierge. First, we'll deploy the standard manifests for the CRDs:

kubectl apply -f \
  "https://get.pinniped.dev/v0.22.0/install-pinniped-concierge-crds.yaml" \
  --kubeconfig workload1-admin.yaml

The next step requires an update to the standard manifests. By default, the concierge service deploys with its own LoadBallancer Service. This is assumed because the concierge requires a direct connection to the client without any TLS termination points. We want everything to run through our Ingress, so we're going to patch the incoming objects to deploy as a ClusterIP Service instead:

curl -L https://get.pinniped.dev/v0.22.0/install-pinniped-concierge-resources.yaml | sed 's/type: LoadBalancer/type: ClusterIP/g' |  kubectl create -f -

Next, we need to configure our Ingress to support the concierge service. The NGINX Ingress controller doesn't support TLS passthrough by default, so before we can configure our Ingress we first need to patch our ingress controller to support TLS passthrough:

kubectl patch deployments.apps ingress-nginx-controller --type=json -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-ssl-passthrough" }]' -n ingress-nginx

We also need to tell the concierge what its URL is. By default, it tries to auto detect its URL based on the LoadBalancer configuration on its Service, but since we're using an Ingress this won't work. Update this patch command to your cluster's URL:

kubectl patch CredentialIssuer pinniped-concierge-config -p '{"spec":{"impersonationProxy":{"externalEndpoint":"pinniped-concierge.blog.tremolo.dev:443"}}}' --type=merge

Our last cluster configuration step is to download the manifest that configures our concierge to use our supervisor for authentication and update it with our supervisor's host and our enterprise CA certificate.

Once deployed, our last step is to generate a kubectl configuration file:

pinniped get kubeconfig > my-cluster.conf

Now, you can distribute this configuration file. There's no secret data in it and nothing user specific.

With our CLI configured, our next step is to integrate the dashboard. Similar to Dex and KeyCloak, we'll use a combination of oauth2-proxy and kube-oidc-proxy to provide integration with Kubernetes via impersonation. Pinniped is a little different then the other identity providers we've configured earlier in that:

Changing what attributes are in the JWT is very limited- As of v0.22 you can perform some attribute mapping to, as an example, map the user's sub to mail for applications that require a mail attribute.
There's no email address - The JWT only has the user's unique identifier and groups in addition to the attributes that describe when the JWT was created, for who, and for how long it should be accepted. This is a good thing and I applaud the Pinniped team for not using email addresses. Even though it's common to use an email address for identifying users in the Kubernetes world, it's a bad approach. Names change for many reasons, it's much better to use an immutable identifier instead. For instance, the sub you get from Okta is just a random identifier. If you tie permissions directly to user's identities (which is an antipattern on its own) having an email address that could chains could cause all kinds of permission headaches. It's best to use an immutable identifier not based on any user attributes so if your name changes your permissions don't.
Your sub isn't the same as from your identity provider - Pinniped creates a sub based on your federation domain and your sub from that domain. The username claim instead contains your identity provider's login ID.
Uses ES256 by default - Pinniped's JWTs are signed using the ES256 algorithm. There's nothing wrong with this, it's faster and provides better security with smaller keys. The default for applications is RS256, so if you have issues with applications verifying JWTs that seem valid, make sure the application knows to accept ES256 JWTs.
Client applications require both PKCE and a client secret - The PKCE addition to OpenID Connect was designed to provide a way for the identity provider to "authenticate" a client that wouldn't have a client secret, such as a single page web application or a mobile app. It's good to have it enabled, but it isn't required per the OIDC spec so if you start running into errors on initial login make sure you're using PKCE in your client.

We're going to first tell the supervisor to create a client. Update the below manifest for your environment with the correct URLs:

---
apiVersion: config.supervisor.pinniped.dev/v1alpha1
kind: OIDCClient
metadata:
  # name must have client.oauth.pinniped.dev- prefix
  name: client.oauth.pinniped.dev-kubedashboard
  namespace: pinniped-supervisor # must be in the same namespace as the Supervisor
spec:
  allowedRedirectURIs:
    - https://k8sdb.blog.tremolo.dev/oauth2/callback
  allowedGrantTypes:
    - authorization_code
    - refresh_token
    - urn:ietf:params:oauth:grant-type:token-exchange
  allowedScopes:
    - openid
    - offline_access
    - pinniped:request-audience
    - username
    - groups

Next, generate a client secret. You don't create a Secret manually, you tell Pinniped you want to register one and the supervisor will generate one for you. An important note: you MUST capture the generated secret from the response. This is the only time it's available in plain text:

cat <<EOF | kubectl create -o yaml -f -
apiVersion: clientsecret.supervisor.pinniped.dev/v1alpha1
kind: OIDCClientSecretRequest
metadata:
  name: client.oauth.pinniped.dev-kubedashboard # the name of the OIDCClient
  namespace: pinniped-supervisor # the namespace of the OIDCClient
spec:
  generateNewSecret: true
EOF

The output contains the client secret. Download the manifest that configures oauth2-proxy, kube-oidc-proxy, and a certificate for the Kubernetes dashboard. Make sure to update URLs and to update the oauth2 proxy's Secret with the correct client secret you just generated. The final step is to update the dashboard's configuration to work with a standard TLS Secret:

kubectl patch deployments kubernetes-dashboard -n kubernetes-dashboard -p '{"spec":{"template":{"spec":{"containers":[{"name":"kubernetes-dashboard","args":["--namespace=kubernetes-dashboard","--tls-key-file=tls.key","--tls-cert-file=tls.crt"]}]}}}}'

With that running, you'll now have secure access to the Kubernetes dashboard using Pinniped!

Accessing our Cluster

Accessing our cluster requires using the kubectl configuration file we generated earlier. There's no secret information in this file and there's no user specific information either, so it's safe to post to a common area for download such as a git repository. The only important thing to remember is that, in addition to kubectl, you need to have the pinniped client installed on each develoeprs/admins workstation.

Operations

There really isn't much to operate. There's no external database to do backups of. If a user needs to be disabled quickly, the easiest way is to do it at the identity provider. On the next refresh Pinniped will disable access.

Security and Compliance

Pinniped gets high marks for making use of short lived tokens. The OIDC tokens issued by the supervisor are only good for a few minutes by default. The use of elliptic curve encryption and TLS by default is also a major plus. Finally, on the plus side, Pinniped respects your federation provider's refresh tokens and will update the user's groups based on the federation provider on each refresh. The certificates used to authenticate to Concierge (whether issued using the cluster's CA or not), are only good for five minutes, another major plus.

The only major compliance risk I see in Pinniped is if using its feature to generate certificates from the cluster's CA certificate. If Concierge is breached, you now have to re-key your entire cluster.

Edge Cases

There are very few ways to handle edge cases natively. If you wanted to use Pinniped to, for instance use AzureAD groups with Okta, you'd need to synchronize it some other way. Any edge case would need to be handled outside of Pinniped.

Pipeline Integration

Pinniped doesn't include any kind of openid connect security token service. You'll need to automate authentication to the federation provider. If you have an authentication source that accepts credentials directly, like LDAP or AD, you could pass those credentials from a pipeline directly. It's not a native function though to integrate a pipeline into Pinniped. Both OpenUnison and KeyCloak include Security Token Services (STS), and OpenUnison has APIs that can be authenticated to via credentials, certificates, tokens, etc.

Cluster Management Applications

You'll need to leverage a combination of oauth2-proxy and kube-oidc-proxy to provide access to web applications that need to interact with the API server (assuming they know how to handle impersonation). You could configure an on-prem cluster to recognize JWTs from the supervisor.

Multi Cluster

This is a tale of two cities. As a CLI tool, the onboarding process for adding a cluster is very easy compared to hosting an identity provider per cluster with Dex or Keycloak. While the integration with OpenUnison is automated, the deployment of a Concierge does have a smaller footprint.

Where things get more complicated with a multi cluster pinniped deployment is with local management applications (like the dashboards). Even though each cluster gets its own Concierge, any web applications need to be integrated directly to the Supervisor, which runs on the control plane cluster. This means that the pinniped management team needs to own integration or setup a webapp specific identity provider per satellite cluster.

Final Thoughts

Pinniped provides an opinionated approach to cluster access that is streamlined for CLI access. The upside to this approach is that the developer experience is as close to having a master certificate as possible while integrating external authentication. This streamlined approach comes with a cost in flexibility for integrating with common edge cases in enterprises and doesn't integrate as well with management applications that are popular.

Other Identity Providers

OpenUnison, KeyCloak, Dex, and Pinniped are by no means the only identity providers that can be used with Kubernetes. This guide isn't, and can't be, exhaustive. That said, if you're looking at any of these providers you should do so based on your needs and requirements. Many tools that are used by cluster managers provide identity data that can be integrated with Kubernetes. For instance, HashiCorp's Vault and GitLab both have identity providers that could be integrated with Kubernetes. These tools are all technically capable. Questions you should ask when evaluating them:

Are they in my management silo? - Just because a tool can do the job at a technical level doesn't mean it fits will with management silos. For instance, if you have a centralized Vault instance then using it for authentication to your clusters probably won't be a great fit. First off, as the cluster owner, you don't want to be dependent on the Vault team to onboard authentication into your cluster. Also, the Vault team probably doesn't want this responsibility. Identity and authentication are every bit as complex and nuanced as secret management. They have enough to worry about without adding cluster access to the mix.
What is the multi-cluster story? - We included multi-cluster in every evaluation because no organization has just one cluster. This becomes an important question when delegating management of clusters because you may not want your centralized team owning management of access to each individual cluster. Very few identity providers have any kind of delegated management that can be easily setup, so you'll likely end up either having to centralize management to your identity provider via your Kubernetes management team or setup satellite identity providers on each cluster to handle both CLI and management application access.
Cross-Cloud Cluster Access - Can your identity provider support both on-prem and cloud management clusters? Most identity providers need to be paired with an impersonating proxy like kube-oidc-proxy to manage cross cloud access.
Handling Edge Cases - How does the tool handle edge cases? This is usually where I'd put a meme of Buzz telling Woody "Edge cases, edge cases everywhere!". Nearly all identity providers can provide a unique id and a list of groups. Can your identity provider manage the edge cases that you'll run into?

Also, remember that almost none of these tools provide their own mechanism for getting a kubectl configuration, so you'll need to use something like the kube-oidc-login plugin we used with Dex and KeyCloak.

Wrap Up

We covered so much information in this post. To give you an idea at how big this post is, Packt didn't want chapters longer then 45 pages for Kubernetes: An Enterprise Guide while this blog post comes in at nearly forty pages on its own when printed! We started out looking at common challenges enterprises face when deploying Kubernetes authentication, designed a testing cluster on Civo that replicates those common issues and walked through the implementation of OpenUnison, Dex, KeyCloak, and Pinniped! All of the manifests are available on GitHub at https://github.com/TremoloSecurity/kubernetes-authentication. It's no surprise that of the four solutions deployed in this post, OpenUnison was the fastest to deploy. We've spent thousands of hours building specific features for Kubernetes, simplifying the deployments using a combination of Helm and a deployment tool, and work with customers on deployments to make sure that the experiences learned from those deployments make it back into the project. If you'd like to learn more about OpenUnison, check out OpenUnison's documentation site, and if you're interested in commercial support, please reach out!