Cloud

Kubernetes Security Myths Debunked

June 23, 2019

by

Marc Boorshtein

I talk to many folks on slack channels, twitter and in person about various security issues with Kubernetes.  I hear many of the same myths that I’d like to address and debunk.

The Kubernetes Dashboard is a Security Risk

Nope.  Not even a little.  The problem isn’t the dashboard, it’s how you deploy it.  When you hear about compromises of the cluster via the dashboard (Tesla was the big one that comes to mind) what usually happens is someone assigns the dashboard’s service account privileged or cluster-admin access without locking down access to the dashboard it’s self.  As long as you follow these really basic rules the dashboard is every bit as safe as kubectl:

  1. Setup the dashboard to use a zero privilege service account
  2. Use an authenticating reverse proxy that can inject a user’s JWT into each request
  3. Configure the dashboard to use TLS for both internal and external communications

For #2 I’m partial to our own Orchestra open source login portal for Kubernetes, but there are other solutions you can use.  Chances are if you’re running the dashboard with cluster-admin privileges its not your only security issue with the cluster.

Pod Security Policies are Hard

Pod Security Policies aren’t hard, finding well designed publicly hosted containers are.  Especially if you’re in enterprise the vast majority of your pods are running applications.  There are some exceptions but unless you’re doing something at a system level chances are your pod just doesn’t need to be privileged.  Apache doesn’t need to be privileged. Neither does Nginx or Java or Python or … They just don’t. So follow a few simple rules in your containers:

  1. After you install your packages, switch to an unprivileged user
  2. Don’t write data to your pod unless its setup with an emptyVolume mount

Those two rules alone should solve 95% of any pod security policy issues you are having.  That last 5% is probably issues with properly creating your policies. Automation is your friend here.  Don’t create namespaces and policies ad-hoc, use a consistent provisioning process. Again, I’m partial to Orchestra for automating namespace creation and onboarding but I’m not exactly impartial.

I Can Use Service Accounts for User Access

This is a bad idea on multiple levels:

  1. Service Accounts are not built for humans, they’re built for automation tasks
  2. The tokens are long lived bearer tokens, which means if you have it you can use it and if it’s compromised through accidentally storing it in a git repo or logging it in debug messages it can be abused.
  3. The ServiceAccount object in k8s can’t be a member of a group, making it harder to manage authorizations in RBAC

I know doing this seems like a good idea.  It’s simple, right? Not so fast. How are you getting the service account from the admin that generates it to the user who uses it?  Email? Slack? How are you rotating the tokens? Auditing access? Disabling once no longer needed? These aspects of identity are just as important as authentication.  By trying to avoid the “complexity” of external authentication you’re actually introducing more risk. Use OpenID Connect. Whether your environment uses Active Directory, LDAP or Google take a look at Orchestra to automate your logins, we take care of most of the complexities of OIDC for you!

JWT Tokens Can Have Long Lives

The OpenID connect refresh process is handled for you by kubectl, so there’s no reason to have long lived JWTs.  In fact there are several reasons NOT to have long lived JWTs:

  1. JWTs are bearer tokens, which means there is no security around who uses them.  If someone compromises your token there is no way to control who uses it.
  2. Modern enterprises often have multiple network layers between you and your api server.  Each of these layers is a potential point for a token to be leaked.
  3. If you accidentally check-in a token to a public git repository, even if it eventually expires it can be identified and compromised pretty quickly

I recommend using 1 minute token life.  Usually plus or minus a minute to account for clock skew.  Your refresh token should be set to whatever the web application’s idle timeout is for your enterprise (usually 15-20 minutes).  This will keep you in line with your compliance requirements.  It should be transparent to your users since the refresh is handled by kubectl automatically.

I Can Share SSH Keys for Host Level Access

The hosts that run your clusters are the soft underbelly of your k8s security.  Root access to any of those boxes means you now have access to every container, every volume mount, every…everything.  Locking down your nodes is every bit as important as locking down your RBAC rules.  Hopefully your enterprise already has a way to integrate identity into your hosts.  If not, OpenUnison (the open source project Orchestra is built on) has built in support for LDAP, AD and FreeIPA to make it easier to manage access to your nodes.

I Want an RBAC Rule That Lets Someone Do Everything BUT…

This one comes up often on the sig-auth channel.  Seems like a good idea.  You define a simple rule that lets users do what the requirements say without having to constantly go back and evaluate rbac policies as new features come out.  The problem with this approach is that when new features come out your users will have access to them by default.  That could be bad.  If k8s has a new feature called “spin up nodes just for your namespace” and now your users can start spinning up nodes without your approval or cost structure.  Explicitly defining what a user can do, vs what a user can’t do, can be very frustrating.  Automation here is your friend.  If your RBAC rules are consistent across your environment then patching them with updated permissions will be much easier.  I’ll skip the link to Orchestra for this one…

I’ll use kubectl Proxy to Access My CI/CD and Monitoring Infrastructure

I see this quite a bit.  The logic goes:

  1. I have kubectl
  2. Kubectl proxy injects my token into requests
  3. You have to be on my machine to access it
  4. Secure!

Not so fast.  There’s zero controls on what process can access loopback (127.0.0.1).  There are controls in your browser that will stop malicious javascript from opening a connection but there’s more thats dangerous outside of just your browser.  Any malicious script that could find its way onto your system will have access.

There’s also a less sinister reason not to use kubectl proxy for accessing your internal infrastructure.  Many large enterprises don’t give you direct access to the k8s api servers from your desktop, you have to use a jump box.  While I’m not the biggest fan of this policy it’s a common practice in enterprises and means that you could have multiple users on the same box running kubectl proxy with nothing to stop them from accidentally using each other’s connections.

I find that most vendors that recommend this approach do so because they don’t have an effective or integrated strategy for managing access to the “second tier” infrastructure needed to effectively run a cluster (ie Grafana, Jenkins, Prometheus, Alert Manager, etc) so they rely on k8s’ built in capabilities.  This is problematic at multiple levels.  Its a painful experience to have to do for devs that use webapps every day to do this kubectl proxy thing.  It’s also difficult to audit access.  If the underlying app is multi-tenant how do you control who has access?  Web apps should be integrated as such.  Most of these apps support OpenID Connect or some other mechanism for SSO.  Use them!  As an example, you can extend Orchestra to provide SSO to Jenkins, Grafana or any other web application in your stack.  You can also use it to automate the access too!

Multi-Tenant  Is Hard, Give Everyone Their Own Cluster!

The part about multi-tenant being hard isn’t a myth.  Thats true.  There’s a lot to consider.  Just giving everyone their own cluster doesn’t really help though as you go from a difficult to manage mess to an impossible to manage mess.  If everyone has their own cluster how do you manage who has access to what?  How do you impose who gets their own cluster?  This approach is often pushed by large cloud providers to push responsibilities for management from k8s up to their own stack.  Spoiler alert: you’re going to run into mostly the same issues because now you have to bridge your cloud provider’s API into your cluster management processes.  You shouldn’t just have one uber cluster, but that doesn’t mean you should have one cluster per app either.  Identity is a big part of multi-tenant and you probably know where this sentence is going…

I Want to Offload TLS to My Ingress Implementation

It’s 2019, not 2005.  Just, no.  You can’t trust your network internals any more than the public web.  Just encrypt everything.  Probably a policy that says you need to do it anyways.  Guess what, no pitch on this one!

Wrapping Things Up

We talked about multiple myths in the Kubernetes security landscape.  Most of them were centered around identity management, well for obvious reasons.  Its where we spend most of our time.  That said there are several other areas of unsettled security debate in Kubernetes (for instance should you scan running containers for malware?).  Identity is a key aspect to any security program in Kubernetes.  Tremolo Security offers multiple open source options to help you start your journey on an identity enabled cluster.  We published a great walk through for Canonical’s CDK with a demo video of how quickly you can be up and running.  If you want to learn more about the details of Kubernetes Authentication, take a look at my article at Linux Journal.