Kubernetes Reusable Elements for Designing Cloud Native Applications: Security Patterns
Kubernetes is a popular open-source platform for managing containerized workloads and services. It provides a declarative, scalable, and portable way to orchestrate applications across different environments.
However, Kubernetes also introduces new challenges and risks for security, such as unauthorized access, data breaches, denial-of-service attacks, and malicious code execution. Therefore, it is essential to understand and apply the best practices and patterns for securing Kubernetes clusters and workloads.
It’s important to keep secure the system in all stages of the software development lifecycle and its implications on various layers of the software stack, known as the 4C’s of cloud native security:
- Process Containment: This pattern aims to limit and contain the actions an application can perform on the node it’s running on
- Network Segmentation: This involves techniques to restrict which Pods a particular Pod can communicate with
- Secure Configuration: This pattern discusses how an application within a Pod can access and use configurations securely
- Access Control: This describes how an application can authenticate and interact with the Kubernetes API server
Process Containment
Despite the implementation of various security measures such as static code analysis, dynamic scanning tools, and regular scanning of application dependencies for vulnerabilities, new code and dependencies can still introduce fresh vulnerabilities. This means that without runtime process-level security controls, a malicious actor could potentially breach the application code and take control of the host or even the entire Kubernetes cluster.
To mitigate this risk, it’s crucial to limit a container to only the permissions it needs to run, applying the principle of least privilege. This can be achieved through Kubernetes configurations, which serve as an additional line of defense by containing any rogue process and preventing it from operating outside its designated boundary.
The Security configurations applied to a container are managed by Kubernetes and made available to the user through the security context configurations of the Pod and container specs. While specialized infrastructure containers may require fine-grained tuning, common security configurations are typically sufficient for running standard cloud-native applications on Kubernetes.
A container is not only a packaging format and resource isolation mechanism but also serves as a security fence when properly configured. The practice of Shift Left security considerations and testing is gaining popularity. This involves deploying into Kubernetes with production security standards early in the development cycle to identify and address security issues sooner, thereby avoiding last-minute surprises.
The Shift Left model encourages developers to incorporate operational security considerations during the application development phase. The objective is to fully understand what your application requires and grant it only the minimum necessary permissions. This includes creating boundaries between workloads and the host, reducing container privileges, and configuring the runtime environment to limit resource utilization in case of a breach.
The Process Containment pattern is emphasized as a method to ensure that any security breaches remain confined within the container.
Running Containers with a Non-Root User
In Kubernetes, you can specify user and group IDs for container images, which are crucial for managing access to files, directories, and volume mounts. However, some containers either run as root by default or don’t have a default user set. You can override this at runtime using securityContext
, which allows you to define the user ID and group ID for any container in the Pod.
However, be aware that mismatches between the directory structure and file ownership IDs in the container image and the specified user and group IDs can lead to runtime failures due to permission issues. Therefore, it’s advisable to inspect the container image file and run the container with the defined user ID and group ID.
To ensure a container doesn’t run as root, set the .spec.securityContext.runAsNonRoot
flag to true
. This doesn’t alter the user but guarantees that the container operates as a non-root user. If root access is required for certain tasks, you can use an init container that runs as root briefly before the application containers start up as non-root.
Finally, to prevent privilege escalation — a situation where a user gains root-like capabilities — set .spec.containers[].securityContext.allowPrivilegeEscalation
to false
.
This is a key measure in adhering to general security practices and thwarting container breakout attacks.
Restricting Container Capabilities
A container is essentially a process running on a node, possessing the same privileges as any other process. If it needs to make a kernel-level call, it requires the appropriate privileges, which can be granted by running the container as root or assigning specific capabilities.
Containers with the .spec.containers[].securityContext.privileged
flag are equivalent to root on the host, bypassing kernel permission checks and potentially compromising security. It’s advisable to avoid using privileged containers and instead assign specific kernel capabilities to those containers that require them.
In Linux, root user privileges are divided into distinct capabilities that can be independently enabled or disabled. Determining what capabilities your container has is not straightforward. An allow list approach can be employed, starting your container without any capabilities and gradually adding them as needed.
To bolster security, containers should be granted the minimum privileges necessary to run. The container runtime assigns a default set of privileges (capabilities) to the container, which are often more generous than necessary, potentially making them vulnerable to exploits. A sound security practice is to drop all privileges and add only those that are needed.
For instance, you might drop all capabilities and only add back the NET_BIND_SERVICE
capability, which allows binding to privileged ports with numbers lower than 1024. Alternatively, you could replace the container with one that binds to an unprivileged port number.
A Pod’s security context should be configured appropriately and not overly permissive to prevent compromise. Limiting the capabilities of containers serves as an additional line of defense against known attacks. A malicious actor who breaches an application would find it more challenging to take control of the host if the container process is not privileged or if the capabilities are significantly limited.
Avoiding a Mutable Container Filesystem
It’s crucial to understand that applications running in containers should avoid writing to the container filesystem. This is because containers are temporary, and any data stored within them will be lost when they restart. Instead, applications should write their state to external persistence methods like databases or filesystems.
To enhance security, applications can limit potential attack vectors by setting the container filesystem to read-only
. This prevents any modifications to the application configuration or the installation of additional executables. In Kubernetes, this can be achieved by setting .spec.containers[].securityContext.readOnlyRootFile
to true
, which mounts the container’s root filesystem as read-only.
There are also other significant security context options:
seccompProfile
: restricts the process running in a container to only call a subset of the available system callsseLinuxOptions
: assigns custom SELinux labels to all containers within a Pod and its volume
However, manually configuring these fields for every Pod or container can lead to human errors. To mitigate this, cluster administrators can define cluster-level policies that ensure all Pods in a namespace meet minimum security standards. This policy-driven approach helps maintain consistent security practices across the entire cluster.
Enforcing Security Policies
Kubernetes employs Pod Security Standards (PSS) and the Pod Security Admission (PSA) controller to ensure that a group of Pods complies with specific security standards. PSS provides a universal language for security policies, while PSA enforces these policies.
These policies, which can be implemented via PSS or other 3rd-party tools, are categorized into a few security profiles:
- Privileged: Unrestricted profile offers the broadest range of permissions and is intended for trusted users and infrastructure workloads
- Baseline: Minimally restrictive profile is designed for common, non-critical application workloads. It prevents known privilege escalations but does not permit privileged containers, certain security capabilities, and configurations outside of the
securityContext
field - Restricted: The most restrictive profile, adhering to the latest security-hardening best practices. It is intended for security-critical applications and users with lower trust levels
These security standards are applied to a Kubernetes namespace using labels that define the standard level and one or more actions to take when a potential violation is detected.
These actions can be:
- Warn: User-facing warning
- Audit: Recorded auditing log entry
- Enforce: Pod rejection
For example, a namespace can be created that rejects any Pods that don’t meet the baseline standard, and also issues a warning for Pods that don’t comply with the restricted standards requirements.
Network Segmentation
In Kubernetes, all Pods can connect to each other by default, which can pose security risks, particularly when multiple independent applications are running in the same cluster. While Kubernetes’ Namespaces offer a way to group related entities, they don’t inherently provide isolation for containers within those namespaces. As such, it’s vital to limit network access to and from Pods to bolster application security.
Network segmentation becomes particularly important in multi-tenant environments where multiple entities share the same cluster. However, setting up network segmentation in Kubernetes can be a complex task. Traditionally, this responsibility fell on administrators who managed firewalls and iptable rules to shape the network topology.
Software-Defined Networking (SDN), an architecture that enables network administrators to manage network services by abstracting lower-level functionality. This is done by separating the control plane, which makes decisions about data transmission, from the data plane that carries out these decisions.
Kubernetes allows for the overlay of its flat cluster-internal network with user-defined network segments via the Kubernetes API. This shifts the responsibility of managing network topology to developers who have a better understanding of their applications’ security needs. This Shift-Left approach is particularly beneficial in a microservices environment characterized by distributed dependencies and a complex network of connections.
To implement this Network Segmentation pattern, NetworkPolicies
for L3/L4 network segmentation and AuthorizationPolicies
for finer control of network boundaries are essential tools.
Multitenancy with Kubernetes
Multitenancy, the ability of a platform to support multiple isolated user groups or tenants, is not extensively supported by Kubernetes out of the box, and its definition can be complex.
A potential solution could be network isolation, which provides a softer approach to multitenancy. For stricter isolation needs, a more encapsulated approach like a virtual control plane per tenant provided by vcluster might be necessary.
Kubernetes has shifted networking tasks to the left, allowing developers to fully define their applications’ networking topology. This includes creating “application firewalls” for network segmentation.
This can be achieved in a few ways:
- Using core Kubernetes features that operate on the L3/L4 networking layers: Developers can create ingress and egress firewall rules for workload Pods by defining resources of the type
NetworkPolicy
- Using a service mesh that targets the L7 protocol layer, specifically HTTP-based communication: This allows for filtering based on HTTP verbs and other L7 protocol parameters
Network Policies
In Kubernetes, a resource type known as NetworkPolicy
acts as a custom firewall, setting rules for inbound and outbound network connections for Pods. These rules dictate which Pods can be accessed and their connection points. The Container Network Interface (CNI) add-on, used by Kubernetes for internal networking, implements these rules. However, it’s important to note that not all CNI plugins support NetworkPolicies
.
NetworkPolicy
is supported either directly or through add-on configuration by all hosted Kubernetes cloud offerings and other distributions like Minikube. The definition of a NetworkPolicy
includes a Pod selector and lists of inbound (ingress) or outbound (egress) rules. The Pod selector uses labels, which are metadata attached to Pods, to match the Pods to which the NetworkPolicy
applies. This allows for flexible and dynamic grouping of Pods.
The list of ingress and egress rules defines the permitted inbound and outbound connections for the Pods matched by the Pod selector. These rules specify the allowed sources and destinations for connections to and from the Pods.
NetworkPolicy
objects are confined to their namespace and only match Pods within that namespace. Defining cluster-wide defaults for all namespaces through NetworkPolicy
is not possible, but some CNI plugins like Calico support customer extensions for defining cluster-wide behavior.
Network segment definition with labels
In Kubernetes, groups of Pods are defined using label selectors, which facilitate the creation of unique networking segments. This is advantageous because it allows developers, who have a deep understanding of the application’s Pods and their communication patterns, to label the Pods appropriately. These labels can then be converted into NetworkPolicies
, establishing clear network boundaries for an application with defined entry and exit points.
To segment the network using labels, it’s typical to label all Pods in the application with a unique app label. This label is then used in the NetworkPolicy
selector to ensure that all Pods associated with the application are included in the policy.
There are a few prevalent methods for consistently labeling workloads:
- Workload-unique labels: These labels enable the direct modeling of the dependency graph between application components, such as other microservices or a database. This technique is used to model the permission graph where a label type identifies the application component
- Role or permissions labels: In a more flexible approach, specific role or permissions labels can be defined that need to be attached to every workload that plays a certain role. This method allows for new workloads to be added without updating the
NetworkPolicy
Deny-all as default policy
In Kubernetes, the default setting for NetworkPolicy
allows all incoming and outgoing traffic, which can be a potential issue if a Pod is overlooked or future Pods are added without the necessary NetworkPolicy
. To mitigate this, it’s advisable to implement a deny-all
policy as a starting point.
This policy essentially sets the list of permitted ingresses to an empty list ([]
), effectively blocking all incoming traffic. It’s crucial to understand that an empty list is not the same as a list with a single empty element ({}
). The latter matches everything, which is the complete opposite of what we’re trying to achieve with a deny-allh
policy. This distinction is key in maintaining a secure and efficient Kubernetes environment.
Ingress
One of the primary use cases involves managing incoming traffic, also known as ingress traffic. This is achieved through a policy that uses a podSelector
field to determine which Pods (the basic units of deployment in a Kubernetes system) are permitted to send traffic to a specific Pod. If the incoming traffic matches any of the set ingress rules, the selected Pod can receive it.
There are several ways to configure these ingress rules:
- use
namespaceSelector
for specifying the namespaces where thepodSelector
should be applied - instead of selecting Pods from within the cluster, you can define an IP address range using an
ipBlock
field - limit traffic to certain ports on the selected Pod by using a
ports
field that lists all permitted ports. This provides an additional layer of control over the network traffic in your Kubernetes system
Egress
When working with Kubernetes, it’s advisable to begin with a stringent policy. Although blocking all outgoing traffic may seem like a good idea, it’s not feasible because every Pod needs to interact with Pods from the system namespace for DNS lookups.
The policyTypes
field in a NetworkPolicy
is crucial as it determines the type of traffic the policy impacts. It can include Egress and/or Ingress elements, indicating which rules are part of the policy. The default value is decided based on whether the ingress and egress rule sections are present.
For an egress-only policy, you need to explicitly set policyTypes
to Egress
. If this isn’t done, it would imply an empty ingress rules set, effectively blocking all incoming traffic.
Selective activation of access to external IP addresses is possible for specific Pods that need network access outside the cluster. However, if you opt for stricter egress rules and also wish to limit the internal egress traffic within the cluster, it’s crucial to always permit access to the DNS server in the kube-system namespace and the Kubernetes API server.
Authorization Policies
Kubernetes provides control over network traffic between Pods at the TCP/IP level. However, there might be situations where you need to manage network restrictions based on higher-level protocol parameters.
This requires a deep understanding of protocols like HTTP and the ability to inspect incoming and outgoing traffic. Unfortunately, Kubernetes doesn’t inherently support this feature.
To overcome this, there are add-ons known as service meshes that extend Kubernetes to provide this advanced network control.
Service Mesh
Service meshes such as Istio and Linkerd are designed to manage operational needs like security, observability, and reliability, enabling applications to concentrate on their core business logic. They typically operate by injecting sidecar containers into workload Pods to monitor Layer 7 traffic.
Istio, in particular, offers a comprehensive set of features for authentication, transport security through mutual TLS, identity management with certificate rotations, and authorization.
Istio leverages the Kubernetes API machinery by introducing its own CustomResourceDefinitions
(CRDs) and manages authorization using the AuthorizationPolicy
resource. This resource is a namespaced resource that controls traffic to a specific set of Pods in a Kubernetes cluster through a set of rules.
The policy is composed of:
- Selector: Specifies the Pods to which the policy applies
- Action: Determines what should be done with the traffic that matches the rules
- List of rules: Avaluated for incoming traffic
The AuthorizationPolicy
can define network segments of an application that are independent and isolated from each other. It can also be used for application-level authorization when an identity check is added to the rules.
However, it’s crucial to understand that while AuthorizationPolicy
is about application authorization, the Kubernetes RBAC model is about securing access to the Kubernetes API server.
Secure Configuration
Applications often need to interact with external systems, necessitating authentication and secure credential storage. Kubernetes provides Secret resources for storing confidential data, but these are merely Base64 encoded, not encrypted. Despite Kubernetes’ efforts to limit access to Secrets, they can be exposed when stored outside the cluster.
The advent of GitOps, which involves storing configurations in remote Git repositories, introduces additional security risks. Secrets should never be stored unencrypted in remote repositories, and the encryption and decryption process requires careful management. Even within a cluster, access to encrypted credentials isn’t entirely secure since the cluster administrator can access all data.
Trust levels can differ based on whether a 3rd-party operates the cluster or if it’s deployed on a company-wide platform. Depending on these trust boundaries and confidentiality needs, different solutions may be necessary.
Two common approaches for secure configuration are:
- Out-of-cluster encryption: This method stores encrypted configuration data outside of Kubernetes. The data is transformed into Kubernetes Secrets either just before entering the cluster or within the cluster by a continuously running operator process
- Centralized secret management: This approach uses specialized services offered by cloud providers (like AWS Secrets Manager or Azure Key Vault) or in-house vault services (like HashiCorp Vault) to store confidential configuration data
Here are some methods for securing confidential information in Kubernetes:
- Sops: Provides pure client-side encryption, ideal for encrypting Secrets stored in public-readable places like a remote Git repository
- External Secrets Operator: Implements secret synchronization, separating the concerns of retrieving credentials in a remote SMS and using them
- Secret Storage CSI Providers: Provides ephemeral volume projection of secret information, ensuring no confidential information is permanently stored in the cluster except access tokens for external vaults
- Vault Sidecar Agent Injector: Offers sidebar injections that shield from direct access to an SMS
However, it’s crucial to remember that if someone with malicious intent gains full root access to your cluster and containers, they can potentially access that data. The aim is to make these kinds of exploits as challenging as possible by adding an extra layer on the Kubernetes Secret abstraction.
Out-of-Cluster Encryption
The out-of-cluster encryption technique involves retrieving secret and confidential data from outside the cluster and converting it into a Kubernetes Secret.
Sealed Secrets
Sealed Secrets, a Kubernetes add-on developed by Bitnami, enables the secure storage of encrypted data within a CustomResourceDefinition
(CRD) known as a SealedSecret
. This add-on operates by monitoring these resources and generating a Kubernetes Secret for each SealedSecret
, which contains decrypted content. The encryption process is carried out externally from the cluster using a command-line interface tool named kubeseal. This tool transforms a Secret into a SealedSecret
that can be safely stored in a source code management system like Git.
The encryption employs AES-256-GCM
symmetrically for a session key, and the session key is encrypted asymmetrically with RSA-OAEP
, mirroring the approach used in TLS. The secret private key is stored within the cluster and is automatically created by the SealedSecret Operator
. However, it’s important to note that the administrator is responsible for backing up this key and rotating it when necessary.
SealedSecrets
offers scopes:
- Strict (default)
- Namespace-wide
- Cluster-wide
The desired scope can be chosen when creating the SealedSecret
with kubeseal
.
One potential limitation of Sealed Secrets
is its reliance on a server-side operator that must continuously run within the cluster to perform decryption.
Additionally, proper backup of the secret key is crucial because without it, decryption of the secrets will not be possible if the operator is uninstalled.
External Secrets
The External Secrets Operator is a tool designed for Kubernetes that collaborates with various external Secret Management Systems (SMSs).
Unlike Sealed Secrets, which requires you to handle the encrypted data storage, External Secrets delegates the tasks of encryption, decryption, and secure persistence to an external SMS. This approach allows you to utilize features provided by your cloud’s SMS, such as key rotation and a user-friendly interface.
This system promotes a separation of duties, allowing different roles to manage application deployments and secrets independently. It provides flexibility in defining how the external secret data maps to the content of the mirrored Secret. For example, you can use a template to create a configuration with a specific structure.
A significant benefit of this server-side solution is that only the server-side operator has access to the credentials needed to authenticate against the external SMS. The External Secrets Operator project has absorbed several other Secret-syncing projects, establishing itself as the leading solution for mapping and syncing externally defined secrets to a Kubernetes Secret.
However, it does require running a server-side component continuously, which could be seen as a drawback.
Secret OPera‐tionS (Sops)
Sops is a versatile tool created by Mozilla. It’s designed to encrypt and decrypt any YAML
or JSON
file, making it safe to store these files in a source code repository. This is especially handy in a GitOps environment where all resources are stored in a Git repository. Sops operates entirely outside of a Kubernetes cluster and doesn’t require any server-side component.
The tool encrypts all values in a document, leaving the keys untouched. It supports various encryption methods:
- Local asymmetric encryption via age with keys stored locally
- Storing the secret encryption key in a centralized key management system (KMS). It supports platforms like AWS KMS, Google KMS, and Azure Key Vault as external cloud providers, and HashiCorp Vault as an independently hosted SMS
Sops is a command-line interface (CLI) tool that can be run either locally on your machine or within a cluster (for instance, as part of a CI pipeline).
If you’re operating in one of the major clouds, integrating with their KMSs can provide seamless operation. This makes Sops an essential tool for those working with Kubernetes and other deep tech knowledge areas.
Secret Management Systems (SMSs) vs Key Management Systems (KMSs)
Both cloud services that handle different aspects of data security.
SMSs provide an API for storing and accessing secrets, with granular and configurable access control. The secrets are encrypted transparently for the user, eliminating the need for the user to manage this aspect.
On the other hand, KMSs are not databases for secure data but focus on the discovery and storage of encryption keys. These keys can be used to encrypt data outside of a KMS. A good example of a KMS is the GnuPG keyserver.
Each leading cloud provider offers both SMSs and KMSs, and if you’re using one of the major cloud services, you’ll also benefit from good integration with its identity management for defining and assigning access rules to SMS- and KMS-managed data.
Centralized Secret Management
While secrets are made as secure as possible, they can still be read by any administrator with cluster-wide read access. This could potentially be a security concern depending on the trust relationship with cluster operators and specific security requirements.
An alternative to this is to keep the secure information outside the cluster in an external Secret Management System (SMS) and request confidential information on demand over secure channels. There are many such SMSs available, and each cloud provider offers its own variant. The focus here is not on the individual offerings of these systems, but on how they integrate into Kubernetes.
Secrets Store CSI Driver
The Secrets Store CSI Driver is a Kubernetes API that allows access to various centralized Secret Management Systems (SMSs) and mounts them as regular Kubernetes volumes. Unlike a mounted Secret volume, nothing is stored in the Kubernetes etcd database but securely outside the cluster. This driver supports the SMS from major cloud vendors (AWS, Azure, and GCP) and HashiCorp Vault.
The setup for connecting a secret manager via the CSI driver involves some administrative tasks:
- Installing the Secrets Store CSI Driver and configuration for accessing a specific SMS: Cluster-admin permissions are required for the installation process
- Configuring access rules and policies: This results in a Kubernetes service account being mapped to a secret manager-specific role that allows access to the secrets
After the setup, you must define a SecretProviderClass
, where you select the backend provider for the secret manager and add the provider-specific configuration.
While the setup for a CSI Secret Storage drive is complex, its usage is straightforward, and it allows you to avoid storing confidential data within Kubernetes.
However, there are more moving parts than with Secrets alone, so more things can go wrong, and it’s harder to troubleshoot.
Pod injection
There are different methods for an application to access external Secret Management Systems (SMSs):
- Direct Access: The application can directly access the SMS via proprietary client libraries. However, this method requires storing the credentials along with the application and adds a hard dependency on a specific SMS
- Container Storage Interface (CSI) Abstraction: The abstraction allows secret information to be projected into volumes visible as files for the deployed application, providing a more decoupled approach
- Init Container: An Init Container fetches confidential data from an SMS and copies it to a shared local volume that is mounted by the application container. The secret data is fetched only once before the main container starts
- Sidecar: A Sidecar syncs secret data from the SMS to a local ephemeral volume accessed by the application. This method allows for updating secrets locally if the SMS rotates them
HashiCorp Vault Sidecar Agent Injector implemented as a mutating webhook, it modifies any resource when it’s created based on specific vault-specific annotations in a Pod specification. This technique is entirely transparent to the user and has fewer moving parts than hooking up a CSI secret storage volume with the provider deployment for a particular SMS product.
Access Control
There are 2 core concepts of security:
- Authentication (AuthN): identifying the subject of an operation
- Authorization (AuthZ): determining permissions for actions on resources
Developers are typically more concerned with authorization, such as who can perform operations in the cluster and access specific parts of an application. Misconfigured access can lead to privilege escalation and deployment failures, hence it’s crucial for developers to understand authorization rules set up by administrators.
Every request to the Kubernetes API server has to pass through a few stages :
- Authentication
- Authorization
- Admission Control
Once a request passes the Authentication and Authorization stages, a final check is done by Admission controllers before the request is processed. This process ensures fine-grained access management and restrictions are in place to limit the impact of any potential security breaches.
Key points for access control:
- Namespace-specific resources: Use a Role with a
RoleBinding
that connects to a user orServiceAccount
- Reuse access rules: Use a
RoleBinding
with aClusterRole
that defines shared-access rules - Extend predefined ClusterRoles: Create a new
ClusterRole
with anaggregationRule
field that refers to theClusterRoles
you wish to extend - Access to all resources of a specific kind: Use a
ClusterRole
and aClusterRoleBinding
- Manage cluster-wide resources: Use a
ClusterRole
and aClusterRoleBinding
RBAC allows for fine-grained permissions and can reduce risk by ensuring no gaps are left for escalation paths. However, broad permissions can lead to security escalations.
General gudelines:
- Avoid wildcard permissions: Follow the principle of least privilege when defining Roles and ClusterRoles
- Avoid cluster-admin ClusterRole: High privileges can lead to severe security implications
- Don’t automount ServiceAccount tokens: If a Pod gets compromised, an attacker can talk with the API server with the permissions of the Pod’s associated
ServiceAccount
Authentication
Kubernetes offers several pluggable authentication strategies that administrators can configure:
- Bearer Tokens (OpenID Connect) with OIDC Authenticators: OpenID Connect (OIDC) Bearer Tokens can authenticate clients and grant access to the API Server. The client sends the OIDC token in the Authorization header of their request, and the API Server validates the token to allow access
- Client certificates (X.509): The client presents a TLS certificate to the API Server, which is then validated and used to grant access
- Authenticating Proxy: This refers to using a custom authenticating proxy to verify the client’s identity before granting access to the API Server
- Static Token files: Tokens can be stored in standard files and used for authentication. The client presents a token to the API Server, which is then used to look up the token file and search for a match
- Webhook Token Authentication: A webhook can authenticate clients and grant access to the API Server. The client sends a token in the Authorization header of their request, and the API Server forwards the token to a configured webhook for validation
Kubernetes allows you to use multiple authentication plugins simultaneously, such as Bearer Tokens and Client certificates.
However, the order in which these strategies are evaluated is not fixed, so it’s impossible to know which one will be checked first. The process will stop after one is successful, and Kubernetes will forward the request to the next stage for authorization.
Authorization
Kubernetes uses Role-Based Access Control (RBAC) as a standard method for managing system access. RBAC allows developers to execute actions in a detailed manner. The authorization plugin in Kubernetes is easily pluggable, enabling users to switch between the default RBAC and other models like Attribute-Based Access Control (ABAC), webhooks, or delegation to a custom authority.
ABAC requires a file containing policies in a JSON
per-line format, but any changes necessitate server reloading, which can be a disadvantage due to its static nature. This is why ABAC-based authorization is only used in certain cases.
Admission Controllers
Admission controllers are a feature of the Kubernetes API server that intercept requests to the API server and perform additional actions based on those requests. They can enforce policies, perform validations, and modify incoming resources. Kubernetes uses Admission controller plugins for various functions, such as setting default values on specific resources and validations. External webhooks can be configured for validation and updating API resources.
Authentication has 2 fundamental parts:
- Who: Represented by a subject: a human person or a workload identity
- What: Representing the actions those subjects can trigger at the Kubernetes API server
Subject
In Kubernetes, a “subject” refers to the identity associated with a request to the Kubernetes API server.
There are some types of subjects which represent the workload identity of Pods:
- human users
ServiceAccounts
Both human users and ServiceAccounts
can be grouped into user groups and service account groups respectively. These groups can act as a single subject, with all members sharing the same permission model.
Users
In Kubernetes, human users are not defined as explicit resources in the Kubernetes API, meaning they can’t be managed via an API call.
Instead, authentication and mapping to a user subject are handled externally. After successful authentication, each component creates the same user representation and adds it to the actual API request for later verification.
This user representation includes the username, a unique user id (UID), a list of groups that the user belongs to, and additional information as comma-separated key-value pairs.
This information is evaluated by the Authorization plugin against the authorization rules associated with the user or via its membership to a user group.
Certain usernames are reserved for internal Kubernetes use and have the special prefix system:
. It’s recommended to avoid creating your own users or groups with the system:
prefix to avoid conflicts. While external user management can vary, workload identities for Pods are standardized as part of the Kubernetes API and are consistent across all clusters.
Service accounts
In Kubernetes, service accounts represent non-human users within the cluster, enabling processes within a pod to interact with the Kubernetes API Server. These accounts are authenticated by the API server using a specific username format.
Every namespace includes a default service account that is used by any pod that doesn’t specify its own service account. Each of these accounts has an associated JWT, which is managed by Kubernetes and automatically integrated into each pod’s filesystem.
The service account token can be mapped directly into the pod’s filesystem, which enhances security by eliminating the need for an intermediate token representation and allowing for token expiration times.
Before the release of Kubernetes 1.24, these tokens were represented as secrets and had long lifetimes without rotation. However, with the introduction of the projected volume type, the token is now only accessible to the pod and isn’t exposed as an additional resource.
The service account resource also includes fields for specifying credentials for pulling container images and defining mountable secrets. Image pull secrets enable workloads to authenticate with a private registry when pulling images.
You can attach a pull secret directly to a service account in Kubernetes, which means that every pod associated with that service account will automatically include the pull secrets in its specification when it’s created. This eliminates the need to manually include them each time a new pod is created.
Furthermore, you can specify which secrets a pod associated with a service account can mount using the secrets field in the service account resource. You can enforce this restriction by adding a specific annotation to the service account. If this annotation is set to true, only the listed secrets will be mountable by pods associated with the service account, enhancing security and reducing manual effort.
JSON Web Tokens (JWTs)
JWTs are digitally signed tokens used in Kubernetes as Bearer Tokens for API requests. They consist of a header, payload, and signature, and are represented as a sequence of Base64 URL-encoded parts separated by periods.
The payload of a JWT can carry various information such as the identity of the workload making the request, the expiration time, the issuer, and more. Tools like jwt.io
can be used to decode, validate, and inspect these tokens.
In Kubernetes, the API server verifies the JWT’s signature by comparing it with a public key published in a JSON Web Key Set (JWKS), following the JSON Web Key (JWK) specification which defines the cryptographic algorithms used in the verification process.
The tokens issued by Kubernetes contain useful information in their payload, such as the issuer of the token, its expiration time, user information, and any associated service accounts.
Groups
In Kubernetes, both user and service accounts can be part of one or more groups. These groups are attached to requests by the authentication system and are used to grant permissions to all members of the group. Group names are plain strings that represent the group name and can be freely defined and managed by the identity provider. This allows for the creation of groups of subjects with the same permission model.
Kubernetes also has a set of predefined groups that are implicitly defined and have a system:
prefix in their name. Group names can be used in a RoleBinding
to grant permissions to all group members.
The text also mentions that users, ServiceAccounts
, and groups can be associated with Roles that define the actions they are allowed to perform against the Kubernetes API server.
Role-Based Access Control
In Kubernetes, Roles are used to define the specific actions that a subject (like users or service accounts) can perform on particular resources. These Roles are assigned to subjects through RoleBindings. Both Roles and RoleBindings are resources within Kubernetes that can be created and managed like any other resource, and they are tied to a specific namespace, applying to its resources.
It’s crucial to understand that in Kubernetes RBAC (Role-Based Access Control), there is a many-to-many relationship between subjects and Roles. This means a single subject can have multiple Roles, and a single Role can be applied to multiple subjects. The relationship between a subject and a Role is established using a RoleBinding, which contains references to a list of subjects and a specific Role.
Role
Roles in Kubernetes are used to define a set of allowed actions for a group of Kubernetes resources or subresources. They consist of a list of rules that describe which resources can be accessed.
Each rule is described by a few fields:
- apiGroups: This list specifies all resources of multiple API groups. An empty string (
“”
) is used for the core API group, which contains primary Kubernetes resources such as Pods and Services. A wildcard character (*
) can match all available API groups the cluster is aware of - resources: This list specifies the resources that Kubernetes should grant access to. Each entry should belong to at least one of the configured apiGroups. A single
*
wildcard entry means all resources from all configuredapiGroups
are allowed - verbs: These are similar to HTTP methods and define allowed actions in a system. They include CRUD operations on resources (Create-Read-Update-Delete) and separate actions for operations on collections, such as list and deletecollection. Additionally, a watch verb allows access to resource change events and is separate from directly reading the resource with get
Only one rule needs to match a request to grant access to this Role.
RoleBinding
In Kubernetes, RoleBindings are used to link one or more subjects to a specific Role. Each RoleBinding can connect a list of subjects to a Role, with the subjects list field taking resource references as elements. These resource references have a name field plus kind and apiGroup
fields for defining the resource type to reference.
A subject in a RoleBinding
can be one of the following types:
- User: Human or system authenticated by the API server. User entries have a fixed apiGroup value of
rbac.authorization.k8s.io
- Group: Group is a collection of users. The group entries also carry
rbac.authorization.k8s.io
asapiGroup
- ServiceAccount: Belong to the core API Group that is represented by an empty string (
“”
). It’s unique in that it can also carry a namespace field, allowing you to grant access to Pods from other namespaces
The other end of a RoleBinding points to a single Role, which can either be a Role resource within the same namespace as the RoleBinding or a ClusterRole resource shared across multiple bindings in the cluster. Similar to the subjects list, Role references are specified by name, kind, and apiGroup.
Privilege-Escalation Prevention
The RBAC (Role-Based Access Control) subsystem in Kubernetes is designed to prevent privilege escalation. It manages Roles and RoleBindings, including their cluster-wide counterparts, ClusterRoles and ClusterRoleBindings. Here are the key restrictions:
- Users can only update a Role if they already possess all the permissions within that Role, or if they have permission to use the
escalate
verb on all resources in therbac.authorization.k8s
API group. - A similar rule applies to RoleBindings: users must either have all the permissions granted in the referenced Role, or they must have the
bind
verb allowance on the RBAC resources.
These restrictions are designed to prevent users with control over RBAC resources from elevating their own permissions.
ClusterRole
Common purposes:
- Securing cluster-wide resources: These resources, such as
CustomResourceDefinitions
orStorageClasses
, are typically managed at the cluster-admin level and require additional access control. Developers may have read access to these resources but need help writing to them.ClusterRoleBindings
are used to grant subjects access to these cluster-wide resources - Defining typical Roles that are shared across namespaces:
ClusterRoles
allow you to define general-access control Roles (e.g.,view
forread-only
access to all resources) that can be used in multipleRoleBindings
. This allows you to create typical access-control schemes that can be easily reused
Sometimes, you may need to combine the permissions defined in two ClusterRoles. This can be achieved using aggregation, where you define a ClusterRole
with an empty rules field and a populated aggregationRule
field containing a list of label selectors. The rules defined by every other ClusterRole
that has labels matching these selectors will be combined and used to populate the rules field of the aggregated ClusterRole
.
This technique allows you to dynamically build up large rule sets by combining smaller, more focused ClusterRoles. It also allows you to quickly compose more specialized ClusterRoles by aggregating a set of basic ClusterRoles.
You can view the complete list of ClusterRoles available in a Kubernetes cluster using the kubectl get clusterroles
command, or refer to the Kubernetes documentation for a list of default ClusterRoles.
ClusterRoleBinding
A ClusterRoleBinding in Kubernetes is similar to a RoleBinding, but it applies to all namespaces in the cluster, ignoring the namespace field. This means that the rules defined in a ClusterRoleBinding, such as the ClusterRole view-pod
, apply universally, allowing any Pod associated with a specific ServiceAccount
(like test-sa) to read all Pods across all namespaces.
However, caution is advised when using ClusterRoleBindings due to their wide-ranging permissions across the entire cluster. It’s recommended to consider carefully whether using a ClusterRoleBinding is necessary. While it may be convenient as it automatically grants permissions to newly created namespaces, using individual RoleBindings per namespace is generally better for more granular control over permissions. This allows for the omission of specific namespaces, such as kube-system, from unauthorized access.
ClusterRoleBindings should be used only for administrative tasks, like managing cluster-wide resources (Nodes, Namespaces, CustomResourceDefinitions, or even ClusterRoleBindings). Despite its power, Kubernetes RBAC can be complex to understand and even more complicated to debug. Therefore, it’s important to have a good understanding of a given RBAC setup.
Conclusion
Kubernetes Security patterns are critical to protecting your applications and data. By following the best practices, you can secure your Kubernetes cluster and protect your applications from threats.
Remember to always use the principle of least privilege, network segmentation, and monitoring and logging to ensure the security of your cluster. Additionally, keep your cluster up-to-date with the latest security patches and comply with security standards and regulations.
These patterns are not exhaustive, but they provide some best practices and tools that can help you improve the security of your Kubernetes cluster.