Security Pattern – Container Orchestration

Overview

This security pattern outlines the controls needed to protect orchestration services used for deploying clustered containers.

Container orchestration enables provisioning and scaling of containers across multiple hosts and platforms. It also facilitates service registration and discovery for clustered container groups.

Common examples of technology vendors providing container orchestration include Kubernetes, HashiCorp Nomad, and Apache Mesos. These services inherently bring new security challenges due to the rapid pace of change and large scale of container deployments.

Typical Challenges

Compromise of orchestration services or sub-components can lead to unauthorised access to containers or deployment of containers to unauthorised hosts. For example, a rogue workstation joining a container cluster could subsequently be allowed to deploy sensitive containers onto that host.
Increased complexity and attack surface across orchestration services and multi-tenant container hosting. Managing configurations at scale across multiple container deployments and container hosting platforms increases the risk of human error or inadvertent changes that weaken security posture.
Increased complexity in network micro-segmentation for container clusters. Ensuring appropriate policies for segregation of container-to-container communications becomes more challenging.
Incident management and forensics become more difficult, as containers can be restarted and destroyed, limiting the ability to investigate the root cause of infections, compromises, or impacts.

Scope

The scope of this document is for addressing the security threats that relate to

ID	Description	Example
01	Container orchestration across multiple hosts	Deployment of Kubernetes across Docker hosting platform.

Out of Scope

ID	Description of exclusion	Reason for Exclusion
01	Deployment of Service Mesh architectures across clustered containers	Service Mesh deployments are closely related to Container Orchestration. These patterns are maintained separately due to the different associated security challenges.
02	Protection of Container Platform	Covered separately under a different security pattern
03	Container platforms integration with DNS services	DNS is a core capability for supporting container orchestration but requires separate consideration of security challenges.

Dependencies

ID	Description	Impact from dependency not met
01	Summarised security design principles outlined under Jericho Forum® Commandments. https://publications.opengroup.org/w124	Minimal impact as these principles are used as an example set of baseline requirements within security pattern.

Constraints

ID	Description	Impact from constraint
01

Assumptions

ID	Description	Impact if assumption is false
01	Assets are assumed to have criticality ratings of either ‘Non-Critical’ or ‘Critical’.	Minimal. Values only used for demonstrating the solution within security pattern
02	Assets are assumed to have data classification ratings of either ‘Internal’, ‘Sensitive’ or ‘Highly Sensitive’	Minimal. Values only used for demonstrating the solution within security pattern
03	Assets are assumed to be exposed to different network security domains of either ‘Public’, ‘Partner’ or ‘Internal’	Minimal. Values only used for demonstrating the solution within security pattern

Assets at Risk

The following section provides a list of assets affected by the problem statement:

Asset Title	Asset Description
Orchestration Scheduler and Resource Manager	Provisions and deploys container clusters across target container host platforms based on prescribed cluster configurations. Monitors overall container cluster service state, container failures, and system resources.
Orchestration Metadata Datastore	Maintains metadata information for host platforms, containers, and services. This includes metadata associated with registration, inventory, and configuration for these assets.
Container Host Platform	Hosting platform for containers, including the container host operating system, container engine, and any local orchestrator agents required for container scheduling and health checks.
Container Network Services	Overlay network fabric and supporting services including NAT, proxy, and load balancing. These services provide ingress and egress connectivity for containers to endpoints located outside the container cluster. Services may be co-located on the same system as the Container Host Platform or hosted as a gateway service.

The following assets are also referenced within the pattern but not in scope

CI/CD Pipelines and Tooling – Broader orchestration services for supporting assets. Covered under separate security pattern.

Threat Model

The following section provides a list of threats within the problem statement:

Threat Event (ID / Title)	Threat Description and Characteristics	Diagram
TE-11: Disruption to information systems due to misconfiguration or maintenance errors	Unintended changes in orchestration configuration, metadata, or network policy may disrupt container clusters or their supporting system resources. These changes can occur through human error by orchestration system administrators or due to unnecessary privileges granted to developers deploying Application Containers.
TE-23: Man-in-the-middle attack or network traffic modification	Compromise or eavesdropping of communications between orchestrator services, container host platform, and related sub-components. This could allow a malicious entity to disable or disrupt applications through replay attacks or manipulation of network packets associated with administrative functions.
TE-25: Generation of false identities	Compromise of system credentials or secrets stored in orchestration metadata may lead to the creation of false identities. This could allow unauthorised container host platforms to impersonate legitimate users or the insertion of rogue hosts that are mistakenly considered trusted platforms.
TE-26: Abuse of resources through misconfiguration	Deliberate manipulation of orchestration services and functions to disrupt or disable applications within clustered containers. Compromising the integrity of the metadata database or making intentional changes to scheduler configuration may cause unavailability of services essential for operating and deploying container clusters.
TE-35: Lack of security insights to detect security threats	Increased complexity in detecting and investigating malicious activity or rogue processes within containers. This complexity stems from the rapid pace of changes made by container orchestration. Insights to track events or actions within containers can be limited when orchestration services automatically destroy and redeploy containers flagged as inactive or unhealthy. This restriction hinders the ability to conduct forensics for clustered containers, especially when manual intervention or tasks are required.

Target State Solution

Summary

The target state solution evaluates the following design requirements to provide the expected target state solution and design principles.

Design Requirements

The target state solution is required to meet the following requirements, as referenced under Dependencies, Assumption and Constraints.

Requirement	Implication to Design Principles
1. The scope and level of protection should be specific and appropriate to the asset at risk.	Maintain segregated domains for host container platforms with different sensitivity and criticality
2. Security mechanisms must be pervasive, simple, scalable, and easy to manage.	The security pattern maintains clear security principles to be applied for container orchestration.
3. Assume context at your peril.	Controls defined in this security pattern are used to identify and measure problems, limitations or issues
4. Devices and applications must communicate using open, secure protocols.	Open and encrypted communication channels such as HTTPS are applied for interfaces to container orchestration services
5. All devices must be capable of maintaining their security policy on an un-trusted network.	Container Orchestration is protected against both external and internal threats.
6. All people, processes, and technology must have declared and transparent levels of trust for any transaction to take place.	Validate container host platforms before joining orchestration cluster
7. Mutual trust assurance levels must be determinable.	Establish mutual trust between container orchestration services and container host platforms
8. Authentication, authorization, and accountability must interoperate/exchange outside of your locus/area of control.	Apply authentication and authorization for both internal, partner and public clients.
9. Access to data should be controlled by security attributes of the data itself.	Utilise secrets management to protect secrets or sensitive information required during container deployment
10. Data privacy (and security of any asset of sufficiently high value) requires a segregation of duties/privileges.	Utilise secrets management for securing sensitive data or credentials.
11. By default, data must be appropriately secured when stored, in transit, and in use.	Protect containers both during deployment and run time. Isolate containers suspected of compromise

Solution Overview

The orchestration of Application Containers across multiple hosting platforms is divided into segregated deployments. Security categorization levels determine how containers are deployed and isolated between Container Host Platforms operating within different security groups.

Additional Notes

Enforce restrictions for any hosts attempting to join clusters from other security domains such as Public, Partner or End User locations.
- Services and sub-components within the orchestration control plane establish mutual trust and communicate across protected channels (where supported).
Metadata datastore operates as source of truth for container security categorisation information, regardless if applied as security attributes or tagging to each container.

Security categorisation levels

The rationale for segregation is to minimize the potential for compromise of services or traffic between Application Containers with different security posture requirements. These ‘protection ratings’ are associated with additional assurance for controls implemented to protect critical or sensitive Application Containers.

Containers receive a security categorization based on:

Exposure to external sources
Criticality of the application services running within the containers
Sensitivity of data being transferred or processed within the containers

The following provides attributes (with example values) for modeling segregation associated with ‘protection ratings’:

ID	Description	Values
1	Exposure of application services running within the containers	Public Partner Private
2	Criticality of the application services running within the containers	Non-Critical Critical
3	Sensitivity of data being transferred or processed within the containers	Internal Sensitive Highly Sensitive

The ‘protection ratings’ for establishing boundaries between Container Host Platforms are subsequently calculated based on the following tiering model.

	Internal (Non-Critical)	Partner (Non-Critical)	Public (Non-Critical)	Internal (Critical)	Partner (Critical)	Public (Critical)
Internal	Default	Default	Default	Default	Protected	Protected
Sensitive	Default	Default	Default	Protected	Protected	Protected
Highly Sensitive	Protected	Protected	Protected	Highly Protected	Highly Protected	Highly Protected

See notes in Assumptions for listed sample values. Values provided are for demonstrating the segregation model are will differ for each organisation.

Isolating Compromised Containers

For Application Containers suspected of compromise or breach, the following containment measures can be taken:

Snapshot the compromised Application Container.
Redeploy to an isolated security forensics environment for investigation.
Scale down the Application Container cluster to zero using Container Orchestration services to prevent similar containers from being compromised.

Additional Notes

Ensure automation security processes and workflows within orchestration services where supported. Allow automated triggering of workflows based on orchestration events or CI/CD pipelines (where applicable for deployment)

Design Principles

The following design principles are applied for this pattern, based on the requirements.

Tightly restrict cluster-wide administrative access within Container Orchestration.
Segregate container deployments to host platforms based on security categorisation for sensitivity and criticality.
Apply tighter access restrictions to those containers exposed to external clients.
Ensure that container host platforms are securely introduced to the cluster.
Isolate container suspected of compromise and remove from cluster.

Actors

List the actors involved in this pattern.

Actor Type	Actor Description
Container Orchestration Administrator	Responsible for management and administration of orchestration services.
Container Platform Administrator	Responsible for management and administration of Container Engine and Container Host Operating System
Application Developer	Design, build and deployment of microservice applications within Application Containers.

Locations

This pattern is applied to any locations for assets being utilised

Location	Location Description
Container Hosting Platform	These platforms are deployed within trusted internal hosting environments, hosted within on-premise or cloud provider environments.

Sequencing

The pattern is designed within the following sequences

Stage gate	Description
Container Cluster Deployment	Orchestration services manage how and where containers are being deployed. Includes the scale of deployment, policies governing communications and container privileges assigned.
Container Cluster Run Time	Execution and operations for active container clusters that are successfully deployed

Mapping Threats to Controls

The following provides a mapping of security threats to affected assets and the security control objectives required to mitigate them (further detailed in subsequent security pattern logical designs).

Threat Event	Affects Assets	Security Controls Objectives
TE-11: Disruption to information systems due to misconfiguration or maintenance errors	Orchestration Scheduler and Resource Manager	AC-06: Least Privilege AU-02: Event Logging CM-02: Baseline Configuration CM-05: Access Restrictions for Change CP-09: System Backup CP-10: System Recovery and Reconstitution
TE-23: Man in the middle attack or network traffic modification	Container Network Services	AC-04: Information Flow Enforcement AC-12: Session Termination SC-02: Separation of System and User Functionality SC-03: Security Function Isolation SC-07: Boundary Protection SC-08: Transmission Confidentiality and Integrity SC-11: Trusted Path SC-37: Out-of-band Channels
TE-25: Generation of false identities	Orchestration Metadata Datastore Container Host Platform	AC-03: Access Enforcement AC-06: Least Privilege SC-03: Security Function Isolation SC-17: Public Key Infrastructure Certificates SR-09: Tamper Resistance and Detection
TE-26: Abuse of resources through misconfiguration	Orchestration Scheduler and Resource Manager Orchestration Metadata Datastore	AC-03: Access Enforcement AC-14: Permitted Actions Without Identification or Authentication AU-02: Event Logging CM-02: Baseline Configuration CP-09: System Backup CP-10: System Recovery and Reconstitution IA-09: Service Identification and Authentication MA-03: Maintenance Tools RA-02: Security Categorization RA-05: Vulnerability Monitoring and Scanning SC-32: System Partitioning SI-04: System Monitoring SR-09: Tamper Resistance and Detection
TE-35: Lack of security insights to detect security threats	Container Host Platform	AU-02: Event Logging AU-09: Protection of Audit Information IR-04: Incident Handling IR-05: Incident Monitoring IR-08: Incident Response Plan IR-10: Incident Analysis

Security Pattern

Pattern View: Container Orchestration

Control list: Orchestration Scheduler and Resource Manager

Control Objective	Control Description
AC-03: Access Enforcement	Restrict administrative access to orchestration service interfaces and API’s, through IAM policies or source IP whitelisting.
AC-06: Least Privilege	Define granular RBAC model, with least privilege defined for actions and roles on specific hosts and containers. Minimise the use of wildcard policies for applying defined roles. Allocation of cluster-wide administrative accounts are tightly controlled.
AC-14: Permitted Actions Without Identification or Authentication	Disable any unauthenticated or anonymous access to orchestration services.
AU-02: Event Logging	Enforce event logging across orchestration services and sub-components. Forward and capture log events to external security logging and monitoring service.
CM-02: Baseline Configuration	Ensure baseline security configuration for orchestration services are hardened to industry or vendor best practise (e.g. CIS Security Benchmark), including system permissions for configuration files and services. Ensure regular patching cycles are applied.
CM-05: Access Restrictions for Change	Ensure multi-factor authentication is applied for administrator accounts with access to orchestration services, particular those accounts with cluster-wide privileges.
CP-09: System Backup	Regular scheduled backups are applied for orchestration system information, configuration and metadata datastore. Ensure mechanisms employed to protect the integrity of system backups.
CP-10: System Recovery and Reconstitution	Ensure reconstitution for orchestration services and restoration of container clusters back to operational states.
IA-09: Service Identification and Authentication	Orchestrators ensure that containers are securely deployed to the cluster with a unique and persistent identity throughout their lifecycle. Minimise the use of any shared credentials for service accounts used for orchestration or administrative functions.
MA-03: Maintenance Tools	Validate any 3rd party components, modules or plugin services integrated within orchestration services or performing management functions.
RA-02: Security Categorization	Categorise and group Container Host Platforms into logical entities based on sensitivity and criticality of hosted Application Containers. Isolate sensitive workloads and specify multiple containers with same protection ratings categorization to run on the same host OS. Maintain the source of truth for this information within purpose metadata datastore.
RA-05: Vulnerability Monitoring and Scanning	Regularly monitor and scan for vulnerabilities in the orchestration system and related sub-components
SC-32: System Partitioning	Isolate within separate security groups or segments for Container Host Platforms. Ensure separate system partitioning of orchestration control plane from those systems used within Container Host Platforms.
SI-04: System Monitoring	Compare and analyse different runtime activity in containers within same deployments for given environment. Deviations in network activity, active processes or behaviour between similar deployed containers are flagged as Indicators of Compromise (IOC). If suspected compromised, trigger forensics on suspicious containers and then scale to zero or terminate any containers within that cluster.
SR-09: Tamper Resistance and Detection	Ensure authentication and mutual trust is applied within sub-components of orchestration services and metadata service to

Control list: Orchestration Metadata Datastore

Control Objective	Control Description
AC-03: Access Enforcement	Restrict administrative access to metadata service interfaces and API’s, through IAM policies and or source IP whitelisting.
AC-06: Least Privilege	Minimise any administrative privileges to perform direct changes within metadata datastore.
AC-14: Permitted Actions Without Identification or Authentication	Disable any unauthenticated or anonymous access to metadata datastore.
CM-02: Baseline Configuration	Ensure baseline security configuration for orchestration services are hardened to industry or vendor best practise (e.g. CIS Security Benchmark), including system permissions for configuration file and services.
CP-09: System Backup	Regular scheduled backups are applied metadata datastore. Ensure mechanisms employed to protect the integrity of system backups.
CP-10: System Recovery and Reconstitution	Ensure reconstitution for orchestration services and restoration of container clusters back to operational states.
IA-09: Service Identification and Authentication	Maintain accurate inventory and security-related attributes of clustered containers respective container host platforms.
RA-05: Vulnerability Monitoring and Scanning	Regularly monitor and scan for vulnerabilities in the orchestration metadata services and related sub-components.
SC-03: Security Function Isolation	Any storage or caching of secrets within meta-data services and read-able configuration files are protected using purpose-built secrets management service. This includes security-related attributes of clustered containers used for security categorisation. These secrets are made available during both deploy and runtime of container clusters.
SC-13: Cryptographic Protection	Ensure metadata datastore is encrypted at rest.
SC-32: System Partitioning	Ensure separate system partitioning of Orchestration Metadata services and datastore from those systems used within Container Host Platforms. Isolate within separate security groups or segments.

Control list: Container Host Platform

Control Objective	Control Description
AC-03: Access Enforcement	Enforce access restrictions for local orchestration agents or services operating on Container Host Platform. All container deployment or configuration changes originate from Orchestration Scheduler and Resource Manager.
AC-06: Least Privilege	Restrict and tighten permissions for local orchestration agents or services. Permissions are assigned under dedicated role.
AU-02: Event Logging	Enforce event logging across Container Host Platform and system components. Forward and capture log events to external security logging and monitoring service.
CM-02: Baseline Configuration	Ensure baseline security configuration for orchestration services are hardened to industry or vendor best practise (e.g. CIS Security Benchmark), including system permissions for configuration files and services.
IR-04: Incident Handling	Ensure forensic tooling in provisioned on each Container Host Platform, to allow capture container file-system changes, container memory and any shared volumes.
IR-05: Incident Monitoring	Compare and analyse different runtime activity in containers within same deployments for given environment. Deviations in network activity, active processes or behaviour between similar deployed containers are flagged as Indicators of Compromise (IOC).
IR-08: Incident Response Plan	Trigger forensics to snapshot container as soon as possible, to reduce potential loss of events or actions. Once forensics is captured, terminate container instance(s) from cluster. If the incident is suspected to escaped container isolation, then additional capture snapshot for host OS.
IR-10: Incident Analysis	Security forensics are automated to capture container file-system changes, memory and shared volumes.
SC-03: Security Function Isolation	Container Host Platform supports integration for secrets management.
SC-17: Public Key Infrastructure Certificates	Use certificates issued from trusted certificate authority.
SR-09: Tamper Resistance and Detection	Validate integrity and authenticity for local orchestration agents or services operating on Container Host Platform.

Control list: Container Network Services

Control Objective	Control Description
AC-04: Information Flow Enforcement	Remove unnecessary, unused or insecure communication flows for container clusters. In particular reviewing any ports externally exposed outside container clusters for ingress or egress flows for Container Host Platforms with higher protection ratings (For example ‘Protected’ and ‘Highly Protected’)
AC-12: Session Termination	Any TLS offload to load balancing maintains micro-segmentation and boundaries.
SC-02: Separation of System and User Functionality	Maintain separate polices and controls for orchestration management traffic (including sub-components for system daemons or agents) and data traffic associated application containers.
SC-07: Boundary Protection	Apply segmentation policies between Container Host Platforms for different protection ratings (for example – ‘Default’, ‘Protected’ and ‘Highly Protected’). Tightly restrict connectivity for platforms with higher protection ratings.
SC-08: Transmission Confidentiality and Integrity	Encrypt container to container data traffic and exchange within container overlay network fabric.
SC-11: Trusted Path	Ensure trusted and mutually authenticated connectivity between orchestrator services and Container Host Platform. Restrict attempted registration of Container Host Platforms from unknown or untrusted network segments to Orchestration services.
SC-03: Security Function Isolation	Preference deployment of container-based firewall filtering, packet inspection and content inspection where supported.
SC-37: Out-of-band Channels	Ensure network visibility is integrated within container overlay network fabric and services.

Appendix A – References

Please see below links to external sites for further reading

Appendix B - Disclosure Notice

This document is published as independent research only and is without warrenty. It does not represent any publication from National Institute of Standards and Technology (NIST) or other associated US government entities.