This is the second installment in a two-part series on NIST standards for zero trust security. The first installment covers NIST Special Publication (SP) 800-207, which lays the groundwork for zero trust principles for the enterprise, but makes no specific implementation recommendations.
The follow-up series is made up of four special publications: SP 800-204, SP 800-204A, 800-204B, and 800-204C. This series is co-authored with NIST by Tetrate founding engineer Zack Butcher and takes up where SP 800-207 leaves off.
This series provides security strategies for microservices applications. It mostly focuses on communications between services and between services and a control plane, as described below, under the header Threat Background. In this article, we’ll present an overview of the most important concepts, best practices, and specific deployment recommendations in each of the four papers of the SP 800-204 series:
SP 800-204, Security Strategies for Microservices-Based Application Systems. The first paper in the series provides context on the technology and threat background for microservices applications. It goes on to outline a core set of features that must be available in the infrastructure environment of a microservices application to properly address the unique security and availability concerns of a microservices architecture.
SP 800-204A, Building Secure Microservices-Based Applications Using Service Mesh Architecture. The second paper establishes a reference platform for delivering microservices security, consisting of Kubernetes as the orchestrator and the Istio service mesh as the security kernel. Establishing this reference platform enables NIST to demonstrate specific implementations of the concepts presented. The paper then provides a series of recommendations for implementing the core features presented in the first paper, SP 800-204.
SP 800-204B, Attribute-based Access Control for Microservices-Based Applications Using a Service Mesh. The third paper offers a deep dive into best practices for implementing fine-grained and dynamic authentication and authorization for microservices using attribute-based access control (ABAC).
SP 800-204C, Implementation of DevSecOps for a Microservices-Based Application with Service Mesh. The final paper in the series provides an overview of DevSecOps practices and how they can be used to bolster the security of microservices applications, with an eye toward enabling continuous authorization to operate (C-ATO) for applications in the Department of Defense.
This blog post summarizes the key points of this series of four papers, referring to specific points in each paper where relevant. As you will see from the references given, we mostly deal with topics in the order they appear in the papers. However, you will see references skip around as we summarize topics that are spread across sections of more than one paper.
We hope this summary helps you to understand and adopt the recommendations from the papers – and inspires at least some of you to read one or more of the papers in their entirety.
Technology, Threat Background, and Core Features
Microservices Security Challenges
While microservices applications offer benefits like agility, scalability, and resiliency, they also present unique security challenges that must be addressed (SP 800-204B, §1):
- There are many more components, interconnections between components, and communication links to protect than in a monolithic architecture.
- The dynamic lifecycle of microservices requires a secure service discovery mechanism.
- Because there is no concept of a network perimeter, zero trust architecture principles must be observed. For example, no microservice should be treated as trustworthy until proven otherwise by way of a secure authentication mechanism.
- Microservices as fine-grained, single-function components require a similarly fine-grained authorization mechanism in front of each component. At the same time, because microservices are loosely coupled, security policies should be both centrally defined and consistently enforced everywhere.
Core Features to Meet Those Challenges
To meet these challenges, NIST stipulates the following core features that must be available to any microservices application deployment (NIST SP 800-204, §2.6):
- Secure service discovery
- Identity and access management
- Secure communication protocols
- Security monitoring
- Integrity assurance
- Facilities to counter internet-based attacks
NIST offers a microservices technology background (NIST SP 800-204, §2) including a conceptual view of microservices, along with basic design principles and business drivers. The four main design drivers are:
- Single function per microservice, operating in a bounded context
- Lifecycle independence between microservices
- Allowance for constant failure and recovery with an eye toward statelessness
- Reuse of trusted services for state management
These design drivers yield certain design principles, including:
- Fault tolerance
- Loose coupling
- Alignment of APIs with business processes
While microservices applications are susceptible to most of the same attacks as web applications — e.g., injection, encoding and serialization, XSS, CSRF, etc — and must be protected accordingly, the series concentrates on threats specific to microservices.
NIST identifies six layers (not to be confused with the seven-layer OSI model) in the deployment stack of a typical microservices-based application subject to threat (NIST SP 800-204, §3):
The communication layer is considered unique to microservices applications because of the large number of loosely coupled, interconnected components. As such, threats to the communication layer are the main focus of the publication series. Threats to the other layers are addressed in other NIST documents and are therefore considered out of scope.
Service discovery mechanism threats. Potential threats include registration of malicious nodes, subsequently compromising service discovery, and corruption of the registry database, resulting in denial of services or redirection to malicious services.
Internet-based attacks. Microservices applications are more susceptible than monoliths to internet-based attacks because of the larger set of exposed IP-addressable RPC interfaces. There is also increased risk of inadvertent exposure of internal functionality if upstream components can be skipped to address downstream components directly.
Cascading failure. While microservices are loosely coupled from a development perspective, they often have logical or functional dependency relationships, such that a failure of one service can result in the cascading failure of dependent services.
Security Strategies for Implementing Core Features and Countering Threats
Given the microservices-specific threat background, NIST offers security strategies for implementing the core features required by microservices-based applications outlined above (NIST SP 800-204, §4). Specific recommendations throughout the series are given reference designations with a topic-appropriate naming scheme (for example, MS-SS-N for “MicroServices Security Strategy N“).
- API access authentication should not be done with API keys alone. Authentication tokens should be digitally signed or verified with an authoritative source. Single-use or short-lived tokens may be required in certain cases.
- Authentication tokens should be handle-based, signed, or protected by an HMAC scheme.
- API keys should be explicitly restricted in scope of application and API sets. That scope restriction should be commensurate with the level of assurance provided.
- For stateless authentication tokens, expiry times should be as short as possible; the secret key must be a dynamic variable represented by an environment variable or the contents of an environment data file, not embedded in library code; the key should be stored in a data vault.
- The authentication mechanism must be securely deployed.
Access Management (MS-SS-2)
- Access policies should be defined and provisioned to an access server. Coarser-grained policies should be enforced at gateways closer to the edge; finer-grained policies should be enforced closer to individual microservices.
- The access server should be capable of supporting fine-grained policies.
- Cached policy data should only be used when the access server is unavailable; cached data should have a suitable expiry policy.
- Access decisions should be conveyed through standardized tokens (e.g., OAuth 2.0) in a platform-neutral format (e.g., JSON).
- The scope of internal authorization tokens should be carefully controlled to offer the least privilege necessary for a particular operation.
Service Registry Configuration (MS-SS-3)
- Service registry function should run on dedicated servers or in a service mesh.
- Service registry services network should be highly available and resilient.
- Service registry communication should be secure.
- Service registry should ensure legitimacy of services performing updates and discovery.
- Services should not self-register/deregister. To ensure registry integrity, the registry should follow a third-party registration pattern.
- Registry updates should be dependent on a health check of the relevant service instance(s).
- Large applications should use a distributed registry with care taken to ensure data consistency across registry instances.
Secure Communication (MS-SS-4)
- Inbound calls from external clients should route through a gateway URL, not directly to individual services.
- Communication should be mutually authenticated and encrypted (e.g., via mTLS).
- Keep-alive TLS connections should be used for frequently interacting services.
Security Monitoring (MS-SS-5)
- Security monitoring should be performed at both gateway and service level. Additionally, input validation and extra parameter errors, crashes, and core dumps must be logged. Enabling software such as OWASP AppSensor could potentially be implemented in the gateway and at the service level via service mesh.
- There should be a central dashboard available to display the status of services and networks to highlight obvious signs of injection attack attempts.
- A baseline for normal, uncompromised behavior should be established. Intrusion detection system (IDS) nodes should be placed such that deviations from this baseline are detected.
Circuit Breakers (MS-SS-6)
- The circuit breaker function should be implemented in a single proxy to avoid placing trust in the multiple components of clients and services.
Load Balancing (MS-SS-7)
- Programs supporting the load balancing function should be decoupled from individual service requests — e.g., health checks should be run asynchronously, out of band from application communication.
- The network connection between the load balancer and microservices platform must be protected.
- If a DNS resolver is used to provide a list of available target service instances, it should work in tandem with a health check mechanism to present a single list to callers.
Rate Limiting (MS-SS-8)
- Limits should be based on both infrastructure and application requirements.
- There should be well-defined API usage plans.
- Replay detection must be implemented for high security microservices.
Integrity Assurance for New Microservices Versions (MS-SS-9)
- Traffic to both existing and new versions of microservices in a blue/green deployment should be routed through a central node to monitor the transition and risk associated with a canary release. Likewise, security monitoring should be performed at both existing and new versions.
- Traffic should be steadily increased to the new version based on usage monitoring and an assessment of its performance and functional correctness.
- Client preference for a particular version should be taken into consideration.
Session Persistence (MS-SS-10)
- Client session data must be stored securely.
- Binding server information must be protected.
- Internal authorization tokens must not be provided back to the user; user session tokens must not pass beyond the gateway for use in policy decisions.
Prevention of Credential Abuse and Stuffing Attacks (MS-SS-11)
- Run-time prevention is preferable to offline strategies. A threshold should be established above which the number of authentication attempts within an established time window should trigger preventive measures by the access server. When a bearer token is used, this feature must be used to detect and prevent the reuse of such tokens.
- A credential-stuffing detection mechanism should check user logins against a stolen credential database and warn legitimate users that their credentials have been compromised.
- IDS and boundary devices should detect: a) denial of service attacks and raise an alert before the service becomes inaccessible; c) distributed network probes.
- File uploads and the contents of each container’s memory and file system should be scanned for resident malware threats.
Treatment of authentication and authorization are further expanded upon in SP 800-204B, as described below.
Having laid out the technology and threat background, and the core features required in a solution, the papers in this series go on to describe Kubernetes plus Istio as a solution meeting most of these requirements.
NIST Reference Platform for Microservices-Based Applications: Kubernetes + Istio
To provide clarity and contexts for concepts and recommendations (SP 800-204B, §2), NIST establishes a reference platform consisting of Kubernetes for orchestration and resource management, with Istio service mesh providing the core security features described above.
Service mesh closes certain gaps in the orchestrator’s communication mechanism. These identified limitations of Kubernetes include (SP 800-204B, §2.1.1):
- Insecure communications by default;
- Lack of a built-in certificate management mechanism needed to enforce TLS between pods;
- Lack of an identity and access management mechanism;
- Firewall policy that operates at OSI L3, but not L7 and, therefore, unable to peek into data packets or to make metadata-driven decisions.
Kubernetes, focused as it is on orchestration and resource management, does not address certain application-level development and operational needs unique to microservies that the service mesh was designed to satisfy:
- A unified way to address cross-cutting application concerns;
- Standard plugins to quickly address those concerns and a framework for building custom plugins;
- Management of operational complexity;
- Easy governance of third-party developers and integrators;
- Cost reduction for development and operations.
These Kubernetes-specific concerns are largely addressed by an Istio service mesh.
Service Mesh Components and Capabilities
Istio in this case includes the extended version of Envoy proxy that ships as part of Istio. As such, Istio consists of two main components, a data plane and a control plane (SP 800-204A, §3):
- The data plane is composed of a set of interconnected Envoy proxy instances to mediate all application communication between services and between services and the outside world. For service-to-service communication, instances of Envoy are placed alongside each service instance as a sidecar. Envoy proxies may also be deployed as ingress controllers to control external inbound traffic as well as egress controllers to control communication with entities outside the mesh.
- The control plane is used to control and configure data plane behavior across the mesh.
The data plane in general provides the following functions (SP 800-204A, §3.1):
- Authentication and authorization, including certificate generation, key management, whitelist and blacklist, single sign-on (SSO) tokens, and API keys;
- Secure service discovery via a dedicated service registry;
- Secure communications, including mTLS, encryption, dynamic route generation, multiple protocol support and protocol translation;
- Network resilience, including circuit breakers, retries, timeouts, fault injection/handling, load balancing, failover, rate limiting, and request shadowing;
- Observability/monitoring data, including logging, metrics, and distributed tracing.
The ingress controller, specifically, provides (SP 800-204A, §3.1.1):
- A common, external-facing API for all clients, shielding the internal API inside the service mesh;
- Protocol translation from web-friendly protocols (HTTP/S) to backed protocols used by microservices, such as RPC, gRPC,and REST;
- The composition of results from the decomposition of an inbound request into multiple calls to other services;
- Load balancing;
- Public TLS termination.
The egress controller provides (SP 800-204A, §3.1.2):
- A single, centralized set of whitelisted external workloads (e.g., hosts and IP addresses);
- Credential exchange between internal and external identity credentials to keep external credentials isolated from the application;
- Protocol translation back to web-friendly protocols.
A Security Kernel for Microservices
The authorization policy enforcement mechanism must satisfy the three requirements of a reference monitor; it must be: 1) non-bypassable, 2) protected from modification, and 3) verified and tested to be correct (SP 800-204B, §5.1).
As a dedicated infrastructure layer, Istio serves as a security kernel for microservices-based applications. The Envoy data plane provides reference monitors by way of non-bypassable policy enforcement points (PEPs) in front of each service and at each ingress and egress gateway. The kernel code is independent of the application, so its lifecycle can be managed independently and it can’t be modified at runtime. And, the mesh is a tightly controlled element of the system that can be hardened with more eyes and closer inspection (SP 800-204B, §5.1).
Service Mesh Deployment Recommendations
Service mesh deployment recommendations are provided in SP 800-204A, §4 and SP 800-204B, §4 as follows.
Communication Configuration for Service Proxies.
- Allowed traffic (SM-DR1): Accepted ports and protocols should be configurable. By default, only communication on and via such ports and protocols should be accepted.
- Reachability (SM-DR2): Reachability must be limited. Such limits should be configurable based on (at least) namespace, a specific named service within a namespace, or runtime identity. Access to the control plane must always be provided to relay discovery, policy, and telemetry data.
- Protocol translation (SM-DR3): Service proxies should support protocol translation. This reduces the attack surface by eliminating the need for a separate server per client protocol.
- User extensibility (SM-DR4): Service proxies should have an extension mechanism to allow support for use-case specific policies. Such a mechanism should implement controls to mitigate risk from user code (e.g., sandboxing, runtime API restrictions, or pre-runtime analysis techniques).
- Dynamic configuration (SM-DR5): Service proxies should be dynamically configurable. New configuration should be atomically swapped. Outstanding requests under the previous configuration regime should be efficiently completed or terminated to allow for timely enforcement of policy changes without service downtime or degradation of user traffic.
- Proxy and service communication (SM-DR6): Service proxies should only communicate to application service instances through a loopback channel (e.g., localhost IP address, UNIX domain socket, etc.). Service proxies should only communicate with each other via mTLS.
Configuration for Ingress
- Ingress proxies (SM-DR7): There should be facilities for configuring ingress (standalone) proxies similarly to service proxies to allow for consistent routing policy enforcement from edge to workload.
Configuration for Egress
- Limited access to external resources (SM-DR8): Access to external resources should be disabled by default. Access should be enabled only by explicit policy and restricted to specific destinations. Additionally, external resources should be modeled as services in the service mesh itself.
- Secure access to external resources (SM-DR9): Network availability capabilities (e.g., retries, timeouts, circuit breakers, etc.) available to mesh-internal traffic should also be available for traffic to external resources.
- Egress proxies (SM-DR10): There should be egress proxies to mediate all access to external resources. Egress proxies should have configurable access and availability policies similar to ingress proxies. This can help integrate with traditional network-oriented security models.
Configuration for Identity and Access Management.
- Universal identity domain (SM-DR11): All instances of a microservice should have a consistent and unique identity across the entire system.
- Signing certificate deployment (SM-DR12): Self-signed certificates should be disabled in the service mesh control plane. The signing certificate should always be rooted in the enterprise PKI root of trust and should be provided securely to the control plane at startup.
- Identity certificate rotation (SM-DR13): Identity certificate lifetime should be as short as possible, preferably on the order of hours.
- Identity certificate change (SM-DR14): The service proxy should efficiently retire existing mTLS connections and establish new ones when its identity certificate is rotated.
- Non-signing identity certificates (SM-DR15): Microservices identity certificates should not be signing certificates.
- Workload authentication (SM-DR16): The identity of a service instance should be authenticated before an identity certificate is issued (e.g., by attesting against the orchestrator). The same care should be taken in provisioning the signing certificate for the control plane’s certificate management system.
- Secure naming service (SM-DR17): There should be a secure naming service that maps server identity to microservice name to ensure the server is authorized to run the given microservice and to protect against network hijacking.
- Granular identity (SM-DR18): Each microservice should have its own identity such that all instances of a particular microservice present the same identity at runtime.
- Authentication policy scope (SM-DR19): Authentication policy should be configurable to at least the following scopes: a) all services in all namespaces; b) all services in a specific namespace; c) a specific microservice in a specific namespace.
- Authentication tokens (SM-DR20): Authentication tokens should be digitally signed and encrypted. Further, such tokens must be passed only by loopback device or through an encrypted channel.
Configuration for Monitoring Capabilities.
- Event logging (SM-DR21): Proxies should log input validation and extra parameter errors, crashes, and core dumps. Attack detection capabilities should include bearer token reuse and injection attacks.
- Request logging (SM-DR22): Proxies should log at least Common Log Format fields for irregular requests. Logging successful requests may be of little value when metrics are available.
- Log message content (SM-DR22) [sic]: Log messages should contain at least a timestamp, microservice identity, request trace id, message, and relevant contextual information. Sensitive information (e.g., bearer tokens) should be masked.
- Mandatory metrics (SM-DR23): Metrics gathered should include at least: a) request volume; b) failed request volume, by failure code; c) average latency per service and average total latency per complete request lifecycle.
- Distributed tracing (SM-DR24): Application services should be instrumented to forward tracing headers.
Configuration for Network Resilience Techniques
- Data storage (SM-DR25): Data pertaining to retries, timeouts, circuit breakers, etc. should be kept in robust data stores.
- Health checks (SM-DR26): The health check function should be tightly integrated with the service discovery mechanism to maintain the integrity of information used for load balancing.
Configuration for Cross-Origin Resource Sharing (CORS)
- CORS (SM-DR27): CORS policy should be implemented in the service mesh rather than in application code.
Configuration of Permissions for Administrative Operations
- Access control for administrative operations (SM-DR28): Granular access control should be configurable for all administrative operations. The configuration interface for such access control may not be part of the service mesh itself, but part of installation software or the orchestration software.
High-Level Configuration Parameters for Applications (SP 800-204B, §4.3)
- AHLC-SR-1: Containers and applications should not run as root.
- AHLC-SR-2: Host path volumes should not be used to prevent tight coupling between container and node that can constrain migration and flexible resource scheduling.
- AHLC-SR-3: Container file systems should be read-only by default and overridden only when the application (e.g., a database) must write to the file system.
- AHLC-SR-4: Privilege escalation must be explicitly prevented (e.g., set the Kubernetes ‘allowPrivilegeEscalation’ flag to false).
Having set out the reference architecture and recommendations for implementing it securely, the narrative then turns to access control.
Attribute-Based Access Control Using a Service Mesh
SP 800-204B provides an in-depth analysis of, and policy configuration guidance for, attribute-based access control (ABAC) as the preferred access control model for microservices security. NIST promotes ABAC as the access control model best suited for microservices based on its advantages of expressiveness and wide applicability to a large range of subjects, objects, and environments:
- The ability to meet the scalability demands of the increased precision regarding the large set of variables required by cloud-native applications;
- Policies that can be expressed as a logical expression regarding the attributes of a subject, object, and environment, without a tight relationship between a particular subject or object;
- Policies that can be expressed in terms of attributes without prior knowledge of potentially numerous users and resources.
Advantages of ABAC with a Service Mesh
With a service mesh, the authorization policy engine can be implemented as a container in the mesh — rather than built into application code — executing either natively in the proxy memory space or callable from a proxy filter such that it shares no memory space with the calling application, satisfying the requirements of a security kernel (SP 800-204B, §5.1). In addition to acting as a security kernel, ABAC implemented with a service mesh offers at least the following other benefits:
- The authorization engine is coupled to the orchestrator control plane, the service mesh control plane, and the application data plane;
- The extensibility of the proxy data plane can be used to integrate any particular authorization engine;
- The extensibility of the proxy data plane also allows for the incorporation of both data and application protection models as part of the authorization mechanism;
- The NGAC-based ABAC model exhibits linear-time performance.
For further discussion of the relative merits of various access control models for microservices applications, see Tetrate’s analysis of role-based access control (RBAC), attribute-based access control (ABAC), and NIST’s ABAC variant, next-generation access control (NGAC).
Guidance for Authentication and Authorization Policy Configuration
At minimum, the following data from the orchestration and resource management platform are required to make appropriate authentication and authorization decisions (SP 800-204B, §4.1):
- Metadata-like application service name and the sets of service instances that come from the orchestrator;
- Runtime data (protocols, ports);
- Namespaces for logical isolation;
- Unique runtime identities for each service.
The service mesh configuration requires at least the following components:
- Ingress gateway;
- Egress gateways;
- Sidecar proxies beside each service instance;
- A certificate authority (CA);
- A control plane to distribute policy to the proxies in the data plane.
The service mesh’s CA module should be rooted in an organization’s existing public key infrastructure (PKI) to allow for auditability, rotation, and revocation (SP 800-204B, ISMC-SR-1); self-signed certificates should not be used in secure deployments and should be disabled by default.
Communication between the service mesh control plane and the orchestration platform must be authenticated and authorized (SP 800-204B, ISMC-SR-2).
Two kinds of identity must be assured in a microservices-based application: end-user identity and microservices (workload) identity (SP 800-204B, §4.4). To authenticate end-user identity, the mesh should facilitate encoding end-user credentials (EUC) in communications between services and provide them to the application.
Workload identity is crucial because a) it provides assurance that a server is authorized to run a particular service to protect against malicious, phony services (remembering that a zero trust posture assumes that attackers are already in the network); b) it serves as the basis for applicable authorization policy. To facilitate workload identity, the system should have a mechanism for providing strong identity to services with a digital signature, e.g. SPIFFE. That identity should be used to establish mTLS comms between services.
NIST offers the following recommendations for service-level and end-user authentication policy authorship and enforcement implementation in a service mesh environment:
Service-Level Authentication Policy Recommendations
- SAUN-SR-1: It should be possible to define policy that requires mTLS communication for at least the following levels of specificity: a) global or service mesh; b) namespace; c) workload/microservice; d) port. This implies the capability to assign strong identity to each service, authenticate that identity by mapping it to server identity, and establish the authenticity of that mapping with a digital signature, e.g. via SPIFFE.
- SAUN-SR-2: The service mesh should provide a secure naming service that maps server identity to the microservice name provided by the secure discovery service or DNS.
End-User Authentication Policy Recommendations
- EAUN-SR-1: Request authentication policy must provide at least the following information to be enforced by the sidecar proxy: a) instructions for extracting the credential from the request; b) instructions for validating the credential. For a JSON Web Token (JWT), this might include: a) header name of the JWT; b) how to extract subject, claims, and issuers from the JWT; c) public keys or location of the key used to validate the JWT.
Like their authentication counterparts, authorization policies may be specified at both the service level and the end-user level. The expression of authorization policy will depend on the access control model in use. Further, the location of access control data may vary depending on the access management infrastructure available, e.g., in a centralized or external authorization server, or carried as header data.
Service-Level Authorization Policy Recommendations
- SAUZ-SR-1: There should be a policy object describing service-to-service access for all services in the mesh that, at least, restricts access to specific namespaces, e.g.,“services in namespace A can call services in namespace B”. Ideally, policy should be authored for each microservice that restricts access explicitly to specific services, e.g., “service P in namespace A can call service Q in namespace B”. Such policies should describe the minimum access required, e.g., “service P in namespace A can call ‘GET /q’ on service Q in namespace B.”
End-User Authorization Policy Recommendations
- EUAZ-SR-1: Service proxy communication with the authentication or authorization system must be secured by either the mesh’s built-in service-to-service authentication and authorization capabilities or by an existing enterprise identity and access management (IAM) system.
- EUAZ-SR-2: The service proxy should generate logs for every service request to ensure authentication and authorization policies are enforced and relay telemetry data for the generation of metrics to ensure no degradation of service that will impact availability.
- EAUZ-SR-3: All application traffic should carry end-user credentials, and there should be policy enforcing that credentials are present.
Authorization Policy Elements
NIST specifies the minimal set of policy elements that should be available to policy authors (SP 800-204B, APE-SR-1). This set of elements allows for the dynamic, fine-grained policy definition required by microservices-based applications.
- Type: ALLOW or DENY;
- Scope: The target resource in terms of a set of services, versions and namespaces;
- Sources: The set of services authorized to operate on the set of resources specified under the policy target.
- Operations (APE-SR-2): The operations that are part of the application type, e.g., REST verbs POST, GET, PUT, PATCH, DELETE;
- Conditions: Constraints in the form of key-value pairs for the metadata associated with the request, including metadata about the source, destination, and request.
- Default policy (APE-SR-3): A default policy such that, at least:
- All unauthenticated requests are rejected;
- End-user credentials are present on every request;
- Communication is restricted to an application’s own namespace;
- Service communication across namespaces is allowed only through an explicit policy.
DevSecOps with a Service Mesh
The final installment of the series, SP 800-204C, provides guidance on DevSecOps best practices in the context of the Kubernetes + Istio reference platform for microservices applications. Though careful to point out that there is no established consensus on the exact definition of “DevSecOps,” NIST nonetheless advocates it as a “facilitating paradigm for… agile and secure development, delivery, deployment, and operations” (SP 800-204C, §1).
The NIST guidance stipulates a reference application environment with both organizational as well as operational structures in place, emphasizing that fully embracing a DevSecOps paradigm is a project of political and cultural adoption in addition to technology adoption. “The most pronounced change involves organizing a DevSecOps group… of software developers, security specialists, and IT operations experts” (SP 800-204C, §3.1); a smaller team promotes efficiency and effectiveness in an agile software development lifecycle.
Implementing DevSecOps Primitives for the Reference Platform
NIST’s reference application environment posits five different types of code, each with its own associated CI/CD pipeline (SP 800-204C, §4.1):
- Application code: Business logic implemented as microservices in containers;
- Application services code: Code for all services, including network connections, load balancing, resilience, etc;
- Infrastructure as code (IaC): Code for provisioning and configuring infrastructure, typically written in a declarative language; IaC is subject to the same vulnerabilities as application code and therefore must be subject to security inspection practices at least as rigorous as those employed for application code. In addition, IaC must include methodical drift management to ensure that the intent expressed in IaC code is what actually exists in the deployed environment;
- Policy as code: Policy in multiple domains, including security and networking, explicitly authored and encoded as executable modules. Policy as code should provide protection against all known threats relevant for the application environment and should be periodically scanned and updated (SP 800-204C, §4.4);
- Observability as code: In the context of the reference platform, this refers to code that creates agents in proxies and creates functionality for gathering logging, tracing, and telemetry.
Securing the CI/CD Pipeline
The following is the minimum set of security tasks required for all CI/CD pipelines (SP 800-204C, §4.6):
- Harden code and artifact repository hosts;
- Ensure that only secure credentials are used for accessing repositories;
- Place controls on who can read and write to container image registries;
- Log all code and build update activities;
- Send build reports to developers and stop the pipeline on build, test, or audit failure;
- Ensure developers can access application code, but not pipeline code;
- Sign the release artifact at each CI/CD stage, preferably with multi-party signing;
- Verify the presence and validity of all such required signatures prior to production release to ensure the pipeline was not bypassed.
Pull-Based Pipeline Workflow Recommended
Depending on tooling, CI/CD pipelines can use either a push-based or pull-based workflow model. The security downside of using a push model is the possibility of exposing credentials outside of the deployment environment (SP 800-204C, §4.7). The recommended pull-based workflow uses an operator to watch image registries and pull new versions automatically, yielding an automatic convergence of actual deployment infrastructure state with the declaratively described state as in the git repository. GitOps is offered as a preferred pull-based workflow.
All five code types should be subject to application security testing (AST) via automated pipeline tooling or as a service. AST tools should in particular perform vulnerability, container image, and regulatory/compliance scans (SP 800-204C, §4.8).
Benefits of DevSecOps to Application Security in the Service Mesh
NIST concludes its analysis with an examination of the security benefits of using DevSecOps primitives in the service mesh. These include:
- Better communication and collaboration between development, operations, and security teams;
- Streamlined software development, deployment, and delivery, especially through automation;
- Reduction of attack surfaces and restriction of lateral movement via zero trust architecture, further facilitated by continuous monitoring;
- For federal enterprises, DevSecOps primitives and practices also help enable continuous authorization to operate (C-ATO).
The NIST special publications on zero trust architecture represent a milestone for cybersecurity by providing detailed and comprehensive guidance for securing enterprise applications — both in government and private industry — as they move from mostly monoliths in static deployments to highly decoupled, composable architectures in dynamic deployments across multiple, heterogeneous environments. Applications have changed; the perimeter has disappeared. It’s time to rethink how we do enterprise security. These new NIST standards for zero trust architecture are valuable tools to use in that effort.
Tetrate has years of experience putting these tenets into practice, through its work on standards such as these; through its extensive contributions to Istio, the most widely deployed service mesh, and Envoy proxy, which is increasingly widely used; through the development of Tetrate Service Bridge, a leading management plane for service mesh implementations; and by helping the world’s largest enterprises plan, implement, deploy, and operate an application networking platform based on service mesh. If you’re thinking about how to integrate service mesh into your cybersecurity strategy, contact us to start a conversation with some of the foremost service mesh experts in the world.
- NIST SP 800-207: Laying the Groundwork for Zero Trust Architecture — Background on NIST’s standards for zero trust architecture.
- NIST-Tetrate 2022 Conference Talks: NIST Standards for Service Mesh — Zack Butcher’s deep dive into the NIST SP 800-204 series from the 2022 joint NIST-Tetrate conference on ZTA and DevSecOps for Cloud Native Applications.
- NIST-Tetrate 2021 Conference Talk: ABAC for microservices applications using a service mesh — Zack Butcher’s presentation at the 2021 joint NIST-Tetrate conference on DevSecOps and Zero Trust Architecture for Multi-Cloud Environments.
- The US Government Endorses Zero Trust Architecture for Security — Background on US government initiatives to improve cybersecurity and move towards zero trust network architecture.
- White Paper: Zero Trust Architecture — Zack Butcher’s white paper on zero trust architecture.