Introduction to Service Mesh
Service mesh is an infrastructure layer between application components and the network via a proxy. These app components are often microservices, but any workload from serverless containers to traditional n-tier applications in VMs or on bare metal can participate in a mesh. Rather than each component communicating directly with other components over the network, the proxies mediate that communication. These proxies form the data plane, providing many capabilities for implementing security and traffic policy and producing telemetry about the services the proxies are deployed with. Read more about service mesh capabilities.
The capabilities of service mesh include:
- Service discovery
- Resiliency: retries, outlier detection, circuit breaking, timeouts, etc.
- Client-side load balancing
- Fine-grained traffic control at L7: route by headers, destination or source, and other runtime information.
- Security policy on each request, rather than once per connection
- Authentication, rate limiting, arbitrary policy based on L7 metadata
- Strong (L7) workload identity
- Service-to-service authorization
- Metrics, logs, and tracing
We believe that this survey shows that Istio is the most capable, most flexible, and most widely used service mesh, and the one best supported by a robust community. As a result, Istio is steadily improving in all areas over time. Consul and Linkerd are designed to be lighter-weight and are not as widely supported by the community. As a result, they may be easier to implement initially, and have had performance advantages in the past, but they may not be as suitable for demanding use cases and for the long haul.
Working Model of Service Mesh
Service mesh software follows a “hub-and-spoke” pattern, i.e., they use a proxy or worker nodes to implement specific rules and govern the proxies using a central server. Istio and Consul use the Envoy proxy, while Linkerd uses their custom-built proxy. These proxies work in the data plane at an application level. They are responsible for the heavy-lifting duties such as handling incoming traffic, connecting services, and applying granular security policies. The management of traffic and security is done at the centralized control plane.
Service Mesh Comparison
IT departments usually research to find suitable service mesh software based on specific criteria. This blog will provide a comparison chart to aid IT architects in quickly selecting and adopting the right service mesh software.
The best-known open source tools to implement service mesh are Istio, Linkerd, and Consul. Istio and Consul both use Envoy as a data proxy; Linkerd has an open source proxy of its own. We will compare these three leading open source projects in this blog post.
We have compared the service mesh software based on six criteria- traffic management, security management, scalability, visibility and observability, installation and implementation, extensibility- and shared our opinion.
Istio, Consul, and Linkerd provide rudimentary traffic management capabilities such as load balancing, routing, and service discovery. And their proxies support HTTP, gRPC proxying, TCP proxying, and protocol detection.
Envoy proxy, as used by Istio and Consul, implements more HTTP-specific functionality than Linkerd. Envoy HTTP connection manager offers native support for HTTP/1.1, WebSockets, HTTP/2, and HTTP/3, while Linkerd micro-proxy does not support HTTP/3.
Largely because they both use Envoy, Istio and Consul are also at par with more granular traffic management features such as circuit breaking, retries, timeouts, fault injection, delay injection, etc., compared to Linkerd.
Istio and Consul both support Kubernetes (on-premises and managed), public cloud (AWS, GCP, Azure), and on-prem VMs. Istio provides advanced traffic management features like fault injection which will help developers to introduce delays or failures in order to test the resiliency of a system and harden operations. Istio is usually preferred by many large enterprises whose IT supports cloud-native applications, monoliths, and legacy workloads and wants advanced traffic management features.
A few reasons (or use cases) why platform architects and application engineers like Istio are:
- Front/edge proxying: Istio helps redirect requests based on runtime values, making implementation of deployment strategies like canary and blue/green easy.
- Support for ingress gateways: Istio provides its Istio ingress gateway and supports third-party ingress controllers to manage external service access in a Kubernetes cluster.
- Multi-cluster communication: Connecting K8S services spread across clusters or regions using Istio service mesh is straightforward.
- Multi-site failover: With support for all public clouds, Istio is used for automatic site failover to support high availability and greater resiliency for cloud-based applications
All three service mesh software products provide support for authentication and authorization features such as:
- mTLS-based authentication to ensure the security of network communications
- JWT-based authentication to secure apps from external and internal users. When it comes to OIDC, only Istio and Consul provide options to configure Open ID (OIDC) authentication flow.
- Security policies for defining granular workload-to-workload and end-user-to-workload authorization and access controls
However, as Istio offers integration with multi-cloud and on-prem VMs, you can use it to apply mTLS to traffic to and from external clients or other Kubernetes clusters or VMs. Here are a few reasons why security managers prefer Istio for ensuring a zero trust security posture in their IT:
- FIPS Compliant: Many implementations of Istio are FIPS compliant, the middle level of FIPS adherence, and one implementation, from Tetrate, is FIPS certified, the highest level. This makes Istio the best choice, if not the only choice, for organizations adhering to the Federal Information Security Modernization Act of 2014. Refer to how Istio helps the Federal Government and the Department of Defense secure their internal applications.
- Transparency with CVEs: Istio is more transparent than its peers, as it shares vulnerabilities with the National CVE database so that cybersecurity managers can refer to the latest security flaws and issues in the software. In addition, the Istio steering committee ensures the timely implementation of security patches and notifications to Istio users about newly created builds. Like Istio, Consul too shares vulnerabilities with the CVE database.
- Flexibility: The flexibility of Istio allows you to integrate with authentication providers of your choice like OpenID Connect providers, for example, KeyCloak, OAuth 2.0, Google Auth, Firebase Auth, etc.
The US National Institute of Standards and Technology (NIST) uses Istio as a reference architecture to implement security standards for federal agencies and achieve zero trust. Refer to the detailed guides by NIST to implement zero trust in microservices – Security Strategies for Microservices, Building Secure Microservice using Service Mesh Architecture, Attributed-based access control for microservices using Service mesh, Implementation of DevSecOps for a Microservice using Service Mesh, and Zero Trust Architecture.
Installation and Operation
All the service meshes can be installed and implemented without changes to the application code. But service mesh, in general, has been known for being difficult to set up and maintain. The Ops team will provide particular attention to installation, operation, and performance.
All three service mesh options described here provide easy installation in a Kubernetes environment using Helm charts or their operators. Since Linkerd is lightweight, with fewer features and simple architecture, it can be easier to maintain and perform faster than Istio and Consul. Istio is often perceived as “heavy” primarily because it supports all kinds of workloads- K8s, public cloud, and on-prem VMs.
But with time, Istio has improved its performance and strives to provide all the network management and security benefits with minimal resource overhead. The project also aims to support large meshes with high request rates while adding minimal latency.
Each of the service meshes has unique advantages when it comes to on-demand scaling.
Regarding data plane scalability, Linkerd scales faster than Istio and Consul. The main reason is Linkerd’s lightweight architecture, which does not support advanced traffic management and granular security management features. Istio and Consul need a bit of configuration and support while scaling control planes and data planes to other clusters and workloads across public clouds or on-prem VMs.
From a data plane perspective, Envoy might consume more CPU than Linkerd’s micro-proxy in a resource-constrained setup. However, Envoy follows a practical design and is suitable for faster performance in the actual IT ( prod/non-pro) environment. The maintainers of Envoy have done a fantastic job in performance tuning the proxy in the critical path. The performance will vary widely depending on the features used, the environment in which Envoy runs, and the available computer resources.
Also, please note that performance is only relevant in a like-for-like comparison. Software that does not do everything you need, nor meet the same standards as the competition, is not faster or lighter-weight in a useful way if it is not getting the job done. You can only consider a lighter-weight and less functional alternative if your use cases, now and in the future, are fully met by its feature set. This is less likely to be the case in enterprise implementations or in technically demanding environments.
To avoid misleading benchmarking info and to carry out the performance test with the scientific process, please follow the best practices for benchmarking Envoy.
As per the Istio load tests, a service mesh consisting of 1000 microservices services was considered along with 2000 sidecars. A load of 70,000 mesh-wide requests per second was sent to the mesh, and the response was measured. After running the tests using Istio 1.14.1, the results observed for the data plane and control plane are as follows:
- Data plane performance: The Envoy proxy uses 0.35 vCPU and 40 MB memory per 1000 requests per second going through the proxy. It was also observed that the Envoy proxy adds 2.65 ms to the 90th percentile latency.
- Control plane performance: Istiod uses 1 vCPU and 1.5 GB of memory.
Visibility and Observability
End-to-end visibility and observability of distributed services are essential for operation and maintenance teams. SREs and Ops teams require service-level visibility, tracing, and monitoring abilities in a service mesh. All the service mesh software options described here provide visibility into the health of application performance, measuring the behavior of the network and reporting it to stakeholders.
Istio, Consul, and Linkerd generate the key metrics needed for monitoring, such as latency, traffic, errors, and saturation for HTTP, HTTP/2, and gRPC traffic. The metrics provided by service mesh software can be viewed directly from the command-line interface (CLI), or Grafana dashboards, or using Prometheus pre-built integrations.
Compared to Linkerd, both Istio and Consul provide far more metrics for investigating network communication minutely. The Envoy proxy, which forms the data plane in Istio and Consul, emits enough service level metrics, proxy-level metrics, and distributed tracing for the Ops team to diagnose traffic management configuration problems faster. And since Envoy is also a widely used data plane, it is easier for the Ops team to find solutions for any configuration or performance anomaly from the community.
If you are interested, learn about the Istio commands that give you a detailed overview of the mesh.
Istio is the most popular service mesh software in and outside the community. And this implies more innovation and more aggressive feature development. Evidence suggests that Istio, as compared to any other service mesh, is most widely implemented by architects, DevOps teams, and security professionals worldwide. Although usage may vary somewhat by industry vertical, Istio has now achieved success in key areas such as financial services and US government usage. So Istio is likely to maintain or increase its leadership across the board.
To estimate the popularity of these projects, we searched for the keywords “Istio service mesh,” “Consul service mesh,” and “Linkerd service mesh” on Google Trends. (We used the longer phrases because the word “Consul,” by itself, is widely used and has many meanings.) Over the last five years or so, Istio service mesh has skyrocketed upward in popularity, with Consul and especially Linkerd trailing well behind.
Google search activity shows activity around Istio well ahead of Consul and Linkerd.
Diversity of contributors
Apart from more engagement and contributions, Istio has a diversity of contributors. While Buoyant and Hashicorp are the primary contributors to Linkerd and Consul projects, respectively, contributions to Istio are made by diverse and large enterprises such as Google, IBM, Accenture, VMware, etc., as well as many smaller enterprises.
|Service Mesh||Total Contributions||Major Contributors||References|
|Istio||449,975||Google, IBM, Huawei, Red Hat, Tencent, etc.||Click here|
Istio has the highest number of Git stargazers as compared to its peers. The more stargazers, the greater engagement in the open source community.
|Service Mesh||Git Stargazers||Reference|
Governance of contributions
The steering committee of the Istio project practices good governance to make Istio driven by contributors and community members only. The steering committee represents at least seven different organizations. This avoids any vendor having majority voting control over the Istio project irrespective of the contribution size.
No other service mesh software committee practices such diverse governance to cater to the community’s interests.The Envoy project also has diverse governance and recently added a new project, Envoy Gateway, which includes Tetrate as part of the steering committee.
For more, please read the Istio governance structure.
Istio vs Linkerd vs Consul Tabular Comparison
|Traffic Management||Load balancing|
|Multi-cluster run-time traffic routing|
|Retries, circuit breaker, timeouts|
|Dynamic updation of policies|
|Support for TCP, HTTP/1.1, HTTP/2, gRPC|
|Security Management||Identity management|
|Support for external CA certificate managers|
|Authentication, authorization and audit (AAA)|
|Scalability||Support for K8S, Public Cloud (AWS, GCP, Azure), On-Prem VMs||K8S + Public Cloud + VMs||K8S + Public Cloud + VMs||Only K8S|
|Visibility & Observability||App-level (L7) Metrics|
|Service-level (L4) metrics|
|Root cause analysis|
|Ease of installation and on-going maintenance||Medium||Medium||High|
All the service mesh software is modular and extensible. But since Istio is more popular than other service mesh, many open source projects have emerged over Istio. Some notable projects are:
- Slime, an intelligent service manager to use Istio and Envoy
- MOSN, which provides cloud-native edge gateways and agents,
- Aeraki, to provide support for all the L7 protocols apart from HTTPs and gRPC
- WASM, to develop WebAssembly plugins and enhance service mesh capabilities
Find out the list of open source projects in the Istio ecosystem here.
As organizations increasingly use microservices and adopt multi-cloud technologies, service mesh becomes increasingly important to deal with new network complexities and security challenges. While choosing service mesh software, organizations must evaluate the options carefully to avoid failure of IT projects, both immediately and in the long term.
After carefully evaluating the three popular service mesh on various features, we believe that Istio provides a richer feature set compared to its peers across categories. If you have a small IT setup with a small DevOps and security team, and you are not operating in a particularly challenging security environment, and you are able to operate in this way for the foreseeable future, then lighter-weight Linkerd or Consul may make sense. But if you have a large-scale IT system and a big group, or if you consider security as the top priority, then Istio is the safer bet.
Your overall experience with Istio is likely to be seamless and supportive, given that it is the most popular and widely-implemented service mesh. And with regards to scalability and performance, we feel that Istio and Envoy are most fit for product-grade implementation at scale.