Matt Klein at Tetrate's Service Mesh Day 2019: Envoy as the standard data plane and where it's going

Matt Klein, creator of Envoy and SMD keynote speaker

Matt Klein, the creator of Envoy, says he had greatly underestimated the market demand for a proxy that could be used in a generic way. The Lyft software engineer wrote Envoy as a “communication bus” to handle issues like rate limiting, circuit breaking, and load balancing. It facilitates network-transparent applications and allows developers to focus on business logic rather than debugging and network management.

The keynote at Tetrate’s Service Mesh Day 2019 spoke about the rise of Envoy, its ecosystem, and its growth from a proxy into more of a platform.

Both the use cases and the organizations that have adopted Envoy have been wide-ranging. It’s being used by an array of cloud providers, large internet providers, and startups, including Tetrate, that are building businesses on top of it.

Klein attributes Envoy’s popularity to its stability and efficient performance, its devoted open source community, and the increasing focus on DevOps in the API-driven, cloud and cloud-native world. It supports developers who need to run the software that they’re also building. Envoy has focused on having best-in-class stats, logging and tracing, said Klein. And its extensible platform has encouraged contributors to build incredible things on top of it.

The success of a platform sets off a virtuous cycle, said Klein. As the platform becomes more powerful and plug-in apps proliferate around it, more people want to build on top of it. Considering the innovation that will spring from apps relying on independent network plumbing, and the observability and auditing that’s enabled by developing metrics, logging and tracing systems, he added, we’re only at the beginning.

Envoy and service mesh have significant implications for traffic management that allows people to identify and fix problems in systems, and for building up increasingly sophisticated control planes. These will have to deal with federation and intersecting trust domains. “But again, if there’s a universal API and a universal proxy in place [for] these control planes, people can build businesses and systems that will span multiple clouds and not bound to a particular cloud.”

Envoy is an open source project sponsored by the Cloud Native Computing Foundation.

Transcript

Cool. I’m trying to show everyone I’m wearing my Envoy proxy socks. Yeah. So I’m very excited to be here. It’s so great to see all of you.

This is a new talk, but something that I’ve been thinking about a lot recently. Uh, and you know, essentially how Envoy has moved from being a proxy to I think a platform that people are going to build applications on and that makes it more of an operating system. Um, so, you know, people talk about Envoy a lot. You’ll hear the term universal data plane. And you know, let’s come back to the original project goals of Envoy. And as I’ve said in many, many talks, you know, the goal is that we would like to make the network transparent to applications. We want application developers to focus on business logic. Anytime that they spend futzing with network or fixing bugs or finding bugs around sporadic failures and things like that is time that they’re not spending working on business logic.

Uh, and of course we’d like to help them fix problems when those problems do occur. And you know, so we open sourced Envoy at the end of 2016 and it’s become incredibly popular. This has resonated with people. And I greatly underestimated the demand in the market for a proxy that can be used in this generic way and people have built a really incredibly fantastic things on top of it. And if you look at adoption, uh, is really incredible and it, you know, ranges from all of the major cloud providers, uh, to now probably tens of startups that are building their businesses, on top of Envoy to companies like Lyft who are large Internet providers who are using Envoy and the, and the use cases that are also a very, very wide, they range from API gateway edge proxy to service-to-service proxying to middle proxying.

I, I’m honestly, I’m continuously blown away by the different use cases that pop up and we’ll cover why that is. Uh, but you know, it’s been, it’s been fantastic to see all of these projects, both closed source and open source that have popped up around Envoy as that ecosystem has grown. So let’s talk a little bit about, from my perspective, why Envoy has become popular.

I think that the first thing is just a check mark is that Envoy performs quite well. Uh, so I think when people are looking in this space, you know, it’s, uh, something that is going to be running on lots of boxes and, and, uh, you know, we would like it to be fairly efficient from my memory and CPU perspective. So that’s a checkmark that Envoy has done.

From very early stages of the project, we’ve focused on reliability. So that means, you know, excellent test coverage, making sure that master is in a good state at all times. Uh, I think most people have considered the project to be fairly stable. Uh, and I think that’s super important. Uh, and that has allowed us to grow.

Obviously modern code base, uh, you know, especially compared to some of our competitors. I think it’s been exciting for people to have a proxy like this that’s on Github and we can grow our community. And uh, we do CI and all, all types of normal, modern things. That’s been great.

Uh, you know, and the next thing is really around operations. And I think in our new cloud, a cloud native world, people are focusing a lot more on quote “DevOps.” Uh, and just the idea that, you know, people need to run the software that they’re also building. So I think given that we built Envoy at Lyft and it was built by the team that also operated it. We had a, we had a sense of operations from the very early days. Uh, and so we’ve really focused on having best in class things like stats and logging and tracing and stuff like that.

Um, now let’s, let’s come to the bold items. And the bold items are really why I think the Envoy has grown so much. And the first thing is extensibility. We have focused very, very much on, uh, making Envoy, very extensible platform. So we allow plugins at tons of different layers. And you know, I think that’s an important thing from any product perspective, but particularly from an open source perspective because we don’t want to get overwhelmed with people having to change the core to build whatever functionality they actually want. And this extensibility, it comes back to all the products and services that are now built on top of Envoy. It has allowed people to build, I mean really incredible things. I don’t have time to go into all of the people that I talked to and all the different use cases, but really amazing to see what people have been able to build on top of this core code. It’s quite, quite amazing.

The next thing I think really if I were looking at a single thing of why Envoy has becomes so popular is the fact that it is API driven. Uh, so, you know, when you look at some of our competitors like Nginx or HAProxy, um, you know, we’re really moving from a world in which I would call it the flat configuration file world to an API driven cloud cloud native world. And this has allowed Envoy to be again, built into very interesting control plane management systems, where people can differentiate and now, you know, we can have a distributed system of Envoys that make a larger mesh or a larger system that can do extremely interesting things.

And on top of all of this, you know, it’s basically all of you, right? It’s you know, to see the community grow has been truly incredible. And I think the reason that the community has grown is because of all of these things together. Uh, you know, as the fact that we do have a modern system around GitHub and that we allow extensibility and that we help people, you know, essentially be successful. So that’s been fantastic.

Um, so let’s, let’s talk, talk a little bit about business model. I’m not going to stand up here and talk to you about open source business models that would take many hours. And I’m sure you’d all be bored. Uh, but I do think it is very interesting just because when we talk about again why Envoy has become this, this platform, I think we have to look at Envoy itself or Envoy the project.

And you know, one of the key takeaways of Envoy, the open source project is that there is no premium version and there is no open core version. And what I think is super incredible about this project is when I look out at all of you, I see competitors here, right? But yet from my perspective, we all work together and I think what’s really incredible about it is that I see competitors come into Envoy and we collaborate openly around features in the Envoy core. And then people go off and they compete on their extensions or their management plans. And you know, that’s great for everyone because it allows us to make, you know, technology first decisions. We don’t have to fight about saying, well, if I take this passion, I’m going to cannibalize my business model. And you know that, that again, I won’t name names, but that has led to some, some pretty bad outcomes.

Um, but the key here actually is what I was talking about before, which is that, you know, this model of, uh, you know, having no open core and the core open source product and making technology first decisions has allowed us to have a community ecosystem of what I call it, differentiated success. So again, that means that all of you, many of who our competitors can come together and we can build this amazing thing and see it grow around the industry, um, but, but you can all also be successful and compete with each other and hopefully build great things. Uh, and that’s, that’s a pretty, pretty awesome thing. Um, so let’s, let’s look at all of these things up here. And you might be looking at them and you’re, you know, look at the Envoy logo next to Linux and Android and Windows and iOS. And you might be saying what, what to all of these things have in common?

And the thing that they have in common is that they’re all, they’re all platforms. And what, uh, what I think is so incredible that has increasingly happened and in a very short period of time, and we saw this with the App Mesh GA that was announced this week, is that the App Mesh GA launched I think with 15 or 20 different partners and that was partners across things like observability and all, all types of different mix ends. And that partnership was possible because of Envoy, uh, because Envoy is effectively a universal API driven data plane that allows configuration to come in data to go out and an extension to be built in. And on top of these things you can build really incredible products. And what I’ve talked about a bunch is that just like these other platforms around, you know, Windows in the nineties or iOS and the two thousands, there’s a virtuous cycle where the more powerful the platform becomes, the more plugins and applications people want to build on top of that platform.

And then the cycle just continues and continues and continues. And we’re seeing that on Envoy now because we’re getting so much traction, so many people now, are building control planes and plugins and, and, and applications, uh, that I, I really think that, uh, you know, we are on the path to being really almost everywhere, which is really amazing. Um, so just, just to briefly cover some of the things that I think that we’re going to see, if we assume that Envoy becomes this universal data plane, what type of applications will we see. And I, you know, I think that when you look at some of the stuff that we’re seeing today, it’s just the basics. It’s just table stakes. Like we’re, we’re at the beginning. I think of the real innovation, which is all of the applications that will come on top. And that’s because historically when people have had to focus on the boring network plumbing and you know, we were looking before at the old L3 L4, uh, pictures, no one cares about that, right?

Like 20 people in the world in the future will care about that. People only care about applications, they only care about L7. Uh, the rest is just plumbing that no one cares about. And if we, if we look at, you know, this L7 or this application oriented world, people historically have focused a ton on just doing that plumbing. But if we assume that the plumbing is there, what can we build? We can build really incredible things. Uh, obviously we can do security. And you know, today we’re doing what I would consider to be very basic security apparatus. So things like ACLs and, and you know, super basic checks, but think about a world in which Envoy is able to tap traffic in a sampling matter. Send that traffic to an analysis system. In realtime build DDoS rules, send those DDoS rules back to Envoy and do real time blocking.

That’s just one example, or real time auditing or all of those systems. These are applications that will end up getting built. And because it’s universal, if you’re running Envoy in AWS or GCP or Azure or on-prem, it won’t matter. You can just use it. And that’s a pretty incredible thing.

Observability. I found this cool, cool thing on Google images. I, that’s actually as a total aside, I love making presentations. I just go on Google images and keep searching for pictures until I find cool ones. So I thought, I thought this one was pretty cool for observability. It’s like a cool, I think. Um, but yeah, so, you know, we’ve, we’ve seen, uh, we’ve seen, you know, again, just very basic things or, or you know, not basic, but really just the beginnings of, you know, metric systems and tracing systems and logging systems. But when you think about, again, all of the data that’s coming in and out in a consistent way, we can bill coherent dashboards and coherent tooling that’s spans, you know, the entire distributed system, which is pretty, pretty incredible.

Auditing. Obviously from an enterprise perspective. Um, you know, where you tap the networking traffic is where you can do an incredible amount of auditing. And I don’t have time to go into it into this talk, but I have lots of thoughts for future stuff that we can do around APIs and from an auditing perspective, you know, imagine if you can annotate APIs for fields that contain PII or like similar data and actually have the mesh deal with that, again, in a coherent way. There’s really incredible things that we can do there.

Debugging, right. You know, I mean, uh, just the ability to introspect traffic, tap traffic, search for traffic.

Uh, yeah, it’s again, fairly incredible what we can do to help people figure out what’s going on in these systems and actually fix them. Uh, and then again, obviously control, you know, we’re seeing people just start to build the beginnings of these control planes. Right now they’re relatively simple, but as was said in previous talks, you know, we’re increasingly entering a multicloud world or a world where people are spanning on-prem and cloud. And these control planes are going to have to become increasingly sophisticated. They’re going to have to deal across federation and trust domains and all of these things. So there’s a huge opportunity to actually build sophisticated control planes. But again, if there’s a universal API and a universal proxy in place, these control planes, people can build businesses and systems that will span multiple clouds and not be bound to go to a particular cloud or pass implementation.

So, um, thank you. So in summary, uh, you know, I think Envoy has been a incredible journey. Really excited to see all of you out here. The community is really incredible. Uh, I think, you know, we, we have grown as fast as we’ve been able to grow again based on quality, velocity, operability, uh, extensibility, API driven, the no open core pay premium, technology first. Um, and this ecosystem where we can allow all of you to, to be successful, which I think is pretty cool. So thank you very much.

Back to Blog