Cameron Moreau and Tian Wang at Tetrate's Service Mesh Day 2019: All for Auth & Auth for All: Bringing End User Identity to the Service Mesh
Cameron Moreau and Tian Wang from Pivotal spoke at Tetrate’s inaugural Service Mesh Day 2019 in San Francisco on the workings, pain points, and future of auth.
Moreau opened with a description of today’s “gruesome world of auth,” wherein a hapless app developer, tasked to bring auth to the apps they’re building for enterprise, is overwhelmed with complex, risky, and language-dependant approaches. Scale that scenario to a world where hundreds of app developers are building thousands of apps and microservices to deliver business value, and you’ve got a quagmire. How does the app developer build identity across all of those? What libraries and models will they choose? What policies will they enforce? Where are the vulnerabilities?
The beauty of the sidecar– where a proxy handles inter-service communications, security and monitoring– is that it handles those issues in a scalable and language-agnostic way, relieving the app developer to worry only about the app’s business logic. Whether it’s for greenfield or brownfield, you get an entire service mesh when you wire all those app sidecars together. With the service mesh, we have the ingress gateway, where there’s an edge component that can handle inbound and outbound traffic to the mesh as well– and we can think of this, said Wang, almost like a sidecar for the mesh itself.
Or we can use Istio, our control plane, to manage all of our policies for us. Either way, we’ve taken all of this logic that we’d typically write in every single one of our apps, and split it off to be taken care of in the space between the services– by the mesh itself.
A key outcome of this approach, said Wang, is that it gives organizations security, velocity, and the ability to focus on business value.
“Our developers our happier,” said Moreau, pointing to the benefits listed below. “Our platform and security operators’ lives are much easier.”
|Scale||Repeatability||More business value focus|
So what does the world look like when we do this with auth? Below, we have a set of scenarios and process steps– as described by Moreau at Service Mesh Day, followed by Wang’s explication.
oAuth step 1: Let’s say a user hits your app and enters through the ingress gateway, with no token or cookie associated. So the request hits the proxy, which thinks about it a second and says, “Hey, I see you don’t have a token, but I also see you match a particular IDP that I’ve already set up.” Whether it’s an internal or consumer-facing IDP, you have a login endpoint to hit. So you’re redirected there in order to kick off the whole oauth flow.
Right here, Wang said, you might be thinking: Maybe I don’t have something that uses OIDC or oAuth; it’s very common to have LDAP or SAML. An identity proxy can help solve that for you. It can act as an authorization server for the policy that you set on this proxy. It can help then translate your LDAP or SAML connection back to this login page of your own.
oAuth step 2: Next, posits Moreau, you get a code, and the proxy exchanges this for a token: “an ID token, a JWT token, whatever you have set up. The whole idea is, whenever you enter the mesh with a token, you’re passed down in the request with the correct bearer token or ID token attached, and you don’t have to go through the oAuth flow. If you have no token, the oAuth flow takes place.”
Wang: JWTs have a typical header, body, and signature that prevents tampering with the token. The body contains information that you know about the user, per OIDC spec, and here, a lot of people also add custom claims.
oAuth step 3: So we went past the gateway, and now I’m on to some downstream services. Here, the first important policy is AuthN, JWT validation. In this case, says Moreau, “we want to know who the user is. And we don’t really care about what they can do. We just want to make sure that they have a token: Is this token signed by the correct person? Is it given by the correct issuer? Is it expired? So my first request hits, I have an appropriate token. The proxy thinks about it says, ‘Hey, here you go, talk to the app.’ The next request: a user logs in or the user has a bad token or no token at all. And the proxy is able to kill that request.”
Which brings us to AuthZ, which enters the picture in the form of service roles and service role bindings. In this scenario, users can do what they think they can do when the service role is set up to grant, or “bind” to you that authorization, if you have the right scope. And in the mesh, the policy, or Authn and Authz roles, can be reapplied as you go downstream.
(Here, notes Wang, we are talking about end user authentication that passes a request through microservices; there are other mechanisms, through Istio, that are tackling app identity and service-to-service auth).
See the full video for insights on the future of auth as well as a live demo showing the application of end user auth policies (e.g., requiring a pivotal.io address to access a page).
Follow #tetrateio to stay on the edge of service mesh and updates in the world of auth.
Today we’re going to be talking about a thing - All for auth, auth for all - bringing end user identity to the service mesh. Uh, quite a mouthful, but hey, we’ll get started here. A couple of things on the agenda. First of all, we want to talk about how auth works today, how he may do it at your company, how it was done, maybe a year. How it’s done today, uh, what some of the pains are. We’re going to talk about what the future kind of looks like with Istio. We have a small demo for you all and talk about a little bit what’s next and what the service mesh brings. Envoys are cool. Um, that being said, my name is Cameron. I’m a software engineer at Pivotal, specifically on the application single sign on team.
My name is Tian Wang, I’m the senior product manager over the application identity problem space. Um, so we’re both from Pivotal and this is a talk about a collaboration that we’re doing with Thales E-Security. So Nick Smith and some folks out of London as well as the Istio security working groups. So Limin Wang, Tao Li and a bunch of those folks there. Um, and you know, just yesterday we were at the Istio workshop when Zach Butcher mentioned, you know, wouldn’t it be great if someone took all the auth IDC libraries and put them in the sidecar? So hint, hint.
All right, so, uh, let’s start with a story. It’s going to be called today’s gruesome world of auth, boo. Hi. Meet the app developer. He makes apps. Meet the decision maker. She makes decisions and with those decisions come business opportunity. She wants to grow her business. Um, you know, there’s several things you can do and you know, software is one great way to grow your business, make money. So she has some requirements and application that needs to be built. She reaches out to the app developer, he understands, he’s pretty excited. You know, this is more, more business logic, he gets to write, right? Well, you know, he gets down and makes his list, starts drafting some stuffs up, works on some features, and then he realizes, oh, hey, this story includes, I need to bring some auth to my app as well. How do I do that?
So I mean, first thing, right? Just like any normal developer, you hit the books aka Google and Stack Overflow and you learn how to do auth. Like what does that even mean? Right? I just build apps. Well, you start looking around and you notice there’s a lot of resources here. There’s this thing called open id connect. There’s this thing called SAML. There’s a JWTs. What is this id token thing everyone is everyone’s talking about. And really, you know, none of this is language agnostic. I chose to use Spring in this case, so now I have to find something that meets my, meets my needs were, you know, it’s already written in Java and maybe there’s a framework that already exists. It’s very easy to get overwhelmed in this kind of a world. And you know, the app may have been delivered, but it took a lot of time. There are some risks to be made and not everyone’s happy. But the problem is this is just a single app developer.
Yeah. And enterprises are going to have hundreds of application developers that are all trying to kind of figure this out and build applications and deliver business value. And there’s this really good talk, I forget where it was or who it was, but the concept was, you know, said no CEO ever. So all my developers know how to build LDAPS, SAML, OIDC. And that’s great. Said, no CEO ever. Um, so going from this, of course we got hundreds of enterprise application developers that are trying to build thousands of applications and microservices. Uh, and the question there is of course, how am I building identity across all of those? You know, leaving aside that there’s probably a single identity provider team. What’s that. Leaving aside the fact that there’s like a single identity provider team that’s going to be a bottleneck to kind of building identity into those applications.
There’s also the question of how am I going to build this into my applications? What are application developers doing? What libraries are they choosing to use? What policies are they enforcing? What models are they trying to use? Is it even [unclear] secure? So when we get into the security aspects, of course, you know, what happens when a CV happens, where am I vulnerable? What am I going to do? How am I going to roll this out? Um, and you know, you’re still trying to grow your applications and if you’re not able to keep up, of course this is going to be, um, a very chaotic, a very messy, meshy mess all over the place. So we just went over a lot of the challenges that you might see related to, kind of, identity at scale, um, at various corporations. And this is basically something that is near and dear to our heart. Um, so one of our central personas is the enterprise application developer and we are very much focused on their user journey as a part of my product, the Pivotal single sign on service. But the way we look at it as this user journey is very user centric and solution agnostic and full of pains that exist across all enterprises.
But Hey, we’re here to tell you the future is a pretty nice place. So let’s talk about sidecars. Oh, not these kind of side cars, these kind of side cars. You know where traffic goes in. Uh, when you, your app receives a request that hits a proxy first. When your app wants to make an outside or an egress call, it hits the proxy first. And you can kind of think about this as, you know, the smallest unit and you’re, whether you’re using Kubernetes or you know, whatever, like a little pod, right? The cool part here is when I start scaling my app, my proxies are going to scale with it. So that’s great. But you know, one of the, one of the main pieces, the, to really acknowledge here is that we’re taking a lot of the logic that’s inside of your app and we’re doing it elsewhere.
So let’s look at your app. It probably does some really cool thing and you’re probably really proud of it and you should be, but there’s a lot of business logic and that’s really important. But to make your app do the thing it was supposed to do, you added things like metrics and tracing and logging, service discovery authentication, right? So you probably pick some framework, maybe you worked on it from scratch, but you know, in an Instio Envoy world, here’s your app. You know, you bring off your app, it has its own special container, just your business logic you don’t have to worry about else. And we split that up. So it’s all done kind of for you almost. And the best part is it’s language agnostic. I don’t have to pick a specific framework, write it myself just because I’m using some weird language, uh, kind of does the same thing, right?
So that’s kind of like the whole like what if my application didn’t support it before while we could add it there, whether it’s greenfield, brownfield, um, and then of course when you kind of wire all those up together and you have application side cars at every level and then you get this entire service mesh. Um, and with the service mesh of course we also have the ingress gateway, where kind of that edge component, um, that can handle inbound and outbound traffic to this mesh as well. So we’re going to talk a little bit about this as well, because essentially we’re kind of seeing this as almost like a sidecar for the mesh itself. Um, in a place where we can add a lot of our own logic as well.
So you know, we talked a lot about the data plane here. The thing, the proxy, the Envoy, the thing that’s going to enact on the policies that we give it. But in our example, in a few demos about today, we’re going to be using Istio. I’m sure you’re familiar with it by now, so I won’t go too deep into it. But the whole idea is this is going to manage all of our policies for us. This is our control plane that we’ve chosen to use. So the real piece here is we’re trying to take all of this little logic that we rewrite every single time super boiler plate, and we can push it to the sidecar or to the platform or to Istio. You know, Eric’s talk today I mentioned a huge key concept of Istio is, you know, I don’t have to keep doing things as an app developer.
They’re kind of done for me. I’m able to split off this logic. So what does a world look like where we do this with auth? And uh, I’m here to kind of, we’re here to kind of show you a little bit. So we’re going to go throughout this story piece again. So a user, once they hit your app, maybe they go to something like slash, dashboard, maybe whatever. Maybe they do or don’t have a token, we’ll find out in a second. But the first thing it’s going to go through your ingress gateway. So, it travels through and we notice that there’s no token or cookie associated. So it hits that proxy, proxy thinks about it to a second and says, Hey, I see you don’t have a token, but you match a particular IDP that I already have set up. Maybe it’s an internal IDP, maybe it’s consumer facing, who knows? But you have en, a login and point to hit. So you’re redirected there and in order to, to kick off the whole auth flow.
And from there as well, you might be thinking, well, maybe I don’t have something that uses OIDC or OAuth. It’s very common, um, to kind of have LDAP or SAML behind. And that’s where we kind of use that same tagline as Kubernetes does, which is well, you know, identity proxies can help solve that for you. Um, and the idea there is of course this identity proxy can act as an authorization server for the policy that you set on this proxy, uh, and it can help them translate your LDAP and SAML connections back into um, this login page of your own. Um, and this is something we use ourselves with Cloud Foundry’s user account and authentication service. It Is just one example of an identity proxy. There’s plenty out there so you can just bring your own.
So next step in the oAuth flow is you get a code, maybe it came back and went to the identity proxy, but you get a code and the proxy actually exchanges this for a token and you know, it can be an ID token, JWT token, you know, however you have it set up. Uh, but the whole idea is now whenever I enter the mesh, if I don’t have a token, that whole flow takes place, I have a token and then you’re passed down in the request with the correct bearer token or ID token attached. But if I do have a token or cookie attached, you know, it automatically knows you’re not going to have to go through that oAuth flow again, it’s translated into a bearer token and propped down to your app. And now I’m sure like all of you as well, you’ve probably read the documentation of Istio and you’re like, wait a second, I’ve never seen this policy. Uh, well this is a little experimental policy that Pivotal and Thales is working on, uh, we have some more information kind of at the end.
Um, and then of course, just to give a brief intro to JWT, but I think most people here, it might have heard of this before from past talks. So you know, JWTs have a typical header, a body and a signature. Um, and then the signature itself offers, you know, tamper proof, um, is meant to kind of prevent the token from being tampered. Uh, and then of course the body contains a bunch of, um, information about the user that you know, is both per OIDC spec, but a lot of people also go in and add customs claims to that JWT token.
So we went past the gateway and now I’m on to some downstream services. So you are here, let’s talk about the first important policy. JWT auth in or auth in JWT validation. In this case we want to know who the user is and we don’t really care about what they can do. We just want to make sure that they have a token. Is this token signed by the correct person via some Jwks_uri, uh, is it given by the correct issuer? Um, is it expired? Things like this. Right? So my first requests hits, I have an appropriate token. Proxy thinks about, it says, Hey, here you go talk to the app. The next request, a user logs in or a user has a bad token or no token at all, and the proxy is able to, uh, to kill that request before it hits the app.
Uh, next popular one you’ve probably heard of - authz. This comes in the form of service rules and service rules bindings. So this is that part that I mentioned. Can the user do what they think they can do? So if I’m trying to make a post request or slash todos, a end point here, I have the scope read and write and I have a service rule set up that says it will only be granted to you or bound to you if you have this right scope. In this case I do, I’m able to pass on that super secret message to my app. But in this case, a todos viewer with only the read scope is rejected at the proxy. And, uh, so here I am, right? I, I’ve gone to the first downstream service, but what about the rest of them? Right? We’re service mesh. We have microservices, well the best part is as the requests kind of keeps going downstream, you can keep reapplying these policies over and over and until finally, you know, maybe one rejects the requests and you get whatever error code you have set up uh. Or you know, you can just keep reapplying the auth and authz rules.
And, um, our focus here has really been talking about the end user authentication piece that passes along with the request of your microservices. But that doesn’t mean we aren’t thinking about the application identity though, service to service pieces as well. But the good thing is there’s a bunch of smart people in the Istio community who are working to tackle that exact same problem. So we believe that a lot of this can be leveraged complimentary, uh, with the end user authentication. Um, and you can basically get all that Spiffe goodness that you want,
All right, so we have a live demo for you all because we like to live life on the edge. Um, and if you haven’t seen Bookinfo, that was in the previous few talks as well, but we’re using Bookinfo from Istio to do this. Um, let’s see here. I’ll make this a little bit bigger because, uh, you know, the screen’s a lot of bigger. So I’ve Bookinfo. I actually have one policy already set up today. Let me actually make this a bit bigger so you can see. But if I do something like let’s, let’s look what I have here. So that’s one policy set up. The first one that I’ve already enabled is, uh, whoops. First one I’ve already enabled is my service rule and service rule binding. So I have RBAC configured, but it says, hey, if you’re coming from a certain service, allow anyone basically to view the product viewer page. So the first thing I’m gonna do is I’m going to say, hey, in order to view this, you have to have a valid token. So to do that, I’m going to k apply.
I’m going to apply this policy, but let’s take a look what it does first. Basically anytime, we’re applying this to all, all sidebars in this case for saying, Hey, you need to have a JWT token and needs to be an issuer from our source and a Jwks_uri from our source. So now if I visit this page without any kind of form of token, I get an origin authentication failed and you’re probably thinking, hey, how did you get that token the first time? And that slide you’re talking about. So we have a thing for that. Uh, let me apply that.
So this is our OIDC filter policy. I’ll show it to you. There’s going to be some secrets, but it’s fine because those are going to be deleted after this talk. But regardless, uh, another thing I mentioned that this is kind of experimental right now. So the policy has actually changed a bit. Let me actually make it bigger for you. But you can see I have a multiple set of IDPs here. Given a certain request, I can match that IDP. I have some values set up, like my authorization token, Jwks_uri end point, things like that. Right? Well, let’s see. Uh, let’s see what this actually looks like. So I’m gonna hit my page and instead of a, of an issue, I am automatically redirected to my chosen IDP. In this case, I’m gonna go ahead and sign in with Google and I’m logged in here and hey, what do you know? I now have access to the app. Uh, one thing to show as well is k [unclear].
If you’re familiar with the Oauth flow, you know that it isn’t stateless. Uh, and I actually have several ingress gateways here. So obviously an issue to be solved there. We can talk about a little bit later. But here’s the next thing I want to do. I want to say you need a pivotal.io address in order to, in order to view this product page. Maybe it’s an internal tool. So let’s apply that. So if I look at that policy, all it’s saying is, hey, just like before the service rule binding, only grant this service rule is if I have a pivotal.io email address. So that means when I refresh this page here, I’m not just gonna kill it because caching is a thing. Right? Let’s see, I’m gonna log back in with Google. And what do you know, access denied. But if I were to log in with my pivotal email address, sure, you can have an ID token. Bam, I’m in.
So that was just kind of an example of where we are today and this is something we’re working towards upstreaming into Envoy as well as, um, contributing into Istio itself. So we’ve actually presented the OIDC policy, I think two weeks back as a part of the Istio security working group. Um, so it’s there in the notes and we’ll have it at the end of these slides as well. Um, now of course, there’s a lot of things that can be added by the time this goes, um, you know, production enterprise ready. Um, of course things like observability of which there’s a lot of activity going on there within, um, the Istio community as well. So not something we’re going to worry too much about. Um, default security policies. This is always that, you know, um, dev ops, you know, dev or secs, where ops kind of conversation, who owns the policies?
Um, so a lot of that comes around, you know, are their default policies you want to set, who can set what policies where. Um, there’s kind of this whole concept of RC tokens that you know, Google, folks have been talking about in the Istio security working group and the concept of their’s - their request context tokens. Um, so one, um, disadvantage or risk of using an OIDC token throughout your mesh of course is can it be used for impersonating, can like a service, take it and call another service with it. Um, so request context tokens are meant to kind of swap those out beyond the edge and use kind of this token only in the context of container to containers. And that of course means we can kind of just worry about token exchanges then at the edge of this service mesh. So like if we’re receiving tokens from another authorization server or sending tokens out that need to be authenticated by different authorization server, maybe that’s only the time we really have to worry about token exchange scenarios.
There’s a lot of additional items as well that could be essentially chained with fine grain authorization. Um, so you know, with scopes in OIDC, they’re fairly limited towards kind of role based access control scenarios. Um, so there’s definitely going to have to be kind of what happens for more fine grain authorizations. And then of course other new innovations that can happen in the service mesh space. You know, this next generation access control, token black listing, a bunch of different things that may not have been possible before. I’m now going to be possible with a service mesh. And the best part is since all of this logic sits outside your codebase, why not just change policies later? Make sure your use cases work.
So what’s an ideal future look like? And this is just painting a rosy picture. Um, so you know, we’re already putting this logic into side cars. So you’re going to have essentially this logic available, kind of a crossroads service mesh. We’ve already chosen to start using federation as a protocol with OIDC, so you have federation at the end user level and it’s really cool to see kind of Spiffe going in the direction of federation as well. Um, so one joke we have with our team is that software development is pretty cyclical. So at some point we’re going to start calling the service mesh a monolith and move to micro service meshes.
And then of course, um, you’re going to get observability, which is something we’ve talked about. Very important to have auditing, logging, monitoring, all of that across your service mesh. And then of course the operational aspects of this as well. Upgrading, patching, being able to kind of use policies to kind of help you manage this all across everywhere. But really what we see, the outcomes that are being enabled by this is that you’ll be able to have security and velocity while focusing on just the business value.
So in our strange planet, this, this changes things a bit, right? There’s a lot less things we have to do. There’s a lot more features we get for free. And I mean, I can go off and name the benefits or you can just kind of read the slide, but you know, our developers are happier. Our platform and security operators lives are much easier. The business side can focus on velocity, everything Tian just said it’s all nice and great, more focus, less risks. It’s pretty good.
And this is something that know we’ve been talking to our customers about across the fortune 500 minus the tech companies of course. So looking really at enterprise developers everywhere and kind of like seeing this is kind of a pain that they’re seeing. Um, and trying to give them kind of a solution. So, you know, asking them questions about like, what are the requirements for this? But trying to kind of say like, hold off, you know, it’s not quite there yet. We’re trying to learn from you right now. Um, and this is something, of course that kind of fits with, you know, Pivotal’s value of transforming the way the world builds software. We’re really looking at kind of, you know, the people process and culture side of things on top of the tools that enable enterprises to be successful. Um, and a lot of that has to do then with, uh, making sure like from our side that we have the tools to help enterprises build identity into their software. Um, so, you know, we want to definitely work with the open source community more. We definitely would love, um, collaborators in the identity space and for security aspects of Istio and the service mesh. Um, so we have a Slack channel, um, OIDC proposal out there. Um, there’s the Istio discuss security channel, which I may go to every few months or so. Um, and then, um, the Google shortener wasn’t working. So remember our OIDC proposal link. Uh,
They actually deprecated it in like a couple of months ago.
Oh No, we didn’t know that.
Um, but yeah, we’re a small team at Pivotal - 4 developers, 2 designers and a PM. I’m working with like two developers out of London and the Istio security working groups. So hoping to get more involved in the community.
Also follow us on Twitter.
I don’t really use it, but I put my there anyways names.
Any questions by the way,
So I can kind of take this question though. The question was, uh, I believe that our OIDC policy was a policy inside of Envoy. So we have built a filter or Thalus has built a filter and we’ve kind of helped out a bit. Uh, and it’s the OIDC filter as well as the session manager filter. And the idea is, OIDC is pretty extensible, so I could also have an oAuth filter as well built on that session manager. So we’re in the process of upstreaming those changes into Envoy. So you can use whatever service mesh you like. In our case, we’re specifically working with Istio and we’re creating an OIDC policy or as similar to the policy you saw today, I can apply it. Those configs go to Envoy and the magic happens.
And we’re actually meeting with our collaborators on Google campus in two weeks time. Um, just to kind of talk through how we can get the rest of the way to get this into the open source community.
Yes, we are. This is a problem that we’re currently solving. As I mentioned, I another problem very similar is the whole fact if you have multiple ingress gateways, it’s not guaranteed that you’re going to go to the same gateway. Oauth is not a stateful operation. Uh, so there’s kind of two approaches here. The first approach that we’ve necessarily taken is we have an encrypted cookie. The token does not actually bleed into, we’re storing the cookie, the encrypted cookie into the browser. The browser can’t read it, can encrypt it. Uh, this is kind of like a K8 secret to unencrypt inside of the service mesh. Once the proxy receives that encrypted tookie, cookie, it’s able to unencrypt and pass down. The second option, like you mentioned, is you need some kind of persistent storage. So we’re also working on, currently in this Envoy filter, you can swap out between in memory, uh, using the encryption key or you can bring your own Redis. And this is kind of an area that we’re looking into a very similar, you know, it’s a possibility to add Redis to the mixer rate limiting filter. This is just another problem kind of have to solve, but all the, stuff like that is definitely what’s going on in that Slack channel. So,
Yeah. A good question. So the, the, I, the question here was, does the OIDC filter only work for JWT tokens? And right now that’s kind of the case. Um, it just kind of built that way. It could be extended to have any form of opaque token is just the main reason we started with JWTs is so that we can build the auth-in and authz filter right on top of it. Uh, cause the auth and authz right, you need a JWT to validate claims.
I think there’s been some talks in the community. Actually up a proposal came out I believe three weeks back or so to look at um, opaque tokens as well. Um, I think the reason why we went with JWT was of course for offline validation and the existing policy filters just to avoid the network hops. Um, and with kind of the service mesh, we’re hoping that a lot of this is more hidden away from the user, especially if you layer mTLS on top. Um, so you know, uh, it’s not like out of the picture to use, um, oAuth with opaque tokens. Um, but your, it is going to have to go back to the central authorization server for that validation and you’re going to have to start like reintroducing those network hops.
Well, hey, if that’s everything, that’s all we had to say. And thanks for listening.