XYZ Transactional Authorization

From IIW

XYZ Transactional Authorization


Wednesday 7A

Convener: Justin Richer

Notes-taker(s): Aaron Parecki


Tags for the session - technology discussed/ideas considered:


Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:


1. Notes received from Aaron Parecki:


Discussion notes


there are a lot of extensions to OAuth 2.0 to help it fit into other places, PKCE, UMA, OIDC, etc


a lot of these applications and extensions have added their own bits and pieces over time


if you take all of the oauth 2 specs and stack them up it's a big pile, and they aren't even necessarily compatible with each other


what if we took a step back and instead of saying how do we make oauth do these things, try to solve this better


oauth.xyz


current draft spec at oauth.xyz


it's still fairly early, but has implemented some parts of this


two of the big things that are problematic of oauth 2

- it relies too much on passing data in the front-channel
- it doesn't have a good underlying data model for what a client or resource is


in this protocol, there is still an authorization server, but it's defined as a single URL. it's equivalent to the token endpoint in oauth, it's backchannel only, takes JSON in and returns JSON out


what the client does is it talks to the authorization server and says this is who i am and this is what i want. this is represented in this JSON object.


by the way this is a strawman, so please burn it down, but not with the argument "this isn't how oauth 2 works"


the client says this is who it is, name, home page, etc. this allows the client to show up and declare this to the server. this feels funny in an oauth 2 world where everything is assumed to be preregistered. but this is just user facing decorative information which gets self declared in client registration today. notice that there are no functional URIs or keys. this is like a dynamic client registration request.


the client and server has the option to support a "handle" request, so instead of passing it by value it can pass by reference which it may have gotten from preregistration or previous communication.


next: interact. this is how the client can interact with the user. this says the client has the ability to redirect the user and can be redirected back to me. this is the auth code flow. there are a couple other modes, but this is where you declare it. this is also where you declare your state parameter since it should change every transaction, it's required, and sent in the back channel.


next: resources. this is very loose right now. Torsten Lodderstedt has been doing some interesting work in fleshing out this area and the ideas will probably merge. can either send a list of strings like oauth scopes, but you can also declare this is in more detail the kind of things you want. in this example, actions, locations, but torsten has a different data model which is probably better.


the key is for a client developer, if the API is built in such a way that it has a set of predefined resource sets, those get handle identifiers and you can send the handles in. or you can say "i only need read access to this small subset"


next: user. this is from UMA. this is the ability for the client to push information about the user. UMA calls it a pushed claim. if the client knows something about the user, it can pass this assertion about the user into the authorization server. oauth 2 doesn't have a place for this, but UMA does, but it's not as well defined or supported. the people who have use cases that oauth 2 doesn't fit, this is one of the things. the client needs to be able to say "on behalf of this user i need to get these resources". This can also be passed in as a reference handle, which is like the UMA persistent claims token.


finally: keys. A big gap in OAuth 2 out of the box is the ability to bind keys in the client software to the resulting tokens. We want clients to be able to declare at transaction request time, these are the keys that I have and prove they have access to the keys. 


that's the request. the response has a similar set of potential sections.


there are few bits to point out and then get in to how interaction works.


first, this is the authorization server telling the client, I need you to get the user involved with me, so send them to this interaction URL. the client takes that URL exactly as is and sends the person there. 


this first handle is the transaction handle. when the client comes back from the transaction endpoint, or if it's polling, or if it needs to refresh an access token, etc, this is how it references this transaction and all of the decisions/claims/rights it's associated with. in this example it's a bearer shared secret, but there could be other ways to manage it, including any keys that have been presented and proved need to be proved again when this handle is used.


the client_handle and key_handle come back optionally from the transaction request and say that's great you can give me just this handle next time you come back instead of providing all the client metadata key metadata etc. it's effectively dynamic client registration but it's built into the protocol.


Interaction


this example is basically the auth code flow. the client says I can interact with the user by redirecting them and then you can redirect back to me. When you come back to me, you can redirect to this URL, and here is this state value. This should be sounding familiar. When the server processes this, it decides from that request, I need a user to approve this -- this is not necessarily a known client, this is not a continuation of a previous transaction, etc -- whatever reason, the AS decides I need a user. So it tells the client to send the user to this URL (interaction_url), and when you get the results of that here's a transaction handle to continue it. This URL, unlike OAuth 1 and 2 and UMA, is a fully formed static URL that the client does not add anything to. This URL is already a reference to this entire transaction at the AS. Yes this means the AS is stateful, and sure there are ways around this but it's an implementation question. 


As far as the client is concerned, it gets a URL and sends you there. Meanwhile, the AS interacts with the user, so we're in classic OpenID/OAuth territory, ask them to identify themselves and ask for authorization. Unlike OAuth, this mechanism also can stand in for claims gathering endpoint in UMA 2. It's important when we define this in the spec text that we make it more generic. This needs a little more thinking.


In this mode, the auth server knows how to send you back. It takes the URL that the client presented at the beginning (callback URL). Just like in OAuth 2, the auth server adds a couple parameters to the callback. Went back and forth on this, because if we require the client to generate a new redirect URI every time, we can get rid of the state idea. But instead, the state parameter is required, and the AS sends back a hash of it as well as the interact handle. 


The client will look at the state value and match it up. Just like in OAuth 2, I tried to design this where the client is the stupidest piece of software in the system. The client takes that handle and sends it back to the authorization server.


Right now it sends it back to the transaction endpoint both the first handle and second handle. In this example, the transaction handle is sent as a plain bearer value but the interact handle is a hash, not sure where the right balance between bearer and hashing is yet.


The client also needs to continue to prove possession of its keys, so if the client had a JWK then it would need to include a JWS in this.


At that point, AS looks up the transaction handle and figures out ok whatever the user did plus whatever the client did is that enough to issue a token.


Question: Daniel Fett - you're using a state mechanism here, why not PKCE? (Debate about whether PKCE replaces the need for state in OAuth 2). 


The other interaction method written up and implemented here is the client says "I can get the user to interact but not directly, I can tell them a URL and tell them to punch in a code, but I can't send them there or get anything back", like the OAuth device flow.


The interaction URL you get back here is allowed to be static, and then also returns a user code. While the user is using their secondary device, the client keeps calling the transaction endpoint with its handle and checks whether the user is done, just like OAuth device flow.


The response includes a "wait" value, and importantly, every time you get a response you get a new transaction handle. You never reuse a transaction handle. This is based on UMA 2 and some consider best practice on refresh tokens in OAuth 2.


Eventually the interaction ends with a token response. 


Tokens have the opportunity to be better defined here, but they aren't better defined yet, except by saying they can be bound to keys like how existing OAuth mechanisms work. Since we have methods for binding keys we can bind those to the token. We can still do bearer tokens, or bound to any key the client has presented, or potentially to keys the server issues along side the access token itself. For example the server can issue a key pair that it expects the client to use again.


It's important to have the capability to bind various kinds of keys at runtime to transaction requests and to access tokens. By binding them to transaction requests it also binds to all the sections in that request. This gets bound to keys, instead of in OAuth how there is a vague notion of a "client".


Refreshing tokens -- when you get an access token, if you still get a transaction handle along with the access token, that's the AS saying i'm going to remember all the decisions i made when I issued this access token. That can be used to get a new access token.


Question - 

George Fletcher - do you cover things like downscoping? or are you punting on that and saying go get an entirely new token? 
Justin - as soon as you're allowed to modify a transaction, even if it's downscoping, does that change the meaning of the handle?
George - if you don't do that, it means you expect clients to manage multiple handles
Justin - the response can already come back with a bunch of handles that represents various parts of the request, so maybe the client sends most of those back, but not sure. Leaning towards that being a separate token request like "this is what i have give me something new"
David Waite - if you have tokens that are bound to the client, you might not need downscoping as much, since it means you already have the assurance of the client
Justin - one of the reasons we have downscoping in OAuth 2 is that it's relatively difficult to get a new token, so the idea was, you ask for as much as you think you need at the beginning, but then when you go get the access token you get only what you need at the time. In the real world, that's being pushed in the other direction, which is asking for a minimal set to get started and making it easy to upscope. 
Eve Maler - thinking of these transactions as things that can be referred to legally is great.
Some things that still need fleshing out - introspection, revocation. Those should be addressed here because OAuth punted on those and they've been bolted on after the fact.


Also interestingly, this gets rid of the mixup attack or the AS attack, because the answers are all coming from one endpoint. You get rid of a lot of the necessity of having a separate discovery document. I'm not convinced you could do all of this by just having one call yet but maybe? But I really like the idea of having an AS defined by this singular URL that the client and resource servers have to talk to. Nat tried to do this in OAuth/OIDC years ago with linked data models, this is a simpler way.


Natalie Nguyen - I don't quite have an understanding of all of the problems, so at face value it looks like you're taking a relatively unopinionated model like OAuth 2 but coming up with an opinionated model and not allowing flexibility and extensibility that OAuth had. In doing so, what are the problems that you're solving.


Justin - check out my talk What's wrong with OAuth 2. It's a small sampling of the answer to that question. Some of it is covered on the website, a lot of the problems are related to overuse of the front channel, and a lot of what's been developed is to cover up our use of the front channel.


Last IIW, Aaron and I were talking about an early version of this, and worked on a way to do this handle mechanism on top of OAuth 2. Aaron built a prototype of posting the initial request instead of putting it in the query string with minimal changes to his existing OAuth server.


The modularity in OAuth 2 in a lot of ways is fantastic, but in a lot of ways we guessed where the extension points are wrong. 


If you look at the examples on oauth.xyz, I would want lots of extensibility in the sections you see there to support things I haven't thought of. The core syntax and processing stays the same but the data model can be extended. The extensions become what to do with the data rather than how to pass data around.


If you look at PKCE, it fails "open" in some disturbing ways, so I want to avoid those kinds of things here.


Aaron - how do you plan for the unexpected extensions that may be needed to fix problems that will be found with this in the future?


Justin - having the request and response portions with a well defined core, and defining the extension mechanisms very clearly. In the real world, you come up with a great idea, like deprecating the implicit flow, and people light the internet on fire because people don't want to change. It's a tough question though and gets into standards definitions.