A Protocol for Decentralization – How Many Data Brokers Will We Need

From IIW

A Protocol for Decentralization: How Many Data Brokers Will We Need?


Tuesday 5F


Convener: Adrian Gropper


Notes-taker(s): Scott Mace


Tags for the session - technology discussed/ideas considered:


Web of Trust, DIDs, OAuth



Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:


Background: In rebooting Web of Trust, we are trying to figure out what the protocol issues are. Some contention from other groups that want to see their particular architectural things going on. We’re all friends, many have done work on the W3C data model side, time for the rubber to hit the road.


Snorre, Orie Steele, Victor, one other, tackling what is an agent, what does it mean in our content to have decentralization or a protocol that promotes decentralization. 4 slides.


Here is my framing of the problem.


Alice or Bob’s DID, edge agent, cloud agent, MVP, EDV L3 – notification…, EDV L2 – sharing with other entities…, EDV L1 – persist, enforce… per J. Zittrain



EDV – encrypted data vault.



Our group working on minimum viable protocol. EDV wanted to define the endpoints. Wrote a 10-page paper still in draft form.


L2 can reencrypt the data.


In middle is a thing that looks like IP, everyone can use it, compete for the individual’s business.



My take on agent to storage protocol functionality (roughly in order of use)


Split into things that are Alice to Alice, or Alice to Bob



Where is data about me



California and Vermont have now required data brokers to register.


Questions.


Q: I really do like the way of don’t invent something. New words confuse things a lot. Bot. Hub. How do you define communication without having to confuse things too much?


Orie: I’m most drawn to MVP and cloud agent and edge agent in your diagram. We have this data format agreement issue, how do I talk to agents in a consistent manner. Attempt to define a messaging standard. At encryption envelope layer, not requiring HTTP. Until we see that demonstrated in any kind of way, it will be hard to talk about inter-agent interoperability.


Adrian: I just came from the Aries talk, you would have to fork the library to make it talk to UMA. Relative to OAuth.xyz as being the protocol that wouldn’t be invented from scratch to be that agent to handle both Alice to Alice and Alice to Bob, do you see a gap between IETF Oauth.xyz and this?


Justin: biggest gap, validation of key material. Say I have one instance of an SPA or mobile app, want to tie that to some distribution mechanism, a way to prefill if not entirely convey specific user information. Xyz has a space to put all of these things, but it does not have a definition of what these look like in a DID-facing world. Xyz is probably not where it should be defined. Should be in an general purpose authorization or delegation protocol. Your client shows up, says, for my client info, it’s this DID, for my key info, it’s this DID. As long as the auth server understands those, to have that kind of conversation.


Adrian: Is there any competitive starting point to filling these gaps other than Oauth xyz? Solve HTTP first is another discussion. Within the community as we understand it, is there anything better than Oauth xyz to focus on?


Justin: it’s a project that I have started as a means to taking a look at addressing shortcomings of Oauth 2 in a consistent way. Not an extension. Not wire compatible with Oauth 2. By design. Takes a fundamentally different model. Borrows from Open ID connect and UMA. Real value, it brings these together in a way that they are consistently applied to the system. Oauth 2 can stack Uma, pixie, request objects, stack all together with twine, and it will work but it’s unwieldy at best. Xyz looks at what everybody is actually doing with Oauth, strip out some legacy decisions made in 2010 that are hindering us now. First step is intent registration. Oauth 2, authorization first. Xyz, I can push in information then figure out if need to redirect the user or involve the user in any way. You end up with something different in a lot of ways. Borrows a lot of its structure from the UMA 2 protocol, which is an extension to Oauth 2 by design. Builds things in a way that when you’re starting off, you might not know you’re talking to Alice or Bob or Colin. As a client you’ll do things the same way every time.


Orie: Issue ID tokens?


Justin: ID assertions alongside access tokens. It’s HTTP. Not going to call it RESTful because it’s not. It’s a multitransactional HTTP protocol.


Adrian: I see this as inevitable. But will raise the stakes. What do we mean by decentralization? We know Oauth works. Will be gaps filled somewhere else. My realization is that in order to have a protocol for decentralization, protocol must handle Alice to Alice and Alice to Bob seamlessly. To some people, it’s a capabilities-based access protocol, not an identity-based protocol. That builds in the delegation benefits. Now where the money is, I want to ask about the incentives for actually making this happen for introducing this relatively simple Alice to Alice and Alice to Bob protocol. The landscape, in healthcare and elsewhere, lot of attention being paid by hosts (Microsoft, etc.) coming to me and asking how do we create a good-guy data broker. Their term, not mine. The Equifax session asked the same thing. The question at hand, in this community, where we’re all vendors, a data broker or a service provider (who doesn’t need to do aggregation to be successful)…like nebula genomics, does sequencing anonymously. We are not by definition going to aggregate your data. Won’t go to DMV or Facebook to mix that in. People sign up, Apple with Healthkit, aggregation happens somewhere else. Data brokers depend on aggregation. Not necessarily doing any value-added service. They’re subject to the data control of oauth xyz. Data brokers want to add ML, worth a lot more. The problem for our community is explaining the good-guy data broker. As long as they’re wiling to resell data, that’s their business model. They have another characteristic. They promote something called decentralized governance. That means the governance of data brokerage looks like Apple app store or the Android Play store. The governance behind those is centralized, maybe in next generation, Salesforce, Google building database of consent management to announce in October. Will have a lot more data brokers trying to introduce consent management. Who decides who decides, in surveillance capitalism sense. We have to have this idea of bundles of policies represented by these communities. Otherwise a race to centralize data in as few places as possible.


Orie: Need a cattle prod to force aggregators not to amass large data sets. Vermont and California hidden data broker laws.


Adrian: DIDs are just the beginning, but I don’t see any alternative. How many data brokers are we going to have? My answer is there have to be thousands of them, as many as we have communities, the way our doctors and lawyers compete for the individual’s business. People are asking, how do we label apps and services. Setting a low bar that say these people are not evil is not enough. What can Kantara do?


Colin: My sense is we will play a part, standardize something. It’s taking the basis of UMA and Open ID Connect, proactively, something the IETF asks to take on board, take charge of that part of the project.


Scott Mace: Rough consensus and running code is their mantra.


Colin: Yes.


Adrian: A more difficult question. Room for something Kantara already does on the money making side. You call it conformance, I call it the registry role in the good guy data broker definition. I envision a handful of orgs like patient privacy rights willing to standardize the nutrition label for this concept. Karin Alliance. I’m asking we have a separation of concerns between the people who define the label and the people who run the registries.


Colin: We could happily do that.


Adrian: Are you powerful enough to convene the good-guy data broker conversation at Kantara>


Colin: Probably not at this time. But things could change that. Mobile guys are asking us to play a similar role. We have a U.S. state asking us to help with their privacy extension. It doesn’t take many things to line up to have the political strength to do this.


Adrian: Karin is closed, DIF and Hyperledger can’t do it. We need an open forum.


Colin: Many are vendor-play organizations. Kantara is not that. Liberty Alliance prompted a open response. Kantara uses the monetized side of its business to run the free side.