Building A Sovrin Linked Permissionless Ledger for Data Analytics
Building A Sovrin Linked Permissionless Ledger for Data Analytics
Tuesday 5C
Convener: Paul Knowles
Notes-taker(s): Paul Knowles
Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:
The presentation outlined the potential advantages of building a Sovrin-linked permissionless ledger to house “Ghost Schemas”, GDPR-compliant versions of the original schemas. Any attributes constituting Sensitive Personal Identifying Information (SPII) would be omitted from the ghost versions at the time of schema creation using a sensitiveAttributes function. All non-sensitive data would be stored off-ledger in public read-only tables. The permissionless ecosystem could then be used to produce statistics and analytics on non-sensitive data.
The only real concern was how to prevent triangulation on non-sensitive attributes to unblind identification. Proposed solutions to prevent triangulation were (i.) to set an observation threshold so that tables were only published once they had grown to a certain pre-determined size, (ii.) to have a holding function so that tables were only published once the schema owners were content that they were suitably non-sensitive and safe for public consumption and (iii.) the community would be able to see if certain tables were being excessively used in conjunction to create analytics as this could signal that hackers were looking at certain data patterns and an investigation could then be triggered.
Having spoken to Timothy Ruff from Evernym, it was suggested that all of the functionality could, in fact, be built into the Sovrin framework without the need to create a separate linked ledger. That set the ball rolling in a different direction and we are now looking at how to go about that implementation.
On the last day of the workshop, Nathan George, Matthew Hailstone and Paul Knowles had a brainstorming session and decided that, rather than introducing “ghost schemas” into the mix, "overlaid schemas” would be a better solution. These acetate versions would enable sensitive attributes to be flagged throughout the schema lifecycle without having to strip them out.
A first draft of the new proposal can be downloaded via the following link …
Paul Knowles (data modelling expert) and Elizabeth Renieris (Global Policy Counsel, Evernym) will now meeting in London on the evening of April 12th to further discuss the initial drafting of guidelines for the sensitiveAttributes function. The ultimate aim is to come up with a succinct set of rules on what should be deemed "sensitive". This would primarily be for the schema creators.
The final tech implementation may have wide-reaching consequential benefits for every player in the communication chain. We believe that a model can be hashed out that not only ticks all SPII concerns but also allows people to get paid for use of their data.
The data analytics slant is the real game-changer. Imagine if all participants in a global subset of data can be paid when their data is used for statistics and analytics applications for societal, corporate and personal benefit. We’ll continue moving forward with the proposal as it most certainly has legs.