20A/ DID Document Representations (JSON-LD, JSON, CBOR, ...)

From IIW

DID Document Representations (JSON-LD, JSON, CBOR, ...)


Thursday 20A

Convener: Markus Sabadello

Notes-taker(s): Drummond Reed

Tags for the session - technology discussed/ideas considered:

DID (Decentralized Identifiers), JSON, JSON-LD, CBOR, abstract data model, common

data model, data representation


Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps


Markus, one of the editors of the W3C DID Core Specification, presented a slide deck explaining how DID documents are represented, and the challenges of using an abstract data model.

LINK TO MARKUS’ PRESENTATION:

https://drive.google.com/file/d/1RBjubnmJXXlooIn3_ptiHzqCc7O4fIR6/

Chat from the session:

From Paul Bastian : Are we already in the middle of the session? Open Circle just ended

From Swapna Radha : Did it already start?

From David Huseby : hiya

From Ryan Faulkner : it is recording, hopefully you should be able to catch up with the first few minutes later if need be :)

From Orie Steele : Currently 100% of did spec registry extensions require JSON-LD… which leads to some interesting problems wrt the ADM translation complexity

From Orie Steele : “You could write code” famous last words :)

From Orie Steele : @Dave Huesby: https://github.com/w3c/did-core/issues/439

From Tom Jones : @orie that was the scariest answer I have ever heard - you have to write some code

From Paul DIetrich : Do DID docs still have optional signatures? Is that now based on the abstract model?

From Orie Steele : JSON-Only is going to be marked at risk and may be removed… its really not doing much but deleting an `@context` and allowing RDF nonconformance

From Nathan_George : I agree with Dave here, I cannot know about the foreign representations, I want a good enough handle on the ADM that I can live in my own fantasy land

From Tom Jones : if json only is at risk, I would suggest that DID is at risk

From Orie Steele : @Paul proofs have been removed from did core, but they can be applied with extensions ; @Tom I agree, its been very hard to get any real contribution to JSON-only ; Its essentially a representation that defines itself by not being JSON-LD :/

From Dave_McKay : @Dave https://www.w3.org/TR/vc-data-model/#extensibility

From Tom Jones : @orie - every single semantic representation that has ever been proposed for the web has failed - is that the future you want?

From Orie Steele : @Tom I think search engines work :) ; And I work with companies that use linked data every day :)

From Tom Jones : @orie - there is no special semantic for search - they accept local languages

From Orie Steele : @Tom, I invite you to learn more about schema.org and https://developers.google.com/knowledge-graph

From Tom Jones : @orie - I know lots of sematic federations - I know of no broad acceptance of semantics other than local languages

From Orie Steele : And also to join the W3C and help solve this through standards participation :)

From Tom Jones : @orie - I was an early member of the discussion group and was driven out by manu ; over semantics

From David Huseby : and he just ran into why JSON-LD is a problem

From Orie Steele : You can’t be “driven out”… there is a formal process… you are welcome to join the discussion and help fix this :)

From Andrew Whitehead : abstract partial data model

From PhilWolff : backward interop?

From Tom Jones : I was accused by manu of being a a troll because I would not drink the kool-aid

From Orie Steele : Here is a cool example of how you can support multiple representation from the same did method… ; Https://did.key.transmute.industries

From drummondreed : It’s an abstract data model that’s only interoperable to the extent that extensions are registered in the DID Spec Registries

From Orie Steele : ^ which requires JSON-LD for registration :) ; Which is why its not really “RDF-free"

From Gabe Cohen : Continues to be interoperable as long as you support LD…which is not interoperable

From David Huseby : ^LOL

From Orie Steele : JSON is interoperable but JSON-LD is not? Makes no sense

From drummondreed : Just to be clear, the requirement for JSON-LD is just to describe any extension so it can be rendered in JSON-LD. It doesn’t mean you have to produce a JSON-LD representation.

From Gabe Cohen : It’s like comparing java and spring boot, the difference is clear

From Orie Steele : X-Headers are also deprecated ; lol ; We call this the “preserve by default” ; And it has almost universal adoption

From David Huseby : +1 @Orie

From Gabe Cohen : +1.2

From Paul DIetrich : Is the list of supported representations stewarded by a standards process, or is this a free for all like DID methods?

From Orie Steele : Jonathan holt is probably the only person who has contributed to a representation other than JSON-LD ; I’ve seen no other contribution to did spec registries, so if I seem frustrated by this….

From Gabe Cohen : Workday uses plain json…

From Orie Steele : Its because a lot of folks like to talk about PRs instead of write them :)

From Paul DIetrich : but does that have to go through a standards process?

From Paul Bastian : do you have a link for that proposal?

From Orie Steele : https://github.com/w3c/did-spec-registries/pulls ; https://github.com/w3c/did-core/pulls

From PhilWolff : has anyone been using ML or NLP toward understanding semantics of unknown endpoints?

From Orie Steele : Dude, I want waffles

From Andrew Whitehead : Yeah I hope he made enough for everybody

From Paul Bastian : Can someone summarize the core motivation for other formats other than json-ld?

From Orie Steele : The hope was that they would be safer or smaller

From Paul Bastian : except from this is my favorite format/saving some bytes

From Orie Steele : Neither is true today

From Gabe Cohen : The motivation is the desire to not touch JSON LD

From Orie Steele : JSON-only actually currently normatively supports prototype pollution ; Which grants ACE ; lol ; A lot of talk about security, but then a “preserve unknown properties by default” approach :)

From Paul Bastian : @Orie in which way is json-ld or the parsers unsafe?

From Markus Sabadello : Yes, there was a long discussion some months ago in the DID WG about not requiring JSON-LD, which is one of the reasons why this design was adopted

From Orie Steele : And no real contribution to shaking it out ; @Paul I don’t think it is ; Thats my point :) ; Not safer, and not smaller

From Brent Shambaugh : I jumped in late. What is an "abstract data model" ?

From drummondreed : That’s helpful, Dave, thanks

From Orie Steele : Dave just quit an help us fix this!

From jonathan holt : The Abstract Data Model is a bit too abstract for my taste, hence way I’m leveraging CDDL to make it more concrete.

From Orie Steele : Yep, json sucks compared to CBOR….

From Orie Steele : Except its readable

From Brent Shambaugh : @Orie I pinged you about the world of Category Theory in twitter land. I wonder if Abstract Data Model could be represented by Dr. David Spivak's and Ryan Wisenesly's work on CQL (catagorical databases) and FQL.

From drummondreed : The “tags” in JSON are—wait for it—JSON-LD context entries :-)

From Orie Steele : Lol lets use ASN1 ; Said no one ever

From Paul Bastian : ASN1 is cool :D

From Orie Steele : And we should listen to the guy who wants to use ASN1 ? ; lol

From Colin Jaccino : +1 to abstract data model. the model is the source of truth. Other formats are merely representations. The pattern is good architecture, but a bit more work. What format to use for the canonical model is what I'm less clear about

From Shigeya Suzuki : (I don't want to use ASN.1 ... :D)

From Paul Bastian : You had OIDs

From David Huseby : HDF5 i superior to everything else but it solves the data archiving problem above all else

From Paul Bastian : OIDs make internal structures totally clear in ASN1

From David Huseby : and isn’t ideal for high speed and scalable systems

From Orie Steele : IMO CBOR could be massively better than JSON ; But requires a lot more help, can have Jonathan solo it :) ; Can’t *

From Colin Jaccino : OWL or RDFa?

From Orie Steele : +1 to what mike is saying, as a did method, you are responsible for your representation…. The problem is that somehow we got stuck trying to get other people to care about “your representation” ; Yep, data sanitization and input validation are required regardless of representation ; And you have a combinatorics sanitization problem if you have unbounded representations ;)

From Dave_McKay : Yup, always need code that understands the context. We need that for innovation.

From Orie Steele : Ahhhh drummond ; please ; stop ; :)

From David Huseby : :) ; tags are not just for types but also field ordering so that digital signatures are easy to construct and validate

From drummondreed : Good point about ordering. The representation specs are responsible for that.

From David Huseby : json and stepchildren don’t possess any notion of ordering. so they rely on canonicalization algorithms

From Orie Steele : Its not guaranteed unless you write code that comprehends the registry ; Which I doubt you will be doing :) ; Because its complicated

From David Huseby : canonicalization algorithms are an endless source of interop bugs

From Orie Steele : I like stable content identifiers… I use IPFS a lot :)

From David Huseby : +1 @Orie

From jonathan holt : One security concern for not allowing preservation of “unknown” properties is that it may present processing of nefarious code, such as a buffer overflow. The way we handled this in HL7 was create a property called “extension” and in that field you define your extension.

From Orie Steele : It adds work, that nobody does :)

From Swapna Radha : Looking to use IPFS in one of the projects.. is there any downside of using IPFS as content identifiers

From Orie Steele : Just be careful that you objects are canonicalized before uploading ; Or you will get different identifiers for the same object :)

From Paul Bastian : Comparison: is anybody in favor of adding 5 more proof formats?

From Orie Steele : -1

From Swapna Radha : Is there any example of IPFS usage other than w3c? I mean in an example use case..

From Orie Steele : TL;DR I prefer to ask for a representation from a did method, as opposed to asking for some to translate a representation for me… telephone game applies. ; From Orie Steele : Better to always ask the did method directly, when you can ; {accept: application/did+json}

From Gabe Cohen : If you get a did you care about translating you will translate

From Orie Steele : I’m unlikely to want to translate a did, form a method provider that couldn’t figure out how to support my requested representation, but of course, asking the client to translate is always an option.

From jonathan holt : I was still hoping to get feedback regarding using CDDL from y’all. Any thoughts?

From Orie Steele : The client might also drop properties / sanitize injection attacks, or insert additional keys :) ; You would now need to trust not only the method, but every translator as well ; There is no “general purpose” integrity protection

From jonathan holt : Yep, +1 to Orie.

From Orie Steele : If you want to prevent translation, you might always sign it ; But a translator can always drop your signature :) ; Its not true, what is being said

From Nathan_George : The signature type specifies it not the serialization ; The signature type may require a particular representation as input for signing
From Orie Steele : There is no DID Core document signing ; Normatively defined ; Its 100% a did method specific thing

From Nathan_George : (I’m drawing on the VC approach, apologies)

From Orie Steele : Many did methods have no support for this.

From JC Ebersbach : ^^

From Nathan_George : Some methods sign some don’t and rely on the consensus component of the storage repo

From Orie Steele : There is not “integrity” across translation ; Its a trusted process… you have to trust the software or operator doing the translation ; And it will almost 100% break any integrity protection the controller may have applied

From JC Ebersbach : the resolving process is already requiring lots of trust, even without translation

From Orie Steele : Exactly, but now you will be trusting resolvers to both resolve and translate.

From PhilWolff : FIVE MINUTE WARNING. NEXT SESSION STARTING.

From JC Ebersbach : well, if I use a resolver than I can also trust it for translation, don't you think?

From Orie Steele : Depends on who wrote the code :)

From Nathan_George : Terry —> very much this ; (I agree)

From Gabe Cohen : +1 provide in the format it’s stored in. Translate at ends where necessary

From Orie Steele : But yes, resolvers are generally a trusted service

From Michael Jones : The next session starts in two minutes

From jonathan holt : I think it is to re-use your favorite processing library, i.e. JSON-LD

From JC Ebersbach : a solution could be to ask multiple resolvers/translaters to establish more trust

From Orie Steele : ^yep thats one solution ; If you don’t trust the resolver, compare resolutions ; Finally YAML!

From Paul Bastian : who needs all these different formats, after parsing nobody cares anymore

From Paul DIetrich : Thanks Markus. Heading a quick break before the next session.

From JC Ebersbach : thanks a lot