20A/ DID Document Representations (JSON-LD, JSON, CBOR, ...)
DID Document Representations (JSON-LD, JSON, CBOR, ...)
Thursday 20A
Convener: Markus Sabadello
Notes-taker(s): Drummond Reed
Tags for the session - technology discussed/ideas considered:
DID (Decentralized Identifiers), JSON, JSON-LD, CBOR, abstract data model, common
data model, data representation
Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:
Markus, one of the editors of the W3C DID Core Specification, presented a slide deck explaining how DID documents are represented, and the challenges of using an abstract data model.
LINK TO MARKUS’ PRESENTATION:
https://drive.google.com/file/d/1RBjubnmJXXlooIn3_ptiHzqCc7O4fIR6/
Chat from the session:
From Paul Bastian : Are we already in the middle of the session? Open Circle just ended
From Swapna Radha : Did it already start?
From David Huseby : hiya
From Ryan Faulkner : it is recording, hopefully you should be able to catch up with the first few minutes later if need be :)
From Orie Steele : Currently 100% of did spec registry extensions require JSON-LD… which leads to some interesting problems wrt the ADM translation complexity
From Orie Steele : “You could write code” famous last words :)
From Orie Steele : @Dave Huesby: https://github.com/w3c/did-core/issues/439
From Tom Jones : @orie that was the scariest answer I have ever heard - you have to write some code
From Paul DIetrich : Do DID docs still have optional signatures? Is that now based on the abstract model?
From Orie Steele : JSON-Only is going to be marked at risk and may be removed… its really not doing much but deleting an `@context` and allowing RDF nonconformance
From Nathan_George : I agree with Dave here, I cannot know about the foreign representations, I want a good enough handle on the ADM that I can live in my own fantasy land
From Tom Jones : if json only is at risk, I would suggest that DID is at risk
From Orie Steele : @Paul proofs have been removed from did core, but they can be applied with extensions ; @Tom I agree, its been very hard to get any real contribution to JSON-only ; Its essentially a representation that defines itself by not being JSON-LD :/
From Dave_McKay : @Dave https://www.w3.org/TR/vc-data-model/#extensibility
From Tom Jones : @orie - every single semantic representation that has ever been proposed for the web has failed - is that the future you want?
From Orie Steele : @Tom I think search engines work :) ; And I work with companies that use linked data every day :)
From Tom Jones : @orie - there is no special semantic for search - they accept local languages
From Orie Steele : @Tom, I invite you to learn more about schema.org and https://developers.google.com/knowledge-graph
From Tom Jones : @orie - I know lots of sematic federations - I know of no broad acceptance of semantics other than local languages
From Orie Steele : And also to join the W3C and help solve this through standards participation :)
From Tom Jones : @orie - I was an early member of the discussion group and was driven out by manu ; over semantics
From David Huseby : and he just ran into why JSON-LD is a problem
From Orie Steele : You can’t be “driven out”… there is a formal process… you are welcome to join the discussion and help fix this :)
From Andrew Whitehead : abstract partial data model
From PhilWolff : backward interop?
From Tom Jones : I was accused by manu of being a a troll because I would not drink the kool-aid
From Orie Steele : Here is a cool example of how you can support multiple representation from the same did method… ; Https://did.key.transmute.industries
From drummondreed : It’s an abstract data model that’s only interoperable to the extent that extensions are registered in the DID Spec Registries
From Orie Steele : ^ which requires JSON-LD for registration :) ; Which is why its not really “RDF-free"
From Gabe Cohen : Continues to be interoperable as long as you support LD…which is not interoperable
From David Huseby : ^LOL
From Orie Steele : JSON is interoperable but JSON-LD is not? Makes no sense
From drummondreed : Just to be clear, the requirement for JSON-LD is just to describe any extension so it can be rendered in JSON-LD. It doesn’t mean you have to produce a JSON-LD representation.
From Gabe Cohen : It’s like comparing java and spring boot, the difference is clear
From Orie Steele : X-Headers are also deprecated ; lol ; We call this the “preserve by default” ; And it has almost universal adoption
From David Huseby : +1 @Orie
From Gabe Cohen : +1.2
From Paul DIetrich : Is the list of supported representations stewarded by a standards process, or is this a free for all like DID methods?
From Orie Steele : Jonathan holt is probably the only person who has contributed to a representation other than JSON-LD ; I’ve seen no other contribution to did spec registries, so if I seem frustrated by this….
From Gabe Cohen : Workday uses plain json…
From Orie Steele : Its because a lot of folks like to talk about PRs instead of write them :)
From Paul DIetrich : but does that have to go through a standards process?
From Paul Bastian : do you have a link for that proposal?
From Orie Steele : https://github.com/w3c/did-spec-registries/pulls ; https://github.com/w3c/did-core/pulls
From PhilWolff : has anyone been using ML or NLP toward understanding semantics of unknown endpoints?
From Orie Steele : Dude, I want waffles
From Andrew Whitehead : Yeah I hope he made enough for everybody
From Paul Bastian : Can someone summarize the core motivation for other formats other than json-ld?
From Orie Steele : The hope was that they would be safer or smaller
From Paul Bastian : except from this is my favorite format/saving some bytes
From Orie Steele : Neither is true today
From Gabe Cohen : The motivation is the desire to not touch JSON LD
From Orie Steele : JSON-only actually currently normatively supports prototype pollution ; Which grants ACE ; lol ; A lot of talk about security, but then a “preserve unknown properties by default” approach :)
From Paul Bastian : @Orie in which way is json-ld or the parsers unsafe?
From Markus Sabadello : Yes, there was a long discussion some months ago in the DID WG about not requiring JSON-LD, which is one of the reasons why this design was adopted
From Orie Steele : And no real contribution to shaking it out ; @Paul I don’t think it is ; Thats my point :) ; Not safer, and not smaller
From Brent Shambaugh : I jumped in late. What is an "abstract data model" ?
From drummondreed : That’s helpful, Dave, thanks
From Orie Steele : Dave just quit an help us fix this!
From jonathan holt : The Abstract Data Model is a bit too abstract for my taste, hence way I’m leveraging CDDL to make it more concrete.
From Orie Steele : Yep, json sucks compared to CBOR….
From Orie Steele : Except its readable
From Brent Shambaugh : @Orie I pinged you about the world of Category Theory in twitter land. I wonder if Abstract Data Model could be represented by Dr. David Spivak's and Ryan Wisenesly's work on CQL (catagorical databases) and FQL.
From drummondreed : The “tags” in JSON are—wait for it—JSON-LD context entries :-)
From Orie Steele : Lol lets use ASN1 ; Said no one ever
From Paul Bastian : ASN1 is cool :D
From Orie Steele : And we should listen to the guy who wants to use ASN1 ? ; lol
From Colin Jaccino : +1 to abstract data model. the model is the source of truth. Other formats are merely representations. The pattern is good architecture, but a bit more work. What format to use for the canonical model is what I'm less clear about
From Shigeya Suzuki : (I don't want to use ASN.1 ... :D)
From Paul Bastian : You had OIDs
From David Huseby : HDF5 i superior to everything else but it solves the data archiving problem above all else
From Paul Bastian : OIDs make internal structures totally clear in ASN1
From David Huseby : and isn’t ideal for high speed and scalable systems
From Orie Steele : IMO CBOR could be massively better than JSON ; But requires a lot more help, can have Jonathan solo it :) ; Can’t *
From Colin Jaccino : OWL or RDFa?
From Orie Steele : +1 to what mike is saying, as a did method, you are responsible for your representation…. The problem is that somehow we got stuck trying to get other people to care about “your representation” ; Yep, data sanitization and input validation are required regardless of representation ; And you have a combinatorics sanitization problem if you have unbounded representations ;)
From Dave_McKay : Yup, always need code that understands the context. We need that for innovation.
From Orie Steele : Ahhhh drummond ; please ; stop ; :)
From David Huseby : :) ; tags are not just for types but also field ordering so that digital signatures are easy to construct and validate
From drummondreed : Good point about ordering. The representation specs are responsible for that.
From David Huseby : json and stepchildren don’t possess any notion of ordering. so they rely on canonicalization algorithms
From Orie Steele : Its not guaranteed unless you write code that comprehends the registry ; Which I doubt you will be doing :) ; Because its complicated
From David Huseby : canonicalization algorithms are an endless source of interop bugs
From Orie Steele : I like stable content identifiers… I use IPFS a lot :)
From David Huseby : +1 @Orie
From jonathan holt : One security concern for not allowing preservation of “unknown” properties is that it may present processing of nefarious code, such as a buffer overflow. The way we handled this in HL7 was create a property called “extension” and in that field you define your extension.
From Orie Steele : It adds work, that nobody does :)
From Swapna Radha : Looking to use IPFS in one of the projects.. is there any downside of using IPFS as content identifiers
From Orie Steele : Just be careful that you objects are canonicalized before uploading ; Or you will get different identifiers for the same object :)
From Paul Bastian : Comparison: is anybody in favor of adding 5 more proof formats?
From Orie Steele : -1
From Swapna Radha : Is there any example of IPFS usage other than w3c? I mean in an example use case..
From Orie Steele : TL;DR I prefer to ask for a representation from a did method, as opposed to asking for some to translate a representation for me… telephone game applies. ; From Orie Steele : Better to always ask the did method directly, when you can ; {accept: application/did+json}
From Gabe Cohen : If you get a did you care about translating you will translate
From Orie Steele : I’m unlikely to want to translate a did, form a method provider that couldn’t figure out how to support my requested representation, but of course, asking the client to translate is always an option.
From jonathan holt : I was still hoping to get feedback regarding using CDDL from y’all. Any thoughts?
From Orie Steele : The client might also drop properties / sanitize injection attacks, or insert additional keys :) ; You would now need to trust not only the method, but every translator as well ; There is no “general purpose” integrity protection
From jonathan holt : Yep, +1 to Orie.
From Orie Steele : If you want to prevent translation, you might always sign it ; But a translator can always drop your signature :) ; Its not true, what is being said
From Nathan_George : The signature type specifies it not the serialization ; The signature type may require a particular representation as input for signing
From Orie Steele : There is no DID Core document signing ; Normatively defined ; Its 100% a did method specific thing
From Nathan_George : (I’m drawing on the VC approach, apologies)
From Orie Steele : Many did methods have no support for this.
From JC Ebersbach : ^^
From Nathan_George : Some methods sign some don’t and rely on the consensus component of the storage repo
From Orie Steele : There is not “integrity” across translation ; Its a trusted process… you have to trust the software or operator doing the translation ; And it will almost 100% break any integrity protection the controller may have applied
From JC Ebersbach : the resolving process is already requiring lots of trust, even without translation
From Orie Steele : Exactly, but now you will be trusting resolvers to both resolve and translate.
From PhilWolff : FIVE MINUTE WARNING. NEXT SESSION STARTING.
From JC Ebersbach : well, if I use a resolver than I can also trust it for translation, don't you think?
From Orie Steele : Depends on who wrote the code :)
From Nathan_George : Terry —> very much this ; (I agree)
From Gabe Cohen : +1 provide in the format it’s stored in. Translate at ends where necessary
From Orie Steele : But yes, resolvers are generally a trusted service
From Michael Jones : The next session starts in two minutes
From jonathan holt : I think it is to re-use your favorite processing library, i.e. JSON-LD
From JC Ebersbach : a solution could be to ask multiple resolvers/translaters to establish more trust
From Orie Steele : ^yep thats one solution ; If you don’t trust the resolver, compare resolutions ; Finally YAML!
From Paul Bastian : who needs all these different formats, after parsing nobody cares anymore
From Paul DIetrich : Thanks Markus. Heading a quick break before the next session.
From JC Ebersbach : thanks a lot