Fundamental Problems of Distributed Systems (2/3)
Fundamental Problems of Distributed Systems (2/3)
Convener: Dave Huseby
Notes-taker(s): Cam Geer
Tags for the session - technology discussed/ideas considered: Distributed systems, decentralization, collaboration
Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:
Link to slide deck presented in session, provided by Dave Huseby: http://docs.google.com/presentation/d/1C78Wq3vw1W1VJFVAps0KIr3EWPUSg_Xl306sB5Va7_0/edit?usp=sharing
- Discovery: How nodes find one another on the network
- Introduction: How nodes establish secure communications and exchange credentials.
- Coherence: How nodes re-connect in a mobile and frequently disconnected world.
- Public Services: Functions that operate on a system-wide basis.
- Trust: Proving what, not who
- Privacy (anti correlation over time)
- coordination: Are all communications handled in the system?
- Membership: Managing access to services both private and public
- Persistent state: How does the whole system remember?
Here is the link to Konstantin’s blog post on a magical (non-existent) collaborative development tool: http://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains
I mentioned CCLang: http://crates.io/crates/cclang
Secure Scuttlebutt came up a lot too: http://ssbc.github.io/scuttlebutt-protocol-guide/
Dave’s view is that Decentralization is a spectrum and moves in "the direction in which User Sovereignty Increases"
More formal definition from Sam Smith:
Our definition of decentralization (centralization) is about control not spatial distribution. In our definition decentralized is not necessarily the same as distributed. By distributed we mean that activity happens at more than one site. Thus decentralization is about control and distribution is about place. To elaborate, when we refer to decentralized infrastructure we mean infrastructure under decentralized (centralized) control no matter its spatial distribution. Thus decentralized infrastructure is infrastructure sourced or controlled by more than one entity. Entities are not limited to natural persons but may include groups, organizations, software agents, things, and even data items. This control may lie on a scale from highly centralized to highly decentralized. A centralized administratively managed identity system may be under the control of a single governing organization. A governing organization might also be hierarchical in nature with multiple subordinate organizations that operate under the auspices of the next higher level organization. The associated operational infrastructure might itself be highly spatially distributed despite being under highly centralized control or vice-versa. For example although DNS is administered by a single organization, IANA, the operational infrastructure is distributed worldwide
Nine Problems of Distributed Systems (with discussion points)
- How new nodes discover another node to form/join a network
- one of the hardest problems
- How nodes establish secure communications and exchnage credentials
- this is where SSI comes in
- How nodes re-connect in a mobile and frequently disconnected world
- context switching — mobile / laptop etc changes IP#’s etc makes it challenging
- Q: is it deeper than the network protocol?
- Q: how do you do this without central servers?
- IPv6 could be a solution for this
- needs further discussion
- Public Services
- Functions that operate on a network-wide basis
- (e.g. search, query, etc)
- Proving what not who and anti-duplicity (thanks Sam!)
- Anti-correlation protection over time
- Are all communications handled within the system?
- secure and private communication must be within the system
- Are all communications handled within the system?
- managing access to services both public (p2p) and public (p2s)
- Persistent State
- how does the system remember
- blockchains are not be all and end all — just one method / others can be considered
- how does the system remember
Potential Solutions for the Nine Problems of Distributed Systems
- digital dead drops
- distributed hash table with secure
- P2P invites over text / phone QR code
- BitcoinDB daemon for filtering / searching for payloads
- Something Better such as Noise/Mega-Olm but with DID/KER
*Last Known Whereabouts Protocol
- 5 am project to work without centralized server
- each node must be able to act as a proxy for every other node
- when over 50% probabilityto connect with one other node achieved — meta-stablity materialized
- group of 10 friends to stay coherent
- “seem like” fixed central server
- fully coherent meta stable network
- open source projects need critical mass to become self sustaining
- sometimes social cohesion evaporates
Public Services Solution
- Bloom filters?
- Query flooding?
- No known good solution for public services without correlation
- could Verifiable Claims cover this?
- KERI — sam’s work?
- event receipt logs make full trust possible
- could Verifiable Claims cover this?
- client-side encryption with crypto key escrow
- PayPub-like protocol for sharing via cryptocurrency transfer
- been discussing with Peter Todd
- potential for subscription based crypto economy?
- all communication through a secure link, routed through the minxet network
- key escrowing techniques from MegaOlm or Cryptree
- token based access controls as long as token issuance is not tracked centrally
- Dave & Mike Lauder have been discussing
Persistent State Solution
- Node based storage with erasure coding redundancy for reliability
- Distributed ledger
- IPFS or Tahoe-LAFS?
- Dave’s comment: "IPFS — too webby"
Dave’s vision to build an ideal developer tool from scratch. What would it need?
- client-side by default
- no network required
- offline by default
- to support work in multiple modalities
- mobile by default
- Local hash-linked data structure container (ala SSB) with signatures in CCLang
- key value list in block header for higher level application (e.g. code patches, messages, file storage, key escrow, tweets, follows)
- Bitcoin OP_RETURN for discovery via dead drop payloads
- P2P discovery via secure text QR code
- Mixnet for IP making and secure information retrieval for store-forward async signaling
- Last known whereabouts protocol for minxes coherence and p2p-via-mixnet coherence
- must have meta stability
- needed for store and forward
- MegaOlm group key escrow for sharing coordination
- Did:Git like project/community based identity anchoring (provable hacker reputation)
- should be the norm
- preferred pronoun “Who?"
- Just works. Always in sync. Time and network agnostic.
- Replaces Github, JIRA, Mailing Lists, SSB, Web publishing
Dave’s 5am Project Summary Statement I suspect a fully user sovereign, fully decentralized system will work like magic. Automatic discovery, automatic network formation, automatic synchronization, persistent and resilient and secure storage with instantaneous sharing regardless of data size. Automatic synchronization and ubiquitous integration via standard protocols and data formats. No need for any other system to use it effectively.
Zoom Session Chat:
11:04:44 From Grace Rachmany : I can't take notes for this one because I need to leave early.
11:05:07 From Grace Rachmany : I could take notes for the first hour
11:06:11 From Grace Rachmany : Thanks Cam!
11:07:27 From Grace Rachmany : It's going to be hard to reach you after this if you aren't findable online...
11:08:34 From dsearls : I see the left end here as living in a feudal castle, and the right end as being a free Samurai, a ronin.
11:08:35 From Wendell Baker : He's on Microsoft's GitHub ... the search engines will find him
11:09:58 From Wendell Baker : @marc, no worries ... the data markets idea seems to inflame emotion in surprising ways. I was fishing to see if ... Something for a garden talk.
11:10:19 From johnnyfromcanada : IMO, decentralization is about control, and distribution is an implementation detail. Orthogonal - i.e., you can have any combination of both (some combinations having less obvious cost-benefit).
11:10:32 From dsearls : Maybe there's a 2x2 here, with centralization-decentralization on one axis and distribution-? on the other axis.
11:11:07 From dsearls : Is there an opposite of distributed, if not centralized? Perhaps aggregated?
11:12:15 From dsearls : Dave, can you take a phone-shot or something that produces a .png or a .jpg of that graphic you just held up? If so, attach it to the session notes.
11:12:19 From johnnyfromcanada : “Centralized” is common for both - ambiguous use of it perpetuates the confusion of the difference.
11:12:34 From Cam Geer : dave .. sounds like the organic evolution of trust
11:12:43 From johnnyfromcanada : Perhaps monolithic?
11:14:01 From Grace Rachmany : Discovery on Distributed Hash Tables seems to be working.
11:15:21 From Marc Davis : I would argue that “degree of self-sovereignty” is an orthogonal axis to “centralized<—>decentralized” architecture. IMHO, degree of centralization is an implementation question and degree of self-sovereignty is a rights/duties/contracts question. It may be easier to “enforce” and/or implement self-sovereignty in a decentralized architecture, but you could imagine a centralized architecture with self-sovereign rights and a decentralized architecture with no self-sovereign rights.
11:17:39 From Grace Rachmany : Biomimicry can be helpful in thinking about how these things happen in complex systems. Expanding our view beyond how it's done in computing architectures opens up additional possibilities.
11:19:18 From Wendell Baker : http://en.wikipedia.org/wiki/Mobile_IP
11:19:26 From MarkL. @smartopian : A Security Problem -
11:19:30 From dsearls : @johnnyfromcanada, I think he opposite of monolithic is polylithic.
11:20:47 From Grace Rachmany : In some ways public services need to be built as separate apps that use types of polling and collection rather than tapping into a central servi8ce.
11:22:17 From johnnyfromcanada : @dsearls - Indeed, I am trying to brainstorm ;-) You like etymology (per “anonymous”). So perhaps “tributed”?
11:22:25 From johnnyfromcanada : http://www.dictionary.com/browse/distribute
11:23:22 From Elias Strehle : Is it possible to handle ALL communication within the system? What about communication that relates to the system itself (like Bitcoin's Improvement Proposals)?
11:23:50 From Grace Rachmany : I guess it depends on how you define "the system"
11:24:20 From Grace Rachmany : There isn't some intrinsic reason it shouldn't be possible within the system.
11:25:24 From dsearls : could be that "distributed" is the wrong word, inherited from Paul Baran in 1964: https://www.researchgate.net/figure/Centralized-decentralized-and-distributed-network-models-by-Paul-Baran-1964-part-of-a_fig1_260480880
11:26:05 From dsearls : It also carries familiar centralized assumptions, such as that a distributor is required for distribution.
11:26:11 From Michael Graybeal : “Voltron Effect” - a technical term
11:26:20 From johnnyfromcanada : Dfinity is establishing an Internet Computer, which should eliminate many of such problems, at least from a compute & storage perspective.
11:26:34 From Marc Davis : Language question: doesn’t “user sovereignty” still frame the problem in terms that privilege the system vs. the person? Framing “persons” as “users” seems to imply a power hierarchy that is not truly “sovereign” for persons. So @DavidHuseby, why are you using the term “user sovereignty” rather than “self sovereignty”?
11:26:50 From johnnyfromcanada : http://dfinity.org
11:27:15 From dsearls : Hm: http://dfinity.org/ Very interesting. hadn't seen that before. thanks.
11:27:36 From johnnyfromcanada : Very serious project
11:28:18 From dsearls : I've never liked "user," though I've always liked "sovereignty" when used with "self." Because that's what each of us are experiencing here, as we talk, and walk, and interact.
11:29:17 From johnnyfromcanada : So “user” is a wrapper of “self”.
11:29:22 From johnnyfromcanada : Or decorator.
11:29:30 From dsearls : Agree johnny.
11:30:06 From johnnyfromcanada : Fits with Dependency Inversion / Injection concept.
11:30:12 From Cam Geer : +1 Tim org or System perspective
11:30:38 From Marc Davis : “Self Sovereignty” assumes the “person” as having rights that exist outside of, and prior to, any particular system.
11:31:04 From MarkL. @smartopian : Its about who controls the data
11:31:05 From dsearls : We are all self-sovereign as independent beings. When we enter a system as a self-sovereign individual, we wear the definition of "user." I get that but I still don't like it. Computing and drugs are the only fields that call people "users."
11:31:23 From dsearls : Agree Marc.
11:31:26 From scottmace : Doc +1
11:31:35 From johnnyfromcanada : Actors
11:31:43 From johnnyfromcanada : Participants
11:32:15 From dsearls : Note that the group here has grown to 43. Peaked at 78 in the last session. Both strong.
11:32:15 From MarkL. @smartopian : I have challenged the use of the word User at IIW for over a decade - glad to see the User vs Human discussion happen now - self sovereign is a bit of red herring -
11:32:53 From Marc Davis : That’s why “self sovereignity” is a key concept, because it postulates a “person” as having sovereign rights that exist prior to and outside of being a “user” of any particular “system”. IMHO, this is a key framing issue.
11:33:38 From Timothy Ruff : @Doc, makes sense about "user", and some orgs will have an allergic reaction to the word "sovereign." I spoke with Dave about alternatives, something that conveys a balance of power between orgs and people that's mutually beneficial. Maybe there's a "balance of power pledge" that orgs can publicly agree to...
11:34:00 From dsearls : Self-sovereign comes from Devon Loffreto, who coined it a decade ago and walked off. He still cares but isn't involved. He was contrasting the ideal from the purely administrative. Any system that has an identifier for you in their namespace is administrative.
11:34:39 From dsearls : Hmm... maybe the opposite of distributed is administrative. Just thinking out loud.
11:34:54 From Grace Rachmany : That is accurate. At Holochain we are considering that data needs to be held in 20 hosts in order for it to be fully accessible at any time.
11:35:24 From Grace Rachmany : Also, there isn't really a need for everyone to be online for any give amount of time. People could be off for days or months and get updated when they come back up online.
11:35:46 From Marc Davis : Helpful to think about legal concept of the “person” which can apply to human beings and to corporations/organizations. So perhaps “personal sovereignty” could bridge the gap between humans and corporations each having respective rights and duties.
11:35:59 From dsearls : Agree,Timothy, that some are allergic to "sovereign." But the baby got named. I remember the mainframe world disliking "personal computing," back in the late '70s. it wasn't until IBM made a PC in 1982 that "personal" got legitimized.
11:36:38 From johnnyfromcanada : Perhaps we should distinguish between “coherence” and “cohesion”.
11:37:39 From MarkL. @smartopian : Its litterally self surveillance identity - an this title is something that I think I could trust.. because its transprent. But mis-labelling identity as Soverign in my opinion decreases trustworthiness -
11:38:01 From Grace Rachmany : Stributed?
11:38:06 From Grace Rachmany : Tributed?
11:38:15 From Timothy Ruff : Great points, Doc. Nothing beats hindsight, where wisdom comes from. Sovereign it is! Now what's the replacement for "user"?
11:38:44 From MarkL. @smartopian : Human :-)
11:39:02 From David Huseby : +1
11:39:08 From dsearls : Person.
11:39:26 From Timothy Ruff : @MarkL I agree that "sovereign" is inaccurate in many ways, but as Doc says, "the baby got named." Funny, I've got a blog mostly written about how SSI isn't identity and mostly isn't sovereign. :)
11:39:46 From Line Kofoed : Communal?
11:39:49 From Cam Geer : please add that tot he chat
11:40:34 From Riaz Zolfonoon : Could we say Consolidated as opposite of Distributed?
11:41:34 From dsearls : Riaz, what Sam said was helpful, but I don't remember it well enough to write down. So hopefully he can point to it here or somewhere.
11:44:49 From Iain Henderson : On terminology, the MyData Community has stabilised on ‘human-centric’, and ‘individual’ and those seem to be being well received across that global community. So there must be decent non-English language equivalents.
11:45:08 From Marc Davis : Who made the great distinction just prior on the Zoom call that centralization is about control and distribution is about space? Great distinction! It separates “degree of centralization” and “degree of distribution” into different layers of the architecture: degree of distribution is a physical layer of the location of physical system resources; degree of centralization is a rights/policies layer about control of data and lower level system resources? Did I get that right?
11:45:28 From johnnyfromcanada : Coherence is more about lower-level structural integrity and coupling / dependencies. Cohesion is more about high-level semantic meaning / understandability.
11:45:47 From MarkL. @smartopian : Working on human trust and transparency is a bit more difficult when SSI claims to do trust -when it really does assurance. — The Consent Record and Notice an Consent standards are external - and meant for human and sovereign type of interaction - Identity s a digital tool — that is used to surviel an attribute- SSI refers to identity Surveillance tech.. —
11:46:22 From Grace Rachmany : I'll be diving deeper into IPFS over the next month. For now, I can talk about how Holochain is dealing with these issues. Hosting is the one area where decentralization is most difficult for regulagory reasons.
11:46:35 From MarkL. @smartopian : Offline by default :-)
11:48:02 From Sam-Smith : See text file for definition of decentralization vs distribution. Control vs Space
11:52:22 From MarkL. @smartopian : Distributed ledger consent tech for storing fwd..
11:53:00 From Chris Winczewski : Sam, does Mobile IP also require full IPv6 implementation?
11:54:41 From dsearls : I like that: "The most easy to use software ever invented."
11:54:55 From scottmace : +1
11:54:59 From Dee Platero : +1
11:55:11 From Cam Geer : awesome Dave! thx
11:55:17 From Jsearls : new physics
11:55:47 From dsearls : I have a season at 3:30 in D to carry this forward.
11:55:53 From johnnyfromcanada : Re: earlier concept of how many nodes are needed to keep a coherent network. Related concept of Byzantine Fault Tolerance (BFT). There is an old math proof that says you need to be resilient at minimum against 1/3 of nodes being “corrupted” / inconsistent (intentional or otherwise).
11:55:54 From MarkL. @smartopian : Please share slides —
11:56:03 From dsearls : True Self-Sovereignty: What Will It Take?
11:56:50 From Grace Rachmany : One of the things that is notable is that it may be necessary to move off of the existing Internet infrastructure altogether.
11:57:22 From johnnyfromcanada : I recall an Internet 2
11:57:39 From scottmace : Which was mostly fat pipes, IIRC
11:57:57 From mitfik : where is the code ? :)
11:58:21 From dsearls : Sam Curran's Minimum Positive Human Application of SSI is at the same time. A difference might be that my focus won't just be on SSI. Just SS. I'm hoping developer types can make it.
11:59:16 From johnnyfromcanada : The alternative to replacing the Internet is to leverage it such that it cannot corrupt what you are intend. Basically, what DLTs attempt to do, with varying success / correctness.
11:59:26 From Grace Rachmany : +1
11:59:29 From Cam Geer : htts://scuttlebutt.nz/
12:00:33 From Cam Geer : http://en.wikipedia.org/wiki/Secure_Scuttlebutt
12:02:15 From MarkL. @smartopian : Event receipts
12:02:40 From MarkL. @smartopian : Consent State Records
12:05:06 From johnnyfromcanada : Hashgraph inventor Leemon Baird is big on concept “share worlds”.
12:05:13 From scottmace : Great session!
12:05:18 From johnnyfromcanada : “shared"
12:05:20 From MarkL. @smartopian : Great !!
12:05:27 From Dee Platero : Fantastic! This has been awesome - I can't wait to start developing.
12:05:35 From Iain Henderson : Great session thanks