§ Implementers Guide
This guide contains concepts, explanations, and important considerations for those building DIDComm capable systems.
This section discusses best practices and the implications of DIDComm Protocols on the privacy of the communicating parties.
§ Public Fingerprinting
Users should avoid disclosing supported protocols until sufficient trust has been established. The broad disclosure of supported protocols may provide a unique fingerprint that can be used to correlate multiple identifiers in use by a single party.
DIDComm’s use of the Discover Features Protocol allows selective disclosure of features to mitigate this problem. Leveraging this protocol as a better substitute for Verifiable Data Registry (VDR) published DID Document endpoints will prevent the disclosure of unique protocols.
Routing is the process of managing the delivery of messages from sender to recipient, possibly adapting the packaging and transfer to intermediate nodes. A route is a map or plan that specifies enough to achieve delivery in at least one direction; it may omit uninteresting details.
A sender emits a message hoping that a recipient eventually receives it. As a message moves toward the recipient, we say it is moving destward; the opposite direction is sourceward. Note that sender and recipient flip if a request-like message is followed by a response-like message in the opposite direction; the context that defines a sender is a single message, not a paired interaction. (DIDComm supports request-response but does not require it.)
We tend to conceive of senders and recipients as simple — “Alice” and “Bob” sound like unitary individuals. This can be a useful simplification. However, it’s important to remember that it may hide important detail; if Bob uses a laptop, a mobile device, and a cloud service to receive messages, and if each of them follows the cryptographic best practice of not sharing keys, then it is impossible to ignore this complexity in some parts of routing design. We call the set of things that Bob controls his sovereign domain (or domain for short). Note that this usage is notably different from the meaning of “domain” in DNS and many web contexts. We also use the loose cover term agent to refer to individual devices or pieces of software inside a given domain. The boundary of a domain is an important construct for trust analysis and for routing. We will explore this in greater detail below.
§ Routing Requirements
All messaging technologies must address routing in some way. However, DIDComm has unusual requirements:
Because of security and privacy goals, DIDComm routing must impose careful limits on how and to what degree intermediate nodes are trusted.
Because DIDComm aims to be transport-independent, its routing model must be carefully decoupled from strong assumptions about networking. In particular, DIDComm routing cannot make broad-brush assumptions that:
- A given route will use only a single transport.
- Transport mechanisms will provide any security benefits.
- The identity and connectivity for every hop in a route will be known by any party at the time a message is sent.
- The route from sender to recipient will be similar in hop identity, hop count, or transport mix to a complementary route from recipient back to sender.
- All the nodes in a route are ever online at the same time.
Because of decentralization and multiple-peer-friendly goals, DIDComm routing cannot orient itself around servers or simple request—response patterns as mandatory components.
Because of privacy goals, DIDComm must offer fine distinctions in how repudiation and authentication are handled.
These requirements are very demanding. They do NOT necessarily make DIDComm hard to implement. They do NOT prevent DIDComm from using common web infrastructure — indeed, DIDComm routes quite simply and elegantly over HTTP. However, they DO force the routing model to be described in a very generic, flexible way, and they DO mean that existing routing solutions are an imperfect fit. DIDComm learns as much as it can, and borrows as much as it can, from clever and battle-tested routing work done in other contexts, but it does have some unique twists.
Let’s focus on a simple case where
A wants to send a message to
B, and the route involves one intermediate hop at
C. Suppressing a few details, DIDComm routing works like this:
Aprepares a plaintext message of any type,
M, and encrypts it for the recipient,
B. This produces
- A prepares another plaintext message of type
forward, and adds
Mto it as an attachment. This new message,
Cto to deliver the attached payload to
M. Reading the plaintext,
Csees that it’s been asked to deliver the encrypted attachment to
Chands the attachment from
M— which is the encrypted message
M, reproducing the plaintext
This is not the simplest possible scenario; DIDComm could be direct from
C, which would eliminate steps 2-5. Some DIDComm+HTTP interactions are that simple. And it is not the most complex DIDComm routing scenario, either. Much more elaborate routes could be described by introducing additional hops with additional
forward messages. However, the above sequence is a good rough model to carry through the discussion that follows.
forward message created in step 2 in our overview is a DIDComm application-level protocol message. Routing, as described here, is just one of many protocols that can be built atop DIDComm primitives. A and C are “speaking” this protocol when they communicate. The routing protocol can use the same features as any other application-level protocol; this includes attachments, message threading, message timing, message tracing, ACKs, problem reports, and so forth. However, before we describe the structure and semantics of the
forward message in detail, it’s important to understand some additional routing concepts.
§ Two Dimensions
Two dimensions of routing are relevant to DIDComm. Sometimes they are confused or conflated:
- Network routing deals with how packets flow across a digital landscape. This is the “routing” familiar to most technical people, and the one that will inform most assumptions they bring to the routing topic when reading the DIDComm spec for the first time.
- Cryptographic routing concerns itself with how packaging hides or reveals plaintext to participants in a delivery chain. In other words, it’s about the routing of secrets rather than the routing of packets.
The overview immediately above collapsed these two dimensions, assuming that a node relaying data is also a node decrypting. This makes for simple explanations, but it’s not always a true or helpful assumption. The correspondence between these two routing dimensions can be non-trivial; a route may have ten network hops but only three cryptographic hops, and a single network hop may decrypt twice as it passes data between distinct software entities. However, simple principles explain the dynamics for each situation.
§ Mediators and Relays
DIDComm routing uses two important constructs to model all varieties of network and cryptographic routing: mediators and relays.
A mediator is a participant in routing that must be accounted for by the sender’s cryptography. In other words, it is visible in the cryptographic routing dimension. It has its own keys and will deliver messages only after decrypting an outer envelope to reveal a
forward request. It must understand DIDComm routing to do this. Many types of mediators may exist, but two important ones should be widely understood, as they commonly manifest in DID Docs:
- A service that receives messages for many agents at a single endpoint to provide herd privacy (sometimes called an “agency”) is a mediator.
- A cloud-based agent that forwards messages to mobile devices is a mediator.
In contrast, a relay is an entity that passes along encrypted messages without understanding or decrypting them. It’s focused on network routing only.
Like mediators, relays can be used to change the transport for a message (e.g., accept an HTTP POST, then turn around and emit an email; accept a Bluetooth transmission, then turn around and emit something in a message queue). But unlike mediators, relays can do this without understanding DIDComm. Load balancers and mix networks like TOR are important types of relay.
Let’s define mediators and relays by exploring how they manifest in a series of communication scenarios between Alice and Bob.
§ Scenario 1: direct
Alice and Bob are both employees of a large corporation. They work in the same office, but have never met. The office has a rule that all messages between employees must be encrypted. They use paper messages and physical delivery as the transport. Alice writes a note, encrypts it so only Bob can read it, puts it in an envelope addressed to Bob, and drops the envelope on a desk that she has been told belongs to Bob. This desk is in fact Bob’s, and he later picks up the message, decrypts it, and reads it.
In this scenario, there is no mediator, and no relay.
§ Scenario 2: a gatekeeper
Imagine that Bob hires an executive assistant, Carl, to filter his mail. Bob won’t open any mail unless Carl looks at it and decides that it’s worthy of Bob’s attention.
Alice has to change her behavior. She continues to package a message for Bob, but now she must account for Carl as well. She take the envelope for Bob, and places it inside a new envelope addressed to Carl. Inside the outer envelope, and next to the envelope destined for Bob, Alice writes Carl an encrypted note: “This inner envelope is for Bob. Please forward.”
Here, Carl is acting as a mediator. He is mostly just passing messages along. But because he is processing a message himself, and because Carl is interposed between Alice and Bob, he affects the behavior of the sender. He is a known entity in the route.
You may recognize this as similar to our overview example with A, B, and C. A and B correspond to Alice and Bob; C is Carl, the mediator.
§ Scenario 3: transparent indirection
All is the same as the base scenario (Carl has been fired, and is thus out of the picture), except that Bob is working from home when Alice’s message lands on his desk. Bob has previously arranged with his friend Darla, who lives near him, to pick up any mail that’s on his desk and drop it off at his house at the end of the work day. Darla sees Alice’s note and takes it home to Bob.
In this scenario, Darla is acting as a relay. Note that Bob arranges for Darla to do this without notifying Alice, and that Alice does not need to adjust her behavior in any way for the relay to work.
§ Scenario 4: more indirection
Like scenario 3, Darla brings Bob his mail at home. However, Bob isn’t at home when his mail arrives. He’s had to rush out on an errand, but he’s left instructions with his son, Emil, to open any work mail, take a photo of the letter, and text him the photo. Emil intends to do this, but the camera on his phone misfires, so he convinces his sister, Francis, to take the picture on her phone and email it to him. Then he texts the photo to Bob, as arranged.
Here, Emil and Francis are also acting as relays. Note that nobody knows about the full route. Alice thinks she’s delivering directly to Bob. So does Darla. Bob knows about Darla and Emil, but not about Francis.
Note, too, how the transport is changing from physical mail to email to text.
To the party immediately upstream (closer to the sender), a relay is indistinguishable from the next party downstream (closer to the recipient). A party anywhere in the chain can insert one or more relays upstream from themselves, as long as those relays are not upstream of another named party (sender or mediator).
§ More Scenarios
Mediators and relays can be combined in any order and any amount in variations on our fictional scenario. Bob could employ Carl as a mediator, and Carl could work from home and arrange delivery via George, then have his daughter Hannah run messages back to Bob’s desk at work. Carl could hire his own mediator. Darla could arrange for Ivan to substitute for her when she goes on vacation. And so forth.
§ More Traditional Usage
The scenarios used above are somewhat artificial. Our most familiar routing scenarios involve edge agents running on mobile devices and accessible through bluetooth or push notification, and cloud agents that use electronic protocols as their transport. Let’s see how relays and mediators apply there.
§ Scenario 5: direct
Alice’s cloud wants to talk to Bob’s cloud. Bob’s cloud is listening at http://bob.com/api. Alice encrypts a message for Bob and posts it to that URL.
In this scenario, we are using a direct transport with neither a mediator nor a relay. This is how Alice and Bob operate in Scenario 1, and it’s also equivalent to our Overview minus steps 2-5.
When DIDComm involves only two parties, and when HTTP is convenient for both of them, this sort of direct delivery may be used. (Note that if you need n-wise, or if you need a reciprocal return route but Alice’s cloud exposes no public API, this delivery scenario can present problems. More on this later.)
Virtually the same diagram could be used for a Bluetooth or NFC or sneakernet conversation that happens offline:
§ Scenario 6: herd hosting
Let’s tweak Scenario 5 slightly by saying that Bob’s agent is one of thousands that are hosted at the same URL. Maybe the URL is now http://agents-r-us.com/inbox. Now if Alice wants to talk to Bob’s cloud agent, she has to cope with a mediator. She wraps the encrypted message for Bob’s cloud agent inside a
forward message that’s addressed to and encrypted for the agent of agents-r-us that functions as a gatekeeper.
This scenario is one that highlights an external mediator–so-called because the mediator lives outside the sovereign domain of the final recipient.
§ Scenario 7: intra-domain dispatch
Now let’s subtract agents-r-us. We’re back to Bob’s cloud agent listening directly at http://bob.com/agent. However, let’s say that Alice has a different goal–now she wants to talk to the edge agent running on Bob’s mobile device. This agent doesn’t have a permanent IP address, so Bob uses his own cloud agent as a mediator. He tells Alice that his mobile device agent can only be reached via his cloud agent.
Once again, this causes Alice to modify her behavior. Again, she wraps her encrypted message. The inner message is enclosed in an outer envelope, and the outer envelope is passed to the mediator.
This scenario highlights an internal mediator. Internal and external mediators introduce similar features and similar constraints; the relevant difference is that internal mediators live within the sovereign domain of the recipient, and may thus be worthy of greater trust.
§ Scenario 8: double mediation
Now let’s combine. Bob’s cloud agent is hosted at agents-r-us, AND Alice wants to reach Bob’s mobile:
This is a common pattern with HTTP-based cloud agents plus mobile edge agents, which is the most common deployment pattern we expect for many users of self-sovereign identity. Note that the properties of the agency and the routing agent are not particularly special–they are just an external and an internal mediator, respectively.
§ Remember Routes are One-Way (not Duplex)
In all of this discussion, note that we are analyzing only a flow from Alice to Bob. How Bob gets a message back to Alice is a completely separate question. Just because Carl, Darla, Emil, Francis, and Agents-R-Us may be involved in how messages flow from Alice to Bob, does not mean they are involved in flow the opposite direction.
Note how this breaks the simple assumptions of pure request-response technologies like HTTP, that assume the channel in (request) is also the channel out (response). Duplex request-response can be modeled with DIDComm, but doing so requires support that may not always be available, plus cooperative behavior governed by the
§ Routing Protocol
Now that we understand mediators and relays, how and why they might be combined in various ways, and how their presence influences delivery semantics, we can describe the actual application-level routing protocol that any DIDComm sender speaks with a destward mediator. See [Routing Protocol] in the spec.
It is a best practice to ponder appropriate timeout settings when designing application-level protocols atop DIDComm. A protocol for conducting live music over the internet should probably time out its messages to cue musicians within milliseconds, whereas a protocol to apply for college may need timeouts that are days or weeks long. A protocol definition should communicate timeout assumptions like these.
Individual implementers of a protocol should also ponder whether they need timeouts more aggressive than those of the general community. Perhaps a college application protocol allows the process to unfold over weeks – but an app that promises it can help someone apply to college in 5 minutes shouldn’t use default timeouts in the messages it sends.
§ Cybersecurity considerations for problem reports
Ethical and unethical hackers deliberately trigger errors on systems to understand what exploits are possible. We expect this to happen with DIDComm. Therefore, the troubleshooting and transparency that comes from problem reports needs to be weighed against the risk of disclosing too much information. The following considerations are recommended.
- Problem reports do not have to be sent (only) to the party who triggers a problem. Sometimes, a different (or additional) audience may be appropriate.
problem-reportmessage type is deliberately decoupled from the versioning and release status of other protocols, so it cannot be used for feature sniffing.
- Fields that encourage careless, recursive information dumping (e.g., Java’s
Throwable.cause) do not appear in
argsproperties of a
problem-reportare separated, with
commentmapping consistently to a
code. This means that the risk of disclosing too much information is concentrated in the value of
comment. Values placed in
argsshould be scrubbed of anything sensitive.
- Sending problem reports to an unknown party is more risky than sending them to someone with known characteristics. (Because DIDComm’s normal mode is mutual authentication via DIDs, and because DIDComm connections may accumulate credential-based context, this is a manageable risk.)
- Sending problem reports immediately may be more risky than sending them with a modest, random delay. This makes denial-of-service attacks and temporal correlation harder, and is the same principle that motivates login dialogs to pause before reporting an incorrect password.
§ Message Authentication
TODO: Add details about authenticated encryption.
§ What’s New?
The version of DIDComm incubated in the Hyperledger Aries community is referred to as Version 1 (V1). This spec describes the next version, referred to as Version 2 (V2). This section will describe the changes between V1 and V2, useful to members of the Aries community.
§ Summary of Changes
- Formalization of methods used in V1
- JWM based envelope
- ECDH-1PU standardized form of AuthCrypt
- Both DID and key in each message
- Special Handling of Peer DIDs eliminated
- Message structure split between ‘headers’ and body.
- No AnonCrypt encryption method.
§ Practical Changes
The list of changes above leads to practical changes in how DIDComm is used.
§ DID Exchange not needed
Each message contains both the sender key (used in the encryption layer), and the sender’s DID. The exchange of DIDs that occurs via the DID Exchange Protocol used in V1 occurs in each message that is transferred. The important step of rotating DIDs is accomplished via the
from_prior header that travels alongside any protocol message. These features make the DID Exchange Protocol redundant.
One side effect of the DID Exchange Protocol in V1 was that you confirmed the validity of the DID with a round trip to the other party. Many protocols will provide this assurance via the flow of the protocol prior to the point where round-trip testing is required. When this round-trip is desired prior to the beginning of a protocol, a round trip with another protocol (such as Trust Ping or Feature Discovery) can provide the same assurance.
§ Special Handling of Peer DIDs eliminated
DIDComm V1 defined special handling of Peer DIDs, making it very optimized for usage with Peer DIDs. However this made it less obvious how other DID methods could be used with DIDComm. DIDComm V2 eliminated special handling of Peer DIDs, making handling of all DIDs equal from the perspective of the DIDComm spec. This creates a more distinct separation between how DIDs are used (defined by DID Core and specific DID method) and how to securely communicate using DIDs (defined by DIDComm spec).
from attributes inside a DIDComm message allow for query parameters to be included on a DID. Using the query parameters you can exchange additional information without using custom fields. DID methods indicate how query parameters can be used to pass state information. For example, the Peer DID method defines the usage of the
initial-state query parameter to pass all information needed to construct a DIDDoc in a single field.
§ Process From Headers prior to Protocol Processing
Relationship changes in V1 were handled inside the DID Exchange Protocol. In V2, relationship changes including discovery and rotation are handled in message headers.
In V2, messages must evaluate the
from_prior headers of every message prior to beginning the protocol message processing.
§ No technical difference between Ephemeral Mode and Full Mode
Ephemeral mode in V1 was a method of passing messages without first performing an exchange of DIDs. Given that we no longer have a need to perform an exchange of DIDs prior to passing messages of another protocol, we no longer need to designate a mode for ephemeral interactions.
§ Use Peer DIDs (or other suitable DID method) in place of AnonCrypt
Anoncrypt was a method present in V1 that allowed a message to be encrypted to a recipient using ephemeral sender keys, allowing the sender to remain anonymous. The ease of using Peer DIDs allows the sender to remain anonymous using the existing authenticated encryption method. The encryption properties are the same between the methods. Eliminating this option makes the spec simpler without loosing any features.
§ Message Level Decorators now represented as Headers
The adjusted structure of DIDComm messages now represents message level decorators as message headers. An example includes