I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. The document updates RFC 3920, which is the definition of the core XMPP real-time messaging protocol. It should be noted the instant messaging and presence are layered on top of this protocol, and specified separately. General The document is initially intimidating, because of its length. But it is extremely well written (for which I would like to thank the editor) and well organized. So overall, a good read. I have not found anything that I consider a glaring security hole. But this is a layered security architecture (application layer == XMPP core == SASL == TLS) which is not easy to do right. Hence the large number of comments and questions below. I also appreciate the open discussion of the existing implementations and their security issues (e.g. "server dialback", shiver). I hope this document results in a security improvement in real deployments. Detailed Comments Note: these comments are based on rev -17 of the draft. This only matters as far as section numbers, with the only security-relevant change in -18 being a useful note on end-to-end protection. - 1.3: Implementation note: I suggest adding something like "Solutions specified in this document offer a significantly better level of security." - 3.3: why do we not recommend to use TLS (stateless) session resumption for reconnection? - 4.2.5 the access control rule at the bottom of this section makes sense. Why why does it explicitly not apply to elements other than XML stanzas? Are there cases where you negotiate TLS with someone other than the server? You can even imagine weird tunneling attacks using this "feature". - 4.2.5: "other than itself" - am I really allowed to send "message" stanzas from one resource of a JID to another resource of the same JID, before feature negotiation and before TLS negotiation? - 4.2.6: why do we even allow non-SASL protected server-to-server communication? - 4.3: How is TLS negotiated for the additional streams? How is it bound to the SASL negotiation that (apparently) only takes place once? - 4.4: this section appears to tie the two streams in the opposite directions together - when you close one you expect the other guy to close the other ASAP. But what is the behavior for "multiple streams over multiple connections" (Sec. 4.3)? - 4.4: what about orderly tear-down of the TLS association ("closure alert")? - 4.8.3.12: does not-authorized only refer to stream-level, rather than stanza-level errors? Are there cases when I am authorized to send some stanza types but not others? - 4.8.3.14 remote-connection-failed has dubious security benefit (why tell the world that your RADIUS server is down), compared to reusing internal-error. - 4.8.3.16: shouldn't we say that "reset" (when the stream is encrypted) also applies to the higher layers, i.e. encryption and authentication should be performed again? - 4.8.3.18: what are the security implications of a "redirect"? Should the client apply the same policy, e.g. for using TLS, as for the original server? Which "to" identity to use? Can redirection occur before the recipient is even authenticated? - 4.9: Don't we resend the "stream" header again after completing the TLS negotiation (Sec. 4.2.3). - 5.3.5: the text mentions that client certificates are "sufficiently rare", which is a pity because they do make sense for server-to-server interaction. So I suggest to promote renegotiation from OPTIONAL to RECOMMENDED. - 5.3.6: extensions may be out of scope. But I think we need to include a few words re: the TLS version, at least to prohibit SSL 3.0. - 5.4.1: don't you have to declare version >= 1.0 if you simply support this draft? Same question in 6.4.1. - 5.4.3.1: why is the initiator required to present a certificate "So that mutual authentication will be possible"? There are many other ways of ensuring mutual auth. I suggest to reword as "mutual certificate authentication". - Global replace TLS "cipher" -> "ciphersuite". - 5.4.3.1, bullet 5: actually, based on the "from" attribute that the receiving entity has just sent. - 5.4.3.3: where do we say that both sides must validate the "to" and "from" identities in view of the identities presented at the TLS layer (if any)? - 6.3.2: do you restart the stream twice, once after TLS and once after SASL?? - 6.3.3: shouldn't we simply "MUST NOT" the PLAIN mechanism? - 6.3.7: and what is the identity used for server-to-server auth? Also, it is very uncommon to consider the password as part of the identity. - 6.4.4: isn't it inconsistent that SASL aborts are converted into XMPP failures, but TLS failures are not? - 7.7.2.2: the preferred option #1, while reasonable in itself, does not allow the client to determine its own policy (whether it wants multiple sessions from multiple devices or not). - 8.1.1.2: any validated domain -> any validated subdomain. - 8.3.1, rule #2: if the error is a result of something gone bad with the addresses, then simply swapping "to" and "from" may not be appropriate, and may even be a security issue. - Same, rule #6, the recipient MUST NOT include the original XML if it's not well formed, right? - 8.3.2: MUST NOT... be presented to the human user - this is impossible to enforce, and most likely will not be followed. Moreover, there's a reason why we include a language tag. I suggest to tone it down a bit. - 8.3.3.11: what are "improper credentials"? It sounds like we are not differentiating the conditions of authentication failure vs. authorization failure. - 8.3.3.14: the "security note" doesn't makes sense to me - no matter what error code is returned, the sender has gained information that we don't want to provide. The note should say something along the lines of: such services must not be available to entities that cannot be trusted with knowing the status of an arbitrary recipient. See also 8.3.3.20. - 8.3.3.15: Are there no security implications to redirection at the stanza level? - 8.3.3.18: The "wait" error type might not be appropriate for some situations, e.g. authentication issues. - 8.3.3.21: do you explain anywhere what is the difference between "prior registration" and "prior subscription"? - 9.1.1, step 7: this is utterly trivial, but please mention that the new "stream" element is sent over TLS. - 9.1.5: as I noted in Sec. 4.4, the TLS connection needs to be closed as well. - 10.2: "try many different resources", I suppose it is typically fewer than 10 per JID, which does not make the attack uninteresting. - 10.4.2: if the protocol supports routing, shouldn't we mention that there are cases where the IQ stanza will be forwarded to another server, not the one mentioned in the "To" header? And what about loop prevention? - 10.5.3.3: this is a strange rule. Does it mean that if I'm using a desktop (/desktop) and a mobile (/mobile), and I'm connected to the desktop, I cannot have messages targeted at my mobile be deferred (and stored somewhere) until I connect from that device? - 13.4: "for the ensuring" -> "for ensuring" (or: to ensure). - 13.4: the security note is confusing, because it is unclear whether it has any normative status. Otherwise, it is almost trivial: of course you can provide these guarantees by different means. - 13.4: the cipher suites are not just "nun-null". They MUST provide both confidentiality and integrity. - 13.6: the out-of-band trust chain rule may be practical for server-to-server connections, but probably not for large client deployments and when certificates are sometimes rolled over. - 13.7: terminology: "mutual authentication" describes authentication with client certs, just as much as it describes the server presenting a cert and the client authenticating with SASL. This also applies to 5.4.3. - 13.7.1.1: The word "issuer" is ambiguous. It usually refers to a person/organization, but here you use it to refer to the certificate itself. How about replacing the heading of the second list by: "the following rules apply to issuer certificates, used to sign XMPP end-entity certificates"? - 13.7.1.1: it's strange to specify the hash algorithm, but not the signature algorithm (RSA-512 anyone?) - 13.7.1.1: what are "access certificates"? Neither 5280 nor Google can help... And how can an issuer cert NOT be marked with the CA bit? - 13.7.1.1: and most important, is the "relying party" (e.g. the client) required to check all these rules and fail validation if any of them is not met? - 13.7.1.2.2: enabling entity -> enabling entities. - 13.7.1.4: "conjuction" misspelled. - 13.7.2: "An implementation MUST enable a human user to view information about the certification path." I'm afraid this is security theater, because 99.5% of your target population cannot understand this information. - 13.7.2.2.1: "MUST either validate" - incomplete sentence (and actually, at a sensitive point in the text). - 13.7.2.2.1, subcase #3: please mention that some servers will simply fail validation at this point, subject to their policy. I.e., some might insist on correct client certs. - 13.7.2.3: "periodically query OCSP" - is there any guidance about the period? Is a recommended period specified in the cert or communicated by the OCSP responder? - 13.7.2: a silly question, but anyway: where do you say that the client's cert is correlated with the client's JID, as it appears in the From line (when setting up the original stream and/or when setting the stream anew after TLS negotiation)? And vice versa, for the server cert and the client's To header. - 13.8: I suggest to add (here or elsewhere): "All password-based mechanisms are susceptible to password guessing attacks, and therefore the authenticator MUST implement common rate-limiting mitigations." - 13.8: "For both confidentiality and authentication with passwords" - here you don't specify a TLS ciphersuite. - 13.8: the note kind of implies that PLAIN is preferable to DIGEST-MD5, which is clearly not the case. - 13.9.4: why is SASL-PLAIN singled out here? Other mechanisms are susceptible to off-line password guessing when used without TLS confidentiality, which is not a trivial attack but still a significant risk. - 13.11: all three examples of "service unavailable" can be ruled out on an operational server. Are there no better examples? - 15: The sasl-whitespace "feature" is not really a feature, because you'd fail any interoperability if you send whitespace during the SASL phase, right? Similarly tls-whitespace. - 15: if we insist on PLAIN, I would expect a security-mti-auth-order feature, where the proposals are ordered right (i.e. with PLAIN at the end). - 15: why is confidentiality-only a MUST? Is it widely deployed? Is it required for backward compatibility? By the way, please expand the acronym MTI somewhere. - 15: and hey, there's no feature defined for TLS+SCRAM+? Isn't this the one we want/expect to be deployed? - 15: stanza-* features - if you have an XML schema, can't you just say that conformance to the schema is REQUIRED and eliminate all this stuff? Given that these are MUSTs, not SHOULDs, a formal specification is so much easier to validate. This does *not* contradict the fact that runtime validation is optional. - 15: I would expect one or a few features around validation of identities at the various layers, since we're spending much of the document on this issue. "tls-certs" is an important piece of that, but not the whole thing.