[Note this is a WG LC related review, not IETF LC.] Hello, I have been selected as the Routing Directorate reviewer for this draft. The Routing Directorate seeks to review all routing or routing-related drafts as they pass through IETF last call and IESG review, and sometimes on special request -- or WG Last call as was the case here . The purpose of the review is to provide assistance to the Routing ADs. For more information about the Routing Directorate, please see ​ http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir Although these comments are primarily for the use of the (chairs and) Routing ADs, it would be helpful if you could consider them along with any other Last Call comments that you receive, and strive to resolve them through discussion or by updating the draft. Document: draft-ietf-manet-dlep-14 Reviewer: Lou Berger Review Date: June 8 (later than requested due to scope of comments -- sorry) WG LC End Date: unknown Intended Status: Standards track Summary: While I think the document is pretty decent for the scope of the work, I do have concerns about this document and recommend that the WG Chairs/Routing ADs discuss these issues further with the authors. I'm also available as/if needed to discuss. Comments: I think the document shows significant good work and looks to be a useful protocol, although I'm not overly familiar in this space. That said, I have a number of serious concerns about the document, and its contents from a few of perspectives. These include basic protocol issues, underspecified details (which could lead to interoperability issues), and specification/editorial issues. I think the document / protocol can be modified to address the issues I raise below. Of course, it is up to the WG, chairs, and ADs to decide which comments to address and which to ignore. I don't expect that all comments will result in changes. Major Issues: - The length field of the generic data item (i.e., TLV) is only 8 bits. While 255 bytes (assuming that this is the unit of measure, which BTW isn't specified) is big enough today, allowing for larger will greatly simplify things when 255 isn't enough. -- We've run into this in RSVP and it's a real pain. - Version number is currently defined as a data item. This means a signal (i.e., message) needs to be potentially fully parsed to discover what version is being used. This precludes basic format changes to the protocol. Perhaps the Discovery and Init Signals should be special cased to include version in their formats. (And shorten version to 8 bits from 32, as mentioned below). - The document references, but does not define, 'in-session' and 'discovery' states. These either need to be formally defined or removed. BTW we had exactly the same issue with LMP (RFC4204) and ended up adding section 11 (FSMs) at a pretty late stage of the process. - TCP session management is not defined, nor is the relationship with TCP and DLEP sessions fully defined. For example: o Closing the TCP session is only mentioned in one place and in a way that is inconsistent with the expected protocol behavior (close TCP before ACK is received). o What happens when a DLEP session is terminated, can the TCP session be reused or must it be closed too? - There is no transaction model defined. For example, it's completely unclear if only one unacknowledged Signal allowed at a time, or perhaps just one per signal type is allowed, or perhaps there are no restrictions. This needs to be explicit. - What is the purpose of retries and timeouts over TCP? Retries aren't needed over TCPs and it's unclear whey they are being used. - The higher level implications of ACKs, over TCP, isn't really clear. It seems ACKs are defined for multiple purposes: reliable transport, transaction acknowledgment and transaction results. Of course the first isn't needed, and implications of the others should be clear. For example, in section 7.10, why would there be a retry when receiving a Destination Up ACK signal indicating an error? - There is no discussion on scaling considerations. Are there really none? For example, how often might be appropriate to issue/limit Peer Updates based to changes in link quality, or how to handle the case where a large number (all or most) of destinations go down. - There are 13 places where the protocol allows implementation to define their own 'heuristics'. Some of these seem unnecessary due to the TCP point raised above, but any that remain in the protocol should be fully specified to ensure predictable/consistent behavior from implementations. - Data Items are defined for "Extensions" and "Experimental Definition" (Sections 8.7 and 8.8). Both seem to support for optional mechanisms, but the former uses assigned numeric values, why the latter uses UTF-8 strings. o What, if any, is the intended distinction/relationship between these? o How does an "Experimental Definition" become standardized? - Sections 8.19 and 8.20 define "Resources" related Data Items. The definition related to these basically says a resources is "An 8-bit integer percentage, 0-100, representing the amount of resources allocated to receiving|transmitting data.". If I were implementing this protocol, I'd have no idea how to produce, update or use this information. I think there is some missing informative and normative (RFC 2119) text related to these formats. - Sections 8.21 and 8.22 (Relative Link Quality) have a similar problem of being under described, in particular it's unclear if there's a meaningful, non-proprietary definition for link quality that an implementation is to act on or if the passed value is just passed for as monitoring information. Either way, this needs to be clarified. - Section 9 defines a "credit-windowing scheme analogous to the one documented in [RFC5578]". It describes how credits are exchanged, but it provides zero definition on the implications or use of credits relative to the data plane. - Multiple ways to implement the same function are allowed, e.g., optional presence of Status, Interval and TCP port. Generally allowing such complicates testing and leads to interoperability issues. The document should pick one way and require it. - The document doesn't state if there are any ordering requirements on data items. It should be explicit on this, e.g., there are no ordering requirements on the placement of Data Items within Signals. - The required and optional data items that are permitted on a signal isn't always clear. For example are 0/1/N copies of a particular Data Item required/allowed. Using something like ABNF would really help formalize and clarify this. - The document doesn't clearly delineate from informative/narrative text, normative / required processing procedures, and message formats. This by itself is not necessarily a major issue, it just makes it harder to (write,) review and implement the protocol. What is a major issue is that this approach allows for duplicate (and sometimes contradictory) normative procedures and for omissions in procedures (particularly related to exception/error processing). Specific examples are included above and below. It would be best to ensure that each required processing behavior is defined just once and in a consistent way. - The security consideration section is inadequate. This section should address the security of the DLEP protocol, not user traffic. It should include an analysis of risks/threats/possible exploits and how these are mitigated by the protocol. rfc6952, and the protocols it references can serve as examples. Minor Issues: - The data and signal type fields are both 8 bits. This seems pretty small, particularly the data type field. Given this is a control protocol, I think a larger (at least data type) field would provide better "future proofing". - 2^32 versions are currently allowed (section 8.1). This seems a bit excessive. I'd opt for max of 8 bits here myself. - It's probably too late, but it probably would be cleaner to have a generic ack signal rather than a per signal type ack. I mention this here as this may come up again when clarifying the transaction model (as mentioned above.) - Section 2: Assumptions This section includes informative and normative text so is more than just Assumptions. Personally, I'd remove all normative text from the section. - There are no specific rules related to UDP header formation. - Sections 8.10->8.17. Isn't add/drop indicator needed for subnets in destination update signals? - The IANA Considerations sections must follow​ RFC2360. - New registries must include initial values, which are defined in the document. (The document currently has many TBDs that should be replaced.) - New registries need an allocation policy, e.g.: The registry should be established with registration policies of "Standards Action" (for Standards Track documents) and “Specification Required" (for other documents). The designated expert is any current WG chair. Nits: - The document introduces the terms "signals" and "data items" for what is commonly called "messages" and "TLVs" (or objects) in other protocols. It's probably too late to change this, but I think the introduction of unique terminology is counter productive. - Use of RFC 2119 conformance language is a bit rough, and there are words in all caps that are not defined in RFC2119. Take a look at http://trac.tools.ietf.org/wg/teas/trac/wiki/PSGuideline for some suggestions. - Internal socket operation is mentioned a couple of times. It really shouldn't be, the spec should define behavior on the wire. - The Length fields are missing unit of measure (presumably octets) - The Mnemonics are used basically once and don't really add value, suggest dropping them. - How/when is the "Unknown Signal" Status Code sent? - Section 8.7: Extension List should be shown as a variable length field. - Section 8.8: Experiment List should be shown as a variable length field. That's it -- for now -- hopefully I didn't miss anything. Look forward to hearing response to the above (and how I got things hopelessly wrong ;-) Lou