Below is my review of draft-ietf-sfc-multi-layer-oam as part of the routing area directorate on behalf of the ADs. I have some high level concerns and a detailed review that follows. MAJOR: Fate Sharing is listed as a SHOULD, is it perhaps a MUST? does OAM work for SFC if it doesn't follow the same chain as traffic? MAJOR: The document does not list Performance Measurement as a requirement, though it is mentioned in the introduction and in RFC8924. MAJOR: The document claims the SFC echo request/reply resolves all 8 requirements without stating precisely how. I don't doubt that it does, but I would benefit from a clear description of how. MAJOR: Security Consideration look relatively complete, but the Security considerations of RFC7665 should be re-evaluated in this specification to identify areas of additional attack or exposure with echo req/rep and the tools that may be used within the limited NSH domain, or outside it, to diagnose failures. MINOR: The Operational Considerations section seems light. For an OAM specification I was expecting more detail on the "Operation" side of this specification. NIT: The use of forward references requires a reader/implementer place a lot on their stack to follow the references. Below is a more detailed review with the following format: ``` quoted text ``` ** followed by my comments/questions. ``` This document defines how active Operation, Administration and Maintenance (OAM), per [RFC7799] definition of active OAM, is identified when Network Service Header (NSH) [RFC8300] is used as the SFC encapsulation. ``` ** Does this document define how OAM is identified? Perhaps the right word was implemented? ``` Active OAM tools, conformant to the requirements listed in Section 3, improve, for example, troubleshooting efficiency and defect localization in SFP because they specifically address the architectural principles of NSH. ``` ** Should "conformant to the requirements in Section 3" be "conformant to this specification"? I don't see how tools can conform to the requirements. ** Do the tools address "the architectural principles of NSH"? If so what are the principles that need addressing and how are they addressed? ``` Active OAM tools, conformant to the requirements listed in Section 3, improve, for example, troubleshooting efficiency and defect localization in SFP because they specifically address the architectural principles of NSH. For that purpose, SFC Echo Request and Echo Reply are specified in Section 6. This mechanism enables on-demand Continuity Check and Connectivity Verification among other operations over SFC in networks addresses functionalities discussed in Sections 4.1, 4.2, and 4.3 of [RFC8924]. SFC Echo Request and Echo Reply, defined in this document, ``` ** s/This mechanism/These mechanisms/ ** s/defined in this document// - It's just been said where they are defined ``` Following are the requirements for an FM SFC OAM, whether with the E2E or segment scope: REQ#1: Packets of active SFC OAM SHOULD be fate sharing with the monitored SFC data in the forward direction from ingress toward egress endpoint(s) of the OAM test. ``` ** Since this is SHOULD what is the consequence of not doing this? ``` 1. Active SFC OAM Header As demonstrated in Section 4 [RFC8924] and Section 3 of this document, SFC OAM is required to perform multiple tasks. Several ``` ** Requirements are stated a few lines up, no need for a reminder. ** s/As demonstrated in Section 4 [RFC8924] and Section 3 of this document, // ``` active OAM protocols could be used to address all the requirements. When IP/UDP encapsulation of an SFC OAM control message is used, protocols can be demultiplexed using the destination UDP port number. But extra IP/UDP headers, especially in an IPv6 network, add noticeable overhead. This document defines Active OAM Header (Figure 2) to demultiplex active OAM protocols on an SFC. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V | Msg Type | Flags | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ SFC Active OAM Control Packet ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: SFC Active OAM Header V - two-bit-long field indicates the current version of the SFC active OAM header. The current value is 0. The version number is to be incremented whenever a change is made that affects the ability of an implementation to parse or process the SFC Active OAM header correctly. For example, if syntactic or semantic changes are made to any of the fixed fields. Msg Type - six bits long field identifies OAM protocol, e.g., Echo Request/Reply. Flags - eight bits long field carries bit flags that define optional capability and thus processing of the SFC active OAM control packet, e.g., optional timestamping. No flags are defined in this document, and therefore, the bit flags MUST be zeroed on transmission and ignored on receipt. Length - two octets long field that is the length of the SFC active OAM control packet in octets. ``` ** Consistency in header field descriptions is lacking throughout the doc. ** s/V - two-bit-long field/V - two bit field/ ** s/Msg Type - six bits long field/Msg Type - six bit field/ ** etc. for all other definitions. ``` 6. Echo Request/Echo Reply for SFC Echo Request/Reply is a well-known active OAM mechanism extensively used to verify a path's continuity, detect inconsistencies between a state in control and the data planes, and localize defects in the data plane. ICMP ([RFC0792] for IPv4 and [RFC4443] for IPv6 networks) and [RFC8029] are examples of broadly used active OAM protocols based on the Echo Request/Reply principle. The SFC Echo Request/Reply defined in this document conforms to REQ#1 (Section 3) by using the NSH encapsulation of the monitored service. Further, the mechanism addresses requirements REQ#2 through REQ#7, listed in Section 3. Specifically, it can be used to check the continuity of an SFP, trace an SFP, or localize the failure within an SFP. Also, note that REQ#8 can be addressed by an extension of the SFC Echo Request/Reply described in this document adding proxy capability. The SFC Echo Request/Reply control message format is presented in Figure 3. ``` ** How does this echo request/reply "conforms to REQ#1". Does this mean "satisfies REQ#1"? ** there is no justification of how REQ#2-7 are satisfied. Some expansion is needed. ** are back references to section 3 really needed? One presumably just read section 3 and the requirements. ``` The interpretation of the fields is as follows: Version (V) is a two-bit field that indicates the current version of the SFC Echo Request/Reply. The current value is 0. The version number is to be incremented whenever a change is made that affects the ability of an implementation to parse or process the control packet correctly. If a packet presumed to carry an SFC Echo Request/Reply is received at an SFF, and the SFF does not understand the Version field value, the packet MUST be discarded, and the event SHOULD be logged. Reserved - fourteen-bit field. It MUST be zeroed on transmission and ignored on receipt. The Echo Request Flags is a two-octet bit vector field. A flag defined in the Flags field of the SFC Active OAM header in Figure 2 has no implication for those defined in the Echo Request Flags field of an Echo Request/Reply message. The Message Type is a one-octet field that reflects the packet type. Value 1 identifies Echo Request and 2 - Echo Reply. The Reply Mode is a one-octet field. It defines the type of the return path requested by the sender of the Echo Request. Return Codes and Subcodes are one-octet fields each. These can be used to inform the sender about the result of processing its request. Return Code values are provided in Table 1. For all Return Code values defined in this document, the value of the Return Subcode field MUST be set to zero. ``` ** Consistent type descriptions like "(field) - x bit field (description)" is needed throughout the document ``` 6.3.1. Source TLV The responder to the SFC Echo Request encapsulates the SFC Echo Reply message in IP/UDP packet if the Reply mode is "Reply via an IPv4/IPv6 UDP Packet". Because the NSH does not identify the ingress node that generated the Echo Request, the source ID MUST be included in the message and used as the IP destination address and destination UDP port number of the SFC Echo Reply. The sender of the SFC Echo Request MUST include an SFC Source TLV (Figure 5). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source ID | Reserved1 | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Port Number | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IP Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: SFC Source TLV where Source ID Type is a one-octet field and has the value of 1 Section 10.4. Reserved1 - one-octet field. The field MUST be zeroed on transmission and ignored on receipt. Length is a two-octet field, and the value equals the length of the data following the Length field counted in octets. The value of the Length field can be 8 or 20. If the value of the field is neither, the Source TLV is considered to be malformed. Port Number is a two-octet field. It contains the UDP port number of the sender of the SFC OAM control message. The value of the field MUST be used as the destination UDP port number in the IP/ UDP encapsulation of the SFC Echo Reply message. Reserved2 is a two-octet field. The field MUST be zeroed on transmit and ignored on receipt. IP Address field contains the IP address of the sender of the SFC OAM control message, IPv4 or IPv6. The value of the field MUST be used as the destination IP address in the IP/UDP encapsulation of the SFC Echo Reply message. ``` ** The TLV format of "Type" "Reserved" "Length" should not be redefined for each TLV type. The authors should provide values for Type, and Length like... - Type: Source ID (1) - Length: The length of the variable length value in octets. - Port Number: ... - Reserved2: MBZ - IP Address: ... ** Why is the IP address type assumed IPv4 or v6 based on length? Why not use 4 bits for the IPversion - there is space in Reserved2. ** Try to avoid processing descriptions within the field definitions of TLVs. For example, port number "MUST be used as the destination UDP port..." This is better placed in the echo request/reply processing section. ``` * Reply via an IPv4/IPv6 UDP Packet (2). This likely will be the most often used value. ``` ** Why is the assumption of how often the reply via UDP is used relevant? Again in section 6.5 it's referred to as a 'default value'. Is this default value relevant, are there requirements that implementations have specific configurable and default parameters? ``` 6.4. SFC Echo Request Reception Punting a received SFC Echo Request to the control plane is triggered by one of the following packet processing exceptions: NSH TTL expiration, NSH Service Index (SI) expiration, or the receiver is the terminal SFF for an SFP. An SFF that received the SFC Echo Request MUST validate the packet as follows: 1. If the SFC Echo Request is integrity-protected, the receiving SFF first MUST verify the authentication. 2. Validate the Source TLV, as defined in Section 6.3.1. 3. Suppose the authentication validation has failed and the Source TLV is considered properly formatted. In that case, the SFF MUST send to the system identified in the Source TLV (see Section 6.5), according to a rate-limit control mechanism, an SFC Echo Reply with the Return Code set to "Authentication failed" and the Subcode set to zero. 4. If the Source TLV is determined malformed, the received SFC Echo Request processing is stopped, the message is dropped, and the event SHOULD be logged, according to a rate-limiting control for logging. 5. If the authentication is validated successfully, the SFF that has received an SFC Echo Request verifies the rest of the packet's general sanity. 6. If the packet is not well-formed, the receiver SFF SHOULD send an SFC Echo Reply with the Return Code set to "Malformed Echo Request received" and the Subcode set to zero under the control of the rate-limiting mechanism to the system identified in the Source TLV (see Section 6.5). 7. If there are any TLVs that the SFF does not understand, the SFF MUST send an SFC Echo Reply with the Return Code set to 2 ("One or more TLVs was not understood") and set the Subcode to zero. Also, the SFF MAY include an Errored TLVs TLV (Section 6.4.1) that, as sub-TLVs, contains only the misunderstood TLVs. 8. Sender's Handle and Sequence Number fields are not examined but are copied in the SFC Echo Reply message. 9. If the sanity check of the received Echo Request succeeded, then the SFF at the end of the SFP MUST set the Return Code value to 5 ("End of the SFP") and the Subcode set to zero. 10. If the SFF is not at the end of the SFP and the TTL value is 1, the value of the Return Code MUST be set to 4 ("TTL Exceeded") and the Subcode set to zero. 11. In all other cases, SFF MUST set the Return Code value to 0 ("No Return Code") and the Subcode set to zero. ``` ** This section was confusing, is it describing Reception, Processing or validation in the "Validation" steps? - 3 and 5 appear to be sub-bullets of 1, - 4 appears to be a sub bullet of 2, - 6 doesn't provide a definition of "well-formed" so its not clear how that's checked. - 8 does not appear to be relevant to validation and can be removed. - 9 does not specify what 'sanity check' may have failed. - 10 does not specify what TTL value (NSH TTL?). - 11 appears to instruct SFFs to set the return code to 0 "in all other cases". Is this an echo reply step? ``` 6.5. SFC Echo Reply Transmission The "Reply Mode" field directs whether and how the Echo Reply message should be sent. The Echo Request sender MAY use TLVs to request that the corresponding Echo Reply be transmitted over the specified path. Section 6.5.1 provides an example of a TLV that specifies the return path of the Echo Reply. Value 1 is the "Do not reply" mode and suppresses the Echo Reply packet transmission. The default value (2) for the Reply mode field requests sending the Echo Reply packet out- of-band as an IPv4 or IPv6 UDP packet. ``` ** Should this section not be an exhaustive description of when and how to send echo reply messages to an echo request? For example "Theory of Operation" doesn't mention the Source TLV but it had a requirement on replies "The value of the field MUST be used as the destination IP address in the IP/UDP encapsulation of the SFC Echo Reply message." ** the "6.5.1 Reply Service Function Path TLV" seems out of place here, is it a TLV sent in a request or reply? ``` The destination SFF of the SFP being tested or the SFF at which SFC TTL expired (as per [RFC8300]) may be sending the Echo Reply is referred to as responding SFF. The processing described below equally applies to both cases. ``` ** "may be sending the Echo Reply" is not computing, is it a typo? Remove it? ``` 6.5.4. Tracing an SFP SFC Echo Request/Reply can be used to isolate a defect detected in the SFP and trace an RSP. As with ICMP echo request/reply [RFC0792] and MPLS echo request/reply [RFC8029], this mode is referred to as "traceroute". In the traceroute mode, the sender transmits a sequence of SFC Echo Request messages starting with the NSH TTL value set to 1 and is incremented by 1 in each next Echo Request packet. The sender stops transmitting SFC Echo Request packets when the Return Code in the received Echo Reply equals 5 ("End of the SFP"). ``` ** What does an implementation do when TTL wraps? ``` 6.6. Verification of the SFP Consistency The consistency of an SFP can be verified by comparing the view of the SFP from the control or management plane with information collected from traversing by an SFC NSH Echo Request message. Every SFF that receives a Consistency Verification Request (CVReq) (specified in Section 6.6.1) MUST perform the following actions: * Collect information about the SFs traversed by the CVReq packet and send it to the ingress SFF as CVRep packet over IP network; * Forward the CVReq to the next downstream SFF if the one exists. As a result, the ingress SFF collects information about all traversed SFFs and SFs, information on the actual path the CVReq packet has traveled. That information can be used to verify the SFC's path consistency. The mechanism for the SFP consistency verification is outside the scope of this document. 6.6.1. SFP Consistency Verification packet For the verification of an SFP consistency, two types of SFC Active OAM messages are defined in addition to the SFC Echo Request/Reply messages. Their SFC Echo Request/Echo Response Message Types are as follows: * 3 - SFP Consistency Verification Request * 4 - SFP Consistency Verification Reply Upon receiving the CVReq, the SFF MUST respond with the Consistency Verification Reply (CVRep). The SFF MUST include the SFs information, as described in Section 6.6.3 and Section 6.6.2. ``` ** 6.6 specified "Every SFF that receives a Consistency Verification Request (CVReq)" and 6.6.1 just describes those as type 3 (and 4 for reply). the information appears duplicated, am I reading this correctly? Why is 6.6.1 a separate section and type 3 and 4 defined in 6.6? ``` 6.6.2. SFF Information Record TLV For the received CVReq, an SFF is expected to include in the CVRep message the information about SFs that are available from that SFF instance for the specified SFP. The SFF MUST include SFF Information Record TLV (Figure 9) in CVRep message. ``` ** "expected to include the information about SFs" or "MUST include SFF Information Record TLV"? Is it a MUST or just expected?