Hello, I have been selected as the Routing Directorate reviewer for this draft. The Routing Directorate seeks to review all routing or routing-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Routing ADs. For more information about the Routing Directorate, please see https://wiki.ietf.org/en/group/rtg/RtgDir Although these comments are primarily for the use of the Routing ADs, it would be helpful if you could consider them along with any other IETF Last Call comments that you receive, and strive to resolve them through discussion or by updating the draft. Document: draft-ietf-bier-idr-extensions-10 Reviewer: Ketan Talaulikar Review Date: 12 Feb 2024 IETF LC End Date: n/a Intended Status: Standards Track Result: Not Ready - Has Issues Summary of Major Issues: a) Document does not specify major aspects expected from a BGP specifications such as : AFI/SAFI to which BIER attribute applies, fault management which covers error handling and their impact on route propagation/programming, and implications of BGP NH update on processing. b) Document does not describe nor provides pointers to applicability/use of BIER with BGP in the Large Scale DC networks running BGP only as the routing protocol. Detailed Review (provided below with major/minor/nits tagging in IDnits o/p): 89 1. Introduction [minor] IDnits is reporting a few errors/warnings that need to be fixed. 91 Bit Index Explicit Replication (BIER) [RFC8279] is a new multicast 92 forwarding architecture which doesn't require an explicit tree- 93 building protocol and doesn't require intermediate routers to 94 maintain any multicast state. BIER is applicable in a multi-tenant 95 data center network environment for efficient delivery of Broadcast, 96 Unknown-unicast and Multicast (BUM) traffic while eliminating the [major] I am assuming this refers to an overlay service but the document seems to hint (but not explcitly state) as an underlay signaling. Can you please clarify? 97 need for maintaining a huge amount of multicast state in the 98 underlay. This document describes BGP extensions for advertising the 99 BIER-specific information. More specifically, in this document, we [major] Please provide pointer to the document which describes how BIER (MPLS and non-MPLS) works with BGP as control plane protocol. The mechanisms for link-state IGP based flooding and for hop-by-hop BGP propagation differ but I am unable to find any discussion about the implications of the same (if any) on BIER. My impression (and I could be wrong), is that BGP is primarily being used simply to "distribute" this BIER info across all routers in the BIER domain and that it does not actually need to install anything in the forwarding or compute BIER paths - that is likely done by the BIER module. If so, it would be required to explain this in brief as it has implications on the BGP machinery, 100 define a new optional, non- transitive BGP attribute, referred to as 101 the BIER attribute, to convey the BIER-specific information such as 102 BIER Forwarding Router identifier (BFR-id), BitString Length (BSL) 103 and so on. In addition, this document specifies procedures to 104 prevent the BIER attribute from "leaking out" of a BIER domain. 106 These extensions are applicable in those multi-tenant data centers 107 where BGP instead of IGP is used as an underlay [RFC7938]. These [major] RFC7938 has no mention of multicast or MPLS. It would help to put some reference about how BIER works in a BGP-only DC. 108 extensions may also be applicable to other BGP based network 109 scenarios, e.g., as described in 110 [I-D.ietf-bier-multicast-as-a-service]. [major] Some clarity on the applicablity of these extensions would help. The reference to RFC7938 indicates perhaps use as an underlay in a BGP-only DC but there is no reference to the AFI/SAFI that these extensions are applicable for - perhaps AFI 1/2 and SAFI 2? Then there is the reference to the BIER MAAS draft which has a lot more deployment scenarios that are quite involved. The BIER MAAS draft in turn uses the BGP extensions in this document but also uses TEA. 112 2. Terminology 114 This memo makes use of the terms defined in [RFC4271] and [RFC8279]. [minor] Please indicate which terms used in this document are defined in which RFC. 116 3. BIER Path Attribute 118 This draft defines a new optional, transitive BGP path attribute, [minor] Any reason why this needs to be transitive? e.g., it may be so that it gets propagated across routers that are not BIER capable? The reason is important since transitive attributes can escape depending on the AFI/SAFI. 119 referred to as the BIER attribute. This attribute can be attached to 120 a BGP UPDATE message by the originator so as to indicate the BIER- 121 specific information of a particular BFR which is identified by the 122 /32 or /128 address prefix contained in the NLRI. In other words, if [major] What happens when the NLRI encodes something other than /32 for IPv4 or /128 for IPv6? 123 the BIER path attribute is present, the NLRI is treated by BIER as a 124 "BFR-prefix". When creating a BIER attribute, a BFR needs to include 125 one BIER TLV for every Sub-domain that it supports. The attribute [major] Please define the "generic" TLV encoding for this new attribute. Also, should the length not be that of the value field? I would also suggest to consider if a 2-byte Type field is really necessary for this attribute - it seems more something just picked from OSPF? 126 type code for the BIER Attribute is TBD. The value field of the BIER 127 Attribute contains one or more BIER TLV as shown in Figure 1. [nit] Figure 1 is not labeled. Same goes for other figures in this document. 129 0 1 2 3 130 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 132 | Type=TBD | Length | 133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 134 | Sub-domain | BFR-ID | Reserved | 135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 136 ~ ~ 137 | Sub-TLVs | 138 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.......................... 140 Type: Two octets encoding the BIER TLV Type: TBD. 142 Length: Two octets encoding the length in octets of the TLV, 143 including the type and length fields. The length is encoded as an 144 unsigned binary integer. (Note that the minimum length is 8, 145 indicating that no sub-TLV is present.) 147 Sub-domain: a one-octet field encoding the sub-domain ID 148 corresponding to the BFR-ID. 150 BFR-ID: a two-octet field encoding the BFR-ID. [major] RFC8444 sec 2.1 has text related to detection of duplicate BFR-IDs. How is that handled with BGP? 152 Sub-TLVs: contains one or more sub-TLV. 154 The BIER TLV MAY appear multiple times in the BIER Path Attribute, 155 one for each sub-domain. There MUST be no more than one BIER TLV 156 with the same Sub-domain value; if there is, the entire BIER Path 157 Attribute MUST be ignored. [major] I am assuming this is "attribute discard" handling? In such case, is the route still considered eligible for best-path selection? Can it be propagated further if selected as best-path and if so, would that not break the BIER forwarding path? 159 A BIER TLV may have sub-TLVs, which may have their own sub-TLVs. All 160 those are referred to as sub-TLVs and share the same Type space, 161 regardless of the level. [major] This is not very clear. Looking at the IANA section, it seems like all TLVs/sub-TLVs of the BIER Attribute share the same TLV space? 163 3.1. BIER MPLS Encapsulation sub-TLV 165 The BIER MPLS Encapsulation sub-TLV matches the OSPFv2 "BIER MPLS 166 Encapsulation sub-TLV" as specified in Section 2.2 of [RFC8444]. It [major] The sub-TLV doesn't match with RFC8444. I don't know why it even needs to match and not sure why the reference to OSPF RFC is needed here. 167 MAY appear multiple times in the BIER TLV. 169 The following is copied verbatim from that section: 171 The BIER MPLS Encapsulation Sub-TLV has the following format: 173 0 1 2 3 174 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 176 | Type | Length | 177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 178 | Max SI |BS Len | Label | 179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 180 ~ sub-TLVs | 181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 183 Type: TBD1 (To be assigned by IANA). [minor] Perhaps the type can be already decided since it is in a new TLV space/registry? The IANA section seems to indicate a value. 185 Length: 4 or other values (depending on sub-TLVs) [major] Here the length seems to be limited to the value portion (which is good) but this is not consistent with the top-level BIER TLV. 187 Max SI: A 1-octet field encoding the maximum Set Identifier (SI) 188 (see Section 1 of [RFC8279]) used in the encapsulation for this 189 BIER sub-domain for this BitString length. 191 BS Len (BitString Length): A 4-bit field encoding the supported 192 BitString length associated with this BFR-prefix. The values 193 allowed in this field are specified in Section 2 of [RFC8296]. 195 Label: A 20-bit value representing the first label in the label range. 197 The "label range" is the set of labels beginning with the Label and 198 ending with (Label + (Max SI)). A unique label range is allocated 199 for each BitString length and sub-domain-id. These labels are used 200 for BIER forwarding as described in [RFC8279] and [RFC8296]. 202 The size of the label range is determined by the number of SIs 203 (Section 1 of [RFC8279]) that are used in the network. Each SI maps 204 to a single label in the label range: the first label is for SI=0, 205 the second label is for SI=1, etc. 207 If the label associated with the Maximum Set Identifier exceeds the 208 20-bit range, the BIER MPLS Encapsulation Sub-TLV containing the 209 error MUST be ignored. [major] In general (as mentioned in a previous comment), there is a need for a section that describes BGP fault management and processing for all types of errors and handling in terms of RFC7606. 211 If the same BitString length is repeated in multiple BIER MPLS 212 Encapsulation Sub-TLVs inside the same BIER TLV, all BIER MPLS 213 Encapsulation Sub-TLVs in the BIER TLV MUST be ignored. 215 Label ranges within all BIER MPLS Encapsulation Sub-TLVs advertised 216 by the same BFR MUST NOT overlap. If an overlap is detected, all 217 BIER MPLS Encapsulation Sub-TLVs advertised by the BFR MUST be ignored. 219 3.2. BIER Non-MPLS Encapsulation sub-TLV 221 Similar to the concept in [I-D.ietf-bier-lsr-non-mpls-extensions], 222 the BIER non-MPLS Encapsulation sub-TLV is used for non-MPLS 223 encapsulation. It matches the OSPFv2 BIER non-MPLS Encapsulation sub 224 TLV as specified in Section 3.2 of 225 [I-D.ietf-bier-lsr-non-mpls-extensions]. 227 The following are copied verbatim from that section. Note to RFC 228 Editor: the following copied text must match the final text in the 229 RFC for [I-D.ietf-bier-lsr-non-mpls-extensions]. [major] I find it very strange that RFC Editor is being asked to do this! Should this not be something that the authors do? The sub-TLV format does not map - nor does it need to. Things are error handling and their implications may be different in different protocols. 231 The non-MPLS Encapsulation Sub-TLV MAY appear multiple times within a 232 single BIER TLV. If the same BitString length is repeated in 233 multiple BIER non-MPLS encapsulation Sub-TLVs inside the same BIER 234 TLV, the BIER TLV MUST be ignored. 236 0 1 2 3 237 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 239 | Type | Length | 240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 241 | Max SI |BS LEN | BIFT-id | 242 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 243 ~ sub-TLVs | 244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 246 Type: TBD2 (To be assigned by IANA). 248 Length: 4 or other values (depending on sub-TLVs) 250 Max SI: A 1 octet field encoding the Maximum Set Identifier 251 (Section 1 of [RFC8279]) used in the encapsulation for this BIER 252 subdomain for this BitString length. The first BIFT-id is for SI=0, 253 the second BIFT-id is for SI=1, etc. If the BIFT-id associated with 254 the Maximum Set Identifier exceeds the 20-bit range, the sub-TLV 255 MUST be ignored. 257 BIFT-id: A 20-bit field representing the first BIFT-id in the BIFT-id 258 range. 260 BitString Length (BS Len): A 4 bit field encoding the 261 bitstring length (as per [RFC8296]) supported for the encapsulation. 263 The "BIFT-id range" is the set of 20-bit values beginning with the 264 BIFT-id and ending with (BIFT-id + (Max SI)). These BIFT-id's are 265 used for BIER forwarding as described in [RFC8279] and [RFC8296]. 267 The size of the BIFT-id range is determined by the number of SI's 268 (Section 1 of [RFC8279]) that are used in the network. Each SI maps 269 to a single BIFT-id in the BIFT-id range: the first BIFT-id is for 270 SI=0, the second BIFT-id is for SI=1, etc. 272 If the BIFT-id associated with the Maximum Set Identifier exceeds 273 the 20-bit range, the BIER non-MPLS Encapsulation sub-TLV 274 containing the error MUST be ignored. 276 BIFT-id ranges within all the BIER non-MPLS Encapsulation sub- 277 TLVs advertised by the same BFR MUST NOT overlap. If an overlap is 278 detected, all the BIER non-MPLS Encapsulation sub-TLV advertised 279 by the BFR MUST be ignored. However the 280 BIFT-id ranges may overlap across different encapsulation types and 281 is allowed. As an example, the BIFT-id value in the non-MPLS 282 encapsulation sub-TLV may overlap with the Label value in the 283 Label range in BIER MPLS encapsulation sub-TLV. 285 3.3. BIER Nexthop sub-TLV 287 0 1 2 3 288 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 290 | Type=TBD3 | Length | 291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 292 | Nexthop | 293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 295 Type: TBD3 (To be assigned by IANA). 297 Length: 4 if the Nexthop is IPv4 address and 16 if the Nexthop is 298 IPv6 address 300 Nexthop: 4 or 16 octets of IPv4/IPv6 address 302 The BIER Nexthop sub-TLV MAY be included in the MPLS or non-MPLS 303 Encapsulation sub-TLV as well as in the top level BIER TLV. [major] What are the semantics of this NH as opposed to the traditional BGP NH? What validation/reachability checks or such is BGP required to do for it? The sections 4 and 5 cover some regards, but some description of its semantics in this section would be helpful. 305 4. Originating/Updating BIER Attribute 307 A BIER Forwarding Egress Router (BFER) MUST attach a BIER attribute 308 to its own BIER prefix NLRI. The BIER attribute MUST include one [major] To be advertised in which AFI/SAFI? 309 BIER TLV for each BIER sub-domain that it supports. Each BIER TLV 310 MUST include an MPLS and/or non-MPLS Encapsulation sub-TLV, and 311 SHOULD include a BIER Nexthop sub-TLV with the Nexthop set to the 312 BIER prefix. If the BIER Nexthop sub-TLV is not included, the BIER 313 prefix will be used by receiving BFRs as the BIER nexthop when 314 calculating BIFT. 316 A BFR/BFER MAY attach a BIER proxy range sub-TLV 317 [I-D.ietf-bier-prefix-redistribute] in the BIER TLV. In this case it 318 MUST attach a BIER attribute to its own BIER prefix NLRIs. Other 319 than this case, a BFR that is not a BFER (i.e., its BFR-ID is 0) 320 SHOULD NOT attach a BIER attribute to its own BIER prefix NLRIs (if a 321 BIER attribute is attached it will not get used anyway). [major] The above para does not seem to belong to this document and perhaps should be instead covered in draft-ietf-bier-prefix-redistribute? Or add it here and put a normative (blocking) reference to draft-ietf-bier-prefix-redistribute? 323 When a BFR re-advertises a BGP NLRI with a BIER attribute, it SHOULD 324 set/update the BIER Nexthop sub-TLV to use its own BIER prefix, in [major] Re-advertises with either BGP nexthop set or nexthop unchanged or in both cases? What is the implications of a BGP RR being used? 325 which case it MUST replace the MPLS or non-MPLS Encapsulation sub-TLV 326 with its own, i.e., as if the BFR is attaching the encapsulation sub- 327 TLV for its own BIER prefix. If it does not update the BIER Nexthop 328 sub-TLVs, it MUST NOT update MPLS or non-MPLS Encapsulation sub-TLV. [major] There is no section describing the "receiving" of the BIER Attribute and how the information therein is processed. Section 5 seems to cover the BIER forwarding entry calculation part but it is not clear which parts are done by BGP and which parts by something like a BIER module. Since similar info is also flooded via IGPs, it would help if only the BGP specifics is covered in this document with pointer to other BIER documents for the "common" parts? 330 It's possible that the BFR supports some but not all BSLs in the 331 received MPLS or non-MPLS Encapsulation sub-TLVs. After updating the 332 BIER Nexthop sub-TLV in the top BIER TLV to itself, for the BSLs that 333 it does support, the BFR MUST remove the BIER Nexthop sub-TLV (if 334 present) in the corresponding Encapsulation sub-TLVs. For the BSLs 335 that it does not support, it MUST not update those Encapsulation sub- 336 TLVs except that if a BIER Nexthop sub-TLV is not included in the 337 Encapsulation sub-TLV, the received BIER Nexthop sub-TLV in the top 338 BIER TLV MUST be copied into the Encapsulation sub-TLV. All impacted 339 length fields (e.g., the Encapsulation sub-TLV Length, the top level 340 BIER TLV Length) MUST be updated accordingly. [minor] This is essentially putting together a fresh BIER attribute from the one received. A more formal description of the processing in the form of steps would help an implementor and ensure that things don't fall through the cracks. As more TLVs/info are added, these steps can be updated by future documents. 342 Since the BIER attribute is an optional, transitive BGP path 343 attribute, a non-BFR BGP speaker could still advertise the received 344 route with a BIER attribute. 346 5. BIFT Calculation 348 For each sub-domain, a BFR calculates the corresponding BIFTs by 349 going through the BIER prefixes whose BIER attribute includes a BIER 350 TLV for the sub-domain. For a non-zero BFR-id in the BIER TLV, or 351 for each BFR-id in the BIER Proxy Range sub-TLV in the BIER TLV of a 352 BIER prefix, a BIFT entry is created or updated. The entry's BFR 353 Neighbor (BFR-NBR) [RFC8279] is the Nexthop in the BIER Nexthop sub- 354 TLV in the corresponding Encapsulation sub-TLV, or in the top level 355 BIER TLV if the Encapsulation sub-TLV does not have a Nexthop sub- 356 TLV. If there is no Nexthop sub-TLV at all, The entry's BFR Neighbor 357 is the BIER prefix itself. The BIER label or BIFT-id for the entry 358 is derived from the Label Range in the MPLS Encapsulation sub-TLV or 359 from the BIFT-id Range in the non-MPLS Encapsulation sub-TLV. 361 BIER traffic is sent to the BFR-NBR either natively (BIER header 362 directly follows a layer 2 header) if the BFR-NBR is directly 363 connected, or via a tunnel otherwise. Notice that, if a non-BFR BGP [major] What is this tunnel and who creates it in a BGP-only DC network? 364 speaker re-advertises a BIER prefix (in this case it can not update 365 the BIER attribute since it is not capable), or if a BFR BGP speaker 366 re-advertises a BIER prefix without updating the BIER Nexthop sub- 367 TLV, the BFR receiving the prefix will tunnel BIER traffic - the BGP 368 speaker re-advertising the BIER prefix will not see the BIER traffic 369 for the BIER prefix. 371 6. Deployment Considerations [minor] Perhaps this is more like Operational Considerations? 373 It's assumed by this document that the BIER domain is aligned with an 374 Administrative Domain (AD) which may be composed of multiple ASes 375 (either private or public ASes). Use of the BIER attribute in other 376 scenarios is outside the scope of this document. [minor] Could you put a reference to BIER domain from an appropriate BIER RFC? 378 A boundary router of the AD that supports the BIER attribute MUST 379 support a per-EBGP-session/group policy, that indicates whether the 380 attribute is allowed. If it is not allowed, the BIER attribute MUST 381 NOT be sent to any EBGP peer of the session/group, and the BIER 382 attribute received from the peer MUST be treated exactly as if it 383 were an unrecognized non-transitive attribute. That is, "it MUST be 384 quietly ignored and not passed along to other BGP peers". [minor] Perhaps the default being "drop" for EBGP unless enabled via explicit config? 386 7. Acknowledgements 388 Thanks a lot for Eric Rosen and Peter Psenak for their valuable 389 comments on this document. 391 8. IANA Considerations 393 IANA is requested to assign a codepoint in the "BGP Path Attributes" 394 registry to the BIER attribute. 396 IANA is requested to create a registry for "BGP BIER Attribute Types" [major] What is "BGP BIER Attribute Types"? 397 and a registry for "BGP BIER TLV sub-TLV Types". The type field for 398 both registry consists of two octets, with possible values from 1 to 399 655355 (the value 0 is reserved). The allocation policy for this 400 field is to be "First Come First Serve". [nit] Reference to RFC8126 is required for FCFS 402 Three initial values are to be allocated from the "BGP BIER TLV sub- 403 TLV Types" for MPLS Encapsulation, non-MPLS Encapsulation, and BIER 404 Nexthop sub-TLV respectively. [major] What about the top level BIER TLV? 406 9. Security Considerations 408 This document introduces no new security considerations beyond those 409 already specified in [RFC4271] and [RFC8279]. [major] I am not sure that the above is sufficient. To start off, perhaps describe what security considerations are covered for BGP and BIER respectively that apply to this document. Then, perhaps there should be a discussion on the limiting scope via configuration of the specific AFI/SAFI to prevent the BIER attribute escape and the implications if there is an escape? Also, reference to BIER domain and some text about it being a "single administrative domain" would help. [End of Review]