This document has been reviewed as part of the transport area review team's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors and WG to allow them to address any issues raised and also to the IETF discussion list for information. When done at the time of IETF Last Call, the authors should consider this review as part of the last-call comments they receive. Please always CC tsv-art@ietf.org if you reply to or forward this review. Reviewer: Bernard Aboba Document: draft-ietf-rum-rue-09 Result: Ready with Issues Overall, found several issues with this document: 1. The specification appears to contradict some of its normative references with respect to support of IPv6. 2. Although the document does include language within the provider configuration, it does not mention language negotiation [RFC 8373], which included relay services as a use case. So I'm curious as to how users with different communications requirements might be accommodated. For example, a user who is speech impaired but not hearing impaired, and so would like to communicate via ASL to the interpreter (or realtime text in English if video fails) but can receive spoken English. 3. Maintainability. This specification does not appear implementable using the WebRTC 1.0 API, and as a result, it cannot run in a browser. Major changes would be required to WebRTC code bases such as libwebrtc and pion to allow for native implementation. The separate fork that would be needed would become increasingly difficult to maintain over time. --------- 1. IPv6 support Section 4 says: " Implementations MUST support IPv4 and IPv6. Dual stack support is NOT required and provider implementations MAY support separate interfaces for IPv4 and IPv6 by having more than one server in the appropriate SRV record where there is either an A or AAAA record in each server DNS record but not both. The same version of IP MUST be used for both signaling and media of a call unless ICE ([RFC8445]) is used, in which case candidates may explicitly offer IPv4, IPv6 or both for any media stream." [BA] This document requires that RFC 8835 be supported, but this paragraph conflicts with that document in multiple ways. RFC 8835 indicates ICE as both required to implement and to use. When it says that "Dual stack support is NOT required" is it referring to the client or the provider? RFC 8835 seems to conflict in its IPv6 requirements as well. --------- 3. WebRTC usage Section 5.5 says: Implementations MUST conform to [RFC8835] except for its guidance on the WebRTC data channel, which this specification does not use. See Section 6.2 for how RUE supports real-time text without the data channel. [BA] The WebRTC 1.0 API does not allow data to be sent except via the data channel API. So when it says that the WebRTC data channel is not used, by what mechanism is real-time text to be sent? This is not audio/video data so that it cannot be sent using the RTCRtpSender API. The only way I can think of that this might be implemented is to add back support for the RTP data channel so that real-time text could be implemented on top of the RTCDataChannel send() method. However that in turn would violate aspects of the WebRTC security model, such as the deprecation of SDES. If that is the intent here (it's the only practical way that RTT can be supported other than via the gateway model), then I would prefer that it be stated explicitly. Other places in the document conflict with the RFC 8835 requirement. For example, RFC 8835 Section 3.4 says: The primary mechanism for dealing with middleboxes is ICE, which is an appropriate way to deal with NAT boxes and firewalls that accept traffic from the inside, but only from the outside if it is in response to inside traffic (simple stateful firewalls). ICE [RFC8445] MUST be supported. The implementation MUST be a full ICE implementation, not ICE-Lite. A full ICE implementation allows interworking with both ICE and ICE-Lite implementations when they are deployed appropriately. RFC 8835 also has requirements for support of IPv4 and IPv6 that seem to be in conflict with the statements in Section 4. For example, RFC 8835 Section 3.2 says: 3.2. Ability to Use IPv4 and IPv6 Web applications running in a WebRTC browser MUST be able to utilize both IPv4 and IPv6 where available -- that is, when two peers have only IPv4 connectivity to each other, or they have only IPv6 connectivity to each other, applications running in the WebRTC browser MUST be able to communicate. When TURN is used, and the TURN server has IPv4 or IPv6 connectivity to the peer or the peer's TURN server, candidates of the appropriate types MUST be supported. The "Happy Eyeballs" specification for ICE [RFC8421] SHOULD be supported. Section 6 of the document says: This specification adopts the media specifications for WebRTC ([RFC8825]). Where WebRTC defines how interactive media communications may be established using a browser as a client, this specification assumes a normal SIP call. The RTP, RTCP, SDP and specific media requirements specified for WebRTC are adopted for this document. The RUE is a WebRTC "non-browser" endpoint, except as noted expressly below. [BA] Is RUE is really a WebRTC "non-browser" endpoint? If the goal is to allow RUE to be easily built on top of native WebRTC libraries such as libwebrtc or pion, then it should inherit WebRTC requirements such as ICE, dual stack support, etc. Section 6.1 says: Implementations MUST support [RFC8834] except that MediaStreamTracks are not used. Implementations MUST conform to Section 6.4 of [RFC8827]. [BA] Since MediaStreamTracks are how audio/video is obtained from devices (or rendered), I don't understand how a WebRTC application (browser or non-browser) can function without them. Elsewhere, the specification states that the data channel isn't used, now it seems to say that audio/video isn't used either. RFC 8827 is the security architecture of WebRTC. Is it really true that RUE implementations only need to support Section 6.4?? If so, this allows major deviations from WebRTC to the point where interoperability could be severely impaired. 6.2. Text-Based Communication Implementations MUST support real-time text ([RFC4102] and [RFC4103]) via T.140 media. One original and two redundant generations MUST be transmitted and supported, with a 300 ms transmission interval. Implementations MUST support [RFC9071] especially for emergency calls. Note that RFC4103 is not how real-time text is transmitted in WebRTC and some form of transcoder would be required to interwork real-time text in the data channel of WebRTC to RFC4103 real-time text. Transport of T.140 real-time text in WebRTC is specified in [RFC8865], using the WebRTC data chanel. RFC 8865 also has some advice on how gateways between RFC 4103 and RFC 8865 should operate. It is RECOMMENDED that RFC 8865 including multiparty support is used for communication with browser-based WebRTC implementations. Implementations MUST support [RFC9071]. [BA] The reason why RFC 8865 was developed was that there was no practical way to support RFC 4103 except via the RTCDataChannel API. Requiring RFC 4103 to be supported doesn't make the practical problems go away - you need a way to provide the RTT data to be sent. When you say that implementations MUST support RFC 9071, how is this supposed to work in a WebRTC implementation? 6.3. Video Implementations MUST conform to [RFC7742] with following exceptions: only H.264, as specified in [RFC7742], is Mandatory to Implement, and VP8 support is OPTIONAL at both the device and providers. This is because backwards compatibility is desirable, and older devices do not support VP8. [BA] The reality is that H.264 is not very widely used in WebRTC applications (less than 1 percent of calls use it), and as a result, implementations are quite buggy. I do not believe that it is serving RUE users well to make VP8 optional. Even if you're going to stick with H.264, you might consider going beyond only requiring support only for constrained baseline profile. Some implementations now support constrained high profile, for example. Section 6.8 For backwards compatibility with calling devices that do not support the foregoing methods, implementations MUST implement SIP INFO messages to send and receive XML encoded Picture Fast Update messages according to [RFC5168]. [BA] Really? Earlier in the document is says that backwards compatibility is not a strict requirement. Any widely used WebRTC code base will support NACK, FIR and PLI. seems