And the -11 version resolves everything else, plus picks up improved versions of the editorial clarifications; -11 is Ready for RFC publication. Many thanks for the timely responses. Thanks, --David > -----Original Message----- > From: Black, David > Sent: Wednesday, December 10, 2014 10:51 AM > To: nico at cryptonector.com; General Area Review Team (gen-art at ietf.org); ops- > dir at ietf.org > Cc: ietf at ietf.org; json at ietf.org; Black, David > Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10 > > The -10 version of this draft resolves items [A]-[E] from the > Gen-ART and OPS-Dir review of -09, and the IESG is in the process of > resolving the (silly) idnits complaint about RFC 20 being a possible > downref. > > For item [D], a different approach was taken instead of modifying > the ABNF - the resulting new Section 2.4 is a definite improvement > to the draft, and is significantly clearer than the modified ABNF > would have been. Nicely done. > > Item [F] about the text in the IANA Considerations > (Section 4) remains open - if the intent is to not deal with replacing > that text until after IESG approval, an RFC Editor Note to that effect > should be added to Section 4. > > I have an additional editorial concern - given all the discussion about > UTF-8, it would be good for the draft to make it clear early on > that JSON text sequences are UTF-8 only. Here are some suggested changes. > > Abstract: > > This document describes the JSON text sequence format and associated > media type, "application/json-seq". A JSON text sequence consists of > any number of JSON texts, each prefix by an Record Separator > (U+001E), and each ending with a newline character (U+000A). > > "any number of JSON texts" -> "any number of UTF-8 encoded JSON texts" > > It also looks like ASCII names for RS and LF are being mixed w/Unicode > codepoints in the second sentence in the abstract. I'm not sure that's > a good thing to do, especially as the body of the draft refers to RS and > LF as being ASCII. Here are a couple of changes that would remedy this: > > "an Record Separator (U+001E)" -> "an ASCII Record Separator (0x1E)" > "a newline character (U+000A)" -> "an ASCII newline character (0x0A)" > > Section 2 JSON Text Sequence Format: > > I suggest adding this sentence as a separate paragraph at the end of this > section (i.e., just before Section 2.1): > > JSON text sequences MUST use UTF-8 encoding; other encodings of JSON > (i.e., UTF-16 and UTF-32) MUST NOT be used. > > Aside from item [F], all of the above are editorial suggestions, but I > think making this clear early in the draft will help avoid potential > implementer confusion. > > Thanks, > --David > > > -----Original Message----- > > From: Black, David > > Sent: Friday, December 05, 2014 9:51 AM > > To: nico at cryptonector.com; General Area Review Team (gen-art at ietf.org); ops- > > dir at ietf.org > > Cc: ietf at ietf.org; json at ietf.org; Black, David > > Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09 > > > > This is a combined Gen-ART and OPS-Dir review. Boilerplate for both follows > > ... > > > > I am the assigned Gen-ART reviewer for this draft. For background on > > Gen-ART, please see the FAQ at: > > > > < http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. > > > > Please resolve these comments along with any other Last Call comments > > you may receive. > > > > I have reviewed this document as part of the Operational directorate's > ongoing > > effort to review all IETF documents being processed by the IESG. These > > comments > > were written primarily for the benefit of the operational area directors. > > Document editors and WG chairs should treat these comments just like any > other > > last call comments. > > > > Document: draft-ietf-json-text-sequence-09 > > Reviewer: David Black > > Review Date: Dec 5, 2014 > > IETF LC End Date: Dec 5, 2014 > > IESG Telechat date: Dec 18, 2014 > > > > Summary: This draft is on the right track, but has open issues > > described in the review. > > > > This draft specifies a format that packs multiple JSON texts into a > > single string. The ASCII RS (0x1E) character is used to separate texts, > > and a linefeed is appended to each text to ensure that a complete text > > always ends with a whitespace character. > > > > All of the open issues are minor - the most important ones center on > > treatment of incomplete JSON texts - that appears to be an afterthought > > in this draft and needs more attention. I also found a couple of > > minor issues in the Security and IANA Considerations sections, both of > > which are almost nits. > > > > Major issues: None. > > > > Minor issues: > > > > [A] Section 2.1: > > > > If parsing of such an octet string as a JSON text fails, the parser > > SHOULD nonetheless continue parsing the remainder of the sequence. > > > > That's not quite right - there are two levels of parsing, JSON > > sequence parsing and JSON text parsing of each text in the sequence, > > both of which might be implemented in a single-pass parser. For such an > > implementation, the above sentence could be (mis-)read to imply that the > > JSON text parse should resume from the point at which it failed, which > > would be silly (although I've seen heroic PL/1 parsers do exactly that). > > Instead, the parse needs to skip ahead to the next RS, ignoring the rest > > of the JSON text that failed to parse. I suggest: > > > > If parsing of such an octet string as a JSON text fails, and the > > octet string is followed by an RS octet, the parser > > SHOULD nonetheless skip ahead to that RS octet and continue parsing > > the remainder of the sequence from there. > > > > That also covers the case where there is nothing more to parse after the > > JSON text that caused the parse failure. > > > > [B] Section 2.3: > > > > Is incremental parsing of a JSON text within a sequence allowed, or > > is the parser required to not produce any results until the parse of > > the entire text is successful? I'd expect that incremental parsing > > is ok (so results may be produced from a text that ultimately fails > > to parse), and I think that's worth stating. > > > > [C] Section 2.4: > > > > Parsers MUST check that any JSON texts that are a top-level number > > include JSON whitespace ("ws" ABNF rule from [RFC7159]) after the > > number, otherwise the JSON-text may have been truncated. > > > > That reference to the "ws" rule doesn't get the job done because that > > rule allows a match to no characters - it's of the form ws = *( ... ) > > where ... is the list of whitespace characters. What's needed here is > > a rule of the form vws = 1*( ...) to force there to be at least one > > whitespace character, but see the next issue for a better way to deal > > with this topic by pulling the appended LF into the sequence parse > > instead of the text parse. > > > > [D] I wonder whether the possibility of incomplete texts ought to be > > encoded into the parsing rules to directly catch JSON texts that must > > be incomplete because the last character is not LF, e.g.: > > > > JSON-sequence = *(1*RS (possible-JSON / truncated-JSON / empty-JSON)) > > RS = %x1E; "record separator" (RS), see RFC20 > > possible-JSON = 1*(not-RS) LF ; attempt to parse as UTF-8-encoded > > ; JSON text (see RFC7159) > > truncated-JSON = *(not-RS) not-LFRS); truncated, don't attempt > > ; to parse as JSON text > > empty-JSON = LF ; only the LF appended by the encoder, nothing to parse > > > > not-RS = %x00-1D / %x1F-FF; any octet other than RS > > not-LFRS = %x00-09/ %x1B-1D / %x1F-FF; any octet other than RS or LF > > > > Note that this won't detect all incomplete JSON texts, because LF is allowed > > within a JSON text (and this should be stated). > > > > [E] Section 3 - Security Considerations > > > > Incomplete and malformed JSON texts can be used to attack JSON parsers - > > that should be pointed out, as I don't see that in RFC 7159's security > > considerations and incomplete texts are a relevant consideration for > > this draft. > > > > [F] Section 4 - IANA Considerations > > > > Security considerations: See , > > Section 3. > > > > Interoperability considerations: Described herein. > > > > Published specification: . > > > > Applications that use this media type: > < https://stedolan.github.io/jq > is likely to support this format>. > > > > Replace all three instances of the angle bracketed text. The first two > > instances should be RFC references (e.g., RFC XXXX) w/a note to the RFC > > Editor to insert the number of the RFC when published. The third instance > > should be resolved now, or could have an RFC Editor note added indicating > > that the author will resolve that during Authors 48 hours. > > > > Nits/editorial comments: > > > > idnits didn't like the reference to RFC 20 for ASCII: > > > > ** Downref: Normative reference to an Unknown state RFC: RFC 20 > > > > RFC 5234 (ABNF) uses this, which looks like a better reference: > > > > [US-ASCII] American National Standards Institute, "Coded Character > > Set -- 7-bit American Standard Code for Information > > Interchange", ANSI X3.4, 1986. > > > > --- Selected RFC 5706 Appendix A Q&A for OPS-Dir review --- > > > > Most of these questions are n/a because this draft describes a format > > that will be used in other protocols to which RFC 5706's concerns would > apply. > > > > A.1.4 Have the Requirements on other protocols and functional > > components been discussed? > > > > The specification of the interaction of the JSON sequence parser with the > > JSON text parser is not as clear as it should be for incomplete or malformed > > JSON texts. See Minor Issues [A]-[E] above. > > > > A.1.8 Are there fault or threshold conditions that should be reported? > > > > Yes, incomplete JSON texts - this is covered in sections 2.3 and 2.4. > > > > Thanks, > > --David > > ---------------------------------------------------- > > David L. Black, Distinguished Engineer > > EMC Corporation, 176 South St., Hopkinton, MA  01748 > > +1 (508) 293-7953             FAX: +1 (508) 293-7786 > > david.black at emc.com        Mobile: +1 (978) 394-7754 > > ----------------------------------------------------