I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other review comments. This document is very thorough and well-written. The security considerations section, in particular, reflects significant effort. It is almost ready for publication a Proposed Standard RFC. There is at least one issue to address before publication, however. ---- Issue: The discussion (and possibly the specification) of the use of URLs in metadata blocks is insufficient. Relying only on explicit user approval to access the resource still leaves many possible exploits. What does explicit approval mean? Is clicking approval once in an application to say "Yes, you can download album cover art" sufficient? Consider an instance, particularly one served over HTTP using the content type registered by this document that becomes virally popular and contains a URL pointing to a victim. Consider again such a thing where the contained URL is, perhaps by mistake, exactly the URL originally retrieved (causing a naive implementation to enter a request loop). Some additional discussion of what is reasonable to retrieve is warranted, and the spirit of same-origin should be taken into account. Privacy should also be mentioned - what sort of tracking can be accomplished through accesses to the URL? There should also be acknowledgment of the risks of allowing arbitrary content to be bundled into FLAC files, particularly in contexts where the MIME type defined in this document is used (web retrieval, multipart/mime in email, wherever else). How might this be used to circumvent content security policies at firewalls, for example? The document already acknowledges that executable code can appear here, but that's only one type of risk, and the discussion in the document steers quickly to the implication that the executable code is not malicious. Is it ever safe to actually execute such code? ---- Minor issue: "mux" is used 5 times in the document (in various forms) without definition, and once normatively: "The offsets MUST NOT be updated on muxing to reflect the new offsets of FLAC frames in a container." ---- Nit issue: This is really almost editorial, but I'm pulling up here so that the SEC ADs see it early; It's a little odd to lean on RFC4732 they way the document does. I understand the intent, but it requires the reader to do some lateral thinking to apply that RFCs guidance to this document. Consider some words motivating how you are intending for it to apply here. ---- More of a question than an issue, but I think the ADs consider the answer: Why isn't this document moving the registry currently at https://xiph.org/flac/id.html to IANA? ---- The rest are editorial comments in document order: Definitions at *Interchannel samples* - you define counts here not samples - maybe the list item should be *Interchannel sample counts* "side-right" and "right-side" appear 2 and 3 times (not counting capitalized uses), respectively, in the document. Are the intended to mean the same thing? Is there a need to have two terms? "does not lead to non-lossless behavior" is a bit of a head-spin. Would "does not lead to lossy behavior" suffice? It would be nice to explicitly say _why_ you allowed a mix of big and little-endian encoding when you note it just after Table 1. Who is the first sentence of section 6 for? Consider deleting it and renaming the section 6 to "An overview of the FLAC format layout" At "FLAC supports up to 127 kinds of metadata blocks", consider removing "up to". It supports exactly 127 kinds of metadata blocks. Consider deleting the one use of "et cetera" - the sentence already says the list isn't exclusive. Consider quoting "fLaC" when your are talking specifically about the marker in prose everywhere (as you quote it in 10.3) What is "pure" overhead? Consider removing the word "pure". At the restriction in section 7 of the maximum block size for 48kHz audio to 4608 - consider showing where 4608 came from. It takes a moment to realize what Table 3 is intended to convey. A short introductory sentence or a table caption would go a long way to help the reader. At 8.3 - it might be nice to note that the n in u(n) here is what's present in the 3 bytes of metadata header block length. Consider replacing the places you use NUL with 0x00 (which you use elsewhere in the document). This makes it more clearer how many 0 bits you really mean. Remember ASCII started out as 7-bits - RFC20 talks about sticking it in an 8 bit byte. I will almost, but not entirely, let "A bright colored fish" pass without comment as it's not this document's responsibility. In 9.1.5 at "non-consecutive" did you mean perhaps "strictly increasing". Consecutive coded frame numbers might make sense, but sample numbers wouldn't. There's a word missing in "There are also audio processors like lossyWAV that enable users to improve compression of their files by a lossless audio codec in a non-lossless way." Perhaps "by using a"? Consider breaking the discussion of bit depth of a subframe out of section 9.2.3 into its own section - it stands (and is used in other parts of the document) for other things than constant subframes. At "This means that decoding MUST be a sequential process" - I think you should say "is necessarily" instead of "MUST be". This really isn't using the 2119 term as intended. In 9.2.6 consider being explicit about whether the coefficients are each represented as 2's complement signed? At section 10, I am suspicious that "very thin transport layer" is not going to communicate what you really mean, but have no immediate suggestions for improvement. In the first sentence of Appendix A - consider "is" rather than "MUST be". At Appendix C.5 - Consider saying "32768" explicitly rather than "last block". (Especially if you didn't _mean_ 32768, but rather the whole last row of Table 14). I don't know what it means to satisfy an interoperability consideration. Please consider rewriting the sentences in C.7 that use the construct. And finally - Please reconsider using the TITLE example in D.2.5 that leaves you using an rtl script inline with the ltr prose. You are within your rights to do so, but it is stabbing at a place where the world (not just our own tooling) isn't as ready as we would like it to be.