Hi,

Please find my review, as member of the INT Area Directorate, of the following document: 


dprive                                                     S. Bortzmeyer
Internet-Draft                                                     AFNIC
Obsoletes: 7626 (if approved)                               S. Dickinson
Intended status: Informational                                Sinodun IT
Expires: July 19, 2020                                  January 16, 2020


                       DNS Privacy Considerations
                    draft-ietf-dprive-rfc7626-bis-04

<snip>

1.  Introduction

<snip>

   Let us begin with a simplified reminder of how the DNS works (See
   also [RFC8499]).  A client, the stub resolver, issues a DNS query to
   a server, called the recursive resolver (also called caching resolver
   or full resolver or recursive name server).  Let's use the query
   "What are the AAAA records for www.example.com?" as an example.  AAAA
   is the QTYPE (Query Type), and www.example.com is the QNAME (Query
   Name).  (The description that follows assumes a cold cache, for
   instance, because the server just started.)  The recursive resolver
   will first query the root name servers.  In most cases, the root name
   servers will send a referral.  In this example, the referral will be
   to the .com name servers.  The resolver repeats the query to one of
   the .com name servers.  The .com name servers, in turn, will refer to
   the example.com name servers.  The example.com name server will then
   return the answer.  The root name servers, the name servers of .com,
   and the name servers of example.com are called authoritative name
   servers.  It is important, when analyzing the privacy issues, to
   remember that the question asked to all these name servers is always
   the original question, not a derived question.  The question sent to
   the root name servers is "What are the AAAA records for
   www.example.com?", not "What are the name servers of .com?".  By
   repeating the full question, instead of just the relevant part of the
   question to the next in line, the DNS provides more information than
   necessary to the name server.  In this simplified description,
   recursive resolvers do not implement QNAME minimization as described
   in [RFC7816], which will only send the relevant part of the question
   to the upstream name server.

<JMC>
IMHO, that would be clearer to split the previous paragraph into 2 paragraphs:
- one explaining the general DNS process
- one showing the privacy issue related to the fact the question is not derived
BTW, the construction of the end of the previous paragraph suggests that question derivation and QNAME minimization are two different things. </JMC>

<snip>

   At the time of writing, almost all this DNS traffic is currently sent
   in clear (i.e., unencrypted).  However there is increasing deployment
   of DNS-over-TLS (DoT) [RFC7858] and DNS-over-HTTPS (DoH) [RFC8484],
   particularly in mobile devices, browsers, and by providers of anycast
   recursive DNS resolution services.  There are a few cases where there
   is some alternative channel encryption, for instance, in an IPsec VPN
   tunnel, at least between the stub resolver and the resolver.

<JMC>
IPsec: a reference is missing.
</JMC>

<snip>

   o  Tertiary requests: these are the additional requests performed by
      the DNS system itself.  For instance, if the answer to a query is
      a referral to a set of name servers, and the glue records are not
      returned, the resolver will have to do additional requests to turn
      the name servers' names into IP addresses.  Similarly, even if
      glue records are returned, a careful recursive server will do
      tertiary requests to verify the IP addresses of those records.

<JMC>
“glue records”: IMHO, either a reference or a definition is needed.
</JMC>

<snip>

2.  Scope

   This document focuses mostly on the study of privacy risks for the
   end user (the one performing DNS requests).  We consider the risks of
   pervasive surveillance [RFC7258] as well as risks coming from a more
   focused surveillance.

<JMC>
From my point of view, but maybe I am wrong, this document is the “Problem Statement” document regarding DNS Privacy mechanisms.
If so, I regret that there is no text about impact(s), in a security context, when privacy policy (e.g., DoT, DoH) is deployed.
Please, find more comments on such a point inside Security Considerations section.
</JMC>

<snip>

3.2.  Data in the DNS Request

<snip>

   For the communication between the stub resolver and the recursive
   resolver, the source IP address is the address of the user's machine.
   Therefore, all the issues and warnings about collection of IP
   addresses apply here.  For the communication between the recursive
   resolver and the authoritative name servers, the source IP address
   has a different meaning; it does not have the same status as the
   source address in an HTTP connection.  It is typically the IP address
   of the recursive resolver that, in a way, "hides" the real user.
   However, hiding does not always work.  Sometimes EDNS(0) Client
   subnet [RFC7871] is used (see its privacy analysis in
   [denis-edns-client-subnet]).  Sometimes the end user has a personal
   recursive resolver on her machine.  In both cases, the IP address is
   as sensitive as it is for HTTP [sidn-entrada].

   A note about IP addresses: there is currently no IETF document that
   describes in detail all the privacy issues around IP addressing in
   general, although [RFC7721] does discuss privacy considerations for
   IPv6 address generation mechanisms.  In the meantime, the discussion
   here is intended to include both IPv4 and IPv6 source addresses.  For
   a number of reasons, their assignment and utilization characteristics
   are different, which may have implications for details of information
   leakage associated with the collection of source addresses.  (For
   example, a specific IPv6 source address seen on the public Internet
   is less likely than an IPv4 address to originate behind an address
   sharing scheme.)  However, for both IPv4 and IPv6 addresses, it is
   important to note that source addresses are propagated with queries
   and comprise metadata about the host, user, or application that
   originated them.

<JMC>
“It is typically the IP address of the recursive resolver that, in a way, "hides" the real user.”
“... it is important to note that source addresses are propagated with queries and comprise metadata about the host, user, or application that originated them.”

IMHO, with such a construction, a reader may be misled (i.e., finally, a recursive resolver propagates the end-user’s source address). Maybe, the last paragraph should be at the beginning of the section.
</JMC>

<snip>

3.4.  On the Wire

3.4.1.  Unencrypted Transports

<snip>

   o  The recursive resolver can be in the IAP network.  For most
      residential users and potentially other networks, the typical case
      is for the end user's device to be configured (typically
      automatically through DHCP or RA options) with the addresses of
      the DNS proxy in the CPE, which in turns points to the DNS
      recursive resolvers at the IAP.  The attack surface for on-the-
      wire attacks is therefore from the end user system across the
      local network and across the IAP network to the IAP's recursive
      resolvers.

<JMC>
IMHO, it should be: “The best attack surface for on-the wire attacks is therefore from the end user system to the CPE (i.e., DNS Proxy). From the CPE to the IAP’s recursive resolvers, the eavesdropping is more complex as the end-user’s source address may be “hidden”, as explained in Section 3.2”.
</JMC>

<snip>

   It is also noted that typically a device connected _only_ to a modern
   cellular network is

   o  directly configured with only the recursive resolvers of the IAP
      and

   o  afforded some level of protection against some types of
      eavesdropping for all traffic (including DNS traffic) due to the
      cellular network link-layer encryption.

<JMC>
Sorry but I don’t agree except if the recursive resolvers are located inside mobile antennas :)
More seriously, AFAIK, even there is L2 encryption on cellular network, either L2 encryption (e.g., MACSEC) or L3 encryption (e.g., IPsec) on fixed networks from RAN to recursive resolvers, this encryption is not E2E with the recursive resolvers. 
BTW, the recursive resolvers may be the same for “Mobile” customers and “Fixed” (e.g., DSL, Fiber) customers.
</JMC>

<snip>

4.  Actual "Attacks"

   Many research papers about malware detection use DNS traffic to
   detect "abnormal" behavior that can be traced back to the activity of
   malware on infected machines.  Yes, this research was done for the
   good, but technically it is a privacy attack and it demonstrates the
   power of the observation of DNS traffic.  See [dns-footprint],
   [dagon-malware], and [darkreading-dns].

<JMC>
“... but technically, it is a privacy attack”
Please, add either a definition of what is a “privacy attack” inside the document or a reference of an existing definition.
By the way, I am curious to check with the definition whether anti-virus software is also considered as a privacy attacker.
</JMC>

<snip>

6.  Security Considerations

   This document is entirely about security, more precisely privacy.  It
   just lays out the problem; it does not try to set requirements (with
   the choices and compromises they imply), much less define solutions.
   Possible solutions to the issues described here are discussed in
   other documents (currently too many to all be mentioned); see, for
   instance, 'Recommendations for DNS Privacy Operators'
   [I-D.ietf-dprive-bcp-op].

<JMC>
As I mentioned inside Section 2, in case this document is considered as a “Problem Statement” document, IMHO, impact(s) from privacy on security is (are) missing inside this document. 
Indeed, there is no text about, at least for me – but maybe there are other points, the following points:
- DNS Tunneling
As, generally, DNS flows are not filtered/blocked, this technique may be used for malicious activities (e.g., botnet C&C, malware propagation, data extraction from compromised devices, fraud).
One way to mitigate such malicious activities is the monitoring of DNS flows.
The encryption of DNS flows may encourage the filtering/blocking of encrypted DNS flows (cf. Section 3.5.1.3. topic)    
- DDoS attacks based on DNS amplification
I am not a DDoS expert, but I am wondering on potential detection mechanisms, closed to the sources, of DDoS attacks based on DNS amplification: is there any potential impact to have DoH/DoT deployed? 
</JMC>

Thanks in advance for your replies.

Best regards,

JMC.