User: Password:
Subscribe / Log in / New account

Potential pitfalls in DNS handling

Potential pitfalls in DNS handling

Posted Nov 15, 2012 11:02 UTC (Thu) by epa (subscriber, #39769)
Parent article: Potential pitfalls in DNS handling

Yikes! Any byte value is allowed even . and \0? That's quite a surprise. It also means that the common DNS notation is inadequate; you'd have to carefully define an escaping scheme so that any hostname can be written unambiguously.

What happens when one of these odd hostnames needs to be encoded in a URI? It is not enough to say 'just %-encode it' because that does not address the issue of '.' contained in a component.

(Log in to post comments)

Potential pitfalls in DNS handling

Posted Nov 15, 2012 16:02 UTC (Thu) by NAR (subscriber, #1313) [Link]

Well, it looks like that the "Preferred name syntax" chapter is indeed only about the preferred names. Interestingly the inet:getaddr/2 function in Erlang accept characters only between $21 and $7e, so no space or international domain names.

Potential pitfalls in DNS handling

Posted Nov 15, 2012 18:14 UTC (Thu) by paulj (subscriber, #341) [Link]

Anything is allowed in DNS labels? Was that always the case, because RFC1035 says this in section 2.3.1 on "Preferred name syntax" - on names in DNS generally:

"Note that while upper and lower case letters are allowed in domain names, no significance is attached to the case. That is, two names with the same spelling but different case are to be treated as if identical.

The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen. There are also some restrictions on the length. Labels must be 63 characters or less."

Potential pitfalls in DNS handling

Posted Nov 15, 2012 18:18 UTC (Thu) by paulj (subscriber, #341) [Link]

Ah, perhaps it's RFC2673 allowing this:

Potential pitfalls in DNS handling

Posted Nov 15, 2012 20:10 UTC (Thu) by bjencks (subscriber, #80303) [Link]

Further up 2.3.1 it says:

"For example, when naming a mail domain, the user should satisfy both the rules of this memo and those in RFC-822. When creating a new host name, the old rules for HOSTS.TXT should be followed. This avoids problems when old software is converted to use domain names.

The following syntax will result in fewer problems with many applications that use domain names (e.g., mail, TELNET)."

and proceeds to describe the rules you quoted, indicating that those rules are guidelines for maximum interoperability, not MUST specifications of the protocol. Section 3.1 reinforces this:
"Although labels can contain any 8 bit values in octets that make up a label, it is strongly recommended that labels follow the preferred syntax described elsewhere in this memo, which is compatible with existing host naming conventions."

Potential pitfalls in DNS handling

Posted Nov 16, 2012 10:24 UTC (Fri) by paulj (subscriber, #341) [Link]

Interesting. Which means RFC1035 is surely inconsistent on this, given 2.3.1 says "the labels must" - a "must" that isn't actually a "must". But as someone else points out, RFC2181 ยง11 clearly states binary is allowed.

Potential pitfalls in DNS handling

Posted Nov 16, 2012 19:11 UTC (Fri) by hawk (subscriber, #3195) [Link]

I think the point there is that section 2.3.1 of RFC1035 ( is not describing the capabilities of the actual DNS protocol but rather what names should be used to achieve compatibility with existing systems.

This article is really about what kind of data you can get back in a (still correctly formatted) DNS response. It's important to note that even though the DNS protocol can carry anything there may still be application specific naming rules that prevents the full-on "any byte is valid" in a specific context.

(The article does have an unfortunate mixup (that's my take on it, anyway) where hostname name rules and DNS protocol name rules seem to be considered the same thing. See my comment regarding this:

Potential pitfalls in DNS handling

Posted Nov 22, 2012 6:40 UTC (Thu) by magfr (subscriber, #16052) [Link]

The problem with application specific rules is that a cracker could choose to not adhere to them so the problem is still there and the application have to be prepared for everything that the protocol can transport.

Note that everything the protocol can transport might be a superset of what the protocol allows.

Potential pitfalls in DNS handling

Posted Nov 16, 2012 17:57 UTC (Fri) by hawk (subscriber, #3195) [Link]

I guess the point is that "the common DNS notation" is not really "DNS notation" at all but a common domain name notation used in the operating system/application layers regardless of resolution mechanism.

In fact, it's clearly a different notation from the one that the DNS protocol uses, so there being a difference in capability there is not hugely surprising.

For the actual DNS protocol it's not actually that shocking, there dots have no special meaning and the same applies to any other byte value.
Instead, of the dot-separation the DNS protocol prefixes each label by an integer specifying the length.

However, there being a difference between the capabilities of how domain names are handled by the OS/application and by DNS and possibly other resolution mechanisms in use clearly creates an opportunity for confusion/inconsistency/disaster/... when mapping between the native format of each resolution mechanism (the 8bit clean "DNS wire notation" in the case of DNS) and the "string representation with dot-separation".

I suppose the idea is that the resolver library in the operating system ought to take care of this mapping (escaping or discarding or whatever should be done for names that can not be represented in the dot-separated string notation) once and for all and let the applications just do their thing.

Potential pitfalls in DNS handling

Posted Nov 16, 2012 23:54 UTC (Fri) by Comet (subscriber, #11646) [Link]

The bind library does.

That doesn't stop people writing code like:

labels = result.split('.')
shorthost = labels[0]

If you split on '.', then you break when presented with the escaped form '\.' as you'll do an extra split. You might get away with it when only looking for the TLD, or just sorting data and rejoining the strings.

So, there's an escaping mechanism, it helps a lot of the time, but other times it produces surprising results and you need to at least know that the escaping mechanism is in use.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds