Building a DNS Resolver

If you’ve ever had to turn a hostname into an IP address or the other way around in your C/C++ code, you probably called either gethostbyname() or gethostbyaddr(), and never thought twice about how those functions work. Well, they’re more complex than you might think, and they’re slow.

Not excruciatingly slow, but slower than if you implement their functionality yourself, which is what I spent the last few weeks working on for my networking class.

So how do these functions work? Well, this requires an understanding of the Domain Name System or DNS for short. Here’s DNS 101:

You want to go to a site like Google, so you open your web browser and type http://www.google.com, or go through your bookmarks, or perhaps you have your homepage set to google.com. Either way, it remains transparent to you that http://www.google.com is an easy way to remember how to get to Google. But this name doesn’t uniquely identify Google in the sense that computers don’t understand the text, and the name google.com might redirect you to one of many of google’s servers, like one for http://www.google.es if you live in Spain.

What does uniquely identify Google is its Internet Protocol (IP) address, which is basically a unique series of numbers. Sort of like how your social security number identifies you, but nobody goes around calling you by it, because your name is just easier to remember. Another example is to think of all the contacts in your cell phone’s address book. How many of their phone numbers can you remember off-hand? Few, if any I’m guessing. It’s just easier to call “Amy” than it is to dial in their phone number. This is exactly the reason the DNS system was invented, so you wouldn’t have to remember IP addresses.

For anyone who has used gethostbyname() or gethostbyaddr(), this is obvious. But what isn’t obvious is the amount of work those functions go through to get you what you want; a simple translation of a host name into an IP address, or the other way around.

My project was to implement a multithreaded application that could resolve a batch file of N host names or IP addresses into IP addresses or host names, respectively. So how does it work?

Each thread takes a new batch entry to resolve, and does the following:

  • Create a DNS Query
  • Put the DNS Query into a packet
  • Send the packet to the DNS server
  • Receive a response packet from the DNS server
  • Parse the packet, extract all the answers to our query
  • Output answers

Sounds simple, right? The devil’s in the details.

RFC 1035 is the official documentation for how all this is to implemented. I don’t suggest you actually read through the page, just get an understanding of how precise and complex the actual process is.

Depending on whether the next batch entry was a host name or a valid IP address, I would run either an Authoritative Query (to return an IP address) or a PTR query (to get a host name or authoritative name server). So I had to create a DNS packet, which in ASCII looks like the following (taken from RFC 1035):

And, broken down to the bit level, the Header section looks like this:
Boring and unintelligible you say? Bollocks. The RFC is actually very clear about what each field means, but it leaves the implementation up to you. In general, these are handled through direct manipulation of bit strings. Making the header section is actually fairly easy, as a header is always the same size no matter what. But what about the question section I was sending?

Setting the type of question was easy (PTR or A, both translate to 1 byte numbers that can be #defined), as was setting the class of question (Always internet, which also translates to a 1 byte number). Properly setting Qname was kind of hard in implementation, although theoretically it sounds fine.

First, I had to calculate the number of bytes I would need, so I could create a packet of proper size, then I had to create the name and insert it into the right location in the packet. We are simply manipulating bit strings after all, here. So first the DNS header was inserted, indicating 1 question to follow, then the question section followed.

All question names have to be converted into a special format according to protocol. A question like http://www.google.com becomes 3www6google3com0, where the numbers indicate the #of bytes to follow, and replace all periods (to separate top-level, second-level, etc. domains) the 0 appended to the end is a signal meaning “end of name”. So how many bytes had to be reserved is simple addition: length of name + 2, the extra 2 coming from the appending of the ‘3’ at the start and the ‘0’ at the end. Thanks to ASCII, one character takes up 1 byte, so there’s no extra math involved.

Turning an IP address into a name is a little more difficult: an IP like 123.45.67.8 has to be turned into 8.67.45.123.IN-ADDR.ARPA, which is then turned into 1826724531237IN-ADDR4ARPA, or for you humans out there [1]8[2]67[2]45[3]123[7]IN-ADDR[4]ARPA where the brackets indicate a period was replaced with a 1 byte decimal representing the number of characters to follow.

As you can see, it’s a little less trivial than just replacing periods with numbers, because the IP address is not entirely reversed; each substring between periods gets reversed. Still, a little use of the strtok_s function helps out a lot!

This question resource record was copied using the memcpy function to put all our data into our char* packet (which, let me remind you is actually a string of bits!)

After this, it becomes a simple matter of sending the packet to our local DNS server, right?

You knew it wasn’t that easy! How DO you find out the local DNS server to ping anyway??? Hard coding the value is a very very bad idea, so we had to find the DNS server dynamically.

This was done with a little help from a function called GetNetworkParams(), which helps one obtain information like Host Name, Domain Name, and yes, Local DNS Servers. After we got a primary DNS Server IP, we could open UDP sockets and send our packets to them.

OK, so far we have found a DNS Server and created the packet we want to send. We just need a socket for communication. DNS typically runs over UDP sockets, which are faster than TCP, though less reliable. I used UDP sockets, but because they’re unreliable, I had to have a retransmission scheme in case the packet was lost. This can be accomplished with a little help from the select() function, which will allow a request to time out instead of hang indefinitely.

Using the function properly is a little difficult at first, and there aren’t a  whole lot of great examples out there, so here’s my code:

There’s a couple things to note here. First, one must create a timeval structure which allows you to specify timeout time in seconds (timeval->sec) and milliseconds (timeval->usec). the name of my UDP socket is sock, and was instantiated earlier. Next, you must create a fd_set structure to specify which sockets you will wait on. I named my structure Sockets. the fd_count field specifies the number of sockets to wait on, and the fd_array[0] is setting the first item of the array of sockets to my socket, sock.

Then, I send my packet to the server and wait for 30 seconds via the select function. If a response is received from the server, I process it and exit the loop, but what’s more important here is the case where the select() times out. After processing why the select function failed (time out or socket error), you MUST reset your fd_set structure or else any future calls to select using that structure will immediately fail!

My retransmission strategy was to try a total of three times to get the packet to the server, and if it times out all three times, the output of the program later will inform the user of this.

Now that we’ve located our DNS server, created our packet, opened a socket, successfully sent our data, and received a response from the server, here comes the easy part! Well, not really, there aren’t any really easy parts in this project! All the data we received into our buffer is again a string of bits that we must manually separate into meaningful blocks of data. Binary Tides has an excellent article that fully illustrates parsing the received data.

Our received packet is in exactly the same format as the DNS packet we sent to the server: DNS Header, Question, Answer, Authority, and Additional sections. Since the DNS Header is of fixed length we can easily read in how many answer, authoritative, and additional RRs are contained in the packet. First things first, though –we must read the flags from the header section to determine if there was an error of some sort reported by the server.

The flags section of the header will indicate response messages, similar to the “200 OK” responses you get from a HTTP GET request. The thing is, though, that they’re represented as a four-bit number (the “RCODE” part of the above figure). There is no basic data type we can cast into just four bits so what I did is grab 1 byte with an unsigned short, and then bit-mask the top nybble with 0’s so I would get something like 0000 0010 instead of possibly 0010 0010, which indicates a response code of 34 instead of 2, and there does not exist a response code of 34. What’s more, a response code of 2 indicates that the server failed to process the query, so without bit-masking we may just assume the server could not find a DNS entry instead of the real reason.

So after we receive a flag of 0 (no error), we can parse out all our answers. The manner to do so involves carefully separating bits out of the bitstring, via casting into a variety of types. I personally based my code off the Binary Tides article above, so I could do no better job explaining how it’s done than that author.

Now comes the (actual) easy part: processing all the data! After all the records in batch mode have been processed, we have a wealth of information. We can calculate average wait times, percentage of failed queries, percentage of records with no DNS entries, or take it a step further and calculate statistics based on # of threads run, # of entries in the batch file, whether the project is run in debug or release mode, and then compare all our results to those had we just used gethostbyname() and gethostbyaddr().

NOTE: Unless you are hosting your own domain name server through something like Simple DNS, I highly advise against running a batch file of more than a few entries as well as more than a few threads! The sheer volume of DNS queries coming into the DNS server could overload it and either crash it or cause network outages for other users on the network! This could be viewed as a malicious network attack (similar to a DOS attack) and it can be traced back to your computer!

Ok, now that I’ve got that out of the way, I can tell you this: I was on the University of Dayton’s network and could not run my project for large batches of data due to the above stated information. Additionally, Simple DNS could not be properly configured because UD’s DNS servers refuse to transfer records to other computers (most likely for security reasons). BUT, I was able to gather some overall conclusions based on my own small tests, and reading the reports from the few other people in my class who also built DNS resolvers.

This project is much, much faster than gethostbyname() and gethostbyaddr() for all sizes of data and number of threads running. Runs of even very large batch files could be as much as 5-10 times faster! This is probably due to the fact that the MSDN functions are more robust and secure, but the difference is astonishing. Also, projects actually seemed to run faster on debug versions than release mode. Strange, considering STL insertion is about 50 times faster in release mode. I do not know why this is, and can’t even really guess at it.

So there you have it, a DNS Resolver, built over the course of many hours during a two-week span. It is a huge learning experience, and really teaches you that re-inventing the wheel can cost many hours of time lost. Had one not known functions like gethostbyname() existed, they may have tried to do something like this, spending 60 hours when they could have spent 10 minutes reading the MSDN function documentation.

Also, it teaches you some nuts-and bolts about what has to go on when you make a request as simple as “www.google.com” into your URL toolbar of your browser.

Finally, if you want to look at the DNS packets (as well as other types of packets) streaming across your network, I suggest you install Wireshark, a free application for sniffing your packets. The program was so essential to me during the building of this project, as it allowed me to carefully examine each of my malformed packets and find out what I was doing wrong.

Thanks for reading, if you got this far. This was one of my most difficult and complex projects to date. If you want the source code… well too bad 😦 It could be used for malicious purposes and I don’t want to freely distribute it. If you have any questions about the whole DNS lookup process, though, I would be glad to help! Find my e-mail address on my “About Me” Page.

Advertisements

Tags: , , ,

3 Responses to “Building a DNS Resolver”

  1. What’s Wrong With This Code? « Andy G's Blog Says:

    […] Wrong With This Code? By gieseanw I meant to include this in my post about Building a DNS Resolver, as it is a small, yet significant bug that took me a little while to find. What the code should be […]

  2. bogdan Says:

    hello! Do you think that is possible to look over the source code of this project? I am involved in a similar project. Thank you. Bogdan

    • gieseanw Says:

      Hi Bogden, I didn’t want to release the source code into the wild for fear of “script kiddies” just getting a hold of it and using it for nefarious purposes. Perhaps we could communicate via email and work something out? Contact me by the email address found on my About Me page.

      -Andy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: