DNSSEC and zone transfers: what you need to know
Welcome to The Quirks of the DNS, a series of occasional blog posts where I highlight some of the odd corners of the Domain Name System (DNS)–the universal database for all things Internet. We will see some interesting issues that can occur with the DNS and I will give some recommendations for how to avoid problems.
Getting in the zone: two ways to sync primary and secondary servers
The DNS system needs to transfer DNS data between primary and secondary servers. Traditionally this was done using Asynchronous Zone Transfers (AXFR). This is still used for initial transfer, and it transfers the entire delimited administrative part of DNS information known as a zone. Zones vary greatly in size, and can be anywhere from a few to millions of records. The size of the AXFR will vary accordingly as will the bandwidth and computing power needed to effect the transfer.
Every time some record in a zone is updated on the primary server, the secondary server needs to be informed about the change. Traditionally this was done by having the secondary server request a full zone transfer, which would, obviously, include the new records. However, a lot of time and effort is wasted in transferring records that are not changed and which are already present on the secondary server.
So Incremental Zone Transfers (IXFR) were developed. In an IXFR only the records connected to the actual changes to the zone are transferred. The zones all have version numbers so that servers can refer to "this version of the zone" (with the corresponding exact content) and "that version of the zone" (with corresponding different content). When a secondary server requests an IXFR, it basically says to the primary server "tell me what I need to do to catch up from my version of the zone <version-number> to the latest version". The primary server will then respond with "to go from version <version-number> to <new-version-number> you need to the remove the following records from the zone: XXX, XXX, XXX, and then add the following records to the zone: YYY, YYY, YYY".
If the change is small, this makes perfect sense. If you change 10 records in a zone with 5,000 records, an IXFR will contain 20 records (remove 10, add 10), whereas a full AXFR will contain 5,000 records (all the records in the new version of the zone). The gain in time and network resources is obvious.
Enter DNSSEC: the problem of timestamps and RRSIG records
DNS Security Extensions (DNSSEC) adds security to the DNS using public key cryptography to authenticate the data sent in response to a DNS query. This means that the “answers” sent from DNS servers in response to DNS queries can be validated as genuine.
However, when it comes to IXFR, DNSSEC changes the picture. With DNSSEC, every record (or, to be precise, every record set [1]) is accompanied by one or more signature records (RRSIG). RRSIG records are clunky, complicated and big. One of their properties is that they contain timestamps in absolute time (a specific moment in time identified by date and clock time, always expressed in UTC). They have an "inception time" and an "expiration time" which are points in time that delineate their validity. They are invalid before the inception time, and after the expiration time. In order to keep DNS signatures valid at all times, the zone administrator needs to update them with new timestamps on a regular basis. In doing so, one must also change the signature data itself (which is another part of the RRSIG record). This means that even though the basic DNS data hasn't changed, RRSIG records in a zone will change regularly just in order to keep the signatures current.
A basic approach to signing zones would be to add RRSIG records for all the record sets in your zone using the same inception time and expiration time for all RRSIG records. These records would all expire at the same time, but as long as you updated them before that happens, you would be all good. However, if you updated them all at the same time using the same validity period again, they would all expire for the second time at the same point in time. This does not seem like a problem until you consider how this interacts with IXFRs.
RRSIG records, IXFRs and jitter
When a zone is signed using the method above, all RRSIG records in the zone will change; remember that there are "about as many of them as there are basic records in the zone". The RRSIG records are also bigger than most other records in the zone, so in a DNSSEC signed zone, the RRSIG records form the bulk of the data. As a consequence, resigning a zone changes the bulk of the zone data.
Remember that an IXFR sends information twice: "remove this, add that". If the RRSIG records are what's being changed, you will essentially say "remove the bulk of the zone, add a new bulk of the zone". Transferring the bulk of the zone twice actually means transferring more data than you would using a normal full AXFR which says "replace everything you've got with this", and which only sends the data once over the connection.
This is mitigated using something called "jitter". In low-level network terminology, jitter refers to the unwelcome variations in the time it takes for a packet to traverse a particular network segment. In this instance, though, jitter is used to introduce an important variation in RRSIG records. Instead of signing the zone using the exact same parameters for every RRSIG record, the inception time and expiration time are randomly varied by a small amount. A program will keep track of all the RRSIGs in the zone, and will automatically resign the ones that are about to expire, and only those; not the entire zone all at once. Over time, the jitter will create a situation where large numbers and randomness "smear out" the time stamps so that they all occur at different points in time.
Instead of infrequent, big updates of a lot of data (all the RRSIG records), there will be a trickle of frequent, small updates of only a few records. This approach is much more suited for IXFR. Large updates are always more difficult to handle than small ones, so using a continuous stream of small updates is a more stable way to operate, where the failure of one update has much smaller impact, than using large, infrequent updates.
How to automate jitter signing and what to consider
These days, most implementations of DNSSEC-signing software enable "auto-signing with jitter", at least if configured to do so. One drawback is that the signing software must have access to the private (secret) cryptographic key used to generate the signatures. It must also be able to release zone transfers (IXFR/AXFR) to secondary servers. This means that the key will have to be "on-line" in a server that is, to some extent, reachable from the Internet. This is an important security issue that needs to be addressed properly, where the exact "how" will vary by organisation.
Unless handled properly, the issue of time-based updates can lead to two problems: the first is if you have a very large zone, anything in the region of tens or hundreds of thousands of records, or more. Computing signatures is a fairly complicated task that requires noticeable amounts of resources. If all signatures are allowed to expire at the same time, the job of creating new ones will require a large amount of compute resources for a short period of time, and then nothing until it happens again. In other words, you will need a big computer that idles most of the time. The updates will be "enormous" when they happen, and handling them at the secondary server requires substantial resources, again, temporarily. If jitter is used, this will change to a model where the use of resources is spread out evenly, and where a steady state of small updates keep things running smoothly and predictably.
The second problem is where the same primary and secondary servers serve a large number of zones. If all the zones have records that expire at the same point in time, all the zones will be resigned almost simultaneously, and the ensuing zone transfers (be that IXFR or AXFR) will happen almost simultaneously. If jitter is used, zone updates for the different zones will again be small, and will happen at different points in time.This, again, creates a continuous stream of small updates which keeps the system running well.
Recommendation
Using IXFRs saves a lot of time and network resources. If you wish to combine DNSSEC and IXFR, you should choose software that uses jitter, keeps track of signatures, and automates the signing process.
I hope you have found this post useful. Stay tuned for further posts as we delve deeper into the quirks of the DNS! In the meantime, if you have any questions about your DNS setup and how Netnod’s DNS anycast service can help, you can contact us here.
Endnotes
[1] A record set is a group of records with the same name, DNS class, and DNS type. They will also, by necessity, have the same time to live (TTL). For more on TTL, see my previous post here.