Anycast DNS: Improving Resilience and Performance

The IP address 1.1.1.1 exists in more than 300 locations simultaneously. When you query it from Paris, you hit a datacenter in Paris. From São Paulo, São Paulo. There's no load balancer in the middle, no geolocation database lookup, no application-layer routing. The network itself figures it out.

That's anycast. It's been the standard architecture for authoritative DNS at any meaningful scale for over a decade. If you're not using it, you're either very small or paying for the privilege of suboptimal latency.

How Anycast Works

In normal unicast routing, each IP address belongs to exactly one network interface in one physical location. When a packet is sent to that IP, BGP routes it to that specific location.

In anycast, the same IP prefix is announced from multiple locations simultaneously via BGP. Each location advertises the same route. Routers follow standard BGP path selection — shortest AS path, lowest MED, local preference — and packets are delivered to whichever location is "nearest" in BGP terms.

This means:

A query from a client in Tokyo hits your Tokyo PoP
A query from Frankfurt hits your Frankfurt PoP
Both are using the same IP address
Neither client does anything special

When a node goes offline and withdraws its BGP announcement, traffic automatically reroutes to the next-best node. No TTL to wait for. No client reconfiguration. BGP convergence typically happens in 30–90 seconds.

Why This Is the Standard for Authoritative DNS

Authoritative DNS has two performance requirements: low latency (answer quickly) and high availability (always answer). Anycast addresses both.

Latency: Instead of sending a query from Tokyo to a server in Virginia, you answer it locally. The difference between 2ms and 200ms RTT is measurable in page load time. For recursive resolvers that need to chain multiple authoritative queries, this compounds.

Availability: If your Frankfurt node dies, traffic shifts to Amsterdam or Paris. The client's resolver retries within its standard timeout and hits a different node that happens to have the same IP. From the client's perspective, there was a brief timeout. Not an outage.

Scale: You can add capacity by adding nodes. Each node announces the same prefix. Traffic distributes naturally to whoever is closest.

Who Uses It

Every major DNS provider uses anycast for authoritative service:

Cloudflare: 300+ PoPs, all serving 1.1.1.1 (resolver) and authoritative via anycast
Route 53: AWS edge nodes globally, same NS IP answering from nearest available region
NS1: Anycast-native from the start, one of their core selling points
Google Public DNS: 8.8.8.8 is anycast across Google's global network

For authoritative DNS, the nameserver IPs themselves are anycast. When your zone has NS ns1.exampledns.com, that hostname resolves to an anycast IP that routes to the nearest PoP.

The Trade-offs

Anycast is not free. There are real operational complications.

Zone Transfers Don't Work the Way You'd Expect

AXFR (full zone transfer) and IXFR (incremental) rely on TCP connections from a secondary to a specific primary. With anycast, "the primary" isn't a single location — the same IP might route to different nodes depending on where the secondary is.

The standard pattern is a hidden primary: a unicast (non-anycast) server that holds the authoritative copy of the zone, which is not publicly advertised. All anycast nodes receive zone transfers from this hidden primary. The hidden primary is never in your public NS records.

Zone delegation:
  example.com NS ns1.exampledns.net   <-- anycast, public
  example.com NS ns2.exampledns.net   <-- anycast, public

Hidden primary:
  primary.internal.exampledns.net     <-- unicast, not in NS records
  All anycast nodes AXFR/IXFR from here

This is the topology used by every serious authoritative operator. The hidden primary is protected, monitored, and only reachable from the anycast nodes.

Debugging Is Harder

When you query an anycast IP, you don't know which node answered. If there's an inconsistency — one node has stale data, one node has a misconfigured zone — you might see intermittent failures that you can't reproduce.

To debug anycast:

Use dig with +short +identify to see the IP that answered (doesn't always help with anycast)
Contact your DNS provider's API to check which nodes are serving which data
If you control the infrastructure, use node-specific unicast IPs (out-of-band management addresses) to query individual nodes directly

Inconsistency between anycast nodes is usually caused by failed zone transfers. Check dig AXFR @specific-node.internal against your hidden primary's serial number.

BGP Hijacking Risk

Since anycast routing is based on BGP, a misconfigured or malicious AS announcing the same prefix can attract traffic. This is not theoretical — it happens. RPKI (Resource Public Key Infrastructure) validation mitigates this, but not all providers and networks implement it.

For your own infrastructure: make sure your IP announcements are covered by a valid ROA (Route Origin Authorization) in RPKI. For DNS providers you're relying on, check whether they publish ROAs.

Implementing Anycast for Your Own Infrastructure

If you're running your own authoritative DNS infrastructure (common for large enterprises, hosting providers, and anyone who doesn't want to be fully dependent on a SaaS DNS provider), here's the architecture:

Requirements

Your own ASN — you need to be able to originate BGP routes
Your own IP block — at least a /24, since most providers won't accept more-specific announcements
BGP sessions with transit or IXP at each location
A BGP daemon on your DNS servers — Bird2 or FRRouting are the standard choices

Basic Node Setup (Bird2)

Each anycast node runs a BGP daemon that announces the anycast prefix when the DNS service is healthy:

# /etc/bird/bird.conf (simplified)
protocol bgp upstream1 {
    neighbor 198.51.100.1 as 64496;
    local as 65001;
    ipv4 {
        export filter {
            if net = 203.0.113.0/24 then accept;
            reject;
        };
    };
}

Pair this with a health check that withdraws the route if the DNS daemon is unresponsive:

#!/bin/bash
# healthcheck.sh — runs every 10 seconds
if ! dig +time=2 +tries=1 @127.0.0.1 health.internal.example.com > /dev/null 2>&1; then
    birdc disable upstream1
    logger "DNS healthcheck failed, withdrawing BGP route"
fi

When the service recovers:

birdc enable upstream1

This is the core of anycast resilience: automatic failover based on service health, not just node availability.

Anycast with a DNS Provider

If you're using a managed DNS provider (Cloudflare, NS1, Route 53), anycast is just included. You don't configure it. What you do configure is:

Multiple providers (secondary DNS) so one provider's anycast network going down doesn't take you offline
Consistent TTLs across providers
Zone transfer or API sync between primary and secondary providers

Key Takeaways

Anycast routes traffic to the nearest BGP-adjacent node with the same IP. It's geography-aware routing at the network layer.
Zone transfers require a hidden primary. AXFR to an anycast address is unreliable.
Debugging anycast requires node-specific access or provider tooling — you can't tell which node answered from a standard dig.
For your own anycast: Bird2 or FRRouting + BGP health check withdrawal is the standard pattern.
The trade-offs are real but well-understood. Every major DNS provider has solved them. You benefit from their solutions when you use managed DNS.

Up Next

DNS Monitoring and Logging Best Practices — what metrics to collect, what dashboards to build, and what thresholds to alert on.