LWN.net Logo

MirrorManager automatic local mirror selection

Matt Domsch takes a look at MirrorManager in Fedora. "As you know, Internet routing uses BGP (Border Gateway Protocol), and Autonomous System Numbers (ASNs) to exchange IP prefixes (aa.bb.cc.dd/nn) and routing tables. By grabbing a copy of the global BGP table a few times a day, MM can know the ASN of an incoming client request, and Hosts in the MM database have grown two new fields: ASN and "ASN Clients?". MM then looks to see if there is a mirror with the same ASN as each client, and offers it up earlier in the list."
(Log in to post comments)

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 2:59 UTC (Wed) by jspaleta (subscriber, #50639) [Link]

I don't think you do Matt's involvement justice. He's the lead developer and I think its one of the hidden gems in the Fedora infrastructure. I think every single distribution should be running their own MirrorManager service as a way to make better use of mirroring resources. Mirrormanager really gives local network admins a way some control over how Fedora clients are using external bandwidth without getting in the way over client operation. It's very slick.

-jef

Killer App: Site-Local Netblocks

Posted Sep 30, 2009 3:46 UTC (Wed) by wtogami (subscriber, #32325) [Link]

One really awesome feature about MirrorManager is its "Site-Local Netblocks". You can use it to setup a private mirror for your local network, and automatically all yum clients on your network know to try your local mirror before public mirrors.

My office has Internet access all coming from a single NAT address, so MirrorManager tells all yum clients from this network to point at my private mirror within the office. It is great for yum updates and nightly test composes to use an ultra-fast mirror without any manual configuration. Visitors to our office automatically use the local mirror while plugged into our network and they didn't have to change any configurations to do so.

A University campus could list all of its public IP ranges in MirrorManager to direct all on-campus clients to use its local mirror, even if that local mirror is a public mirror. If the local mirror is down, the clients are smart enough to fallback to a random public mirror. The random public mirror is within a nearby geography.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 6:53 UTC (Wed) by butlerm (subscriber, #13312) [Link]

This is great stuff.

Someday it would be nice if there was a way for clients to transitively ask
their nearest BGP router questions like, "what is the AS number of and
estimated route latency to each of the following IP addresses?"

Assuming that BGP (and the corresponding IGPs) were extended to carry,
transitively modify, and propagate such information, it would be trivial for
any client to pick the closest host from a list of mirrors, CDN nodes, or
whatever.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 8:14 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

It probably doesn't make sense to do this, routers are expensive, busy and critical infrastructure. Adding frivolous (and publicly accessible) features increases the load, makes them more complicated and increases the exposure to attack from black hats, all bad things. An anycast UDP service running separately from the routers (at least on non-toy networks) could make sense, but to convince the big ISPs you'd have to show how it's going to save them a bunch of money.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 15:05 UTC (Wed) by butlerm (subscriber, #13312) [Link]

It saves money like this: (1) Heavy download traffic traverses a smaller
number of links and routers, reducing the load on (and lowering the
ultimate cost of) each one, (2) Latency is lower, increasing application
performance, making the provided service more valuable.

In addition, if the load were really a problem, the queries could be
shunted off to a route server that carried the same information. What is
the alternative? Test the latency to ten or twenty different hosts until
you find the one that is optimal for your location?

The MirrorManager way is to capture a database of net block to AS mappings,
and use AS identity as a proxy for latency. That works great if you are in
the same AS as one of the servers. If not, you either test latency
manually, or you have to access some sort of local route server to make a
latency estimate. AS path length would be a lot better than nothing, and
wouldn't require any new BGP/IGP attributes. A real latency estimate would
be better of course, especially for cases with multiple mirrors in the same
(large) AS.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 15:55 UTC (Wed) by jsatchell (guest, #6236) [Link]

Is latency the correct measure?

For big downloads (like most packages), I think it is probably effective bandwidth. Clearly this depends on latency somewhat, but varies more strongly with congestion, which can't be measured statically.

The MM behaviour, of seeking out the same ASN seems robust and sane.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 16:48 UTC (Wed) by butlerm (subscriber, #13312) [Link]

Selecting a host in the same AS works great if there *is* one in the same AS.
And that policy, of course, is *much* better than nothing.

Statically estimated latency doesn't adapt to dynamic congestion of course,
but the metric for typically congested links should be statically increased
to compensate.

MirrorManager automatic local mirror selection

Posted Sep 30, 2009 18:46 UTC (Wed) by mmcgrath (guest, #44906) [Link]

I'd be curious to see cases where a host is in the same ASN as a mirror, but is a poor mirror choice.

MirrorManager automatic local mirror selection

Posted Oct 2, 2009 3:23 UTC (Fri) by gdt (subscriber, #6284) [Link]

You are confusing the routing and forwarding functions of a "router". Wire-speed forwarding is expensive, the exchange of route information is cheap and can be hosted on a low-end CPU.

The problem for us ISPs offering access to BGP is three-fold. (1) Lack of solid routing software outside of a router chassis (and yes, I know about Quagga, Xorp and OpenBGP. Like a lot of open source software, they are all 80% solutions and none of them can implement even the routing functions of a typical backbone router [MBGP+OSPF+OSPFv3+MSDP for IPv4+v6 with MPLS-TE and graceful restart]). (2) The malicious use of this information for targeting DoS attacks against infrastructure. There's some fine academic work done at Matt Roughan's group on removing interior detail from BGP without altering exterior path selection, but it's yet to be fielded. (3) A IETF standard protocol. We don't want to be talking BGP to hosts, that's millions of TCP connections. We want a scalable infrastructure more like DNS.

However, ISPs *are* interested in providing this functionality. We'd be very happy if p2p traffic followed our BGP metrics rather than attempting to determine its own lowest-cost path based on measurements of latency.

Furthermore, for high-traffic systems, like the mirrors of major Linux distributions, it is well in our interest to provide a BGP feed to this small number of trusted parties. Of course, we don't do that from a core route reflector, instead we send it through a router configured in a "Internet exchange route server" mode (which both Quagga and OpenBSD do well).

MirrorManager automatic local mirror selection

Posted Oct 2, 2009 6:26 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

if ISPs are really interested in providing this sort of service (let me tell you what the best mirror to use is) then it would seem that the right answer would be to create a DNS-like system (it could even use the DNS protocol, just a different tree of servers) and get ARIN to define an IP address to be used with anycast for this service.

every ISP could provide a service listening on this IP address that would answer a DNS-like query with the name or IP of the resource to use. the client then does a regular DNS lookup of what gets returned by the location service and uses that mirror. if the server doesn't know a good answer it can just return the same name that was queried and the client would then fall back to a normal DNS lookup of that source.

small ISPs (and other small networks/businesses) that don't want to run this service can just pass the IP upstream to their ISP.

by using anycast the service can be hosted on any number of machines, spread throughout the ISPs network as appropriate for the load.

the only place this would fail is in organizations that have egress filters that block DNS requests from the inside. I suspect that most of those organizations would also discourage downloading so there probably won't be a lot of traffic from them. they could also make exceptions to the egress rules to allow DNS queries to the dedicated anycast address.

MirrorManager automatic local mirror selection

Posted Oct 1, 2009 13:02 UTC (Thu) by sjlyall (subscriber, #4151) [Link]

Well if you just want the AS originating a specific network then you could
use Team Cymru's IP to ASN Mapping service. It's available via whois, http, https and dns:

http://www.team-cymru.org/Services/ip-to-asn.html

MirrorManager automatic local mirror selection

Posted Oct 1, 2009 19:34 UTC (Thu) by mdomsch (subscriber, #5920) [Link]

yes, I saw that, and the DNS lookup method at asn.routeviews.org. I don't want MM to have to count on any external services to answer queries (we've had troubles when data centers or network connections fail). Loading the asn.routeviews.org zonefile into bind took 1.1GB RAM, where parsing the RIB and loading it into the python web app takes ~165MB, which, while large, is manageable.

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds