|
|
Subscribe / Log in / New account

Security

SCALE 8x: Ten million and one penguins

By Jake Edge
March 10, 2010

At SCALE 8x, Ronald Minnich gave a presentation about the difficulties in trying to run millions of Linux kernels for simulating botnets. The idea is to be able to run a botnet "at scale" to try to determine how it behaves. But, even with all of the compute power available to researchers at the US Department of Energy's Sandia National Laboratories—where Minnich works—there are still various stumbling blocks to be overcome.

While the number of systems participating in botnets is open to argument, he said, current estimates are that there are ten million systems compromised in the US alone. He listed the current sizes of various botnets, based on a Network World article, noting that "depending on who you talk to, these numbers are either low by an order of magnitude or high by an order of magnitude". He also said that it is no longer reported when thousands of systems are added into a botnet, instead the reports are of thousands of organizations whose systems have been compromised. "Life on the Internet has started to really suck."

Botnet implementations

Botnets are built on peer-to-peer (P2P) technology that largely came from file-sharing applications—often for music and movies—which were shut down by the RIAA. This made the Overnet, which was an ostensibly legal P2P network, into an illegal network, but, as he pointed out, that didn't make it disappear. In fact, those protocols and algorithms are still being used: "being illegal didn't stop a damn thing". For details, Minnich recommended the Wikipedia articles on subjects like the Overnet, eDonkey2000, and Kademlia distributed hash table.

P2P applications implemented Kademlia to identify other nodes in a network overlaid on the Internet, i.e. an overnet. Information could be stored and retrieved from the nodes participating in the P2P network. That information could be movies or songs, but it could also be executable programs or scripts. It's a "resilient distributed store". He also pointed out that computer scientists have been trying to build large, resilient distributed systems for decades, but had little or nothing to do with the currently working example; in fact, it's apparently currently being maintained by money from organized crime syndicates.

Because the RIAA has shut down any legal uses of these protocols, it makes it difficult to study: "The good guys can't use it, but it's all there for the bad guys" And the bad guys are using it, though it is difficult to get accurate numbers as he mentioned earlier. The software itself is written to try to hide its presence, so that it only replies to some probes.

Studying botnets with supercomputers

In the summer of 2008, when Estonia "went down, more or less" and had to shut down its Internet because of an attack, Minnich and his colleagues started thinking about how to model these kinds of attacks. He likened the view of an attack to the view a homeowner might get of a forest fire: "my house is on fire, but what about the other side of town?". Basically, there is always a limited view of what is being affected by a botnet—you may be able to see local effects, but the effects on other people or organizations aren't really known: "we really can't get a picture of what's going on".

So, they started thinking about various supercomputer systems they have access to: "Jaguar" at Oak Ridge which has 180,000 cores in 30,000 nodes, "Thunderbird" at Sandia with 20,000 cores and 5,000 nodes, and "a lot of little 10,000 core systems out there". All of them run Linux, so they started to think about running "the real thing"—a botnet with ten million systems. By using these supercomputers and virtualization, they believe they could actually run a botnet.

Objections

Minnich noted that there have been two main objections to this idea. The first is that the original botnet authors didn't need a supercomputer, so why should one be needed to study them? He said that much of the research for the Storm botnet was done by academics (Kademlia) and by the companies that built the Overnet. "When they went to scale up, they just went to the Internet". Before the RIAA takedown, the network was run legally on the Internet, and after that "it was done by deception".

The Internet is known to have "at least dozens of nodes", really, "dozens of millions of nodes", and the Internet was the supercomputer that was used to develop these botnets, he said. Sandia can't really use the Internet that way for its research, so they will use their in-house supercomputers instead.

The second objection is that "you just can't simulate it". But Minnich pointed out that every system suffers from the same problem—people don't believe it can be simulated—yet simulation is used very successfully. They believe that they can simulate a botnet this way, and "until we try, we really won't know". In addition, researchers of the Storm botnet called virtualization the "holy grail" that allowed them to learn a lot about the botnet.

Why ten million?

There are multiple attacks that we cannot visualize on a large scale, including denial of service, exfiltration of data, botnets, and virus transmission, because we are "looking at one tiny corner of the elephant and trying to figure out what the elephant looks like", he said. Predicting this kind of behavior can't be done by running 1000 or so nodes, so a more detailed simulation is required. Botnets exhibit "emergent behavior", and pulling them apart or running them at smaller scales does not work.

For example, the topology of the Kademlia distributed hash network falls apart if there aren't enough (roughly 50,000) nodes in the network. The botnet nodes are designed to stop communicating if they are disconnected too long. One researcher would hook up a PC at home to capture the Storm botnet client, then bring it into work and hook it up to the research botnet immediately because if it doesn't get connected to something quickly, "it just dies". And if you don't have enough connections, the botnet dies: "It's kind of like a living organism".

So, they want to run ten million nodes, including routers, in a "nation-scale" network. Since they can't afford to buy that many machines, they will use virtualization on the supercomputer nodes to scale up to that size. They can "multiply the size of those machines by a thousand" by running that many virtual machines on each node.

Using virtualization and clustering

Virtualization is a nearly 50-year-old technique to run multiple kernels in virtual machines (VMs) on a single machine. It was pioneered by IBM, but has come to Linux in the last five years or so. Linux still doesn't have all of the capabilities that IBM machines have, in particular, arbitrarily deep nesting of VMs: "IBM has forgotten more about VMs than we know". But, Linux virtualization will allow them to run ten million nodes on a cluster of several thousand nodes, he said.

The project is tentatively called "V-matic" and they hope to release the code at the SC10 conference in November. It consists of the OneSIS cluster management software that has been extended based on what Minnich learned from the Los Alamos Clustermatic system. OneSIS is based on having NFS-mounted root filesystems, but V-matic uses lightweight RAMdisk-based nodes.

When you want to run programs on each node, you collect the binaries and libraries and send them to each node. Instead of doing that iteratively, something called "treespawn" was used, which would send the binary bundle to 32 nodes at once, and each of those would send to 32 nodes. In that way, they could bring up a 16M image on 1000 nodes in 3 seconds. The NFS root "couldn't come close" to that performance.

Each node requires a 20M footprint, which means "50 nodes per gigabyte". So, a laptop is just fine for a 100-node cluster, which is something that Minnich routinely runs for development. "This VM stuff for Linux is just fantastic", he said. Other cluster solutions just can't compete because of their size.

For running on the Thunderbird cluster, which consists of nodes that are roughly five years old, they were easily able to get 250 VMs per node. They used Lguest virtualization because the Thunderbird nodes were "so old they didn't have hardware virtualization". For more modern clusters, they can easily get 1000 VMs per node using KVM. Since they have 10,000 node Cray XT4 clusters at Sandia, they are confident they can get to ten million nodes.

Results so far

So far, they have gotten to 1 million node systems on Thunderbird. They had one good success and some failures in those tests. The failures were caused by two things: Infiniband not being very happy being rebooted all the time, and the BIOS on the Dell boxes using Intelligent Platform Management Interface (IPMI), which Minnich did not think very highly of. In fact, Minnich has a joke about how to tell when a standard "sucks": if it starts with an "I" (I20), ends with an "I" (ACPI, EFI), or has the word "intelligent" in it somewhere; IPMI goes three-for-three on that scale. So "we know we can do it", but it's hard, and not for very good reasons, but for "a lot of silly reasons".

Scaling issues

Some of the big problems that you run into when trying to run a nation-scale network are the scaling issues themselves. How do you efficiently start programs on hundreds of thousands of nodes? How do you monitor millions of VMs? There are tools to do all of that "but all of the tools we have will break—actually we've already broken them all". Even the monitoring rate needs to be adjusted for the size of the network. Minnich is used to monitoring cluster nodes at 6Hz, but most big cluster nodes are monitored every ten minutes or 1/600Hz—otherwise the amount of data is just too overwhelming.

Once the system is up, and is being monitored, then they want to attack it. It's pretty easy to get malware, he said, as "you are probably already running it". If not, it is almost certainly all over your corporate network, so "just connect to the network and you've probably got it".

Trying to monitor the network for "bad" behavior is also somewhat difficult. Statistically separating bad behavior from normal behavior is a non-trivial problem. Probing the networking stack may be required, but must be done carefully to avoid "the firehose of data".

In a ten million node network, a DHCP file is at least 350MB, even after you get rid of the colons "because they take up space", and parsing the /etc/hosts file can dominate startup time. If all the nodes can talk to all other nodes, the kernel tables eat all of memory; "that's bad". Unlike many of the other tools, DNS is designed for this "large world", and they will need to set that up, along with the BGP routing protocol so that the network will scale.

Earlier experiments

In an earlier experiment, on a 50,000 node network, Minnich modeled the Morris worm and learned some interesting things. Global knowledge doesn't really scale, so thinking in terms of things like /etc/hosts and DHCP configuration is not going to work; self-configuration is required. Unlike the supercomputer world, you can't expect all of the nodes to always be up, nor can you really even know if they are. Monitoring data can easily get too large. For example, 1Hz monitoring of 10 million nodes results in 1.2MB per second of data if each node only reports a single bit—and more than one bit is usually desired.

There is so much we don't know about a ten million node network, Minnich said. He would like to try to do a TCP-based denial of service from 10,000 nodes against the other 9,990,000. He has no idea whether it would work, but it is just the kind of experiment that this system will be able to run.

For a demonstration at SC09, they created a prototype botnet ("sandbot") using 8000 nodes and some very simple rules, somewhat reminiscent of Conway's game of Life. Based on the rules, the nodes would communicate with their neighbors under certain circumstances, and, once they had heard from their neighbors enough times would "tumble", resetting their state to zero. The nodes were laid out on a grid which were colored based on the state of the node, so that pictures and animations could be made. Each node that tumbled would be colored red.

Once the size of the botnet got over a threshold somewhere between 1,000 and 10,000 nodes, the behavior became completely unpredictable. Cascades of tumbles, called "avalanches" would occur with some frequency, and occasionally the entire grid turned red. Looking at the statistical features of how the avalanches occur may be useful in detecting malware in the wild.

Conclusion

There is still lots of work to be done, he said, but they are making progress. It will be interesting to see what kind of practical results come from this research. Minnich and his colleagues have already learned a great deal about trying to run a nation-scale network, but there are undoubtedly many lessons on botnets and malware waiting to be found. We can look forward to hearing about them over the next few years.

Comments (14 posted)

Brief items

Microsoft's Charney Suggests 'Net Tax to Clean Computers (PCWorld)

PCWorld reports on a speech given by Microsoft's Vice President for Trustworthy Computing, Scott Charney, at the RSA security conference in San Francisco. In it, he suggests that a tax of some sort might be just the way to pay for cleaning up systems that are infected with viruses and other malware. "So who would foot the bill? 'Maybe markets will make it work,' Charney said. But an Internet usage tax might be the way to go. 'You could say it's a public safety issue and do it with general taxation,' he said."

Comments (55 posted)

'Severe' OpenSSL vuln busts public key crypto (Register)

The Register has posted an article on a reported OpenSSL vulnerability that allows attackers to obtain a system's private key. Before hitting the panic button, though, it's worth seeing what's involved in carrying out this attack: "The university scientists found that they could deduce tiny pieces of a private key by injecting slight fluctuations in a device's power supply as it was processing encrypted messages. In a little more than 100 hours, they fed the device enough 'transient faults' that they were able to assemble the entirety of its 1024-bit key." It could be a problem for keys hidden in embedded systems, but that is probably about the extent of it.

Comments (22 posted)

Security reports

IETF draft - "Security Assessment of the Internet Protocol"

A draft security assessment of IP, which may one day become an Internet Engineering Task Force (IETF) RFC, has been announced. "This document is the result of an assessment the IETF specifications of the Internet Protocol (IP), from a security point of view. Possible threats were identified and, where possible, countermeasures were proposed. Additionally, many implementation flaws that have led to security vulnerabilities have been referenced in the hope that future implementations will not incur the same problems. Furthermore, this document does not limit itself to performing a security assessment of the relevant IETF specifications, but also provides an assessment of common implementation strategies found in the real world."

Comments (2 posted)

New vulnerabilities

apache: information leak

Package(s):apache CVE #(s):CVE-2010-0434
Created:March 8, 2010 Updated:April 12, 2011
Description: From the Mandriva advisory:

The ap_read_request function in server/protocol.c in the Apache HTTP Server 2.2.x before 2.2.15, when a multithreaded MPM is used, does not properly handle headers in subrequests in certain circumstances involving a parent request that has a body, which might allow remote attackers to obtain sensitive information via a crafted request that triggers access to memory locations associated with an earlier request.

Alerts:
Gentoo 201206-25 apache 2012-06-24
rPath rPSA-2011-0014-1 httpd 2011-04-11
rPath rPSA-2010-0056-1 httpd 2010-09-13
Fedora FEDORA-2010-6055 httpd 2010-04-09
Fedora FEDORA-2010-6131 httpd 2010-04-09
SuSE SUSE-SR:2010:010 krb5, clamav, systemtap, apache2, glib2, mediawiki, apache 2010-04-27
Debian DSA-2035-1 apache2 2010-04-17
Pardus 2010-45 apache-2.2.15-36-11 apache-2.2.15-34-12 2010-03-29
CentOS CESA-2010:0175 httpd 2010-03-28
CentOS CESA-2010:0168 httpd 2010-03-28
Red Hat RHSA-2010:0168-01 httpd 2010-03-25
Red Hat RHSA-2010:0175-01 httpd 2010-03-25
Ubuntu USN-908-1 apache2 2010-03-10
Mandriva MDVSA-2010:057 apache 2010-03-06

Comments (none posted)

apache: remote attack via orphaned callback pointers

Package(s):httpd CVE #(s):CVE-2010-0425
Created:March 9, 2010 Updated:March 30, 2010
Description: From the CVE entry:

modules/arch/win32/mod_isapi.c in mod_isapi in the Apache HTTP Server 2.3.x before 2.3.7 on Windows does not ensure that request processing is complete before calling isapi_unload for an ISAPI .dll module, which has unspecified impact and remote attack vectors related to "orphaned callback pointers."

Alerts:
Pardus 2010-45 apache-2.2.15-36-11 apache-2.2.15-34-12 2010-03-29
Slackware SSA:2010-067-01 httpd 2010-03-09

Comments (none posted)

argyllcms: udev rules set incorrect tty permissions

Package(s):argyllcms CVE #(s):
Created:March 4, 2010 Updated:March 10, 2010
Description:

From the Red Hat bugzilla entry:

From /lib/udev/rules.d/55-Argyll.rules which is part of argyllcms-1.0.4-4.fc13.x86_64

 # Enable serial port connected instruments connected on first two ports.
 KERNEL=="ttyS[01]", MODE="666"

 # Enable serial port connected instruments on USB serial converteds connected
 # on  first two ports.
 KERNEL=="ttyUSB[01]", MODE="666"
This gives world-write read/write access to any tty device.
Alerts:
Fedora FEDORA-2010-3587 argyllcms 2010-03-03

Comments (none posted)

bournal: multiple vulnerabilities

Package(s):bournal CVE #(s):CVE-2010-0118 CVE-2010-0119
Created:March 9, 2010 Updated:March 10, 2010
Description: From the Red Hat bugzilla:

Bournal before 1.4.1 allows local users to overwrite arbitrary files via a symlink attack on unspecified temporary files associated with a --hack_the_gibson update check. CVE-2010-0118

Bournal before 1.4.1 on FreeBSD 8.0, when the -K option is used, places a ccrypt key on the command line, which allows local users to obtain sensitive information by listing the process and its arguments, related to "echoing." CVE-2010-0119

Alerts:
Fedora FEDORA-2010-3301 bournal 2010-03-02
Fedora FEDORA-2010-3221 bournal 2010-03-02
Fedora FEDORA-2010-3168 bournal 2010-03-01

Comments (none posted)

cups: arbitrary code execution

Package(s):cups CVE #(s):CVE-2010-0393
Created:March 4, 2010 Updated:April 20, 2010
Description:

From the Debian advisory:

Ronald Volgers discovered that the lppasswd component of the cups suite, the Common UNIX Printing System, is vulnerable to format string attacks due to insecure use of the LOCALEDIR environment variable. An attacker can abuse this behaviour to execute arbitrary code via crafted localization files and triggering calls to _cupsLangprintf(). This works as the lppasswd binary happens to be installed with setuid 0 permissions.

Alerts:
Gentoo 201207-10 cups 2012-07-09
Pardus 2010-54 cups 2010-04-20
Mandriva MDVSA-2010:073-1 cups 2010-04-14
Mandriva MDVSA-2010:073 cups 2010-04-14
Mandriva MDVSA-2010:072 cups 2010-04-14
Pardus 2010-49 cups 2010-04-09
SuSE SUSE-SR:2010:007 cifs-mount/samba, compiz-fusion-plugins-main, cron, cups, ethereal/wireshark, krb5, mysql, pulseaudio, squid/squid3, viewvc 2010-03-30
Ubuntu USN-906-1 cups, cupsys 2010-03-03
Debian DSA-2007-1 cups 2010-03-03

Comments (none posted)

cups: denial of service

Package(s):cups CVE #(s):CVE-2010-0302
Created:March 4, 2010 Updated:April 14, 2010
Description:

From the Red Hat advisory:

It was discovered that the Red Hat Security Advisory RHSA-2009:1595 did not fully correct the use-after-free flaw in the way CUPS handled references in its file descriptors-handling interface. A remote attacker could send specially-crafted queries to the CUPS server, causing it to crash. (CVE-2010-0302)

Alerts:
Gentoo 201207-10 cups 2012-07-09
Mandriva MDVSA-2010:073-1 cups 2010-04-14
Mandriva MDVSA-2010:073 cups 2010-04-14
SuSE SUSE-SR:2010:007 cifs-mount/samba, compiz-fusion-plugins-main, cron, cups, ethereal/wireshark, krb5, mysql, pulseaudio, squid/squid3, viewvc 2010-03-30
Fedora FEDORA-2010-2743 cups 2010-02-24
CentOS CESA-2010:0129 cups 2010-03-12
Fedora FEDORA-2010-3761 cups 2010-03-06
Ubuntu USN-906-1 cups, cupsys 2010-03-03
Red Hat RHSA-2010:0129-01 cups 2010-03-03

Comments (none posted)

curl: arbitrary code execution

Package(s):curl CVE #(s):
Created:March 9, 2010 Updated:March 15, 2010
Description: From the Red Hat bugzilla:

A stack based buffer overflow flaw was found in the way libcurl used to uncompress zlib compressed data. If an application, using libcurl, was downloading compressed content over HTTP and asked libcurl to automatically uncompress data, it might lead to denial of service (application crash) or, potentially, to arbitrary code execution with the privileges of that application.

Alerts:
Fedora FEDORA-2010-2720 curl 2010-02-24
Fedora FEDORA-2010-2762 curl 2010-02-24

Comments (none posted)

drupal: multiple vulnerabilities

Package(s):drupal CVE #(s):
Created:March 8, 2010 Updated:March 10, 2010
Description: Multiple vulnerabilities and weaknesses were discovered in Drupal. See the Drupal advisory for more information.
Alerts:
Fedora FEDORA-2010-3739 drupal 2010-03-06
Fedora FEDORA-2010-3787 drupal 2010-03-06

Comments (none posted)

php: multiple vulnerabilities

Package(s):php CVE #(s):
Created:March 10, 2010 Updated:March 30, 2010
Description:

From the Mandriva advisory:

Multiple vulnerabilities has been found and corrected in php:

  • Improved LCG entropy. (Rasmus, Samy Kamkar)
  • Fixed safe_mode validation inside tempnam() when the directory path does not end with a /). (Martin Jansen)
  • Fixed a possible open_basedir/safe_mode bypass in the session extension identified by Grzegorz Stachowiak. (Ilia)
Alerts:
Fedora FEDORA-2010-4114 maniadrive 2010-03-11
Fedora FEDORA-2010-4114 php 2010-03-11
Fedora FEDORA-2010-4212 maniadrive 2010-03-11
Fedora FEDORA-2010-4212 php 2010-03-11
Mandriva MDVSA-2010:058 php 2010-03-09

Comments (none posted)

samba: access restriction bypass

Package(s):samba CVE #(s):CVE-2010-0728
Created:March 10, 2010 Updated:March 11, 2010
Description:

From the Samba advisory:

This flaw caused all smbd processes to inherit CAP_DAC_OVERRIDE capabilities, allowing all file system access to be allowed even when permissions should have denied access.

Alerts:
Gentoo 201206-22 samba 2012-06-24
Fedora FEDORA-2010-3999 samba 2010-03-10
Fedora FEDORA-2010-4050 samba 2010-03-10

Comments (none posted)

tdiary: cross-site scripting

Package(s):tdiary CVE #(s):CVE-2010-0726
Created:March 10, 2010 Updated:March 10, 2010
Description:

From the Debian advisory:

It was discovered that tdiary, a communication-friendly weblog system, is prone to a cross-site scripting vulnerability due to insuficient input sanitising in the TrackBack transmission plugin.

Alerts:
Debian DSA-2009-1 tdiary 2010-03-09

Comments (none posted)

typo3-src: multiple vulnerabilities

Package(s):typo3-src CVE #(s):
Created:March 9, 2010 Updated:September 8, 2010
Description: From the Debian advisory:

Several remote vulnerabilities have been discovered in the TYPO3 web content management framework: Cross-site scripting vulnerabilities have been discovered in both the frontend and the backend. Also, user data could be leaked.

Alerts:
Debian DSA-2098-2 typo3-src 2010-09-07
Debian DSA-2098-1 typo3-src 2010-08-29
Debian DSA-2008-1 typo3-src 2010-03-08

Comments (none posted)

Page editor: Jake Edge
Next page: Kernel development>>


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds