July 5, 2006
This article was contributed by John Richard Moser
Prelink (PDF) is a
popular tool used to decrease program load time, shortening system boot
time and making applications start faster. Developed by Jakob Jelinek at
Red Hat, prelink relocates libraries on disk to save dynamic linking time.
When the dynamic linker loads a dynamically linked ELF binary, it has to
also load and link all of the libraries before executing the program's
entry point, _main(). This process involves relocating
libraries—changing all addresses referenced in the library to reflect
the actual addresses in memory. Relocating libraries involves iterating
through each address in the library and replacing it with the real address
as determined by the library's location in the process's virtual address
space. Most relocations happen in the symbol table and PLT;
but in rare cases there are also .text relocations which require
fixed-position executable code to be patched in a slightly slower process.
The relocation process will slow down an application's launch.
In order to speed up the process, prelink relocates the libraries ahead of
time. This is done by scanning every
executable to be prelinked, generating a graph of libraries that will be
loaded at the same time as other libraries, and then calculating target
addresses for each library at such that it will never be loaded at the same address
as other libraries. These offsets are then stored in the shared object
files themselves, and the symbol tables and segment addresses are all
adjusted to reflect addresses based on the chosen base address.
Once prelink has done its job, the dynamic linker no longer has to concern
itself with relocation. Libraries are loaded at the address specified in
the library header and the symbol table is already correct. If anything
forces the library to be loaded at a different address, then the library is
relocated appropriately as usual; otherwise we can say goodbye to the
load-time overhead of relocating libraries.
Kernel facilities supplying address space layout randomization for
libraries cannot be used in conjunction with prelink; to do so would
require relocating the libraries, defeating the purpose of prelinking.
Address space randomization is a core feature of secure systems such as
OpenBSD, Adamantix, Hardened Gentoo, Fedora Core, and Red Hat Enterprise
Linux. It has appeared as part of PaX as well as part of Ingo Molnar's
Exec Shield, and has been accepted into the mainline kernel as
of 2.6.12 after submission by Arjan van de Ven.
The simple purpose of address space randomization is to make it more
difficult to perform certain classes of attacks by changing where
in memory important segments for the attack are loaded. If an attacker
wants to execute injected shell code or manipulate the program to execute
out of order, he obviously has to know where that code is. By shuffling
memory segments around, these attacks become quite difficult; the chances of
successful attack are mathematically described in the PaX documentation
and Wikipedia.
In an attempt to restore some of the benefits of address space
randomization, prelink is capable of randomly selecting
the addresses used for prelinking. This makes it more difficult to perform
certain attacks on a system, because the addresses used are unique to that
system. This approach is, however, less effective than per-process
randomization because the addresses stay constant until prelink is run
again.
There is another implication that has to be examined with prelink. To
understand this implication, let us first review a feature of prelink by
examining the load address of the C standard library in two processes: a
user-owned 'cat' and a root-owned 'bash'. The C standard library is
interesting because, in practice, virtually all return-to-libc
attacks utilize it exclusively.
user@icebox:~$ cat /proc/self/maps | grep libc | grep r-xp
4df2e000-4e053000 r-xp 00000000 08:07 81197 /lib/tls/i686/cmov/libc-2.3.6.so
user@icebox:~$ sudo -s
root@icebox:/home/user# cat /proc/$$/maps | grep libc | grep r-xp
4df2e000-4e053000 r-xp 00000000 08:07 81197 /lib/tls/i686/cmov/libc-2.3.6.so
Closely examining these quickly verifies that the address of glibc's
executable code is the same between these two processes; this is consistent
with the behavior of prelink. Because the library itself is relocated
ahead of time, there is a preference for the dynamic linker to load it at
that address. Examination of libc itself yields the below.
user@icebox:~$ readelf -S /lib/tls/i686/cmov/libc-2.3.6.so | head -n 6
There are 64 section headers, starting at offset 0x12d114:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .note.ABI-tag NOTE 4df2e154 000154 000020 00 A 0 0 4
Computing 4df2e154 - 154, the address and offset taken from any given
non-NULL segment, yields 4df2e000, the base address of libc. This makes
sense; prelink rewrites the segment and symbol addresses for the library
based on a specific load address, and the dynamic linker loads the library
at that address to avoid relocating it. Further, any program that links
with libc has to be able to read libc, and will thus be able to derive the
same information.
All of this means that any program on the system using any prelinked
library will be able to leak information about higher privileged tasks
using the same library. This allows any attacker able to gain any form of
local access—or more directly any ability to read libc—to gain
information about the address space layout of higher privileged processes,
including the load address of libc. As we know, this information is
extremely valuable to an attacker wanting to exploit a privileged process
without brute forcing library load addresses.
This vulnerability only applies to attackers with local access; but this is
not an unreasonable requirement. Many web hosting companies give local
shell access or allow PHP; either of these can be used to remotely fetch a
copy of libc. Due to the nature of the dynamic linker and sane security
design, the dynamic linker is exactly as privileged as the process it is
starting; therefor, even the most stringent mandatory access policies on
systems such as SELinux, grsecurity, or AppArmor cannot prevent this
attack.
Besides avoiding prelinking, there is one other way to prevent this information
leak from being exploited. All processes linked to a prelinked library
need access to the library file and load that library at the same address;
the point of exposure is the use of the same copy of the library. In order
to prevent information leaking, then, you must have separate copy of each
library common between any two programs you don't want to leak information
about each other. This can be done with Xen, chroot jails, UML, or simply
isolated machines, as long as the directory hierarchies are individually
prelinked with prelink randomization. Each system will have a different
set of addresses from every other system in this scheme. This of course
requires more hardware, more disk space, more management, more memory, and
more work.
The direct implications of this information leak depend on your exact
security concerns. A web hosting company, for example, may not want to run
prelink on its servers, given the risk of effectively losing
the benefit of address space randomization. A home desktop, on the other
hand, may only have to worry about a trojan using the information leak to
stage an attack on a system service such as cups or dbus—and should
probably worry about /proc/PID/maps first. While these are both
essentially the concern of an attacker with local access, the likelihood of
attack and the value of potential damages are different.
The prelink tool gives a useful decrease in program load time, and can help
users reach their desktop and the programs they need to run more quickly.
It does however have some unfortunate repercussions that must be examined,
especially in security-sensitive environments relying on address space
randomization.
(
Log in to post comments)