January 3, 2013
This article was contributed by Daniel Pierre Bovet
A section is an area in an object file that contains information which is useful
for linking: the program's code and data,
relocation information, and more.
It turns out that the Linux kernel has some additional types of sections, called
"special sections", that are used to implement various kernel
features. Special sections
aren't well known, so it is worth shedding some light on the topic.
Segments and sections
Although Linux supports several binary file formats, ELF (Executable and
Linking Format) is the preferred format since it is flexible and extensible
by design, and it is not bound to any particular processor or architecture.
ELF binary files consist of an ELF header followed by a few
segments. Each segment, in turn, includes one or more sections. The
length of each segment and of each section is specified in the
ELF header. Most segments, and thus most sections, have an
initial address which is also specified in the ELF header.
In addition, each segment has its own access rights.
The linker merges together all sections of the same type included in the input
object files into a single section and assigns an initial address to it. For
instance, the .text sections of all object files are merged together
into a single .text section, which by default contains all of the
code in the
program.
Some of the segments defined in an ELF binary file are used by the GNU
loader to assign memory regions with specific access rights to the process.
Executable files include four canonical sections called, by convention,
.text, .data, .rodata, and .bss. The
.text section contains executable code and is packed into a segment
which has the read and execute access rights. The .data and
.bss sections contain initialized and uninitialized data
respectively, and are packed into a segment which has the read and write access
rights.
Linux loads the .text section into memory only once, no matter how many
times an application is loaded. This reduces memory usage and launch time and is
safe because the code doesn't change. For that reason, the
.rodata section, which contains read-only initialized data, is
packed into the
same segment that contains the .text section.
The .data section contains information that could
be changed during application execution, so this section must be copied for
every instance.
The "readelf -S"
command lists the sections included in an executable file, while the
"readelf -l" command lists the segments included in an
executable file.
Defining a section
Where are the sections declared? If you look at a standard C program you won't
find any reference to a section. However, if you look at the assembly version of
the C program you will find several assembly directives that define the beginning of a
section.
More precisely, the ".text", ".data", and
".section rodata" directives identify the beginning of the the three canonical
sections mentioned previously, while the ".comm " directive defines an area
of uninitialized data.
The GNU C compiler translates a source file into the equivalent assembly
language file.
The next step is carried out by the GNU assembler, which produces an object
file.
This file is an ELF relocatable file which contains only sections
(segments which have absolute addresses cannot be defined in a relocatable
file). Sections are now filled, with the exception of the
.bss section, which just has a length associated with it.
The assembler scans the assembly lines, translates them into binary code, and
inserts the binary code into sections. Each section has
its own offset which tells the assembler where to insert the next byte. The
assembler acts on one section at a time, which is called the current
section.
In some cases,
for instance to allocate space to uninitialized global variables, the assembler
does not add bytes in the current section, it just increments its offset.
Each assembly language program is assembled separately; the assembler assumes
thus that the starting address of an object program is always 0.
The GNU linker receives as input a group of these object files and combines
them into
a single executable file. This kind of linkage is called static linkage
because it is performed before running the program.
The linker relies on a linker script to decide which address to assign
to each section of the executable file. To get the default script of your
system, you can issue the command:
ld --verbose
Special sections
If you compare the sections present in a simple executable file, say one
associated with helloworld.c, with those present in the Linux
kernel executable, you will notice that Linux relies on many special sections
not present in conventional executable files. The number of such sections
depends on the hardware platform. On an x86_64 system over 30
special sections are defined, while on an ARM system there are about ten.
You can use the readelf command to extract data from the ELF
header of vmlinux, which is the kernel executable. When issuing this command on an x86_64 box
you get something like:
Elf file type is EXEC (Executable file)
Entry point 0x1000000
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000200000 0xffffffff81000000 0x0000000001000000
0x00000000007a3000 0x00000000007a3000 R E 200000
LOAD 0x0000000000a00000 0xffffffff81800000 0x0000000001800000
0x00000000000c7b40 0x00000000000c7b40 RW 200000
LOAD 0x0000000000c00000 0xffffffffff600000 0x00000000018c8000
0x0000000000000d60 0x0000000000000d60 R E 200000
LOAD 0x0000000000e00000 0x0000000000000000 0x00000000018c9000
0x0000000000010f40 0x0000000000010f40 RW 200000
LOAD 0x0000000000eda000 0xffffffff818da000 0x00000000018da000
0x0000000000095000 0x0000000000163000 RWE 200000
NOTE 0x0000000000713e08 0xffffffff81513e08 0x0000000001513e08
0x0000000000000024 0x0000000000000024 4
Section to Segment mapping:
Segment Sections...
00 .text .notes __ex_table .rodata __bug_table .pci_fixup __ksymtab
__ksymtab_gpl __ksymtab_strings __init_rodata __param __modver
01 .data
02 .vsyscall_0 .vsyscall_fn .vsyscall_1 .vsyscall_2 .vsyscall_var_jiffies
.vsyscall_var_vgetcpu_mode .vsyscall_var_vsyscall_gtod_data
03 .data..percpu
04 .init.text .init.data .x86_trampoline .x86_cpu_dev.init .altinstructions
.altinstr_replacement .iommu_table .apicdrivers .exit.text .smp_locks
.data_nosave .bss .brk
05 .notes
Defining a Linux special section
Special sections are defined in the Linux linker script, which is
a linker
script distinct from the default linker script mentioned above. The
corresponding source file is
stored in the
kernel/vmlinux.ld.S in the architecture-specific subtree. This file uses a set of
macros defined in the linux/include/asm_generic/vmlinux.lds.h header
file.
The linker script for the ARM hardware platform contains an easy-to-follow
definition of a special section:
. = ALIGN(4);
__start___ex_table = .;
*(__ex_table)
__stop___ex_table = .;
The
__ex_table special section is aligned to a multiple of four bytes.
Furthermore, the linker creates a pair of identifiers, namely
__start___ex_table and
__stop___ex_table, and sets their
addresses to the beginning and the end of
__ex_table. Linux
functions can use these identifiers to iterate through the bytes of
__ex_table. Those identifiers must be declared as
extern
because they
are defined in the linker script.
Defining and using special sections can thus be summarized as
follows:
- Define the special section ".special" in the Linux linker
script together with the pair of identifiers that delimit it.
- Insert the .section .special assembly
directive into the Linux code to specify that all bytes up to the next
.section
assembly directive must be inserted in .special.
- Use the pair of identifiers to act on those bytes in the kernel.
This technique seems to apply to assembly code only. Luckily, the GNU C compiler
offers the non-standard attribute construct to create special sections.
The
__attribute__((__section__(".init.data")))
declaration, for instance, tells the compiler that the code following that
declaration must be inserted into the
.init.data section.
To make the code more readable, suitable macros are defined. The
__initdata macro, for instance, is defined as:
#define __initdata __attribute__((__section__(".init.data")))
Some examples
As seen in the previous readelf listing, all special
sections appearing in the Linux kernel end up packed in one of the segments
defined in the vmlinux ELF header. Each special section fulfills a
particular purpose.
The following list groups some of the Linux special sections according to the
type of information stored in them. Whenever applicable, the name of the macro
used in the Linux code to refer to the section is mentioned instead of the
special section's name.
- Binary code
Functions invoked only during the initialization of Linux are declared as
__init and placed in the .init.text section. Once the system
is initialized, Linux uses the section delimiters to release the
page frames allocated to that section.
Functions declared as __sched are inserted into the
.sched.text special section so that they will be skipped by the
get_wchan() function, which is invoked when reading the
/proc/PID/wchan file. This
file contains the name of the function, if any, on which process PID is
blocked (see WCHAN
the waiting channel for further details).
The section delimiters bracket the sequence of addresses to be skipped. The
down_read() function, for instance, is declared as __sched
because it gives no helpful information on the event that is blocking the
process.
- Initialized data
Global variables used only during the initialization of Linux are declared as
__initdata and placed in the .init.data section. Once the
system is initialized, Linux uses the section delimiters to release the page
frames allocated to the section.
The EXPORT_SYMBOL() macro makes the identifier passed as parameter
accessible to kernel modules. The identifier's string constant
is stored in the __ksymtab_strings section.
- Function pointers
To invoke an __init function during the initialization phase, Linux
offers an extensive set of macros (defined in <linux/init.h>);
module_init() is a well-known example.
Each of these macros puts a function pointer passed
as its parameter in a .initcalli.init section (__init
functions are grouped in several classes).
During system initialization, Linux uses the section delimiters to
successively invoke all of the functions pointed to.
- Pairs of instruction pointers
The _ASM_EXTABLE(addr1, addr2) macro allows the page fault exception
handler to determine whether an exception was caused by a kernel instruction at
address addr1 while trying to read or write a byte into a process
address space. If so, the kernel jumps to addr2 that contains the
fixup code, otherwise a kernel oops occurs. The delimiters of the
__ex_table special section (see the previous linker script example) set the
range of critical kernel instructions that transfer bytes from or to user
space.
- Pairs of addresses
The EXPORT_SYMBOL() macro mentioned earlier also inserts in the
ksymtab (or ksymtab_gpl) special section a pair of
addresses: the identifier's address and the address of the corresponding
string constant in ksymtab (or ksymtab_gpl). When linking a
module, the special sections filled by EXPORT_SYMBOL() allow the
kernel to do a binary search to determine whether an identifier declared as
extern by the module belongs to the set of exported symbols.
- Relative addresses
On SMP systems, the DEFINE_PER_CPU(type, varname) macro inserts the
varname uninitialized global variable of type in the
.data..percpu special section. Variables stored in that
section are called per-CPU variables. Since .data..percpu is
stored in a segment whose initial address is set to 0, the addresses of
per-CPU variables are relative addresses.
During system initialization, Linux allocates a memory area
large enough to store the NR_CPUS groups of per-CPU variables. The
section delimiters are used to determine the size of the group.
- Structures
The kernel's SMP alternatives mechanism
allows a single kernel to be built optimally for multiple versions of a
given processor architecture. Through the magic of boot-time code
patching, advanced instructions can be exploited if, and only if, the
system's processor is able to execute those instructions. This mechanism
is controlled with the alternative() macro:
alternative(oldinstr, newinstr, feature);
This macro first stores oldinstr in the .text regular section.
It then stores in the .altinstructions special
section a structure that includes the following fields: the address of the
oldinstr, the address of the newinstr, the feature
flags, the length of the oldinstr, and the length of the
newinstr. It stores newinstr in a .altinstr_replacement special section. Early in the boot process, every alternative instruction
which is supported by the running processor is patched directly
into the loaded kernel image; it will be filled with no-op
instructions if need be.
Additional special sections, besides
__ksymtab and
__ksymtab_strings, are introduced to handle modules. Kernel objects of
the form
*.ko have an ELF relocatable format and the ELF
header of such files defines a pair of special sections called
.modinfo and
.gnu.linkonce.this_module. Unlike the special sections of the
static kernel, these two sections are "address-less" because kernel objects do
not contain segments.
The .modinfo section is used by the
modinfo command to show information about the kernel module. The
data stored in the section is not loaded in the kernel address space.
The .gnu.linkonce.this_module special section includes a
module structure which contains, among other fields, the module's
name. When inserting a module, the init_module() system call reads the
module structure from this special section into an area of dynamic
memory.
Conclusion
Although special sections can be defined in application programs too, there is
no doubt that kernel developers have been quite creative in exploiting them. In
fact, the examples listed above are by no means exhaustive and new special
sections keep popping up in recent kernel releases. Without special
sections, implementing some kernel features like those above would be
rather difficult.
Comments (8 posted)
Brief items
Even the FSF continues to struggle with its own software-oriented mission. Stallman and the FSF have worked over the last several years to move non-free code that runs on what are essentially smaller sub-computers (e.g., a wireless interface or graphics device within a laptop) from the computer’s main hard drive into the sub-processors themselves. The point of these efforts is to eliminate non-free software by turning it into hardware. But are users of software more free if proprietary technology they cannot change exists in one form on their computer rather than another?
—
Benjamin Mako Hill (in an essay written in 2011, but published only recently)
Comments (17 posted)
Hot on the heels of a successful fundraising campaign, the
MediaGoblin decentralized media publishing platform has
released version 0.3.2. The headline feature in the release is support for 3D models. "
We've blogged about this, we've collared people at holiday parties, we've done everything but make a Gangnam Style parody video about it... but in case you haven't heard, you can now upload 3d models to MediaGoblin, whoo! This means you can build your own free-as-in-freedom Thingiverse replacement and start printing out objects. We support the sharing of STL and OBJ files. MediaGoblin can also call on Blender to create nice image previews during upload. Or if you prefer, you can use javascript to display live 3d previews in webgl-enabled browsers (we use the thingiview.js library to do this)."
Full Story (comments: none)
Version 3.2 of the LLVM compiler system and Clang C compiler has been
released. "
Despite only it being a bit over 6 months of development since 3.1, LLVM 3.2
is a huge leap, delivering a wide range of improvements and new features.
Clang now includes industry-leading C++'11 support, improved diagnostics, C11
and Objective-C improvements (including 'ObjC literals' support), and the
Clang static analyzer now has the ability to do inter-procedural (cross-
function) analysis along with improved Objective-C support." See
the release
notes for lots of details.
Full Story (comments: 11)
Version 3.5 of the "Awesome" window manager has been
released.
"
The last major release happened more than three years ago. However,
even longer ago, a civilization known as the 'Maya' predicted that today a
great pain will be brought to everyone (Don't trust the 'Date' header of
this mail or you will get a long and weird explanation about time zones and
other weak excuses). Today is the day of thousand crys from users whose
config broke. Today is the end. Welcome to the time after the end."
See
this
message for a summary of changes in this release, or
this LWN review of Awesome from 2011.
Comments (17 posted)
Enlightenment DR 0.17.0 (E17) has been released, along with
version 1.7.4 of the Enlightenment Foundation
Libraries. LWN
looked at Enlightenment in
August 2011.
Full Story (comments: 11)
Version 2.17 of the GNU C library (glibc) is available. This release includes a port to ARM AArch64, contributed by Linaro, as well as a lot of bug fixes. The minimum Linux kernel version supported by this glibc release is 2.6.16.
Full Story (comments: 7)
Version 4.2.2 of the GNU stream editor "sed" is out. There's a number of
new features, but the announcement also includes the resignation of the sed
maintainer. "
Barring any large change in policy and momentum from
GNU, these three reasons are bound to be the first step towards the
irrelevance of GNU. And barring any such policy change, I have no reason
to be part of GNU anymore."
Full Story (comments: 129)
Much of the discussion on the Python mailing lists in recent times has been
devoted to the topic of a new framework to support the development of
"event loop" programs in Python 3. That discussion has been pulled
together into PEP 3156; there is an accompanying reference implementation
currently called "tulip". Guido van Rossum is seeking comments on both the
proposal and the implementation. Click below for the full text of the
proposal.
Full Story (comments: 23)
Simon is a system for speech recognition;
version
0.4.0 is now available. "
This new version of the open source
speech recognition system Simon features a whole new recognition layer,
context-awareness for improved accuracy and performance, a dialog system
able to hold whole conversations with the user and more."
Comments (3 posted)
Version 5.1.0 of the GNU Multiple Precision Arithmetic Library
(GMP) has been released. A number of speed optimizations have been added, as has support for new processors. New functions for "multi-factorials, and primorial: mpz_2fac_ui, mpz_mfac_uiui and mpz_primorial_ui" have also been added.
Full Story (comments: none)
Version 12.3 of the Twisted framework has been released. This version adds partial support for Python 3.3, among other changes.
Full Story (comments: none)
LightZone, a multi-platform digital photo editor that started out as a proprietary product, has been released as an open source project. The code can be found at lightzoneproject.org.
Full Story (comments: 1)
GNU Automake 1.13 has been released. This is a major update with several important changes, among them the ability to define custom recursive targets and changes to several macros.
Full Story (comments: none)
Newsletters and articles
Comments (1 posted)
The H
interviews Raspberry Pi Foundation executive director Eben Upton about the educational mission of the foundation—something that got a bit lost in the excitement over the hardware.
"
The nice thing is that almost all of the good CS teaching software already runs on Linux, so the bulk of the work is in making sure it works well on the Pi, rather than developing things from a standing start. MIT Scratch is actually a great example of this – it's built on top of the Squeak Smalltalk VM, and because this has generally only been run in anger on modern desktop hardware there hasn't previously been a case for heavy optimisation of its graphics routines, so it's a little sluggish on the Pi right now. We've commissioned a couple of pieces of work, the first of which involves porting it to use Pixman as its rendering backend, and the second involves optimising Pixman itself for the Pi's ARMv6 architecture (which will obviously pay dividends elsewhere in the system too)."
Comments (15 posted)
Linux.com
looks ahead to where embedded Linux is heading for 2013. The article forecasts Linux to replace realtime operating systems (RTOS) in many devices, Android getting into traditional embedded devices, more open source embedded Linux projects becoming available, expansion for Linux in the mobile and automotive spaces, and more. "
As Android enters the general embedded realm, several new Linux-based mobile OSes [6] are stepping out to compete in the smartphone market. In 2013, the Linux Foundation's Tizen, Mozilla's Firefox OS, and Jolla's Meego spinoff, Sailfish, all plan to ship on new phones. If that's not enough, an upcoming mobile version of Ubuntu is due in 2014, HP's Open WebOS may yet reawaken on new hardware, and even the GNOME Foundation is planning a mobile-ready, developer-focused GNOME OS."
Comments (none posted)
Lukas 'Slyon' Märdian
looks
at an Openmoko based smartphone. The Openmoko smartphone efforts were
abandoned some time ago, but Golden Delicious Computers has taken the code
and created the OpenPhoenux GTA04. "
Golden Delicious Computers and
the enthusiasts from the Openmoko community started off with the idea of
stuffing a BeagleBoard into a Neo Freerunner case and connecting an USB
UMTS dongle to it – this was the first prototype GTA04A1, announced in late
2010 and presented at OHSW 2010 and FOSDEM 2011." At this time
there are about 300 GTA04(A3+A4) devices in the wild and the company has
GTA04A5 phones in production. (Thanks to Paul Wise)
Comments (none posted)
Here is
some
advice for scientists developing open-source software published on the
PLOS Computational Biology site in early December. "
The
sustainability of software after publication is probably the biggest
problem faced by researchers who develop it, and it is here that
participating in open development from the outset can make the biggest
impact. Grant-based funding is often exhausted shortly after new software
is released, and without support, in-house maintenance of the software and
the systems it depends on becomes a struggle. As a consequence, the
software will cease to work or become unavailable for download fairly
quickly, which may contravene archival policies stipulated by your journal
or funding body. A collaborative and open project allows you to spread the
resource and maintenance load to minimize these risks, and significantly
contributes to the sustainability of your software."
Comments (24 posted)
Page editor: Nathan Willis
Next page: Announcements>>