|
|
Subscribe / Log in / New account

Development

Python without an operating system

By Jake Edge
April 22, 2015

PyCon 2015

Josh Triplett started out with "the punchline" for his PyCon 2015 talk on porting Python to run without an operating system: he and his Intel colleagues got the interpreter to run in the GRUB boot loader for either BIOS or EFI systems. But that didn't spoil the rest of the talk by any means. He had plenty of interesting things to say and a number of eye-opening demos to show as well.

The original reason for wanting Python in the boot loader was to be able to test the hardware, BIOS, Extensible Firmware Interface (EFI), and Advanced Configuration and Power Interface (ACPI) without having to write a "bunch of one-off programs" for testing. Traditionally, Intel had written test programs targeting DOS (for BIOS systems) or EFI. Both DOS and EFI provide environments without protections, so that programs can poke around in memory and the hardware to do what they need.

[Josh Triplett]

What he wanted to be able to do was to write scripts for these tests, "which is more fun". He wanted to avoid writing more C code, but also to move away from the previous incarnation that used the GRUB shell along with some shell functions that could evaluate C-like expressions. In fact, he said, "the more sentences I can end with 'without writing any C code', the happier my life is".

Over time, the Python port to GRUB has turned into a useful exploratory environment for working with hardware. It hearkens back to the fun of hacking on the Commodore 64 (or DOS) using PEEK and POKE to mess with the hardware. That can't really be done with modern hardware, he said.

Python in GRUB

The BIOS Implementation Test Suite (BITS), as the project is called, will run on GRUB for several kinds of firmware: 32-bit BIOS or EFI and 64-bit EFI. It uses the original GRUB (i.e. GRUB Legacy) not GRUB 2. [Correction: BITS uses GRUB 2.] It is based on the standard Python interpreter (i.e. CPython), but he apologized that it uses Python 2.7. The target audience for the tool is quite familiar with that version of the language. If that changes, he would love to move to Python 3 some day.

There is an interactive read-eval-print loop (REPL) that gives full access to the language. That includes tab completion, history, and line editing as well. A "substantial fraction" of the standard library has been ported to run on BITS. On top of that, the project has added modules for platform support: CPU, SMP (symmetric multi-processing), ACPI, EFI, and others. Intel has created a test suite and some exploratory tools written in Python using all of that.

Triplett then switched away from his slides to a Python prompt from an interpreter running on GRUB in a virtual machine. He typed two statements into the interpreter to demonstrate that it supported both list comprehensions and arbitrarily large integers (i.e. bignums).

To get an interactive prompt from Python, there is a single function that GRUB will need to call:

    PyRun_InteractiveLoop(stdin, "<stdin>");
That handles everything for the REPL, including parsing and executing the input, line editing, and so on. The two parameters simply say where to get the input and what to print as the source "file" in the traceback when there is an exception. But to be able to call that function from GRUB requires some work.

The project couldn't use the standard Python configure and make because those would use the toolchain and attributes from the Linux host. There is no GNU target string (i.e. the cpu-vendor-os triple used for cross-compilation) and no target header files available for GRUB. So, instead, BITS added all of the Python source files into the GRUB build system. Essentially that was just a list of C files that GRUB needed in order to add Python. Normally, autoconf would create a pyconfig.h file as part of the Python build process that would say which features are present on the platform. Instead, the project manually created a pyconfig.h that had lots of "no I don't have this feature" configuration parameters along with a handful of "yes" entries.

Many of the features listed in pyconfig.h are things that are provided (or not) by the operating system, but in this case there is no operating system. Python does minimally require some support functions, though, plus there were some extra features that were configured in. The project needed to provide any functions that were called but not present.

What CPython needs

So, what do you really actually need to run CPython? Triplett provided a number of examples. There are some non-trivial file operations that are needed, like stat() to determine if a path is a directory that might contain an __init__.py or if it is a file. A simple isatty() (which, for BITS, returns true if the file descriptor is less than three) was added, as was a seek() implementation. To support those, a simple file descriptor table had to be added because GRUB's file functions use structure pointers rather than descriptors.

Python also needs to be able to use ungetc(), as the parser will sometimes put one character back on the input stream. Rather than add a one-character buffer, a "quick hack" was added to seek backward by one character. An open-coded qsort() was added as well; GRUB didn't have any support for sorting.

Floating-point math was another area that GRUB had no support for. The project found a permissively licensed floating-point library called FDLIBM. It does not have any support for acceleration using floating-point hardware, which is actually an advantage in the GRUB environment. It means that floating point can be used even if the firmware has not properly initialized the floating-point hardware.

Python uses printf() and sprintf() extensively, so those were needed. For the most part, GRUB's versions worked fine, though support for the "%%" format specifier (to put a "%" in the output) was not present. It turns out that Python uses that frequently to format its strings for output. Strange bugs were seen until that lack was identified and fixed.

There are a number of performance issues that the project has had to work through. To start with, the startup time was surprisingly long. That was painful on real hardware, but it was really bad on CPU circuit simulators ("we wouldn't want this to take three days to boot"). Part of the problem was the Python parser, which reads data one character at a time and uses ungetc(). GRUB does not have much disk caching, so all of that I/O hits the disk.

By adding support for .pyc (Python byte code) files, the project was able to reduce much of the parsing overhead. A host version of the interpreter is built at the same time as the GRUB version and that is used to byte-compile the Python files needed at startup.

That made substantial improvements, but startup is still a little slow because of stat() performance. On a Linux system, you expect stat() to take microseconds, but the BITS version takes milliseconds, he said. Adding support for zipimport allowed the project to bundle up all of the .pyc files into a single ZIP file to avoid most of the stat() calls.

The project wanted history and tab completion for the REPL, but the normal way to get that support is to use the Readline library. That library depends on having a POSIX environment along with tty support. The developers did not want to write a "pile of C code" to provide that, so instead they wrote the Readline support in Python. The PyOS_ReadlineFunctionPointer in CPython is set to a C function that calls the new Python function using the C API.

There was also a desire to construct dynamic menus for GRUB so that various test suites and other options were available. GRUB already has disk and filesystem providers for devices like disks and CD drives (e.g. "(hd0)", "(cd)") so BITS added a "(python)" device and filesystem that works like the Linux Filesystem in Userspace (FUSE). So Python code can access arbitrary in-memory files, such as the menu configuration file that lives at (python)/menu.cfg. "Even more C code we don't have to write", Triplett said.

Accessing the hardware

Since the goal was to provide a nice environment for testing the hardware, Python needs to be able to access it. A module called "bits" was added that provided access to various hardware functionality such as CPUID, model-specific registers (MSRs), I/O ports, and memory-mapped I/O. He demonstrated those capabilities with a bit of Python:

    >>> import bits
    >>> from ctypes import *
    >>> c = bits.cpuid(0, 0)
    >>> c
    cpuid_result(eax=0x..., ebx=..., ecx=..., edx=...)
He would use the ctypes import in order to "manipulate raw pieces of memory" in the next piece of the demo. For those who want to dig a little deeper, all of the demos are quite visible in the YouTube video of the talk. The cpuid() call returns the CPUID of CPU0, which he then prints. "How fun is that?", he asked. "We are getting processor registers from Python." From there, he used Python to interpret the result:
    >>> buf = (c_uint32*3)(c.ebx, c.edx, c.ecx)
    >>> (c_char*12).from_buffer(buf).value
    'GenuineIntel'
Three of the registers contain an identifier describing the processor type. He used the types from the ctypes module to reinterpret those three registers (in that order) as a character string, which showed the processor type.

Intel wanted to be able to test highly parallel systems, but GRUB only knows about the boot CPU. So BITS wakes up every CPU in the system and puts them into a sleeping loop using MWAIT (the x86 monitor wait instruction) waiting for work to do. There are functions to wake up specific CPUs and to run functions on them.

The project also wanted to be able to access ACPI information and methods from Python. It took the ACPI Component Architecture (ACPICA) reference implementation and added it into BITS. That was all C code, so Python bindings were added. That allows arbitrary ACPI methods to be called from Python with arguments converted to ACPI types and with the result being converted into Python types. He demonstrated dumping all of the hardware IDs for devices in the virtual machine using a simple Python program:

    >>> import acpi
    >>> print acpi.dump('_HID')

Triplett said that he wouldn't be going into more details of using BITS for hardware exploration. He has given other talks along the way with more detailed information about that.

Intel also wanted to be able to access EFI for systems using that firmware, rather than BIOS. The "Extensible" in the name refers to the idea that everything in EFI is a "protocol", each of which includes native C functions to call. To do so, the foreign function interface provided by libffi was ported to run in GRUB and support for the EFI calling convention was added. Using that and the Python ctypes module that provides an interface to C types and functions from Python allowed the interpreter access to EFI. He demonstrated accessing EFI methods from within Python:

    >>> import efi
    >>> out = efi.system_table.ConOut.contents
    >>> out.ClearScreen(out)
    [ which clears the screen ]
    >>> out.OutputString(out, 'Hello world!\r\n')
    Hello world!

Access to EFI also allows Python to use the EFI file protocol to make directories and write files in the EFI filesystem, which is useful since GRUB only knows how to read files. Beyond that, there is a graphics output protocol (GOP) that can be used to read and write the contents of the screen. As he noted, presentation slides are simply graphics and, in fact, were being displayed by BITS and EFI on his laptop. The presentation and demos were all done in the BITS environment, so, in reality, the whole presentation was a demo, he said to a round of applause. Doing so required "no new C code, not a single line".

He saved his best demo for last. He started by getting a pointer to the frame buffer from the EFI GOP as a Python array. As he typed in the next few lines of code, it was clear that some in the room recognized what he was up to, which was calculating and displaying a 400x400 grayscale image of the Mandelbrot set. "Fractals in eight lines of Python using the EFI graphics protocol", he said to another round of applause. It took around 15 seconds to draw the image, which was kind of slow; that was not due to Python, he said, but instead to the software-only floating point in the interpreter.

In the questions following the talk, Triplett noted that there was no hook for interrupt handling in BITS, but that it is something that could be added fairly easily. Environments like Mirage OS (and other "just enough operating systems") could also add Python using the BITS code without too much difficulty, he said. The "next fun item on our to-do list" is to add Python bindings for the EFI TCP network protocol and to hook that up to the Python socket module to see if the SimpleHTTPServer will run in that environment. That would effectively add a "web REPL" to the BITS environment.

Comments (7 posted)

Brief items

Quote of the week

GDB will be the weapon we fight with if we accidentally build Skynet.
Gary Benson

Comments (none posted)

GNU Hurd 0.6 released

It has been roughly a year and a half since the last release of the GNU Hurd operating system, so it may be of interest to some readers that GNU Hurd 0.6 has been released along with GNU Mach 1.5 (the microkernel that Hurd runs on) and GNU MIG 1.5 (the Mach Interface Generator, which generates code to handle remote procedure calls). New features include procfs and random translators; cleanups and stylistic fixes, some of which came from static analysis; message dispatching improvements; integer hashing performance improvements; a split of the init server into a startup server and an init program based on System V init; and more. "GNU Hurd runs on 32-bit x86 machines. A version running on 64-bit x86 (x86_64) machines is in progress. Volunteers interested in ports to other architectures are sought; please contact us (see below) if you'd like to help. To compile the Hurd, you need a toolchain configured to target i?86-gnu; you cannot use a toolchain targeting GNU/Linux. Also note that you cannot run the Hurd "in isolation": you'll need to add further components such as the GNU Mach microkernel and the GNU C Library (glibc), to turn it into a runnable system."

Full Story (comments: 185)

Schaller: Red Hat joins Khronos

At his blog, Christian Schaller announces that Red Hat has joined the Khronos Group, the consortium behind (among other things) the OpenGL standard. Schaller notes that "the reason we are joining is because of all the important changes that are happening in Graphics and GPU compute these days and our wish to have more direct input of the direction of some of these technologies. Our efforts are likely to focus on improving the OpenGL specification by proposing some new extensions to OpenGL, and of course providing input and help with moving the new Vulkan standard forward."

Comments (2 posted)

PacketFence 5.0 released

PacketFence is a free network access control system; the 5.0 release is now available. Changes include a new active clustering mode, better device fingerprinting, better performance monitoring, the elimination of plaintext passwords, and more.

Comments (none posted)

Ardour 4.0 released

Version 4.0 of the Ardour audio editing system is available. This release features Windows support, more flexible audio support (JACK is no longer required), a lot of user-interface work, and official OS X and Windows support.

Comments (6 posted)

libXpresent 1.0.0 is available

Version 1.0.0 of libXpresent has been released. The library is an Xlib-compatible library for the X11 Present extension. Along with the library release, the corresponding Present protocol has also been bumped to version 1.0.

Full Story (comments: none)

GNU Ocrad 0.25 released

Version 0.25 of the GNU Ocrad optical-character-recognition engine has been released. Character recognition has been improved, including support for several new accented Latin characters, there are several new filters, and a command-line switch has been added that enables users to apply user-defined filters.

Full Story (comments: none)

The Linux Test Project April 2015 release available

The April 2015 release of the Linux Test Project is now available. Noteworthy changes include new test cases for the futex() syscall and for select() and poll() timeouts, as well as continued work to convert sleep() calls to proper synchronization.

Full Story (comments: none)

GCC 5.1 released

Version 5.1 of the GNU Compiler Collection is out. "GCC 5.1 is a major release containing substantial new functionality not available in GCC 4.9.x or previous GCC releases." Some of that new functionality includes full C++14 language support, quite a few optimization improvements, partial OpenACC support, OpenMP 4.0 support, an experimental JIT library, and more; see the changelog for details.

Full Story (comments: 11)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Tschumperlé: My latest ten months working on G’MIC

David Tschumperlé has posted an extensive summary of his work on G'MIC, an image-processing tool. One of those projects was comic colorization: "The idea is very simple: Instead of forcing the artist to do all the colorization job by himself, we just ask him to put some colored key-points here and here, inside the different image regions to fill-in. Then, the algorithm tries to guess a probable colorization of the drawing, by analyzing the contours in the image and by interpolating the given colored key-points with respect to these contours." (LWN looked at G'MIC in August 2014).

Comments (none posted)

The Puppet design philosophy (O'Reilly)

O'Reilly has posted an excerpt from Puppet Best Practices, an upcoming book about the Puppet system configuration tool. It's a good place to look for those wanting an introduction to how Puppet works. "Puppet can be somewhat alien to technologists who have a background in automation scripting. Where most of our scripts scripts are procedural, Puppet is declarative. While a declarative language has many major advantages for configuration management, it does impose some interesting restrictions on the approaches we use to solve common problems."

Comments (3 posted)

Sourcegraph: A free code search tool for open source developers (Opensource.com)

Opensource.com introduces Sourcegraph. "Sourcegraph is a code search engine and browsing tool that semantically indexes all the open source code available on the web. You can search for code by repository, package, or function and click on fully linked code to read the docs, jump to definitions, and instantly find usage examples. And you can do all of this in your web browser, without having to configure any editor plugin."

Comments (8 posted)

Page editor: Nathan Willis
Next page: Announcements>>


Copyright © 2015, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds