By Jonathan Corbet
February 23, 2011
The
Python 3.2 release was announced on
February 20, exactly 20 years after 0.9.0, which was the first public Python release. Given that Python 2.x remains the version of the
language used by most programmers and most existing code, one might be tempted
to write off this release as being relatively unimportant. But the 3.2
release has some changes which will be important to Python developers going
forward, so, even if one isn't planning on
moving to Python 3 right away, this
release merits a quick look.
Since Python is under a moratorium on the addition of new language
features, one might think that a new release - even a major release - would
be relatively boring. But the moratorium only applies to the core
language; the libraries - which is where much of the interesting action is
to be found - are unaffected. A look at the What's new in Python
3.2 document indicates that the libraries are evolving quickly indeed.
Some of the more significant changes include:
- A new "argparse" module for the handling of command-line options.
Those of us still using getopt have been left far behind; the current
"optparse" module has also been deprecated as of version 2.7. Argparse
would appear to go beyond mundane argument parsing into the creation
of command-line languages. It can probably handle more details than
most people will ever want to use.
- There is an ongoing effort to gather concurrency-related modules under
the "concurrent" namespace. The first addition there is concurrent.futures, a
mechanism for the submission and management of tasks in
multi-threaded and multi-process environments.
- The handling of compiled .pyc files has changed to reflect an
environment where multiple Python runtimes coexist. They now have the
interpreter name and version built into their names and have been
banished into a separate __pycache__ directory. There is a
similar mechanism for the handling of shared libraries.
- Many other modules have seen significant improvements; see the "what's
new" document for details.
A couple of the most significant improvements may be elsewhere, though.
One of those is the definition of a stable ABI for
extension modules. Anybody who has been through a Python version update
knows that the associated rebuilding of extension modules is not a lot of
fun. As of version 3.2, modules which restrict themselves to a subset of
the extension module ABI should continue to work indefinitely into the
future. It's not yet clear how many real-world modules can live within the
restrictions of this ABI; also unclear is how much that ABI could be
extended without slowing further development of the language. But it's a
step in the right direction toward the solution of a real problem.
Another partial solution to an ongoing problem can be found in the rewrite
of the global
interpreter lock (GIL). The GIL is Python's equivalent to the kernel's Big
Kernel Lock; it ensures that only one thread can be executing in the
bytecode interpreter at any given time. Since running bytecode is what
Python programs do, the GIL can be seen as a rather significant
constraint on how much concurrency is possible in a multi-threaded
environment. Some extension modules release the GIL while they are doing
extensive computations, and the GIL (like the BKL) is released while
waiting for I/O, but that doesn't solve the real problem. The failure to
remove (or at least reduce the role of) the GIL during the Python 3
development process is, for many developers, one of the biggest
disappointments of Python 3.
The 3.2 GIL rewrite does not change the fundamental nature of the GIL, but
it does reduce its impact somewhat. As described
by Antoine Pitrou, the principal hacker behind this work, two significant
changes have been made:
- Previously, the GIL would be passed from one contending thread to the
next after a certain number of opcodes had been executed. But opcodes
do not execute in constant time, and some of them (such as calls into
an extension module) can execute for a long time indeed. The new GIL
is, instead, passed on after a bounded time period (5ms by default).
- The GIL is implemented in an inherently unfair manner; once it has
been released, any process which comes along can claim it. Prior to
3.2, that "any process" is often the process which just released the
lock. That process is supposed to wait before attempting to reacquire
the GIL, but the fact that it is running and cache-hot means it's
still likely to get there first. The new GIL is still unfair, but it
will at least force the releasing process to wait until a contending
process has acquired the lock. That should fix some of the long
latencies seen by Python programmers in some situations.
Given the scalability limitations inherent in a single, global lock, one
might think that eliminating that lock would be a priority for the Python
developers. The Python
glossary suggests that this isn't the case:
Past efforts to create a "free-threaded" interpreter (one which
locks shared data at a much finer granularity) have not been
successful because performance suffered in the common
single-processor case. It is believed that overcoming this
performance issue would make the implementation much more
complicated and therefore costlier to maintain.
The addition of fine-grained locking which did not hurt single-threaded
code could certainly be a bit of work; it might well involve techniques like
run-time patching of the interpreter. For a system which is supposed to
run on many operating systems, such a solution could indeed be brittle and
hard to maintain. In its absence, though, the scalability of
multi-threaded Python programs will continue to be limited.
That said, Python 3 is clearly getting better. Over time, adoption appears
to be on the increase; the number of distributions and modules which
support the language is growing. Python 3 continues to be a
sufficiently hard sell that a group of developers recently contemplated
reopening feature-oriented development on version 2.x, but that idea fell
by the wayside when it became clear that the developer interest wasn't
there. Python 3 thus appears to be the future for those who want a
language which continues to evolve. Based on what can be seen in the 3.2
release, that evolution is going full speed, even in the face of a
moratorium on new core features.
(
Log in to post comments)