By Jake Edge
March 4, 2009
A sandbox (or restricted execution) environment for a programming language
can be a useful feature to
allow untrusted users access to much of the language while restricting
the "dangerous" operations. Some languages, notably Java, were designed to
support sandboxes from the outset. Others, like Python, have a variety of
possible sandbox solutions, but the core language doesn't support that
functionality. A movement is afoot to change that
for Python by reviving "restricted
mode".
Guido van Rossum raised the subject on the
python-dev mailing list, which started a conversation about the
requirements for such a mode. It turns out that the interested party, who
goes by the name "Tav", would like to be able to run untrusted code within
applications in Google's App
Engine. In particular, he would like to be able to allow untrusted
code to access additional functionality by way of closures. But, because
of the introspection features of Python, a closure object could be used
to circumvent any access restrictions.
The example Tav uses in his App
Engine feature request is instructive:
def _get_blog_posts(db, current_user):
def get_blog_posts():
"""Return Blog posts by the current user."""
return db.get('BlogPost').filter('user =', current_user)
return get_blog_posts
__builtins__['get_blog_posts'] = _get_blog_posts(db, 'tav@espians.com')
This would allow untrusted code to access the database in a constrained
manner, in this case only returning data for one particular user. But, by
peering inside of the
get_blog_posts object, a malicious user could
access the
db object. That would allow access to any data that is
stored in the database.
So, at some level, Tav, van Rossum, and others are trying to create a
restricted mode that limits the introspection so that untrusted code cannot
access attributes that "leak" information from the trusted code. This is a
fairly limited definition of a sandbox, as it relies on App Engine (or
other, such as PyPy
sandbox) safeguards to prevent things like system call access or
problems caused by interpreter segmentation faults. For this exercise,
those problems are explicitly defined away.
The real goal, as outlined
in Tav's blog, is to be able to provide more expressive templating for
users of App Engine applications:
Web applications like Blogger don't allow users to customise their blogs
using a rich language. Instead they have a proprietary templating system
which for the most part is just variable substitution.
Imagine instead if you could let your users use a templating language like
Genshi. Users could have the full
expresivity of the Python language to
generate the output they want.
The problem with letting users do that today is that they would be able to
use it to get at the rest of your application and start doing evil things
to your database.
In order to test his ideas about how to approach this problem, Tav issued a challenge to Python developers to
break his restricted FileReader object such that one could write a file to
the filesystem. It was only a few hours before a simple crack was posted, but, unlike other challenges
of this sort, Tav seemed delighted, rather than defeated, by what was
found. His environment essentially removed access to certain attributes
that are normally associated with an object. In essence, the challenge was
to find more attributes which needed to be added to his list.
A second version
of the challenge was posted to his blog, along with a running tally of
exploits that had been found and fixed. It is an interesting exercise that
Python developers seem to be having fun with. The problem with the
approach is that it relies on blacklists, as Victor Stinner, who also found
the first exploit, points out. A whitelist
approach is likely to be better; choosing which attributes are safe to use,
rather than removing those that are found to be unsafe.
Tav has posted a patch to the Python
core that implements his method into the language proper as suggested by
van Rossum. Given that van Rossum, as Python lead and Google employee, is
uniquely positioned to effect these changes, his promise
to "give it serious consideration,
both for inclusion in core Python and for App Engine" would seem to
carry a lot of weight.
While it is not a complete solution to the sandboxing problem, Tav's work
will help Python applications that already run in somewhat restricted
environments. After all, from App Engine's perspective, all of the code
that it gets is untrusted, so it must provide the safeguards against
exploits of the underlying operating system by way of crashes or system
calls. Tav's code would then allow App Engine user applications to run
their own untrusted code.
This could be a solution for other programs that want to run untrusted
Python code as well. The Battle for
Wesnoth
has support for AIs written in Python, but there have been some security
concerns about users grabbing random, perhaps malicious, AI code. This
change to the Python core, perhaps coupled with a PyPy sandbox might be
enough to change Eric Raymond's recent pronouncement that Lua is the way forward instead of Python.
(
Log in to post comments)