LWN.net Logo

Reviving Python restricted mode

By Jake Edge
March 4, 2009

A sandbox (or restricted execution) environment for a programming language can be a useful feature to allow untrusted users access to much of the language while restricting the "dangerous" operations. Some languages, notably Java, were designed to support sandboxes from the outset. Others, like Python, have a variety of possible sandbox solutions, but the core language doesn't support that functionality. A movement is afoot to change that for Python by reviving "restricted mode".

Guido van Rossum raised the subject on the python-dev mailing list, which started a conversation about the requirements for such a mode. It turns out that the interested party, who goes by the name "Tav", would like to be able to run untrusted code within applications in Google's App Engine. In particular, he would like to be able to allow untrusted code to access additional functionality by way of closures. But, because of the introspection features of Python, a closure object could be used to circumvent any access restrictions.

The example Tav uses in his App Engine feature request is instructive:

    def _get_blog_posts(db, current_user):
        def get_blog_posts():
            """Return Blog posts by the current user."""
            return db.get('BlogPost').filter('user =', current_user)
        return get_blog_posts

    __builtins__['get_blog_posts'] =  _get_blog_posts(db, 'tav@espians.com')
This would allow untrusted code to access the database in a constrained manner, in this case only returning data for one particular user. But, by peering inside of the get_blog_posts object, a malicious user could access the db object. That would allow access to any data that is stored in the database.

So, at some level, Tav, van Rossum, and others are trying to create a restricted mode that limits the introspection so that untrusted code cannot access attributes that "leak" information from the trusted code. This is a fairly limited definition of a sandbox, as it relies on App Engine (or other, such as PyPy sandbox) safeguards to prevent things like system call access or problems caused by interpreter segmentation faults. For this exercise, those problems are explicitly defined away.

The real goal, as outlined in Tav's blog, is to be able to provide more expressive templating for users of App Engine applications:

Web applications like Blogger don't allow users to customise their blogs using a rich language. Instead they have a proprietary templating system which for the most part is just variable substitution.

Imagine instead if you could let your users use a templating language like Genshi. Users could have the full expresivity of the Python language to generate the output they want.

The problem with letting users do that today is that they would be able to use it to get at the rest of your application and start doing evil things to your database.

In order to test his ideas about how to approach this problem, Tav issued a challenge to Python developers to break his restricted FileReader object such that one could write a file to the filesystem. It was only a few hours before a simple crack was posted, but, unlike other challenges of this sort, Tav seemed delighted, rather than defeated, by what was found. His environment essentially removed access to certain attributes that are normally associated with an object. In essence, the challenge was to find more attributes which needed to be added to his list.

A second version of the challenge was posted to his blog, along with a running tally of exploits that had been found and fixed. It is an interesting exercise that Python developers seem to be having fun with. The problem with the approach is that it relies on blacklists, as Victor Stinner, who also found the first exploit, points out. A whitelist approach is likely to be better; choosing which attributes are safe to use, rather than removing those that are found to be unsafe.

Tav has posted a patch to the Python core that implements his method into the language proper as suggested by van Rossum. Given that van Rossum, as Python lead and Google employee, is uniquely positioned to effect these changes, his promise to "give it serious consideration, both for inclusion in core Python and for App Engine" would seem to carry a lot of weight.

While it is not a complete solution to the sandboxing problem, Tav's work will help Python applications that already run in somewhat restricted environments. After all, from App Engine's perspective, all of the code that it gets is untrusted, so it must provide the safeguards against exploits of the underlying operating system by way of crashes or system calls. Tav's code would then allow App Engine user applications to run their own untrusted code.

This could be a solution for other programs that want to run untrusted Python code as well. The Battle for Wesnoth has support for AIs written in Python, but there have been some security concerns about users grabbing random, perhaps malicious, AI code. This change to the Python core, perhaps coupled with a PyPy sandbox might be enough to change Eric Raymond's recent pronouncement that Lua is the way forward instead of Python.


(Log in to post comments)

Reviving Python restricted mode

Posted Mar 6, 2009 19:00 UTC (Fri) by jimparis (subscriber, #38647) [Link]

http://tav.espians.com/a-challenge-to-break-python-securi...

Wow, that's downright scary. "Here are more and more esoteric ways to crack this software. Once we can't think of any more, it's definitely secure!"

What ever happened to proper security design, where you start with nothing and grant just the permissions you want? Designing with security in mind from the start?

Reviving Python restricted mode

Posted Mar 6, 2009 21:42 UTC (Fri) by nix (subscriber, #2304) [Link]

Um, this is making the core safe, i.e. making sure there's nothing
intrinsic to Python classes or the interpreter core -- the language
itself -- that lets you break out of restricted mode.

The *modules* will be whitelisted piece by piece as you suggest.

Reviving Python restricted mode

Posted Mar 6, 2009 22:32 UTC (Fri) by njs (guest, #40338) [Link]

There are languages designed like that (E is probably the most prominent). It's obviously the right way to do it.

Nobody (within epsilon) uses them :-(

Reviving Python restricted mode

Posted Mar 8, 2009 21:31 UTC (Sun) by kleptog (subscriber, #1183) [Link]

Indeed, Python was removed as a trusted language in PostgreSQL because it turned out to be impossible to lock down in any meaningful way. (So it's only available to trusted users). Fortunately there are other languages you can use, like Perl :)

Reviving Python restricted mode

Posted Mar 12, 2009 15:21 UTC (Thu) by tav (guest, #57059) [Link]

Thank you Jake for this article -- I'm really impressed at the amount of research you've done -- it's
the first write-up I've seen which *really* gets the whole picture.

--
Cheers, tav@espians.com

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds