LWN.net Logo

Moving to Python 3

Moving to Python 3

Posted Feb 10, 2011 5:19 UTC (Thu) by nevyn (subscriber, #33129)
Parent article: Moving to Python 3

> Python 2 tries to decode strings as 7-bit ASCII to get Unicode text

Which is like a one line fix to make, to default the system locale to utf-8 in py-2 instead of "ascii" ... almost instantly removing the need for checking every $%#%$# string operation in your app. ... but hey, let's pretend it's 1985 instead and write an incompatible language.

> Python 3 is unarguably a better language than Python 2.

Really? Unarguably? The fact that os.listdir() is utterly broken on Linux isn't any kind of hint that maybe, just maybe, there might be some problems? Or maybe people might find _some_ argument in the fact that in the two years since py-3 (3.0 was released Dec. 2008), _no_ Linux distribution has announced a timeline to move to py3k as the default python implementation.

How many apps. on rawhide or unstable run against the py-3 stack, again? About as many as the perl apps. are running on perl6?

First perl kills itself, and now this ... it's enough to make you go back to C ... or even look at Java again.


(Log in to post comments)

Moving to Python 3

Posted Feb 10, 2011 6:10 UTC (Thu) by mrjoel (subscriber, #60922) [Link]

Arch is actually moving to py3 as the default: http://www.archlinux.org/news/python-is-now-python-3/

Moving to Python 3

Posted Feb 10, 2011 14:39 UTC (Thu) by Webexcess (subscriber, #197) [Link]

You didn't provide a link for the problem, but is this what you're looking for?

    os.listdir(b'.') # no decoding for me, thanks

Moving to Python 3

Posted Feb 10, 2011 15:38 UTC (Thu) by nevyn (subscriber, #33129) [Link]

I'm aware that you can call it (directly) as os.listdir(bytes(mypath)), but this has a number of problems:

1. When calling listdir() directly, the default is broken (and in a non-obvious way) ... so everybody has to remember "Oh, yeh, you have to call os.listdir() in this speciail way or it's broken".

2. It assumes people are calling os.listdir() directly ... which is _far_ from the normal case. So now, to do the same hack, every API that eventually calls listdir() will have to implement/debug the bytes vs. unicode input vs. output thing ... and every caller of those APIs will have to remember "Oh, yeh, you have to call foo_API() in this speciail way or it's broken".

3. It's still not obvious what you _do_ with those bytes, because the reason listdir() doesn't work "normally" is that it's model of the Universe doesn't match reality. Basically you can't load a POSIX filename, and print "Error: open(%s): %s" ... and this problem is much bigger than POSIX filenames, it's just that's the most glaringly broken problem that people see. So the whole thing is a huge clue that "Unicode" is not any better in py-3 than it is in py-2 (which is to say, it's completely broken).

Moving to Python 3

Posted Feb 12, 2011 0:13 UTC (Sat) by cmccabe (guest, #60281) [Link]

There was a thread about non-UTF8 filenames on LWN a little while back. (One of many, I'm sure.) The consensus seemed to be that they were quite useless. They tend to break the mental model of programmers too. For example, programmers tend to assume that printing a filename to stdout is *not* a security vulnerability. But if that filename contains control characters... surprise! It can hack your terminal emulator.

Python has a pretty long history of "forcing" what it believes to be the correct behavior on its users. It even tells you how to use whitespace. I am not surprised at all that they ignore non-UTF filenames. Frankly, it's a good decision.

Moving to Python 3

Posted Feb 12, 2011 1:50 UTC (Sat) by foom (subscriber, #14868) [Link]

They don't ignore random-byte filenames. Filenames are decoded from bytes to unicode with the *locale encoding* (not always utf8), and the "surrogateescape" error handler. That allows roundtripping filenames through unicode even if they're not in the proper encoding at all (although in that case they'll be garbage).

http://www.python.org/dev/peps/pep-0383/

Moving to Python 3

Posted Feb 15, 2011 1:24 UTC (Tue) by yuhong (guest, #57183) [Link]

Yea, it is not Python's fault that historically there has been no standard character encoding beyond ASCII for Unix filenames, in contrast to Windows LFN filenames and Mac HFS+ filenames, both of which used UTF-16 from the beginning.

Moving to Python 3

Posted Feb 15, 2011 14:32 UTC (Tue) by nevyn (subscriber, #33129) [Link]

> Yea, it is not Python's fault [that unix doesn't look like windows]

It is exactly python's fault that it pretends unix is like windows, when it isn't.

Moving to Python 3

Posted Feb 15, 2011 14:52 UTC (Tue) by foom (subscriber, #14868) [Link]

> It is exactly python's fault that it pretends unix is like windows, when it isn't.

Except that python doesn't actually do that, see comment above...

Moving to Python 3

Posted Feb 10, 2011 22:20 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

As far as Rawhide is concerned, lot more than Perl 6

$yum search python3

Loaded plugins: presto, refresh-packagekit
========== N/S Matched: python3 ===========
dreampie-python3.noarch : Support for running the python3 interpreter from
: dreampie
python3-cairo-devel.i686 : Libraries and headers for python3-cairo
python3-cairo-devel.x86_64 : Libraries and headers for python3-cairo
python3-decorator.noarch : Module to simplify usage of decorators in python3
python3-smbc.x86_64 : Python3 bindings for libsmbclient API from Samba
python3-stomppy.noarch : Python stomp client for messaging for python3
dpm-python3.x86_64 : Disk Pool Manager (DPM) python bindings
lfc-python3.x86_64 : LCG File Catalog (LFC) python bindings
libselinux-python3.x86_64 : SELinux python 3 bindings for libselinux
libsemanage-python3.x86_64 : semanage python 3 bindings for libsemanage
python3.i686 : Version 3 of the Python programming language aka Python 3000
python3.x86_64 : Version 3 of the Python programming language aka Python 3000
python3-PyQt4.i686 : Python 3 bindings for Qt4
python3-PyQt4.x86_64 : Python 3 bindings for Qt4
python3-PyQt4-devel.i686 : Python 3 bindings for Qt4
python3-PyQt4-devel.x86_64 : Python 3 bindings for Qt4
python3-PyYAML.x86_64 : YAML parser and emitter for Python
python3-babel.noarch : Library for internationalizing Python applications
python3-beaker.noarch : WSGI middleware layer to provide sessions
python3-bpython.noarch : Fancy curses interface to the Python 3 interactive
: interpreter
python3-cairo.x86_64 : Python 3 bindings for the cairo library
python3-chardet.noarch : Character encoding auto-detection in Python
python3-cherrypy.noarch : Pythonic, object-oriented web development framework
python3-coverage.x86_64 : Code coverage testing module for Python 3
python3-debug.i686 : Debug version of the Python 3 runtime
python3-debug.x86_64 : Debug version of the Python 3 runtime
python3-deltarpm.x86_64 : Python bindings for deltarpm
python3-devel.i686 : Libraries and header files needed for Python 3 development
python3-devel.x86_64 : Libraries and header files needed for Python 3
: development
python3-gobject.i686 : Python 3 bindings for GObject and GObject Introspection
python3-gobject.x86_64 : Python 3 bindings for GObject and GObject Introspection
python3-httplib2.noarch : A comprehensive HTTP client library
python3-inotify.noarch : Monitor filesystem events with Python under Linux
python3-jinja2.noarch : General purpose template engine
python3-libs.i686 : Python 3 runtime libraries
python3-libs.x86_64 : Python 3 runtime libraries
python3-lxml.x86_64 : ElementTree-like Python 3 bindings for libxml2 and libxslt
python3-mako.noarch : Mako template library for Python 3
python3-markupsafe.x86_64 : Implements a XML/HTML/XHTML Markup safe string for
: Python
python3-minimock.noarch : The simplest possible mock library
python3-mpi4py-mpich2.x86_64 : Python bindings of MPI, MPICH2 version
python3-mpi4py-openmpi.x86_64 : Python bindings of MPI, Open MPI version
python3-numpy.x86_64 : A fast multidimensional array facility for Python
python3-numpy-f2py.x86_64 : f2py for numpy
python3-paste.noarch : Tools for using a Web Server Gateway Interface stack
python3-ply.noarch : Python Lex-Yacc
python3-postgresql.x86_64 : Connect to PostgreSQL with Python 3
python3-psutil.noarch : A process utilities module for Python 3
python3-pygments.noarch : A syntax highlighting engine written in Python 3
python3-pyke.noarch : Knowledge-based inference engine
python3-pyparsing.noarch : An object-oriented approach to text processing
: (Python 3 version)
python3-setuptools.noarch : Easily build and distribute Python 3 packages
python3-sip.i686 : SIP - Python 3/C++ Bindings Generator
python3-sip.x86_64 : SIP - Python 3/C++ Bindings Generator
python3-sip-devel.i686 : Files needed to generate Python 3 bindings for any C++
: class library
python3-sip-devel.x86_64 : Files needed to generate Python 3 bindings for any
: C++ class library
python3-sleekxmpp.noarch : Flexible XMPP client/component/server library for
: Python
python3-smbpasswd.x86_64 : Python SMB Password Hst Generator Module for Python 3
python3-sqlalchemy.x86_64 : Modular and flexible ORM library for python
python3-tempita.noarch : A very small text templating language
python3-test.i686 : The test modules from the main python 3 package
python3-test.x86_64 : The test modules from the main python 3 package
python3-tkinter.i686 : A GUI toolkit for Python 3
python3-tkinter.x86_64 : A GUI toolkit for Python 3
python3-tools.i686 : A collection of tools included with Python 3
python3-tools.x86_64 : A collection of tools included with Python 3
python3-zmq.x86_64 : Software library for fast, message-based applications

Moving to Python 3

Posted Feb 10, 2011 23:50 UTC (Thu) by dave_malcolm (subscriber, #15013) [Link]

FWIW, we're tracking Fedora's Python 3 status in more detail here:
https://fedoraproject.org/wiki/Python3#Porting_status
(and this may be of interest to other distributions looking to build out a Python 3 stack)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds