|
|
Subscribe / Log in / New account

Python moratorium and the future of 2.x

November 11, 2009

This article was contributed by Andrew M. Kuchling

On November 9, Python BDFL ("Benevolent Dictator For Life") Guido van Rossum froze the Python language's syntax and grammar in their current form for at least the upcoming Python 2.7 and 3.2 releases, and possibly for longer still. This move is intended to slow things down, giving the larger Python community a chance to catch up with the latest Python 3.x releases.

The idea of freezing the language was originally proposed by Van Rossum in October on the python-ideas list and discussed on LWN. There are three primary arguments for the freeze, all described in the original proposal:

  • Letting alternate implementations, IDEs, catch up:
    [...] frequent changes to the language cause pain for implementors of alternate implementations (Jython, IronPython, PyPy, and others probably already in the wings) at little or no benefit to the average user [...]

  • Encouraging the transition to Python 3.x:
    The main goal of the Python development community at this point should be to get widespread acceptance of Python 3000. There is tons of work to be done before we can be comfortable about Python 3.x, mostly in creating solid ports of those 3rd party libraries that must be ported to Py3k before other libraries and applications can be ported.

  • Redirecting effort to the standard library and the CPython implementation:
    Development in the standard library is valuable and much less likely to be a stumbling block for alternate language implementations. I also want to exclude details of the CPython implementation, including the C API from being completely frozen — for example, if someone came up with (otherwise acceptable) changes to get rid of the [Global Interpreter Lock] I wouldn't object.

The proposal turned into PEP 3003, "Python Language Moratorium", which is more definite about what cannot be changed:

  • New built-ins
  • Language syntax
    The grammar file essentially becomes immutable apart from ambiguity fixes.
  • General language semantics
    The language operates as-is with only specific exemptions ...
  • New __future__ imports
    These are explicitly forbidden, as they effectively change the language syntax and/or semantics (albeit using a compiler directive).

Adding a new method to a built-in type will still be open for consideration, and so is changing language semantics that turn out to be ambiguous or difficult to implement. Python's C API can be changed in any way that doesn't impose grammar or semantic changes, and the modules in the standard library are still fair game for improvement.

The duration of the freeze is given in the PEP as "a period of at least two years from the release of Python 3.1." Python 3.1 was released on June 27 2009, so the freeze would extend until at least June 2011. Van Rossum later clarified the duration on python-dev, writing "In particular, the moratorium would include Python 3.2 (to be released 18-24 months after 3.1) but (unless explicitly extended) allow Python 3.3 to once again include language changes."

Most responses to the moratorium idea were favorable, but those who had objections felt those objections very strongly. Steven D'Aprano wrote:

A moratorium isn't cost-free. With the back-end free to change, patches will go stale over 2+ years. People will lose interest or otherwise move on. Those with good ideas but little patience will be discouraged. I fully expect that, human nature being as it is, those proposing a change, good or bad, will be told not to bother wasting their time, there's a moratorium on at least as often as they'll be encouraged to bide their time while the moratorium is on.

A moratorium turns Python's conservativeness up to 11. If Python already has a reputation for being conservative in the features it accepts — and I think it does — then a moratorium risks giving the impression that Python has become the language of choice for old guys sitting on their porch yelling at the damn kids to get off the lawn. That's a plus for Cobol. I don't think it is a plus for Python.

The 2-to-3 transition

One of the reasons for the moratorium is the developers' increasing concern at the slow speed of the user community's transition away from Python 2.x. The moratorium thread led to a larger discussion of where Python 3.x stands.

Progress on the transition can be roughly measured by looking at the third-party packages available for Python 3.x. Only about 100 of the 8000 packages listed on the Python Package Index claim to be compatible with Python 3, and many significant packages have not yet been ported (Numeric Python, MySQLdb, PyGTk), making it impossible for users to port their in-house code or application. Few Linux distributions have even packaged a Python 3.x release yet.

For the Python development community, it's tempting to nudge the users toward Python 3 by discouraging them from using Python 2. The Python developers have been dividing their attention between the 2.x and 3.x branches for a few years now, and a significant number of them would like to refocus their attention on a single branch. Given the slow uptake of Python 3, though, it's difficult to know when Python 2 development can stop. The primary suggestions in the recent discussion were:

  1. Declare Python 2.6 the last 2.x release.
  2. Declare Python 2.7 the last 2.x release.
  3. After Python 2.7, continue with a few more releases (2.8, 2.9, etc.).
  4. Declare the 3.x branch an experimental version, call it dead, and begin back-porting features to the 2.x branch.

Abandoning the 3.x branch had very few supporters. Retroactively declaring 2.6 the final release was also not popular, because people have been continuing to apply and backport improvements on the assumption that there was going to be a 2.7 release.

As Skip Montanaro phrased it:

2.6.0 was released over a year ago and there has been no effort to suppress bug fix or feature additions to trunk since then. If you call 2.6 "the end of 2.x" you'll have wasted a year of work on 2.7 with about a month to go before the first 2.7 alpha release.

If you want to accelerate release of 2.7 (fewer alphas, compressed schedule, etc) that's fine, but I don't think you can turn back the clock at this point and decree that 2.7 is dead.

A significant amount of work has already been committed to the 2.7 branch, as can be seen by reading "What's New in Python 2.7" or the more detailed NEWS file. New features include an ordered dictionary type, support for using multiple context managers in a single with statement, more accurate numeric conversions and printing, and several features backported from Python 3.1.

Clearly a 2.7 release will happen, and manager Benjamin Peterson's draft release schedule projects a 2.7 final release in June 2010. There's no clear consensus on whether to continue making further releases after 2.7. Post-2.7 releases could continue to bring 2.x and 3.x into closer compatibility and improve porting tools such as the 2to3 script, while keeping existing 2.x users happy with bugfixes and a few new features, but this work does cost effort and time. Brett Cannon stated his case for calling an end with 2.7:

[...] I think a decent number of us no longer want to maintain the 2.x series. Honestly, if we go past 2.7 I am simply going to stop backporting features and bug fixes. It's just too much work keeping so many branches fixed.

Raymond Hettinger argued that imposing an end-of-life is unpleasant for users:

I do not buy into the several premises that have arisen in this thread. [First premise:] For 3.x to succeed, something bad has to happen to 2.x. (which in my book translates to intentionally harming 2.x users, either through neglect or force, in order to bait them into switching to 3.x).

Hettinger is unmoved by the argument that maintaining 2.x takes up a lot of time, arguing that backporting a feature is relatively quick compared to the time required to implement it in the first place. He's also concerned that 3.x still needs more polishing, and concludes:

In all these matters, I think the users should get a vote. And that vote should be cast with their decision to stay with 2.x, or switch to 3.x, or try to support both.

Assessment

Declaring such a long-term freeze on the language's evolution is a surprising step, and not one that developer groups often choose. Languages defined by an official standard, such as C, C++, or Lisp, are forced to evolve very slowly because of the slow standardization process, but Python is not so minutely specified. D'Aprano makes a good point that the developers are already pretty conservative; most suggestions for language changes are rejected. On the other hand, switching to Python 3.x is a big jump for users and book authors; temporarily halting further evolution may at least give them the sense they're not aiming for a constantly shifting target.

It's probably premature to call the transition to Python 3.x a failure, or even behind schedule. These transitions invariably take a lot of time and proceed slowly. Many Linux distributions have adopted Python for writing their administrative tools, making the interpreter critical to the release process. Distribution maintainers will therefore be very conservative about upgrading the Python version. It's a chicken-and-egg problem; third-party developers who stick to their distribution's packages can't use Python 3 yet, which means they don't port their code to Python 3, which gives distributions little incentive to package it. Eventually the community will switch, but it'll take a few years. The most helpful course for the Python developers is probably to demonstrate and document how applications can be ported to Python 3, as Martin von Löwis has done by experimentally porting Django to Python 3.x, and where possible get the resulting patches accepted by upstream.

It remains to be seen if a volunteer development group's efforts can be successfully redirected by declaring certain lines of development to be unwelcome. Volunteers want to work on tasks that are interesting, or amusing, or relevant to their own projects. The moratorium may lead to a perception that Python development is stalled, and developers may start up DVCS-hosted branches of Python that contain more radical changes, or move on to some other project that's more entertaining.

The nearest parallel might be the code freezes for versions 2.4 and 2.6 of the Linux kernel. The code freeze for Linux 2.4 was declared in December 1999, and 2.5.0 didn't open for new development until November 2001, nearly two years later. The long duration of the freeze led to a lot of pressure to bend the rules to get in one more feature or driver update.

Python's code freeze will be of similar length and there may be similar pressure to slip in just one little change. However, freezing the language still leaves lots of room to improve the standard library and the CPython implementation, enhance developer tools, and explore other areas not covered by the moratorium. Perhaps these tasks are enough of an outlet for creative energy to keep people interested.

Index entries for this article
GuestArticlesKuchling, A.M.


to post comments

Python moratorium and the future of 2.x

Posted Nov 12, 2009 3:00 UTC (Thu) by marduk (subscriber, #3831) [Link] (8 responses)

Personally I think a moratorium should be put on Python 2.x, starting after 2.7.0 and only fixing bugs in 2.x after that. If they continue to backport stuff to 2.x then there will be less and less reason to port to 3.x and so 3.x will just die and ten years from now we'll be using 2.101, which will have everything 3.x has but includes all the backward-compatibility crap since 2.5.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 5:41 UTC (Thu) by foom (subscriber, #14868) [Link] (7 responses)

> If they continue to backport stuff to 2.x then there will be less and less reason to port to 3.x
> and so 3.x will just die and ten years from now we'll be using 2.101, which will have everything
> 3.x has

That'd be pretty nice. :)

Python moratorium and the future of 2.x

Posted Nov 12, 2009 6:54 UTC (Thu) by drag (guest, #31333) [Link] (6 responses)

No it wouldn't.

The introduction of the 'byte' datatype combined with the end of the abuse
of strings-for-everything is enough of a reason alone why 2.x should
eventually die.

Python moratorium and the future of 2.x

Posted Nov 13, 2009 3:12 UTC (Fri) by spitzak (guest, #4593) [Link] (5 responses)

The "Unicode" is the reason I am not using Python 3.

They are living in a fantasy world where UTF-8 magically has no errors in it.

In the real world, if those errors are not preserved, data will be corrupted. This means that the "Unicode" is USELESS because we cannot store text in it. Therefore everybody will have to use byte strings and because the easiest way to avoid "errors" is to say the strings are ISO-8859-1 then we will revert to non-Unicode really fast. This is a terrible result and it is shameful that the people causing it are under the delusion that they are "helping Unicode".

For some reason the ability to do character = string[int] is somehow so drilled into programmers brains that they turn into complete idiot savants, doing incredible amounts of insanely complex and error-prone work, rather than dare to question their initial assumption and come up with the obvious solution if they thought about any other piece of data that is stored in a stream, such as words.

The strings should be BYTES and there should not be two types. If you want "characters" (in the TINY TINY TINY percentage of the cases where you do) then you use an ITERATOR!!!! Ie "for x in string", and x is set to a special item that can compare to characters and also can encode errors. To change the codec you make a different object, but the bytes just get their reference count incremented so there is no copying and changing coded is trivial and O(1). "Unicode strings" would mean the codec is set to UTF-8 and "bytes" would mean the codec is set to some byte version, or possibly the iterator is disallowed. Also the parser needs to translate "\uXXXX" in a string to the UTF-8 representation and "\xNN" to a byte with that value.

Python moratorium and the future of 2.x

Posted Nov 13, 2009 12:25 UTC (Fri) by intgr (subscriber, #39733) [Link] (3 responses)

I agree with you on the point of increasing efficiency by doing less conversions. Basically it boils down to annotating a string with its encoding, and doing the conversion lazily.

But I cannot agree with you on this claim:

They are living in a fantasy world where UTF-8 magically has no errors in it. In the real world, if those errors are not preserved, data will be corrupted.

If your supposedly UTF-8 strings have errors in them, then that data is already corrupt. What you are talking about is wiping those corruptions under a carpet and claim that they never existed. This is exactly the failure mode that forces every single layer of an application to implement the same clunky workarounds again and again.

The real solution to this problem is detecting corruptions early -- at the source -- to avoid them from propagating any further. In fact, a frequent source of these corruptions is indeed handling UTF-8 strings as if they were a bunch of bytes.

To a coder like you, who knows all the details about character encodings, this is very obvious. However, most coders neither have the experience nor time to think of all the issues every time they deal with strings. Having an "array of characters" is just a much simpler model for them, and there is a smaller opportunity to screw up.

But Python 3 doesn't force you to use this model. It simply clears up the ambiguity between what you know is text, and what isn't. In Python 2, you never knew whether a "str" variable contains ASCII text or just a blob of binary data. In Python 3, if you have data in uncertain encodings, you use the "bytes" type. Just don't claim that it's a text string -- as long as you don't know how to interpret it, it's not text.

For some reason the ability to do character = string[int] is somehow so drilled into programmers brains that they turn into complete idiot savants

How about len(str)?

Python moratorium and the future of 2.x

Posted Nov 13, 2009 18:11 UTC (Fri) by foom (subscriber, #14868) [Link]

> How about len(str)?

That is not actually a particularly useful number to have available. Some more useful numbers:

a) Bounding box if rendered in a particular font. (or for a terminal: number of cell-widths)
b) Number of glyphs (after processing combining characters/etc)
c) Number of bytes when stored in a particular encoding.

Python moratorium and the future of 2.x

Posted Nov 13, 2009 20:09 UTC (Fri) by spitzak (guest, #4593) [Link] (1 responses)

The problem is that programmers will do exactly what you are saying: if "corrupt UTF-8" means "this data is not UTF-8", that is EXACTLY how they will treat it. They will remove all attemps to interpret ANY text as UTF-8, most likely using ISO-8859-1 instead (sometimes they will double-encode the UTF-8 which has the same end results).

You cannot fix invalid UTF-8 if you say it is somehow "not UTF-8". An obvious example is that you are unable to fix an incorrect UTF-8 filename if the filesystem api does not handle it, as you will be unable to name the incorrect file in the rename() call!

You have to also realize that an error thrown when you look at a string is a DENIAL OF SERVICE. For some low-level programmer just trying to get a job done, this is about 10,000 times worse than "some foreign letters will get lost". They will debate the solution for about .001 second and they will then switch their encoding to ISO-8859-1. Or they will do worse things, such as strip all bytes with the high bit set, or remove the high bit, or change all bytes with the high bit set to "\xNN" sequences. I have seen all of these done, over and over and over again. The rules are #1: avoid that DOS error, #2: make most English still readable.

We must redesign these systems so that programmers are encouraged to work in Unicode, by making it EASY, and stop trying to be politically correct and certainly stop this bullshit about saying that invalid code points somehow make it "not UTF-8". It does not do so, any more than misspelled words make a text "not English" and somehow unreadable by an English-reader.

How about len(str)?

You seem to be under the delusion that "the number of Unicode codes" or (more likely) "the number of UTF-16 code points" is somehow interesting. I suspect "how much memory this takes" is an awful lot more interesting, and therefore len(str) should return the number of bytes. If you really really want "characters" than you are going to have to scan the string and do something about canonical decomposition and all the other Unicode nuances.

Python moratorium and the future of 2.x

Posted Nov 13, 2009 21:15 UTC (Fri) by nix (subscriber, #2304) [Link]

They will remove all attemps to interpret ANY text as UTF-8, most likely using ISO-8859-1 instead
Only if they never want any users outside the US and Europe, in which case they'll be marginalized sooner rather than later.

Python moratorium and the future of 2.x

Posted Nov 19, 2009 12:34 UTC (Thu) by yeti-dn (guest, #46560) [Link]

<blockquote>...the easiest way to avoid "errors" is to say the strings are ISO-8859-1</blockquote>

That's the real core of the problem.

The scary myths about broken UTF-8 are very likely being spread by precisely the same people who broke the UTF-8 in the first place because they cannot imagine anything beyond ISO-8859-1.

I live outside US and Western Europe. If I do something as sloppy as treating UTF-8 text as ISO-8859-1 (removing 8bit chars, escaping, whatever) I completely mangle it. So, people don't do it. I meet broken UTF-8 rarely and when I do it invariably comes from the western countries.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 5:56 UTC (Thu) by felixfix (subscriber, #242) [Link] (20 responses)

I don't use Python much at all, it didn't seem to have any advantages over Perl, not that Perl is the best language ever... but good enough that Python held no interest other than curiousity in trying a new language.

However! I have tried to keep up with what is going on, and I got the distinct impression that the biggest problem with the 3.x branch was its lack of backwards compatibility. Perl6 looks to be taking even longer to come to life, but at least they have made provisions for compatibility packages, from what I think I remember reading, so that you could run perl5 packages in perl6 programs with at least mostly good prospects of succeeding.

If Python 2.x ---> 3.x really does have no compatibility provisions, if you really do have to run either 2.x or 3.x, and there is no way to include 2.x packages in 3.x, that seems incredibly brain dead to me. If there is no migration path, how are people supposed to migrate? I am serious in asking that. Was there any kind of plan for how people could switch from 2.x to 3.x? Were package maintainers supposed to maintain two versions of everything for several years? Were production developers supposed to maintain two versions of their code for several years? It all seems as if someone (BDFL?) had unrealistic ideas of how much spare time real life provides for such fantasies.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 7:18 UTC (Thu) by drag (guest, #31333) [Link] (13 responses)

Last time I checked... Perl6 does not actually exist yet. It's been under development since 2000 or so and the ETA is 'sometime in 2010'. Meanwhile Python 3 is actually out and is usable. So forgive me if I choose true unicode support, cleaned up syntax, and proper handling of binary data over backwards compatibility with a non-existent-for-another-decade release. :)

Python moratorium and the future of 2.x

Posted Nov 12, 2009 8:37 UTC (Thu) by ptman (subscriber, #57271) [Link]

Perl6 does exists. It's not ready yet, though. There is an effort underway
to create a somewhat stable release of rakudo (one implementation) called
rakudo star in the spring.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 8:44 UTC (Thu) by chromatic (guest, #26207) [Link] (8 responses)

Last time I checked... [Perl 6] does not actually exist yet.

I've had working Perl 6 code running for four and a half years. Rakudo (a Perl 6 implementation) will have its 23rd monthly release in a row in a week. The Rakudo spectest status page has an informative graph which, I believe, addresses your ontological questions.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 13:54 UTC (Thu) by drag (guest, #31333) [Link] (6 responses)

So hows the backwards compatibility then? Can you just pull in Perl5 modules
from CPAN and have them work?

Python moratorium and the future of 2.x

Posted Nov 12, 2009 20:34 UTC (Thu) by chromatic (guest, #26207) [Link] (5 responses)

That's a work in progress; a project called Blizkost allows some interoperability with Perl 5 at the Parrot level. (XS modules complicate things somewhat, as you might expect... but the Blizkost approach is clever.)

Python moratorium and the future of 2.x

Posted Nov 12, 2009 22:17 UTC (Thu) by lysse (guest, #3190) [Link] (4 responses)

I believe "no" was the word for which you were reaching? (Possibly "not yet"?)

Python moratorium and the future of 2.x

Posted Nov 13, 2009 19:58 UTC (Fri) by chromatic (guest, #26207) [Link]

I haven't tried it, so I can't comment authoritatively on what works and doesn't work.

Python moratorium and the future of 2.x

Posted Nov 15, 2009 12:12 UTC (Sun) by IkeTo (subscriber, #2122) [Link] (2 responses)

> I believe "no" was the word for which you were reaching? (Possibly "not
> yet"?)

That depends on what the OP want. The design of Perl 6 definitely have backward compatibility addressed, at least so long as Perl extension module is not involved (i.e., there is no C code involved). The Perl 6 is designed to be based on a virtual machine (Parrot) that can interpret many different languages, Perl 5 is one of them. Then the Parrot engine can run many back-end languages at the same time, with objects of different languages cooperating in a fashion similar to Java dynamic language interface. Finally, the Perl 6 language is designed so that modules are distinguishable from Perl 5 modules by just looking at the first few tokens, so that the eventual VM can load a module and decide whether to use Perl 5 or Perl 6 back-end automatically. So the design is there, it's just an issue about when it actually enters implementation.

Python moratorium and the future of 2.x

Posted Nov 18, 2009 17:37 UTC (Wed) by cptskippy (guest, #62050) [Link]

So if I understand all of this correctly, the Python 3 implementation that I can download and use today sucks because it lacks a feature that Perl 6 supported since it's inception 10 years ago and I should disregard the fact that I can't actually use this feature in Perl 6 because no one has ever been able to create a complete implementation of Peal 6?

Python moratorium and the future of 2.x

Posted Nov 19, 2009 6:42 UTC (Thu) by lysse (guest, #3190) [Link]

I believe designs for time machines also exist.

Python moratorium and the future of 2.x

Posted Nov 15, 2009 5:57 UTC (Sun) by b7j0c (guest, #27559) [Link]

whoah, someone made a big commit on 9/28!

Python moratorium and the future of 2.x

Posted Nov 12, 2009 9:46 UTC (Thu) by niner (subscriber, #26151) [Link] (1 responses)

If Python has "true unicode support" then I'm very happy to work with some untrue
unicode support which actually works and is not as painfull...

But let's not start another language flamewar.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 14:07 UTC (Thu) by drag (guest, #31333) [Link]

I don't care about Perl vs Python or anything like that. To each their own,
but there are definitely good and practical reasons why 3 is not backwards
compatible.

Anyways it's always been something normal to have multiple versions of
Python installed. I have 3 versions installed right now and while pure
python generally packages work across all of them, most complex modules
have some C code in them and you end up with a lot of multiple copies of
the same module.

--------------------------

As far as unicode support goes it is not a issue of 'true' vs 'untrue'
unicode support the difference is having 8-bit strings heavily overloaded
as the basic datatype for the entire implementation and every modules and
issues with data type conversions and mismatching string types a common
source of python bugs because unicode was tacked onto the language after
the fact versus having all strings be unicode and having a separate basic
byte datatype for storing data.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 14:12 UTC (Thu) by felixfix (subscriber, #242) [Link]

And a language which exists officially but is not usable is better than a language which exists unofficially and is usable?

If a language exists but is unusable because it has no backwards compatibility, does it really exist? If said language emulates the tree in the forest by falling down, will anyone emulate listeners and hear it?

Python user here

Posted Nov 13, 2009 23:38 UTC (Fri) by man_ls (guest, #15091) [Link] (4 responses)

If Python 2.x ---> 3.x really does have no compatibility provisions, if you really do have to run either 2.x or 3.x, and there is no way to include 2.x packages in 3.x, that seems incredibly brain dead to me.
My feelings exactly. Why should I "upgrade" to Python 3.x? My favorite libraries all work in 2.x, I know it more or less and it's what people know. I like Unicode a lot, but pervasive use of Unicode should not mean incompatible changes. To my little program it meant changing all str() to unicode(), all __str__() to __unicode__() and little more.

If there was a way to mix 2.x libraries with 3.x code, then users would be able to migrate at their own pace, and once there is enough 3.x code out there new libraries would be written in 3.x. But right now, it looks as if the chicken-and-egg situation might strangle 3.x.

Python user here

Posted Nov 14, 2009 0:08 UTC (Sat) by anselm (subscriber, #2796) [Link] (3 responses)

The fun thing is that Tcl/Tk got Unicode essentially nailed more than 10 years ago (in version 8.2, to be exact), and transparently at that. None of those flag-day incompatibilities that seem to haunt the Python world. In Tcl there is no notion of separate »byte strings« and »Unicode strings«. Things just work. Most of the other languages that have been around for about as long (e.g., Perl or Python) are still struggling.

People may not like the Tcl language a lot (it is a bit of an acquired taste, to be sure) but much of the underlying engineering is really very good indeed.

Unicode and Bytes

Posted Nov 15, 2009 0:31 UTC (Sun) by pboddie (guest, #50784) [Link] (2 responses)

In Tcl there is no notion of separate »byte strings« and »Unicode strings«. Things just work.

Maybe they do. It's probably no accident that custody of Tcl was with Sun around that time (or slightly earlier) and that Java APIs also occasionally have byte sequences becoming proper strings with a sprinkling of magic, although I forget where this was - probably the Servlet API, where you mostly do want strings, but where deficiencies in the standards require a bit of guesswork to actually provide correct strings (and not just bytes) to the API user when they ask for a request parameter or part of a URL.

Of course, there's nothing to stop you storing byte values in a Unicode string type: Jython managed to do this, too. But again, cross your fingers that the right magic is being used.

Unicode and Bytes

Posted Nov 16, 2009 13:50 UTC (Mon) by kleptog (subscriber, #1183) [Link] (1 responses)

I find it interesting that perl 5.6 introduced Unicode everywhere internally and no-one had to rewrite a thing. Internally everything is unicode but you generally don't even notice. Except all those people who *needed* unicode support could write their programs and all the extension modules just worked (for the most part).

For example, chr() can now return numbers greater than 255, but that doesn't bother people so much. Your *source* will still be interpreted as latin1 but that doesn't bother many people since you don't usually need unicode in your source (I'm somewhat baffled by pythons "unicode" construct at source level, assuming the source was latin1 would have made transitions easier).

What mostly happened is that when you tried to send unicode data over a pipe or to a file, you got an error. People filed bugs, the appropriate encode() call was added (or "use bytes" if people wanted to punt) and all was well. The workaround was to encode prior to calling the module so it was no big deal.

There is magic under the hood ofcourse, see the perlunicode manpage, but the result is a completely transparent transition and is why at my work we run perl5.8 and python2.4, because python upgrades always break something (for zero apparent benefit).

Unicode and Bytes

Posted Nov 19, 2009 12:19 UTC (Thu) by yeti-dn (guest, #46560) [Link]

I find it interesting that perl 5.6 introduced Unicode everywhere internally and no-one had to rewrite a thing.

Maybe in the US because US-ASCII is good for everyone there (with a few reluctantly admitting that ISO-8859-1 exists too). But I remember perl 5.6 release well exactly because it broke important programs (for me anyway) working with international text and reencoding it. Somehow, perl 5.8 managed to break them again. Then I stopped counting.

Python moratorium and the future of 2.x

Posted Nov 15, 2009 5:50 UTC (Sun) by b7j0c (guest, #27559) [Link]

i would describe python3 and perl6 at almost exactly the same place in their development cycles. python3 may be "released" and blessed, its adoption is only tentative...everyone is in the same boat, they want to be part of the future, but the tool isn't done yet and certainly everything else in module-land is a long way from catching up.

afaik python mysql support isn't even on 2.6 yet...getting to 3.0 is going to be a while

Python moratorium and the future of 2.x

Posted Nov 12, 2009 8:58 UTC (Thu) by mjthayer (guest, #39183) [Link] (1 responses)

Is it not possible to write code that works with both Python 2 and Python 3? If so, perhaps a few "code bash days" in which important existing code was reworked might do a lot for uptake of Python 3.

I do find it rather surprising that people think that the core language can't survive for a couple of years without new features. Libraries are a different matter, but new library features can also be done in external libraries, and brought in after the end of the freeze if that makes sense.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 14:19 UTC (Thu) by bartoldeman (guest, #4205) [Link]

It's possible to write code that works in both, though it may not be particularly elegant, especially if you use unicode strings. You'd need to use things like
"".encode("ascii").join(slist)
instead of just
"".join(slist)
or
b"".join(slist)
if slist is a list of byte strings. And use
    try:
        f = open(filename,mode)
    except IOError:
        s = sys.exc_info()[1]
instead of
    except IOError, s:
or
    except IOError as s:

Though it also depends on your 2.x baseline; 2.6 already supports a lot of the 3.x syntax. Supporting all the way up from less than 2.2 (not to mention 1.5.2) is going to be much more painful though, as 2.1 supports has_key() but not "in" for dictionaries, and 3.x is the other way around.

See also: here and here.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 9:37 UTC (Thu) by jschrod (subscriber, #1646) [Link] (7 responses)

I don't program in Python, so I don't understand the distribution's problem fully. Why is Python 3 not distributed under a different name? E.g., call the program python3 and stuff the libraries in their own tree, separate from the Python 2 libraries. Then people can start porting their libraries and applications without having to install from upstream.

E.g., SUSE does that. Is that not done by other distributions? (On Debian sid, I can't see python3 packages, though there are older versions in separate packages.) If not, why?

Python moratorium and the future of 2.x

Posted Nov 12, 2009 11:00 UTC (Thu) by samtherecordman (guest, #43207) [Link] (6 responses)

The Python community should work closely with the distros, including the enterprise distros, to get it packaged and part of the base install. I'm a Fedora user and Python programmer and I can't even get Python 3 as an offical package in Fedora 11 (a fairly recent distro).

I see it's been scheduled for Fedora 13 as an optional package - see https://fedoraproject.org/wiki/Features/Python3F13 and this bugzilla entry as well https://bugzilla.redhat.com/show_bug.cgi?id=526126

If it's not in Fedora yet it's certainly not going to hit the enterprise variants RedHat/CentOS for a long time yet (I'm not so familiar with other distros so can't comment on these).

Not being able to install Python 3 using the package manager is a barrier to Python 3 adoption for me. I can't even write simple scripts in it since for some machines I use there are controls on what can be globally installed (i.e. only packaged applications).

Python moratorium and the future of 2.x

Posted Nov 12, 2009 13:10 UTC (Thu) by sbergman27 (guest, #10767) [Link] (5 responses)

Fedora doesn't have Python3?! Wow.

It's been a simple "apt-get install" in Ubuntu for the last couple of releases. You can run
Python2 and Python3 side by side easily. I thought Fedora was supposed to promote the use
of new technologies. Why are they at the forefront of holding this one back? As if a new
buggy, redundant, and unneeded sound server is more worthy of promotion than major
improvements to a much used programming language.

Python moratorium and the future of 2.x

Posted Nov 12, 2009 13:52 UTC (Thu) by drag (guest, #31333) [Link] (4 responses)

I personally have Python 2.4, Python 2.5, and Python 2.6 installed and
Python 3.1 installable through apt-get. :)

Python moratorium and the future of 2.x

Posted Nov 12, 2009 14:38 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link] (3 responses)

Python 3 has been available for Fedora for a long time but without actual applications taking advantage of it, it hasn't been very useful. Fedora 13 is going a bit beyond that and integrating not only the base language but also a number of libraries and updating core pieces like the RPM Python binding so that Python 3 is more useful in the real world.

Since your favorite distribution also made the deliberate choice of using PulseAudio by default, they must be thinking that it is awesome as well :-)

Python moratorium and the future of 2.x

Posted Nov 13, 2009 5:52 UTC (Fri) by sbergman27 (guest, #10767) [Link] (2 responses)

What does "available" mean, exactly? Is it in Fedora's official repos or not? Remember that
Python plays 2 separate roles in most distros. It's the runtime for various OS utilities... and a
development platform for users who program. As a Python programmer, I don't care all that
much what version of Python that print manager is using. But I care very much about the
quality of support for the latest 2.x and 3.x versions of Python and associated modules. Both
of which are first class in my "favorite" distro. Though I generally prefer the term "preferred"
to "favorite". One should not get too attached to any one distro. It distorts one's perspective.

At any rate, while Shuttleworth will almost certainly eventually end up in Heaven, I'm
recommending a couple of extra days in purgatory for the PulseAudio thing. Lennart's going
straight to Hell, though. :-)

Python moratorium and the future of 2.x

Posted Nov 13, 2009 9:51 UTC (Fri) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

As a python programmer, it is readily available and packaged as a parallel installable single RPM for a very long time already. The Fedora 13 plan is for better integration to take the first steps for it to be used for distribution utilities. Since Fedora uses Python extensively including for Anaconda, yum, system configuration utilities etc, this is a incremental process.

If you want to imagine sending any free and open source software developer to non existent places, feel free to.

Python moratorium and the future of 2.x

Posted Nov 17, 2009 20:23 UTC (Tue) by sbergman27 (guest, #10767) [Link]

This is one time that I would cheer Fedora on in their much vaunted effort to facilitate the
adoption of new technology. But a year after the release of Python 3, is seems like Fedora is
just now making it to the starting line. As a Python consumer, it seems to me that there
would be more interest, on Fedora's part, in helping to get past this 2->3 pot hole in the road.
In short: "Fedora, we need you now. Where are you?"

Why move to Python 3?

Posted Nov 12, 2009 12:16 UTC (Thu) by jonth (guest, #4008) [Link] (1 responses)

I can only offer a personal perspective on the reasons the transition to Python 3 isn't happening quickly, but I think it's salient.

For work, I maintain a number of python scripts & modules, which were written when Python 2.4 was current. These scripts work on Python 2.5 and 2.6. They are not compatible with Python 3.0, and any changes to do so won't be backwards compatible with Python 2.x, so to support Python 3.0 I'll need to fork my code.

So, here's the obvious question. Why should I move? It's a significant amount of work to do it, requires that I maintain two different codebases and I get no benefit at all from updating to Python 3.0.

Why move to Python 3?

Posted Nov 14, 2009 16:07 UTC (Sat) by man_ls (guest, #15091) [Link]

Another perspective. I maintain a specialized Python package published under the GPL (recently accepted in Debian), and I want to maximize the number of people that can run it. This means keeping backward compatibility from Python 2.3, so I don't use any features added in 2.4, 2.5 or 2.6. I guess that most maintainers are in the same situation. So you can imagine my feelings about migrating to 3.x: no amount of flag days is going to help here. Frankly, I would rather migrate to Lua (an intriguing language that I want to learn anyway) than to Python 3.x, so if 2.x grows cobwebs I know what to do.

Real problem: Mixing up Python 3 "the language" and "the implementation"

Posted Nov 12, 2009 15:08 UTC (Thu) by dwheeler (guest, #1216) [Link] (1 responses)

Python 3 is a nice *language*, but you can't move programs to it until all the libraries switch simultaneously because it has a completely different implementation. And that's silly.

What should happen is that there should be a single implementation that supports both the Python 2 and Python 3 languages. That way, people could incrementally transition.

Real problem: Mixing up Python 3 "the language" and "the implementation"

Posted Nov 13, 2009 12:39 UTC (Fri) by dag- (guest, #30207) [Link]

Right, much like HTML pages indicate what "dialect" they are talking, the python interpreter could understand such hints and do what is expected from it.

That makes much more sense then having to choose what /usr/bin/python is, and whether the new interpreter needs to be /usr/bin/python3 or not.

I am sure the Python developers thought about it and rejected it for some reason, but it seems the most practical way for a distribution or a programmer to migrate gradually their whole stack.

The 2-to-3 transition

Posted Nov 12, 2009 16:36 UTC (Thu) by southey (guest, #9466) [Link]

I do not understand the concern, if there is no help for a project to transition, then most projects do not have the resources to transition.

While the 2-to-3 tool might be useful, it does not help get projects follow the recommended path of getting compatible prior to the actual transition. Many projects use a mixture of Python and C (or similar language) that requires considerable effort to first update API's before transitioning to Python 3. Thus, just getting 'python2.6 -3' to pass can require major changes to a project's code. You will still need extensive user testing to ensure that these changes are correct even if a project has extensive tests because those tests usually do not address language changes.


Multiple versions of the same package installed at same time

Posted Nov 13, 2009 12:19 UTC (Fri) by Cato (guest, #7643) [Link] (4 responses)

Perhaps the real issue here is that Linux's excellent package management make it significantly harder to installed a 'second' version of Python, or in fact any package. You either need to compile from source, or install a binary under a tree that doesn't affect the core package, or do a chroot-install, or something else that's equally complex.

Windows often makes this a lot simpler, although sometimes the registry enforces single versions. At least without dependency management it's easier to install the same version twice using a simple installer.

While package management makes things simpler, if there was an easier way to install two versions of Python and choose which applications use which version, you could make a gradual transition to Python 3 without breaking core distro features.

Am I missing something here?

Multiple versions of the same package installed at same time

Posted Nov 13, 2009 12:49 UTC (Fri) by rahulsundaram (subscriber, #21946) [Link]

Programs have to be designed to be parallel installable. There are applications and libraries which do make this very easy. GTK and gstreamer comes to mind.

Details at http://www106.pair.com/rhp/parallel.html

Multiple versions of the same package installed at same time

Posted Nov 13, 2009 15:49 UTC (Fri) by roblucid (guest, #48964) [Link] (2 responses)

Yes, I think you are. Basically if you build an RPM, it does the ./configure; make && make install. Geerally FOSS packages default to /usr/local and distro's configure them to be rooted at /, /usr, /opt whatever.

So in fact you ought to be able to be able to build Gen++ packages, rooted at a different place than Gen, and select which implementation you use by traditional means $PATH & environment.

Opening up openSUSE 11.2 Software Package manager, and I find :

python -2.6.2-6.3
python-32bit -2.6.2-6.2
python3 -3.1-3.3

So perhaps Distro's with good package managers are not the problem at all.
Whoops I just installed python3 without really meaning to, expecting it to tell me about other packages it would require installed to with it.

I have both installed, guessing it'll be python3 :

ladm@fir:~/.kde4/share/config> python3
Python 3.1 (r31:73572, Oct 24 2009, 05:39:09)
[GCC 4.4.1 [gcc-4_4-branch revision 150839]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Multiple versions of the same package installed at same time

Posted Nov 13, 2009 16:18 UTC (Fri) by Cato (guest, #7643) [Link] (1 responses)

I think you've made my point for me, which was that installing a second
version is significantly more complex than just installing the default
version.

I'm aware that some distros create 'python3' packages, but that's really a
hack rather than a general solution - why isn't it possible to pull a newer
version of Python from a later version of the distro (e.g. Ubuntu 9.10 while
using an earlier version of Ubuntu), and cleanly install it with
dependencies, maybe in a dynamically created chroot or simply a new directory
prefix.

Multiple versions of the same package installed at same time

Posted Nov 14, 2009 17:35 UTC (Sat) by man_ls (guest, #15091) [Link]

It is better this way, I think. I don't want a second set of the same libraries lying around; if I did I would do the chroot myself. It does not look like a hack to me, at least on Debian.
# ls -l /usr/bin/python
lrwxrwxrwx 1 root root 9 jul 17 01:48 /usr/bin/python -> python2.5


Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds