LWN.net Logo

LWN.net Weekly Edition for July 11, 2013

Four KDE SC releases a year?

By Jake Edge
July 10, 2013

As with most things in software development, there are trade-offs inherent in the length of the release cycle for a project. Shorter cycles mean more frequent releases, with features getting into the hands of users more quickly. But they also mean that whatever overhead there is in creating a release is replicated more times. Finding a balance between the two can be difficult and projects have tried various lengths over the years. Currently, the KDE project is discussing a proposal to move from six months between releases down to three. It has sparked a fairly thoughtful discussion of some of the pros, cons, and logistics of making that kind of switch.

Àlex Fiestas announced the proposal on the kde-core-devel mailing list, looking for feedback prior to a "Birds of a Feather" (BoF) session he has scheduled for the upcoming Akademy conference. In a nutshell, the idea is to shrink the current release cycle down to three months, with two months for feature merging and one for release preparation and testing. Instead of multiple freezes for different elements (features, dependencies, messages, artwork, documentation, ...) of the KDE Software Collection (SC), all freezes would happen at the same time: roughly a month before release. A look at the schedule for the in-progress 4.11 release will show a rather more complicated set of freezes, for example.

Several advantages are listed in the proposal. With more frequent releases, it will be easier for distributions to pick up the most recent code. Right now, a distribution that happens to have a release at an inconvenient time in the KDE SC release cycle will have to ship a nearly six-month-old major release. The proposal would cut that down. In addition, a three-month cycle will necessarily have fewer changes than a longer cycle would, which means fewer bugs and easier testing, at least in theory.

The other benefit listed is actually more of a prerequisite for making the change: "Master will be always in a releasable state". One of the objections raised to the plan was that it would reduce the amount of time available for feature development. It would seem that some KDE projects do their feature development on the master branch, which then needs to stabilize prior to a release. As Thomas Lübking (and others) pointed out, though, the solution is to do feature development on a topic branch, rather than the master.

There are reasons that some projects do feature work on the master, however. For one thing, it is generally easier to get users to test new features when they are on the master branch, rather than some fast-moving development branch that may be broken on a fairly regular basis. Handling that problem is simply a matter of creating an integration branch for users to test, as Martin Gräßlin noted. In addition, Aurélien Gâteau explained why a topic-branch-based workflow generally works better.

Part of the difficulty may be that some KDE projects are still coming up to speed with Git, which KDE has been migrating to over the last few years. Subversion's painful branch management may have led projects into using the master (or "trunk") for development over the years. The final shutdown of the Subversion servers has not yet occurred, but it is scheduled for January 1, so projects either have switched or will soon.

More frequent releases might result in an undesirable dilution of the impact of those releases, Nuno Pinheiro said. Sven Brauch expanded on that, noting, with a bit of hyperbole, that the frequent Firefox releases (every six weeks) had made each one less visible: "I think attention in the media to their releases has declined a lot -- nobody cares any more that a new version of firefox was released since it happens every three days." It is, he said, something to consider, though it shouldn't be an overriding criterion.

The impact on distributions was also discussed. Kubuntu developer Philip Muskovac was concerned about long-term support for KDE SC releases, especially with regard to the upcoming 14.04 LTS for the Ubuntu family. He noted that, depending on how things go (presumably with Mir), whichever KDE release goes into the LTS "might very well be our last release based on KDE4/Qt4/X11". It will need to be supported for three years, and three-month cycles mean fewer minor releases, all of which may add up to a problem for Kubuntu. He suggested creating better tools to allow distributions to find "stable" fixes in newer releases—something that Fiestas seemed amenable to providing.

Kubuntu developer Scott Kitterman was more positive about the impact of a three-month KDE cycle on the distribution. He, too, is concerned about having fewer minor releases available ("we ship all the point releases to our end users and appreciate the added stability they provide"), but thinks that's probably the biggest hurdle for Kubuntu. If a solution can be found for that problem, he thought the distribution could handle the change, though he clearly noted that was his opinion only.

On behalf of the openSUSE KDE packagers, Luca Beltrame posted concerns over the amount of "extra" work that would be required to keep up with packaging KDE SC every three months. There is also a support burden when trying to handle multiple different major versions, he said. Fiestas asked Beltrame (and distribution packagers in general) what KDE could do to make it easier to package up the project. He noted that figuring out the dependencies for each new release is a pain point mentioned by multiple distributions: "Can't we coordinate on that so everybody life is easier?" Fiestas's approach seems to be one of trying to identify—and remove—the obstacles in the way of the proposal.

In a lengthy message, Aaron Seigo touched on a number of the topics in the thread. He noted that there is nothing particularly special about three months, and other intervals should be considered (he mentioned two or four months). He also pointed out that the marketing and visual design cycles need not coincide with those of the software development. The project could, for example, do a visual refresh yearly, while doing twice-yearly public relations pushes to highlight the latest KDE software features. Other arrangements are possible too, of course.

Seigo did note something that was quite apparent in the thread: there were few, if any, flames and essentially no bikeshedding. A potentially contentious topic has, at least so far, been handled by "thoughtful input". Whether or not the proposal is adopted, "the discussion has been very valuable already", he said. More of that will undoubtedly take place at Akademy in Bilbao, Spain later in July and it will be interesting to see where it leads.

Comments (9 posted)

Google, Reader, and hard lessons about migration

By Nathan Willis
July 10, 2013

As has been widely reported already, Google discontinued Reader, its RSS and Atom feed-reading tool, at the beginning of July. In the weeks preceding the shutdown, scores of replacement services popped up hoping to attract disgruntled Reader refugees. But most of them focused squarely on the news-delivery features of Reader; a closer look illustrates several additional lessons about the drawbacks of web services—beyond the simple question of where one's data is stored.

Takeout, again?

First, Google had advertised that users would be able to extract their account information from Reader ahead of the shutdown. But the reality is that the available data migration tools are often not all that they are cracked up to be, particularly when they are offered by the service provider. Reader had always allowed users to export their list of feed subscriptions in Outline Processor Markup Language (OPML) format, of course. But access to the rest of an account's Reader data required visiting Google Takeout, the company's extract-and-download service (which is run by a team within Google called the Data Liberation Front). Takeout allowed users to extract additional data like the lists of starred and shared items, notes attached to feed items, and follower/following information.

However, Takeout does not preserve the historical contents of subscribed feeds, the existence of which is one of the more valuable aspects of always accessing news items at a single location: it is what enables full-text and title search of cached entries. Obviously, there are copyright issues that could understandably make Google shy away from offering downloads of other sites' content—although it could be argued that the company was already retaining that content and offering access to it in a variety of products, from Reader's cache to the "cached" items in Google Search. In any event, in the weeks preceding the Reader shutdown, several tools sprang up to retrieve the cached item store, from the open source Reader Is Dead (RID) to the commercial (and Mac-only) Cloudpull.

Both Cloudpull and RID utilized the unpublished Reader API to fetch and locally store an account's entire feed history. By sheer coincidence, I stumbled across the existence of RID a few days before the shutdown deadline, and used it to successfully pull down several year's worth of feed items on June 30. The archive consumes about 30 GB of space (uncompressed), although about half of that is wasted on high-traffic feeds without any historic value, such as local weather updates and Craigslist categories.

For the rest, however, the backup is akin to a local Wayback Machine. Initially the RID project was working on its own web application called reader_browser to access and search these archives; that program is still under development with limited functionality at present, but in the first week of July the project rolled out a stop-gap solution called zombie_reader as well. zombie_reader starts a local web server on port 8074, and presents a clone of the old Reader interface using the cached archive as storage. The legality of the clone may be questionable, since it employs a modified copy of the Reader JavaScript and CSS. But there is little long-term value in developing it further anyway, since outside of search and browsing, few of the old application's features make sense for an archive tool. Developer Mihai Parparita is continuing to work on reader_browser and on an accompanying command-line tool.

The silo

Of course, maintaining a standalone archive of old news items puts an additional burden on the user; at some point the news is too old to be of sufficient value. A better long term solution would be to merge the extracted data into a replacement feed reader. That illustrates another difficulty with migrating away from an application service provider—importing the extracted data elsewhere is problematic, if it is possible at all.

Copying in an OPML subscription list is no problem, of course, but other web-based feed-reader services will understandably not support a full history import (much less one 30GB in size). Self-hosted free software tools like ownCloud News and Tiny Tiny RSS are an option, although the official reception from Tiny Tiny RSS to such ideas has been less than enthusiastic. The Tiny Tiny RSS feature-request forum even lists asking for Google Reader features as a "bannable offense."

Outside contributors may still manage to build a working import tool for RID archives (there is one effort underway on the Tiny Tiny RSS forum). Regardless, the main factor that makes RID just a short-term fix is the fact only those users who made an archive before Reader closed can use it. Once Google deactivated Reader, it was no longer possible to extract any more cached account data. That left quite a few confused users who did not complete their exports before the July 1 shutdown, and it puts a hard upper limit on the number of RID users and testers.

The reason archive export no longer works, quite simply, is that Google switched off the Reader API with the application itself. That is an understandable move, perhaps. But there is still another shutdown looming: even the ability to export basic information (i.e., OPML subscriptions) will vanish on July 15—which is a perplexingly short deadline, considering that users can still snag their old Google Buzz and Google Notebook data through official channels, several years after those services were shuttered. So despite the efforts of the Data Liberation Front, it seems, the company can still be arbitrarily unhelpful when it comes to specific services.

Why it still matters

The moral of the Reader shutdown (and resulting headaches) is that it is often impossible to predict which portions of your data are the valuable ones until you actually attempt to migrate away to a new service provider. Certainly Google Reader had a great many users who did not care about the ability to search through old feed item archives. But some did, and the limitations of the service's export functionality only brought that need to light when they tried to move elsewhere.

For the future, the obvious lesson is that one should not wait until a service is deactivated to attempt migration. It is easy to lapse into complacency and think that leaving Gmail will be simple if and when the day comes. But, as is the case with restoring from backups, Murphy's Law is liable to intervene in one form or anther, and it is better to discover how in advance. There are certainly other widely-used Google services that exhibit the same problematic symptoms as Reader, starting with not allowing access to full data. Many of these services are for personal use only, but others are important from a business standpoint.

The most prominent example is probably Google Analytics, which is used for site traffic analysis by millions of domains. Analytics allows users to download summary reports, but not the raw numbers behind them. On the plus side, there are options for migrating the data into the open source program Piwik. However, without the original data there are limits to the amount and types of analysis that can be performed on the summary information alone. Most other Google products allow some form of export, but the options are substantially better when there is an established third-party format available, such as iCalendar. For services without clear analogues in other applications of providers—say, Google+ posts or AdWords accounts—generic formats like HTML are the norm, which may or may not be of immediate use outside of the service.

The Data Liberation Front is an admirable endeavor; no doubt, without it, the task of moving from one service provider to another would be substantially more difficult for many Google products. And the Reader shutdown is precisely the kind of major disruption that the advocates of self-hosted and federated network services (such as the Autonomo.us project) have warned free software fans about for years. But the specifics are instructive in this case as well: perhaps few Reader users recognized that the loss of their feed history would matter to them in time to export everything with RID, and perhaps more than a few are still unaware that Google Takeout will drop its Reader export functionality completely on July 15.

Ultimately, the question of how to maintain software freedom with web services divides people into several camps. Some argue that users should never use proprietary web services in the first place, but always maintain full control themselves. Others say that access to the data and the ability to delete one's account is all that really matters. The Autonomo.us project, for example, argues in its Franklin Street Statement that "users should control their private data" and that public data should be available under free and open terms. One could argue that Reader met both of those requirements, though. Consequently, if it signifies nothing else, Reader's shutdown illustrates that however admirable data portability conditions may be, those conditions are still complex ones, and there remains considerable latitude for their interpretation.

Comments (7 posted)

The next 20 years of Python

July 10, 2013

This article was contributed by Martin Michlmayr


EuroPython 2013

The EuroPython 2013 conference in Florence, Italy opened with a keynote by Van Lindberg about the next twenty years of Python. Lindberg, a lawyer with an engineering background, is the chairman of the Python Software Foundation (PSF) and the author of the book Intellectual Property and Open Source (reviewed by LWN in 2008). His keynote looked at the challenges facing the Python community and the work underway to ensure that Python will remain an attractive programming language and have a healthy community for the next twenty years (and beyond).

The design philosophy of Python

Lindberg began his keynote with a retrospective of the last twenty years of Python. He described the origins of Python as a skunkworks project, which led Guido van Rossum, the creator of Python, to a number of interesting design choices. One is that Van Rossum borrowed ideas from elsewhere, such as ALGOL 68 and C. Another design approach was to make things as simple as possible. This involved taking the same concepts and reusing them over and over again. Python also follows the Unix philosophy of doing one thing well, he said. Finally, perfection is the enemy of the good, as "good enough" is often just that. Cutting corners is allowed, as you can always go back and improve it later. Lindberg summarized that Van Rossum "got a lot right in the early days".

Lindberg noted that Van Rossum also succeeded in creating a community around Python. Lindberg identified four factors that were crucial for the success of Python. First, Python was an excellent language. This was a necessary basis because "otherwise there's nothing to gather and rally around". Second, Van Rossum chose an open source license even before the term "open source" was invented. Third, Van Rossum encouraged a sense of humor, naming the language after the Monty Python crew. Finally, Python had a sense of values.

The values behind Python, in particular, are what differentiates Python from many other programming languages. Lindberg asked the audience whether they knew about "import this". This is an Easter egg in Python which displays the Zen of Python, the guiding principles behind Python. Unlike Perl, which proudly proclaims that there's more than one way to do it, Python encourages a certain programming style. This is reflected in the Zen of Python, which says that there should be one — and preferably only one — obvious way to do it.

Challenges for the Python community

Lindberg emphasized that Python is a remarkable story of success. There are hundreds of thousands, maybe even millions, of people using Python as part of their jobs. Python is widely deployed — it has become the de facto standard in the movie and animation industry, is overtaking Perl in bioinformatics, and is the implementation language of two of the leading cloud platforms. Python is also a significant player in education, "finally replacing Java as the primary teaching language for a lot of universities", he said.

Despite the success, Python is facing what Lindberg described as "market share challenges". JavaScript, which used to be stricken by buggy, browser-only, and inconsistent implementations, has become a fairly big competitor in the desktop and server spaces, and particularly in mobile. Lua is increasingly used as an embeddable extension language. Lindberg sees Go as another contender. What makes Go attractive is its concurrency and ability to create easily deployable binaries that you can just drop on a system and run. "Frankly, deployment is a challenge for us", admitted Lindberg, as are mobile and other areas with highly constrained space requirements. Lindberg also mentioned the statistical and graphic abilities of R as a potential competitor.

Asking "why do I care?", he explained that it's important to keep growing — otherwise Python will end up where Smalltalk and Tcl are today. He rhetorically asked the audience when the last time was that anyone did anything interesting in Tcl. Noting that these are fantastic languages, Lindberg argued that "they have died because they have not grown". It's not just the language, but also the community around it, that can die. He observed that in addition to technical challenges facing Python, there are also challenges with scaling the Python community that need to be addressed. Lindberg believes that ten or twenty years ago it was enough to focus on the programmer, whereas these days you have to form a culture around programming.

There is something special about the Python community, according to Lindberg. He quoted the mission of the Python Software Foundation, which is to "promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers", observing that "these are important words". Lindberg argued that the current community is getting older and that actions have to be taken that will create the Python community twenty years from now: "if we don't build and grow the community, it will go away".

Big changes coming

Lindberg emphasized three areas that the Python Software Foundation is focusing on to grow the Python community, now and in the future. One is the Code of Conduct the PSF adopted in April. The Zen of Python has been important in defining Python, but its focus is on code. The Code of Conduct, on the other hand, captures what the community itself should be like — it should consist of members from all around the world with a diverse set of skills. He said that a member of the Python community is open, considerate, and respectful: members are open to collaboration, to constructive criticism, and to fostering an environment in which everyone can participate; they are considerate of their peers; and they are respectful of others, their skills, and their efforts. The Code of Conduct condenses what is great about the Python community. "It's about being the best people that we can be and being the best community that we can be", Lindberg said. Alluding to Python's reputation as the language with batteries included, he summarized that "Python is the language with community included".

The second focus for PSF is education. As we're all getting older, we have to think about where the next generation is coming from, Lindberg said. He told the story of Sam Berger, an eleven year old school boy from South Africa, who attended PyCon and participated in professional level tutorials and classes. This is an illustration of where the next generation of Python leaders is coming from. In order to encourage that next generation, the PSF is supporting initiatives to promote young coders, such as making a curriculum to teach kids Python available online. Lindberg is also very supportive of the Raspberry Pi. He reminisced about the 80s when computers booted into BASIC. The default way to interact with the computer was through programming. If you wanted to do something else, you had to make an explicit decision. This led to an entire generation that understood that computers are tools — tools that won't break if you play around with them.

Finally, the PSF itself is adapting to better serve the needs of the Python community. It is working on a new web site (a preview of which can be found at preview.python.org). The design goal of the new site is to make it easy for the community to get involved. It is also putting a lot of thought into representing the community, and there will be efforts to address various needs, such as learning Python or teaching Python. Lindberg also lamented that the PSF is not broad and inclusive enough. Membership in the PSF currently requires a nomination from an existing member, but Lindberg believes that every member of the Python community should be a member of the PSF. In April, the PSF voted to completely redo its membership program and to open up membership to anyone. Answering a question from the audience, Lindberg clarified that basic membership will be available to anyone who signs up. Further rights, such as voting privileges, will be given to those members who have demonstrated a commitment to the Python community, such as by contributing code, documentation, or test cases — or by organizing events.

Lindberg closed by saying that the PSF is "changing to be your home". It is fundamentally saying that "we need each of you" and that "this is all about you". This is the most significant change the Python community has seen since the formation of the PSF, according to Lindberg, and it's about building the next twenty years of Python.

Comments (54 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Subverting Android package verification; New vulnerabilites in kernel, nagios, reviewboard, xorg-x11-server, ...
  • Kernel: 3.11 merge window part 2; Ethernet polling and patch-pulling latency; Is the whole system idle?
  • Distributions: Debian, Berkeley DB, and AGPLv3; SUSE, ...
  • Development: HTTP 2.0; LXDE-Qt; Harlan; A Year of the Linux Desktop; ...
  • Announcements: Seth Vidal RIP, CfP, events.
Next page: Security>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds