By Jake Edge
March 25, 2009
Translating text strings into other languages, called "localization" or
"l10n", is a critical part of extending the reach of free software. But
it is equally important that those translations make their way upstream, so
that the translation work is not duplicated, and that all future versions
can benefit. Making all of that easy is the goal of Transifex, which is a platform for doing
translations that is integrated with the upstream version control system
(VCS). The project recently released
Transifex 0.5—a complete rewrite atop the Django web
framework—with many new features
Transifex came out of work done in the 2007 Google Summer of Code for the
Fedora project. Dimitris Glezos worked on a project
to create a web interface to ease localization for Fedora. In the year and a half
since then, Transifex has grown greatly in capabilities, and is now used as
the primary tool for Fedora translations. One of the key aspects, as can
be seen in the SoC application is a focus on being upstream friendly.
People who are able to translate text into another language—for good
or ill, most software is developed with English text—are not
necessarily developers, so their knowledge of VCS systems may be small. In
addition, they are unlikely to want to have multiple accounts with various
projects who might need their services. Transifex abstracts all of the
VCS-specific differences away, so that it presents a single view to
translators. This allows those folks to concentrate on what they are good at.
Transifex interfaces with multiple different VCS systems that a development
project might choose to hold its source code. The five major VCS packages
used by free software projects:
CVS, Subversion, Bazaar, Mercurial, and Git; are all handled seamlessly by
Transifex. A translator doesn't have to know—or care—what the
project chose, and their translations will be properly propagated into the
repository.
This stands in contrast to Canonical's Rosetta, which is also a web-based
translation tool, but it is tightly integrated with Launchpad. That
requires that projects migrate to Launchpad to take advantage of the
translations made by Ubuntu users. Many projects are skittish about moving
to Launchpad, either due to its required use of Bazaar, or due to the
non-free nature (at least as yet) of the Launchpad code. No doubt there
are also
projects who are happy with their current repository location and are
unwilling to move.
Because of the centralized nature of Rosetta, translations tend to get
trapped there, leading some to declare it a poor choice for doing
free software translations. Perhaps when Launchpad opens its code, and
support for more VCS systems is added, it may be a more reasonable choice.
For now, Transifex seems to have the right workflow for developers as well
as translators.
The 0.5 release adds a large number of new features to make
it even easier to use and to integrate with various projects. The data
model has been reworked to allow for arbitrary collections of projects (i.e
Fedora 11 or GNOME), with multiple branches for each project. A lot of
work has also gone into handling different formats of localization files (such as
PO and POT formats), as well as supporting variants of languages for
specific countries or regions (e.g. Brazilian Portuguese).
For users, most of whom would be translators, 0.5 has added RSS feeds to
follow the progress of translations for particular projects. User account
management has been collected into its own subsystem, with features like
self-service user registration and OpenID support for authentication. In
addition,
the VCS and localization layers are easily extensible to allow for supporting other
varieties of those tools. Transifex 0.5 has the look of a very solid release.
Glezos and others from the Transifex team have started a new company, Indifex to produce a hosted version of
Transifex (at Transifex.net) that
will serve the same purpose as Wordpress.com
does for Wordpress blogs. Projects that don't want to host their own
Transifex installation can work with Indifex to set up an localization solution for
their code. Meanwhile, Indifex employees have been instrumental in the 0.5
rewrite and will be providing more development down the road.
Glezos outlined
their plans in a blog post in December.
Because of its openness, and its concentration on upstream-friendliness,
Transifex has an opportunity to transform localization efforts for free software
projects. There are a large number of willing translators out there, but
projects sometimes have difficulty hooking up with them. Transifex will
provide a place for translators and projects to come together. That
should result in lots more software available in native languages for many
more folks around the
world.
Comments (24 posted)
By Jonathan Corbet
March 21, 2009
Sometimes, even the best job can call for extraordinary sacrifices. Even
grumpy editorial jobs. Let it never be said that your editor is unwilling
to take one for his readers; why else would he choose to spend four hours
in the company of around 100 lawyers gathered to talk about software
patents? This event, entitled
Evaluating
software patents, was held on March 19 at the local law school.
The conversation was sometimes dry and often painful to listen to, but it
did provide an interesting view into how patent attorneys see the software
patent regime in the U.S. The following is a summary of the high points
from the four panels held at this event.
Should software patents exist?
It should come as little surprise that a panel full of patent lawyers turns
out to be supportive of the idea of software patents. Of all the
panellists present, only Jason
Mendelson was truly hostile to patenting software, and even he stopped
short of saying that they should not exist at all. The first speaker,
though, was John Duffy,
who cited language in a 1952 update to the patent code stating that "a
patentable process includes a new use of an old machine." That language,
he says, "fits software like a glove." So there is, he says, no basis for
any claims that software patents are not allowed by current patent law.
Beyond that, he says, the attempts to prevent the patenting of software for
many years did a great deal of damage. Keeping the patent office away from
software prevented the accumulation of a proper set of prior art, leading
to the current situation where a lot of bad patents exist. Software is an
engineering field, according to Duffy, and no engineering field has ever
been excluded from patent protection. That said, software is unique in
that it also benefits from copyright protection. That might justify
raising the bar for software patents, but does not argue against their
existence.
Damien
Geradin made the claim that there's no reason for software patents to
be different from any other kind of patent. The only reason that there is any fuss about
them, he says, is a result of the existence of the open source community;
that's where all the opposition to patents comes from. But he showed no
sign of understanding why that opposition exists; there is, he says, no
real reason why software patents should be denied.
Kevin Luo, being a Microsoft attorney, could hardly come out against
software patents. He talked at length about the research and development
costs at Microsoft, and made a big issue of the prevalence of software in
many kinds of devices. According to Mr. Luo, trying to make a distinction between
hardware and software really does not make a whole lot of sense.
Beyond their basis in legislation, patents should, according to the US
constitution, serve to encourage innovation in their field. Do software
patents work this way? Here there was more debate, with even the stronger
patent supporters being hard put to cite many examples. One example that
did come up was the RSA patent, cited by Kevin Luo; without that patent, he
says, RSA Security would not have been able to commercialize public key
encryption. Whether this technique would not have been invented in
the absence of patent protection was not discussed.
Mr. Geradin noted that software patents are often used to put small
innovators out of business, which seems counter to their stated purpose.
But, he says, they can also be useful for those people, giving them a way
to monetize their ideas. Without patents, innovators may find themselves
with nothing to sell.
Jason
Haislmaier claimed, instead, that software patents don't really create
entrepreneurship; people invent because that is who they are. And he noted
that software patents are especially useless for startup companies. It can
currently take something like seven years to get a patent; by that time,
the company has probably been sold (or gone out of business) and the
inventors are long gone. Jason Mendelson, who does a lot of venture
capital work, had an even stronger view, using words like "worthless" and
"net negative." He claimed that startups are frequently sued for patent
infringement for the simple purpose of putting them out of business.
What's wrong with the patent system?
In general, even the panellists who were most supportive of the idea of
software patents had little good to say about how the patent system works
in the US currently.
For example,
Michael
Meurer, co-author of Patent
Failure, has no real interest in abolishing software patents, but
he argues that they do not work in their current form. Patents are
supposed to be a property right, but they currently "perform poorly as
property," with software patents being especially bad. That, he says, is
why software developers tend to dislike patents, something which
distinguishes them from practitioners of almost every other field. Patents
are afflicted by vague language and "fuzzy boundaries" that make it
impossible to know what has really been patented, so they don't really
deliver any rewards to innovators.
Mr. Meurer also noted that software currently features in about 25% of all
patent applications. That is a higher percentage than was reached by other
significant technologies - he cited steam engines and electric motors - at
their peak.
Mark Lemley
talked a bit about the effect of software patents on open source software.
Patents are a sort of arms-race game, and releasing code as open source is,
in his words, "unilateral disarmament." He talked about defending open
source with the "white knight" model - meaning either groups like the Open
Invention Network and companies like IBM. He also noted that patents
provide great FUD value for those opposed to open source.
A related topic, one which came up several times, is "inadvertent
infringement." This is what happens when somebody infringes on a patent
without even knowing that it exists - independent invention, in other
words. John Duffy said that the amount of inadvertent infringement going
on serves as a good measure of the health of the patent system in general.
In an environment where patents are not given for obvious ideas,
inadvertent infringement should be relatively rare. And, in some fields
(biotechnology and pharmaceuticals, for example), it tends not to be a
problem.
[PULL QUOTE:
Actual copying of patented
technology is only alleged in a tiny fraction of software patent suits. In
other words, most litigation stems from inadvertent
infringement.
END QUOTE]
In the software realm, though, inadvertent infringement is a big problem.
Mark Lemley asserted a couple of times that actual copying of patented
technology is only alleged in a tiny fraction of software patent suits. In
other words, most litigation stems from inadvertent
infringement. Michael Meurer added that there is a direct correlation
between the amount of money a company spends on research and development
and the likelihood that it will be sued for patent infringement. In most
fields, he notes, piracy (his word) of patents is used as a
substitute for research and development, so one would ordinarily see
most suits leveled against companies which don't do their own R&D. In
software, the companies which are innovating are the ones being sued.
The other big problem with the patent system is its use as a way to put
competitors out of business. Rather than support innovation, the patent
system is actively suppressing it. Patent litigator Natalie Hanlon-Leh
noted that it typically costs at least $1 million to litigate a patent
case. John
Posthumus added that no company with less than about $50 million
in annual revenue can afford to fight a patent suit; smaller companies will
simply be destroyed by the attempt. Patent lawyers know this, so they
employ every trick they know to stretch out patent cases, making them as
expensive as possible.
Variation between the courts is another issue, leading to the well-known
problem of "forum shopping," wherein litigators file their cases in the
court which is most likely to give them the result they want. That is why
so many patent suits are fought in east Texas.
What is to be done about it?
Michael Muerer made the claim that almost every industry in the US would be
better off if the patent system were to be abolished; in other words,
patents serve as a net drain on the industry. But, being a patent
attorney, he does not want to abolish the patent system; instead he would like to see
reforms made. His preferred reforms consist mostly of tightening up claim
language to get rid of ambiguities and to reduce the scope of claims. He
would like to make the process of getting a patent quite a bit more
expensive, putting a much larger burden on applicants to prove that they
deserve their claims.
Mr. Muerer went further and singled out the independent inventor lobby as
being the biggest single impediment to patent reform in the US. In
particular, their efforts to block a switch from first-to-invent to
first-to-file priority (as things are already done in most of the rest of
the world) has held things up for years. What the lobby doesn't realize,
he says, is that if the patent system works better for "the big guys," they
will, in turn, be willing to pay more for patents obtained by the "little
guys." This sort of trickle-down patent theory was not echoed by any of
the other panelists, though.
Part of the problem is that the US patent and trademark office (PTO) is
overwhelmed, with a backlog of over 1 million patent applications. So
patent applications take forever, and the quality control leaves something to be
desired. Some panellists called for funding the PTO at a higher level, but
this is unlikely to happen: the number of patent applications has fallen in
recent times, and there is a possibility that some application fees will be
routed to the general fund to help cover banker bonuses and other equally
worthy causes. The PTO is likely to have less money in the near future.
And, in any case, does it make sense to put more money into the PTO? Mark
Lemley is against that idea, saying that the money would just be wasted.
Most patents are never heard from again after issuance; doing anything to
improve the quality of those patents is just a waste. Instead, he (along
with others) appears to be in favor of the "gold-plated patent" idea.
Gold-plated patents are associated with another issue: the fact that, in US
courts, patents have an automatic presumption of validity. This presumption
makes life much easier for plaintiffs, but, given the quality of many
outstanding patents, some people think that the presumption should be
revisited and, perhaps, removed. Applicants who think they have an
especially strong patent could then apply for the gold-plated variety.
These patents would cost a lot more, and they would be scrutinized much
more closely before being issued. The idea is that a gold-plated patent
really could have a presumption of validity.
Others disagree with this idea. Gold-plated patents would really only
benefit companies that had the money to pay for them; everybody else would
be a second-class citizen. Anybody who was serious about patents would
have to get them, though; they would really just be a price hike in
disguise.
There was much talk of patent reform in Congress - but little optimism. It
was noted that this reform has been held up for several years now, with no
change in sight. There was disagreement over who to blame (Mark Lemley
blames the pharmaceuticals industry), but it doesn't seem to matter. John
Duffy noted that the legislative history around intellectual property is
"not charming"; he called the idea that patent law could be optimized a
"fantasy." Mark Lemley agreed, noting that copyright law now looks a lot
like the much-maligned US tax code, with lots of specific industry rules.
Trying to adapt slow-moving patent law to a fast-moving industry like
software just seems unlikely to work.
What Mark suggests, instead, is to reform patent law through the courts.
Indeed, he says, that is already happening. Recent rulings have made
preliminary injunctions much harder to get, they have raised the bar for
obviousness, restricted the scope of business-model patents, and more.
Most of the complaints people have had, he says, have already been fixed.
John Duffy, instead, would like to "end the patenting monopoly." By this
he means the monopoly the PTO has on the issuing of patents. Evidently
there are ways to get US-recognized patents from a few overseas patent
offices now, and those offices tend to be much faster. He also likes the
idea of having private companies doing patent examination; this work would
come with penalties for granting patents which are later invalidated.
Eventually, he says, we could have a wide range of industry-specific patent
offices doing a much better job than we have now.
Conclusion
There was a brief discussion of the practice of not researching patents at
all with the hope of avoiding triple damages for "willful infringement."
The participants agreed that this was a dangerous approach which could
backfire on its practitioners; convincing a judge of one's ignorance can be
a challenge. But it was also acknowledged that there is
no way to do a full search for patents which might be infringed by a given
program in any case.
All told, it was a more interesting afternoon than one might expect. The
discussion of software patents in the free software community tends to
follow familiar lines; the people at this event see the issue differently. For
better or worse, their view likely has a lot of relevance to how things
will go. There will be some tweaking of the system to try to avoid the
worst abuses - at least as seen by some parts of the industry - but
wholesale patent reform is not on the agenda. Software patents will be
with us (in the US) for the foreseeable future, and they will continue to
loom over the rest of the world. We would be well advised to have our
defenses in place.
Comments (61 posted)
March 25, 2009
This article was contributed by Nathan Willis
The Parrot project released version
1.0 of its dynamic language interpreting virtual machine last week, marking
the culmination of seven years of work. Project leader Allison Randal
explains that although end users won't see the benefits yet, 1.0 does mean
that Parrot is ready for serious work by language implementers. General
developers can also begin to get a feel for what working with Parrot is like
using popular languages like Ruby, Lua, Python, and, of course, Perl.
The evolution of Parrot
Parrot originated in 2001 as the planned interpreter for Perl 6, but
soon expanded its scope to provide portable compilation and execution for
Perl, Python, and any other dynamic language. In the intervening
years, the structure of the project solidified — the Parrot team
focused on implementing its virtual machine, refining the bytecode format,
assembly language, instruction formats, and other core components, while
separate teams focused on implementing the various languages, albeit
working closely with the core Parrot developers.
The primary target for 1.0 was to have a stable platform ready for
language implementers to write to, and a robust set of compiler tools
suitable for any dynamic language. The 1.4 release, tentatively set for
this July, will target general developers, and next January's 2.0 should be
ready for production systems.
The promise of Parrot is tantalizing: rather than separate runtimes for
Perl, Python, Ruby, and every other language, a single virtual machine that
can compile each of them down to the same instruction set and run them.
That opens the possibility of applications that incorporate code and call
libraries written in multiple languages. "A big part of development
these days isn't rolling everything from scratch, it's combining existing
libraries to build your product or service,"
Randal said. "Access to multiple languages expands your available
resources, without making you learn the syntax of a new language. It's also
an advantage for new languages, because they can use the libraries from
other existing languages and get a good jump-start."
The Parrot VM itself is register-based, which the project says
better mirrors the design of underlying CPU hardware and thus permits
compilation to more efficient native machine language than the stack-based
VMs used for Java and .Net. It provides separate registers for integers,
strings, floating-point numbers, and "polymorphic containers" (PMCs; an
abstract type allowing language-specific custom use), and performs garbage
collection. Parrot can directly execute code in its own native Parrot
Bytecode (PBC) format, and uses just-in-time compilation to run programs
written in higher-level host languages. In addition to PBC, developers and
compilers can also generate two higher-level formats: Parrot Assembly
(PASM) and Parrot Intermediate Representation (PIR). A fourth format,
Parrot Abstract Syntax Tree (PAST), is designed specifically for compiler
output. The differences between them, including the level of detail
exposed, is documented
at the Parrot web site.
Parrot includes a suite of core libraries that implement common data
types like arrays, associative arrays, and complex numbers, as well as
standard event, I/O, and exception handling. It also features a
next-generation regular expression engine called Parser Grammar Engine
(PGE). PGE is actually a fully-functional recursive descent parser, which
Randal notes makes it a good deal more powerful than a standard regular
expression engine, and a bit cleaner and easier to use.
The project plans to keep the core of Parrot light, however, and extend
its functionality through libraries running on the dynamic languages that
Parrot interprets. Keeping the core as small as possible will make Parrot
usable on resource-constrained hardware like mobile devices and embedded
systems.
Language experts wanted
The "getting
started" documentation includes sample code written in PASM and PIR,
but it is the high level language support that interests most developers.
The project site maintains a list of active efforts to
implement languages for the Parrot VM. As of today, there are 46 projects
implementing 36 different languages. Three of the most prominent are Rakudo, the implementation of
Perl 6 being developed by the Perl community, Cardinal, an implementation
of Ruby, and Pynie, an
implementation of Python. Among the rest there is serious work pursuing
Lua and Lisp variants, as well as work on novelty languages such as Befunge and
LOLCODE. Not all are complete, but Randal said development has accelerated
in recent months after the 1.0 release date was announced, and she expects
production ready releases of the key languages soon.
Language implementers come from within the Parrot project and from the
language communities themselves. As Randal explained it, "we see it
as our responsibility as a project to develop the core of the key language
implementations, and to actively reach out to the language
communities."
1.0 includes a set of parsing utilities called the Parrot
Compiler Tools (PCT) to help implement dynamic languages on the Parrot
VM. PCT includes the PGE parser, as well as classes to handle the lexical
analyzer and compiler front-end, and to create the driver program that
Parrot itself will call to run the compiler. Owing to its
Perl heritage, PCT uses a subset of Perl 6 called Not Quite Perl (NQP).
Developer
documentation for NQP and all of the PCT components is available with
Parrot 1.0 as well as on the Parrot Developer
Wiki.
Parrot packages have been available for many Linux distributions and
BSDs for much of its development cycle, but now that it has reached 1.0,
Randal expects to see it ship by default in upcoming releases. For now,
however, developers and language implementers interested in testing and
running Parrot 1.0 can download source code releases
from the project's web site or check out a copy from its Subversion
repository. Building Parrot requires Perl, a C compiler, and a standard
make utility.
Parrot has been a long time in coming, but now that 1.0 is out of the
gate, the real work can begin, as the major language projects make their
own stable releases and developers start to use the Parrot VM as a runtime
environment. Although the technical work continues at full pace, Randal
said the project is also pushing forward on the education and outreach
front, with a book soon to be published through Onyx Neon Press, and Parrot
sessions planned for upcoming open source conferences and workshops as
well.
Comments (14 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Linux botnets; New vulnerabilities in bugzilla, ffmpeg, libvirt, pam,...
- Kernel: The return of utrace; Union file systems: Implementations, part I; Nftables.
- Distributions: Moblin 2 Core Alpha; Novell Ships SUSE Linux Enterprise 11; openSUSE Build Service 1.5; Igelle PC/Desktop; Distributions: The big and the small (The H); Interview with Robert Shingledecker, creator of Tiny Core Linux (DistroWatch Weekly).
- Development: A first look at Xfce 4.6, new versions of JACK, CML, SOGo, iptables, OpenSIPS, circuits, Rails, TikiWiki, Puppet, Rockbox, xpra, Libertine, Wine, Thunderbird, OpenGoo, ETS, AmFast, IcedTea7, Rakudo Perl, TestLink, bzr, Mercurial, monotone.
- Press: Fixing Unix/Linux/POSIX Filenames, ramifications of IBM acquiring Sun, TomTom sues Microsoft, Benchmarking recent kernels, vagaries of mobile broadband cards, Django review, 64 Studio and Ardour, GNOME 2.26 review.
- Announcements: GSoC announcements, the JavaScript trap, Sugar Learning Platform, TomTom joins OIN, FSF awards, EuroSciPy cfp, LGM cfp, UKUUG conf cfp, FSFE licensing workshop, KDE brainstorm forum.
Next page:
Security>>