By Jonathan Corbet
December 3, 2008
The MySQL development team decided to celebrate the (US) Thanksgiving
holiday with the release of
MySQL
5.1.30, the first "general availability" (read "production-ready")
release in the 5.1 series. There is a lot of good stuff in 5.1.30,
including table partitioning, row-based replication, a new plugin API, a
built-in job scheduler, and more; see
the
nutshell summary for more information. It's a celebration point for a
long development series; the MySQL developers are to be congratulated for
what they have accomplished with this release.
Behind the celebration, though, one can hear the grumbling from unhappy
developers and users. This release has been a long time in coming; the
first 5.0 GA release was in October, 2005 - just over three years ago. The
first 5.1 release candidate (5.1.22) came out in September,
2007; seven more "release candidates," many with major changes, were
announced over the following 14 months. So the 5.1 production release
came rather later than desired, but some developers feel that it was still to
soon; the complaints reached a climax in this
lengthy posting from Michael "Monty" Widenius, the original creator of
MySQL. His point of view, in short, is that this release has fatal bugs,
and that these bugs come from a number of flaws in how MySQL development is
managed.
Your editor cannot claim to be an expert on the MySQL development
community. But Monty, presumably, is an expert on this community,
so his observations have a higher than usual likelihood of reflecting
something close to reality. Reading various dissenting posts (example)
has done little to make your editor feel otherwise.
And, in any case, much of what Monty says rings true when compared against
experiences from elsewhere in the free software community. As projects
grow, they must occasionally revisit their development models. There is
little happening here which is truly unique to MySQL.
Monty asserts:
MySQL 5.1 was declared beta and RC way too early. The reason MySQL
5.1 was declared RC was not because we thought it was close to
being GA, but because the MySQL manager in charge *wanted to get
more people testing MySQL 5.1*. This didn't however help much,
which is proved by the fact that it has taken us 14 months and 7
RC's before we could do the current "GA". This caused problems for
developers as MySQL developers have not been able to do any larger
changes in the source code since February 2006!
Two things jump out of that statement. One is that MySQL apparently
suffers from an inadequate testing community. Needless to say, that
is not a problem which is unique to this project; testing is a scarce
resource throughout our community. MySQL users who are unhappy with the
results of the development process might want to ask themselves if they are
doing enough to help with the testing process. Like it or not, testing
software and finding bugs is one of the costs of "free" (beer) software.
If this testing doesn't happen during the development cycle, it will end up
happening with the "stable" releases instead.
The other attention-getter above is the statement that MySQL developers
have been unable to make major changes since early 2006. One need only
think back to the 2.4 kernel days to see the kind of damage that can result
from pent up "patch pressure." Developers get frustrated, major changes
start to find their way into "release candidate" code, and the number of
bugs tends to increase. The existence of a separate MySQL 6
development branch helps, perhaps, in reducing patch pressure, but it can
also only serve to distract developers from stabilizing current release
candidates.
Related to this is another assertion:
Too many new developers without a thorough knowledge of the server
have been put on the product trying to fix bugs. This in combined
with a failing review process have introduced of a lot new bugs
while trying to fix old bugs.
Review would appear to be a big part of the problem in general. It may
well be that a failure of review has caused the introduction of new bugs
with fixes. But one could argue that the problem is deeper than that: any
code which failed to stabilize over fourteen months of release candidates
should, almost certainly, never have been merged into the MySQL trunk to begin
with. It seems that there are not enough eyeballs being applied to major
new features before they go in.
Your editor has resisted the temptation to
make comparisons with other relational database manager projects, but
there is value in comparing this state of affairs with the review problems faced by
PostgreSQL in recent years. An inability to get additions to
PostgreSQL properly
reviewed resulted in those additions not being merged. That, in turn,
leads to delayed releases with fewer than the desired number of features,
neither of which is particularly pleasing for users or developers. But, on
the other hand, PostgreSQL does not appear to have the same kind of trouble
stabilizing its major releases.
Perhaps the key point to take away from all of this, though, is here:
In addition, the MySQL current development model doesn't in
practice allow the MySQL community to participate in the
development of the MySQL server.
MySQL is very much a corporate-owned, corporate-driven project, and it has
been for a long time. Decisions on what to include are made internally;
there is little discussion of development decisions on the project's
mailing lists. It is hard to find information on how to contribute to the
project; some
of the available information still tells prospective contributors to
use BitKeeper. All code is copyrighted by MySQL (now Sun), which reserves
(and uses) a right to distribute that code under proprietary licenses.
All of the above reflects an arrangement which has worked well for years,
and which has produced an immensely valuable database manager used
by vast numbers of people. But it is not a community
project, so development decisions will not necessarily reflect the best
interests of the wider user or developer communities. If, as Monty suggests,
those decisions are made in ways which favor features and deadlines over
quality, there will be little that the community can do about it.
Comments (11 posted)
By Jake Edge
December 3, 2008
On the kernel page a few weeks ago, we took a look at KSM, a technique to
reduce memory usage by sharing identical pages. Currently proposed for
inclusion in the mainline kernel, KSM implements a potentially
useful—but not particularly new—mechanism. Unfortunately,
before it can be examined on its technical merits, it may run afoul of what
is essentially a political problem: software patents.
The basic idea behind KSM is to find memory pages that have the same
contents, then arrange for one copy to be shared amongst the various
users. The kernel does some of this already for things like shared
libraries, but there are numerous ways for identical pages to get created
that the kernel does not know about directly, thus cannot coalesce.
Examples include initialized memory (at startup or in caches) from
multiple copies of the same program and virtualized guests that are running
the same operating system and application programs.
Unfortunately, as Dmitri Monakhov points out, the KSM technique
appears to be patented by
VMware. A patent for "Content-based, transparent sharing of memory
units" was filed in July 2001 and granted in September 2004. The abstract
seems to clearly cover the ideas behind KSM:
[...] The context, as opposed to merely
the addresses or page numbers, of virtual memory pages that [are]
accessible to
one or more contexts are examined. If two or more context pages are
identical, then their memory mappings are changed to point to a single,
shared copy of the page in the hardware memory, thereby freeing the memory
space taken up by the redundant copies. The shared copy is ten preferable
[sic]
marked copy-on-write. Sharing is preferably dynamic, whereby the presence
of redundant copies of pages is preferably determined by hashing page
contents and performing full content comparisons only when two or more
pages hash to the same key.
It should be noted that the abstract has no legal bearing, that comes from
the—always tortuously worded—claims, which can be seen at the
link above. In this case, as far as
can be determined, the claims and abstract are in close agreement.
The dates above are rather important because there is some "prior art" to
consider, namely the mergemem patch
first announced
in March of 1998. It is substantially the same as the patented idea: it
looks for identical "context pages", then changes the memory mappings to
point to a single copy-on-write page. This would seem to be a clear
example of the idea being implemented well before the patent was filed, so
it should invalidate the patent. As with everything surrounding
software patents, though, it isn't as easy as that.
In order to invalidate a patent, either a court must rule that way or the
patent office must be convinced to re-examine it, then find that the prior
art makes it invalid. Both of these methods
take time and usually money and lawyers as well. Free software projects
may have time, but the other two are typically out of reach. Alan Cox suggests that "perhaps the
Linux Foundation and
some of the patent busters could take a look at mergemem and
re-examination". While that might eventually resolve the problem,
it is a multi-year process at best.
The folks behind the KSM project are some of the kvm hackers from
Qumranet—which is now part of Red Hat. It is certainly conceivable
that VMware might consider kvm a competitor and try to use this patent as a
"competitive" weapon. That concern is probably enough to keep KSM out of
the mainline until the issue is resolved.
There is a much quicker resolution available should VMware wish to do so.
Like IBM has done with the RCU patent, VMware could license its patent for
use in GPL-licensed code. There is much to be gained by doing that, at
least in terms of positive community relations, and there is little to be
lost—unless VMware truly believes that the patent will stand up to
scrutiny. Both VMware and its parent, EMC, are members of the Linux
Foundation, so one could see a role for the foundation in helping to put
that kind of agreement together.
The original mergemem idea did not make into the kernel, but the code is
still available for those running Linux 2.2.9. It appears that it was not
pushed very
hard in the face of some security concerns—which will need to be
addressed by KSM as well. Processes could create a page of memory with
known contents then, after waiting for the checker process (or kernel
thread) to run, see if memory usage has increased. Based on that
information, one can determine if other processes have a page with
identical values. It would seem rather difficult to exploit, but clearly
does allow some information to leak.
It will come as no surprise to most LWN readers that software patents are an
increasingly dense minefield that can derail free software projects.
Unfortunately, it is the kind of problem that has no solution in the
technical domain where such projects excel. The political arena is where
any solution will have to come from, though there seems to be some hope
that judicial opinions (like the Bilski decision) may limit the scope of
the damage. It is a problem that we are likely to see more frequently
until there is some kind of resolution.
Comments (43 posted)
December 3, 2008
This article was contributed by Marco Fioretti
I recently spoke at the Congress on Free
Software and Democratization of Knowledge hosted in Quito by the
Universidad Politecnica Salesiana of Ecuador. My general report about the
conference and Free as in Freedom knowledge in that country is at the P2P
Foundation blog: the trip, however, was also an excellent occasion to
check out the most interesting Free Software projects currently taking
place in Ecuador. It turns out that there is a lot of activity at the
Government level to promote Free Software, and interesting news from some
cool projects developed locally.
FOSS in the Government
A recent presidential decree mandates that most national Public
Administrations migrate entirely to Free Software. Ing. Mario Albuja, head
of the Subsecretariat for Information Technology of the Presidency of
Ecuador, explained during the congress the reasons and the general
guidelines of this initiative. Later on, I was able to get more details in
a couple of meetings with the members of his staff. Among the most
important things going on right now there are the studies and tests for a
Government digital signatures application which runs on Gnu/Linux and a
unified document management system for 45 central Public
Administrations. There is also a field trial of the GPL hospital management
software Care2X in the works.
The initial implementation of the digital signature project, which uses
Free Software whenever possible, is based on keys and digital certificates
stored on SafeNet iKey 2032 USB
tokens from Entrust. The first official field test will take place in
the next weeks, when President Correa himself will use one such key to sign
a decree. The Certificate Authority infrastructure which will issue keys
and certificates is the same implemented
by Banco Central del Ecuador in November 2007.
The software application, instead, runs inside any browser. A PostgreSQL
backend stores all the documents, together with administrative metadata, on
a CentOS-based server. The decrees waiting for electronic signature are
presented to the user via a simple Apache/PHP front-end. The actual digital
signature happens through a Java applet which reads the encrypted key from
the USB token thanks to libraries provided by Entrust.
Another big step in the process of freeing Ecuador institutions from
proprietary software will be the formal ratification of OpenDocument 1.0 by
the Ecuadorian Institute of Standards
(INEN). Large-scale usage of this format for public documents
should take off right after that, around mid-2009.
All the public officials I talked with really believe in the potential of
Free Software for a developing country like Ecuador. This only makes more
relevant, and worthy of careful consideration, a comment I got from them:
there, they say, is no coordination or common vision among the developers
of the
several FOSS applications they need to deploy. This was no surprise, of
course: people at the Subsecretariat understand how FOSS development
works. Nevertheless, the fact that there is no unified, local, reliable
source for support, with predictable, if not guaranteed, response times, is
creating them more problems than they expected when they began. There may
be quite a business opportunity here for local FOSS entrepreneurs.
Talking with hackers
Rafael Bonifaz told me what's
new in the Elastix world. In case you never heard of it, Elastix is a specialized GNU/Linux
distribution born and (mostly) developed in Ecuador. Its goal is
to solve all the communication problems of organizations of any
size. Elastix integrates in one easy to administer package all you need to
have PBX, VoIP, email, instant messaging, fax and fax/email gateway through
Asterisk, Hylafax, Postfix and Openfire
for Jabber. You can manage all the PBX functions with a customized
version of freepbx. Other tools
developed by the Elastix team provide hardware detection, centralized
automatic configuration of phones and billing support with a2billing.
Elastix is doing great in Ecuador: RTS and Aerolineas
Galapagos (Aerogal), which are respectively one of the most important
TV channels and one of the main domestic airlines in Ecuador, are using
it. Namely, Aerogal is running its call center off Elastix, which is being
deployed also in the Ministry of Public Health.
Rafael, who is the current coordinator of the Elastix Community, is also
proud of the fact that Elastix is the only Gnu/Linux distribution for
communications which has two manual books, totaling about five hundred
pages, freely downloadable from the Internet: Elastix
Without Tears [PDF] by Ben Sharif and Unified
communications with Elastix [PDF] by Edgar Landivar. The second manual is
still a beta version, currently available only in Spanish. There already
is, however, a new mailing list
devoted to coordinating all the translation efforts for this second
book.
Still thanks to Rafael, after knowing about Elastix I met a local group of
Java developers who have very recently begun developing a new, interesting
content management system called Melenti.
Adrian Cadena, member of the Melenti team, explained to me that he and his
partners needed a GPL, friendly, easy to use and fast CMS that
could scale well from personal web pages to corporate portals. Another must
on their requirement list was ease of integration with enterprise software
(Java or not) for ERP, CRM and SAP services. That's why, three months ago,
after some unsatisfactory experiences with the popular Joomla CMS they started writing Melenti.
One of the main features of Melenti should be performance under high
loads. Adrian said they are aiming for something able to handle hundreds of
thousands of clicks per second, something which Joomla "simply could not
handle, when we tried it". Melenti administrators, instead, would be
able to configure load balancing without problems, thanks to an interface
based on Jndi
and other tools.
Melenti should run on any JEE infrastructure, from Websphere to JBoss, BEA,
Oracle AS, Tomcat, Jetty and more. According to Adrian, Melenti will also
be much simpler to set up and extend than most other GPL software for
Content Management.
Installation should be as simple as dropping a .war file into your flavor
of JEE container and following the steps of the graphical wizard which will
pop up. Writing Melenti "gadgets", that is plugins, should also be easier
than with Joomla, Drupal, Php-nuke and similar products. This because, says
Adrian, "unlike those products, Java has worldwide standards like
Spring, JPA, JSF, GWT and so on: new developers can just take a look at the
core Melenti API and start writing their own gadgets in no time."
The first releases of Melenti will support basic CMS functions like
management of web pages, images and other files. There will be also
interfaces for banner rotation, creation of user polls and a Web Services
Creator. The latter is a simple wizard to create Web Services from existing
Melenti gadgets. The first alpha version of Melenti
has been just uploaded to Sourceforge. You're obviously welcome to have
a look at the code and to participate in the development of Melenti.
Let's go back to the reason why I went to Quito now, that is Free Software
and Democratization of Knowledge. Quiliro Ordonez, with one friend
and other occasional volunteers, is now implementing in the field a project
first announced
in 2007: placing Free Software in a school of the community of
Quilapungo, south of Quito, which serves about 200 students.
Thus far, Quiliro has installed 2 servers and 4 thin clients running
gNewSense. He chose this
distribution because it is "100% free software, without non-free
repositories or blobs in the kernel which promote functionality before
anything else, as this would weaken our position for freedom." He's
also very happy with TCOS, which
made setting up the thin clients a breeze. The school staff will use Projecto Alba, a modular
administration and planning software for schools first developed in
Argentina. While gNewSense worked fine out of the box, Quiliro and his
partners had to localize Alba to adapt it to the terminology and procedures
adapted in Ecuadorian schools.
Eventually, the school in Quilapungo will have about 40 Gnu/Linux
workstations, but Quiliro doesn't plan to stop there. If all goes well,
Quilapungo will be presented as a pilot project in a proposal for Free
Software deployment in all public schools in Ecuador. Let's wish Quiliro
good luck!
Comments (9 posted)
Page editor: Jake Edge
Next page: Security>>