By Jake Edge
April 22, 2009
Keeping up with an active distribution like Fedora consumes a fair amount
of time, but also bandwidth. Depending on the frequency that a
yum update is performed, hundreds of megabytes—or even
gigabytes—can be required to bring the system up to date. A recent
experiment in rawhide uses deltarpms and the yum Presto
plugin to
significantly reduce the size of the packages that needed to be retrieved.
The experiment looks to be largely successful which means that Fedora will
likely make the deltarpm files available more widely as part of Fedora 11.
The idea behind deltarpms is not a particularly new one, but the
visibility has been raised by the recent Fedora Presto
test day. The tools to
build deltarpms were originally created by Michael Schröder of SUSE
and have been around for a few years.
Basically, the tools generate a binary difference
(i.e. diff) between the new and old rpm files and create an rpm that just
contains the differences (a drpm). Because package changes are
typically fairly small and localized, the size difference between the new
rpm and the drpm can be quite substantial.
The deltarpm tools do not require that the old rpm be present on the system
when installing, instead they can reconstruct the state of the old rpm from
the installation itself. As long as there is a drpm corresponding to
the difference between the version currently installed and the version that
needs to be installed, Presto will choose the more bandwidth-efficient
package to download. If the deltarpm tools are unable to reconstruct the
new rpm from the installed files and drpm—due to a local
configuration file change for example—Presto will fall back to
downloading the full rpm of the updated package.
For rawhide users, trying Presto out is quite simple:
yum install yum-presto
which will install and enable the Presto plugin. Using it to update
rawhide on April 22 would normally have required 68M, but using the
drpms available (20 of 21 packages that needed updating) reduced that
to 23M for a 66% reduction. There is a substantial pause after the
packages have been downloaded while the deltarpm tools rebuild the rpms
from drpms—in this case something on the order of one to two minutes.
For someone at the end of a low-medium bandwidth link (or someone who pays by
the the amount transferred), that tradeoff is likely to be a good one.
There are still a few infrastructure glitches on the Fedora side. Part of
the reason for the test day and publicizing the new feature was to find and
fix those problems before Fedora 11 ships. Because of the way
the deltarpm tools work—reading both rpms into memory before doing
the diff—and how the Fedora infrastructure builds rpms for all
architectures in parallel, only packages smaller than 200M are currently
turned into drpms. There are also questions about whether it makes sense
to build source and debuginfo drpms. Those types of packages are not
widely used so spending repository space and build resources on drpm
versions may not be warranted. From a user perspective, though, it all
works quite smoothly: install a package and get a lot of bandwidth savings.
SUSE has been using drpms for some time, at least since SUSE Linux 9.3 was
released in 2005. Users automatically get drpms when using the zypper tool
for package updates and drpms are created for all package updates as long
as the
diff is smaller than the full rpm. For users that would rather get the
full rpm when doing updates, drpms can be disabled in
/etc/zypp/zypp.conf.
Presto development is,
unsurprisingly, a Fedora Hosted project with a Trac page and Git
repository. It would seem that there has been some collaboration with
the openSUSE folks on the drpm format and tools so that yum and zypper will
interoperate. Given that both are rpm-based tools, it is good to see the
two distributions working together.
One could argue, as some have, that there is
too much package churn in Fedora. On the other hand, Fedora users do tend
to expect very recent, often bleeding-edge, packages. Since that is
unlikely to change, Presto will be very welcome for folks whose bandwidth
is limited in some way—those who are unconcerned, need not
install it. Meanwhile, with less fanfare, SUSE users have been getting
those savings for some time.
Comments (10 posted)
By Jonathan Corbet
April 20, 2009
Despite a steady stream of rumors, IBM did not, in the end, buy Sun
Microsystems. But, on April 20,
Oracle
did. This acquisition could have some interesting implications for the
Linux community. Your editor, while not really knowing more than anybody
else, suspects that the outcome could be mostly positive. What follows,
here, is some wild speculation on where this could all go.
Some months ago, your editor posted a
slightly tongue-in-cheek article on a serious topic: what would happen
if Sun Microsystems were to undergo a change in management which rendered
the company far less friendly toward free software? It now appears that
there will, indeed, be a management change. One might well worry what
changes we might see in the newly-acquired company's attitude; Oracle is
not always seen as the friendliest company in general. But Oracle, while
being very much a proprietary software company, does seem to have a
supportive approach toward free software. Your editor was reasonably well
impressed by the talk given by Oracle "Chief Corporate Architect" Edward
Screven at the recent Linux Foundation Collaboration Summit. At some
levels of the software stack, at least, Oracle seems genuinely interested
in working with and growing the development community.
There are a number of specific topics of interest when speculating on what
could happen; your editor will visit a few of them below.
MySQL. This project, of course, can be seen as being in direct
competition with Oracle's flagship offering. So, unsurprisingly, a number
of people have speculated that Oracle will not encourage its further
growth. So, perhaps, Oracle will de-emphasize the project or "return it
to the community." But that is not necessarily how things will go.
One should remember that this isn't the first time Oracle has been seen to
threaten MySQL through acquisition. Back in 2005, Oracle bought Innobase, the creator of
the InnoDB storage engine used by MySQL. The MySQL project wisely branched
away from InnoDB, but the fact of the matter is that this code is still
free software, and InnoDB releases continue to happen. The sky did not
fall after all.
Beyond that, there is the simple matter that MySQL appears to earn money.
This acquisition could well be an opportunity for Oracle to gain revenue
from customers who, for whatever reason, are not interested in buying
Oracle licenses. It broadens the company's database product line and might
provide the opportunity to encourage some customers to move toward the more
expensive, proprietary offerings.
Most interesting, though, will be to see what happens with the MySQL
development community. Oracle still does not have vast amounts of
experience running large, community-oriented projects, but it seems to be
learning. The MySQL community is not in top condition, currently; it has
suffered from Sun's legendary heavy hand, leading to a fair amount of
developer unhappiness. There are currently a few active
forks out there, raising the possibility that control over the "real" MySQL
could move out of Sun's hands altogether. Oracle could, just maybe, woo
these developers back into a core MySQL project which was managed in a more
community-oriented manner. If that were to happen, it would be hard to
conclude that this acquisition was anything but good for MySQL.
Solaris. This operating system is said, in the press release, to be
one of the core justifications for the acquisition. Oracle sells a fair
number of licenses for deployments on Solaris; it cannot be unhappy with the
idea of gaining control over the full platform. The real question here,
perhaps, is whether Oracle sees Solaris as a system with a long future
ahead of it, or whether Solaris becomes a legacy platform which will be
supported for some time, but which will not see a great deal of
development.
There have been suggestions for a while that Sun is reconsidering its
licensing choices. A GPL-licensed Solaris was not entirely out of the
question before the acquisition; quite possibly, those chances have
improved now. A relicensed Solaris, preferably combined
with some clarity on patent licensing, could make it possible for
technologies like ZFS and Dtrace to move into Linux. Whether Linux would
want them is a separate discussion, though.
There is an alternative, of course: Oracle could decide to promote Solaris
as an (incompatibly-licensed) competitor to Linux and reduce its
involvement on the Linux side.
Your editor, perhaps naively, sees this outcome as unlikely. Oracle has
invested heavily enough in Linux to create a real impression of believing
in the platform. Oracle has not invested in Solaris (which is also free
software, remember) at anything close to the same level. If Oracle were to
to try to push Solaris as a better alternative to Linux, it would really
just be continuing Sun's strategy. Presumably there are people in Oracle
smart enough to wonder why Oracle would have any more success with that
approach than Sun did.
Btrfs. Edward Screven claimed that Oracle was pursuing Btrfs
because it likes the technology better than it likes ZFS. Ownership of ZFS
could well put that claim to the test, but there does not appear to be any
reason to believe that it was not sincere. The early word from Oracle is that plans for Btrfs
have not changed, and that the resources put into that project will not
decrease.
Java. The press release states that Java "is the most
important software Oracle has ever acquired." Much Oracle-based
software is written in Java, so there are clear advantages in having
control over that part of the software stack. Increasingly, customers can
just go to Oracle and get support for most of the major components they use
from a single source. That, presumably, will help make some money for
Oracle.
OpenOffice.org. This project looks like a bit of a strange fit in
Oracle, which is not really a desktop software company. Still, Oracle may
see value in keeping this project going as a way to encourage corporate
desktop users away from Microsoft products. With any luck at all, Oracle
will work to turn OpenOffice.org into a more community-oriented project.
By making participation in OpenOffice.org so hard, Sun has spurned the
offers of assistance which have come from around the community. Maybe
Oracle will be a bit smarter and will realize that, by opening things up a
bit, it can speed the development of OpenOffice.org without really having
to invest more into the project. One can always hope.
What it comes down to is that just about anything could happen. It could
be that this acquisition is part of a long-term plan by Oracle to acquire
just enough of the free software community to neutralize any threats it
sees. Now that this hypothetical plan is coming to fruition (lacking,
perhaps, just the occasionally-rumored acquisition of Red Hat), Oracle can
proceed to move away from Linux, turn things proprietary, and generally
prepare itself for the Final Battle. This would not be a good outcome for
the Linux community, though we would, as usual, end up stronger once the
dust had settled.
Alternatively, Oracle may have understood that truly free software can
help to turn its competitors' products into commodities while enabling
Oracle to provide a solid offering around its own products. This company,
which has already become one of the top Linux kernel contributors, could become
the top contributor to free software projects as a whole (a title which Sun
has already claimed). If Oracle sustains Sun's projects in a more
community-oriented mode, we may well conclude, one year from now, that this
acquisition was a good thing indeed.
Comments (82 posted)
April 22, 2009
This article was contributed by Nathan Willis
Sun's sudden acquisition by Oracle triggered a deluge of speculation
about the future of the company's free software projects: Java, OpenOffice,
VirtualBox, OpenSolaris, and, most of all, MySQL. Will Oracle kill it? Spin it off?
Keep its hands off? In light of this uncertainty, the discussion soon
shifted to the trickier question of what branch constitutes the
MySQL. The project has been forked multiple times — several even in
the past year. Considering that each competitor is led by a heavyweight
MySQL developer and has its own goals, how is a humble database
administrator supposed to choose?
Patch sets and proto-forks
The seeds of this confusion predate MySQL's acquisition by Sun, when
MySQL developers began to lose patience with MySQL AB's governance of the
project. Management had announced two branches, "enterprise" and
"community," in 2006, but soon began to miss scheduled binary and source
releases of the community branch. Worse still, community developers
complained that the company was trying to hide the enterprise branch code
— changing the release location between iterations.
In 2007, Jeremy Cole of Proven
Scaling took matters into his own hands, and set
up a public mirror of
the official "enterprise" releases as they appeared. Cole does not make
changes to the code released by Sun, although Proven Scaling does publicly
maintain its own set of patches and tools for
MySQL — as do several other database consulting firms and MySQL
users, including
Google.
Percona
One of those consulting firms is Percona, a web-development consulting
business that emphasizes its expertise in MySQL. Percona develops a
pluggable storage engine for MySQL called XtraDB.
XtraDB is an enhancement to the popular InnoDB engine, designed to work as a
drop-in replacement. It adds the ability to scale better on multi-core
hardware, use memory more efficiently, and adds more tune-ability and
metrics.
Percona's MySQL
releases do not remove InnoDB to replace it with XtraDB, but do include
patches to InnoDB. They also incorporate patches from other sources,
including Proven Scaling, Google, and Open Query. Source and binary releases, as well
as RPMs for Red Hat Enterprise Linux, are available for MySQL 5.0 and MySQL
5.1.
Percona's patch set is documented on the
company's wiki. The patches include changes that add status variables,
more configuration parameters, additional I/O settings, dynamic memory
allocation, and alters mutexes and locks to improve performance on SMP
systems.
OurDelta
OurDelta was launched in October of
2008 by former MySQL employee Arjen Lentz (now at Open Query), and describes its mission as providing
"enhanced" MySQL builds for common production platforms. Its releases
build on Percona's, adding additional patches (some from Google and other
third-parties, some original work) and including additional storage
engines.
OurDelta maintains two builds, one stable and one bleeding-edge. All
stable releases so far have been for MySQL 5.0, and include the
full-text-search-capable Sphinx
storage engine. Upcoming work for MySQL 5.1 and MySQL 6.0 will add an enhanced version of InnoDB
from Innobase, PBXT, and FederatedX storage engines.
OurDelta makes source code
releases available as tar archives, and runs binary repositories for Red Hat Enterprise Linux and CentOS,
Debian, and Ubuntu.
OurDelta also documents its
significant patches. In addition to the Percona patch set, OurDelta
includes activity monitoring and reporting (per table, index, account, and
machine), improved logging, an option to kill idle database connections,
the ability to temporarily freeze InnoDB for backup purposes, and
improvements to speed up failover.
MariaDB
MySQL founder Michael "Monty" Widenius started his own fork in February
of 2009 after leaving Sun. At the time, he said
his reason for departing was dissatisfaction with Sun's development and
community processes for MySQL, which was not "a true open development
environment" that encouraged outside participation.
Widenius's fork is called MariaDB, and the only
major change is that it uses the Maria storage engine, which is
the focus of development. The rest of the code is regularly synchronized
with MySQL releases from Sun, and is intended
to be one hundred percent interoperable.
The Maria storage engine is an evolution of MySQL's default MyISAM
storage engine, and is designed
to duplicate the features found in InnoDB, notably crash recovery and full
transactional support. Maria and MariaDB are being developed against MySQL
5.1. Widenius expects the Maria engine to be a standard part of Sun's
MySQL 6.0 releases, but intends to keep developing MariaDB even after MySQL
6.0 is stable. So far, the project has released
source code packages and generic x86 binaries for Linux.
Widenius maintains a wiki page documenting the advantages
of MariaDB over Sun's unmodified MySQL, focusing on the features of the
Maria storage engine. Aside from the larger goals of crash-safety and
transactional support, he notes that using Maria as a storage engine should
speed up complex queries. In addition, MariaDB contains speed
improvements, the ability to use a pool of threads to handle queries
(rather than one thread per connection), and bugfixes not accepted by
Sun.
Drizzle
Drizzle is the most distinctive MySQL
fork, perhaps better described as a complete refactoring. Drizzle is the
work of Brian Aker, long a preeminent MySQL developer. He announced the
project in July of 2008, saying that he disliked many of the changes made
to MySQL after version 4.1, and felt that there was a large market of users
that did not want them. Despite launching the fork, Aker continues to work
in the MySQL group at Sun.
Drizzle cuts the core of MySQL down to the bare minimum, using a
microkernel-and-modules approach. The goal is to create a slimmed-down,
optimized database targeting web infrastructure and cloud components.
Aker said that Drizzle will question the foundations of database design,
and is not intended to be SQL compliant. The FAQ emphasizes a "look
forward, not back" philosophy. For example, Drizzle targets
modern, multi-core hardware, modern compilers, and modern operating
systems. Similarly, the development team is not
interested in feature requests or in adding excised MySQL features back
in. Thus far, the project had made only source
code releases, and has noted that they are not yet stable for
production use.
Conclusion
The major Linux distributions all package Sun's "community" version of
MySQL. Sun itself provides free downloads of the community edition from the
web, evidently having learned a lesson from the 2007 uproar. Sun's official
packages are likely to be newer, given the release cycles of most
distributions, and to its credit Sun makes binary builds available for a
wide variety of processor architectures and distributions, including older
releases of those distributions. For most users, such a supported build is
usually the best choice. The Percona and OurDelta packages represent the
work of in-the-field MySQL consultants, and MariaDB is focused on the Maria
engine, but only experienced database administrators are likely to be able
to take advantage of the additional features they offer.
Still, it is telling that so much of the work done by the forks centers
around the InnoDB storage engine: the patches written by Percona and
OurDelta, Percona's replacement engine XtraDB, and MariaDB's replacement
engine Maria. InnoDB is GPLv2-licensed, but the copyright is owned by ...
Oracle. Oracle acquired InnoDB's creator Innobase in 2005. That
acquisition sparked a flurry of concern that the database giant would kill
the product, take it proprietary, or somehow use it against MySQL —
many of the same nightmare scenarios now speculated about the Sun purchase.
It is worth noting that in the intervening years two things have occurred:
Oracle has not killed or maimed InnoDB, and the open source
community has preemptively created its own innovative solutions, thereby
insulating
open source users and customers from disaster should Oracle take a step in
the wrong direction.
The real question is not which fork is the MySQL, but whether the
multiple patch sets and forks indicate sickness or health for MySQL as a
whole. Excluding Drizzle, all of the projects were started because someone
who cared a great deal about the future of MySQL saw something wrong with
MySQL's development process (and for its part, Drizzle was spawned by even
deeper dissatisfaction with the technical direction of MySQL). Surely that
much concern on the part of the community signifies health. There is no
telling which forks will prosper and which will fizzle out, but that depends
to a large degree on Oracle, and how it governs the project in the
future.
Comments (13 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: A privilege escalation flaw in udev; New vulnerabilities in cups, firefox, kernel, udev,...
- Kernel: In search of the perfect changelog; The slow work mechanism; DRBD: a distributed block device.
- Distributions: Debian GNU/kFreeBSD: one more step towards a universal operating system; Annual Distribution List update; Ubuntu 9.04 RC; Sugar on a Stick Beta 1; Fedora Unity Releases F10 Re-spins.
- Development: GCC reaches the 4.4.0 release, what's coming in glibc 2.10, new versions of MySQL Community Server, TestDisk/PhotoRec, Samba, RPM, CUPS, Midgard2, Octopussy, skpd, Xfce, Elisa, Firefox, PyMite, Python, Mock, GIT, Jason.
- Press: Shuttleworth promotes 2-3 year meta-cycles, Android for set-top boxes, Linux under Windows apps, OLPC XO 1.5 review, RTI Data Distribution Service review, the health of openSUSE.
- Announcements: GNOME sysadmin team, Zemlin on Oracle/Sun, TomTom case and GPLv3, Oracle buys Sun, open source activity map, IMF cfp, LPC cfp, openSUSE Summit cfp, NLUUG conf sched, OpenSource World sched, X Dev Conf.
Next page:
Security>>