The first Android
Builders Summit (ABS) was held April 13-14 as part of the Linux Foundation's
"two weeks of conferences" in San Francisco. ABS overlapped the last day of
Linux Conference (ELC), so many of the ELC attendees sat in on ABS talks,
at least for the first day. On the second day, Marko Gargenta of Marakana
gave a keynote that looked at uses of Android beyond its traditional
consumer mobile phone
(and now tablet) niche. He also outlined some of the reasons that device makers
are turning to Android.
Advantages of Android
Gargenta is the author of Learning
Android and has worked with various companies to help them with
their Android plans. There are three main advantages that Android brings
to the table which are important to device makers, he said. First that it
is an open platform and it is relatively easy to get the source code and
customize it. Second that it has "apps" and lots of developers are
embracing the platform. And third that it is a "complete
stack" that provides nearly all of the services that are required to
create a product.
While Android is an open platform, as evidenced by Andy Rubin's
famous tweet—though the tweet is missing two important steps (envsetup.sh and
lunch) as Gargenta and audience members pointed out—it isn't
much like other open source projects. There is no Git tree that contains
"whatever was checked in the night before"; instead there are
somewhat infrequent code drops. With Honeycomb (Android 3.0), even those
have stopped, which is something that concerns some companies who are
basing their products on Android. It may eventually cause them to
reconsider using it.
Applications for Android are important, but it's not really the existing
applications that are interesting, rather it is the "model for
developing applications" that attracts the device makers. Many
existing applications may run on a modified Android, but that is just a
"bonus", he said. When you look at Android as a whole, it has
all of the pieces from the hardware up, including the Linux kernel,
libraries for many of the things the device vendors need to be able to do,
Java, and the application development model, which is what makes it a
complete stack. This stands in contrast to standard embedded Linux or Java
ME, which aren't a complete stack.
Gargenta then launched into several case studies of devices using Android.
The projects were ones that he had worked on, though some were not public
(or not public yet), so he didn't reveal the company behind them.
The first was a multi-function printer/scanner/copier device where the user
interface would be built with Android. The application development
framework was one of the most attractive parts of Android, not because they
would be putting Market applications on the device, but because they could
have independent developers work on the interface. There are lots of
developers out there who can write to the Android APIs.
The complete stack approach of Android was also appealing because the
system already has support for graphics, touchscreen interaction, and
networking. There were some missing pieces, of course, including drivers
and C libraries to talk to the proprietary printer/scanner hardware,
Java interfaces for that hardware, and a new "home" application. Instead
of the usual Android user interface, a custom application was written that
didn't include things like an application drawer or status bar. In fact,
users of the device may never know that they are using Android, he said.
Two different device types that Gargenta described had similar
requirements and had many of the same reasons for going with Android. The
first was a "public safety solution" for handling
communications during catastrophes that was developed by a major OEM, while
the other was a device for the US Department of Defense (DoD). In both cases,
the availability of "off the shelf" hardware that runs Android was
attractive. For the public safety application, it's important that
multiple kinds of hardware can be used, as various different agencies need
to be able to coordinate their efforts.
Once again, the application framework provided by Android is appealing
because it allows multiple developers to work on various parts of the
problem, more or less simultaneously. The large developer base is
attractive as well. Both projects were concerned with stopping
installation of "unapproved" applications, either from the Market, or by
restricting which repositories the devices could access.
As might be
guessed, the DoD project had further security concerns. It is important to
ensure that the device is being used by an authorized person, so attaching
a USB device as part of the authentication process is required. The existing
Android code did not support application access to the USB ports, so that
was added. In addition, device management was added so that devices could
be tracked or remotely wiped, and so that password policies could be enforced.
Both projects had an interest in the priority of Android services. In
general, radio communications should not be interrupted by text messages or
a game, so the assumptions needed to be tweaked from those of a consumer
device. Determining which services are critical can be difficult, Gargenta
said. For example, "are media services that critical in a life or
death situation?", he asked. They may or may not be, depending on the
media in question.
Cius was another example that Gargenta presented. It is meant to be an
"iPad for business" that looks something like a desktop video
phone, but the video screen part can be removed to become a mobile tablet
The "open and portable" nature of Android was one of
its selling points, but the company is rethinking Android because of the
Honeycomb availability issue.
The "open and portable" nature of Android was one of
its selling points, but the company is rethinking Android because of the
Honeycomb availability issue. Google is also not helping adoption in the
enterprise market because it is not telling anyone what its plans are for
things like device management and security, he said.
The Cius has its own Market where applications are much more carefully
vetted and generally have higher quality. The Cius also adds multi-user
support, which is not something that Android does, but is, of course,
available in the underlying Linux kernel. The device also provides video
conferencing and Voice over IP telephony support; the latter was added before
Google released Gingerbread with SIP support, because there is no Android
Android set-top boxes were another use that Gargenta described. Google TV
is not available as an API, so it can't be used for television
applications. Android is attractive for the usual reasons, but has some
as well. Appeasing content providers with DRM solutions is one area that
needed to be addressed in the two projects he worked on. The Android user
interface is also not usable for TVs, but it is relatively straightforward
to create one, partly because Android was designed to support multiple
The last case study from Gargenta's talk was for "networked
cars". Visteon has created a prototype of a Android-based dashboard
for cars. One of the more interesting characteristics of such a device is
that it requires multi-screen support, which is not something that comes
out of the box with Android. But it does make a good platform for doing
user interface development quickly, he said.
He listed a number of other Android-based products that he knew of,
including home security systems, scientific calculators,
microwaves, and washing machines. One thing that Gargenta didn't mention
was whether any of the changes being made by these device makers were being
pushed back to Google for possible inclusion into Android. One gets the
sense that, in keeping with the secrecy that often shrouds the embedded
world, those changes may well be held back. It's also not clear if the
custom Linux drivers for various hardware devices are being released in
source form, as Gargenta didn't really address the kernel in his response
to an audience question about licensing.
It certainly was interesting to hear where Android is being used, especially
in devices that stray far from its roots. In many ways it is just an
extension of the enormous penetration that Linux has made into the embedded
Whether other "full stack" solutions, like MeeGo or WebOS, can make inroads
into devices over the next few years will be interesting to watch.
While ABS definitely had some interesting talks, some of which I hope to
write up in coming weeks, it was rather different than one might have
expected. The first two keynotes were essentially extended advertisements
for the speakers' companies (Motorola and Qualcomm), which is not at all
the norm at technical conferences. In addition, it was rather surprising
to see a complete lack of Google speakers—and sponsorship. Some noted
that the Google I/O conference was
scheduled a few weeks after ABS, but that doesn't seem like reason enough
for that level of non-participation. If the LF plans to reprise the
conference next year, fixing the keynotes and working with Google would
likely result in an even better conference.
Comments (7 posted)
Drupal 7 is the first mainstream content management system with
out-of-the-box support for users and developers to share their data in a
machine-readable and interoperable way on the semantic web. At the Drupal Government Days in Brussels, there were
a few talks about the features in Drupal — both in its core and in
extra modules — to present and interlink data on the semantic web.
In his talk "Riding the semantic web with Drupal", Matthias Vandermaesen, senior developer for the Belgian Drupal web development company Krimson, gave both an introduction to the semantic web and an explanation of the Drupal features in this domain. The problem with the "old" web is that it is just a collection of interlinked web pages, according to Vandermaesen: "HTML only describes the structure of documents, and it interlinks documents, not data. The data described by HTML documents is human-understandable but not really machine-readable."
The semantic web, on the other hand, is all about interlinking data in a machine-readable way, and Linked Data, a subtopic of the semantic web, is a way to expose, share and connect pieces of data using URIs (Uniform Resource Identifier) and RDF (Resource Description Framework). This guarantees an open and low-threshold framework, where browsers and search engines can connect related information from different sources. All entities in a Linked Data dataset and their relationships are described by RDF statements. RDF provides a generic, graph-based data model to structure and link data. Each RDF statement comes in the form of a triple: subject - predicate - object. Each subject and predicate is identified by a URI, while an object can be represented by a URI or be a literal value such as a string or a number.
The semantic web is not some vague future vision, it's already here, Vandermaesen emphasized. He talked about some "cool stuff" that the semantic web makes possible. For instance, search engines like Google already enrich their search results with relevant information that is expressed in RDFa or microformats markup: if you search for a movie, Google shows you some extra information under the reference to the IMDb page of the movie, such as the rating, the number of people that have given a rating, the director, and the main actors. Google shows these so-called "rich snippets" in its result page for a lot of other types of structured data, such as recipes. Moreover, many social networking web sites like LinkedIn, Twitter, and Facebook (with its Open Graph Protocol) already markup their profiles with RDFa.
But how do we "get on" the semantic web? This is actually quite simple,
according to Vandermaesen: just use the right technologies to work with
machine-understandable data, like RDF and RDFa, OWL (Web Ontology Language), XML,
and SPARQL (a
recursive acronym for SPARQL Protocol and RDF Query Language). There are two common ways to publish RDF. The first one is to use a triplestore, which is a database much like a relational database, but with data following the RDF model. A triplestore is optimized for the storage and retrieval of RDF triples. Well-known triplestores are Jena, Redland, Soprano, and Virtuoso.
The other way to publish RDF is to embed it in XHTML, in the form of
RDFa. This W3C recommendation specifies a set of attributes that can be
used to carry metadata in an XHTML document. In essence, RDFa maps RDF
triples to XHTML attributes. For instance, a predicate of a triple is
expressed as the contents of the property attribute in an element,
and the object of the same triple is expressed as the contents of the
element itself. For example, using the Dublin Core vocabulary:
<h2 property="dc:title">The trouble with Bob</h2>
One of the benefits of RDFa is that publishers don't have to implement two ways to offer the same content (HTML for humans, RDF for computers), but can publish the same content simultaneously in a human-readable and machine-understandable way by adding the right HTML attributes.
Thanks to these machine-readable data, it's quite easy to connect
various data sources. Vandermaesen gave some examples: you could add IMDb
ratings to the movies in the schedule of your local movie theatre, and you
could link the public transport timetables to Google Maps. This shows one
of the key features of the semantic web: data is not contained in a single
place, but you can mix and match data from different sources. "With
the semantic web, the web becomes a federated graph, or (how Tim
Berners-Lee calls it) a Giant Global
Graph", he said.
RDFa in Drupal
"Drupal 7 makes it really easy to automatically publish your data in RDFa," Vandermaesen said, "and search engines such as Google will automatically pick up this machine-readable data to enrich your search results." Indeed, any Drupal 7 site automatically exposes some basic information about pages and articles with RDFa. For instance, the author of a Drupal article or page will be marked up by default with the property sioc:has_creator (SIOC is the Semantically-Interlinked Online Communities vocabulary). Other vocabularies that are supported by default are FOAF (Friend of a Friend), Dublin Core, and SKOS (Simple Knowledge Organization System). Drupal developers can also customize their RDFa output: if they create a new content type, they can define a custom RDF mapping in their code. A recent article on IBM developerWorks by Lin Clark walks the reader through the necessary steps for this.
But apart from RDFa support in the core, there are a couple of extra modules that let Drupal developers really tap into the potential of the semantic web. One of them is the (still experimental) SPARQL Views module, created by Lin Clark and sponsored by Google Summer of Code and the European Commission. With this module, developers can query RDF data with SPARQL (SPARQL is to RDF documents what SQL is to a relational database) and bring the data into Drupal views. This way, you can import knowledge coming from different sources and display it in your Drupal site in a tabular form, and this with almost no code to write. "Thanks to SPARQL Views, any Drupal web site can integrate Wikipedia info by using the right SPARQL queries to DBpedia," Vandermaesen explained. At his company Krimson, he used (and contributed to) SPARQL Views in a research project sponsored by the Flemish government, with the goal of creating a common platform to facilitate the exchange of data in an open and transparent fashion between large repositories that contain digitized audiovisual heritage.
Linked Open Data
In his presentation "Linked Open Data funding in the EU", Stefano Bertolo, a scientific project officer working at the European Commission, gave an overview of the projects the European Union is currently funding to support linked data technologies. He also maintained that governments are likely to become the first beneficiaries of advances in this domain, thanks to Drupal:
Linked Open Data, which is Linked Data open for anyone to use, is really taking off and Drupal is ready for it. There's a massive amount of information you can re-use in your Drupal installation, and this re-usability is the most important aspect of the semantic web. Just like a typical software developer re-uses a lot of software libraries for generic tasks, the semantic web allows you to re-use a lot of generic data. That's why the European Commission has been investing in Linked Open Data technology. Drupal and Linked Data have much to offer to each other, especially in the domain of publishing government data.
Bertolo mentioned three Linked Open Data projects funded by the European Commission. One is OKKAM, a project that ran from January 2008 to June 2010. Its name refers to the philosophical principle Occam's razor, "Entities should not be multiplied beyond necessity", to which the OKKAM project wants to be a 21st-century equivalent: "Entity identifiers should not be multiplied beyond necessity. What this means is that OKKAM offers an open service on the web to give a single and globally unique identifier for any entity which is named on the (semantic) web. This Entity Name System currently has about 7.5 million entities, such as Barack Obama, European Union, or Linus Torvalds. When you have found the entity you need in the OKKAM search engine, you can re-use its ID in all your RDF triples to refer unambiguously to the entity.
Another deliverable of the OKKAM project is sig.ma, a data aggregator for the semantic web. When you search for a keyword, sig.ma combines all information it can find in the "web of data" and presents it in a tabular form. Recently, a spin-off company started, based on the results of the research project.
The second European-funded project Bertolo talked about was LOD2, a large-scale project with many
deliverables. The project aims to contribute high-quality interlinked
versions of public semantic web data sets, and it will develop new
technologies to raise the performance of RDF triplestores to be on par with
relational databases. This is a huge challenge, because a graph-based data
model like RDF has many freedoms, which is difficult to optimize as there is no strict database schema. The LOD2 project will also develop new algorithms and tools for data cleaning, linking, and merging. For instance, these tools could make it possible to diagnose and repair semantic inconsistencies. Bertolo gave an example: "Let's say that a database lists that a person has had a car insurance since 1967 while the same database lists the person's age as 18 years. Syntactically, there are no errors in the database, but semantically we should be able to diagnose the inconsistency here."
A third project by the European Commission is Linked Open Data Around the Clock. Bertolo explains its goal: "The value of a network highly depends on the number of links, and currently the links across Linked Open Data datasets are not enough. The mission of the Linked Open Data Around the Clock project is to interlink these much more, to give people more bang for their RDF buck. Our objective is to have 500 million links in two years." As a testbed, the project started with publishing datasets produced by the European Commission, the European Parliament, and other European institutions as Linked Data on the Web and interlinking them with other governmental data.
Drupal paving the way
At the moment, the semantic web is still struggling with a
chicken-and-egg problem: many semantic web tools are still experimental and
not easy to use for end users, and publishers still have trouble finding a
good business model to publish their data as RDF when their competitors
don't do so. However, with out-of-the-box RDFa support in Drupal 7, the
open source CMS could pave the way for a more widespread adoption of
semantic web technologies: Drupal founder Dries Buytaert claims that his
CMS is already powering more than 1 percent of all websites in the
world. If Drupal keeps growing its market share, the CMS could help to bring Linked Open Data to the masses, and we could soon have millions of web sites with RDFa data on the web.
Comments (8 posted)
The MySQL Conference and Expo in Santa Clara, the largest open source conference in the San Francisco Bay area since the decline of LinuxWorld, was unusual this year in several ways. First was the inclusion of many non-MySQL databases in the conference. The second was the mostly-friendly rivalry of the several MySQL forks, who presented co-equal keynotes. The dominant discussion at the conference, however, was the stormy relationship between the MySQL community, the conference, and Oracle.
PostgreSQL and more
The biggest official change this year is that conference organizers O'Reilly and Associates decided to bring in non-MySQL open source databases in order to expand the scope of the conference. Most notable among these was the long-time MySQL "rival" project, PostgreSQL. The PostgreSQL support company EnterpriseDB was the top sponsor of the conference, and dominated the expo floor with a huge presentation area. There were two PostgreSQL tutorials and seven talks in the conference program.
In addition to PostgreSQL, several other open source database projects were invited to give talks at the conference, including MongoDB, CouchDB, Cassandra, Redis and HBase. MongoDB was particularly active on the trade show floor, and their ubiquitous coffee mugs turned up all over the conference.
The PostgreSQL presence at MySQLCon included at keynote in which EnterpriseDB staff went over features for the recently released version 9.0 and the upcoming version 9.1. The 9.1 features will be covered in a future LWN article.
Oracle's relationship with the conference bordered on the schizophrenic. According to a member of the conference committee, Oracle sponsored the conference but refused to allow their name to be listed as a sponsor. For several months Oracle refused to allow their staff to submit talks to the conference or attend, relenting at the last minute and sending several speakers and a small marketing crew, according to a member of the Oracle staff.
Oracle did sponsor a party on Wednesday night of the conference. However,
Oracle also sponsored a new MySQL track at IOUG's Collaborate conference at the
same time, in Florida, 3000 miles away. While by all reports attendance of
the MySQL track at Collaborate was poor (some witnesses reporting as few as
50 people at the keynotes), MySQL luminaries employed at Oracle, as well as MySQL-using Oracle partners, were obligated to go to Florida and not to California.
This cannot have helped attendance at the MySQL Conference. Indeed,
attendance was down to around 1100 compared with over 1500 last year according to conference organizers.
On Tuesday, Tomas Ulin, Oracle's MySQL engineering manager presented the
recent improvements which Oracle has introduced in version 5.5, what is
planned for 5.6, and some of the changes that have been made to MySQL in their first year of stewardship of the project.
One of the biggest changes to MySQL 5.5 is that InnoDB, the primary and most mature transactional database engine for MySQL, is now the default. This is welcome news since one of the chief causes of inexperienced MySQL users losing data is use of the older, non-crash-safe, MyISAM engine. The MySQL and InnoDB teams at Oracle have also been merged, "as they always should have been" according to Ulin.
Other 5.5 features include substantially improved performance on Windows,
enhanced partitioning, and the performance_schema,
which is a tool to collect runtime performance data about MySQL queries.
Ulin also announced a "development release" of MySQL 5.6. Oracle's goal is to make development releases very stable, so that they can move from the development releases to final release quickly.
Features for 5.6 include more improvements to partitioning and additional views in performance_schema. MySQL is also adding additional query optimizations to improve performance on large databases, such as multi-range reads, sort optimizations, and pushdown of predicates into subqueries.
MySQL 5.6 may also include enhanced integration with Memcached. This includes the ability to use the Memcached protocol in order to access the InnoDB storage engine directly, effectively making InnoDB a NoSQL and SQL database at the same time.
Ulin also went over Oracle's plans for MySQL Cluster, otherwise known as NDB. MySQL Cluster is a specialized database engine aimed at telecommunications companies, and has been commercially successful in that market. "If you make a phone call today, MySQL cluster is probably involved somewhere," said Ulin.
Version 7.2 will include support for some kinds of JOINs across clustered tables, which NDB has not previously had. It will also include increases in the number of columns per table it can support, and replication of user privileges for faster access. It is also expected to support dynamic switching between SQL queries and direct object-oriented access to the database engine.
The two successive acquisitions of MySQL, by Sun and then by Oracle, have spawned a number of forks, each of which is pursuing its own course of development and community. On Wednesday, Monty Widenius, founder of MySQL, presented MariaDB, his MySQL successor database — or fork — which his company in Finland has been developing for the last two years. Several of the original MySQL developers are now working on MariaDB.
MariaDB is primarily meant to be a drop-in replacement for MySQL 5.1. Its main advantage is being pure open source and not owned by Oracle. MariaDB has also released an LGPL C-language driver for MySQL, which according to Widenius resolves some of the licensing issues with Oracle MySQL drivers.
Beyond that, MariaDB 5.2 was released recently with a number of useful
features not present in mainstream MySQL. Widenius announced that for the first time Sphinx full text search is available directly as a storage engine called SphinxDB. MariaDB contains multiple improvements to MyISAM storage and supports pluggable authentication. It also has added "virtual columns", which hold automatically updated calculated values based on the other columns in the table.
MariaDB 5.3 will include an extensive rewrite of the query optimizer which is supposed to improve response time on more complex queries by orders of magnitude. It will also support a new, faster form of group commit for faster database writes.
Widenius also announced that MySQL support company SkySQL would be offering support for MariaDB. SkySQL is a new MySQL support company in Finland created by a group of former MySQL AB staff and MySQL co-founder David Axmark.
Probably the most conspicuous MySQL fork at MySQL Conference and Expo, due
to the number of speakers, was Drizzle. Drizzle is optimized for
usage on cloud hosting, as well as being extensively rewritten to clean up
the code. The project's team is made up of both former MySQL developers
and new contributors with Rackspace as their primary commercial sponsor.
On Wednesday, Brian Aker took the stage for Drizzle. His big (if rather belated) announcement was the general availability release of Drizzle, which is now ready for its first production use as of around a month ago. The second announcement was that MySQL support and services company Percona had announced commercial support for Drizzle.
The Drizzle team has spent the last three years redeveloping MySQL around a "micro-kernel" architecture. This means that they've taken many things in the MySQL core code, simplified them, and converted them to "plugins", allowing individual users to reconfigure how Drizzle works. Their refactor of the code has also eliminated many longstanding MySQL "gotchas" regarding Unicode support, timestamps, constraints, Cartesian joins, and more.
Since Drizzle is built "for the Cloud", a strong part of its focus is on replication. Drizzle supports row-based replication using protobufs, an open format created by Google. The new open replication format supports integration with a variety of other tools such as ApacheMQ, Memcached, and Hadoop. It also supports multiple masters, partial replication, and sharding. In development are virtualized database containers using database "catalogs".
Aker also announced libDrizzle, a client library which works with MySQL and SQLite as well as Drizzle. Since this driver is BSD-licensed, he believes it will be attractive to users who have legal concerns about the MySQL driver licensing.
The future of open source databases and MySQL
The conference ended on Thursday with talks from Baron Schwartz of Percona and Mike Olson of Cloudera and BerkeleyDB on the future of databases. The two talks were remarkably similar in their predictions:
- databases will replicate and cluster seamlessly
- data will grow to petabytes and more, even for smaller organizations
- databases will integrated better with the rest of the stack
- databases will support a variety of different data formats
- people are using a variety of special-purpose databases now, but future databases will be more all-purpose
- data caching and databases will stop being completely separate layers, but will be fused
The main difference between the two presentations was on the place of relational vs. non-relational databases in the future of databases.
The future of MySQL was rather more of a source of anxiety for the attendees. As a PostgreSQL geek, I got asked repeatedly where PostgreSQL would be in five years by MySQL users who were clearly wondering the same thing about MySQL. All of the forks and the love/hate relationship with Oracle have undermined confidence in MySQL, and sent users looking for alternatives, or for reassurance.
As for the future of MySQL Conference and Expo, the rumor at the conference was that O'Reilly plans to move it away from Santa Clara. An attendee named Olaf even ran vote-by-Twitter poll for a new location.
Videos from the conference are available on blip.tv.
Comments (16 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Developments in web tracking protection; New vulnerabilities in dhcpd, kernel, krb5, PolicyKit, ...
- Kernel: Rationalizing the ARM tree; Power management work at Linaro; Safely swapping over the net.
- Distributions: Xoom and the Android tablet experience; Fedora, Mandriva, Ubuntu, ...
- Development: FVWM 2.6; GNOME 3.2, Linaro toolchain, OpenOffice.org, ...
- Announcements: Boxee GPLv3 violation, Document Foundation, Novell patent deal, ...