September 21, 2011
This article was contributed by Nathan Willis
Open source and free software projects often encounter culture clash
whenever they have to work with standards bodies. The most obvious problem
is the secrecy that many proprietary-vendor-driven standards processes
demand of participants, but that is not the only challenge. The PostgreSQL
database project has been grappling with these challenges in recent weeks
in an effort to strike a balance between its needs as a project and the
closed structures and process of the ISO, which is the publisher of the
official standard for SQL.
Secrecy
The topic arose on the pgsql-hackers mailing list in mid-September, when Susanne Ebrecht lamented the apparent lack of interest in the SQL standards process among PostgreSQL developers, prompted by her experience having a conference talk proposal on the subject rejected. She noted that another ISO meeting was fast approaching, and although rules prevented her from disclosing new drafts of the standard to "the public," she was permitted to discuss them privately with the organization that supported her (PostgreSQL), and asked if there was sufficient interest to set up a private mailing list for such discussions.
It apparently came as a surprise to several on the list that Ebrecht was
an official representative in the ISO process. However, as she elaborated
to the list, her role is not a direct (or a particularly powerful) one.
The ISO has managed the SQL standard since 1987, as ISO/IEC
9075. But the ISO itself is composed of representatives — one
per country — from 162 separate national standards bodies. The
German standards body Deutsches Institut für Normung (DIN) solicited
Ebrecht's input for their own
work on SQL.
The final voting on changes to the ISO standard for SQL is done by the assembled national representatives, however. Thus, even though Ebrecht can present PostgreSQL's concerns to the DIN SQL committee, they are still several steps removed from making it into the eventual standard — steps where the vested interests of corporations and other nations gain more and more influence on the outcome. The real practical question posed to PostgreSQL is how Ebrecht could communicate about the process to the developers without running afoul of the committees' secrecy rules.
It might be possible to avoid violating the non-disclosure rule by discussing broad changes to the drafts on a public mailing list without going into detail. But in SQL as in so much of life, the devil is in the details, so the consensus eventually was that a private list would be set up, to which Ebrecht could forward updates from the standards-writing process. To keep the list traffic confidential, it would be limited to known PostgreSQL contributors.
Standards: who needs 'em?
On the plus side, there does seem to be a healthy interest among project
members in following the ISO standards process. As Heikki Linnakangas said,
the process may not have sparked much discussion over the years, but
"it's hard to get excited about something if you don't know what's
happening." As core team member Josh Berkus said in an email,
though, the non-disclosure rules are just one of several challenges.
These challenges are:
- Requirements of confidentiality around all proceedings of the
committee, which causes extreme difficulty for open source projects used to
making all internal decisions on public mailing lists;
- Requirements to designate specific, pre-cleared staff who need to
attend meetings by telephone or in person, around the world, adding expense
and time requirements open source projects have trouble meeting;
- Intense political atmosphere where all decisions are a matter of
vendor alliances and have little or nothing to do with technical
requirements.
The ISO SQL committee is a particularly egregious example of the first point. Not only are all of their internal drafts secret, but the final published SQL standard is not available freely; it's vended for a substantial fee with restrictive copyright. While there are reasons to keep the minutes of the meetings confidential, there's no really good reason for this level of secrecy over the drafts and final publication, except to support the incumbent proprietary vendors.
On the third point, Berkus offered a specific example where influential
vendors appear to have used the standards process as a weapon. Both
PostgreSQL and MySQL supported a simple syntax for the retrieval of a subset of
the rows returned by a query using the LIMIT and OFFSET
operators, he said, syntax which was well-understood and well-liked by
users. But the standards committee adopted a different syntax that was
more verbose, but which added no additional features or flexibility. He said:
While the minutes of the meetings in question are closed to me, I
suspect that the entire motivation for this was Oracle and Microsoft's
desire to specify something which would be incompatible with the leading
open source databases.
Open source projects are not the only players put at a disadvantage by this sort of tactic, either, he observed. The same hurdles affect startup companies, to the protection of entrenched players against competition.
Distrust of the ISO process was visible from others in the project as
well. PostgreSQL's resident standards guru Peter Eisentraut commented in
an off-list email that, for end users, SQL is "pretty useless as a
'standard'" when compared to more complete specifications like C and
XML. SQL lacks specifications for important features like optimization and
administration, he said, and worse still, the language itself is
"baroque," with every new feature adopting a completely new
syntax. As a result, there is no clear way to extend the language in a
consistent fashion, which is problematic for PostgreSQL and other
projects.
Open source, proprietary vendors, and incompatibility
Joe Abbate mused that perhaps it was time for the open source database players to establish their own standard not controlled by incumbent vendors out to protect their business. Abbate's initial message to that effect came across as a call to form an "open source fork" of SQL itself, which most of the PostgreSQL team seemed to think was a bad idea. In addition to the confusion it would create for users, attempting a fork would require tremendous time and energy — and as Greg Smith commented, "standardization tends to attract lots of paperwork. Last thing you want to be competing with a big company on is doing that sort of big company work."
On the other hand, some, like Christopher Browne, pointed out that open source projects should consider participating in new standards processes that are just beginning, such as the UnQL specification proposed for NoSQL database queries. Darren Duncan suggested much the same thing with respect to the Muldis D language.
Abbate clarified his intention in a follow-up
message, saying he did not mean to propose embarking on a
standards-fork. "I only think it may be useful to discuss SQL
features, informally or otherwise, with other open source 'competitors'
such as SQLite, MySQL (brethren), Firebird, etc.."
With regard to Abbate's idea, Berkus affirmed the value of communication
between the various open source database projects, noting that they already
meet annually at OpenSQL Camp. But
there are essentially only three open source relational databases that
matter, he said: PostgreSQL, MySQL, and SQLite. Among those, MySQL is now
split into several competing fragments, the largest of which is owned by
Oracle. As a result, cross-project communication boils down to PostgreSQL
concurring with SQLite, he said, "which we already mostly
do."
Realistically, though, Berkus does not feel that SQL users are demanding
more features and syntax:
I personally can't think of too many things I'd want to *add* to
the SQL standard. Simplify, yes, but add, no. Possibly the OpenSQL
group could work on more accessible syntax for stuff like windowing
and recursive queries. However, it's more likely that we'll be
working more on direct language interfaces in the future instead
In the broader open source community, then, relational databases may have it easy because SQL is old enough that it is both well known and established (not to mention the fact that most users are resigned to incompatibility between competing vendors). Other software projects are not so lucky, from patent-driven fights about video codecs in HTML5 to supporting new hardware specification in the Linux kernel. The roadblocks Berkus mentioned are problematic no matter what the standard. Large projects or well-funded organizations may be fortunate enough to get a representative into the process (as PostgreSQL has), but a closed process dominated by proprietary vendors cannot be reformed in a day.
(
Log in to post comments)