|
|
Subscribe / Log in / New account

PostgreSQL and the SQL standards process

September 21, 2011

This article was contributed by Nathan Willis

Open source and free software projects often encounter culture clash whenever they have to work with standards bodies. The most obvious problem is the secrecy that many proprietary-vendor-driven standards processes demand of participants, but that is not the only challenge. The PostgreSQL database project has been grappling with these challenges in recent weeks in an effort to strike a balance between its needs as a project and the closed structures and process of the ISO, which is the publisher of the official standard for SQL.

Secrecy

The topic arose on the pgsql-hackers mailing list in mid-September, when Susanne Ebrecht lamented the apparent lack of interest in the SQL standards process among PostgreSQL developers, prompted by her experience having a conference talk proposal on the subject rejected. She noted that another ISO meeting was fast approaching, and although rules prevented her from disclosing new drafts of the standard to "the public," she was permitted to discuss them privately with the organization that supported her (PostgreSQL), and asked if there was sufficient interest to set up a private mailing list for such discussions.

It apparently came as a surprise to several on the list that Ebrecht was an official representative in the ISO process. However, as she elaborated to the list, her role is not a direct (or a particularly powerful) one. The ISO has managed the SQL standard since 1987, as ISO/IEC 9075. But the ISO itself is composed of representatives — one per country — from 162 separate national standards bodies. The German standards body Deutsches Institut für Normung (DIN) solicited Ebrecht's input for their own work on SQL.

The final voting on changes to the ISO standard for SQL is done by the assembled national representatives, however. Thus, even though Ebrecht can present PostgreSQL's concerns to the DIN SQL committee, they are still several steps removed from making it into the eventual standard — steps where the vested interests of corporations and other nations gain more and more influence on the outcome. The real practical question posed to PostgreSQL is how Ebrecht could communicate about the process to the developers without running afoul of the committees' secrecy rules.

It might be possible to avoid violating the non-disclosure rule by discussing broad changes to the drafts on a public mailing list without going into detail. But in SQL as in so much of life, the devil is in the details, so the consensus eventually was that a private list would be set up, to which Ebrecht could forward updates from the standards-writing process. To keep the list traffic confidential, it would be limited to known PostgreSQL contributors.

Standards: who needs 'em?

On the plus side, there does seem to be a healthy interest among project members in following the ISO standards process. As Heikki Linnakangas said, the process may not have sparked much discussion over the years, but "it's hard to get excited about something if you don't know what's happening". As core team member Josh Berkus said in an email, though, the non-disclosure rules are just one of several challenges.

These challenges are:
  1. Requirements of confidentiality around all proceedings of the committee, which causes extreme difficulty for open source projects used to making all internal decisions on public mailing lists;

  2. Requirements to designate specific, pre-cleared staff who need to attend meetings by telephone or in person, around the world, adding expense and time requirements open source projects have trouble meeting;

  3. Intense political atmosphere where all decisions are a matter of vendor alliances and have little or nothing to do with technical requirements.

The ISO SQL committee is a particularly egregious example of the first point. Not only are all of their internal drafts secret, but the final published SQL standard is not available freely; it's vended for a substantial fee with restrictive copyright. While there are reasons to keep the minutes of the meetings confidential, there's no really good reason for this level of secrecy over the drafts and final publication, except to support the incumbent proprietary vendors.

On the third point, Berkus offered a specific example where influential vendors appear to have used the standards process as a weapon. Both PostgreSQL and MySQL supported a simple syntax for the retrieval of a subset of the rows returned by a query using the LIMIT and OFFSET operators, he said, syntax which was well-understood and well-liked by users. But the standards committee adopted a different syntax that was more verbose, but which added no additional features or flexibility. He said:

While the minutes of the meetings in question are closed to me, I suspect that the entire motivation for this was Oracle and Microsoft's desire to specify something which would be incompatible with the leading open source databases.

Open source projects are not the only players put at a disadvantage by this sort of tactic, either, he observed. The same hurdles affect startup companies, to the protection of entrenched players against competition.

Distrust of the ISO process was visible from others in the project as well. PostgreSQL's resident standards guru Peter Eisentraut commented in an off-list email that, for end users, SQL is "pretty useless as a 'standard'" when compared to more complete specifications like C and XML. SQL lacks specifications for important features like optimization and administration, he said, and worse still, the language itself is "baroque", with every new feature adopting a completely new syntax. As a result, there is no clear way to extend the language in a consistent fashion, which is problematic for PostgreSQL and other projects.

Open source, proprietary vendors, and incompatibility

Joe Abbate mused that perhaps it was time for the open source database players to establish their own standard not controlled by incumbent vendors out to protect their business. Abbate's initial message to that effect came across as a call to form an "open source fork" of SQL itself, which most of the PostgreSQL team seemed to think was a bad idea. In addition to the confusion it would create for users, attempting a fork would require tremendous time and energy — and as Greg Smith commented, "standardization tends to attract lots of paperwork. Last thing you want to be competing with a big company on is doing that sort of big company work".

On the other hand, some, like Christopher Browne, pointed out that open source projects should consider participating in new standards processes that are just beginning, such as the UnQL specification proposed for NoSQL database queries. Darren Duncan suggested much the same thing with respect to the Muldis D language. Abbate clarified his intention in a follow-up message, saying he did not mean to propose embarking on a standards-fork. "I only think it may be useful to discuss SQL features, informally or otherwise, with other open source 'competitors' such as SQLite, MySQL (brethren), Firebird, etc.."

With regard to Abbate's idea, Berkus affirmed the value of communication between the various open source database projects, noting that they already meet annually at OpenSQL Camp. But there are essentially only three open source relational databases that matter, he said: PostgreSQL, MySQL, and SQLite. Among those, MySQL is now split into several competing fragments, the largest of which is owned by Oracle. As a result, cross-project communication boils down to PostgreSQL concurring with SQLite, he said, "which we already mostly do".

Realistically, though, Berkus does not feel that SQL users are demanding more features and syntax:

I personally can't think of too many things I'd want to *add* to the SQL standard. Simplify, yes, but add, no. Possibly the OpenSQL group could work on more accessible syntax for stuff like windowing and recursive queries. However, it's more likely that we'll be working more on direct language interfaces in the future instead

In the broader open source community, then, relational databases may have it easy because SQL is old enough that it is both well known and established (not to mention the fact that most users are resigned to incompatibility between competing vendors). Other software projects are not so lucky, from patent-driven fights about video codecs in HTML5 to supporting new hardware specification in the Linux kernel. The roadblocks Berkus mentioned are problematic no matter what the standard. Large projects or well-funded organizations may be fortunate enough to get a representative into the process (as PostgreSQL has), but a closed process dominated by proprietary vendors cannot be reformed in a day.


Index entries for this article
GuestArticlesWillis, Nathan


to post comments

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 17:18 UTC (Wed) by smurf (subscriber, #17840) [Link] (2 responses)

The LIMIT thing is a simple example why this kind of tactics doesn't buy anybody anything.
Implementing the alternate syntax from the standard is a simple matter of adding to the YACC file and recompiling. Total effort spent: one hour, maybe two if you add a test case for it.
So the only reason why MS and Oracle do that kind of thing is that they still don't understand how Open Source works.
Good for us, I'd say. :-P

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 6:12 UTC (Thu) by ekj (guest, #1524) [Link] (1 responses)

I didn't get that part either.

If you've got a database that already efficiently support

select * from [foo] limit 100 offset 100

Then surely making changes to the syntax-tree so that it instead (or additionally) supports:

select * from [foo] make a effing slice from 100 to 200

That's a horrible example, because *renaming* identical functionality is just about the simplest change anyone could possibly make.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 12:30 UTC (Fri) by pflugstad (subscriber, #224) [Link]

One other thing to remember is that often the representatives to this standards bodies are entirely non-technical. They are professional standards body attendees, and they don't really understand the material they're looking at. They have zero concept of how easy or hard or literally impossible something is to actually implement. So you get silly stuff like this.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 19:19 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (17 responses)

I still think that DB industry is thoroughly poisoned by SQL. Even 'new' languages like UnQL are tainted by it: http://www.unqlspec.org/display/UnQL/Example+Queries+and+...

If there was implementation of query language based on http://en.wikipedia.org/wiki/FLWOR on PostgreSQL backend - I'd have switched immediately.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 19:31 UTC (Wed) by dskoll (subscriber, #1630) [Link] (2 responses)

FLWOR looks like the "best" [sic] of both SQL and XML... a nastier horror I could not imagine. :)

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 19:45 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Have you ever tried recursive queries in SQL with CTEs?

...shudder...

FLWOR is as clear as day compared to that.

Though I'm not against even better language, FLWOR should be considered just as a nice guide about how query languages should be written. In any case, it's a bit XML-specific and should be extended for relational data with features like explicit joins instead of nested loops.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 19:55 UTC (Wed) by dskoll (subscriber, #1630) [Link]

Have you ever tried recursive queries in SQL with CTEs?

Ugh, no. :) I guess my use of SQL is not at that level. I agree that such queries are unreadable piles of evil, and predicting how they will perform is like reading a crystal ball.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 19:36 UTC (Wed) by jeremiah (subscriber, #1221) [Link]

FLWOR still strikes me as somewhat difficult to write a parser for. It's better than SQL, but still not clear enough. Too many words and not enough symbols or inherent structure.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 21:46 UTC (Wed) by robert_s (subscriber, #42402) [Link] (12 responses)

I couldn't agree more. Except:

"If there was implementation of query language based on http://en.wikipedia.org/wiki/FLWOR on PostgreSQL backend - I'd have switched immediately."

I'm not convinced we need a whole query "language" at all. The huge majority of database use is done on behalf of larger applications written in a different language. Why make developers learn another half-functional language just to be able to express their query intentions? I'd be far more interested in mechanisms to expose the querying possibilities in a developer's "native" language, which is why I found Berkus' sentence

"However, it's more likely that we'll be working more on direct language interfaces in the future instead"

so encouraging.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 22:03 UTC (Wed) by raven667 (subscriber, #5198) [Link] (6 responses)

I'm not sure how that's going to work out, unless you restrict queries to just getting and setting individual values you are going to need some way to express a complex filtering expression that can be executed by the database and not your application. The job of communicating intentions in complicated enough that it will de-facto be be its own language. You could end up with different query languages for each programming language, for each database server and being restricted on which language you can implement a program on based on how complex the queries are allowed to be, not on the database but on your interface to the database.

I guess that's already the case for a lot of the newer databases like Hadoop where each has its own custom API and special storage/transaction/replication semantics

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 23:56 UTC (Wed) by zlynx (guest, #2285) [Link] (2 responses)

C# has LINQ which is really cool.

The backend code to implement it is awful, but the user side of it is great.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 0:58 UTC (Thu) by raven667 (subscriber, #5198) [Link] (1 responses)

That's cool but its not standard and AFAIK not implemented anywhere else. Is it even in Mono? That's what I was saying about each environment getting its own better or worse API implementation. At the end of the day, right now, whatever queries you can create with LINQ, which must output SQL at the end, can be done in any language or by hand. What if that was tied to the combination of the implementation language and backend db, if LINQ was only available for C# talking to MS SQL, with features not exposed or usable by other languages or against other DBs.

I suppose that's already the case somewhat with proprietary extensions and incompatibilities between different SQL databases but I worry that it would be harder to handle and workaround if there were less compatibility between applications and databases as provided by the SQL standard.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 1:24 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Yes, you can use LINQ in Mono. And there are alternative LINQ providers, like http://bltoolkit.net/Doc.Linq.ashx for various databases.

LINQ is just a mechanism which can be used to create DSLs to query data.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 5:55 UTC (Thu) by smurf (subscriber, #17840) [Link]

You already have the standard operations anyway. They're called "functions" and "operators" and such. And you do need a common representation for that between client and server.

Right now, that common representation is text and SQL, plus (sometimes) a server-specific method to prepare statements to prevent having to transcode and quote data.

Replace that with XML or whatever all you want. Doesn't buy you anything. You still have to assemble a textual string on one end and analyze it on the other side.

Three choices here:

* stay with SQL. It mostly works, people are used to it, you can do interesting stuff with it in any programming language whatsoever. Downside: you still need server-specific code for more complicated queries and for schema.

* agree on a common high-level API for queries, and possibly for schema queries/changes. Downside: who'd do the work and who'd agree on the result? The commercial vendors? Hardly.

* agree on a common binary data stream between client and server. Same downside, plus doesn't work for embedded servers.

PostgreSQL and the SQL standards process

Posted Sep 25, 2011 23:06 UTC (Sun) by jberkus (guest, #55561) [Link] (1 responses)

"You could end up with different query languages for each programming language, for each database server and being restricted on which language you can implement a program on based on how complex the queries are allowed to be, not on the database but on your interface to the database."

The thing is, we *already* have that. 80% of applications out there which interface with a SQL database use an ORM or similar high-level interface, and few ORMs support more than a couple of DBMSes. Developers are not using SQL to push data today.

Frankly, if database geeks took control of the ORMs and gave developers a user-friendly interface in their own language, it would be both better for developers and better for the databases. We could make sure that these interfaces *do the right thing* as far as database interfacing is concerned. The whole path of Call ORM --> Generate SQL String --> Parse SQL String --> Communicate Binary Protocol has at least one too many steps in it.

Couch and Mongo have gone partway there with JSON, but they're still essentially using intermediate query languages. And worse, BSON isn't even a standard anywhere, so it's completely non-portable.

Of course, we want to still support SQL as a direct interface for database geeks to do "advanced" work. But that's only about 10% of the database interaction out there. And I say this as a SQL expert.

PostgreSQL and the SQL standards process

Posted Sep 26, 2011 2:48 UTC (Mon) by raven667 (subscriber, #5198) [Link]

I think we largely agree on facts and are saying some of the same things from different perspectives. Right now ORMs and applications can support multiple databases but if the db/orm/language become more tightly coupled then I could see that no longer being the case. Applications built against newer databases like Couch and Mongo seem to be much less portable than traditional SQL applications.

As you say, if each database engine wrote their own custom ORM then developers could write better queries and get better results more safely but it would increase lock-in which is not an unmitigated positive.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 23:59 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

The problem is, you need DB engine to see the 'whole' query to do some optimizations. For example if you do JOINs in SQL as for-loops over the tables inside your application, then DB engine wouldn't be able to use indexes to optimize them.

That's why a separate fully-analyzable language is needed. FLWOR is a nice example because it's functional and fully reference-transparent. So optimization engine can backtrack through definitions and get the complete picture of what's happening. And then use tried-and-true join optimizations.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 19:43 UTC (Thu) by iabervon (subscriber, #722) [Link] (3 responses)

You do need some representation of the complete query you are performing. The question is actually: is there any benefit to having database operations go through a representation which is a standardized character sequence? Generally, both applications and databases have some sort of data structure representing a query, and the application flattens this into a character sequence which the database has to read in order to construct approximately the same data structure in order to analyze it. If the query API were specified in terms of a data structure instead of a character sequence, it would eliminate an awkward and error-prone step on each side.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 20:08 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Doesn't really matter much for me.

Explicit query-construction API should express roughly the same ideas as the textual query representation.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 20:14 UTC (Thu) by dlang (guest, #313) [Link]

you would not eliminate the conversion step, you would just replace the conversion between an text string and the internal data structure with the conversion between the binary sequence and the internal data structure.

different languages (and for that matter, different libraries for a single language) will want to represent the data in different ways.

it would be good to have a standard format that could eliminate the SQL injection type of vulnerability, but it's hard to think of anything that can take an arbitrary byte stream and use it in an arbitrary place in the statement, while still allowing multiple statements in one blob that would eliminate the problem

PostgreSQL and the SQL standards process

Posted Sep 30, 2011 18:43 UTC (Fri) by Baylink (guest, #755) [Link]

You're effectively asking: "should queries be one string of text, instead of a whole sequence of API calls". (Unless I've misunderstood you badly, in which case, I apologize.)

The answer is "you've never been 400ms away from your database engine, have you"?

Yes, the architecture of SQL is the way it is for a reason. You young whippersnappers (:-) have never run a database engine with an API interface across even a LAN, much less a WAN, so you forget both halves of the latency term of the equation, much less the data reduction of having *all* the intermediate work happen inside the server.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 21:27 UTC (Wed) by Simetrical (guest, #53439) [Link] (8 responses)

I should point out that just because ISO is closed and dysfunctional doesn't mean other standards bodies are. The W3C and IETF are two examples of standards bodies that publish all their standards on the web royalty-free, and where all or almost all discussion and development of the specs is open. There are some large web specs these days (HTML5, DOM4, my editing spec) that are developed exactly like open-source projects: liberal licenses, public version control and bug trackers, discussion on IRC and mailing lists, etc.

Standardization doesn't have to involve paperwork either. HTML5 was originally developed (and to a large extent still is) at the WHATWG, which is just Ian Hickson's personal Dreamhost site. The procedures consist of e-mailing the mailing list or filing a bug if you have feedback, and Ian will address it eventually as he sees fit. The HTML Editing APIs specification that I'm currently working on is developed the same way.

All you really need for standardization is someone willing to put in the time and effort working with implementers, and implementers who are willing to compromise where necessary to match the standard. You can really host the specification wherever you want, and develop it however you want, provided you have the implementers on board.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 22:02 UTC (Wed) by n8willis (subscriber, #43041) [Link] (7 responses)

That's true; I hope I did not suggest somehow that all standards bodies are the same. However, even W3C standardization processes can be slowed down by antagonistic participants with their own agenda. Open source projects are not impervious to being said troublesome participant, either, naturally. And I have heard from W3C folks that not as many FOSS projects get involved as the group would like; I chalk that up largely to Berkus' Challenge #2: it's hard to get someone free to sit in on phone and in-person meetings.

Nate

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 22:18 UTC (Wed) by Simetrical (guest, #53439) [Link] (6 responses)

Different parts of the W3C do things differently. The key Working Groups (HTML, WebApps, CSS, SVG) tend to follow whatever the major browsers' implementers agree on, so no one other than implementers can really cause much trouble. Of course, sometimes implementers won't agree on something, like media format support for <video>. But there's not much to do about that. Fortunately, right now all the major browser implementers are largely committed to interoperability, so we don't have a lot of obstructionism.

The active participants in the Working Groups I'm involved in tend to almost all be employed by one of the major browser implementers: Microsoft, Mozilla, Google, Apple, and Opera. Some of these (Mozilla, WebKit, Chromium) are FOSS. There are a nontrivial number of people participating in their spare time, too. I was one for a few years before I started getting paid to do it this year.

The groups I'm active in (HTML and WebApps) don't do anything significant by phone or in-person meetings. It's all mailing lists, IRC, and bug trackers. They're really run like open-source projects.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 18:42 UTC (Thu) by jwakely (subscriber, #60262) [Link] (1 responses)

Different parts of ISO do things differently too: the ISO C++ committee may not be perfect but by the sound of things it's a lot more open than the SQL one.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 5:35 UTC (Fri) by ncm (guest, #165) [Link]

That's a recent change. Curiously, it hasn't been a result of changes in ISO rules, but just interpretations of them. Committee Drafts are still secret, but Working Papers, effectively identical, are not. Likewise, published Standards. Committee mailing lists and issues lists exist entirely outside ISO's domain. Even voting rights make much less difference than one might think; actually doing the work carries more weight, because when in doubt, voters vote with whoever seems to know more, and most voters are always in doubt.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 22:06 UTC (Thu) by justincormack (subscriber, #70439) [Link] (3 responses)

A lot of the rest of the w3c doesnt work like this though, and a lot of this appened because of html5. Before tht there was a mess of unused xml standards...

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 22:17 UTC (Thu) by Simetrical (guest, #53439) [Link] (2 responses)

Yep, definitely. But all the W3C standards I can think of that are undergoing active development, and that are targeted at web browsers, work entirely in the open these days. There may be exceptions. But even the worst WG in the W3C publishes its standards for free on the web, and is required to listen to feedback from the general public, so it's still way better than the SQL situation described in this article.

listening to feedback

Posted Sep 23, 2011 13:29 UTC (Fri) by pjm (guest, #2080) [Link] (1 responses)

What does "required to listen to feedback" mean? Sure, anyone can post to the mailing list, but my experience with the CSS mailing list (www-style) is that it can take months to get a reply, and the eventual reply can still be "we've decided not to address this" without giving any reason. There's a w3c policy document that says that this "should" not happen; yet happen it does.

While it's true that the CSS 2.1 specification is published on the web, it's also true that the published specification contains contradictions and does not contain enough information to pass the test suite.

Yes, it's like an open-source project: no-one's under any obligation to address the bug reports you file.

listening to feedback

Posted Sep 23, 2011 17:40 UTC (Fri) by Simetrical (guest, #53439) [Link]

Yep, nothing is perfect by any means. It's a far cry better than what's described in the article, though. For what it's worth, some of the more recent specifications are far more detailed, and hopefully do contain enough information to pass their test suites. CSS 2.1 is ancient, although it only just made REC.

But all the CSS specs are less precise than the newer HTML/DOM specs. At some point we're going to have to rewrite CSS from scratch with the same level of precision as HTML and DOM have been rewritten. We've learned a lot over the last decade about how to write standards.

PostgreSQL and the SQL standards process

Posted Sep 21, 2011 23:54 UTC (Wed) by smoogen (subscriber, #97) [Link]

> PostgreSQL's resident standards guru Peter Eisentraut commented in an
> off-list email that, for end users, SQL is "pretty useless as a
> 'standard'" when compared to more complete specifications like C and
> XML. SQL lacks specifications for important features like optimization
> and administration, he said, and worse still, the language itself is
> "baroque," with every new feature adopting a completely new syntax.

To me that would be what an OpenSQL or maybe a W3C SQL would work on.
1) Optimization and administration
2) Redoing of features to use the same syntax.

call it DW; or TRM or some similar acronym but make it something useful. [Of course in the end, a standard is only as useful as those who implement it and use it. ]

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 3:56 UTC (Thu) by ringerc (subscriber, #3071) [Link] (3 responses)

For what it's worth, I've seen strong interest in following the standards on the Pg lists - even in cases where the standards are of dubious sense or downright stupid. That applies both for formal ISO standard features and where compatibility with major-vendor language extensions like to_char is in question. The documentation reflects that, with clear coverage of which Pg features are part of the standard and which are extensions.

There's certainly a lot of frustration with the standard process's exclusion of some of the key groups who use it, like open source database developers and JDBC driver developers. The difficulty of accessing the released standards documentation is another store point.

For users without corporate backing to buy $lots copies of the standards it is vary frustrating to have to rely on circulating old drafts and other vendors' documentation. This devalues the standard, because it's so hard to know what is and isn't standard that it's hard to code to the standard - so people just pick a vendor dialect and stop caring.

Personally, I think the best thing that can happen to the SQL standards process at this point is to put it quietly out of its misery and reboot the whole thing. We - end users, database implementers, middleware/ORM implementers, JDBC/ODBC/ADO/etc driver implementers, etc - need a language with the same core capabilities but a sane, machine-readable syntax. SQL's god-awful pseudo-natural-language monstrosity of a grammar is a nightmare to work with. Parsing it is hard enough, but transforming it and proving things about it is horrifying. I'd be delighted to see adoption of a language that could express all the concepts, but:

- Is ordered sanely, with subjects before predicates before outcomes;

- Is easily represented as an easy-to-work-with in-memory tree (like a DOM or similar, but hopefully not XML) for easy transformation and composition;

- Is designed to be composed from smaller pieces, so it's easy to shuffle around subqueries, predicates, etc without messing with the stupidities of "AND" and commas;

- Is capable of expressing the same queries and plans (for easy porting) so it still has inner, left outer, right outer, and full/cross joins, is still relational, still supports grouping, aggregation, windowing, etc;

- Gets rid of all pseudo-function language syntax in favour of using true functions or a generic and extensible expression structure that doesn't require syntax changes for every new feature like CTEs, window functions, etc;

- Is designed WITH THE ACTIVE INVOLVEMENT OF PEOPLE WHO IMPLEMENT JDBC/ODBC DRIVERS, and PEOPLE WHO HAVE IMPLEMENTED LANGUAGE INTEGRATED QUERYING OR CRITERIA APIs. This is a big one. The amount of suffering that SQL causes for JDBC driver developers cannot be understated, and they get off easy compared to the poor individuals who try to produce programmatic query builders/criteria APIs/language integrated querying like JPA 2.0 criteria, LINQ, etc.

- ... and most importantly, because it's a saner way of writing the same queries as SQL, can be transformed into SQL to be sent to backends that don't support it yet, so the chicken-and-egg problem of adoption vs support can be resolved.

Most current efforts like Muldis-D seem to miss the last point - that if you want adoption, you're going to need to be able to transform to SQL in the medium term so people can use it on platforms that don't yet support the new, saner language.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 2:30 UTC (Fri) by flewellyn (subscriber, #5047) [Link]

A Lisplike domain-specific language could do what you specify, although the ordering in point 1 wouldn't hold. It'd be prefix notation. If that's acceptable though, Lisp is already known to be easy to parse, transform, compose, and work with functionally.

PostgreSQL and the SQL standards process

Posted Sep 30, 2011 19:04 UTC (Fri) by Baylink (guest, #755) [Link] (1 responses)

Very specifically: I'd like to see something that makes it easy to trace queries in, say, "mtop" back to the source code whence they came.

In that vein, and since the proper collection of eyeballs are probably reading the commments to this particular posting, I'd like to propose a device which might or might not work in that context, and get the input of people who know which:

Select $FIELDS from $TABLE where $CONDITIONS *AND NOT GETDATE() equals 1011-03-01* (where the date is some date earlier than today, and specific to each query; it serves as a tag).

Kevin Falcone, who I think is one of the SQL gurus on RT, suggested this would break the query cache, but I'm not sure that's true a) because it would be the same on any given query, and b) because it depends on a function, which can be evaluated away as part of the optimizer (assuming that keeps it from messing up the cache, which I don't know; hence, you folks. :-)

Anyone have an opinion on this, or the larger problem it's aimed at?

PostgreSQL and the SQL standards process

Posted Oct 5, 2011 10:52 UTC (Wed) by Jannes (subscriber, #80396) [Link]

Not sure what you're getting at actually. Why use a function at all, does the GETDATE have any use?

Why not use constants .... AND 0 < 78238923123 (your random tag) ?

And while you're at it, why not just make it a comment? I've used that before to make queries easily identifiable in SHOW PROCESSLIST: (in MySQL)

/* YourTag Function Name blabla */ SELECT * FROM blabla;

The comment has the extra advantage of not affecting the query cache at all (at least in MySQL).

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 4:00 UTC (Thu) by BenHutchings (subscriber, #37955) [Link] (1 responses)

The irony of Microsoft and Oracle using SQL standardisation as a weapon is that neither of them are seem to be in much of a hurry to follow the standards themselves. When I last had the misfortune to work on a data access layer (about 2005), SQL Server was still using SQL-89 semantics in some cases where SQL-92 differed, and Oracle still considered that '' IS NULL. PostgreSQL seemed to be the most standard-conformant of the lot. Has this changed?

PostgreSQL and the SQL standards process

Posted Sep 24, 2011 20:38 UTC (Sat) by butlerm (subscriber, #13312) [Link]

>Oracle still considered that '' IS NULL

Oracle still does, and it probably won't ever change because it would break far too much software if they did. In Oracle's defense, most languages have extremely poor or non-existent support for nullable character strings, and so this decision tends to make working with Oracle easier in some respects.

In particular, you really don't want to index the "no relationship" values of an optional foreign key. Like an empty email address, for example. To accomplish this in most databases, you have to convert email addresses that are empty strings to nulls when inserting data into the database, and then convert nulls back to empty strings when you are done.

Personally, I wouldn't even like to use a database that distinguished between nulls and empty strings unless there was a convenient facility to do just that.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 4:16 UTC (Thu) by dps (guest, #5725) [Link] (8 responses)

Actually I know why ISO standards are not public. Some of the *funding* that pays for the ISO is raised by the sale of standards. I understand that the prices are significant, probably because expecting volume sales is unrealistic and some copying/sharing is inevitable.

Keeping the standards expensive may be in the interest of the large commercial outfits but it is not the only reason. Anything that undermines the sale of standards is obviously is the interests of neither the ISO nor the people that pay people to participate in the process.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 11:49 UTC (Thu) by Kwi (subscriber, #59584) [Link] (5 responses)

Considering that all standardization work is performed not by ISO itself, but by national standard bodies (often funded by tax payers), I don't buy this argument.

It is true that sales of publications is a major source of income - 30% according to the ISO website - but what is that money spent on?

ISO is in itself just the ISO Central Secretariat in Geneva. The secretariat:
- has 154 full-time employees
- costs USD 40 million a year to run

For a purely coordinating role, 154 employees seems pretty excessive, and so does the yearly expenses at USD 257.056 per employee. USD 40 million a year, and then you haven't even produced any standards yet!

To get any work done, you need to add an estimated USD 140 million a year spent by the national standard bodies on ISO work, and an unknown (but probably much higher) amount spent by the participating companies. And none of this is paid for by selling ISO standards!

ISO is a money sink.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 12:17 UTC (Thu) by spaetz (guest, #32870) [Link] (4 responses)

> seems pretty excessive, and so does the yearly expenses at USD 257.056 per employee.

Not if you ever needed to rent a flat or buy food in Geneva :-)

PostgreSQL and the SQL standards process

Posted Sep 24, 2011 20:25 UTC (Sat) by butlerm (subscriber, #13312) [Link] (3 responses)

If the ISO disappeared tomorrow, would anyone notice? It seems to me like a pointless government bureaucracy that has a net negative effect on nearly everything it touches, with a tendency to produce standards and meta-standards that no one actually uses.

PostgreSQL and the SQL standards process

Posted Sep 24, 2011 21:41 UTC (Sat) by njs (subscriber, #40338) [Link] (2 responses)

Many many industries care about ISO: https://secure.wikimedia.org/wikipedia/en/wiki/List_of_In...

For example, it's ISO standards that say what makes paper "A4", makes film "400 speed", says what the standard freight container sizes are, describe ISBN and ISSN barcodes, etc., etc.

I'm sure there's plenty of nonsense in plenty of ISO standards, but we do need *some* standard definitions of these things so that different people and businesses can talk to each other about complicated technical matters without having to check every detail all the time.

ISO does seem particularly ill-suited to handling software-related standards, though.

PostgreSQL and the SQL standards process

Posted Sep 30, 2011 19:06 UTC (Fri) by Baylink (guest, #755) [Link] (1 responses)

Well, I thought it was the American Standards Association (now ANSI) that said what made "film" "400 speed", but maybe that's just me. So many things are just me.

PostgreSQL and the SQL standards process

Posted Oct 2, 2011 19:52 UTC (Sun) by JanC_ (guest, #34940) [Link]

And "A4" was defined by DIN (the German standards body) before ISO copied it.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 15:33 UTC (Thu) by BenHutchings (subscriber, #37955) [Link]

I don't understand the secrecy of drafts or minutes, though. It's not an ISO rule; some committees do publish them.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 22:06 UTC (Fri) by Tobu (subscriber, #24111) [Link]

That looks a lot like academic publishing. Set up a lot of little fiefdoms where people and their employers compete for status, giving away senior roles to the first to join. Participants are the ones hashing out the work, reviewing it, getting it publication ready, and marketing it. The standard body/academic publisher owns exclusive rights to the publication though. Sell the work back to participants, libraries, and any interested parties that can afford it. Print money (Elsevier operates at a grander scale than ISO: 1.5 billion £ of annual revenue, mostly from subscriptions).

Better standards bodies are those with a clear royalty-free, patent-free mandate for both access and implementation.

Open Source and Standards

Posted Sep 22, 2011 10:08 UTC (Thu) by ingwa (guest, #71149) [Link]

I don't have much to offer on the subject of SQL or PostgreSQL. But I have been working with standard bodies around the Open Document Format. I work 0n the Calligra Suite (http://www.calligra-suite.org/) and we are actively involved in the development of that document standard.

I must say that this has meant a lot to us both because we have most of our concerns listened to and actually fixed and also because it has meant increased visibility for us. And being involved in the work to bring out a new standard gives you a credibility that is difficult to get in other ways.

I think that more open source projects, especially the larger ones, should get involved in the standard bodies that regulate the area they are active in. This can only bring good to those projects. True, you may have to deal with some politics, but you will have to do that anyway if you don't participate -- but without having any influence at all.

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 11:57 UTC (Thu) by gerv (guest, #3376) [Link]

Taking back the SQL standard would be entirely possible, but you would need your own clone of Ian Hickson (editor of HTML5).

Gerv

PostgreSQL and the SQL standards process

Posted Sep 22, 2011 18:59 UTC (Thu) by petereisentraut (guest, #59453) [Link] (4 responses)

I have been to SQL standards committee meetings. Actually, the process is quite relaxed and open, and there is friendly cooperation across vendor lines, once you find your way in. So I don't believe the conspiracy theories. And there are no requirements about secrecy of the proceedings. You can read up on everything if you become a member of the working group or even just a temporary guest.

The main problems are (1) ISO politics and (2) the effort required to do the actual work.

First, the participants don't like the ISO politics either, but they're there, so what are you going to do? ISO has existed for 50+ years. Who else are you going to charge with maintaining your standard that is 20+ years old and should perhaps have another 20 years or so to go?

Second, writing actual technical standards requires experts and a lot of time. Who is going to do it? The work is currently done by a dozen people or so who have quasi-guru status within their sponsoring companies. Who is going to put in the time like this in a volunteer organization?

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 9:22 UTC (Fri) by cmccabe (guest, #60281) [Link] (1 responses)

I'm sure that people at SQL standards committee meetings are friendly. But flying halfway across the world at great expense and doing things behind closed doors is a 19th century way of doing things. Modern standards like HTML5 are developed in the open and discussed on mailing lists and IRC. This just makes things easier and cheaper for everyone-- especially users and developers of related software, who actually get a chance to see what developers are discussing.

Also, not all people who work on open source databases are volunteers.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 13:24 UTC (Fri) by petereisentraut (guest, #59453) [Link]

Well, sure, some people get paid to work on PostgreSQL. But participating usefully in the community, especially when it comes to extra activities like the one discussed here, goes beyond merely being told by some boss to work on some code base. Participating in the community is a volunteer effort, even if it's ultimately part of your job.

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 10:55 UTC (Fri) by nix (subscriber, #2304) [Link] (1 responses)

You could use the same reasoning to argue that nobody in the free software community would sit on the C++ committee, or that nobody in the free software community would get involved with POSIX. But the former has substantial free software representation and the latter is by now pretty much entirely driven by the needs of the Linux community.

So your reasoning appears somewhat flawed. (Or, rather, contains an erroneous implication: that it requires quasi-gurus, sure. That there is something preventing these quasi-gurus from ever participating unless they are backed by a non-'volunteer organization', possibly. Your implication that non-volunteer organizations are necessarily proprietary software companies is plainly false. If this implication is dropped, your last paragraph becomes a non sequitur.)

PostgreSQL and the SQL standards process

Posted Sep 23, 2011 13:26 UTC (Fri) by petereisentraut (guest, #59453) [Link]

I think you are implying a lot of implications. I'm just stating the facts and asking some questions. Now we have to find the answers.

PostgreSQL and the SQL standards process

Posted Sep 24, 2011 21:16 UTC (Sat) by butlerm (subscriber, #13312) [Link] (1 responses)

What I want to know is how did the SQL standards process come to specify requirements that suggest extraordinarily bad taste. And by bad taste I mean inventing new commonplace functions that require special grammar rules to parse correctly. Such as the following:

SUBSTRING(s FROM x FOR y)
TRIM(LEADING ' ' FROM s)
POSITION('-' IN s)
EXTRACT(MONTH FROM d)

Old fashions

Posted Sep 27, 2011 7:59 UTC (Tue) by eru (subscriber, #2753) [Link]

Maybe people who used to be COBOL programmers? Actually, syntax that tries to mimic natural language used to be a very common style in old domain-specific languages, the theory being that it makes them self-documenting.


Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds