Interview: Bruce Momjian
This interview took place on November 30 at Linux Conference 2000 Fall, which was held from November 29 to December 1 at Kyoto International Conference Hall, Kyoto, Japan. The event was sponsored by Japan Linux Association and held with the Perl/Ruby Conference by O'Reilly Japan. We would like to thank Bruce Momjian, Tatsuo Ishii, Takaaki Higuchi, and people at JLA who helped realize this interview -- Thank you very much!
CL: First of all, how do you pronounce PostgreSQL?
BM: It's called Post-gres-cue-el (not Postgre-es-cue-el). It's a very unusual name. It's very hard to pronounce. Somebody described it as anti-marketing in a sense that it's so confusing that people maybe don't even want to say it :-)
The original database software was called Postgres. When SQL was added to it and its development was taken onto the Internet, we added SQL at the end of the name. It looks like Postgreee plus something at the end, it does confuse people. I really do apologize for that :-)
CL: You are the vice president of Great Bridge, now would you call yourself a developer?
BM: My job with Great Bridge involves several things.
First, I'm to continue steering Postgres development as I have in the past four years. That means encouraging developers, helping to keep the group together, sort of holding the group. And obviously I'm still involved with development to some extent, adding features, fixing things. So in one sense my job is to just continue doing what I've been doing for the past four years. The second part of my job is to help steer Great Bridge as I have Postgres. That means helping them understand Postgres community, helping them work with the open source community.
So although my title is a one that's very simple, it has sort of two parts. One part is the open source part, the second part is trying to help the company to develop in a way that we think is best.
CL: How many hours do you actually write code a day?
BM: When I'm talking about the part of my job that has to do with the open source project, I spend about an hour and a half a day reading email, responding to questions, making suggestions, maintaining the code, applying patches that people have presented on the mailing list. Additionally, I do coding when something needs fixing. When there is something we need to patch or something somebody submitted that doesn't quite work anymore, I'd be coding and clean it up trying to fix it.
But I have not had a lot of time to do development since I started writing the book ("PostgreSQL: Introduction and Concepts"). I started writing my book in October of 1999, and frankly, since then up to today it's consumed almost all of my time. In fact, just as the book finished in early October (of 2000), I was employed by Great Bridge, and Great Bridge is just starting now, so I have to give them the attention to get them started and make sure they are on the right track.
Certainly, in the future, I'm going to have time to do more development.
CL: Do you have enough time to play with your children?
BM: I've worked at home for the past eight or nine years. So I was home all the time in my previous job as a database software developer.
With Great Bridge, I'm doing more traveling. Obviously I'm here in Japan. But when I'm not traveling, I'm home. So my family is happy that this new job is not taking me away from being there.
So, yeah, I'm home a lot and they come down and visit and... I see a lot of them :-)
CL: Would you say that PostgreSQL is now ready for e-business?
BM: We are already ready, but we know that we can do even more to make it better.
Certainly, it's been used in e-business already by a large number of people. We have people writing shopping carts software that uses our transactional model. The features we have make it very natural, useful, for all steady places where databases would be used.
Now, there are some features that we don't have. We believe that we need the ability to recover from a hard disk crash with the log file that we are working on for 7.1. We know that replication is something that a lot of people are looking for.
So we are continuing to target the areas that we think makes us even more attractive to e-business sites. We are focusing on that.
CL: There were benchmarks this year that said PostgreSQL was faster than commercial ones or, even MySQL. Although I think Postgres has had a good reputation of having advanced features such as the powerful data model, rich data types and high extensibility, I don't think it has had a reputation of being very fast. What happened to PostgreSQL in regard to the speed improvement?
BM: We have been working on improving performance since we started, well, since we took the code from Berkeley. And I'd say within six months, once we've had the code stable, where it was working and it wasn't crashing, we started to look at things we could do to improve performance.
I was one of the first that kind of started running tests to see why it took so long to start a new connection, or why some clauses were taking a long time. We've had a number of people also, looking at the code to try and improve performance in other ways that they have analyzed it. There has not been a real push for performance, but what we have had are people who have been used to analyzing performance in different ways. All look at the code, and all add the things that they felt would improve performance.
For example, several years ago, I, over period of probably two weeks, went through the code and inlined some of the function calls that was called ten thousand times, or something like that. I made a macro out of it, and I got six percent speed improvement. And then on start-up, we were doing a fork and exec which we didn't need to do. Fork was enough and we got rid of exec. All of the sudden, we got a small speed improvement there.
We did little things along the way. Tom Lane has gone through and analyzed performance in some other ways he is used to doing, and he's found many areas where he could improve performance. We've had other people who looked at memory manager, "We could improve performance a little bit if we did this," you know.
So what we've had is not one grand person coming in. Performance tuning, or performance analysis, is really a complex thing. Certain people look for certain things in performance, and other people look for different things. We are really glad having one guy who says "I'm wondering, I'm going to look at code this way," somebody else who comes along and would say "I'm going to improve performance this way," and Tom Lane who would look at another way. What we've found is that... all these people looking at it at different times along the way, adding three percent here, six percent here, two percent here, five percent here, all of the sudden, "Wow, we're really fast!"
There wasn't really one thing. Even in the benchmarks that you are talking about, where we've had improvement from 6.5 to 7.0, there really wasn't much emphasis. We were all very surprised because we didn't do much in terms of performance between 6.5 and 7.0. I changed the way that cache looked up system table information, and Tom Lane added some changes to memory management or some other changes... I think we had 13% change between releases, but we weren't really concentrated on performance. We added a few little things that we thought might improve performance, but we didn't anticipate that.
To extend that, when I first saw the charts of performance us against all the other databases, they showed it to me in California... I was talking with somebody, and I was looking at the chart. I was looking at the one on the bottom. There was something about it, so I was saying "I wonder why this looks this way," and so forth. Someone looked at it and said "Well, this looks pretty good. The two are separate, but they are not that far apart." We were leaving the room and we were upstairs. I was talking with somebody about this chart we've been looking at, and saying "Oracle's graph looks like this and Postgres's goes up or down..." and the person I was talking to, Tom Lane I think, said "Look! The line at the top, the better database is Postgres!!"
It wasn't Oracle, and I was kind of shocked, because when I looked at the chart, legends that told me which line was which were so small, that I just assumed that Oracle was the better one, we were kind of close to them, but below them. And it turned out as a shock, that we were the one that was better, Oracle was the one that was worse.
That's an indication how surprised we've been by these performance measurements. We have received anecdotal information from people saying that Postgres was faster than Excel, but I believe it was not until we saw a systematic test of a really nationally recognized benchmark like AS3AP, that we really started to say "Wow, there really must be something...!"
CL: Where would you position yourself among these three: free software movement (like Richard Stallman), open source movement (like Eric Raymond), interested in free beer (like Linus Torvalds :-)?
BM: I can't compare myself directly to individuals, but I can tell you specifically how we view our software in terms of distribution and legal issues.
We don't have any trouble with people using our software to make money. A lot of us, myself included, spent years developing commercial proprietary software. The reason it was proprietary was because it was made for the limited market. You did custom software. Somebody called you and said "Can you add this program for me?" and there was no sense in open sourcing, because nobody else wanted it. It's just for that customer. It's like making shoes individually. When you make a shoe, you cut the size of a foot, and make the shoe fit the foot. That's exactly the type of work that we did.
Obviously, with the piece of software like databases which has much larger market and which is more useful to a large number of people, there are issues of "Should we allow people to take Postgres and distribute it to customers and not give them any of the source code?" We really don't personally have any problem with companies doing that. We just want to see that our software is used.
We don't really care if you gave it to them and you didn't give them the source or did, I personally find that the fewer restrictions, the easier things are to understand, the better. I think that a lot of people don't really understand the implications of, for example, the GPL license because it's a long document. It's very complicated, too, it really has a lot of things, it doesn't cover completely so you are left not understanding what's legal to do, or what's not legal to do.
We really don't like restrictions. We believe that our softwares are good enough that nobody is ever going to want to do that. To do that, we'd really be defeating the purpose of our software in the first place. We are increasing at such a dramatic rate, so even if somebody took the code and added some features to it and tried to close it off, proprietary, and distributed it... I laugh sometimes that people are running very old versions of Postgres, and they are complaining as if we had some kind of problems. We're light years away from what we did a year or two years ago. So we really think that it's just politically bad. We just say "That's fine, just go do it, don't worry about it."
I'm concerned, somehow, if it's going to disappear from the public realm, if it was to disappear, and somebody was to take it proprietary, they can take the last open copy and they can take development from there. So we don't really feel that we need to add the additional restrictions that somebody owe our license.
But from a practical stand point, we can't change the license. We came with the Berkeley license, (and although legally the license gives us the right) we don't believe we even have the right to change the license under which Berkeley contributed the original code. We, most of our developers are quite happy with the license we have.
In effect, there is another practical stand point. Because our database is extensible, in other words, people can extend our database by adding new types, new functions and so forth, there could be a case where customers would want to take Postgres 7.0 and add some custom data types to handle some geographical, something very complicated that only a few customers are going to need. They may spend a large amount of time developing, adding software on top of that. If they want to do that, they want to ship their product with Postgres. And then if they don't want to distribute the source code, because they want to recoup the amount of money it took to develop all these new functions and all these new data types, well, fine. Go ahead, just use the software. We are confident that eventually, that software will probably become integrated into our code because somebody will get tired of having it maintained.
We really don't fear what is going to happen with our software.
CL: As you have mentioned, PostgreSQL is distributed under BSD license, not GPL, which theoretically means that Great Bridge can take the software and make a proprietary version of it. Do you have such a plan?
BM: Great Bridge is a strong believer in open source. They have stated very clearly that they would never develop proprietary software or closed software. Everything that they do, all the code that they do will be open source. Even, I believe, the manuals that come with Postgres will be available on the Internet, in PDF, for example.
Great Bridge is not interested in generating revenue from software. They wish to generate the revenue from support. We've clearly decided that we do not want to close off, or try to defeat open source. To do anything that would be proprietary would be counted like trying to compete with Oracle, you can't do that.
We believe that we are going to be successful because we are open source. Trying to do closed sourcing anything with Postgres would not be counted as the way we think we are going to succeed. BSD license allows this to happen. But we honestly believe that if somebody goes proprietary with the product, then eventually they will see the light and will open source their software.
So, instead of using the GPL and requiring them to go through all the effort of understanding the GPL and require that before they do anything, we say "Fine, go ahead, go try and do it," because all the software manufacturers at this point are moving toward open source model and we do not fear somehow a company is going to come along and do the proprietary version. We just can't imagine why they would want to do it.
CL: Great Bridge is invested in by Frank Batten Jr. who also invested in Red Hat when it was still small but later it turned out to be one of the leading Linux distribution company. Is Great Bridge going to walk on the same road as Red Hat?
BM: That's what we hope.
When Frank Batten looked at Red Hat, I believe in 1995, obviously in that year Linux was not what it is today. It had a tremendous number of limitations. But Frank Batten's feeling was that if you look at the curve and see where they've gone from very little to this, if they could get this far this quickly, if you looked several years in the future, you imagine it's going to really be something spectacular.
So when Frank talked to his investment group in Landmark Communications that dealt with investments and creating new businesses, his statement was "Go find another circumstance where there is open source software which shows the same kind of promise." They quickly zoomed in on databases, compared various databases, and felt Postgres really had what it took to become a great database. They looked at our curve, at the rate which we were improving. And although we are not on a par with the large databases in all features right now, there's a sense that a few years down the road, we'll be on a par or superior in capability.
If you look at the benchmarks, we are already superior in a number of ways. So we have people who are moving from Oracle to Postgres now. We get people regularly who are porting their applications to Postgres from other commercial databases. And they are certainly hopeful, so I think a few years down the road, that Postgres will be looking real good.
CL: When is Great Bridge's IPO going to be?
BM: Well, people mention that :-)
Obviously it's been discussed, but we have a lot of work to do first. We need to get our support offering started, that will happen very soon when we offer commercial support. And we really need to develop reputation in the industry for superior support of customers.
We believe that the other commercial database vendors have done a fairly poor job of supporting their existing customers. If you talk to people who use almost any commercial database, usually it's a lot of money, and they're normally not happy. One of our goals is to build the reputation where our customers say "Wow, that's really the type of company, that's the type of support," that I really feel proud to have.
I can tell you, I've done with a few commercial support companies in my previous job, where I hung up the phone talking to him and I said, "Wow, I don't care what my company's paying for support, it's worth every penny of it. I called the company, 10 o'clock at night, I am in the middle of a crisis, and this guy knows exactly what's on my screen and exactly what to do!" But that's very rare.
We expect to be able to do that. And I believe that once we develop our reputation and strong business, that shows that we *can* make money in this market and that we're a strong support company.
CL: Do you see any changes in the Postgres community because the business world has been joining it?
BM: I'm asked quite often, on why the Postgres group is so polite. I was asked this by Great Bridge, when they first flew me to Norfolk; I'd never met these people.
I don't really know the answer to that, but all I can tell you is that people who have been involved with other open source software projects like Linux, FreeBSD, or a lot of the other ones, have experienced being exposed to a lot of, sort of hostile, rude, arrogant, mean emails, messages going around in that group. Postgres has never had that. We've always been like a very polite, gentle group just sitting around having a discussion. We are very quiet, we respect everyone's opinion, we do not say mean things to anybody, we do not insult anybody, we try to be very humble in how we work with people. We don't offer rude things as answers to questions.
I'm not sure why that is, but it's a very important part of our success. We have kept the same people involved with Postgres for years. There's a developer page on our web site with the list of developers, there's a list of 20-25 people there. Those developers have been involved in Postgres for at least two years. Because the atmosphere is so polite, we are able to retain people on the list, and we've been able to be very productive in bringing new people into the group.
As far as the actual problems of companies becoming involved, I would say that the core group of six Postgres members are tremendously concerned.
When I was approached to write my book, I had about six publishers contacted me to write a book. Prima, MacMillan, Addison Wesley, and a few others. I chose Addison Wesley for, I believe, good reasons. But I realized that if I had this many people interested in publishing Postgres book, then the introduction of companies involved in Postgres was very soon, it was only a matter of time till companies started getting involved in Postgres.
I sent out a message on Christmas day of 1999, basically stating that I felt that something was going to happen soon. Well, it turned out to be that Great Bridge was really watching this. They actually saw that message because they were lurking, they hadn't been really announcing, but they were reading our emails. And Great Bridge eventually did announce just to the core group to get a feeling directly from us. We basically talked as a group about how we could maintain the integrity, the focus, and the open source nature of Postgres, even if other companies were involved.
I realized that there was a tremendous risk having other companies but my statement to the core group was "Well, we have two choices here. We can either allow companies to come and get involved with Postgres, and try and manage that, make sure we don't make mistakes, or we can tell them to go away. Which do we want to do?" And although many people were concerned about the dangers that could happen, nobody really said that we needed to just tell them to go away. So we basically came to a conclusion that companies were going to be involved because of our success, but we needed to be deliberate and careful about how these companies were involved, and we needed to maintain the open source nature of the project.
So for example, one of the first things we decided was that no more than a few of the core developers could be hired by one company. We clearly stated this to Great Bridge. We did not want a case where they basically just came in and hired everybody, because people outside the group would say "Well, who are we working for now? Is this an open source project, or is this just Great Bridge working on Postgres?" So it was a very deliberate thing to say that only a few people would be involved with Great Bridge. When we announced that Great Bridge was involved to the Postgres general community, we were very clear to indicate that the core group was concerned about how this might effect to all. But we are very interested in hearing from people about potential problems they thought we might have. So we opened up, we basically walked into this expecting problems.
That might be of the difference. We walked in expecting a problem, and we were ready to address that problem however we should, and we were open to talk with people.
When I wrote my book, I felt that there might be some adverse feeling from people such as "Here he is an open source developer and he's writing a book, and he's going to get the money from it." I basically stated that to the group and I said, "I'm going to be making a certain amount of money from every book sale. Do people have trouble with that, or is there something like that?" I was basically open to say "Here's the conflict, what do you think? When I write the book, it's going to take months of time. If I'm going to get no money from my job while I'm doing this, I do need to have some money to compensate me for writing the book. I believe by writing this for Addison Wesley that can be done, but do people have trouble with it? I need to know if I need to say no." Then fortunately, all of the people were very supportive. They felt that this was not a tremendous conflict, and that the book would only help Postgres. So we believe that the existence of the companies is only going to help Postgres.
Recently, we had a case where Great Bridge had done some mis-communication. Something unusual happened involving Great Bridge and Postgres. We have three core developers hired by Great Bridge. All three core developers had different opinions on how to resolve the problem. One said one opposite, the other one said the other opposite, and I was median and so I suggested the vote. I didn't say anything in terms of my opinion. We let the group decide. We were open to tell the larger community and they voted. I went back to the original people and I basically got a consensus from the group of how to handle this project. There was never a sense that we were somehow company men now, because we were hired, obviously we all had different opinions.
In fact, I took the advantage. As unfortunate as it was that some dispute did happen, I was very encouraged by it because it was like all the other disputes we used to have before the company was involved. People would do stupid things all the time. We kind of say that "Well, we think you're going to do stupid things, let's take a vote, and see what not just the six people, the core group think, but the whole group thinks." People would weigh it and we discussed it for days and eventually we came to a decision. There was nothing new here, because the company was involved in this project.
So we don't think that it's going to really add a lot of new problems. What I am encouraged by, frankly, is that it's going to give us a lot more time to do this thing. That's something that Tom Lane and I have felt. We really enjoy doing this, but we didn't have enough time. I did this for four years and I did it for free. Every hour I worked on Postgres, it wasn't a commission, I wasn't making any money. We didn't use Postgres in my work, it had nothing to do with my development. I was on a purely volunteer basis.
When Tatsuo contacted me to come to Japan, I wasn't employed by Great Bridge at that time and I told him that I could not afford the time off from work. What am I gonna tell my boss? I have just finished the book, "Now you say you're going to Japan?!" he needs me to work! So I basically said Tatsuo "But I'm talking to Great Bridge now, if I become their employee, I'd love to come." Without Great Bridge, I couldn't do it.
We believe that those people are having more time to spend on Postgres. Even if they cause a little problems here or there, the overall result is that Postgres would be better forward.
CL: PostgreSQL is not just a great program, but it also has great documentation base that is online and available for free, obviously including your new book. But I guess not all developers love to write documentation. How do you handle this kind of problem?
BM: In the first two years of Postgres history, which is 1996 to 1998 roughly, there were many complaints that we had very poor documentation. People felt that our documentation was incomplete, it was confusing, they couldn't find the information they wanted.
Thanks to Thomas Lockhart who lives in Pasadena, California, one of our core developers, he took on the role of basically overhauling all of our documentation and adding significant amount of documentation himself within a span, I believe, of about six months. He was able to take documentation that really was not first class, which was really confusing, all scattered, inconsistent, because different people had written it, and was able to basically go through all the documentation, converted it into SGML doc book, which was a very common format that people were using. Then he basically accumulated it into administration guide, user guide, reference guide, and so forth.
Since he did that, almost no one has complained about our documentation. As I told you before, people were complaining all the time. So we really have him to thank for doing that tremendous job.
My book is really just an addition to his documentation. The reason I wrote the book, is because although we had, thanks to Thomas, very good documentation for programmers, for people who understood databases, for people who have used other databases and need to look up something, we did not have good documentation for somebody who knew very little about databases. Our documentation primarily has a small tutorial which has some information, but it has sort of an assumption of a look of a reference material that assumes you know what you are looking for. The goal of my book was to take somebody who knew nothing about databases (even very little about computers, they had to know how to type, but that was about all I required), to walk them through what databases were, why and how to use databases, and then, how to create a table, how to look up and add data to a table, how to do complex comparisons and joint a table to other tables, and how to use transactions. It assumed you know nothing and walks you through with why and how of using a database, and it just so happened that the database was Postgres. It is a very general book as the title says, "PostgreSQL - Introduction and Concepts", it is really designed for somebody who needs a broad overview of the features that are involved. It's not really a reference book as our documentation is.
But to answer your specific question, developers do not like writing documentation. With Thomas's conversion of the documentation to SGML doc book, because it looks like HTML which is used when you write web pages, we've been fortunate that when people submit patches, they will give us documentation that goes with it. Usually when somebody adds a feature, they've really got to add a little bit somewhere, so we've been very good at asking for documentation when they supply their patches. And they've been very nice about providing it. In case people don't understand English, or don't want to do it, all we ask is "just give us a paragraph and tell us what it does, and we'll take care of merging that into the documentation."
One of the things that we've always concerned about is that when people add features, we need to keep documentation current with the source code, so we don't want to get the cases that people are adding something and not documenting it. We've been kind of careful about that.
We're really willing to do the work form of Thomas, who has been doing a lot of the work to keep that up to date. So we're happy to do it if somebody doesn't want to do it, if they can give us a description on what it does, we'll take care of finding the places that the documentation should be updated.
CL: What was the biggest news that happened to PostgreSQL this year?
BM: I honestly have to say that the creation of the Postgres support companies would probably be the biggest item of this year. Not only because it shows that we are moving into commercial sites and commercial applications which we've always had, but also because commercial support companies make that more popular, because many companies do not feel they can really use the database until they have a commercial support company behind it.
But in a larger sense, I believe that the creation of the companies and the hiring of some of the developers is going to give us a lot more time to add things to Postgres that we really could not do before. For example, we've always needed to be able to have the ability to do outer join (a special type of query), and none of us really had the time to add it, because we all had regular jobs. Tom Lane has been able to add outer joins for 7.1, I believe only because he's now working full time on Postgres. And it's true of many of the other individuals. We've had been unable to do this type of development. This would only make us better and much faster.
CL: What do you think will be the biggest PostgreSQL news next year?
BM: Next year, I envision there will be the completion of the final features that we really need to compete in the larger, commercial software world.
We've already completed the additional support for very long rows (which had some limitations), Write Ahead Log and outer joins, and several other internal features that we needed to add. I believe that next year we'll see the addition of replication, of support for point-in-time recovery in case of disk crash. I think we're going to find a few thing that we don't have right now but needed by a significant number of sites, not most, but some. We're going to focus on those, and get those resolved.
So basically we'll be able to walk up to any commercial database, and we'll not have any missing items at that point. When that happens, then we can really focus on adding new features that people maybe never even thought of. We've already got a number of features that are fairly significant and new in terms of "object-relational." I think once we can finish some of the larger missing items in comparison, we can start to think about new things that we can do. We think we are going to just continue to fill out our project.
If you look back in the history of Postgres, every time the release comes out, I look at it and say "Wow! this is just the best release we've ever had. Look how great this is! Look how much better it is than the previous releases! It looks like this is just it!!"
And then, the next release comes along, and you're like "Wow! this is really great!" and you look at the one that you thought was really great, and you're like "Oh, that wasn't as great as this one is." So we've learned not to expect to have the release that ends all releases of Postgres. We are slow and steady. Every release gets that much closer and that much closer. There is no final release of Postgres.
But it gets faster, it gets more features, it gets more powerful, or more reliable. I just expect that it will continue hopefully even in the faster pace than now, now that some of us have even more time to spend on it.
(Bruce kindly managed his schedule to take time for the interview in spite of our sudden request. Thank you very much, Bruce. We hope to see you again, probably when you come to Japan again to launch "Great Bridge Japan"!
-- ChangeLog Team
See also: another interview by ChangeLog.net team
Eklektix, Inc. all rights
Linux ® is a registered trademark of Linus Torvalds