Bruce Momjian on PostgreSQL, Great Bridge, the Future and the Past
원본출처 : http://lwn.net/2001/features/oreilly2001/BruceMomjianInterview.php3
Eight months ago Maya Tamiya interviewed Bruce for LWN at Linux Conference 2000 in Kyoto, Japan. We spoke to him again at O'Reilly's recent Open Source Convention in San Diego.
Have you had any time for PostgreSQL development since you last spoke with LWN.
Yes. We obviously continue to enhance PostgreSQL pretty much on a continual basis. Since the conference in Japan we have released 7.1 which introduced a number of very significant features. It removed the row length limit, added write head logging and a couple other pretty sophisticated features. Certainly we have continued to grow and we've seen more and more people follow the project every month.
How have PostgreSQL features for e-business needs, including replication, evolved since then.
We are continuing to fill in the missing parts of PostgreSQL for the enterprise. A rough measure of our progress is the number of relevant things we're missing. About a year and a half ago, when Great Bridge approached us, seven things were missing. With the release of 7.1 that reduced to three, including replication.
We know that 7.2 will have that farther down. Hopefully we can eliminate any of the missing pieces by 7.3.
Would you like to comment on the recent events surrounding NuSphere and MySQL AB.
It is my understanding that until recent history MySQL has always been developed inside MySQL AB, the company in Sweden.
It is also my understanding that NuSphere is hoping to develop a community around MySQL similar to the community that has developed around PostgreSQL. We feel that the open source community around PostgreSQL is a significant part of our success. It has obviously allowed us to add features and generate a much more reliable product than we could do if we didn't have that community.
I have a feeling that part of PostgreSQL's success is fueling the need for NuSphere to try and get a community of developers together around MySQL. Whether they are going to be successful with that is really hard to say.
Interbase or SAP DB are two fairly popular databases released in the open source area. They have had questionable success in developing communities around their open source database offerings.
MySQL has a large community of users but it is a little different challenge to develop a large community of developers around a database. Only time will tell how that will end up. It is obviously a big challenge. If you look at the speed at which PostgreSQL is adding features and the stability of the releases we just couldn't do that without community. For MySQL to keep up it's almost a requirement that an open source community evolve and develop around the project.
How would you characterize Great Bridge's relationship with Red Hat today?
I have a funny story for you.
I was down at Red Hat, with Michael Tiemann, their CTO, to give a speech to the Red Hat engineers. I went down there to speak about an article on my web site. The article is about the ``prisoner's dilemma'' and what that implies for how companies control various resources. The speech was about the challenges of open source and the challenges of companies involved with open source.
I give the speech then fly back to Norfork to spend a couple of days at Great Bridge. I drive over to Great Bridge. I'm walking in the door about 5:30 in the afternoon. The President of Great Bridge is walking out. He says that about a half hour ago Red Hat announced that they are going to be doing commercial support for PostgreSQL. It was kind of eerie to leave Red Hat then realize that they made this announcement as I'm flying away. It was a kind of surreal position to be in.
I've obviously been to Raleigh-Durham to speak to all the Red Hat engineers. I went up to Toronto where Red Hat's PostgreSQL engineering group is going to be located. I met with Patrick McDonald there and spent a full day going over the development community, how it operates, went over the 'to do' list and how they can get involved. We're obviously very excited to have them adding resources to PostgreSQL.
Great Bridge's feeling about this? It is not a surprise to us that they got involved. We understand the need to move into the larger enterprises. You can't get the kind of additional revenue possibilities large enterprises offer in the desktop environment. We would have liked a larger head start in this market but we understand why Red Hat had to do it. Their coming was anticipated.
Our relationship with Red Hat seems to be pretty good. We get along with them well, they are very nice people. Obviously Frank Batten [Great Bridge's chairman and an early investor in Red Hat] has a relationship with Red Hat. In a sense we also have that prior relationship going into the project
Our feeling is that the open source database pie is just so big that there is certainly room for both companies. In fact, Red Hat's choice of PostgreSQL is a validation of Great Bridge's involvement. We've got people saying "Oh wow, there's Great Bridge and now there's Red Hat. Obviously the wagons are circling around PostgreSQL." It just shows that this is a very sane space to be in: that a lot of companies see potential. We think that there is a lot of room for everybody.
What do you see as the primary differentiators between Red Hat's offerings and what Great Bridge offers their customers.
There are probably two primary differences.
One is that Great Bridge focuses on PostgreSQL database solutions. Great Bridge has a huge amount of database expertise by hiring three of us [from the PostgreSQL core team] and having already established themselves in this database marketplace. Database solutions is only one part of Red Hat's business. Red Hat also has commercial relationships with DB2 and Oracle. Great Bridge is focused on PostgreSQL. For example, in the web camp Red Hat recently had the question come up of how do you see PostgreSQL operating in relation to DB2 and Oracle. The answer was that Red Hat sees PostgreSQL operating primarily in the department server area. The Great Bridge people see it in a much larger range of uses.
We all know that in the not too distant future PostgreSQL will have the features to go head to head [with commercial databases] in almost every area. Red Hat may be a little constrained in how they can present PostgreSQL because of their prior relationships with DB2 and Oracle.
The second differentiator is that Red Hat is primarily an operating system. They released PostgreSQL only on Red Hat Linux. Great Bridge has already released support for Solaris. I'm continuing to push Great Bridge to support more platforms.
The span of operating systems that the PostgreSQL development team supports is very wide. We already support fifteen plus platforms. Some of the platforms I'm not even sure what they do, like QNX which is a real time operating system. 98% of the support problems have nothing to do with the operating system at all. These are clients that might be running FreeBSD, IRIX or true 64 Alphas. These are potential customers for Great Bridge.
Great Bridge is primarily a database company. Hopefully, support for operating systems other than just Linux will be a good differentiator for us.
When you spoke with LWN last November you spoke about the kind of commercial support program you wanted to build at Great Bridge. A place where someone could get an answer at ten o'clock at night from someone who really had a clue. How is that going?
We have, I don't know the exact number, four or eight support engineers now. We're continuing to scale that up as we get customers.
Fortunately PostgreSQL is such a solid database that we haven't had a lot of crisis problems because we don't really break that much. I do know that our support customers have been very happy and very impressed at how much we know when they do call.
I'm personally impressed with Great Bridge's support engineers. I've been in this for five years. It has taken me awhile to get up to speed. I look at the engineers we have. I look at the questions they ask me. When I'm stumped by a question, obviously they've gone pretty far in their knowledge. I'm confident that they are doing a good job.
What we have been doing a little more of than we expected is helping customers migrate their applications to PostgreSQL, design their applications and so forth. We're doing less support than I thought, partially because customers don't have as many problems as anticipated.
We found our customers are really happy to have experts in house that know the database. In fact, one idea is to have people go out on occasion to visit customers. Then have those same people come back to Great Bridge and be the support engineers for those customers. They actually can help the client get started then come back and continue their relationship with the customer. That's something that very few vendors do. Usually you get the guy that parachutes in, does the work and them you never see him again.
We're hoping that relationship is something else we can give customers. A personal relationship with support engineers who know them, the code and the type of application that they're working on.
How do you feel the increased commercial attention is impacting the PostgreSQL developer community?
We had some challenges when Great Bridge got started. That was a new dynamic for us. They added an interested commercial company who was hiring some of us and issuing commercial releases with the PostgreSQL name. The transition was not too difficult because three of the core developers, including myself, were hired by Great Bridge. The developer community didn't know Great Bridge but they knew us. They knew we were involved with Great Bridge so they had a certain confidence. PostgreSQL, Inc. also got involved so we had two [commercial partners] at that point. Now we have Red Hat, which makes three.
The core group of six had a lot of discussion when Great Bridge came around. We discussed the possible challenges with companies becoming involved. Multiple companies were a topic because PostgreSQL, Inc. was already around. We basically had a plan listing potential problems and how we could resolve them.
Fortunately since Great Bridge became involved there have not been many problems. The community continues to independently issue releases. We continue to add features by and large independent of the companies. The companies help us by hiring us or doing research that we need and suggesting direction. All of the work that the companies are doing funnels through the same community that it has always funneled through.
Within the core developers working for Great Bridge, Tom Lane, Jan Wieck and I sometimes have different opinions. We'll fight it out in public on the list to try and pick the best course of action for PostgreSQL. I think that surprised everybody, "Gee look they work for the same company and they don't even agree." Clearly, we're given a lot of independence.
Great Bridge is not stupid. They understand that the developers are only effective thru independent action. And, that only through independent action can PostgreSQL remain strong.
Great Bridge is giving us resources but there are very few strings attached. In talking to Red Hat's database group, I think that is going to continue with Red Hat, that they're going to continue focusing on what's best for PostgreSQL. I don't think that they're going to present us with any challenges related to the company part of it.
The challenge right now for everybody is to make PostgreSQL better no matter who puts the money into it. There are a very few cases where a company wants some feature and everyone else doesn't. Everyone pretty much agrees on the features we need. It is just a question of getting the job done.
In some sense for years PostgreSQL has been following feature sets that are in Oracle and other large commercial databases. As you get close to covering those feature sets do you see a day coming when PostgreSQL is going to start leading?
That's an interesting point. You're absolutely right.
When we started in 1996 we had serious limitations. We had problems with crashing and so forth. The challenge in the first couple of months was just to prevent it crashing. Then we had problems, for example, with SQL 92 compliance.
We had a number of people who helped. Thomas Lockhart is one of the crew who really put his nose to the grindstone and analyzed what we were missing, what we needed and how to get it done. Similarly with documentation, we used to get a lot of complains about documentation. Thomas Lockhart, again, put his nose to the grindstone. He did an unbelievable job of overhauling our entire documentation system.
We're now to a point where we have SQL 92 compliance that is probably better than most commercial databases. When we started everyone would say how un-compliant we are. Sometimes people will complain that PostgreSQL doesn't do something like Oracle, DB2 or another commercial database. Now our response is [frequently] that the [SQL 92] standard says it has to do X and that's why PostgreSQL does X. If another database does something else, they're not following the standard. We've really come a long way there.
There clearly is a point where the feature set we're missing is dwindling. We have about three missing guys and maybe a dozen more minor things that we need to add. The way we've dealt with [feature sets] is to have a 'to do' list. We get feature ideas from people who may not have any idea how to code it. They have this cool idea, either that they have seen or that is just unique. We codify that and put it in the 'to do' list. It goes up on the web site that day.
As the features that we are missing get completed we'll start getting more ideas from the community. The community has the tendency to develop very good solutions and meaningful well thought out enhancements. Typically when a vendor adds some sort of extension it's kind of a hokey thing that should have been done a different way.
I do anticipate a time when, thanks to the community and the ideas and the discussion that we have, we will start adding capabilities that nobody else has. We already do have some unique capabilities because some of the SQL 99 work was based on what PostgreSQL already had in an object relational area.
Yes, we'll probably be a super set someday of the other databases.
You've mentioned three missing key features a couple of times. What are those missing features?
They are replication, point in time recovery and transparent removal of unused rows. We currently require a vacuum process to remove unused rows. We'll have a vacuum solution in place for 7.2. We don't know if we're going to have point in time recovery in place for 7.2 or not.
Replication is probably the most complicated. Looking at the solutions other vendors have implemented, they really are not very good. Everyone has complaints about the way [their database] does replications. The solutions seem to be ad hoc, very fragile, require a tremendous amount of hand holding or have serious performance limitations.
We realized that we were not going to find leadership in replication solutions from other vendors. With a typical Oracle solution, you have manpower to throw at administration of the replication and the tuning and the tweaking and so forth. In the PostgreSQL environment it just has to work.
We had to dig and find a solution our users were going to be happy with. We knew we needed replication but none of us really had the time to research all the permutations and all the possibilities. The people at Great Bridge stepped up to the plate. They spend a good two months just researching replication. Replication is a very complicated topic. There is a lot of research that's been done on the topic.
We found a solution called Postgres-R which was done as a research project. Fortunately they happened to have used PostgreSQL[-6.4.2] as a prototype to implement it. We looked at a number of other replication solutions but this one really seems promising. I don't think any of the other commercial vendors have used this particular solution. It seems to have very good performance plus it seems to be very self maintaining. It doesn't require a lot of conflict resolution logic and the other sort of maintenance that usually goes along with replication. We're very excited about that.
[The Postgres-R project is from the Information and Communication Systems Group, ETH in Zurich, Switzerland, originally produced by Bettina Kemme, Win Bausch, Michael Baumer, Ignaz Bachman, Gustavo Alonso, and others.]
Great Bridge has put up a web site with all the replication information. There's also a mailing list where people are talking about replication. The hope is that we can get coding on replication in the next month or so. I know Jan [Wieck] is going to be one of the people working on it. We also have some other people at Great Bridge. Obviously we want to get community people involved too. It is going to be a big job. I don't think it is going to be in 7.2. I do think we can get something working in the next six months.
My hope is that once the replication solution comes out it will be trend setting. Something similar to the leadership that [PostgreSQL] multi-version concurrence control has given database vendors.
After those three are covered what's the next big thing on the list?
There's a number of smaller items such as:
. Table spaces: better support for efficiently spreading data across multiple file systems.
. Schemas: the last feature required for SQL 92 compliance.
. Performance: continue looking for ways to improve performance.
. Programming interfaces: improved access for programmers.
. Administration tools: improvement in this area is a big check box for the commercial vendors.
The 'to do' list has several other features that are real challenging. For example, full text indexing is a big item for some people. We had a solution in 7.1 but it was a little hard to install. We have the code, we just need to get it plugged in effectively for people.
People just continue to come up with ideas. That is the beauty of working on PostgreSQL. After nine months or a year of working on PostgreSQL you may run out of nifty ideas. Fortunately there are always other people who are feeding new ideas into the to do list.
The project is never really going to finish. It is continuing to mature.
Would you care to share some comments with our readers about the impact PostgreSQL has had on your life?
Today PostgreSQL is really moving. I get about a hundred twenty messages a day on the mail list. The activity is just unbelievable and it continues to increase.
I started working on PostgreSQL five years ago on a lark. I was a custom database application developer for law firms. I was bored with my job after seven years. PostgreSQL was something interesting. It helped my C skills gets better. I was fascinated with how a database works inside.
I went a few years where I felt guilty about the time I worked on PostgreSQL. I was paid 100% on commission. When I wasn't working I didn't get paid and my family didn't get paid. I justified the hours I spent working on PostgreSQL as important for my future because it increased my skills.
At first I worked an hour or two each day on PostgreSQL. I never anticipated that any of this would happen. Then I was contacted by several publishers looking for someone to write a book about PostgreSQL. I knew something was up.
On Christmas day of 1999, while rocking the baby to sleep, I suddenly realized that if five or six companies want to publish a book on PostgreSQL a big commercial support firm couldn't be far behind. That day, I sent out an email to the PostgreSQL list saying we should be prepared to be very popular. Great Bridge had already been lurking on the PostgreSQL mailing list for about a month.
A month later Great Bridge contacted the PostgreSQL project and told us they wanted to offer commercial support. They had all the developers to San Francisco. We met in person for the first time ever. It was really amazing to see everybody in one place.
Great Bridge eventually hired three of the core developers, including me. It has been an almost unreal situation for me to be involved in building Great Bridge. I can't imagine what the future will hold. People are asking if PostgreSQL will take on Oracle.
I thought PostgreSQL would take me nowhere. It was an interesting thing to do that had a minimal impact on my family and I enjoyed it. Do what you think is right today and let tomorrow take care of itself.
Is there anything else you'd like to add for our readers?
It is really a pleasure to be here at the O'Reilly Open Source Convention to see open source in such a large and public way. I had a great time yesterday doing a full day PostgreSQL tutorial. Tomorrow we have a full day of sessions on PostgreSQL. It has really been nice to see so many people interested, so many people involved.
Obviously there has been a downturn in open source related stocks but I can tell you for sure that the community is alive and well here and certainly thriving.
Thank you very much. And Thank You for PostgreSQL
Bruce Momjian was interviewed by Dennis Tenney of the LWN staff at the 2001 O'Reilly Open Source Convention.