This is interesting, because the conventional wisdom of idiots on slashdot continues to be "use postgresql if you need advanced features, but use mysql if all you care about is speed," despite all the head-to-head benchmarks I've seen by third parties showing PostgreSQL to be faster under load. (MySQL's own benchmarks, of course, tend to favor their own product. Go figure, huh.)
Here's the latest, showing postgresql about 50% faster than mysql across a variety of hardware. But where MySQL really takes a pounding is when you add multiple cores / CPUs: MySQL adds 37% performance going from 1 to 4 cores; postgresql adds 226%. Ouch! (This would also explain why MySQL sucks so hard on the Niagra chip on the requests-per-second graph -- Sun sacrificed GHz to get more cores in.)
As even low-end servers start to go multicore this is going to be increasingly important.
Update: PostgreSQL core member Josh Berkus says :
[This] is a validation of the last four years of PostgreSQL performance engineering. It's not done yet ... if the Tweakers.net test had included 16+ core machines you'd have seen PostgreSQL topping out ... but our hackers have done quite well.
Comments
Clustering/multi master replication, failover, and case insensitive where clauses.
Multi-master replication: Slony-2 is in development, or there is PGCluster: http://pgfoundry.org/projects/pgcluster
Case insensitive where clause: Use "ILIKE" instead of "LIKE" (although, indexing lower(column) and comparing against lower('where_clause') can use an index, whereas ILIKE can not.
Honestly, whatever MySQL can do, PostgreSQL can probably do better - there are very few exceptions.
The MySQL databased folder was 8times the size of postgresql folder..
The site is perceptibly faster. Love MVCC, pg_autovacuum, no locking required...
http://www.teenwag.com
Teenwag
thanks for an awesome decisive post...
PostgreSQL has a mechanism called MVCC (MultiVersion Concurrency Control), comparable or superior to best commercial databases. It can do row-level locking, can lock rows for writing in one session but give these rows unaffected in another session.
This has an disadvantage also, it leads to defragmented data because an update to a row is actually done on a copy elsevere in the file/filesystem. This leads again to slower sequential reads of tables due to the architecture of an harddrive.
That's fine but it means I can't use any of the ORM layers written in any language. It means I have to litter my code with SQL statements that are postgres specific (ILIKE or *~).
This is a simple collation problem and people have been asking for it at least five or six years (just query the mailing list) and the postgres team refuses to come up with a case insensitive collation or to include any of the case insensitive data types people have developed. The only excuse they have is that it's not "standard" as if ILIKE or *~ are standards.
Anyway postgres has always been late to the party. Late on replication, late on clustering, late on full text searching, late on a windows port, late on ODBC drivers, late on ADO drivers, and still no case insensitive collations or data types.
This is why people use mysql. They deliver what people want. Yes you can argue that MYSQL implementations of some that stuff is crap, yes you can argue that postgres has other better features but you can't deny that mysql delivers what most people want before postgres does.
If you are wondering why people use mysql that's the reason. For the last decade I have tried many times to use postgres in one my projects and every time I run into one obstacle or another. Just last week I abandoned my attempt to trying it one more time (with rails) because I can't demand that my users remember the precise capitalization of every piece of text and I don't want to rewrite activerecord.
"That's fine but it means I can't use any of the ORM layers written in any language. It means I have to litter my code with SQL statements that are postgres specific (ILIKE or *~). "
Perhaps you need to find some better ORMs. The one I work with (built in to Django) supports your choice of "contains" (which translates to LIKE in Postgres) or "icontains" (which translates to ILIKE) in its querying syntax.
That's wonderful but what if you are not using django? I am sure django is wonderful and all but some people may prefer ruby, or php, or java, or .net or something.
Also what does using ILIKE do to your speed? Does it use indexes? Would an application where every instance of X='abc' is run as x ILIKE 'abc' run at acceptable speeds?
Anyway you missed my point. The point is that for at least the last five years this topic has come up again and again on the postgres mailing lists. People wanting to migrate from mysql, ms-sql, sybase, informix etc have asked for it. The solution is simple. Make a case insensitive collation for postgres. Failing that make a case insensitive text type (somebody has already done it!!!) and include it in the default distribution.
Why is the postgresql team so obstinate about this much requested feature? Does anybody buy the "it's not a standard" line especially when the next paragraph says "use ILIKE"?
I know I am ranting a bit but it's only because I WANT to use postgres but am unable to do so because it's lacking a feature I really need. Every time I try to use it I run into some limitation or another. I remember in the days of yore waiting for outer joins and then giving up when I deadlines caused me to use another database instead.
I'm not surprised, but then that's actually the right thing to do. I was fairly surprised to see that MySQL had a case insensitive collation enabled by default, causing a slight inflation of results for my application, and I imagine that for many applications it's totally inappropriate.
JOE: "The only excuse they have is that it's not "standard" as if ILIKE or *~ are standards."
Sounds like a good excuse to me.
And on the subject of full-text searching, I read in various places (including the MySQL manual's comments) that MySQL is relatively hopeless for any real full-text search solution. If ticking boxes on the feature sheet is important then I guess MySQL is just fine; otherwise, I suppose people may find the various extensions to PostgreSQL convenient enough to compile.
JOE: "Just last week I abandoned my attempt to trying it one more time (with rails) because I can't demand that my users remember the precise capitalization of every piece of text and I don't want to rewrite activerecord."
This sounds like a combination of user interface shortcomings and inadequate tools to me.
PostgreSQL is extensible and using some of the rich set of extensions available is expected. Try telling PERL that a capability isn't good enough because it's in CPAN rather than "the default distribution".
Download and install the extension that does what you need, and install. This has all kinds of benefits over the MySQL "everything goes in core" mentality. For one, you can get an updated module without having to wait for the next release of PostgreSQL. Slony-I is a great example of an add-on. If it had to wait each time for a full PostgreSQL release, the project would not be where it is today.
These have the advantage of being nearly universal.
Instead we get semi insulting suggestions at workarounds like use *~ or ILIKE neither of which are an SQL standard or use the full text searching.
Here is another suggestion for people who want case insensitive where clauses. Use mysql, mssql, oracle, firebird, db/2, openbase, frontbase, or any one hundreds of databases that have that feature.
"[PostgreSQL] has had a couple of its indexes replaced to allow better performance. Additionally, a few data structures were altered."
Based on this vague statement and the overall text it seems to me that this benchmark was meant to compare processors, not databases. The point is to find out which processor is fastest for each database, not which database is fastest. The results may still be relevant, but this interpretation sounds like an over-generalization at this point.
Also it's worth noting that the benchmark looks mainly at read speed when all data is in the buffer cache. For many applications this is far from the only interesting benchmark for comparing databases. However, it's a good benchmark for comparing processors to each other.
Hehe, check this post: http://forums.mysql.com/read.php?25,93181,93181
Read/write
As far as i can tell, its mostly simple select statements? So how is the write operations in comparison?
Bechmarking sql? or php?
Also, it seems the benchmark was made by requesting a php4.4 page, through apache 2.2, using "apache-bencmark", which begs the question: Are the SQL client implementations in PHP equally good?
The statements
And where are the actual SQL statements? It is importent to see these to understand the benchmark.
Eg. Did the test contain LIKE statements? And did the test use ILIKE on postgres, so the actual search was the same?
Ie. were the sql-statements "designed" for postgres?
Conclusions
There are so many extra- and unkown factors in this benchmark that one cant conclude anything from it.
Like Jonathan Ellis stated in the original text. MySQL benchmarks tend to favor their own product. So does mssql, oracle and so forth.
Id love to see and absolutely independed benchmark of all these. Also of postgres, so youre sure that the guys benchmarking are out to prove postgres faster, instead of an unbiased bench.
Ppl tend to get very, ehrrm... Religious about their favorite product. If the benchmarker is just half as biased as this original blog-post, the entire test-result is useless.
Would be nice to see the 2 dev versions against each other...
As a MySQL employee, I think it's fair to say that PostgreSQL has done a fantastic job in focusing on performance-tuning for the 8.2 release, and kudos to Josh and the team for their efforts!
Both RDBMS systems have something to offer for various crowds in need of dissimilar features, add-ons, ease of use, etc. No one at MySQL knocks PostgreSQL. We thrive on the healthy competition that comes out of the open source community.
Anyway, on a general note regarding benchmarks, it is almost always possible to create benchmark results that prove a specific point in favor of one side of an argument. But, in the case of the tweakers benchmarks, I think that the results indeed show a performance degradation for multiple cores/processors that MySQL should look at closely, and I'm sure our engineers will do so.
On a related note, I should note that the MySQL 5.0.30 release included a patch for InnoDB performance which showed a substantial increase in performance on multi-core/processor machines (>20% AFAIK).
Anyway, cheers,
Jay Pipes
Community Relations Manager, North America, MySQL, Inc.
Internal Benchmarks
We wrote some internal benchmark software in dotNet, and found postgres 8.0 about 50% faster than mysql (InnoDB)both running on windows with a dotNet client. The test was a stored proceedure that similated creating invoices for multiple companies. There was a table with each companies current invoice no which was incremented as new invoices were inserted into the general invoice table. We then ran X companies with y threads per company.
SQL server screamed at around 1,500 transactions per second, Postgres peeked at around 300, and my sql at 180. The biggest problem with Postgres under windows was that windows allocated too much memory to each postmaster process. It is much much more efficent with a unix platform.
The main difference betwen Postgres and both mysql and sql server is that it does not colapse under load. When we really stressed things mysql just gave up, and we have crashed the server. SQL server performance dropped from 5 times faster to around 20% faster.
I recently build a collection of virtual machines Open BSD, Free BSD, Net BSD, Ubuntu, deb etch, gentoo and have been meaning to try postgres under these.
Postgres was around 70 transactions per second on open bsd, using 100% cpu on the virtual machine (50% cpu of the actual machine)
MVCC works
It is really noticable that postgres keeps going under heavy load. Where as the row level locking databases slow down a lot from their peaks.
I have not done a bench mark yet with postgres on a unix box vs sql server with different clients, but it is noticable that a postgres system on unix only uses about 60 meg for around 100 connections, which is reasonable vs the gig on windows.
My SQL Cluster
We had intended to go with MySQL Cluster since it is by far the best cluster architecture and the backend is rock solid and incredibly efficent.
Unfortunately the front end servers let it down. We tuned it up from around 5 transactions per second (10meg interconnects)to 180 per second (100 meg switched) on our internal benchmark.
We did manage to get about 2,000 inserts per second into the cluster using 3 front end machines, and the backend was still under minimal load.
There have been a lot of features added in mysql 5 and it is maturing very quickly.
Our current strategy is to stay with SQL server, try and move to postgres and keep watching the mysql clusters.
Case insensitive collations
Comming from SQL server case sensitive collations are a real pain. However we have written our own search engine which converts everything to lower case, which means for that part it will be fine, and our primary keys are all int so again no issue.
The bits that bug me are that unless you use lower case in all your table/column names you have to quote them. I prefer WordWord to word_word names. The crazy thing is that the engine converts WordWord to wordword unless it is quoted.
utf8 support
The biggest advantage for us in moving is the utf8 support in both mysql and postgres.
Would prefer case insensitive indexed where clauses, but it will be a very big job porting, and we can probably live with it, since we are looking to translate our table generation, and stored proceedures into the new database, and use one set of code for both databases.
For distributed query processing, they use functional shippment rather than data, that benefits performance a lot to compare with MySQL cluster.
What a pity. Postgresql looks so much more mature to me.
Performance depends on your benchmark ;-)
Generally speaking, speed and quality (aka data integrity) are inversely related in databases. That's why traditionally Oracle was the slowest and MySQL was the fastest. This is why your database license often forbids you from publishing benchmarks, because fast speed makes people think it makes for a better database.
But as time has progressed and each DB has matured, after they've achieved quality, they've been optimizing things to improve speed without losing features or data integrity.
I've been biased towards msSQL because it offers great features (was especially important years back, but now everybody has Stored Procs and transactions) and amazing speed. But in today's world, and licensing, I could see myself preferring PostgreSQL and MySQL (if I really had to) as they have all the essential features and started to go beyond needing key features and are entering the state of maturing those features.
Anyone who says "Postgres has always been late to the party" got to the party pretty damned late themselves.
I remember when MySQL had no support for relational integrity, no support for views, no support for transactions... and while PostgreSQL had all those things, the MySQL team was promoting the idea that all those features were useless performance-impacting cruft that nobody needed unless they were running a bank.
Arguments pro/con about postgres work better if they are rational, factual and emotionless. I think your opinion here is valid, but the presentation is not flattering to you. Don't worry be happy. :-)