
Count distinct performance compared on top 4 SQL databases - hglaser
https://www.periscope.io/blog/count-distinct-in-mysql-postgres-sql-server-and-oracle.html
======
interactive_rep
One of the reasons you are seeing that the subquery runs twice as fast on most
of the databases ( like SQL server, oracle & MySQL), is that it is able to
execute the query with one pass through of the data. When it is taking twice
as long (without the subquery), it is doing a second pass for the count
distinct.

------
dccoolgai
Crazy to see those numbers on old MSSQL...there's still a whole bunch of stuff
they're behind the curve on (json support / array support), but its tough to
argue with those benchmarks...

------
interactive_rep
Another reason postgres could be running very slow, is that it may be spooling
the intermediate result to a temp file on hard disk.

Do you see a performance increase when using SSD Drives?

~~~
hglaser
We didn't try SSD drives. It's a good idea, worth a shot for a future post.

~~~
stock_toaster
You ran this on RDS? (mentioned in blog post)

I would be worried about external factors, nonstandard/non-default configs,
and other such things impacting your tests.

------
AlterEgo20
Did you use default postgres configuration? By default postgres uses very
small amount of RAM and is forced to store hash maps on disk -> very
significant slowdown.

------
toong
See related discussion
[https://news.ycombinator.com/item?id=7114310](https://news.ycombinator.com/item?id=7114310)

