Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Snowflake is technically far superior to Redshift. The performance and features are somewhere else. Even if you pay more in operating cost (which is not always the case, especially if you count Redshift DBA salary), you get to do what you actually need to do much faster instead of fighting the platform.

Snowflake is also technically going to always be far superior to Redshift, because AWS is a follower, not a leader. Their strategy is to copy what others are doing and build a moat of "but we can do it too, and you already have other things with us".

And if you really care about operating cost, you should be running Trino+Minio or Clickhouse onprem or something as your warehouse.



> Snowflake is technically far superior to Redshift. The performance and features are somewhere else

From the discussions I’ve had about this before, I think I’m in the minority when I say I’m categorically unimpressed by Snowflakes performance.

Add that to the hideous cost, and the worlds most aggressive sales/account management team and I’ve less than zero desire to ever deal with them again.

> Snowflake is also technically going to always be far superior to Redshift, because AWS is a follower

While this is true, how many businesses actually need or actually utilise the features they’re paying for with things like Snowflake. Sure it’s got separate storage and compute, but how many places have so much data that they need that? I’ve worked with places that were running multi-node data warehouse clusters, that we migrated to a single ClickHouse or Postgres instance and got equivalent or better performance for a fraction of the economic and operational overhead.

> And if you really care about operating cost, you should be running Trino+Minio or Clickhouse onprem or something as your warehouse.

Can’t disagree here, although I’d say it shouldn’t just be relegated to operating cost: CH blows most alternatives out of the water and is available in hosted options now. Haven’t used Trino recently, last time I used it, it was still called Presto, and it was frustratingly slow, know if it’s improved recently?


I think that depends on your target and especially scope.

ClickHouse demonstrates this well - it is incredibly fast and powerful, but also very limited. It has its own dialect of SQL incompatible with anything else. It doesn't even have a traditional query planner so you have to be expert to write fast queries. It has no update, no merge, no CTEs. To get most out of it, you have to think about data sorting, data types (is this LowCardinality String or just String?). And once you reach scale where single node can handle it (which is much more than single node Redshift could handle) you manage a stateful cluster which is a pain compared to storage/compute separated clusters like Presto/Trino over S3.

So ClickHouse is very good if it is in hands of dedicated team that is willing to learn CH in and out and is using it for a particular purpose. But you can't really setup ClickHouse, dump data into it and then send 100 random Data Analysts to "go forth and make reports". The barrier of entry and list of things that are different in ClickHouse is just too long.

Snowflake is exactly at the opposite end of "how much knowledge / how much of a dedicated team do I need to make use of this". Dumping data in and giving other teams access is something you can do easily. Scaling isn't a problem, users interfering with each other isn't a problem (you can give them their own compute that suspends when not in use), all the typical "data features" are there, backup is handled via time travel once they delete something they shouldn't, they don't have to understand data types or what is the best partitioning and sorting for the querying they are planning to do. It even has fancy web UI with charts and CSV upload.

You do pay a pretty penny for that, but there are many companies with a lot of money and many data problems to solve, so paying more for infrastructure and enabling a many of less technically skilled users (who are likely SMEs) is worth it.

Redshift sits somewhere in the middle. You still need DBAs, you still somewhat need to think about data layout, you still need to worry about managing backups and workload management where one using can slow down everyone else, etc. It feels like worst of all worlds.

In onprem world, Trino behaves more like Snowflake than ClickHouse, which has its uses - again if youre looking to build a general platform instead of dedicated application. Might be worth it even if you never get milisecond latency queries out of it.


> ClickHouse demonstrates this well - it is incredibly fast and powerful, but also very limited.

Your examples are not entirely correct. ClickHouse introduced CTEs in early 2021. I use them constantly. ClickHouse SQL is turning into a superset of standard SQL at least for queries. Most SELECT syntax, including window functions, just works. ClickHouse does have updates; but they are asynchronous. This too is being fixed. Synchronous DELETEs will be available this year, UPDATE is next. One big issue for my money is distributed joins. They still require a lot of reasoning about data locality.

You are right that ClickHouse requires attention and skills to get the best performance. However, that includes fixed, low-latency use cases like real-time marketing that Snowflake simply does not handle. ClickHouse and Snowflake aren't interchangeable for these use cases.


Oh wow, I missed that. I guess I heard about the CTE somewhere and understood dbt-clickhouse not having ephemeral materialization as CH still not supporting it.

I am still somewhat salty about not being able to do GROUP BY 1, 2, 3 :)

> You are right that ClickHouse requires attention and skills to get the best performance. However, that includes fixed, low-latency use cases like real-time marketing that Snowflake simply does not handle. ClickHouse and Snowflake aren't interchangeable for these use cases.

Yup, that was my point.


> GROUP BY 1, 2, 3

It is not ANSI SQL.

Nevertheless, it is already available in ClickHouse as well under `enable_positional_arguments` setting and we are considering making it enabled by default.


Ooooh.

Is there a way to turn unquoted column and table names case insensitive too? Thats like the second biggest annoyance.


Not yet. I will add it to the short list.


> Snowflake is also technically going to always be far superior to Redshift, because AWS is a follower, not a leader.

Redshift was the first cloud data warehouse-as-a-service in the Amazon cloud. Every data warehouse since then has built on that concept. Speaking of innovation, Snowflake depends on object storage, which Amazon basically invented.


Yes, and credit where credit's due, with Redshift Amazon changed analytics industry forever and they changed whole IT space forever with S3.

But since then, there's not much innovation in analytics space from them, is there?


Not so sure about that: https://aws.amazon.com/ground-station/. Enabling data collection from the rest of the solar system seems pretty cutting edge to me.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: