Hacker News new | past | comments | ask | show | jobs | submit login
Apache Hudi: an open data lakehouse platform (github.com/apache)
26 points by saikatsg 2 days ago | hide | past | favorite | 5 comments





I've been working with Apache Iceberg for 18 months now. Everyone in the community seems to be going toward Iceberg, not Hudi.

https://iceberg.apache.org/


Relevant: https://www.theregister.com/2025/01/20/aws_iceberg_support/?...

and also from that same article, further down:

> Late last year, AWS announced S3 Tables, a new type of storage bucket that Warfield described as "a managed Iceberg table. It provides an Iceberg catalog, in which users can create namespaces and tables, each table is a first-class resource. Users can access control policy and security policy on the table itself."


We (at my work) are evaluating Iceberg, Hudi, etc. as a next step in scaling up to support larger OLAP workloads beyond what we can currently manage in Redshift and we're leaning towards Iceberg. Does Hudi have a niche that it's especially well suited for?

“Real time” is why we picked it at Notion. I am not a data person so I don’t really know. But I am frequently bummed out when AWS or some other vendor ships a cool integration with Iceberg and/or Delta Lake and there’s nothing for Hudi. It definitely feels like the 3rd place finisher in the popularity contest (Iceberg #1, Delta #2).

I mean, Iceberg won enough that Databricks bought them…


Apparently Hudi hit 1.0 last month.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: