More

wanshao · 2024-05-07T03:20:44

In recent years, we have seen many challengers to Kafka emerge. Without exception, they all claim to be fully compatible with the Kafka API. Some challengers have even directly forked Kafka's code to innovate further. Has the Kafka API now become the de facto standard in the stream processing domain? How has it achieved this status?

wanshao · 2024-05-06T09:55:53

What is BufStream?Is it a new streaming system?

wanshao · 2024-05-06T07:10:26

Thanks for your advice

wanshao · 2024-05-06T07:10:03

Thanks, I've joined.

wanshao · 2024-04-30T11:16:59

I agree with your viewpoint. The crux of the matter is not whether to use tiered storage or not, but what trade-offs have been made in the specific storage architecture and what benefits have been gained. Here(https://github.com/AutoMQ/automq?tab=readme-ov-file#-automq-...) is a qualitative comparison chart of streaming systems including kafka/confluent/redpanda/warpstream/automq. This comparison chart does not have specific numerical comparisons, but purely based on their trade-offs at the storage level, I think this will be of some use to you.

wanshao · 2024-04-30T10:40:52

Buddy, you've hit the nail on the head. Everything is a trade-off. For a stream processing system, I believe it's entirely possible to balance cost, ease of use, and latency. AutoMQ(https://github.com/AutoMQ/automq) is also a stream system built on top of S3. Its storage scheme introduces a very small size of EBS storage as a persistent write cache, and then asynchronously compacts the memory data to S3, taking into account latency while retaining the advantages of warpstream. Tiered-storage is just a form, how to implement it depends on you.

chenyang · 2024-04-30T11:18:57

In fact, EBS is entirely a cloud storage solution and operates as shared storage. It is not a local disk system.

wanshao · 2024-04-29T11:34:27

Examples of this are quite common. Many tech company blog pages offer tag-based searches. The articles published can be tagged either by the system or manually, allowing us to filter content we're interested in by searching for these tags.

wanshao · 2024-04-29T11:32:17

That makes sense. Every choice has its pros and cons. Simplicity is the philosophy of HN. If complexity gets out of control, HN would cease to exist. However, perhaps some small experiments could be feasible while maintaining simplicity?

wanshao · 2024-04-29T11:29:19

Thank you for sharing. However, I might be looking for more specific categorizations. For instance, if I'm only interested in AWS-related content, I would like to be able to search for the 'AWS' tag.

wanshao · 2024-04-22T09:55:52

The last conversational approach is very practical because it provides a concrete example.