Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, they are similar. ArkFlow is mainly based on DataFusion. Bento actually comes from Benthos. Currently, the ArkFlow project is in the early stages and no performance comparison test has been conducted, but I believe that ArkFlow will outperform them in the long run.

Benthos: https://github.com/redpanda-data/benthos

DataFusion: https://github.com/apache/datafusion



What we found with RPCN (redpanda connect)/old benthos is that most systems are very slow and only cpu intensive things require manual CPU instruction optimizations like the snowflake connector we wrote (https://docs.redpanda.com/redpanda-connect/components/output...). The bulk of it is just about completeness. Go feels like the Perl of the 2020s. Cool little libs for just about everything.


Yes, RPCN (redpanda connect)/old benthos is very cool and can solve most of the scenes. Let me tell you quietly that I am using it too.


Arroyo is another one based on DataFusion


Yes, Arroyo is entirely based on DataFusion, but ArkFlow is not exactly. In the future, ArkFlow will establish a plug-in ecosystem, allowing anyone to process data through plug-ins, not limited to DataFusion.


This isn't quite correct (I'm the creator of Arroyo). We use DataFusion to implement parts of our SQL support (in particular the planner and the expression interpreter) but we have our own dataflow and operators. By contrast Synnada[0] is directly built on DF.

A contrast between Arroyo and systems like Benthos and from what I can tell ArkFlow, is that Arroyo is a "stateful" stream processing engine, which means that we can support things like windows, aggregates, and joins, with exactly-once semantics and fault tolerance, at the cost of significant additional complexity[1].

[0] https://www.synnada.ai/ [1] https://www.arroyo.dev/blog/stateful-stream-processing


Arroyo has been designed with more comprehensive consideration.


Sorry, please forgive me for not knowing Arroyo completely.


No worries! We definitely rely heavily on DF (it’s an incredible project!). Part of what makes it so great is its modularity—it’s a toolkit for building sql systems, which is extremely cool.


Yes, whether it is DataFusion, Arroyo, or Bentos, these open source products have made me profit a lot.


That made me chuckle. May you profit in all senses.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: