
Ask HN: Best way to build a data collection pipeline? - anacleto
I&#x27;m building a predictive analytics SaaS tool. I need to collect a massive amount of data coming from my own customers. Do you have some good advice or know any useful API product to seamlessly collect event data?
======
asavinov
Bistro Streams is an open source light-weight stream analytics engine:
[https://github.com/asavinov/bistro](https://github.com/asavinov/bistro)

It is a general-purpose and highly configurable data processing engine which
can be applied to many workloads and scenarios including data integration,
data migration, ETL, big data processing etc.

------
pastyboy
Snowplow open source or managed solution ($$$) - give it a look.
[http://www.snowplowanalytics.com](http://www.snowplowanalytics.com)

------
liberal_098
Kafka is probably the best option in this case.

