Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

it absolutely is.

This company is trying to make a distinction between Online Data (real time streaming with low latency), no joins, key/store and a more traditional batch processing, OLAP type configurations.

but modern data warehouses can support both.

https://www.snowflake.com/streaming-data/

I think this is an effort to segment the data warehousing market and provide new names for things that already exist and providing a vocabulary to users who may not be familiar with a company's existing datawarehouse solutions



As a data scientist using snowflake and in the market for a feature store, the snowflake streaming is only for data ingestion, not serving. It doesn't solve the problem of serving data for a low latency app.


How is that not just a feature of some future Data Warehouse though?


hi, i'm the co-founder and CTO of a feature store startup that is building this on top of snowflake, can we chat? my name is Patrick, website is rasgoml.com, and e-mail is patrick@rasgoml.com. Thanks!


Online data is not necessarily real-time streaming. I am making a distinction between OLTP workloads for the online applications that need a feature vector (i.e., a row of data) to make an individual prediction, and a client that is creating train/test data from millions of rows of data (features) - that is the OLAP workload.

To be more concrete, Feast is an open-source Feature Store built on BigQuery and originally BigTable. But the latency of BigTable for the OLTP workload was too high for GoJEK (feature lookup is just one part of making a prediction), so they switched to Redis. Redis PK lookups are a couple of ms, on average, compared with 10+ ms for BigTable. What is the latency of a PK lookup on snowflake? It ain't a millisecond or two. On MySQL Cluster (NDB), our online feature store, PK lookups return in sub-ms latency on dedicated hardware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: