Hacker Newsnew | past | comments | ask | show | jobs | submit | lsuresh's commentslogin

We built Feldera's engine in Rust: https://github.com/feldera/feldera

There are some solid ideas here and would definitely apply to the IVM engine we're building. I'm curious if some of these effects could play a role in faster rust compilation times (e.g. nopanic..)?

We use an in-product profiler (discussed here: https://www.feldera.com/blog/introducing-feldera's-visual-pr...), along with CPU profiles to identify where in the code we're spending time.

Feldera co-founder here. Great discussions here.

Some folks pointed out that no one should design a SQL schema like this and I agree. We deal with large enterprise customers, and don't control the schemas that come our way. Trust me, we often ask customers if they have any leeway with changing their SQL and their hands are often tied. We're a query engine, so have to be able to ingest data from existing data sources (warehouse, lakehouse, kafka, etc.), so we have to be able to work with existing schemas.

So what then follows is a big part of the value we add: which is, take your hideous SQL schema and queries, warts and all, run it on Feldera, and you'll get fully incremental execution at low latency and low cost.

700 isn't even the worst number that's come our way. A hyperscale prospect asked about supporting 4000 column schemas. I don't know what's in that table either. :)


This site is underweighted on OLAP. Columnstores were invented for precisely this use case; nobody in the field wants to normalize everything.

Which brings me to the question, why a rowstore? Are Z-sets hard to manage otherwise?

Another aspect of wide tables is that they tend to have a lot of dependencies, ie different columns come from different aggregations, and the whole table gets held up if one of them is late. IVM seems like a good solution for that problem.


Good questions!

Feldera tries to be row- and column-oriented in the respective parts that matter. E.g. our LSM trees only store the set of columns that are needed, and we need to be able to pick up individual rows from within those columns for the different operators.

I don't think we've converged on the best design yet here though. We're constantly experimenting with different layouts to see what performs best based on customer workloads.


Thanks for the Feldera shoutout Jim.

For anyone else, if you want to try out Feldera and IVM for feature-engineering (it gives you perfect offline-online parity), you can start here: https://docs.feldera.com/use_cases/fraud_detection/


Start with Postgres and scale later once you have a better idea of your access patterns. You will likely model your graph as entities and recursively walk the graph (most likely through your application).

If the goal is to maintain views over graphs and performance/scale matters, consider Feldera. We see folks use it for its ability to incrementally maintain recursive SQL views (disclaimer: I work there).


Agreed, Postgres and recursive CTEs will let you simulate graph traversal with the benefit of still having a Postgres db for everything else.


I currently run Firefox nightly with cross-site cookies disabled and all the trackers/scripts blocked. I also run uBlock Origin. Any idea if privacy badger is redundant with this set up?



According to [this page](https://github.com/arkenfox/user.js/wiki/4.1-Extensions#-don...), yes, it's redundant in that case.


Thanks for the kind words (Feldera co-founder here). I'll pass it on to the design team. :)


Do give us at Feldera a shot -- full IVM for arbitrary SQL + UDFs: https://github.com/feldera/feldera/


Would love to have our team try this out (we have some ridiculous rust builds).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: