Hacker News new | past | comments | ask | show | jobs | submit | aazo11's comments login

Right now the supported Vector stores are Chroma (which you can self-host), Pinecone and Astra. Adding a new vector store is quite easy: you just need to extend the VectorStore class (https://github.com/Dataherald/dataherald/tree/main/services/...) and set it as the Vector store module to be used in the environment variable https://github.com/Dataherald/dataherald/blob/main/services/...


There are organizations using Dataherald in production right now.

The latency is ~20-30s and it takes some set up, so as long as those are not blockers it can be used in prod.


Yes when you connect Dataherald to a DB it scans it and you can do exploratory queries.


What happens when the tables and columns have cryptic names/acronyms? Do you need to inject documentation?


ORMs generally map around entities and dimensions. Users generally ask about metrics and measures, which can be expressed in aggregations and group bys.

How ould the NLP+ORM system do this?


While the engine response is not accurate all the time, the engine returns a confidence score. We have never encountered cases where a deployment with necessary training data indicates a .9 confidence score on an incorrectly generated SQL.


Tables, columns and views are scanned at configuration time (or based on an API trigger) and stored in the data store and a vector store, not on every run.

They are then retrieved and injected based on relevance to the query.


As I wrote on the original thread, we recommend using the RDBMS row-level security features.

This blog discusses how to do that on Postgres

https://www.2ndquadrant.com/en/blog/application-users-vs-row...


Way way too complicated. I thought this tool was suppsed to make my life easier


is there an easier way?


Yes write SQL


The question was around row level security


We recommend users leverage row-level security features built into modern RDBMS so the query results only return data for a given user.

You can read more on how to do that on Postgres here https://www.2ndquadrant.com/en/blog/application-users-vs-row...


Where do you recommend this? It sounds dangerous for databases that do not implement RLS, like Mysql, MariaDb, Sqlite. I think you should highlight that very clearly somewhere.


Currently does not but looking to add support. Would love to connect and learn more about your use case.


Sure will reach you out. Currently Dataherald blocks DML or DDL commands from being generated/executed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: