More

aazo11 · 2024-05-25T16:42:43

Right now the supported Vector stores are Chroma (which you can self-host), Pinecone and Astra. Adding a new vector store is quite easy: you just need to extend the VectorStore class (https://github.com/Dataherald/dataherald/tree/main/services/...) and set it as the Vector store module to be used in the environment variable https://github.com/Dataherald/dataherald/blob/main/services/...

aazo11 · 2024-05-25T16:37:37

There are organizations using Dataherald in production right now.

The latency is ~20-30s and it takes some set up, so as long as those are not blockers it can be used in prod.

aazo11 · 2024-05-25T16:34:30

Yes when you connect Dataherald to a DB it scans it and you can do exploratory queries.

momothereal · 2024-05-25T21:28:50

What happens when the tables and columns have cryptic names/acronyms? Do you need to inject documentation?

aazo11 · 2024-05-25T07:31:14

ORMs generally map around entities and dimensions. Users generally ask about metrics and measures, which can be expressed in aggregations and group bys.

How ould the NLP+ORM system do this?

aazo11 · 2024-05-25T02:54:42

While the engine response is not accurate all the time, the engine returns a confidence score. We have never encountered cases where a deployment with necessary training data indicates a .9 confidence score on an incorrectly generated SQL.

aazo11 · 2024-05-25T02:19:30

Tables, columns and views are scanned at configuration time (or based on an API trigger) and stored in the data store and a vector store, not on every run.

They are then retrieved and injected based on relevance to the query.

aazo11 · 2024-05-24T23:01:29

As I wrote on the original thread, we recommend using the RDBMS row-level security features.

This blog discusses how to do that on Postgres

https://www.2ndquadrant.com/en/blog/application-users-vs-row...

altdataseller · 2024-05-25T00:28:03

Way way too complicated. I thought this tool was suppsed to make my life easier

saigal · 2024-05-25T03:23:27

is there an easier way?

altdataseller · 2024-05-25T03:42:02

Yes write SQL

saigal · 2024-05-25T12:48:58

The question was around row level security

aazo11 · 2024-05-24T22:59:31

We recommend users leverage row-level security features built into modern RDBMS so the query results only return data for a given user.

You can read more on how to do that on Postgres here https://www.2ndquadrant.com/en/blog/application-users-vs-row...

throwaway115 · 2024-05-24T23:04:30

Where do you recommend this? It sounds dangerous for databases that do not implement RLS, like Mysql, MariaDb, Sqlite. I think you should highlight that very clearly somewhere.

aazo11 · 2024-05-24T22:22:24

Currently does not but looking to add support. Would love to connect and learn more about your use case.

aazo11 · 2024-05-24T22:04:21

Sure will reach you out. Currently Dataherald blocks DML or DDL commands from being generated/executed.