Hacker Newsnew | past | comments | ask | show | jobs | submit | tillvz's commentslogin

Trust & explainability is the biggest issue here.

We've been building natural language analytics at Veezoo (https://www.veezoo.com/) for 10 years, and what we find is that straight Text-to-SQL doesn't scale. If AI writes SQL directly, you're building on a probabilistic foundation. When a CFO asks for revenue the number can't just be correct 99% of times. Also you can't get the CFO to read SQL to verify.

We're solving that with an abstraction layer (Knowledge Graph) in between. AI translates natural language to a semantic query language, which then compiles to SQL deterministically.

At the same time you can translate the semantic query deterministically back into an explanation for the business user, so they can easily verify if the result matches their intent.

Business logic lives in the Knowledge Graph and the compiler ensures every query adheres to it 100%, every time. No AI is involved in that step.

Veezoo Architecture: https://docs.veezoo.com/veezoo/architecture-overview


Thanks for sharing the links, the architectural overview is very insightful.

I'm curious how this approach manages cardinality explosion? Also, how do you handle cases where a user asks for data that requires running multiple queries, specifically where each query depends on the results of the previous one?


> I'm curious how this approach manages cardinality explosion?

The Knowledge Graph explicitly models cardinality and relationships between entities. The compiler uses that to generate SQL that handles it correctly, using e.g. DISTINCT

> Also, how do you handle cases where a user asks for data that requires running multiple queries, specifically where each query depends on the results of the previous one?

Veezoo can generate adaptive plans, so it can decide to wait for a database query to return results before continuing


Thanks for answering! Regarding cardinality, I was actually thinking more about high-cardinality dimensions on the NLU side, e.g., if a user asks for "Sales for [Obscure Company Name]," and you have 10M distinct customers. Does the Knowledge Graph have to index all those values for the mapping to work?

On the adaptive plans, Is that execution logic handled entirely by your deterministic compiler, or does it loop back to the LLM to interpret the intermediate results?


>Does the Knowledge Graph have to index all those values for the mapping to work?

There are both options. You can index them as entities [1] within Veezoo and keep the mapping automatically synchronized with the database. Or decide to not index them, which will make Veezoo e.g. attempt answering the question using string search in SQL.

>On the adaptive plans, Is that execution logic handled entirely by your deterministic compiler, or does it loop back to the LLM to interpret the intermediate results?

The plan is done entirely by the LLM. The VQL steps (i.e. fetching answers from the database) within the plan is where the compiler kicks in.

[1] https://docs.veezoo.com/vkl/kb-layer/entity/


Don't you still need to unit test and version control the SQL artefact that is produced? You need to be able to see which query was used on which date and how it was validated.

(Prompts need to be version controlled too, of course)


Yes, every SQL query Veezoo runs is logged and visible to admins.

The fundamental artifact is VQL (Veezoo Query Language), which queries against a Knowledge Graph containing your business data model, things like your "Revenue" measure.

A query might look like this:

var order from kb.Order

date_in(order.Order_Date, date("#today"))

var retRevenue = kb.Order.Revenue(order)

select(retRevenue)

If the business decides to change how revenue is computed, the VQL stays valid but compiles to different SQL. At the same time Veezoo can test that with your knowledge graph change that you are not breaking anyones dashboard and even apply evolutions if needed

VQL: https://docs.veezoo.com/vkl/kb-layer/vql/

Evolutions: https://docs.veezoo.com/vkl/evolutions/

The Knowledge Graph itself is version controlled, so the data team can trace every change.


Completely agree! A semantic layer is essential for scaling analytics to enterprise complexity.

Another alternative here is Veezoo [0], which combines the semantic layer (a Knowledge Graph) and self-service analytics into one integrated solution.

We built it specifically for the analytics use-case for both the "data persona" to manage the semantic layer, as well as for the "business persona" to analyze the data.

If you’re looking for a semantic-layer + (embedded) BI solution right out of the box. This could be a fit.

0 - https://www.veezoo.com


Having an LLM be in charge of business logic is madness.

There cannot be any AI involved when processing the definition of a KPI. Otherwise you'll never be able to roll it out to thousands of users when there's always a 90% (or even 99%) chance that the business logic might not get applied correctly.

Check out what we do at Veezoo (https://www.veezoo.com) with the Knowledge Graph / Semantic Layer to mitigate that.


Agree that an approach that more semantically models the data is better, especially when you want to eventually let the non-technical users ask questions.

When you're on a higher abstraction level, it also allows you to make clear definitions (e.g. for certain KPIs) and define business logic that always needs to be applied to get the correct results.

There you don't want to leave it up to chance that a filter gets hallucinated in or out when you ask e.g. about your company's revenue.

At Veezoo (https://www.veezoo.com) we have taken the approach that instead of going directly to SQL. So when a user asks a question, Veezoo translates it first into a query against the Knowledge Graph (which represents the business objects, their relationship etc.). From there we compile it into a SQL query depending on the target database (they all have slight differences) without any AI involvement. In this compilation step we also make sure that the business logic is properly applied.





Veezoo (https://www.veezoo.com) is built to make it as easy as possible for nontechnical users to get answers to their ad-hoc questions.

Follows a conversational "ChatGPT-like" approach since already 2016.

Info: I'm one of the founders.


I dislike about your pricing that it tells me reasonable 29$ and then in the fineprint it says minimum 5 users. I get the reasoning behind your pricing logic, but I really dislike it. Now as solo business owner I'm gone.


If you have a single data source that you'd like to use you can even use it for free up to 5 users.


We are following this approach at Veezoo (https://www.veezoo.com).

When Veezoo connects to a database / dwh for the first time, an initial Semantic Layer / Knowledge Graph gets built automatically based on the data itself. We try to recognize how the columns link to other tables, try to identify units, and other semantic information e.g. if something is a "Location" or a "Country" and so on.

The whole conversational "plain english" querying then operates on top of the semantic layer, ensuring business logic (and other governance topics) are always respected.


"So is enterprise conversational BI impossible in 2023? Will there be a few more years of academic papers and company AI hackathon projects before a solution can be deployed in production? We don’t think so." -- from the medium article.

Want to put your attention to https://www.veezoo.com as well. A conversational self-service analytics solution that's been around since 2016 and productively deployed in fortune 500 companies and used by thousands of users daily :)

Congrats on the launch - and also reaching the top of the Spider dataset. We're very familiar with that dataset and its difficulties :)!

Happy to have a chat as well!


Amazing! Feel free to shoot me a line at amir at dataherald


done :)! Looking forward


Very cool. I'm checking out veezoo.com now!


At Veezoo (https://veezoo.com) we're about to launch a MongoDB connector leveraging Trino, hit me up if you wanna test it.


Hey, I'd love to try this!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: