Hacker News new | past | comments | ask | show | jobs | submit login
AI dev startups are struggling with one problem and I solved it
4 points by kannthu 5 months ago | hide | past | favorite | 11 comments
All startups that are developing:

- AI engineers (Devin, Tusk, Sweep, Fume)

- AI code review automation (Corgea)

- Code comprehension and search (Greptile)

- AI Code generation (Cosine, Sourcegraph)

- AI Incident Response

- and any other tool that requires understanding code and references within a code

... are facing the same problem.

For any given codebase in any given language, they need to extract dependencies, references, and other information from the codebase. They need to have a graph representation of the codebase - to know how functions are called, how classes are used, and how variables are passed around.

It is simple for one or two languages but for 7+? It is a nightmare to solve. For each language, you need to parse, resolve dependencies, and extract information. There are startups that died because they could not add support for a new language fast enough.

By accident, I solved this problem in a way that is language-agnostic and can be used for any language. (or at least it is easy to add a new language)

I want to release it as an API. I am looking for a startup that is facing this problem and would like to use this API.

How the API might work:

1. You send a zip file with the codebase,

2. My system parses the codebase and extract all the information (functions, classes, variables, references, dependencies, etc) and build a graph representation of the codebase 3. I return the graph representation of the codebase for you to use

4. Zip file is deleted after processing

In the future, there might be a self-hosted version of the API that you can run on your own infrastructure.




Once they have this graph, do you offer some other service to read/view it?

Or better yet, Is the graph your API returns in a standardized AI (idk if one exists tbh) or open source graph model (e.g. DAG)?

Kudos for solving this problem if you truly did, best of luck going forward!

I will say i highly doubt most companies will be willing to send you entire source code zips vs a more established company. My advice would be to focus more on the on-prem now instead of a future development effort


Not at this time, I only thought about releasing it as an API for companies to use within their own products (my API would only return JSON file with all of the nodes and edges from graph)

You are right with the self-hosted version


That's quite a claim that, by accident, you solved a problem that entire startups couldn't solve. Can you elaborate?


For context:

I am a co-founder of https://vidocsecurity.com/ - one of our main features is validating security issues. To validate the security issue you need to find relevant context - let's say we detected a potential security issue in file "test.ts" in function "doStuff". We would need to find what other functions in the repository reference this function and do it recursively to build a call tree. Then we use LLM to validate each branch of the tree to understand if the issue could be exploitable.

It took me a couple of months to solve the context fetching and I managed to solve it in a way that is really easily extendable to other languages. At this moment we support Typescript, JS, Python, Go, Ruby, and Rust. I can add many more languages.

I talked with other founders and understood that what I built might be valuable to other companies as they rely on supporting as many languages as possible. This post is my attempt to understand if it is a real problem or if I just imagined it myself.


One of our constraints was that the context fetching had to be accurate as the whole validation process depended on it.


Is there any code to see, anything we can test to see what you mean? Or this hasn't been implemented at all?


It is implemented, but it's embedded in the product at https://vidocsecurity.com/. I was thinking about extracting that part and creating a standalone API for it


Ah, understood. Then, I am interested, I would say that may have some interesting traction.


Email me at dawid{at}vidocsecurity.com - we can talk about it more


So.. sourcegraph?


Not really, sourcegraph is the end product - what I am talking about is giving you API to create products similar to sourcegraph.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: