Launch HN: Trellis (YC W24) – AI-powered workflows for unstructured data

john_horton · 2024-08-13T23:57:51 1723593471

Very cool - I've been working on an open source python package that lets you do some similar things (https://github.com/expectedparrot/edsl).

Here's an example of the Enron email demo using the edsl syntax/package & a few different LLMs: https://www.expectedparrot.com/content/6607caa1-efc5-439f-85...

the_bosshog · 2024-08-14T12:20:45 1723638045

Thanks for sharing! It handled the emails very well.

olavgg · 2024-08-14T10:44:54 1723632294

That is very cool, thank you for sharing.

john_horton · 2024-08-14T11:15:11 1723634111

thanks! B/c it got some positive reaction here, I did a little thread on how you can turn this flow into an API: https://x.com/johnjhorton/status/1823672992624242895

makk · 2024-08-13T22:21:11 1723587671

> a major commercial bank I work with couldn’t improve credit risk models because critical data was stuck in PDFs and emails.

Great use case! Worked on exactly this a decade ago. It was Hard™ then. Could only make so much progress. Getting this right is a huge value unlock. Congrats!

harryf · 2024-08-14T06:19:58 1723616398

Make sure you have an on-premise option for this type of customer. I've worked at two software companies in Europe with tangentially similar products related to document analysis. On premise is a key requirement.

Even though it's 2024, banks, financial institutions like insurance companies etc. tend to be _very_ cautious with valuable documents involving customers. There are also regional regulations that prevent things like patient data being shared with _any_ 3rd parties. Even one of the big 4 oil companies that I've dealt with as prospective customer - very strict rules requiring on premise solutions.

The good news is many are using things like Kubernetes and OpenShift internally, so it should be possible to port what you do on AWS to on-premise.

throw03172019 · 2024-08-14T16:56:48 1723654608

On-premise will be a lot more difficult than just launching a few pods in Kubernetes. These AI tools (LLMs / vision models) will require some high powered gpus as well.

intelVISA · 2024-08-15T03:48:22 1723693702

On-prem is theater if the OS isn't libre.

ace32229 · 2024-08-14T08:40:04 1723624804

I have just been working through the same problem (though just PDFs). Google DocAI helped enormously after a bit of initial input.

intelVISA · 2024-08-15T03:46:53 1723693613

Who is liable when the ML model hallucinates™ while parsing some critical data?

Better still if it can then become a source of truth for further departures from reality.

macklinkachorn · 2024-08-13T23:18:30 1723591110

Great to hear that you worked saw similar use cases. Doing this before LLMs seem like a big challenge.

bustodisgusto · 2024-08-13T21:24:57 1723584297

We built something tangentially related at SoundTrace.

Basically when we onboard a new client they dump all their audiograms on us as PDFs.

The data needs extraction needs to be perfect because the tables values are used to detect hearing loss over time.

We settled on a pipeline that looks roughly like

PDF -> gpto pre filter phase -> OCR to extract text tables and forms -> things branch out here

We do a direct parse of forms and text through an LLM

Extract audiogram graphs and send them to a foundation convnet

Attempt to parse tables programmatically

-> an audiogram might have 3 separate places where the values are so we pass the results of all three of these routes through Claude sonnet and if they match they get auto approved. If they don’t, they get flagged for manual review.

All in all it’s been a journey but the accuracy is near 100 percent. These tools are incredible

macklinkachorn · 2024-08-13T23:23:19 1723591399

Super cool! This aligns with our experiences. These tools are great and can get to near 100% of accuracy but it's quite a lot of work on the Eng side to get it there reliably.

icey · 2024-08-13T16:08:07 1723565287

Great idea. I used to work at Instabase, which you probably compete with. The better you are at dealing with dodgy PDFs and document scans, the more valuable this will be to big banks, shipping companies, etc.

macklinkachorn · 2024-08-13T17:22:32 1723569752

Thanks! Always surprised to see how many dodgy PDFs and scans there is in enterprises.

cs702 · 2024-08-13T15:59:21 1723564761

Congratulations on launching!

Trellis looks amazing... but only if it works well enough, i.e., if the rate of edge cases that trip up the service consistently remains close to 0%.

Every organization in the world needs and wants this, like, right now.

If you make it work well enough, you'll have customers knocking on your door around the clock.

I'm going to take a look. Like others here, I'm rooting for you guys to succeed.

macklinkachorn · 2024-08-13T16:03:38 1723565018

Maintaining the right level of accuracy across different domain is quite hard and something we spend a lot of time on. The accuracy bar tends to be quite high for financial services so we're adding some validation steps and checking to make sure any errors get caught beforehand.

cs702 · 2024-08-13T16:10:39 1723565439

Thank you. Yes, I'm not at all surprised to hear that.

The biggest challenge I see for you guys is that your best customer prospects, i.e., those organizations which need this most urgently and are willing to pay the most for it are the ones already spending gobs of money to do it with human labor because mistakes are too costly, so they need at least human-level performance.

As you know, current-generation LLMs/LMMs are not yet reliable enough to do it on their own. They need all the help they can get -- sanity data checks, post-processing logic, ensembles of models, organization into teams of agents, etc., etc. -- I'm sure you're looking at all options.

Absent human beings in the loop, you're at the frontier of LLM/LMM research.

If you pull it off, you'll make megabucks.

shcheklein · 2024-08-13T17:22:04 1723569724

Hey, congrats! Are you competing / is there some overlap / what are the key differences with Roe AI (YC W24) - roe.ai (just launched recently on HN https://news.ycombinator.com/item?id=41202694 as well).

jackylin · 2024-08-13T18:36:43 1723574203

Jason and Richard from Roe AI are amazing people! We were in the same YC batch and section. Excited for what Roe AI is building and their focus on building a new type of data warehouse.

At Trellis, we're focused on building the AI tool that supports document-heavy workflows (this includes building the dashboard for teams to review, update, and approved results that were flagged, reading and writing directly to your system of record like Salesforce, and allowing customers to create their own validations around the documents).

iudexgundyr · 2024-08-14T06:46:43 1723618003

Interesting! One quick question, how did you validate your data and ensure its correctness, since the ground truth is unstructured?

sbarre · 2024-08-14T12:04:16 1723637056

Not OP but based on their writeup it sounds like you do need to provide at least a target schema, so what data you need or expect to extract from the unstructured input.

I assume that in the validation step if you don't get all those data points, then that routes to an error state for further review or something.

macklinkachorn · 2024-08-14T17:24:28 1723656268

The users specify the schema and output format and a validation rule and we make sure the system adheres to that.

artembugara · 2024-08-13T17:03:31 1723568611

Hey folks. Congrats on the launch.

Everyone here knows that it's a really big problem that no one has nailed yet.

My 2 cents:

1. It took us (newscatcherapi.com) three years to realize that customers with the biggest problems and with the biggest budgets are the most underserved. The reason is that everyone is building an infinitely scalable AI/LLM/whatever to gain insights from news.

In reality, this NLP/AI works quite OK out of the box but is not ideal for everyone at the same time. So we decided to do Palantir-like onboarding/integration for each customer. We charge 25x more, but customers have a perfect tailor-made solution and a high ROI.

I see you already do the same! "99%+ accuracy with fine-tuning and human-in-the-loop" is what worked great for us. This way, your competitor is a human on payroll (very expensive) and not AWS Tesseract.

Going from 95% to 99% is just a fractional improvement, but it can be "not good enough" to a "great solution" change that can be charged differently.

2. "AI-powered workflow for unstructured data" what does it even mean? Why don't you say "99%+ accuracy extraction"? It's 2024, everyone is using AI, and everyone knows you need 2 hours to start applying AI from 0. So don't lower my expectations.

macklinkachorn · 2024-08-13T17:19:27 1723569567

Appreciate the note.

1. I completely agree. Last-mile accuracy is crucial for enterprise buyers, and the challenge isn't just the AI. It's about mapping their business logic and workflows to the product in a way that demonstrates fast time to value.

2. Thanks for the feedback. We're still refining the messaging and don't want to be overly focused on just the extraction aspect. Do you think positioning it as ETL for unstructured data or high-accuracy extraction for enterprises might work better?"

artembugara · 2024-08-13T17:31:52 1723570312

2. I think that "AI" and "unstructured data" sounded "cool" 5 years ago :)

I'd be mindblown if you said, "We turn PDFs into structured data with 99.99% accuracy. Here is how:"

And then tell me about fine-tuning human-in-the-loop stuff.

EarlyOom · 2024-08-13T18:21:42 1723573302

We've been building something similar with https://vlm.run/: we're starting out with documents, but feel like the real killer app will involve agentic workflows grounded in visual inputs like websites. The challenge is that even the best foundation models still struggle a lot with hallucination and rate limits, which means that you have to chain together both OCR and LLMs to get a good result. Platforms like Tesseract work fine for simple, dense documents, but don't help with more complex visual media like charts and graphs. LLMs are great, but even the release of JSON schemas by OpenAI hasn't really fixed 'making things up' or 'giving up halfway through'.

rahimnathwani · 2024-08-13T17:33:54 1723570434

I've had do some of this recently, as a one-off, to extract the same fields from thousands of scanned documents.

I used OpenAI's function calling (via Langchain's https://python.langchain.com/v0.1/docs/modules/model_io/chat... API).

Some of the challenges I had:

1. poor recall for some fields, even with a wide variety of input document formats

2. needing to experiment with the json schema (particularly field descriptions) to get the best info out, and ignore superfluous information

3. for each long document, deciding whether to send the whole document in the context, or only the most relevant chunks (using traditional text search and semantic vector search)

4. poor quality OCR

From the demo video, it seems like your main innovation is allowing a non-technical user to do #2 in an iterative fashion. Have I understood correctly?

macklinkachorn · 2024-08-13T17:49:05 1723571345

We face similar challenges you listed and handle all of the above. 1. Out of the box OCR doesn't perform as well for complex documents (with tables, images, etc.). We use vision model to help process that documents. 2. Recall (for longer documents) and accuracy are also a major problem. We built in validation systems and references to help users validate the results. 3. Maintain this systems in production, integrate with the data sources and refresh when new data comes in are quite annoying. We manage that for the end users. 4. For non-technical users, we allow them to iterate through different business logic and have a one unify place to manage data workflows.

efriis · 2024-08-13T19:51:23 1723578683

Would recommend using the updated guide here! That link is from the v0.1 docs. https://python.langchain.com/v0.2/docs/how_to/structured_out...

OOC which openai model were you using? Would recommend trying 4o as well as Anthropic claude 3.5 sonnet if ya haven't played around with those yet

rahimnathwani · 2024-08-13T20:01:42 1723579302

Thanks.

I was using gpt-3.5-turbo-0125. It was before the recent pricing change.

But I have a bunch of updates to make to the json schema, so will re-run everything with gpt-4o-mini.

Sonnet seems a lot more expensive, but I'll 'upgrade' if the schema changes don't get sufficiently good results.

efriis · 2024-08-13T20:06:58 1723579618

Nice. Could also give haiku a try!

mistursinistur · 2024-08-13T18:05:14 1723572314

FWIW I've seen noticeably better results on (1) and (4) extracting JSON from images via Claude, although (2) and (3) still take effort.

rahimnathwani · 2024-08-13T18:15:12 1723572912

Thanks for sharing.

I'm curious about what types of source documents you tried, and whether you ever suffer from hallucinations?

natural1 · 2024-08-13T16:12:47 1723565567

Has Trellis explored partnerships or integrations with major ERP systems or existing ETL pipelines? The ability to seamlessly fit into existing enterprise architectures could be a significant competitive advantage and a compelling value proposition for large enterprises looking to modernize their data infrastructure.

atak1 · 2024-08-13T15:39:49 1723563589

Congrats on launching! Wish we had this years ago at Flexport for our ops / science teams. Traditional ML approaches are expensive, and the idea of defining your final shape of data and automating the ETL process is the best abstraction out there.

Rooting for you guys!

macklinkachorn · 2024-08-13T15:47:53 1723564073

Glad to see people experiencing similar problems! We previously spend way too much time building and maintaining document processing pipeline that doesn't really scale.

skeptrune · 2024-08-13T17:25:31 1723569931

Both fulltext (BM25 or SPLADE) and dense vector search have issues with documents of different lengths. Part of what makes recursive sentence splitting work so well are its length normalization properties.

Filters are a really important feature downstream of that which this system can provide.

We have also worked with the Enron corpus for demos and fast, reliable ETL for a set of documents that large is more difficult than it seems and a commendable problem to solve.

Exciting stuff!

macklinkachorn · 2024-08-13T17:56:11 1723571771

Thanks! We also start to see the patterns where search systems are being improved with filters and hierarchy level metadata. Another use case that people use Trellis for is ingesting data into their downstream LLMs applications.

cellu · 2024-08-14T09:22:47 1723627367

Really cool project! I'm doing something similar at a very small scale for my personal project using TypeChat with Zod (https://github.com/microsoft/Typechat) and Unstructured (https://unstructured.io/)

macklinkachorn · 2024-08-13T18:24:28 1723573468

Getting a lot of love from HN so the demo site and data processing might slow down by quite a bit. We're fixing it right now!

aiden3 · 2024-08-13T18:02:48 1723572168

What about a pdf with many separate datapoints on it?

For instance, I have 100 pdfs, each with 10-100 individual products listed (in different formats).

I want to create a single table with one row per product appearing in any of the PDFs, with various details like price, product description, etc.,

From what I can tell from the demo, it seems like 1 file = 1 row in Trellis?

macklinkachorn · 2024-08-13T18:09:54 1723572594

Good question and we have seen this extraction workflow a lot in financial services. We just added table mode to the product (select table in transformation parameters) where we extract table structure in the documents that match that schema. So 1 file map to N rows where N is all the row in the table.

seanw265 · 2024-08-15T16:32:58 1723739578

I'm having trouble finding table mode in the demo you linked. Where can I find it?

atak1 · 2024-08-13T18:55:23 1723575323

Just did an extraction and table mode targets this rly well :)

ellis0n · 2024-08-15T17:56:06 1723744566

A good project, I think it will help many projects that are in chaos due to an overload of information. But many projects are in chaos created by managers with chaos in their heads. This will only add more chaos to such projects.

serjester · 2024-08-13T17:02:58 1723568578

It seems like your business strategy is contingent on foundational model providers not improving their product on a couple dimensions: price, grounding accuracy and file handling. This is a risky strategy, especially in such a competitive market. Wishing you the best of luck.

hamsterbooster · 2024-08-13T17:13:01 1723569181

I would say the opposite. We want to make sure that we build our systems in a way that it get better as foundational model becomes better.

Our thesis is that foundational models will become good and affordable enough to be used in almost all data processing pipelines. We build systems on top of that to manage workflows, integrations, and data applications that people may want to develop.

CuriouslyC · 2024-08-13T17:21:15 1723569675

Seems like you want foundational models to become better at doing what you want when you give it your "magic" prompt, while not becoming smart enough to not need your magic prompt at all.

I'd need to actually dig into your product to make an informed statement but my guess is that if you build your business around AI secret sauce you're going to get your business eaten and pivot or fail, and if you build your business around a UI and specific integrations/tools real customers you're already in contact with want right now, you'll be ok.

purplepatrick · 2024-08-13T17:40:15 1723570815

Two quick questions: any plans on being hipaa compliant? Probably one of the biggest use cases for this is in health insurance, etc.

How do your capabilities compare to Google Document AI or Watson SDU? Also what about standalone competitors such as Indico Data or DocuPanda?

macklinkachorn · 2024-08-13T18:28:54 1723573734

Yes, HIPAA compliance is on the roadmap and should be out in a few weeks. We spent a lot of time on healthcare/sensitive data use cases.

Google Document AI and Watson SDU seem to be an afterthought for IBM/Google. The accuracy and configurability often fall short when you want to use them in a production setting.

Comparing to other legacy document processing companies, I think there are a few areas where we differentiate:

1. We handle end-to-end workflows from integrating with data sources, defining the transformation, and automatically triggering new runs when there’s an update to the data. 2. We built our entire stack on LLM and Vision transformers and use OCR/parser to check the results. This allows the mapping and tasks to be a lot more robust and flexible. 3. We have validations, reference checking, and confidence score metrics that enable fast human-in-the-loop iteration.

dmahanta · 2024-08-13T17:32:43 1723570363

Didnt work for me as expected

macklinkachorn · 2024-08-13T17:38:25 1723570705

Please let me know the issues and happy to get it set up correctly for you. I'm at mac@runtrellis.com

MoritzWall · 2024-08-13T18:01:22 1723572082

> And many companies today want data preprocessing in ETL pipelines and data ingestion for RAG.

I'm curious, have you (or your customers) deployed this in a RAG use case already, and what have been the results like?

bitshaker · 2024-08-13T16:11:33 1723565493

Digitizing and organizing old document scans for birth, marriage, and death records would be a huge win for genealogy research. The Mormon church would be a great customer for you.

meiraleal · 2024-08-13T17:31:06 1723570266

For them and all other 50 AI PDF scanning wrappers that were featured on Show HN in the past month.

chrisweekly · 2024-08-13T17:19:52 1723569592

disclaimer: I'm a barely-informed layperson, not any kind of AI expert

non-snarky genuine question: is "generate structured data from unstructured data using AI" intended to be a moat or differentiator?

catalyst for my question: I just read about this capability becoming available from other AI vendors, e.g.

https://openai.com/index/introducing-structured-outputs-in-t...

constantinum · 2024-08-13T17:52:38 1723571558

That is only part of the problem; the others include:

1. writing connectors for various sources

2. writing connectors for destination

3. supporting multiple models, embeddings, vector database, text extractors

3. workflow automation engine(cron jobs)

4. performance tuning for speed and costs

5. security and compliance

macklinkachorn · 2024-08-13T18:01:00 1723572060

Totally! The structured extraction from AI is only a small part in the product. Beyond the list above we also built 1. Custom validation that allows end users to validate outputs with their own logic 2. Manage different workflows (monitoring, scaling) and keep track of business logic in processing different data sources.

usehexus · 2024-08-13T22:05:50 1723586750

Congrats on the launch! This is a great idea! Many usecases.

szawinis · 2024-08-17T00:50:11 1723855811

Super cool! This really is big problem that's waiting for someone to have a solution that fully nails it.

sidcool · 2024-08-13T15:36:51 1723563411

Congrats on launching. What model or AI you use underneath?

macklinkachorn · 2024-08-13T15:49:55 1723564195

We use a combination of fine-tune LLMs models that're specialized at extraction, data validation and parsing and large foundational models for more general reasoning tasks.

Model routing architecture has been quite interesting to explore.

andrethegiant · 2024-08-13T16:18:49 1723565929

Have you tried the Structured Output feature that OpenAI released last week?

SebaSeba · 2024-08-14T11:17:17 1723634237

I'd be curious to know the answer to this also.

hubraumhugo · 2024-08-13T18:59:44 1723575584

You mention validation and schema guarantees as key features for high accuracy. Are you using an LLM-as-a-judge combined with traditional checks for this?

macklinkachorn · 2024-08-13T19:18:34 1723576714

Yes, we combine LLMs as a judge with traditional checks like reverse search in original data sources, defining your own post-processing logic, and simple classifier for confidence score.

EarlyOom · 2024-08-13T18:29:56 1723573796

Curious how this compares to platforms like https://unstructured.io/

macklinkachorn · 2024-08-13T18:44:54 1723574694

Unstructured seems to be focusing a lot on the document chunking and data ingestion into RAGs part. Trellis handles the process end-to-end from extraction to transforming the data into the schema that you need for downstream applications.

The way unstructured built their parsing and extraction are mostly based on traditional OCR and rule based extraction. We built all preprocessing pipeline in an LLM and vision model first way that allows us to be flexible when the data is quite complex (like tables and images within documents).

inglor · 2024-08-13T18:13:03 1723572783

I don't understand why you need an LLM for this, wouldn't a simple NER + entity normalization do this at a fraction of the cost?

(congrats on the launch!)

macklinkachorn · 2024-08-13T18:35:20 1723574120

NER is good for really simple things (like getting names, addresses, etc.).

A lot of the use cases that we see, like extracting data from nested tables in 100-page-long private credit documents or flagging transactions and emails that contain a specific compliance violation, are impossible to do with NER.

NER is good for really simple things (like getting names, addresses, etc.).

A lot of the use cases that we see, like extracting data from nested tables in 100-page-long private credit documents or flagging transactions and emails that contain a specific compliance violation, are impossible to do with NER.

With Trellis, the idea is taht you can write any mappings and transformations (no matter how complex the tasks or the source data are).

jackylin · 2024-08-13T18:29:53 1723573793

Good question—NER and entity normalization work well for documents that have been standardized (e.g., IRS 1040a tax forms). However, the moment something slightly changes about the form, such as the structure of the table, the accuracy of NER drops dramatically.

This is why logistics companies, financial services, and insurance firms have in-house teams to process these documents (e.g., insurance claims adjusters) or outsource them to BPOs. These documents can vary significantly from one to another.

With LLMs fine-tuned on your data, the accuracy is much higher, more robust, and more generalizable. We have a human in the loop for flagged results to ensure we maintain the highest accuracy standards in critical settings like financial services.

doctorpangloss · 2024-08-13T16:36:18 1723566978

> At the Stanford AI lab where we met... 80% of enterprise data is unstructured, and traditional platforms can’t handle it

You guys came out of an academic lab, so you must know that hypothesis fishing expeditions are not viable.

> ... a major commercial bank... couldn’t improve credit risk models because critical data was stuck in PDFs and emails.

In this example there will be no improvement to the risk model or whatever, because 19/20 times there will be no improvement. In an academic setting this is seen as normal, but in a business setting with no executive champions, only product managers, this will be seen as a failure, and it will be associated with you and your technology, which is bad.

Unfortunately these people are not willing to pay more money for less risk. What they want is a base consulting cost (i.e., a non-venture business) to identify the lowest risk, promotion worthy endeavor, and then they want to pay as little as possible to achieve that. In a sense, the kind of customers who need unstructured data ETLs are poorly positioned to use such a technology, because they don't value technology generally, they aren't forward looking.

Assembling attractive websites that are really features on top of Dagster? There's a lot of value in that. Question is, are people willing to pay for that? Anyone can make attractive Dagster UIs, anyone can do Python glue. It's very challenging to differentiate yourselves, even when you feel like you have some customers, because eventually, one of those middlemen at BankCo are going to punch your USP into Google, and find the pre-existing services with huge account management teams (i.e., the hand holding consulting business people really pay for) that outpace you.

mritchie712 · 2024-08-13T16:45:59 1723567559

> 80% of enterprise data is unstructured

I've seen quotes like this many times. It's silly. I worked at a big bank for over a decade. 95% of the data we cared about was already in a SQL database. Maybe ~80% of our data was "unstructured", but it wasn't stuff we cared about for risk management or other critical functions.

> people are not willing to pay more money for less risk

I'd disagree here. Banks are willing to pay money to reduce risk, it's just unlikely to come from scraping data out of PDFs with an LLM because they've already done this if it's worth it.

wjnc · 2024-08-13T18:12:22 1723572742

Who, in your example, put the customers financial data in the SQL database? Because in my part of finance that’s either the customer, or an employee.

Our customers are asking for integration with a lot of their systems (say HR / patrolling), but never ever offer to hook up their accounting system. If we want financial data, we either get a PDF with their audited financial statement or in exceptional cases a custom audited statement (you know, the one where a print of a part of the ledger gets a signature from the CPA for a not insignificant bill).

So I am enthusiastic from a data science point of view. Financial data processing of customer data is / was scarce since limited to what was feasible to manually process. That is nearly in the past.

PeterStuer · 2024-08-14T07:19:56 1723619996

I created automated descision suppport systems in asset based finance. For daily needs you get customer financials and other risk data from both official national sources and the likes of Dunn and Bradstreet, Graydon etc. The choice of providers depends on both the customer and deal risk/size. While the "api"'s to these providers might be clunky (putting structured request file on an ftp server and polling for a response), the data is structured (enough) to process.

Deals that are exceptional enough get assigned to a risk officer that deals with it as a case (sidenote: they use a lott of selfmade Excel, VB and low-code tools as they never get IT priority for these cases). There is not enough uniformity as well as a decreased tolerance for inaccuracy in those to warrant extensive automation.

wjnc · 2024-08-14T07:39:07 1723621147

Thanks. I see the context now. Our asset managers are indeed lucky to have Bloomberg and such, which are easily integratable (and indeed, have been "SQL" for more than a decade now). I'm aware of the third party providers of customer financial information. Lucky to operate in a niche that is not served by them. Graydon (the only one I've been in contact with) is facing a massive disruption though. Their higher tiers are perhaps not enterprise expensive, but Trellis and the likes are probably more integratable and more affordable.

But still, the one building the LLM-integration is 4x as expensive as the one manually entering the data. It's all about TOC, scale and risk perception. I also love that "risk people" (in the banking context, I'd say: model people) think their data quality should be exceptional and then use end user computing MacGyver style models. Spit and popsicle sticks.

PeterStuer · 2024-08-14T11:00:29 1723633229

"think their data quality should be exceptional and then use end user computing MacGyver style models"

The choice they have is submit a formal request to IT, be rejected 95% of the time with the remaing 5% being put in the planning with an eta 2-5 years in the future, or, DIY it with tools at hand. In an ideal world this would not be needed, in reality it is DIY or nothing.

lmeyerov · 2024-08-13T17:24:54 1723569894

Yep, and nowadays, banks are already deploying this stuff internally via their own IT teams. They have 1-2 decades of having built up ETL/orchestration talent + infra, and have been growing deals with openai/azure/google/aws/databricks for the LLM bits. Internally, big banks are rolling out hundreds of LLM apps each, and generally have freezes on new external AI vendors due to 'AI compliance risk'. NLP commoditized so it's a different world.

It makes sense on paper from a VC perspective as a big bet.. but good luck to smaller VC-funded founders competing with massive BD teams fronting top AI dev teams. We compete in adjacent spaces where we can differentiate, and intentionally decided against going in head-on. For those who can, again, good luck!

wjnc · 2024-08-13T18:15:55 1723572955

Am I in another world? (See my response above.) Most of the ‘hundreds of LLM apps’ I see are, well, not very fancy and struggling to keep up on accuracy in comparison to the meatspace solutions they promised to massively outperform.

I agree with your assessment that the IT risk barrier is very high in big corp so that entry might be hard for Trellis. Plus a continuous push afterwards to go back to traditional cloud once their offerings catch up.

lmeyerov · 2024-08-13T20:16:05 1723580165

I totally agree, and it's useful to play out the shrinking quality gap over time:

- Today: Financial companies are willing to pay cloud providers for DB, LLM, & AI services, and want to paper over the rest with internal teams + OSS, and maybe some already-trusted contractors for stopgaps. Institutional immune system largely rejects innovators not in the above categories.

- Next 6-18mo: Projects continue, and they hit today's typical quality issues. It's easiest to continue to solve these with the current team, maybe pull on a consultant or neighboring team due to good-money-after-bad, and likely, the cloud/AI provider solves more and more for them (GPT5, ..., new Google Vertex APIs, ..)

- Next year or year after: Either the above solved it, or they make a new decision around contractors + new software vendors. But how much is still needed here?

It's a scary question for non-vertical startups to still make sense with the assumption that horizontal data incumbents and core AI infra providers don't continue to eat into the territory here. Data extraction, vector indexing, RAG as a service, data quality, talk to your data, etc. Throw in the pressure of VC funding and even more fun. I think there's opportunity here, but when I think about who is positioned wrt distribution & engineering resources to get at that... I do not envy founders without those advantages.

hamsterbooster · 2024-08-13T16:46:50 1723567610

Thanks for the feedback. We built Trellis based on our experience with ingesting and analyzing unstructured customer calls and chats in a reliable way. We couldn’t find a good solution apart from developing a dedicated ML pipeline, which is quite difficult to maintain.

There are some elements that might resemble Dagster, but I believe the challenging part is constructing validation systems that ensure high accuracy and correct schemas while processing all kinds of complex PDFs and document edge cases. Over the past few weeks, our engineering team has spent a lot of time developing a vision model robust enough to extract nested tables from documents

visarga · 2024-08-13T17:04:42 1723568682

What is your metric and score? Maybe you have reached perfect reliability, but in my experience information extraction is about 90% accurate for real life scenarios, and you can't reliably know which 90%.

In critical scenarios companies won't risk using 100% automation, the human is still in the loop, so the cost doesn't go down much.

I work on LLM based information extraction and use my own evaluation sets. That's how I obtained the 90% score. I tested on many document types. It looks like it's magic when you try an invoice in GPT-4o and skim the outputs, but if you spend 15 minutes you find issues.

Can you risk an OCR error confusing a dot for a comma to send 1000x more money in a bank transfer, or to get the medical data extraction wrong and someone could suffer because there was no human in the document ingestion pipeline to see what is happening?

aviguptakonda · 2024-08-13T18:33:33 1723574013

Wow, this is game changing! With your inventions, interestingly we might also be discovering reverse ETL use cases, where the insights/analytics obtained from the troves of unstructured data can be fed back into ERP/CRM/HCM systems, closing the complete loop and amplifying more business value!! Congratulations to the Trellis team :) Regards, Avinash

mehulashah · 2024-08-13T17:10:56 1723569056

Congratulations on the launch! This is the right way to think about LLMs and document processing.

xkq · 2024-08-14T03:16:11 1723605371

Super sick. I’m building this at work rn. Definitely a cool technical problem. Good luck!

macklinkachorn · 2024-08-14T03:29:12 1723606152

That’s awesome. Curious to hear your use cases and also happy to share notes on what we see work well. Feel free to ping me (email in Bio)

rmbyrro · 2024-08-13T17:14:54 1723569294

Domains should start with your company name. Like trellishq.com

Because browsers have an autocomplete feature.

shafyy · 2024-08-13T17:21:43 1723569703

...which also autocompletes if the domain does not start with the company name :-)

rmbyrro · 2024-08-13T17:24:21 1723569861

Yes, but brains are not good at remembering which word they decided to prepend Trellis

vinibrito · 2024-08-13T15:53:27 1723564407

Nice! How's accuracy of produced data?

macklinkachorn · 2024-08-13T16:06:34 1723565194

Exact accuracy depends on the domain and tasks. Processing emails will naturally have higher accuracy than 150+ pages of private credit documents. Generally, we see 95%+ accuracy out of the box and can go up to 99%+ with fine-tuning, and human in the loop validation.

gigatexal · 2024-08-14T08:24:45 1723623885

+1 to this

nosmokewhereiam · 2024-08-13T18:42:42 1723574562

Love the name! Electronic gardening vibe.

blotterfyi · 2024-08-14T08:55:27 1723625727

Just want to say, this is pretty cool.

darkhorse13 · 2024-08-13T16:33:18 1723566798

Congrats on the launch. Serious question though, does YC only fund AI companies these days?

dang · 2024-08-13T16:59:01 1723568341

Nope! From yesterday:

Launch HN: Synnax (YC S24) – Unified hardware control and sensor data streaming - https://news.ycombinator.com/item?id=41227369 - Aug 2024 (23 comments)

also recent:

Launch HN: Stack Auth (YC S24) – An Open-Source Auth0/Clerk Alternative - https://news.ycombinator.com/item?id=41194673 - Aug 2024 (140 comments)

Launch HN: Firezone (YC W22) – Zero-trust access platform built on WireGuard - https://news.ycombinator.com/item?id=41173330 - Aug 2024 (88 comments)

Launch HN: Airhart Aeronautics (YC S22) – A modern personal airplane - https://news.ycombinator.com/item?id=41163382 - Aug 2024 (618 comments)

That's 4 of the 8 most recent Launch HNs btw. But it's true that there are reams of AI startups nowadays.

meiraleal · 2024-08-13T17:34:10 1723570450

The problem for me is that all of them look more of the same. I have a feeling of dejavu every time I see a Show HN of an AI generator, AI nocode, AI supabase, AI PDF scanner, AI monitoring startup.

I'm developing an "AI wrapper" myself and I know how difficult it is to create a reliable system using LLM integration and I guess these many similar projects are competing on being the one to create something that won't risk ruining their customers reputation. But I see no differentiation, no eye-catching tech, algorithm, invention.

YC and HN used to be the bastion of innovation in tech.

elicash · 2024-08-13T17:03:33 1723568613

This year, nearly yes in "some way":

> This year, we’ll fund more than 500 companies out of 50,000 applications, and almost all of them are related to AI in some way.

Source: https://www.ycombinator.com/blog/why-yc-went-to-dc/

(Edited to be more precise.)

hamsterbooster · 2024-08-13T16:58:42 1723568322

Thanks! There are still a lot of amazing hardware companies and vertical applications in our YC batch.

We believe that AI is only one part of our product. A significant amount of value comes from building robust integrations with different data sources and managing the business logic that operates on top of this unstructured data.

wilburli · 2024-08-13T22:50:29 1723589429

this is dope!

ymoondhra · 2024-08-14T03:46:51 1723607211

Intriguing!

destraynor · 2024-08-13T16:52:37 1723567957

Congrats on the launch, and thanks for using Intercom (co-founder here)

breadwinner · 2024-08-13T17:04:56 1723568696

Is that the chat thing that pops up in the bottom right corner? It is the most annoying thing in the world. Because it pops up uninvited, and obscures the page content I am trying to read. So annoying.

meiraleal · 2024-08-13T17:43:20 1723571000

I hate it so much when it rings out of nowhere and I don't even know which tab it is.

hamsterbooster · 2024-08-13T16:59:17 1723568357

Thanks! We got quite a few good enterprise leads from Intercom chats.

constantinum · 2024-08-13T17:36:48 1723570608

Congrats on the launch! For anyone curious who wants to dig deep and solve document processing workflows via open-source, do try Unstract https://github.com/Zipstack/unstract

jackienotchan · 2024-08-13T20:48:19 1723582099

This was the top comment for quite a while but suddenly dropped to the bottom. Was it automatically downranked for mentioning an OS alternative?

How many upvotes does your comment have?

constantinum · 2024-08-14T01:00:21 1723597221

Ah! I'm not sure why. this is strange. I see six upvotes.

localfirst · 2024-08-13T16:33:43 1723566823

looks like more solutions looking for a problem that can be solved at the vendor level

hamsterbooster · 2024-08-13T16:55:37 1723568137

In many use cases, like flagging documents for compliance issues or processing customer emails, it's challenging to manage this at the vendor level because end customers want the ability to apply business logic and run different analyses.

For data ingestion and mapping, I agree that in an ideal world, we would all have first-party API integrations. However, many industries still rely on PDFs and CSV files to transfer data.

localfirst · 2024-08-13T17:11:55 1723569115

perhaps im misunderstanding the product offering here, isn't this just throwing PDFs (which also has unparsable content like formulas, symbols and large tables even with OCR) on an LLM with structured outputs and running SQL queries?

isn't it obvious that this would be a problem that will eventually be solved by the LLM providers themselves including the ability to flag and apply business logic on top of the structured outputs?

Like I'm not sure if this is well known but LLM providers have huge pressure to turn a profit and will not hesitate to copy any downstream wrappers out of existence rather than acquiring them outright.

Its like selling wrapping tape around the shovel handle for better grip and expecting the shovel makers to not release their new shovels with it in the near future.

The shovel makers don't even need to do any market research or product development and the buyers don't have any incentive to seek or pay a dedicated third party for what their vendors will release for free and at lower costs if that makes sense.

constantinum · 2024-08-13T17:57:44 1723571864

This misunderstanding is valid. Another example is why subscription/recurring billing software exists when payment gateways can solve this problem themselves. The elephant in the room is the complexities involved down the funnel that need very specific focus/solutions.

localfirst · 2024-08-13T18:33:37 1723574017

then please elaborate on "complexities involved down the funnel" and where I am misunderstanding with examples.

macklinkachorn · 2024-08-13T18:54:49 1723575289

A few that we experience as we’re building Trellis out:

1. Managing end-to-end workflows from integrating with data sources, automatically triggering new runs when there’s new data coming in, and keeping track of different business logic that’s involved (i.e. I want to classify the type of the emails and based on that apply different extraction logic)

2. Most out-of-the-box solutions only get you 95% of the way there. The customers want the ability to pass in their own data to improve performance and specify their unique ontology.

3. Building a good UI and API support for both technical and non-technical users to use the product.

localfirst · 2024-08-13T21:45:02 1723585502

too generic