This sounds less like Embeddings as a Service and more like Semantic Search (whi...

jxodwyer1 · on March 28, 2023

Search is one use case we support, but you can perform a few other operations on your data, like clustering or fine-tuning. We're also working on a classification feature. Are there other async jobs you'd like to see?

crosen99 · on March 28, 2023

The problem I'd like solved is that when I want to retrieve chunks of data for retrieval augmented generation, it's challenging to optimize the choice of embeddings model, chunking strategy, and overall retrieval algorithm. I'm not sure if that's the sort of problem you're focused on.

jxodwyer1 · on March 28, 2023

We agree; this is precisely the problem area we’re focusing on!! We’re currently working on the ability for users to specify chunking strategies while providing a ton of guidance on this selection based on their particular data.

crosen99 · on March 28, 2023

In addition to the choices for how to chunk (i.e. defining chunk size, chunk boundaries, chunk overlap, etc.), there's also the question of what actually gets returned once finding the chunks that match. For example, perhaps I have a document with 100 1-page sections where each section is broken into roughly 5 chunks. I may get optimal performance in my RAG application not by retrieving the top K chunks from the index, but rather by returning the top K sections fom the document, where sections might be scored based on the number and scores of child chunks. It also might be useful to incorporate section summaries, etc., in the retrieval process.

jxodwyer1 · on March 28, 2023

This is great, and that makes a ton of sense! Would you want to define + experiment with these various configurations yourself explicitly, or would you expect a system to determine this automatically? I like the concept of rolling-up chunk scores!

jn2clark · on March 29, 2023

if you want some more options (chunking, models, +more) check here https://github.com/marqo-ai/marqo and an example for RAG using context aware trimming of text for fitting into context windows https://github.com/marqo-ai/marqo/blob/mainline/examples/GPT...