Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A similar option exists with txtai (https://github.com/neuml/txtai).

https://neuml.hashnode.dev/speech-to-speech-rag

https://www.youtube.com/watch?v=tH8QWwkVMKA

One would just need to remove the RAG piece and use a Translation pipeline (https://neuml.github.io/txtai/pipeline/text/translation/). They'd also need to use a Korean TTS model.

Both this and the Hugging Face speech-to-speech projects are Python though.




Your library is quite possibly the best example of effortful, understandable and useful work I have ever seen - principally evidenced by how you keep evolving with the times. I've seen you keep it up to date and even on the edge now for years and through multiple NLP mini-revolutions (sentence embeddings/new uses) and what must have been the annoying release of LLMs and still push on to have an explainable and useful library.

Code from txtai just feels like exactly the right way to express what I am usually trying to do in NLP.

My highest commendations. If you ever have time, please share your experience/what lead to you taking this path with txtai. For example I see you started in earnest around August 2020 (maybe before) - at that time i would love to know if you imagined LLMs coming on to be as prominent as they are now and for instruction-tuning to work as well as it is. I know at that time many PhD students I knew in NLP (and profs) felt LLMs were far too unreliable and would not reach e.g. consistent scores on MMLU/HELLASWAG.


I really appreciate that! Thank you.

It's been quite a ride from 2020. When I started txtai, the first use case was RAG in a way. Except instead of an LLM, it used an extractive QA model. But it was really the same idea, get a relevant context then find the useful information in it. LLMs just made it much more "creative".

Right before ChatGPT, I was working on semantic graphs. That took the wind out of the sails on that for a while until GraphRAG came along. Definitely was a detour adding the LLM framework into txtai during 2023.

The next release will be a major release (8.0) with agent support (https://github.com/neuml/txtai/issues/804). I've been hesitant to buy into the "agentic" hype as it seems quite convoluted and complicated at this point. But I believe there are some wins available.

In 2024, it's hard to get noticed. There are tons of RAG and Agent frameworks. Sometimes you see something trend and surge past txtai in terms of stars in a matter of days. txtai has 10% of the stars of LangChain but I feel it competes with it quite well.

Nonetheless I keep chugging along because I believe in the project and that it can solve real-world use cases better than many other options.


I have a dozen or so tabs open at the moment to wrap my head around txtai and its very broad feature set. The plethora of examples is nice even if the python idioms are dense. The semantic graph bits are of keen interest for my use case, as are the pipelines and workflows. I really appreciate you continuing to hack on this.


You got it. Hopefully the project continues it's slow growth trajectory.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: