Hacker News new | past | comments | ask | show | jobs | submit | sc077y's comments login

This is impressive, it's hard to imagine all the implications of a BMI.


The real question here is who puts their API keys on a slack server ?


The API key thing is a bit of a distraction: it’s used in this article as a hypothetical demonstration of one kind of secret that could be extracted in this way, but it’s only meant to be illustrative of the wider class of attack.


Damn I built a RAG agent during the past 3 months and a half for my internship. And literally everyone in my company was asking me why I wasn't using llangchain or llamaindex like I was a lunatic. Everyone else that built a rag in my company used llangchain, one even went into prod.

I kept telling them that it works well if you have a standard usage case but the second you need to something a little original you have to go through 5 layers of abstraction just to change a minute detail. Furthermore, you won't really understand every step in the process, so if any issue arises or you need to be improve the process you will start back at square 1.

This is honestly such a boost of confidence.


I had a similar experience when LangChain first came out. I spent a good amount of time trying to use it - including making some contributions to add functionality I needed - but ultimately dropped it. It made my head hurt.

Most LLM applications require nothing more than string handling, API calls, loops, and maybe a vector DB if you're doing RAG. You don't need several layers of abstraction and a bucketload of dependencies to manage basic string interpolation, HTTP requests, and for/while loops, especially in Python.

On the prompting side of things, aside from some basic tricks that are trivial to implement (CoT, in-context learning, whatever) prompting is very case-by-case and iterative, and being effective at it primarily relies on understanding how these models work, not cargo-culting the same prompts everyone else is using. LLM applications are not conceptually difficult applications to implement, but they are finicky and tough to corral, and something like LangChain only gets in the way IMO.


I haven't used LangChain, but my sense is that much of what it's really helping people with is stream handling and async control flow. While there are libraries that make it easier, I think doing this stuff right in Python can feel like swimming against the current given its history as a primarily synchronous, single-threaded runtime.

I built an agent-based AI coding tool in Go (https://github.com/plandex-ai/plandex) and I've been very happy with that choice. While there's much less of an ecosystem of LLM-related libraries and frameworks, Go's concurrency primitives make it straightforward to implement whatever I need, and I never have to worry about leaky or awkward abstractions.


Not really. There isn't much that langchain is doing in this regard. The heavy lifting is already done by the original libs like from openai which they use and the rest are just wrappers around their api calls.


The langchain apis seem to include a lot of functionality for ‘chaining’ LLM calls and streams. That’s what I was referring to.


I completely agree, and built magentic [0] to cover the common needs (structured output, common abstraction across LLM providers, LLM-assisted retries) while leaving all the prompts up to the package user.

[0] https://github.com/jackmpcollins/magentic


Groupthink is really common among programmers, especially when they have no idea what they are talking about. It shows you don't need a lot of experience to see the emperor has no clothes, but you do need to pay attention.


I admire what the Langchain team has been building toward even if people don’t agree with some of their design choices.

The OpenAI api and others are quite raw, and it’s hard as a developer to resist building abstractions on top of it.

Some people are comparing libraries like Langchain to ORMs in this conversation, but I think maybe the better comparison would be web frameworks. Like, yeah the web/HTML/JSON are “just text” too, but you probably don’t want to reinvent a bunch of string and header parsing libraries every time you spin up a new project.

Coming from the JS ecosystem, I imagine a lot of people would like a lighter weight library like Express that handles the boring parts but doesn’t get in the way.


Matches my experience as well. I tried langchain about a year ago for an app and had a pretty standard use case but even going a little bit of rail and i had to dig up layers of abstractions where it would have been much easier just using the original openai lib. So it might be beneficial if your use case is about offering many different LLM providers in your app but if you know you won't be swapping out the LLM provider soon it's usually better to not use such frameworks.


Wise perspective from an intern. The type of pragmatism we love.


I wish I was this pragmatic as an intern.


Way to follow your instinct.

I ran into similar limitations for relatively simple tasks. For example I wanted access to the token usage metadata in the response. This seems like such an obvious use case. This wasn’t possible at the time, or it wasn’t well documented anyway.


I've had the same experience. I thought I was the weird one, but, my god, LangChain isn't usable beyond demos. It feels like even proper logging is pushing it beyond it's capabilities.


On top of that, if you use the TypeScript version, the abstractions are often... weird. They feel like verbatim ports of the Python implementations. Many things are abstracted in ways that are not very type-safe and you'd design differently with type safety in mind. Some classes feel like they only exist to provide some structure in a language without type safety (Python) and wouldn't really need to exist with structural type checking.


Could someone point me towards a good resource for learning how to build a RAG app without llangchain or llamaindex? It's hard to find good information.


At a fundamental level, all you need to know is:

- Read in the user's input

- Use that to retrieve data that could be useful to an LLM (typically by doing a pretty basic vector search)

- Stuff that data into the prompt (literally insert it at the beginning of the prompt)

- Add a few lines to the prompt that state "hey, there's some data above. Use it if you can."


You can start by reading up about how embeddings work, then check out specific rag techniques that people discovered. Not much else is needed really.


Here's a blog post that I just pushed that doesn't use them at all - https://blog.dagworks.io/p/building-a-conversational-graphdb (we have more on our blog - search for RAG).

[disclaimer I created Hamilton & Burr - both whitebox frameworks] See https://www.reddit.com/r/LocalLLaMA/comments/1d4p1t6/comment... for comment about Burr.


My strategy has been to implement in / follow along with llamaindex, dig into the details, and then implement that in a less abstracted, easily understandable codebase / workflow.

Was driven to do so because it was not as easy as I'd like to override a prompt. You can see how they construct various prompts for the agents, it's pretty basic text/template kind of stuff



Data centric on YouTube has some great videos . https://youtube.com/@data-centric?si=EOdFjXQ4uv02J774



openai cookbook! Instructor is a decent library that can help with the annoying parts without abstracting the whole api call - see it’s docs for RAG examples.


you are heading the right direction. It's amazing to see seasoned engineers go through the mental gymnastic of justifying installing all those dependencies and arguing about vector db choices when the data fit in ram and the swiss knife is right there: np.array


impressive to decide against something as shiny as langchain as intern


Any tutorials you follow?


Shkreli has actually sent the album to various people. He used to used to send it to any girl who asked for it back in the day. It's not that exclusive. Also according to Martin, now that he has sold the album, the album is mediocre.


I'm skeptical of the on-device AI. They crave edge compute but I'm doubtful their chips can handle a 7B param model. Maybe ironically with Microsoft's phi 3 mini 4k you can run this stuff on a cpu but today it's no where near good enough.


Impressive not technically because nothing here is new but because it's the first real implementation for the average end consumers of "ai". You have semantic indexing which allows series to basically retrieve context for any query. You have image gen which gives you emojigen or messaging using genAI images. TextGen within emails. UX is world class as usual.

However, The GPT integration feels forced and even dare I say unnecessary. My guess is that they really are interested in the 4o voice model, and they're expecting openAI to remain the front runner in the ai race.


This is 100 percent doable. Building something like this at scale might be a pain but locally it's fairly easy.


Very interesting. I'm building a RAG chatbot and I haven't done the inline citations yet, I honestly thought it was a lot more complicated then just telling the llms to cite with a number and then have numbers next the sources. I did something to that extent as kind of a joke and it worked but the llm didn't always listen. I thought either post processing (checking cosine distance between sentences and retrieved chunks) or function calling would be the way to go.


There are already a lot of open bibliographic databases, semantic Scholar and OpenAlex and to some extent Google Scholar. Researchers need fulltext analysis which with half of publications being locked behind a paywall makes it very tedious and complicated.


Thinking back, if LLMs are able to have Memory store and access then RAG becomes useless. RAG is like a system that shoves bits down the RAM (Context Window) and ask the cpu(LLM) to compute something. But If you expand the RAM to a ridiculous amount or you use the HDD, it's no longer necessary to do that. RAG is a suboptimal way of having long term memory. That being said, today it is useful. And when or if this problem gets solved is not easy to say. In the meantime, RAG is the way to go.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: