haliyat's comments

haliyat · 2024-05-23T15:09:03

There are a ton of different promising AI approaches explored by researchers. When I was at MIT in the early 2010s, when Deep Learning was just taking off, it was seen as one of a suite of new exciting techniques. For example, some of the grad students that taught the AI classes I took were hyped on an approach called Probabilistic Programming, which adds a complete suite of programming concepts (eg. If-statements) to Bayesian networks allowing you to write extremely concise and powerful programs that can learn based on data and handle uncertainty during an execution. Also while I was there Geoff Hinton gave a series of master lectures on the future of AI after Deep Learning and he talked a lot about an approach called Inverse Graphics: basically treating images as one output from a graphics rendering pipeline that includes a scene with geometry and lighting, projection transformations, etc and then trying to learn all the parameters of that pipeline from images so that you produce not just a classifier output but a whole scene description. Both really cool and exciting approaches that build on top of deep neural nets but aren’t bound to them.

One of the negative effects of the huge hype wave (hype tsunami is maybe more appropriate) around LLMs and genAI generally is that it starves these other approaches of resources (as well as discouraging people from exploring other new approaches). This is what LeCunn is responding to. I know some zealots believe that “bigger LLMs” is all we need for AI progress forever, but based on the entire history of the field, a number of technical issues with LLMs, and the nature of progress of LLMs in the last few years I would described this view as blinkered and risky at best. The field often advances fastest from the early years of new approaches rather than massive over-investment in a single approach based on some early promising results. Historically the later approach tends to lead to AI winters.

haliyat · 2024-05-23T15:26:23

Also, FWIW, I saw a bunch of demos of “token sequence learning” that did a lot of the applications that people have been so excited about with LLMs: producing text descriptions of video and images, text summarization with question answering, etc. Those demos were a little janky and limited and obviously only at the academic paper with impressive video demo stage which is a far cry from fast and reliable enough to be useful in production. But they weren’t categorically different from what we’ve seen with transformers and LLMs. This is one of the reasons I’m more skeptical about claims that transformers + more data and compute is all we need for AGI. After a decade plus of not just MASSIVE compute and data scaling but some fairly clever new techniques I would describe progress as incremental rather than transformational beyond those older results. Honestly, people have forgotten this now, but the biggest change that ignite the LLM hype was the UX decision to present interactions with these models in the framework of a conversation with an agent. This is a trick that goes back at least as far as Eliza and it’s effect is mainly in how it primes the user to think about and relate to the tech. That is also an area where more work can be done (conversational interfaces are not the One Solution to all computing). I recommend googling Interactive Machine Learning, which is its own sub-discipline that specifically studies this problem of how to build UX that is native to, and takes best advantage of, ML/AI techniques to produce software that people can use to accomplish real tasks.

haliyat · 2024-05-22T15:45:28

100% agree. You can even see it here on HN. When was the last time you saw a post where someone was excited about something they were actually _doing_ with and LLM? Mostly I see threads about business hype and the periodic conversation started by a confused person saying “so uh I tried LLMs and how do you actually get any value out of them?” And I think programming is actually the domain that LLMs will be MOST effective in.

haliyat · 2024-05-21T07:21:16

The backlash is because an interesting technology that is not ready for prime time is being pushed out the door as the solution for every problem that exists. Any dev who’s been around long enough can recognize when a tech strategy is being pushed down from corporate leadership rather than driven up from its objective usefulness.

haliyat · 2024-05-16T15:47:09

The sad truth is that a lot of senior corporate leaders, by the time they get promoted up to that level, if they ever had useful skills (in management or anything else) they’re usually long atrophied away. Since at that level of nearly all companies of any meaningful size those jobs are pure politics (jockeying for influence, fighting for resources in yet another reorg).

If you want to improve in your career don’t listen to a thing these people say (unless you need to learn about public posturing).

haliyat · 2024-05-07T02:46:26

Classic bubble talk “(mumbling behind my hand) yes it’s obvious that our current over-hyped product sucks but that should just make you even more hyped for our next product!” lol. If you believe that I’ve got a bridge in Brooklyn you might also be interested in purchasing…

haliyat · 2024-04-27T14:46:31

At this point there’s a ton of research on how large and long-lasting the negative impacts of layoffs are on company effectiveness (especially in the area of innovation). Eg: https://hbr.org/podcast/2023/12/the-hidden-costs-of-layoffs

It’s not surprising that CEOs are the last to learn of these impacts because they tend amongst the people at any big company who least understands how it works in my experience. And, like you can see here, once your CEO becomes just an agent of the short-term stock price you are doomed at doing anything other than enshittifying your product until the company eventually dies.

haliyat · 2024-04-26T14:49:47

“Bullshit” is the _perfect_ term. Philosopher Harry Frankfurter wrote a book called On Bullshit where he defines the term as speech or writing intended to persuade without regard for the truth. This is _exactly_ what LLMs do. They produce text that tries to reproduce the average properties of texts in their training data and the user experiences the encoded in their RLHF training. None of that has anything to do with the truth. At best you could say they are engineered to try to give the users what they want (eg. what the engineers building these systems think we want), which is, again, a common motive of bullshitters.

nicklecompte · 2024-04-26T16:53:48

"Bullshit" doesn't work because it requires a psychological "intent to persuade," but LLMs are not capable of having intentions. People intentionally bullshit because they want to accomplish specific goals and adopt a cynical attitude towards the truth; LLMs incidentally bullshit because they aren't capable telling the difference between true and false.

Specifically: bullshitters know they are bullshitting and hence they are intentionally deceptive. They might not know whether their words are false, but they know that their confidence is undeserved and that "the right thing to do" is to confess their ignorance. But LLMs aren't even aware of their own ignorance. To them, "bullshitting" and "telling the truth" are precisely the same thing: the result of shallow token prediction, by a computer which does not actually understand human language.

That's why I prefer "confabulate" to "bullshit" - confabulation occurs when something is wrong with the brain, but bullshitting occurs when someone with a perfectly functioning brain takes a moral shortcut.

haliyat · 2024-04-27T14:57:19

I don’t like “confabulate” because has a euphemistically quality. I think most people hear it as a polite word for lying (no matter the dictionary definition). And this is a space that needs, desperately needs, direct talk that regular people can understand. (I also think confabulate implies intention just as much as bullshit to most people.)

haliyat · 2024-04-27T14:54:28

You’re right about the model’s agency. To be precise I’d say that LLMs spew bullshit but that the bullshitters in that case are those who made the LLMs and claimed (in the worst piece of bullshit in the whole equation) that they are truthful and should be listened to.

In that sense you could described LLMs as industrial strength bullshit machines. The same way a meat processing plant produces pink slime at the design of its engineers so too do LLMs produce bullshit at the design of theirs.

haliyat · 2024-03-10T19:24:55

You sound more like an out of touch executive or a recently-converted-Bitcoin-scammer than someone doing real work with this stuff (and yes I saw your top-voted post — as someone who also works in games at a large AAA studio I’ll just say I look forward to the GDC talk showing the evidence).

haliyat · 2024-03-10T18:13:40

+1.

I also am increasingly thinking that this generation of AI is going to destroy social media with a flood of what Cory Doctorow calls “botshit”.

Only difference is instead of worried about this I’m more <Jack Nicholson head-nodding meme>. The death of social media can’t come too soon.

haliyat · 2024-03-10T18:09:48

I’m not at all worried about LLMs replacing even low-level engineers. We’ve been using Copilot and ChatGPT for awhile at work now and the most useful applications we’ve found are more analogous to a compiler than a developer. For example, we’ve had good luck using it to help port a bunch of code between large APIs. the process still involves a lot of human work to fix bugs from its output but a project we estimated at 6 months took three weeks.

On the other hand, as someone whose role gives me visibility into the way senior leaders at the company think about what AI will be able to do, I’m absolutely terrified that they’re going to detonate the company by massively over-investing in AI before it’s proven and by forcing everyone to distort their roadmaps around some truly unhinged claims about what AI is going to do in the future.

CEOs and senior corporate leaders don’t understand what this technology is and have always dreamed of a world where they didn’t need engineers (or anyone else who actually knows how to make stuff) but instead could turn the whole company into a big “Done” button that just pops out real versions of their buzzword filled fever dreams. This makes them the worst possible rubes for some the AI over-promising — and eager to make up their own!

Between this and the really crazy over-valuations we’re already seeing in companies like Nvidia, I’m seeing the risk of a truly catastrophic 2000- or 2008-style economic crash rising rapidly and am starting to prepare myself for that scenario in the next 2-5 years.