> I find that AI substantially boosts materials discovery, leading to an increase in patent filing and a rise in downstream product innovation. However, the technology is effective only when paired with sufficiently skilled scientists.
I can see the point here. Today I was exploring the possibility of some new algorithm. I asked Claude to generate some part which is well know (but there are not a lot of examples on the internet) and it hallucinated some function. In spite of being bad, it was sufficiently close to the solution that I could myself "rehallucinate it" from my side, and turn it into a creative solution. Of course, the hallucination would have been useless if I was not already an expert in the field.
I came to the same conclusion a while back. LLMs are very useful when user expertise level is medium to high, and task complexity is low to medium. Why ? because it those scenarios, the user can use the LLM as a tool for brainstorming on drawing the first sketch before improving it. Human in the loop is the key and will stay key for the forceable future no matter what the autonomous AI agent gurus are saying.
https://www.lycee.ai/blog/mistral-ai-strategy-openai
"when user expertise level is medium to high, and task complexity is low to medium" – this reminds me of Python itself. Python isn't the best at anything, it's slow, sometimes vague in its formalisms, etc. But it keeps being super popular because most work is low to medium complexity. In everyone's work, from a simple webdev to an AI researcher, there are moments of complexity in the work but most of the work is getting through the relatively simple stuff.
Or maybe in general we can say that to do something really hard and complex you must and should put a lot of effort into getting all the not-hard not-complex pieces in place, making yourself comfortable with them so they don't distract, and setting the stage for that hard part. And when you look back you'll find it odd how the hard part wasn't where you spent most of the time, and yet that's how we actually do hard stuff. Like we have to spend time knolling our code to be ready for the creative part.
Habitual Artificial Intelligence contrasts nicely with Artifial General Intelligence. It parses data and forms habits based on that data. When you want to discover something new, you have to break out of a habit and think. It forms some habits better than others also.
When i saw how Alphazero played chess back in 2017, different than other engines, that's what i described it usually, as a habit forming machine.
Yes, amplification is really apt and suitable analogy comparison.
Just treat the hallucinations as the non-linear distortion and harmonics phenomena that come from amplification process. You can just filter the unwanted signals and noises judiciously if you're well informed.
Taking this analogy further you need to have an appropriate and proper impedance matching to maximize the accuracy, and impedance matching source or load-pull (close-loop or open-loop) and for LLM it can be in the form of RAG for example.
I wonder if the next generation of experts will be held back by use of AI tools. Having learned things “the hard way” without AI tools may allow better judgement of these semi-reliable outputs. A younger generation growing up in this era would not yet have that experience and may be more accepting of AI generated results.
> Having learned things “the hard way” without AI tools may allow better judgement
I see a parallel in how web search replaced other skills like finding information in physical libraries. We might not do research the old way, but we learned new tricks for the new tools. We know when to rely on them and how much, how to tell useful from garbage. We don't write by hand much, do computation in our heads much, but we type and compute more.
Yeah, as a cs student, some professors allow use of LLM's because it is what will be a part of the job going forward. I get that, and I use them for learning, as opposed to internet searches, but I still manually write my code and fully understand it, cause I don't wanna miss out on those lessons. Otherwise I might not be able to verify an LLM's output.
Reminds me of the "Learn X the Hard Way" series, distributed as PDF I think, on the idea that if there's code samples you should transcribe them by hand because the act of transcribing matters.
Maybe that's an argument for simpler chat modalities over shared codepads, as forcing the human to assemble bits of code provided by the LLM helps keep the human in the driver's seat.
Yeah. My favorite professor this semester constantly says "hey, if you rely to much on the robot, and can't do this yourself, you won't get a job." I know some people are just here for the paper, but that makes me feel better when I'm having a hard time finding a new role..
If a model is right 99.99% of the time (which nobody has come close to), we still need something that understands what it's doing enough to observe and catch that 0.01% where it's wrong.
Because wrong at that level is often dangerously wrong.
This is explored (in an earlier context) in the 1983 paper "Ironies of Automation".
> we still need something that understands what it's doing enough to observe and catch that 0.01% where it's wrong.
Nobody has figured out how to get a confidence metric out of the innards of a neural net. This is why chatbots seldom say "I don't know", but, instead, hallucinate something plausible.
Most of the attempts to fix this are hacks outside the LLM. Run several copies and compare. Ask for citations and check them.
Throw in more training data. Punish for wrong answers. None of those hacks work very well. The black box part is still not understood.
This is the elephant in the room of LLMs. If someone doesn't crack this soon, AI Winter #3 will begin. There's a lot of startup valuation which assumes this problem gets solved.
> There's a lot of startup valuation which assumes this problem gets solved.
Not just solved, but solved soon. I think this is an extremely difficult problem to solve to the point it'd involve new aspects of computer science to even approach correctly, but we seem to just think throwing more CPU and $$$ at the problem will work itself out. I myself am skeptical.
Is there any progress? About two years ago, there were people training neural nets to play games, looking for a representation of the game state inside the net, and claiming to find it. That doesn't seem to be mentioned any more.
As for "solved soon", the market can remain irrational longer than you can stay solvent. Look at Uber and Tesla, both counting on some kind of miracle to justify their market cap.
I get the impression that most of the 'understand the innards' work isn't scalable - you build out a careful experiment with a specific network, but the work doesn't transfer to new models, fine-tuned models, etc.
I’m pretty sure humans make mistakes too and it happens rather frequently that nobody catches them until it’s too late. In most fields we’re okay with that because perfection is prohibitively expensive.
Obviously systems have always had to be resilient. But the point here is how dangerous a "set it and forget it" AI can be. Because the mistakes it makes, although fewer, are much more dangerous, unpredictable, and inscrutable than the mistakes a human would make.
Which means the people who catch these mistakes have to be operating at a very high level.
This means we need to resist getting lulled into a false sense of security with these systems, and we need to make sure we can still get people to a high level of experience and education.
I find proofreading the code gen ai less satisfying than writing it myself though it does depend on the nature of the function. Migrating mindless mapping type functions to autocomplete is nice
This is one big point I've subscribed to, I'd rather write the code and understand it that way, than read and try to understand code I did not write.
Also, I think it would be faster to write my own than try to fully understand others (LLM) code. I have developed my own ways of ensuring certain aspects of the code, like security, organization, and speed. Trying to knead out how those things are addressed in code I didn't write takes me longer.
Yes, I have experienced it, too. I was building a web crawler using Replit as an agent. I could have done that in 2 hours without LLM help but I wanted to see how the LLM would do it. I gave it a set of of instructions but the LLM could not execute on it. It later choose an alternative path but that also did not yield. I then gave an exact list of steps. Results were slightly better but not what I was expecting. Overall, it's good to get something going but you still have to hold hands. It is not the best but also not the worst experience.
Yeah I had similar experience where I ask why a bug was happening but it gave me some thing that looked wrong, but upon closer inspection it pointed to a vague general direction where I haven’t thought of and i solved my bug with its help. The caveat is you still need to know your shit to decipher/recognize it.
> I find that AI substantially boosts materials discovery, leading to an increase in patent filing and a rise in downstream product innovation. However, the technology is effective only when paired with sufficiently skilled scientists.
I can see the point here. Today I was exploring the possibility of some new algorithm. I asked Claude to generate some part which is well know (but there are not a lot of examples on the internet) and it hallucinated some function. In spite of being bad, it was sufficiently close to the solution that I could myself "rehallucinate it" from my side, and turn it into a creative solution. Of course, the hallucination would have been useless if I was not already an expert in the field.