The project sounds quite interesting but I'm not sure running it is going to work!
The code `gpt_model = "gpt4_20231230_1106preview"` is not using a valid model name as best as I can tell, so it seems unlikely to work - from https://github.com/SakanaAI/DiscoPOP/blob/main/scripts/launc...
Unusually, the issue section doesn't exist so I can't provide feedback to them that way. But luchris429's repo does have it so will do so there.
Maybe it's dead code. Still, it's wrong.
Author here! Thanks for pointing that out. The correct model name is indeed "gpt-4" instead of "gpt_model = 'gpt4_20231230_1106preview'". We were previously using an Azure endpoint, which is why the model name is different.
While I understand the frustration, I assure you that the rest of the code is functional. This was a simple oversight and should be a trivial fix. I appreciate your feedback and understanding.
They are very useful when ideating with a human. On their own they could veer off into uncertain territory, and likely make mistakes obvious to humans.
I'm sure LLMs can optimize the training of other LLMs (either by inventing new ways or fine tuning existing ones). But we can't predict whether this will result in a giant's leap in the field, or just small increments. That's the definition of singularity, isn't it?
Absolutely. LLMs get a lot of hate but sincerely, GPT-4o can be given a hunk of code (one function/small class worth) and told to find optimization opportunities, and it will do a great job, especially considering the 30s it takes to ask,
It’s not perfect, but it understands lock-free algorithms, branch prediction, can tell you which memory order to use for atomic operations if you’re using too strong of an ordering, AND it will catch silly bugs at the same time. I had a bounds check in a lock-free algorithm I was optimizing, which equated to if(idx < start && idx >= end) return false, and it mentioned that error while optimizing.
This guy was really putting it through its paces on already highly optimized code in an esoteric architecture (Nintendo 64) and it still found a few things:
https://youtu.be/20s9hWDx0Io
The uninformed preliminary answer instead is yes: loaded dice can produce better values.
The article is about using LLMs in an evolutionary framework to design better algorithms for LLM advancement, with particular occasional regard to preference optimization algorithms - and guess what, it seems it worked.
I like the idea of respect where this comes from, but I think that titles have been perverted by SEO way too much and at some point I need writers to stop trying to lure me for my own good.
I wish we can get descriptive, boring, but accurate titles back.
I hate clickbait, but this doesn't look like this one is, because it's not misleading, it's not trying to hide something to get you to click, and does not lure you with information that's not in the article.
Don't misunderstand, building systems models using existing system response as a way of analyzing those systems is a useful methodology and it makes some things otherwise tedious things not so tedious. Much like "high level" languages removed the tedium of writing in assembly code. But for the same reason that a compiler won't emit a new, more powerful, CPU instruction in its code generator, LLMs don't generate previously unseen system responses.
is it possible for copilot or say llama or gpt4o to suggest a piece of code and actually go and try to run a test that they design on an ide and see if there are any results and try to fix issues?
right now you ask llm to write a code to do basic web scraping for HN website for latest url and give username of the submitter. sure they will give you a code and give you a test script but you as the user have to run the script and give manual feedback to LLM.
if the testing step can be automated, user would give an input and desired output or a prompt and choose between the results, that would be good.
kinda like you do inpainting and outpainting and other painting stuff but for code.
Genetic Programming was a thing in the 90s but hampered by a combination of the inefficiency of largely random mutations (plus some crossover, which was still largely undirected) with low odds of doing anything helpful, and lack of computational speed to test. A GP framework attempting to use LLMs to apply more or less "reasoned" changes within the same structure of generations of "mutations" tested against each other and previous generations best would be interesting.
They key bit here is there is no known way (as yet) to encode "reasoning".
I was a big fan of genetic programming, wrote a lot of code, did lots of research. And unlike LLMs it could end up on code that had never been written before that accomplished some task, but the random walk through a galactic sized space with atom (or maybe molecule) sized solution spaces made it computationally infeasible.
Being able to somehow code 'reasoning' one could do the equivalent of gradient descent to converge on a working solution but without that, you are unlikely[1] to find anything in reasonable amounts of time.
[1] The chance is non-zero but it is very very near zero.
LLMs can definitely end up with code that has never been written before, even before considering that you would be able both to ask it for modifications to very constrained parts of the code and can sample more broadly than always picking the most probably tokens.
But it also appear to have a far higher probability of producing changes that move towards something that will run.
Yes, exactly. Tools perform without "knowing" the purpose. Unintelligent yet effective.
So, "perform a selection over the enumerated combinations in the solutions space" works without the process being further sophisticated. It works as much as it can - as a preparation of data until the stage in which intelligence is required.
We have been doing it since a while; simulated annealing, genetic algorithms... Dumb hammers in a way, encoding an action from an intelligent operator, and providing an effective aid when under intelligent control.
I guess the key aspect for human inventions is a stochastic element to the combination of existing pattern. I.e. seeing (or imagining) connections that are not obvious.
Of course, they can invent anything. A better question is how efficient? Because even with brute force you can invent anything: https://libraryofbabel.info/
Yes, but you may need a lot of monkeys.
reply