> Most of these discoveries are on open problems suggested to us by external mathematicians Javier Gomez Serrano and Terence Tao, who also advised on how to best formulate them as inputs to AlphaEvolve. This highlights the potential for synergistic partnerships between AI-driven discovery engines like AlphaEvolve and human mathematical expertise.
Next FIFA will be revolutionary. It will probably have 26 in the title and will cost 80 USD or more.
If they could they would rent each player in your team for a match
But if anything, those rules benefit ChatGPT: it can remember ~all of Wikipedia and translate ~every language on Earth, while a human would need access to online services for that.
If anything, I'd think allowing looking stuff up would benefit human players over ChatGPT (though humans are probably much slower at it, so they probably lose on time).
If it takes a model and database with a large chunk of the internet to compete and win, then that says something, as that's much more expensive and complex than just the model, because models have problems "remembering" correctly just like people.
It's important to have fair and equivalent testing not because that allows people to win, but because it shows where the strengths and weaknesses of people and current AI actually are in a useful way.
I'm not sure how to make sense of this in the context of what we're discussing. Access to the web is exactly what's in question, and emulating the internet to a degree you don't actually need to access it to have the information is very expensive in resources because of how massive the dataset is, which is the point I was making.
Because an accepted answer to that specific question is invariably a link/reference that the asker could have searched for (and posted if they think it's useful for the discussion) themselves directly, instead of putting that burden on the rest of us and amortizing everyone's attention. It's entitled and lazy.
Alternative example: "I wondered what the rules actually say about web search and it is indeed not allowed: (link)"
I would put that the other way around: a thing with an owner does not have rights, and an AI will have an owner. It may be software that does not have an owner, but the hardware it runs on does.
"Imo an AI with a fixed reward function doesn't seem like agi to me"
They specifically talk about this in the position paper and describe the need for a "flexible" reward function that's adaptive. It's very hand-wavy and doesn't really describe how they would do this, but there's a lot of research along these lines. "agi" is also not really the subject of the paper.
He also brought us IC-Light! I wonder why he's still contributing to open source... Surely all the big companies have made him huge offers. He's so talented
I think he is working on his Ph.D. at Stanford. I assume whatever offers he has haven't been attractive enough to abandon that, whether he’ll still be doing open work or get sucked into the bowels of some proprietary corporate behemoth afterwards remains to be seen, but I suspect he won't have trouble monetizing his skills either way.
Looks like a cool paper. It's really puzzling to me why llama turned out to be so bad, yet they're releasing great research. Especially considering the amount of GPUs, llama really seems unexcusable when compared to Loma from much smaller teams with a lot less resources
Llama will advance further just like the rest. The leaderboards for llms is just a constantly changing thing. They will all reach a maturity point and be roughly the same. That's probably something we'll see soon in the next 1-3 years tops. Then it'll just be incremental price drops in the cost to train and run, but the quality will all be equatable. Not to mention we're already running out of training data.
It was always about more than GPUs since even when the original llama came out, the community released fine tunes that would bench higher than the base model. And with the Deepseek distilled models it turned out you could fine tune some reasoning into a base model and make it perform better.
reply