As LLMs become stronger, we’re seeing a new situation:
A human discovers a mathematical method or result,
but the formal proof is generated (and even cross-verified) by multiple LLMs—even if the original author can’t fully reproduce the proof themselves.
Should such AI-generated proofs be considered valid and publishable?
What standards should apply when the idea is human-created, but the proof is AI-derived?
Curious to hear opinions from mathematicians, engineers, researchers, and journal editors. This feels like an important shift in how we think about proofs and authorship.
Can you point examples of this out? So far what's been getting discussed and used is LLMs to generate Lean (or possibly other languages) which is machine checkable, without the need for another LLM. This isn't LLM peer reviewing LLM work, it's people and deterministic (hopefully correct) machines checking LLM work.
The only LLM proof-checking LLM stuff I've seen are from loons, like the people who think they've figured out life, the universe, and everything after a few Adderall filled days with ChatGPT.
reply