Hacker Newsnew | past | comments | ask | show | jobs | submit | feanaro's commentslogin

This alone still wouldn't be a clear demonstration that AGI is around the corner. It's quite possible a LLM could've done Einstein's job, if Einstein's job was truly just synthesising already available information into a coherent new whole. (I couldn't say, I don't know enough of the physics landscape of the day to claim either way.)

It's still unclear whether this process could be merely continued, seeded only with new physical data, in order to keep progressing beyond that point, "forever", or at least for as long as we imagine humans will continue to go on making scientific progress.


Einstein is chosen in such contexts because he's the paradigmatic paradigm-shifter. Basically, what you're saying is: "I don't know enough history of science to confirm this incredibly high opinion on Einstein's achievements. It could just be that everyone's been wrong about him, and if I'd really get down and dirty, and learn the facts at hand, I might even prove it." Einstein is chosen to avoid exactly this kind of nit-picking.


They can also choose Euler or Gauss.

These two are so above everyone else in the mathematical world that most people would struggle for weeks or even months to understand something they did in a couple of minutes.

There's no "get down and dirty" shortcut with them =)


No, by saying this, I am not downplaying Einstein's sizeable achievements nor trying to imply everyone was wrong about him. His was an impressive breadth of knowledge and mathematical prowess and there's no denying this.

However, what I'm saying is not mere nitpicking either. It is precisely because of my belief in Einstein's extraordinary abilities that I find it unconvincing that an LLM being able to recombine the extant written physics-related building blocks of 1900, with its practically infinite reading speed, necessarily demonstrates comparable capabilities to Einstein.

The essence of the question is this: would Einstein, having been granted eternal youth and a neverending source of data on physical phenomena, be able to innovate forever? Would an LLM?

My position is that even if an LLM is able to synthesise special relativity given 1900 knowledge, this doesn't necessarily mean that a positive answer to the first question implies a positive answer to the second.


I'm sorry, but 'not being surprised if LLMs can rederive relativity and QM from the facts available in 1900' is a pretty scalding take.

This would absolutely be very good evidence that models can actually come up with novel, paradigm-shifting ideas. It was absolutely not obvious at that time from the existing facts, and some crazy leap of faiths needed to be taken.

This is especially true for General Relativity, for which you had just a few mismatch in the mesurements like Mercury's precession, and where the theory almost entirely follows from thought experiments.


Isn't it an interesting question? Wouldn't you like to know the answer? I don't think anyone is claiming anything more than an interesting thought experiment.


This does make me think about Kuhn's concept of scientific revolutions and paradigms, and that paradigms are incommensurate with one another. Since new paradigms can't be proven or disproven by the rules of the old paradigm, if an LLM could independently discover paradigm shifts similar to moving from Newtonian gravity to general relativity, then we have empirical evidence of an LLM performing a feature of general intelligence.

However, you could also argue that it's actually empirical evidence that general relativity and 19th century physics wasn't truly a paradigm shift -- you could have 'derived' it from previous data -- that the LLM has actually proven something about structurally similarities between those paradigms, not that it's demonstrating general intelligence...


His concept sounds odd. There will always be many hints of something yet to be discovered, simply by the nature of anything worth discovering having an influence on other things.

For instance spectroscopy enables one to look at the spectra emitted by another 'thing', perhaps the sun, and it turns out that there's little streaks within the spectra the correspond directly to various elements. This is how we're able to determine the elemental composition of things like the sun.

That connection between elements and the patterns in their spectra was discovered in the early 1800s. And those patterns are caused by quantum mechanical interactions and so it was perhaps one of the first big hints of quantum mechanics, yet it'd still be a century before we got to relativity, let alone quantum mechanics.


You should read it


I mean, "the pieces were already there" is true of everything? Einstein was synthesizing existing math and existing data is your point right?

But the whole question is whether or not something can do that synthesis!

And the "anyone who read all the right papers" thing - nobody actually reads all the papers. That's the bottleneck. LLMs don't have it. They will continue to not have it. Humans will continue to not be able to read faster than LLMs.

Even me, using a speech synthesizer at ~700 WPM.


> I mean, "the pieces were already there" is true of everything? Einstein was synthesizing existing math and existing data is your point right?

If it's true of everything, then surely having an LLM work iteratively on the pieces, along with being provided additional physical data, will lead to the discovery of everything?

If the answer is "no", then surely something is still missing.

> And the "anyone who read all the right papers" thing - nobody actually reads all the papers. That's the bottleneck. LLMs don't have it. They will continue to not have it. Humans will continue to not be able to read faster than LLMs.

I agree with this. This is a definitive advantage of LLMs.


No, that's a completely different concept, because we have faultless machines which perfectly and deterministically translate high-level code into byte-level machine code. This is another case of (nearly) perfect abstraction.

On the other hand, the whole deal of the LLM is that it does so stochastically and unpredictably.


The unpredictable part isn't new - from a project manager's point of view, what's the difference between an LLM and a team of software engineers? Both, from that POV, are a black box. The "how" is not important to them, the details aren't important. What's important is that what they want is made a reality, and that customers can press on a button to add a product to their shopping cart (for example).

LLMs mean software developers let go of some control of how something is built, which makes one feel uneasy because a lot of the appeal of software development is control and predictability. But this is the same process that people go through as they go from coder to lead developer or architect or project manager - letting go of control. Some thrive in their new position, having a higher overview of the job, while some really can't handle it.


"But this is the same process that people go through as they go from coder to lead developer or architect or project manager - letting go of control."

In those circumstances, it's delegating control. And it's difficult to judge whether the authority you delegated is being misused if you lose touch with how to do the work itself. This comparison shouldn't be pushed too far, but it's not entirely unlike a compiler developer needing to retain the ability to understand machine code instructions.


As someone that started off with assembly issues for a large corporation - assembly code may sometimes contain very similiar issues that mroe high-level code those, the perfection of the abstraction is not guaranteed.

But yeah, there's currently a wide gap between that and a stochastic LLM.


We also have machines that can perfectly and deterministically check written code for correctness.

And the stohastic LLM can use those tools to check whether its work was sufficient, if not, it will try again - without human intervention. It will repeat this loop until the deterministic checks pass.


> We also have machines that can perfectly and deterministically check written code for correctness.

Please do provide a single example of this preposterous claim.


It's not like testing code is a new thing. Junit is almost 30 years old today.

For functionality: https://en.wikipedia.org/wiki/Unit_testing

With robust enough test suites you can vibe code a HTML5 parser

- https://ikyle.me/blog/2025/swift-justhtml-porting-html5-pars...

- https://simonwillison.net/2025/Dec/15/porting-justhtml/

And code correctness:

- https://en.wikipedia.org/wiki/Tree-sitter_(parser_generator)

- https://en.wikipedia.org/wiki/Roslyn_(compiler)

- https://en.wikipedia.org/wiki/Lint_(software)

You can make analysers that check for deeply nested code, people calling methods in the wrong order and whatever you want to check. At work we've added multiple Roslyn analysers to our build pipeline to check for invalid/inefficient code, no human will be pinged by a PR until the tests pass. And an LLM can't claim "Job's Done" before the analysers say the code is OK.

And you don't need to make one yourself, there are tons you can just pick from:

https://en.wikipedia.org/wiki/List_of_tools_for_static_code_...


> It's not like testing code is a new thing. Junit is almost 30 years old today.

Unit tests check whether code behaves in specific ways. They certainly are useful to weed out bugs and to ensure that changes don't have unintended side effects.

> And code correctness:

These are tools to check for syntactic correctness. That is, of course, not what I meant.

You're completely off the mark here.


What did you mean then if unit tests and syntactic correctness aren't what you're looking for?


Algorithmic correctness? Unit tests are great for quickly poking holes in obviously algorithmically incorrect code, but far from good enough to ensure correctness. Passing unit tests is necessary, not sufficient.

Syntactic correctness is more or less a solved problem, as you say. Doesn't matter if the author is a human or an LLM.


It depends on the algorithm of course. If your code is trying to prove P=NP, of course you can't test for it.

But it's disingenuous to claim that even the majority of code written in the world is so difficult algorithmically that it can't be unit-tested to a sufficient degree.


Suppose you're right and the "majority of code" is fully specified by unit testing (I doubt it). The remaining body of code is vast, and the comments in this thread seem to overlook that.


This is essentially a group theoretical result about permutation groups. Would be nice to see a treatment from this angle.


And yet the categorical concepts in Hask are undoubtedly practically useful, more so than an arbitrary sample of concepts, and compose extraordinarily well. Does that have nothing to do with those concepts deriving from (even more general concepts of) category theory?


I don't think anything good ever came from Ylva Johansson. Mentions of her name on something should make one automatically treat that thing with suspicion.


How do you discover the principles in the first place? You can discover them once and then apply them in all applicable places precisely because you generalised them.

You have a point that the result may very well be more easily explained in concrete terms to practitioners of a given field in which you applied it, though.


What do you mean by the Wireguard option for mitmproxy?

EDIT: Oh, look at this https://mitmproxy.org/posts/wireguard-mode/. TIL.


It's a pretty neat feature! I think it's in beta but it works flawlessly in my experience. Sure is a lot easier than setting up a separate (W)LAN with iptables rules to force redirect traffic.


A server is someone else's device. Your phone is your own device. So no, doing the scan on your own device and making your device your potential adversary is not better than doing it on the server. You can always choose not to use the server.


This doesn't follow.

Apple only ever scanned images being uploaded to the server. They were only ever going to scan images (even if it was done on the local device) if they were uploaded to the server.

On the one hand you have:

- do the scan in private, get a pass (I'm assuming we all get a pass), and no-one outside of your phone ever even looks at your images.

On the other hand, you:

- do the scan on upload. Some random bloke in support gets tasked with looking at 1 in every 10,000 images (or whatever) to make sure the algorithm is working, and your photo of little Bobby doing somersaults in the back garden is now being studied by Jim.

If you never uploaded it, it was never scanned, in either case.

So yes, you've lost privacy because faux outrage on the internet raised enough eyebrows. Way to go.


So there is an upper limit, which is the real price?


What are you even on about, mate? A hacker's multi tool with infinite potential for exploration is an idea "too malicious" to consider?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: