Can someone please explain... If I don't know any Lean (and I suspect most people don't), is it of any direct value? Trying to understand if there's something it can help me with (e.g. automatically write proofs for my Go programs somehow... I'm not sure) or should I just cheer solely for more open models out there, but this one isn't for me?
Presumably the idea is that an agent generates a Lean4 specification against which the software is measured.
But then the Lean4 specification effectively becomes the software artifact.
And we're sort of back to square 1. How do you verify a Lean4 spec is correct (and that it describes what needs to be built in the first place) without human review?
Oh, I was not comparing it with out-sourced development and was instead comparing it with developing it oneself.
Sure, outsourcing is similar, but the difference is one uses a process that is inherently probabilistic and will show up in every result, while other just depends on the probability of you getting a good team.
I suspect the unspoken premise was that it was all in context of people who - just like those who hire contractors - don’t have the capacity to do it themselves.
In this context I suspect a SotA LLM could sometimes beat some cost-comparable UpWork professionals in both quality and spec adherence. In other words, if you need an app and can’t do it yourself and have a tight budget, LLMs are quickly becoming a viable option for more and more complex apps (still only simple ones before it produces junk, but progress is pretty appalling)
>beat some cost-comparable UpWork professionals in both quality and spec adherence...
I am not sure I want to keep paying for something that needs some amount of luck on my side, to be useful. Writing elaborate plans for LLMs also feels a bit pointless when there is no hard and fast rule about how much of it will be followed ..
Apparently some people appear to be doing it, but I am afraid it is not something that will have a universal appeal..
> The real issue is that cheaters suck the fun out of the game
Unpopular opinion: cheaters don’t, griefers do.
“Cheater” is a pejorative for someone who sidesteps the rules and uses technology instead of, uh, pardon a potentially word choice, innate skills. They don’t inherently want to see others suffer as they stomp - it’s a matchmaking bug they’re put where they don’t belong. They just want to do things they cannot do on their own, but what are technically possible. A more positive term for that is a “hacker”.
Griefers are a different breed, they don’t just enjoy own success but get entertained by others’ suffering. Not a cheating issue TBH (cheats merely enable more opportunities), more like “don’t match us anymore, we don’t share the same ideas of fun” thing. “Black hat” is close enough term I guess.
YMMV, but if someone performs adequately for my skill levels (that is, they also don’t play well) then they don’t deprive me of any fun irrespective of how they’re playing.
Yeah thats a really unpopular opinion. Cheaters dont want to play the game. There is no matchmaking for them that makes sense.
They have inhuman skills usually paired with terrible game IQ and generally awful toxicity. They get boosted up to play with intelligent players purely because they can hold a button to outplay. It gets to the point where you have a player on your team who has no idea how to play but is mechanically good and it breaks the entire competitiveness of the game.
> They don’t inherently want to see others suffer as they stomp
Cheaters want to dominate other players, feel like they deserve to dominate other players and are perfectly happy for other players to suffer as long as they feel good.
That’s provably not universally true, although I have no idea about the exact demographics.
Best I’ve ever seen was some online discussions about motives, but I never compiled any statistics out of random anecdotes (that must be biased and probably not representative).
That's a gross exaggeration. Some people just want to play the game, but lack motor skills commensurate with their other abilities.
Are players who take advantage of developer-supplied aim assist and other assistive technologies "motivated by a toxic sense of self regard and a desire to humiliate others"?
World would be surely a saner place if instead of “MCP vs CLI” people would talk about “JSON-RPC vs execlp(3)”.
Not accurate, but at least makes on think of the underlying semantics. Because, really, what matters is some DSL to discover and describe action invocations.
That was all x86_64, but even if aarch64 is more memory efficient, it can’t be too drastic, and 8GiB was borderline unusable even 10 years ago.
Nowadays it must be a teeth-grinding tight fit for a browser and couple Electron apps, held together on a prayer next website doesn’t go too crazy with the bells and whistles and wasn’t vibeslopped with utter disregard to any big-Os.
Because look around - same code compiled for x86_64 and aarch64 is not that drastically different in size, save for some special cases (like NumPy). Data structures are going to have even less differences. Then, assets are the same.
I’ve cursorily checked few programs and difference seemed to about 10-20% (with some exceptions), so 8GiB RAM on an aarch64 is like 10GB on x86_64. Significantly nicer, not a life-changing nicer - you’re still very limited.
Edit: Next comment has a very good point about memory and SSD bandwidth increases, allowing faster swap and compressed RAM performance. That’s something I haven’t considered. So maybe it’ll feel closer to a 16GiB old machine or something like that…
Yeah. Also the bandwidth of modern soldered-on Mac SSDs is insane compared to where it was in the Intel era. The performance impact of moving applications in and out of swap should be much lower than it was a few years ago.
Neither of those solves it, just tries to conserve the status quo.
The issue, as I understand it, is literally a new Eternal November, just that instead of “noobs” there are “clankers” this time.
Personally, I don’t give a flying fuck about things like gender, organs (like skin or genitalia) or absence thereof, or anything alike when someone posts something online, unless posted content is strongly related to one of those topics. Ideas matter no matter who or what produces them. Species fit into the same aspects-I-don’t-care-about list just fine - on the Internet nobody knows^W cares you’re a dog. Or a bunch of matrices in a trench coat. As long as you behave socially appropriate.
The problem with bots is that they’re not just noobs - unlike us meatbags they don’t just do wrong and stupid things but can’t possibly learn to stop (because models are static). Solving that, I think, is the true solution, bringing Internet back to life. Anything else seems to be just addressing the correlations to the symptoms.
(Yea, I’m leaning towards technooptimist and transhumanist views - I was raised in culture that had a lot of those, and was sold a dream of a progress that transcends worlds, and haven’t found a reason to denounce that. Your mileage may vary.)
Instead of full e-voting I would love to see an additional scheme to a traditional paper ballot that allows for verification. Something like STAR-Vote or Scantegrity. Even if it’s flawed, it would be nice to run specifically because it doesn’t affect the elections but could produce useful insights. If it fails - nothing particularly bad happens, if it works - cool, we get extra assurances or maybe spot some fraud that we weren’t aware about.
But there seems to be either no political will, or some issues with the practical implementations. There were some municipal experiments here and there, and then just… crickets. Anyone knows what happened to those efforts?
In my naive understanding, neither requires any will or consciousness.
S1 is “bare” language production, picking words or concepts to say or think by a fancy pattern prediction. There’s no reasoning at this level, just blabbering. However, language by itself weeds out too obvious nonsense purely statistically (some concepts are rarely in the same room), but we may call that “mindlessly” - that’s why even early LLMs produced semi-meaningful texts.
S2 is a set of patterns inside the language (“logic”), that biases S1 to produce reasoning-like phrases. Doesn’t require any consciousness or will, just concepts pushing S1 towards a special structure, simply backing one keeps them “in mind” and throws in the mix.
I suspect S2 has a spectrum of rigorousness, because one can just throw in some rules (like “if X then Y, not Y therefore not X”) or may do fancier stuff (imposing a larger structure to it all, like formulating and testing a null hypothesis). Either way it all falls down onto S1 for a ultimate decision-making, a sense of what sounds right (allowing us our favorite logical flaws), thus the fancier the rules (patterns of “thought”) the more likely reasoning will be sounder.
S2 doesn’t just rely but is a part of S1-as-language, though, because it’s a phenomena born out (and inside) the language.
Whether it’s willfully “consciously” engaged or if it works just because S1 predicts logical thinking concept as appropriate for certain lines of thinking and starts to involve probably doesn’t even matter - it mainly depends on whatever definition of “will” we would like to pick (there are many).
LLMs and humans can hypothetically do both just fine, but when it comes to checking, humans currently excel because (I suspect) they have a “wider” language in S1, that doesn’t only include word-concepts but also sensory concepts (like visuospatial thinking). Thus, as I get it, the world models idea.
I have this gut feeling the world had finally got to the point it decided to fix this pesky warming issue among other things, with a nice and cozy nuclear winter.
reply