I think the commenter is saying that they will combine a world model with the word model. The resulting combination may be sufficient for very solid results.
Note humans generate their own non-complete world model. For example there are sounds and colors we don’t hear or see. Odors we don’t smell. Etc…. We have an incomplete model of the world, but we still have a model that proves useful for us.
> they will combine a world model with the word model.
This takes "world model" far too literally. Audio-visual generative AI models that create non-textual "spaces" are not world models in the sense the previous poster meant. I think what they meant by world model is that the vast majority of the knowledge we rely upon to make decisions is tacit, not something that has been digitized, and not something we even know how to meaningfully digitize and model. And even describing it as tacit knowledge falls short; a substantial part of our world model is rooted in our modes of actions, motivations, etc, and not coupled together in simple recursive input -> output chains. There are dimensions to our reality that, before generative AI, didn't see much systematic introspection. Afterall, we're still mired in endless nature v. nurture debates; we have a very poor understanding about ourselves. In particular, we have extremely poor understanding of how we and our constructed social worlds evolve dynamically, and it's that aspect of our behavior that drives the frontier of exploration and discovery.
OTOH, the "world model" contention feels tautological, so I'm not sure how convincing it can be for people on the other side of the debate.
Really all you're saying is the human world model is very complex, which is expected as humans are the most intelligent animal.
At no point have I seen anyone here as the question of "What is the minimum viable state of a world model".
We as humans with our ego seem to state that because we are complex, any introspective intelligence must be as complex as us to be as intelligent as us. Which doesn't seem too dissimilar to saying a plane must flap its wings to fly.
Has any generative AI been demonstrated to exhibit the generalized intelligence (e.g. achieving in a non-simulated environment complex tasks or simple tasks in novel environments) of a vertebrate, or even a higher-order non-vertebrate? Serious question--I don't know either way. I've had trouble finding a clear answer; what little I have found is highly qualified and caveated once you get past the abstract, much like attempts in prior AI eras.
> Planning: We demonstrate that V-JEPA 2-AC, obtained by post-training V-JEPA 2 with only 62
hours of unlabeled robot manipulation data from the popular Droid dataset, can be deployed in new environments to solve prehensile manipulation tasks using planning with given subgoals. Without training on any additional data from robots in our labs, and without any task-specific training or reward, the model successfully handles prehensile manipulation tasks, such as Grasp and Pick-and-Place with novel objects and in new environments.
There is no real bar any more for generalized intelligence. The bars that existed prior to LLMs have largely been met. Now we’re in a state where we are trying to find new bars, but there are none that are convincing.
ARC-AGI 2 private test set is one current bar that a large number of people find important and will be convincing to a large amount of people again if LLMs start doing really well on it. Performance degradation on the private set is still huge though and far inferior to human performance.
This. For area where you can use tested and tried libraries (or tools in general) LLMs will generate better code when they use them.
In fact, LLMs will be better than humans in learning new frameworks. It could end up being the opposite that frameworks and libraries become more important with LLMs.
> In fact, LLMs will be better than humans in learning new frameworks.
LLMs don't learn? The neural networks are trained just once before release and it's a -ing expensive process.
Have you tried using one on your existing code base, which is basically a framework for whatever business problem you're solving? Did it figure it out automagically?
They know react.js and nest.js and next.js and whatever.js because they had humans correct them and billions of lines of public code to train on.
Wouldn't there be a chicken and egg problem once humans stop writing new code directly? Who would write the code using this new framework? Are the examples written by the creators of the framework enough to train an AI?
There's tooling out there 100% vibe coded, that is used by tens of thousands of devs daily, if that codebase found its way to training data, would it somehow ruin everything? I don't think this is really a problem, the problem will become people will need to identify good codebases from bad ones, if you point out which codes bad during training it makes a difference. There's a LOT of writings about how to write better code out there that I'm sure are already part of the training data.
How much proprietary business logic is on public github repos?
I'm not talking about "do me this solo founder saas little thing". I'm talking about working on existing codebases running specialized stuff for a functional company or companies.
> LLMs will be better than humans in learning new frameworks.
I don't see a base for that assumption. They're good at things like Django because there is a metric fuckton of existing open-source code out there that they can be trained on. They're already not great at less popular or even fringe frameworks and programming languages. What makes you think they'll be good at a new thing that there are almost no open resources for yet?
Yeah, I don't know why you'd drop using frameworks and libraries just because you're using an LLM. If you AREN'T using them you're just loading a bunch of solved problems into the LLMs context so it can re-invent the wheel. I really love the LLM because now I don't need to learn the new frameworks myself. LLMs really remove all the bullshit I don't want to think about.
LLMs famously aren’t that good at using new frameworks/languages. Sure they can get by with the right context, but most people are pointing them at standard frameworks in common languages to maximize the quality of their output.
This is not my experience any longer. With properly set feedback loop and frameworks documentation it does not seem to matter much if they are working with completely novel stuff or not. Of course, when that is not available they hallucinate, but who anymore does that even? Anyone can see that LLMs are just glorified auto-complete machines, so you really have to put a lot of work in the enviroment they operate and quick feedback loops. (Just like with 90% of developers made of flesh...)
I asked Claude to use some Dlang libraries even I had not heard of and it built a full blown proof of concept project for me, using obscure libraries nobody really knows. It just looked through docs and source code. Maybe back 3 years ago this would have been the case.
I have to admit, I have almost no problems with Teams. The one big issue I had was performance when screen sharing. But I got a new laptop and this problem went away. Seems so odd that so many people have major problems with it, while I feel like within my workgroup there are almost no problems to speak of.
This was discussed before: if your Windows computer doesn't have a valid HEVC license installed, then Teams falls back to software encoding and performs horrible. Most manufacturers include the license, but not all. It's also only 99 cents on the Microsoft store (which might be unavailable on enterprise managed devices)
How extensively do you use it? When my team was just using it for meetings and the attached chats, it did actually work completely fine. When broader orgs started pushing more communications through it (the "teams" in teams, and all the weird chat room/forums that entails) all of the rough edges became very apparent. All of that is just a shockingly disorganized mess.
And then we will get rid of them again, because some suits are telling us that we don't actually want them, that they are "complicated", we must trust them and that recursive data types are too hard to get right. Let's all write SMS again. Or better yet, send fax.
Some engineers will facepalm super hard but won't be listened to, as usual, and we will enter the next cosmic age of self-inflicted suffering.
Big difference is being out of office. I expect Trump to get a ton of money after leaving office, because people like proximity to fame, but I don't like the stench when he's in office and has direct political influence.
That said, Trump also investigated Obama for the Netflix deal. Will he investigate Melania now?
Being out of office is irrelevant. "Do this for me now, I'll make sure you're taken care of when you retire." This is so common the revolving door in government is a well worn trope.
As far as I can tell no executive branch agency investigated the Netflix deal.
> This is so common the revolving door in government is a well worn trope
On TV and Reddit. In the real world you’re not getting policy outcomes today for a handshake of a payout tomorrow without someone in office to guarantee your end.
Except if they back out of the deal after getting what they want, they'll never be able to make this kind of deal ever again.
Regardless, the revolving door is well known. It's been talked about since the 1800s. There's a wikipedia page for it. Pretending it doesn't happen doesn't change the fact that it happens and is quite common.
> if they back out of the deal after getting what they want, they'll never be able to make this kind of deal ever again
If someone is stupid enough to go running their mouth on a bribery gone bad, or one willing to give on policy in exchange for promises, you either didn't need to bribe them or are wasting your time and money.
These deals don't happen that way because they can't. It's why e.g. Bob Menendez winds up with gold bars, Melania is being paid now and Trump's crypto is being purchased and sold.
> the revolving door is well known. It's been talked about since the 1800s
Sure. But not in the way you describe. You hire the ex politician not to pay them back for a favour earlier but to curry favour with the folks still in power.
> Pretending it doesn't happen doesn't change the fact that it happens and is quite common
Straw man. Nobody said it doesn't happen. Just that the way you're describig it is wrong.
I've heard that one of the advantages of this administration is that you don't need data or convincing arguments -- just bribery and flattery. If you're OK with bribery and flattery then you'll find this administration much easier to work with. Getting your way is a simpler path.
It’s crazy to think that Instagram Reels, owned by Meta, is preferable to TikTok now. At least Reels now is at least competitive in terms of content - unlike two years ago when people were worried about TikTok being banned and Reels was not a good alternative.
Has anyone tried creating a language that would be good for LLMs? I feel like what would be good for LLMs might not be the same thing that is good for humans (but I have no evidence or data to support this, just a hunch).
The problem with this is the reason LLMs are so good at writing Python/Java/JavaScript is that they've been trained on a metric ton of code in those languages, have seen the good the bad and the ugly and been tuned to the good. A new language would be training from scratch and if we're introducing new paradigms that are 'good for LLMs but bad for humans' means humans will struggle to write good code in it, making the training process harder. Even worse, say you get a year and 500 features into that repo and the LLM starts going rogue - who's gonna debug that?
For example, Claude can fluently generate Bevy code as of the training cutoff date, and there's no way there's enough training data on the web to explain this. There's an agent somewhere in a compile test loop generating Bevy examples.
A custom LLM language could have fine grained fuzzing, mocking, concurrent calling, memoization and other features that allow LLMs to generate and debug synthetic code more effectively.
If that works, there's a pathway to a novel language having higher quality training data than even Python.
I recently had Codex convert an script of mine from bash to a custom, Make inspired language for HPC work (think nextflow, but an actual language). The bash script submitted a bunch of jobs based on some inputs. I wanted this converted to use my pipeline language instead.
I wrote this custom language. It's on Github, but the example code that would have been available would be very limited.
I gave it two inputs -- the original bash script and an example of my pipeline language (unrelated jobs).
The code it gave me was syntactically correct, and was really close to the final version. I didn't have to edit very much to get the code exactly where I wanted it.
This is to say -- if a novel language is somewhat similar to an existing syntax, the LLM will be surprisingly good at writing it.
>Has anyone tried creating a language that would be good for LLMs?
I’ve thought about this and arrived at a rough sketch.
The first principle is that models like ChatGPT do not execute programs; they transform context. Because of that, a language designed specifically for LLMs would likely not be imperative (do X, then Y), state-mutating, or instruction-step driven. Instead, it would be declarative and context-transforming, with its primary operation being the propagation of semantic constraints. The core abstraction in such a language would be the context, not the variable. In conventional programming languages, variables hold values and functions map inputs to outputs. In a ChatGPT-native language, the context itself would be the primary object, continuously reshaped by constraints. The atomic unit would therefore be a semantic constraint, not a value or instruction.
An important consequence of this is that types would be semantic rather than numeric or structural. Instead of types like number, string, bool, you might have types such as explanation, argument, analogy, counterexample, formal_definition.
These types would constrain what kind of text may follow, rather than how data is stored or laid out in memory. In other words, the language would shape meaning and allowable continuations, not execution paths. An example:
@iterate:
refine explanation
until clarity ≥ expert_threshold
There are two separate needs here. One is a language that can be used for computation where the code will be discarded. Only the output of the program matters. And the other is a language that will be eventually read or validated by humans.
Most programming languages are great for LLMs. The problem is with the natural language specification for architectures and tasks. https://brannn.github.io/simplex/
"Hi Ralph, I've already coded a function called GetWeather in JS, it returns weather data in JSON can you build a UI around it. Adjust the UI overtime"
At runtime modify the application with improvements, say all of a sudden we're getting air quality data in the JSON tool, the Ralph loop will notice, and update the application.
The Arxiv paper is cool, but I don't think I can realistically build this solo. It's more of a project for a full team.
In my 30 years in industry -- "we need to do this for the good of the business" has come up maybe a dozen times, tops. Things are generally much more open to debate with different perspectives, including things like feasibility. Every blue moon you'll get "GDPR is here... this MUST be done". But for 99% of the work there's a reasonable argument for a range of work to get prioritized.
When working as a senior engineer, I've never been given enough business context to confidently say, for example, "this stakeholder isn't important enough to justify such a tight deadline". Doesn't that leave the business side of things as a mysterious black box? You can't do much more than report "meeting that deadline would create ruinous amounts of technical debt", and then pray that your leader has kept some alternatives open.
Note humans generate their own non-complete world model. For example there are sounds and colors we don’t hear or see. Odors we don’t smell. Etc…. We have an incomplete model of the world, but we still have a model that proves useful for us.
reply