More

kenjackson · 2026-02-09T12:51:45 1770641505

I think the commenter is saying that they will combine a world model with the word model. The resulting combination may be sufficient for very solid results.

Note humans generate their own non-complete world model. For example there are sounds and colors we don’t hear or see. Odors we don’t smell. Etc…. We have an incomplete model of the world, but we still have a model that proves useful for us.

wahern · 2026-02-09T14:17:02 1770646622

> they will combine a world model with the word model.

This takes "world model" far too literally. Audio-visual generative AI models that create non-textual "spaces" are not world models in the sense the previous poster meant. I think what they meant by world model is that the vast majority of the knowledge we rely upon to make decisions is tacit, not something that has been digitized, and not something we even know how to meaningfully digitize and model. And even describing it as tacit knowledge falls short; a substantial part of our world model is rooted in our modes of actions, motivations, etc, and not coupled together in simple recursive input -> output chains. There are dimensions to our reality that, before generative AI, didn't see much systematic introspection. Afterall, we're still mired in endless nature v. nurture debates; we have a very poor understanding about ourselves. In particular, we have extremely poor understanding of how we and our constructed social worlds evolve dynamically, and it's that aspect of our behavior that drives the frontier of exploration and discovery.

OTOH, the "world model" contention feels tautological, so I'm not sure how convincing it can be for people on the other side of the debate.

pixl97 · 2026-02-09T15:18:53 1770650333

Really all you're saying is the human world model is very complex, which is expected as humans are the most intelligent animal.

At no point have I seen anyone here as the question of "What is the minimum viable state of a world model".

We as humans with our ego seem to state that because we are complex, any introspective intelligence must be as complex as us to be as intelligent as us. Which doesn't seem too dissimilar to saying a plane must flap its wings to fly.

wahern · 2026-02-09T23:21:36 1770679296

Has any generative AI been demonstrated to exhibit the generalized intelligence (e.g. achieving in a non-simulated environment complex tasks or simple tasks in novel environments) of a vertebrate, or even a higher-order non-vertebrate? Serious question--I don't know either way. I've had trouble finding a clear answer; what little I have found is highly qualified and caveated once you get past the abstract, much like attempts in prior AI eras.

D-Machine · 2026-02-10T01:29:58 1770686998

> e.g. achieving in a non-simulated environment complex tasks or simple tasks in novel environments

I think one could probably argue "yes", to "simple tasks in novel environments". This stuff is super new though.

Note the "Planning" and "Robot Manipulation" parts of V-JEPA 2: https://arxiv.org/pdf/2506.09985:

> Planning: We demonstrate that V-JEPA 2-AC, obtained by post-training V-JEPA 2 with only 62 hours of unlabeled robot manipulation data from the popular Droid dataset, can be deployed in new environments to solve prehensile manipulation tasks using planning with given subgoals. Without training on any additional data from robots in our labs, and without any task-specific training or reward, the model successfully handles prehensile manipulation tasks, such as Grasp and Pick-and-Place with novel objects and in new environments.

kenjackson · 2026-02-09T23:27:43 1770679663

There is no real bar any more for generalized intelligence. The bars that existed prior to LLMs have largely been met. Now we’re in a state where we are trying to find new bars, but there are none that are convincing.

D-Machine · 2026-02-10T00:45:13 1770684313

ARC-AGI 2 private test set is one current bar that a large number of people find important and will be convincing to a large amount of people again if LLMs start doing really well on it. Performance degradation on the private set is still huge though and far inferior to human performance.

kenjackson · 2026-02-07T16:59:40 1770483580

This. For area where you can use tested and tried libraries (or tools in general) LLMs will generate better code when they use them.

In fact, LLMs will be better than humans in learning new frameworks. It could end up being the opposite that frameworks and libraries become more important with LLMs.

nottorp · 2026-02-07T17:55:02 1770486902

> In fact, LLMs will be better than humans in learning new frameworks.

LLMs don't learn? The neural networks are trained just once before release and it's a -ing expensive process.

Have you tried using one on your existing code base, which is basically a framework for whatever business problem you're solving? Did it figure it out automagically?

They know react.js and nest.js and next.js and whatever.js because they had humans correct them and billions of lines of public code to train on.

giancarlostoro · 2026-02-07T18:44:23 1770489863

If its on github eventually it will cycle into the training data. I have also seen Claude pull down code to look at from github.

fauigerzigerk · 2026-02-07T19:08:37 1770491317

Wouldn't there be a chicken and egg problem once humans stop writing new code directly? Who would write the code using this new framework? Are the examples written by the creators of the framework enough to train an AI?

giancarlostoro · 2026-02-07T21:08:14 1770498494

There's tooling out there 100% vibe coded, that is used by tens of thousands of devs daily, if that codebase found its way to training data, would it somehow ruin everything? I don't think this is really a problem, the problem will become people will need to identify good codebases from bad ones, if you point out which codes bad during training it makes a difference. There's a LOT of writings about how to write better code out there that I'm sure are already part of the training data.

nottorp · 2026-02-07T22:56:21 1770504981

How much proprietary business logic is on public github repos?

I'm not talking about "do me this solo founder saas little thing". I'm talking about working on existing codebases running specialized stuff for a functional company or companies.

eqvinox · 2026-02-07T17:23:00 1770484980

> LLMs will be better than humans in learning new frameworks.

I don't see a base for that assumption. They're good at things like Django because there is a metric fuckton of existing open-source code out there that they can be trained on. They're already not great at less popular or even fringe frameworks and programming languages. What makes you think they'll be good at a new thing that there are almost no open resources for yet?

kaydub · 2026-02-07T19:24:13 1770492253

Yeah, I don't know why you'd drop using frameworks and libraries just because you're using an LLM. If you AREN'T using them you're just loading a bunch of solved problems into the LLMs context so it can re-invent the wheel. I really love the LLM because now I don't need to learn the new frameworks myself. LLMs really remove all the bullshit I don't want to think about.

catlifeonmars · 2026-02-07T18:13:04 1770487984

LLMs famously aren’t that good at using new frameworks/languages. Sure they can get by with the right context, but most people are pointing them at standard frameworks in common languages to maximize the quality of their output.

tappio · 2026-02-07T18:49:33 1770490173

This is not my experience any longer. With properly set feedback loop and frameworks documentation it does not seem to matter much if they are working with completely novel stuff or not. Of course, when that is not available they hallucinate, but who anymore does that even? Anyone can see that LLMs are just glorified auto-complete machines, so you really have to put a lot of work in the enviroment they operate and quick feedback loops. (Just like with 90% of developers made of flesh...)

catlifeonmars · 2026-02-07T19:28:44 1770492524

Or you could use an off the shelf popular framework in Python and save yourself some time curating the context.

giancarlostoro · 2026-02-07T21:09:19 1770498559

I asked Claude to use some Dlang libraries even I had not heard of and it built a full blown proof of concept project for me, using obscure libraries nobody really knows. It just looked through docs and source code. Maybe back 3 years ago this would have been the case.

lenkite · 2026-02-07T17:40:57 1770486057

How will LLM's become better than humans in learning new frameworks when automated/vibe coders never manually code how to use those new frameworks ?

kenjackson · 2026-02-04T20:44:36 1770237876

Depends on the industry and work. They are still paying top dollar for AI talent.

no_wizard · 2026-02-04T20:57:56 1770238676

Bubble money is paying that out, much like in the past

kenjackson · 2026-02-03T18:40:32 1770144032

I have to admit, I have almost no problems with Teams. The one big issue I had was performance when screen sharing. But I got a new laptop and this problem went away. Seems so odd that so many people have major problems with it, while I feel like within my workgroup there are almost no problems to speak of.

andix · 2026-02-03T19:11:29 1770145889

This was discussed before: if your Windows computer doesn't have a valid HEVC license installed, then Teams falls back to software encoding and performs horrible. Most manufacturers include the license, but not all. It's also only 99 cents on the Microsoft store (which might be unavailable on enterprise managed devices)

dijit · 2026-02-03T20:16:17 1770149777

That: and microsoft routes all calls through their servers.

Fine if you live near a datacenter.

In Sweden though, you go through France.

Not ideal.

delecti · 2026-02-03T18:52:15 1770144735

How extensively do you use it? When my team was just using it for meetings and the attached chats, it did actually work completely fine. When broader orgs started pushing more communications through it (the "teams" in teams, and all the weird chat room/forums that entails) all of the rough edges became very apparent. All of that is just a shockingly disorganized mess.

pjmlp · 2026-02-03T18:45:42 1770144342

One day they will discover threaded conversions.

zelphirkalt · 2026-02-03T21:42:46 1770154966

And then we will get rid of them again, because some suits are telling us that we don't actually want them, that they are "complicated", we must trust them and that recursive data types are too hard to get right. Let's all write SMS again. Or better yet, send fax.

Some engineers will facepalm super hard but won't be listened to, as usual, and we will enter the next cosmic age of self-inflicted suffering.

pjmlp · 2026-02-04T08:32:46 1770193966

Thankfully I only have to use Teams in very specific projects, thus I still have them. :)

kenjackson · 2026-01-30T18:36:12 1769798172

Big difference is being out of office. I expect Trump to get a ton of money after leaving office, because people like proximity to fame, but I don't like the stench when he's in office and has direct political influence.

That said, Trump also investigated Obama for the Netflix deal. Will he investigate Melania now?

wang_li · 2026-01-30T19:22:20 1769800940

Being out of office is irrelevant. "Do this for me now, I'll make sure you're taken care of when you retire." This is so common the revolving door in government is a well worn trope.

As far as I can tell no executive branch agency investigated the Netflix deal.

JumpCrisscross · 2026-01-30T19:31:23 1769801483

> This is so common the revolving door in government is a well worn trope

On TV and Reddit. In the real world you’re not getting policy outcomes today for a handshake of a payout tomorrow without someone in office to guarantee your end.

kenjackson · 2026-01-30T19:45:28 1769802328

Exactly. Anyone willing to bribe you is more than willing to rescind when you have no real power.

wang_li · 2026-01-30T20:08:34 1769803714

Except if they back out of the deal after getting what they want, they'll never be able to make this kind of deal ever again.

Regardless, the revolving door is well known. It's been talked about since the 1800s. There's a wikipedia page for it. Pretending it doesn't happen doesn't change the fact that it happens and is quite common.

JumpCrisscross · 2026-01-30T21:56:03 1769810163

> if they back out of the deal after getting what they want, they'll never be able to make this kind of deal ever again

If someone is stupid enough to go running their mouth on a bribery gone bad, or one willing to give on policy in exchange for promises, you either didn't need to bribe them or are wasting your time and money.

These deals don't happen that way because they can't. It's why e.g. Bob Menendez winds up with gold bars, Melania is being paid now and Trump's crypto is being purchased and sold.

> the revolving door is well known. It's been talked about since the 1800s

Sure. But not in the way you describe. You hire the ex politician not to pay them back for a favour earlier but to curry favour with the folks still in power.

> Pretending it doesn't happen doesn't change the fact that it happens and is quite common

Straw man. Nobody said it doesn't happen. Just that the way you're describig it is wrong.

kenjackson · 2026-01-30T18:33:13 1769797993

I've heard that one of the advantages of this administration is that you don't need data or convincing arguments -- just bribery and flattery. If you're OK with bribery and flattery then you'll find this administration much easier to work with. Getting your way is a simpler path.

kenjackson · 2026-01-27T14:17:11 1769523431

It’s crazy to think that Instagram Reels, owned by Meta, is preferable to TikTok now. At least Reels now is at least competitive in terms of content - unlike two years ago when people were worried about TikTok being banned and Reels was not a good alternative.

Tiktaalik · 2026-01-27T17:00:51 1769533251

Reels is just AI and engagement slop

logicchains · 2026-01-27T14:35:52 1769524552

Isn't Reels content more right-wing, while TikTok has lots of both left-leaning and right-leaning content.

kenjackson · 2026-01-27T15:31:59 1769527919

TikTok historically has, but if this is truly the new owners trying to block content then that can change rapidly.

kortilla · 2026-01-27T15:23:27 1769527407

Reels skews older in the user-base, which skews the average to the right.

qingcharles · 2026-01-29T02:13:52 1769652832

Sample of one, but I scroll Reels at least 30 mins a day and I've never seen any right-wing content on my feed.

kenjackson · 2026-01-27T13:43:46 1769521426

This is what I read as a middle schooler learning 6502 on a C64. Does a good covering the basics in a very conversational manner.

kenjackson · 2026-01-26T22:01:53 1769464913

Has anyone tried creating a language that would be good for LLMs? I feel like what would be good for LLMs might not be the same thing that is good for humans (but I have no evidence or data to support this, just a hunch).

Sheeny96 · 2026-01-26T23:15:46 1769469346

The problem with this is the reason LLMs are so good at writing Python/Java/JavaScript is that they've been trained on a metric ton of code in those languages, have seen the good the bad and the ugly and been tuned to the good. A new language would be training from scratch and if we're introducing new paradigms that are 'good for LLMs but bad for humans' means humans will struggle to write good code in it, making the training process harder. Even worse, say you get a year and 500 features into that repo and the LLM starts going rogue - who's gonna debug that?

reitzensteinm · 2026-01-26T23:39:50 1769470790

But coding is largely trained on synthetic data.

For example, Claude can fluently generate Bevy code as of the training cutoff date, and there's no way there's enough training data on the web to explain this. There's an agent somewhere in a compile test loop generating Bevy examples.

A custom LLM language could have fine grained fuzzing, mocking, concurrent calling, memoization and other features that allow LLMs to generate and debug synthetic code more effectively.

If that works, there's a pathway to a novel language having higher quality training data than even Python.

mbreese · 2026-01-27T04:42:39 1769488959

I recently had Codex convert an script of mine from bash to a custom, Make inspired language for HPC work (think nextflow, but an actual language). The bash script submitted a bunch of jobs based on some inputs. I wanted this converted to use my pipeline language instead.

I wrote this custom language. It's on Github, but the example code that would have been available would be very limited.

I gave it two inputs -- the original bash script and an example of my pipeline language (unrelated jobs).

The code it gave me was syntactically correct, and was really close to the final version. I didn't have to edit very much to get the code exactly where I wanted it.

This is to say -- if a novel language is somewhat similar to an existing syntax, the LLM will be surprisingly good at writing it.

voxleone · 2026-01-26T23:52:26 1769471546

>Has anyone tried creating a language that would be good for LLMs?

I’ve thought about this and arrived at a rough sketch.

The first principle is that models like ChatGPT do not execute programs; they transform context. Because of that, a language designed specifically for LLMs would likely not be imperative (do X, then Y), state-mutating, or instruction-step driven. Instead, it would be declarative and context-transforming, with its primary operation being the propagation of semantic constraints. The core abstraction in such a language would be the context, not the variable. In conventional programming languages, variables hold values and functions map inputs to outputs. In a ChatGPT-native language, the context itself would be the primary object, continuously reshaped by constraints. The atomic unit would therefore be a semantic constraint, not a value or instruction.

An important consequence of this is that types would be semantic rather than numeric or structural. Instead of types like number, string, bool, you might have types such as explanation, argument, analogy, counterexample, formal_definition.

These types would constrain what kind of text may follow, rather than how data is stored or laid out in memory. In other words, the language would shape meaning and allowable continuations, not execution paths. An example:

@iterate: refine explanation until clarity ≥ expert_threshold

koolba · 2026-01-26T22:18:26 1769465906

There are two separate needs here. One is a language that can be used for computation where the code will be discarded. Only the output of the program matters. And the other is a language that will be eventually read or validated by humans.

branafter · 2026-01-26T23:18:22 1769469502

Most programming languages are great for LLMs. The problem is with the natural language specification for architectures and tasks. https://brannn.github.io/simplex/

simonw · 2026-01-26T22:03:48 1769465028

There was an interesting effort in that direction the other day: https://simonwillison.net/2026/Jan/19/nanolang/

conception · 2026-01-26T22:16:04 1769465764

I don’t know rust but I use it with llms a lot as unlike python, it has fewer ways to do things, along with all the built in checks to build.

999900000999 · 2026-01-26T23:04:22 1769468662

I want to create a language that allows an LLM to dynamically decide what to do.

A non dertermistic programing language, which options to drop down into JavaScript or even C if you need to specify certain behaviors.

I'd need to be much better at this though.

branafter · 2026-01-26T23:24:40 1769469880

You're describing a multi-agent long horizon workflow that can be accomplished with any programming language we have today.

999900000999 · 2026-01-27T00:01:39 1769472099

I'm always open to learning, are there any example projects doing this ?

branafter · 2026-01-27T00:43:56 1769474636

The most accessible way to start experimenting would be the Ralph loop: https://github.com/anthropics/claude-code/tree/main/plugins/...

You could also work backwards from this paper: https://arxiv.org/abs/2512.18470

999900000999 · 2026-01-27T00:58:08 1769475488

Ok.

I'm imagining something like.

"Hi Ralph, I've already coded a function called GetWeather in JS, it returns weather data in JSON can you build a UI around it. Adjust the UI overtime"

At runtime modify the application with improvements, say all of a sudden we're getting air quality data in the JSON tool, the Ralph loop will notice, and update the application.

The Arxiv paper is cool, but I don't think I can realistically build this solo. It's more of a project for a full team.

fwip · 2026-01-27T00:36:22 1769474182

yes "now what?" | llm-of-choice

gregoryl · 2026-01-26T23:21:42 1769469702

What does that even mean?

kenjackson · 2026-01-26T17:02:26 1769446946

In my 30 years in industry -- "we need to do this for the good of the business" has come up maybe a dozen times, tops. Things are generally much more open to debate with different perspectives, including things like feasibility. Every blue moon you'll get "GDPR is here... this MUST be done". But for 99% of the work there's a reasonable argument for a range of work to get prioritized.

_tlo4 · 2026-01-26T18:21:41 1769451701

When working as a senior engineer, I've never been given enough business context to confidently say, for example, "this stakeholder isn't important enough to justify such a tight deadline". Doesn't that leave the business side of things as a mysterious black box? You can't do much more than report "meeting that deadline would create ruinous amounts of technical debt", and then pray that your leader has kept some alternatives open.