Hacker Newsnew | past | comments | ask | show | jobs | submit | mlyle's commentslogin

The models don't consider these because there's considerable uncertainty as to the size of these effects and potential countervailing forces of similar magnitudes.

The fact is, for all of these other secondary effects etc... we just don't know. It's too complicated of a system.

So as a result, we've got a prediction of something between "somewhat bad" and "catastrophically-is-an-understatement bad" with a maximum likelihood estimate of "really really bad."


> ... we just don't know. It's too complicated of a system.

I wish this comment was higher up.

The big thing under-discussed about climate change is that the deeper we get into it, the harder it is to predict and understand.

I recall Dr. Richard Alley discussing how Thwaites Glacier collapsing wasn't factored into any IPCC reports; but ultimately pointed out it was for good reason because it's simply not possible to model these things and their consequences accurately.

I don't do any climate modeling, but I do a lot of other modeling and forecasting: the biggest assumption we make in all statistical models is that the system itself more or less stays statistically similar to what it currently is and what we have seen in past. As soon as you drop that assumption you're increasingly in the world of wild guessing. If you wanted me to build you a RAM price prediction model 2 years ago, I could have done a pretty good job. Ask for one today and your better off asking someone with industry but no modeling experience what they think might happen.

This is the hidden threat of climate change most people are completely unaware of: we can know it will be bad when certain things happen, we know they will happen in the nearish future, but we can't really say exactly how and when they'll unfold with any meaningful confidence.


And yet we can still say something simple that is true: warming will accelerate due to non-human greenhouse gas emissions as the planet continues to warm, due to feedback loops and tipping points in the natural carbon cycle. This is an unassailable statement.

> This is an unassailable statement.

No. I believe what you're saying is very likely to be true, but we know there's both positive and negative feedback and we don't really know how they really will interplay and where all the tipping points are.

There may even be significant phase delay in these mechanisms and so we could even get oscillation.


Over time periods in excess of 10K years this is a reasonable caveat. For more human-oriented timelines, there's no negative feedback mechanism I'm aware of that would do anything close to producing an actual oscillation.

Edit: I'd be happy for you to educate me how I'm wrong btw, since that would mean I've missed something significant, which would make me happy! So please do tell me if you know of such a mechanism.


I really meant to say that there's no way to really know of any region of the CO2 vs. temperature graph if there's positive or negative feedback dominating. You're proposing it all runs away in one lump, and I'm saying that there can be chunks of runaway and then damping. An extreme case would be if things are really underdamped somewhere and we spiral down to one of these points.

There are all kinds of things that have time lags from years to centuries, though, that could cause ringing (ocean heat uptake, rates of carbon uptake as the biosphere adapts and shifts, etc).

Indeed, we have evidence of ringing in the geologic climate record-- like Dansgaard-Oeschger events. We also live with ringing in weather systems like El Nino. Warming intensifying or creating new modes of oscillation would not be that surprising.


If those customers end up profitable, it could be tempting for nVidia to vertically integrate.

I don't think it's as easy as others say, though.


There's nothing to say that you can't build something intelligent out of them by bolting a memory on it, though.

Sure, it's not how we work, but I can imagine a system where the LLM does a lot of heavy lifting and allows more expensive, smaller networks that train during inference and RAG systems to learn how to do new things and keep persistent state and plan.


Memory is not just bolted on top of the latest models. They under go training on how and when to effectively use memory and how to use compaction to avoid running out of context when working on problems.

Maybe there's an analogy to our long and short term memory - immediate stimuli is processed in the context deep patterns that have accreted over a lifetime. The effect of new information can absolutely challenge a lot of those patterns but to have that information reshape how we basically think takes a lot longer - more processing, more practice, etc.

In the case of the LLM that longer-term learning / fundamental structure is a proxy for the static weights produced by a finite training process, and that the ability to use tools and store new insights and facts is analogous to shorter-term memory and "shallow" learning.

Perhaps periodic fine-tuning has an analogy in sleep or even our time spent in contemplation or practice (..or even repetition) to truly "master" a new idea and incorporate it into our broader cognitive processing. We do an amazing job of doing this kind of thing on a continuous basis while the machines (at least at this point) perform this process in discrete steps.

If our own learning process is a curve then the LLM's is a step function trying to model it. Digital vs analog.


do you have some reading material to share on this matter?

thanks already


I don't, but look into what the creators of Codex, Gemini CLI, Claude Code, Kimi CLI, etc have said about the models. While these harnesses are advertised as coding specific we know that coding ability correlates with reasoning ability.

You aren't wrong and that is a fascinating area of research. I think the key thing is that the memory has to fundamentally influence the underlying model, or at least the response, in some way. Patching memory on top of an LLM is different from integrating it into the core model. To go back to human terms it is like an extra bit of storage, but not directly attached to our neo cortex. So it works more like a filter than a core part of our intelligence in the analogy. You think about something and assemble some thought and then it would go to this next filter layer and get augmented and that smaller layer is the only thing being updated.

It is still meaningful, but it narrows what the intelligence can be sufficiently that it may not meet the threshold. Maybe it would, but it is probably too narrow. This is all strictly if we ask that it meet some human-like intelligence and not the philosophy of "what counts as intelligence" but... we are humans. The strongest things or at least the most honest definitions of intelligence I think exist are around our metacognitive ability to rewire the grey matter for survival not based on immediate action-reaction but the psychological time of analyzing the past to alter the future.


> I have built a quick demo

This is obvs 5 minutes of LLM generated code

> a regulator verifies the cryptographic proof without trusting the operator.

No, the regulator verifies that the operator signed the proof, which isn't a lot different from the operator saying it alone.


It looks almost entirely envisioned and implemented by AI.

An agent signing a covenant doesn't do anything. You're not going to enforce a contract against it, and there's not some kind of non-repudiation problem to solve.

Enforcing behavioral covenants or boundaries is inherent to how you make things safe. But how do you really do it for anything that matters? How do you make sure that an agent isn't discriminating based on race or other factors?

The whole reason you're using an LLM is because you're doing something either:

A) at very low scale, at which case it's hard to capture sufficient covenants cost-efficiently

or B) with very great complexity, where the behavior you want is hard to encapsulate in code-- in which case meaningful enforcement of the complex covenants that may result is hard.

Indeed, if you could just write code to do it, you'd just write code to do it.

I'm glad you're interested in these issues and playing with them. I'll leave you with one last thought: 134 KSLOC is a bug, not a feature. Some software systems that need to be huge, but for software systems that need to be trusted-- small, auditable, and understandable to humans (and agents) is the key thing you're looking for. Could you build some kind of small trustable core that solves a simple problem in an understandable way?


You're right with the 134K point. The actual cryptographic kernel (covenant building, verification, hash-chaining) is just about 3-4K lines. The rest are just adapters, plugins and test harnesses. I should lead with that number. With enforcement, the covenant itself isn't the enforcement. Middleware intercepts tool calls before the execution and blocks the violations. But you're right that this only works for constraints you can express as rules. "No external calls" and "rate limit 100/hour" are enforceable. "Don't discriminate" is not — that's a fundamentally harder problem and I'm not pretending that it solves it. The small trustable core advice is truly good and probably what I should focus on next. Thank you.

Why does whether the agent "commits" to a rule cryptographically matter?

Surely it's just the enforcement, and maybe the measuring of sentinel events -- how far does it wander off course.

How is cryptography an important part of this, given that we're talking about a layer that sits on top of an LLM without an adversary in-between?

I know you mention non-repudiation, but ... there's no kind of real non-repudiation here in this environment.


Very fair question. If you control the whole stack with your agent, your middleware and your logs, then cryptography doesn't add much. You already trust yourself.

But, it matters when there are multiple parties. An enterprise deploys an agent that can handle customer data. The customer wants proof the agent has followed the rules. The regulator wants proof that the logs were not just edited after an incident. Without cryptographic signatures and hash chains, the enterprise can just say "trust us." With them, the proof is independently verifiable.

It's just the difference between "we followed the rules" and "here's a mathematically verifiable proof we followed the rules." For internal use, it's an overkill. For anything with external accountability, that targets the point.


There's no mathematically verifiable proof that anyone followed the rules. There's a cryptographic chain, but it just means "this piece of the stack, at some point, was convinced to process this and recorded that it did this." -- not whether that actually happened, what code was running, etc.

It doesn't tell you anything about what code was running there or whether it was really enforced.

Look, it's cool that this is an area that interests you. But I want you to know that AI agents are sycophantic and will claim your ideas are good and will not necessarily steer you in good directions. I have patents in the area of non-repudiation dating back 25 years and am doing my best to give you good feedback.

Non-repudiation, policy enforcement, audit-readiness, ledgers: these are all good things. As far as I can tell, there's nothing too special about doing this with LLMs, too. The same kinds of code that a bank uses to ensure that its ledger isn't tampered with and that the right software is running in the right places would work for this job -- and it wasn't vibe coded and mostly specified by AI.


You’re correct. The cryptographic chain proves “this middleware has processed this action and has recorded it,” not that the enforcement logic itself was correct or that the code running was what you think it was. Those are both different guarantees and I have been conflating them.

On “nothing too special about doing this with LLMs,” also fair. The primitives (policy enforcement, audit trails, non-repudiation) aren’t new. The bet is that AI agents will need these at a scale and standardization level that does not exist yet, and having it as a composable library matters when every framework (LangChain, CrewAI, Vercel AI SDK) is building agents differently. But the underlying cryptography isn’t novel.


Proving policy controls are in place and that actions were taken is a fairly universal problem.

Cryptography doesn't really do as much to improve it as one would think. Yes, providing evidence of sequence or that stuff happened before a certain time is a helpful tool to have in the toolbox.

The earliest human writings date to about 3000-3500 BCE, and are almost entirely ledgers on clay tablets.

I want to point out a little asymmetry. It's a little rude to generate a bunch of stuff, including writing, using LLMs, and then expect actual humans to interact with it. If it wasn't your time to do and understand and say, why should it be worth others' time to read and respond to it?


? we're talking about autonomous weapons systems. That would be internationally.

Secondarily, we're talking about domestic surveillance / law enforcement. That would be domestic.

(But they do not find an issue with international intelligence gathering-- which is a legitimate purpose of national security apparatus).


I don’t think deploying “80% right” tools for mass surveillance (or anything that can remotely impact human life) counts as lawful in any context.

Just because the US currently lacks a functioning legislative branch doesn’t magically make it OK when gaps in the law are reworded into “national security”


I'm really not sure what you're trying to say or assert, so you can put it more clearly.

The tools are not good enough to be ethically deployed, least of all for surveillance.

Just because Congress is failing to do its job doesn’t mean the executive branch should simply do what it wants under the guise of “national security.”


I think there's a notable distinction between "domestic mass-surveillance" and use in international intelligence gathering.

The poster said:

> Both their stances are flawed because their ethics apparently end at the border

It seems like Anthropic is ethically concerned about use of autonomous weapons anywhere, and by surveillance by a country against its own citizens. Countries spy on each other a lot, but the ethical implications and risks of international spying are substantially different vs. a country acting against its own citizenry.

Therefore, I think Anthropic's stance is A) ethically consistent, and B) not artificially constrained to the US (doesn't "end at the border"). There's room for disagreement and criticism, but I think this particular hyperbole is invalid.


I’m just attempting to clarify what they said though I feel it was pretty clear to begin with.

One of Anthropic's line in the sand was domestic mass-surveillance.

> > Secondarily, we're talking about domestic surveillance / law enforcement. That would be domestic.

> One of Anthropic's line in the sand was domestic mass-surveillance.

And?


Some people feel that mass surveillance is wrong whether it is domestic or not. For those people, being ok with mass surveillance as long that it is not done to your kind is a morally wrong stance.

>and?

A little more effort/less obvious bait would go a long way to fostering a more productive discussion.


I think the person you are replying to takes issue with the thing which you have simply asserted.

Which thing? Helping intelligence / international surveillance?

>That would be internationally.

No other country should dictate what our military is or is not allowed to do. As they say all is fair in love and war, and if we want to break some international treaty that is our choice to do so. Both are based of domestic decisions of what should be allowed.


We are talking about US corporations deciding to/not to provide tech to the US government. That's completely orthogonal to your concern.

The market price of the ransom for a King is a King's ransom.

One reason we end up with excess capacity is process improvements; adding new fabs to get more density or performance doesn't make old fabs go away, and so we go through cycles of excess capacity. Demand has been relatively constant.

Here we're facing different forces-- unprecedented demand for DRAM that may be durable. But it also looks like the pace of supply changes may be decreased as process improvements get smaller and the industry stops moving so much in lockstep.

It still matters what happens to the demand function, though. If enough AI startups blow up that there's a lot of secondhand SDRAM in the market, and demand for new SDRAM is impacted, too, that will push things down.

Sort of like what happened with the glut of telecom equipment after


The problems with any claim of consciousness are that they are unfalsifiable.

But it seems to be pretty hard to come up with a coherent claim of meat-consciousness that really excludes the possibility of machine-consciousness without some kind of really motivated reasoning.


This assumes all the energy is leaked when you open the door, and that the power is constant rather than ramping down. I'm guessing a -lot less- leaks than this.

(And, of course, you don't absorb all of what leaks).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: