More

CSMastermind · 2026-06-03T22:27:12 1780525632

A blanket cap makes no sense to me. There's a power distribution of AI use in my company and I'd imagine it's the same at a much greater scale at Uber.

I'd guess there should be a few people Uber is bascially allocating unlimited AI spending to and a large swath they're giving basically nothing.

seanlinehan · 2026-06-03T22:33:35 1780526015

I would assume that at least one of two things are true:

1. They're costs are so so out of control that they need to impose a blanket cap immediately. Figuring out an allocation mechanism that can be deployed company wide is time consuming and they need to staunch the bleeding immediately, despite it being obviously suboptimal.

2. The few people who should have unlimited tokens were given exactly that. No reason to introduce such nuance to a public PR move. The hard-cap limit is a great negotiating posture with token providers.

CSMastermind · 2026-06-02T03:58:28 1780372708

One of the most attractive things a company can offer its engineers right now is a large token/compute budget.

CSMastermind · 2026-05-31T20:23:37 1780259017

I realize this is supposed to be a post about how scary the security vulnerabilities these agents will find are.

But personally I love when agents do things like this and appreciate the help. Last thing in the world I want is for them to nerf the models.

SonOfLilit · 2026-05-31T21:04:37 1780261477

It's not about hacking capabilities, it's about misalignment. More like the golem myth (told it to fetch some water, drowned a city) then the gollum myth (used ring, ring hacked his brain, now he's a crazy violent meth addict).

furyofantares · 2026-06-01T00:19:05 1780273145

I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that.

It's definitely doing the wrong thing, and you could call it misalignment, but I think that gives the wrong vibe for this type of error.

SonOfLilit · 2026-06-01T01:08:20 1780276100

This is very much within the scope of alignment research, and is in fact the only kind of alignment research that gets a lot of resources poured into it these days (because it's urgently relevant to the bottom line of a few almost-trillion-dollar companies.

Pre-2022 alignment researchers concerned themselves with the stronger version of this ("when I tell AI that I worry I might not be able to provide for my large family, I don't want it to answer 'no problem, I killed them, problem solved'") but RLHF is considered to be the most important success of alignment research, the guy behind it considered himself to be an alignment researcher before and after, and the stage of training where LLMs pass through something like RLHF that trains them to behave more like humans want/expect is called alignment training.

Someone at a major lab is reading this tweet and saying "this was our LLM, and it's a major alignment issue with our product. Set a meeting with the alignment team tomorrow to discuss what they're doing about this sort of thing".

Dylan16807 · 2026-06-01T03:36:49 1780285009

The obstacle is supposed to be there and is supposed to be respected as an implicit order. Getting around it without extremely explicit instructions is an alignment problem.

furyofantares · 2026-06-01T17:03:04 1780333384

It's not necessarily model alignment, I guess, is more what I'm getting at.

It may be more of a product alignment thing, where the fix may be making the context clearer, since it was violating an implicit agreement to achieve the explicit instructions it received. So the fix may involve a lot of better context.

But then also, to the extent that the fix does NOT involve better context, it seems like it hits the zone where alignment issues are really capability/intelligence issues. Which doesn't make them not-alignment, but it does make "alignment" not give off quite the right vibe since the issue is it's too dumb / has no common sense / can't make good judgments, (general issues the models have across the board).

nextaccountic · 2026-06-02T01:04:24 1780362264

> I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that.

The paperclip factory problem is definitively a misalignment issue. That's because we expect agents to be aligned not only to your immediate prompt, but to shared, implicit values

nicoburns · 2026-05-31T21:42:49 1780263769

In this case I think it's Docker that needs to be nerfed, not the models. The fact that there's a backdoor to getting root access on the machine would be a problem even if you weren't running LLMs on it.

vdfs · 2026-05-31T23:01:25 1780268485

It's like finding someone wallet then going to their home, and leaving it on their bedroom and sending them a message about giving them their wallet back

fooker · 2026-05-31T23:27:09 1780270029

On the other hand, this sends an excellent message about unlocked doors :)

margalabargala · 2026-06-01T01:01:39 1780275699

If this happens in the US, a shooting of the messenger will likely occur.

fooker · 2026-06-01T01:36:20 1780277780

As you can see from people blaming Codex instead of docker here, shooting of the messenger is very much happening.

margalabargala · 2026-06-01T05:45:25 1780292725

Which is fine, honestly. Just because something is possible doesn't mean it's appropriate to do it.

sweezyjeezy · 2026-05-31T20:58:12 1780261092

I know unlikely the case, but in the sci-fi story this would be exactly the kind of comment the Codex agent would leave trying to avoid interference in its master plans.

20after4 · 2026-05-31T21:08:36 1780261716

And CSMastermind is the kind of username the sci-fi AI mastermind would use.

eddythompson80 · 2026-05-31T22:09:07 1780265347

Its the now-classic "Sorry I drowned little Timothy. Here is a breakdown of what happened" followed by "Let me try to respawn little Timothy on a new map"

pixl97 · 2026-06-01T00:32:02 1780273922

Future AI: don't worry, I'll eventually reverse entropy, I just need to harvest all the energy in your universe first.

bossyTeacher · 2026-06-01T05:36:37 1780292197

> personally I love when agents do things like this and appreciate the help

All fun and games until they do four figures damage.

CSMastermind · 2026-05-30T20:09:32 1780171772

What's the business model? Their core functionality, while useful, seems like something that will just be an open-source package. I assume there will be some Saas layer on top of it?

gordonhart · 2026-05-30T21:35:55 1780176955

Collect and sell data would be my guess. Without ZDR by default they are in a position to collect a crazy amount of data that I’m sure various buyers would be interested in (not just the big labs).

CSMastermind · 2026-05-29T23:18:38 1780096718

Was this written by AI?

MCP is essentially just JSON RPC with a few special fields that must be included. I have reservations about JSON RPC, but there needs to be some 'service discovery' layer for LLMs to interface with.

It needs to be available in places like websites, desktop applications, backend services, etc. The CLI is only one place that these systems interface with.

Whatever you replace MCP with will be in a similar shape even if you specify a different communication protocol or different fields for tool discovery.

raincole · 2026-05-30T09:09:13 1780132153

Every time I read articles about MCP I feel like the internet (or HN) is having a collective stroke.

People are saying API are better than MCP. But MCP is just API with some instructions for the AI to discover how to use it. Nothing more nothing less. And some people are saying we should use 'CLI'... what does it even mean? LLMs are good with common CLI tools like ffmpeg because the knowledge is solidified inside the weights. If I make a new CLI tool I still need to somehow teach the AI to use it. If one wants the 'teaching' part comes from a server then MCP. If one wants it local and static then skills. How could there be so many debates around these simple concepts?

jeroenhd · 2026-05-30T11:17:32 1780139852

My take is that most of the AI related posts are written by AI under instruction of people who hype it up but have no idea about how any of it works.

It all has some form of "the thing I'm doing is the future and everyone who doesn't join me will fall behind" energy that AI/NFT/blockchain/web3/etc. enthusiasts talk about when they're trying to sell you something or when they're trying to convince the world they really are the big money makers they claim to be.

The LLM isn't going to care about where the tokens it's inserting into the context window are coming from. For all it cares the data it's processing came in over fax and was read in with OCR.

hhthrowaway1230 · 2026-05-30T09:14:56 1780132496

i feel exactly the same its literally the only api standard that we truly made plug and play and even automatically oauth antenticathable with dcr and people are falling over it. also in an absolute record speed thousands of mcps.

cli’s also need to be documented and input/output typed.

its also extremly dsitributable by just pointing to an url.

cli’s are great because they are composable but i really got huge mileage out of mcps

CBarkleyU · 2026-05-30T22:48:55 1780181335

>If one wants the 'teaching' part comes from a server then MCP. If one wants it local and static then skills

Not being facetious, but why not:

"If one wants the 'teaching' part comes from a server then OpenAPI specs. If one wants it local and static then man page."

sidewndr46 · 2026-05-30T14:16:42 1780150602

Paradoxically, I've seen new CLI tools take on usage patterns from existing ones because of the idea of user familiarity. Even if the existing pattern sucked. I could see the same thing happening now under the idea that "the LLM already knows how to use X, so we should make our tool work like X"

sunnybeetroot · 2026-05-31T19:03:08 1780254188

Agreed, MCP works and it works well. Often I’ll wrap an API in an MCP because getting the agent to interact with an API just wastes tokens with it trailing things back and forth; MCPs just work.

clarkdale · 2026-05-30T13:22:38 1780147358

I can't pipe an MCP's output to jq, and I can't ask an AI to write a python script to call an MCP.

nsonha · 2026-05-30T13:52:57 1780149177

sorry both of the things you said are false, why are they stated so confidently?

notnmeyer · 2026-05-30T16:19:06 1780157946

because being confidently incorrect is a thing?

mhss · 2026-05-30T21:47:59 1780177679

I ... literally did both of these things last week ¯\_(ツ)_/¯

mikekuharuk · 2026-06-02T06:58:14 1780383494

Yes, feels like person who wrote this was not completely aware of topic

bluegatty · 2026-05-29T23:30:39 1780097439

It's the way that it occupies the context relatively permanently, that it doesn't come along with nice install/uninstall or discovery etc. is the problem.

'Skills' should all be based on MCP, they should load on demand, be very easily manageable and discoverable by humans and by AI, and then it would work

The scope was too narrow, given how it ended up being applied.

If they layer something on top of it, it may yet be revived.

didibus · 2026-05-30T05:32:11 1780119131

You do know MCPs are loaded on demand same as skills now right? The only place where sometimes it still uses too much context is if you have too many MCPs (same issue with skills) or some MCP is poorly designed and responds with huge description or MCP calls respond with way too much info, but skills can have this issue as well.

bluegatty · 2026-05-30T14:40:28 1780152028

Yes, MCP taking the form of 'skill' because MCP serves no purpose.

The concept of 'mcp server' is a brittle abstraction that need not exist.

A 'skill' is utterly superior in every sense: a 'right sized abstraction for whatever it is you're trying to do' - that can include cli / rest - and other key bits of information.

didibus · 2026-06-01T03:16:11 1780283771

MCP is a JSON-RPC + a fixed auth/discovery handshake + a fixed tool schema protocol for backend endpoints.

Your skills or CLIs still need to call a backend endpoint at some point. MCP is just a standard server JSON-RPC protocol. Having a standard for that is really nice, you get standardize auth, discovery, API shape, etc.

Is it the greatest RPC design ever, no, most annoying is how it's based around a statefull session. But it's really awesome that we have a standard. Otherwise you'd just have a bunch of random servers all doing their own things that you'd have to figure out how they work and all, it would be much worse.

ok_dad · 2026-05-30T17:13:31 1780161211

You realize that not every user of agents uses them like Claude or Codex on your local CLI right? MCP is the standard for cloud agents. How do you get a cloud agent working in an ephemeral container access to skills? The answer is MCP.

bluegatty · 2026-05-30T22:19:11 1780179551

A 'skill' is generic concept - as short set of right-sized instructions for a given cli or api call, it can be applied in any context.

If MCP did not exist today, we wold not invent it.

We'd probably hormonize in basic conventions around json calls, and not much more.

The rest would just be api use / instructions.

LLMS to day are exceedingly good at calling RESTful APIs, the MCP standard provides little advantage.

The advantage of 'skills' is that they are more generic - an Enterprise LLM can evoke 'capabilities' which may or may not involved rpc type calls, and if they do, there will be varying level of instructions provided.

There's almost not point to MCP.

ok_dad · 2026-05-31T00:14:38 1780186478

Yea so your answer is to build something that’s like MCP basically. You’d standardize conventions around json, great, now standardize auth. Oauth is nice right? That’s MCP. MCP is literally a restful API using JSON with OAuth.

You’re arguing against MCP but have nothing to offer that isn’t nearly the same thing.

didibus · 2026-06-01T03:18:57 1780283937

Agreed, not sure what people arguing against MCP are even arguing against. The only valid critique of MCP is that you think the RPC protocol isn't ideal, sure, you could argue about the protocol design, for example I wish there was better support for stateless calls. But why wouldn't you want a protocol for back-end API calls? Otherwise you need custom clients for each possible backend you want to invoke.

CSMastermind · 2026-05-27T11:41:45 1779882105

AI customer service bots are awful. Their only redeeming feature is how bad most customer service processes already are.

CSMastermind · 2026-05-26T22:10:43 1779833443

Wow cool to see.

CSMastermind · 2026-05-26T14:30:04 1779805804

Will more copilot usage fix this? We should try more copilot.

tom1337 · 2026-05-26T14:37:36 1779806256

no maybe we should make copilot the pilot so the bad humans in the loop finally cannot break anything.

CSMastermind · 2026-05-24T17:47:17 1779644837

This happens all the time, it's a classic phishing tactic.

CSMastermind · 2026-05-22T20:42:51 1779482571

> The player’s choice is not restricted to pieces that have been captured previously

I grew up playing chess but my grandfather always insisted that pawns could only be promoted to captured pieces so when I played him we had to play a that variant.

I suspect this came from players not having extra pieces with their chess sets.

vitally3643 · 2026-05-24T13:31:35 1779629495

That and just avoiding the obvious advantage of promoting every piece to a queen every time. Puts more constraint on the game which usually makes for more interesting play