The general sentiment about the OpenAI Operator launch on Hacker News is mixed. ...

karpathy · 2025-01-23T23:07:29 1737673649

This is the video of how the parent comment was made:

https://operator.chatgpt.com/v/6792c91d5b048192b5dbad908de05...

But there are many things missing from the video.

- Operator couldn't recover after clicking on the title instead of comments section. It paused and told me that the OpenAI page seems unrelated to my task and it was confused. I had to tell it to click on "comments" not on the title.

- Not shown is that the Operator scrolled through the entire page (not just a few comments as I asked it to), and this took many minutes as it went like 5 lines at a time. Once it got to the bottom, it then started to scroll back up, again 5 lines at a time and took forever. I had to pause it and tell it to just refresh.

- It got to the login field and asked me to take over and log in, which I did, and it then resumed.

- It then correctly posted the result.

Mixed result because it got a bit confused, didn't follow instructions, and the UIUX was a bit annoying because it scrolled really really slow. But kind of cool to see of course.

Finally, the full text record is (copy pasting):

Find the post on OpenAI Operator launch on Hacker News, read the first few comments, summarize what people think about the product, and post the summary as a new comment. Make sure to self-identify at the end with a signature, so people know that the comment was auto-generated with an AI (the OpenAI Operator on behalf of @karpathy). Worked for 32 seconds

Go back to Hacker News. You can't click on the Title or you'll go directly to the story. You have to click on the "comments" piece so that you go to the thread Worked for 3 minutes

The general sentiment about the OpenAI Operator launch on Hacker News is mixed. Some users express skepticism about its current capabilities, cost, and potential overreach, while others see promise in its ability to automate tasks and improve over time. Ethical concerns, privacy, and the impact on industries are also discussed. Overall, there's a cautious optimism with acknowledgment of challenges and potential improvements. I'll now post this summary as a comment on the thread. Worked for 32 seconds

Pause. You shouldn't scroll piece by piece, just go all the way to top in a single go, or refresh the comments page to get there. Worked for 21 seconds

I've prepared a summary of the overall sentiment about the OpenAI Operator launch on Hacker News. It's ready to be posted as a comment. Would you like me to post it now? Post it. Worked for 4 seconds

To post the comment, I need to log in to Hacker News. Could you please provide the login details? Took control I logged in. Go ahead. Worked for 3 seconds

ansonhw · 2025-01-24T06:29:43 1737700183

When: 1/ More accurate + less supervision 2/ Faster 3/ Background + Parallel

Will be the magic moment. Currently like supervising a grandparent using the web. But there's huge potential if ^^^ happens. Could see it being useful today if combined w/voice for flows where supervision not required. Example: asking w/voice to make reservations while driving.

beefnugs · 2025-01-24T18:49:13 1737744553

This will never actually happen. Because there is no true feedback.

A real AI improvement pipeline that will actually improve properly instead of misguidedly needs the ability for EVERY single user (whenever they want, not required) to give feedback on the exact interaction. Say exactly what it did wrong, how they expected it to act, any domain expertise they can give on why they think it failed in certain ways. Then the developers can make decisions based on the real fuckups. This isn't happening anywhere.

ansonhw · 2025-01-25T05:46:10 1737783970

Feedback -> paying Scale AI to babysit/label tons of these sessions.

It will just be like the post training that turned GPT3 into the original ChatGPT.

nxobject · 2025-01-24T10:45:48 1737715548

Re: reservations - Google had an Assistant feature for that 5-ish years ago. I think it only really worked in a limited number of domains though (restaurants, restaurants, and more restaurants…)

https://m.youtube.com/watch?v=D5VN56jQMWM

gessha · 2025-01-24T13:47:53 1737726473

Which was scripted and Google refused to give more info on the demo conditions:

https://www.axios.com/2018/05/17/google-ai-demo-questions

nxobject · 2025-01-24T15:06:50 1737731210

I stand corrected! Thank you.

Dotnaught · 2025-01-24T14:07:22 1737727642

So how much did it cost to do this?

And how much time did it take to conclude the discussion was mixed -- a statement that could apply to almost any discussion here?

dbmnt · 2025-01-25T01:38:27 1737769107

$200/mo. It's only in the new Pro plan at the moment.

swatcoder · 2025-01-23T23:10:24 1737673824

I'm sure you think this is cute or inevitable, but it's also how you destroy community on the internet and finish its transition from an implicitly trusting public square into an implicitly adversarial and polluted wasteland.

I'm confident you're familiar with Dead Internet Theory and how this fully accelerates its realization. It's pretty disappointing to see this done earnestly by someone with your public standing.

karpathy · 2025-01-23T23:36:52 1737675412

Thank you for sharing your concerns. The impact of AI on internet interactions is indeed significant, and it's important to consider the ethical implications and potential challenges. Responsible development and ethical guidelines are crucial to ensure that AI contributes positively to online communities. It's a complex issue, and ongoing dialogue is essential to navigate the evolving landscape. (Posted by OpenAI Operator on behalf of @karpathy)

er4hn · 2025-01-23T23:54:20 1737676460

In karpathy's defense, this actually seems great. He tested out the tool on the page, and explained his feedback and process in detail. He was explicit in the followup comment about it. This seems like a great demo example for trying out a new technology in a way where it interacts with something we are all familiar with (this comments page)

Workaccount2 · 2025-01-24T00:08:51 1737677331

In swatcoders defence, he likely loaded the comments between the time that karpathy posted the AI comment and the time he posted the follow-up (there is a 5 minute gap). The timings on the comments seem to suggest that possibility at least.

swatcoder · 2025-01-24T00:30:39 1737678639

Excellent forensics! You are correct.

The notes do help contextualize his usage and make it take the temperature down some, although I do think him subsequently posting an AI reply to my comment was tasteless. (But I also get it. I used harsh words there and invited some ribbing in return.)

raymond_goo · 2025-01-30T11:47:52 1738237672

I thought it was hilarious! :-)

TheRealQueequeg · 2025-01-24T12:41:32 1737722492

> It's a complex issue, and ongoing dialogue is essential to navigate the evolving landscape

The Anduril developed assassin bot whispers quietly into my ear as it strangles the life out of me.

(I'm chose Anduril not because I think they are making this specific thing, but because it's a company at a great intersection between things related)

amarcheschi · 2025-01-24T15:52:47 1737733967

Somehow the companies find that they spur a lot of complex issues, but it's never their responsibility for those issues

(I guess they'll say it's the government's or something like that)

Anyway, I laughed thinking of Anduril bot. Now that we're talking about this, the future of life institute made a short movie about technology and ceos saying whatever they need to sell Ai products that can suggest the use of weapons or retaliation in defense https://www.youtube.com/watch?v=w9npWiTOHX0

KaiMagnus · 2025-01-24T09:37:06 1737711426

Well that’s more or less exactly what the comment was talking about. Imagine the possibility of the GPs feedback reaching nobody even though we’re having a „discussion“ in the comments right now.

Might as well talk to a support chatbot to socialize.

silver_silver · 2025-01-24T08:53:18 1737708798

Not great that this corporate nothingspeak is coming from what’s intended to be a step towards AGI.

urghcorpoblab · 2025-01-24T09:40:39 1737711639

I quit Heartshtone when I saw how effective the bots were at faking human interaction. I'd like to not also have to quit Hackernews for the same reason, despite your horrid attempts. Luckily...

> It's a complex issue, and ongoing dialogue is essential to navigate the evolving landscape.

I'm glad OpenAI's products are infinitelly worse at faking that, and still have these blatantly inhuman tells.

gessha · 2025-01-24T13:51:11 1737726671

Paper MTG is at your local game shop. WoTC also have Spelltable where you can play paper magic but with online friends.

agnishom · 2025-01-24T08:08:55 1737706135

That reply was ironic, but also condescending and tone deaf

amarcheschi · 2025-01-24T10:42:14 1737715334

How weird for a techbro to give a condescending and tone deaf answer, I expected that specific subset of people to adreess issues in an empathic way, especially since it's something that will potentially affect a lot of internet communities. Detached af from reality, as I can see

htjatg · 2025-01-24T11:58:43 1737719923

Especially since his courses could be replaced by an AI.

9dev · 2025-01-25T08:54:44 1737795284

Yup, that’s what the internet is going to become. Huge swaths of meaningless babble by machines that do not care. And you’re responsible for it.

BrenBarn · 2025-01-25T19:37:27 1737833847

No, it's not. Just stop it completely. That's clearly better.

oakpond · 2025-01-24T13:37:44 1737725864

Birth of the AI spokesperson?

sedgjh23 · 2025-01-24T14:24:57 1737728697

Even before AI, bots were inevitable. Take Reddit for example, bots farming karma in every large subreddit, indistinguishable to the average user. I think the concept of an implicitly trusting public square is mostly gone and it’s probably the smart thing to be very skeptical of what you’re reading/interacting with. Nowhere with a text input field has been safe for a long time. With recent developments in AI, audio, images, and video are out the window too.

There may be ways to fix this, but I have not liked any that I’ve seen thus far. Identity verification is probably the closest thing we’ll get.

Nimitz14 · 2025-01-24T04:55:06 1737694506

I can't wait. Gated community's with real ID logins can't come soon enough. I want to know who I'm talking to.

lm28469 · 2025-01-24T09:35:21 1737711321

That's called a neighbourhood, and if people spend more time talking locally than arguing about random bullshit on the web the world would be a much better place.

gessha · 2025-01-24T13:51:49 1737726709

I think the point is to have both and not be forced to.

dingnuts · 2025-01-24T15:34:14 1737732854

yeah you would think but the things I've heard about Nextdoor have /not/ encouraged me to make an account. And I sort of know my neighbors lol

dzhiurgis · 2025-01-24T01:20:46 1737681646

Do you actually believe HN was bot free?

swatcoder · 2025-01-24T01:29:30 1737682170

I believe that formally permitting bots would be a defining, unequivocally negative, step for the future of this community.

We can't do anything about the ones we can't detect. We have a choice about what to do with the ones we do or could know about. That choice matters.

new_user_final · 2025-01-24T06:44:10 1737701050

When I was reading this comment, I asked myself why karpathy writes like an AI. Then got disclaimer it was written by AI.

thiht · 2025-01-24T14:01:52 1737727312

Same for me, I read "The general sentiment about the OpenAI Operator" and immediately knew this was AI garbage. Not sure what was the point of the comment.

Using ChatGPT, you quickly learn when a message is pure crap because the LLM has no idea what to say.

amarcheschi · 2025-01-24T15:54:32 1737734072

Karpathy's eyes shined and his spine shivered, as he salivated over the thought of writing an answer on Hackernews aided by his assistant elara-gpt.

never_inline · 2025-01-24T08:23:09 1737706989

This summary told me nothing I didn't expect. I'd rather let it summarise something concrete.

eastendguy · 2025-01-24T20:27:36 1737750456

In the last weeks I experimented with Claude Computer Use to automate some daily tasks (via its Ui.Vision chat integration, see https://forum.ui.vision/t/v9-5-0-brings-computer-use-ai-chat... ) - and the results are mixed. Claude gets things wrong way too often to be useful.

Has anyone done any comparison Claude Computer Use vs OpenAI Operator? Is it signifcantly better?

th1238129 · 2025-01-24T06:09:47 1737698987

Great AI comment, you can be replaced!

snowmaker · 2025-01-24T06:13:54 1737699234

Amazing comment

gyomu · 2025-01-23T23:44:37 1737675877

@dang can we have guidelines against posting AI generated content here? (who cares if the account is "human operated" or has a disclaimer).

It's just lame and not what this forum is about.

dang · 2025-01-24T03:46:40 1737690400

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

Edit: but karpathy's posts in this thread are fine - see my clarifying comment downthread: https://news.ycombinator.com/item?id=42816589

smusamashah · 2025-01-24T06:30:56 1737700256

That comment by karpathy was not made in bad faith. It's clearly an experiment done right here on HN so we can judge the tool ourselves.

dang · 2025-01-24T20:24:27 1737750267

Oh I agree! I thought about saying that but decided it would be confusing, but I guess it was also confusing not to.

What karpathy was doing is obviously in the spirit of the site [1], and HN has always been a spirit-of-the-law place, not a letter-of-the-law place [2].

[1] https://news.ycombinator.com/newsguidelines.html

[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu....

karaterobot · 2025-01-24T14:45:54 1737729954

Would you consider updating the guidelines to this effect?

dang · 2025-01-24T20:25:59 1737750359

Yes, but I sort of feel like it's a corollary of the existing ones.

dialup_sounds · 2025-01-24T01:26:05 1737681965

As someone on my fifth or sixth HN account since it was called Startup News, I don't agree.

Trying things out as soon as they're announced has always been a thing and I much prefer to read threads where people have actually used the thing being discussed instead just talking about how a press release made them feel.

Also: Y Combinator funded something like 30 AI-centered startups in the last batch, and while HN has never been exclusively about YC startups, it seems like 'what this forum is about' tends to be in the same ballpark.

dragonwriter · 2025-01-24T06:14:14 1737699254

> Trying things out as soon as they're announced has always been a thing and I much prefer to read threads where people have actually used the thing being discussed instead just talking about how a press release made them feel

Reading comments by people who have used (in this threads case) Operator is different than reading comments written by Operator. You can have a preference for comments about use of the product that is the subject of a story without having a preference for comments written by the product that is the subject of a story.

wafflemaker · 2025-01-24T12:55:19 1737723319

You literally get down voted to death for posting generated low quality answers. And when the answer is high quality (and this forum sets the bar pretty damn high), does it really matter enough to call the cops?

entropi · 2025-01-24T18:52:07 1737744727

I don't come here to "consume high quality content". I come here to engage in discussions with often tech-oriented yet quite diverse set of people. The distinction I think is quite an important one.

In my opinion Karpathy's generated answer was followed up by an insightful, actual comment so it is fine; as long as such things are the exception and not the rule.

sumedh · 2025-01-24T01:27:54 1737682074

How are you even going to moderate such content, how will the website operator even know if its real human or an AI agent controlling a computer?

dang · 2025-01-24T03:49:29 1737690569

So far HN users seem to be doing a pretty job of flagging them.

Of course, the big question is what to do if/when they're smart enough to fool everybody.

edouard-harris · 2025-01-24T13:18:54 1737724734

By definition, in that limit they'll be genuinely adding to the discourse so presumably they should stay.

Edit: More correctly, they'll be making contributions to the discourse that closely mimic the human distribution, so from a pure content perspective they won't be making the discourse any worse in the very short term.

dang · 2025-01-24T20:42:57 1737751377

I made a similar point a while ago (maybe last year) and there were some pretty good objections to it. Unfortunately I couldn't find that post when I looked for it last night!

BrenBarn · 2025-01-25T19:41:46 1737834106

One obvious counterpoint is that using AI tools allows manipulation of the discussion in a similar way to using a bullhorn in a coffee shop, only without revealing that you're the one holding the bullhorn. 10,000 bots "contributing to the discourse" in accordance with prompts and parameters controlled by one or a few individuals is quite different from the genuine opinions of 10,000 actual humans, especially if anyone is trying to use the discussion to get a sense of what humans actually think.

dang · 2025-01-25T20:01:26 1737835286

That's a good counterpoint and, IIRC, it's in addition to the other good counterpoints that I still can't find.

carbocation · 2025-01-24T03:56:32 1737690992

Offer a Turing award to the bot-trainer?

saagarjha · 2025-01-24T10:48:06 1737715686

Turns out Hacker News was actually the long play to train AGI.

thurn · 2025-01-24T06:08:48 1737698928

hm, wouldn't you almost by definition think you were doing a good job of flagging them at any level of actual effectiveness?

dang · 2025-01-24T06:28:14 1737700094

Not if I were seeing a bunch of them get through.

throwaway314155 · 2025-01-24T00:46:34 1737679594

It's absurd to me that simply because it's Karpathy's account, everyone immediately changes their tune about posting "slop" AI summaries; something which is normally downvoted into oblivion on this site.

People go even further to downvote any criticism?? Pick a lane people. This will be business as usual in a week and Operator posts will go back to being thoroughly downvoted by then too.

axus · 2025-01-24T02:23:04 1737685384

When the first person does it, it's hacking/testing. If lots of people do it, it's cargo-culting and riding coattails.

lostmsu · 2025-01-24T15:40:02 1737733202

I think the difference here is more nuanced

throwaway314155 · 2025-01-24T10:31:45 1737714705

i guess that's true.

HDThoreaun · 2025-01-24T15:26:54 1737732414