Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: LLM Extended OS
2 points by Obertr on Dec 2, 2023 | hide | past | favorite
Concept of LLM Extended OS I DON'T believe we will be able to jump to LLM first Operating Systems straight away.

Self-operating systems are just like ChatGPT. The need reinforcement learning with human feedback (RLHF). They need it to LEARN how to operate effectively.

How can LLMs collect this feedback?

You SHOW the user an interface of LLM operating. When LLM makes a mistake -> give negative feedback.

What is the best way to do it? Definitely not some server running on the cloud which you can't see except for logs.

For the LLM to learn MOST from humans, it needs to operate on systems where humans can provide PROPER feedback. What are the examples of this? the systems we using RIGHT NOW.

I picked MacOS.

Lets deep dive into how I see it.

I called it LLM extended OS

First, it has A LOT in common with what @karpathy has shown It has access to terminal, camera, audio, and browsing.

The main difference is that it has APPS people use right now. Like: - VS Code - Finder - TODO lists

Just think about it. If LLM has imperfect attention. Just like humans do, why not give it a FULL interface WE use including not only - Calendar – allow LLM to schedule tasks - Reminders – set tasks it NEEDS to do - Music – no, just kidding - Discord – allow it to EFFICIENTLY communicate with OTHER LLMS, and send them tasks while being ignored if they are not interested

Another major difference is Graphical user interface (GUI).

Yes, VISION is too powerful to NOT make use of it. @josh_bickett just made a great case with self self-operating computer. Which proves the point.

One thing I think EVERYONE has missed is something that has been lying here for decades.

Accessibility/VoiceOver

Some people have trouble SEEING stuff. Thanks to apple, those people can STILL use the computer just like YOU DO. Which is mind-blowing right?

And think about it, visually impaired people can navigate the computer and almost FULLY operate just like you can. I tried it, its pure magic.

The question is, if those people use only, RAW PRONOUNCED TEXT to navigate, will LLM not be able to do it?

You know the answer

And there is more to write about it... I'll keep the progress update here X: @Karmedge

Let me know what you think

I would appreciate any feedback on this concept and potentially wrong assumptions I made, feel free to correct me.

The winner in this race must be humanity, not capitalism.

Thoughts/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: