Show HN: YakGPT – A locally running, hands-free ChatGPT UI

jwarden · on March 30, 2023

Nice. It took about a minute to clone it, run it, enter my API key, and get started. The speech-to-text worked flawlessly.

Most people can talk faster than they can type, but they can read faster than other people can talk. So an interface where I speak but read the response is an ideal way of interfacing with ChatGPT.

What would be nice is if I didn't have to press the mic button to speak -- if it could just tell when I was speaking (perhaps by saying "hey YakGPT"). But I see how that might be hard to implement.

Would love to hook this up to some smart glasses with a heads-up display where I could speak and read the response.

anonzzzies · on March 31, 2023

> Most people can talk faster than they can type

Most people I know type faster than they can talk. Also more accurate. I find talking a horrible interface to a computer while sitting down. On the move it is another story entirely of course.

By the way, chatgpt is not very fast either, so usually I type something in the chat and continue working while it generates the response.

> smart glasses

I just tried that; it works quite well, however, pressing the mic button kind of messes up that experience.

chenxi9649 · on March 31, 2023

Normal/average talking is ~150 WPM. Average typing speed is about 60-70. Is a 150+WPM a requirement to become anonzzies' friend?

MikePlacid · on March 31, 2023

The only person that could type as fast as I can speak and whom I met in real life was an immigration officer taking my naturalization interview. The sound of a keyboard going: trrr-trrr! And he was amazingly accurate too: all unnecessary things that I said for conversation sake were there, and exactly as I said them. But I think my wife would beat him easily though…

quickthrower2 · on April 1, 2023

Or a really slow talker?

High WPM might be achievable with shorthand though.

thelittleone · on March 31, 2023

The advantage of course is your not tied to a keyboard / desk. So one could potentially be doing Internet research while hiking.

tluyben2 · on March 31, 2023

Yes, and that with smart glasses seems interesting.

xupybd · on March 30, 2023

It wasn't so smooth for me.

I gave up at

Creating an optimized production build ...TypeError: Cannot read properties of null (reading 'useRef')

johnchristopher · on March 30, 2023

Oh, my install failed at:

    Failed to compile.

    pages/index.tsx
    `next/font` error:
    Failed to fetch `Inter` from Google Fonts.


    > Build failed because of webpack errors

Apparently because it can't fetch a font from Google. There should be assets that are critical (js/ts code, templates,css) and assets that are not (freaking fonts) to a yarn build.

edit: hacketyfixey, let's punch the thing in the face until it works:

    ./pages/index.tsx:
    2:  // import { Inter } from "next/font/google";
    12: // const inter = Inter({ subsets: ["latin"] });

(I am sorry)

kami8845 · on March 30, 2023

Haha, I'll set up a docker image that people can pull down!

johnchristopher · on March 30, 2023

Thanks but FWIW, I'd also be interested in why it doesn't build. Shouldn't yarn/npm/gulp/whatever manage dependencies ?

xupybd · on March 30, 2023

I've not found a dependency manager that works reliably across multiple operating systems and operating system versions.

uh_uh · on March 31, 2023

I did, just not in the JavaScript ecosystem.

JimmyRuska · on March 31, 2023

I tried it, it looks good! I had to modify the code to accept 8000 tokens for chatGPT. It would be good if it saved the json payload of the responses as well.

It uses 2 external calls to a javascript CDN for the microphone package and something else. It would probably be best if it was localhost calls only since it handles an API key

hypercube33 · on April 4, 2023

What'd you modify I'm curious?

FriedPickles · on March 30, 2023

I love the concept of this and other alternate ChatGPT UIs, but I hesitate to use them and pay for my calls when I could use chat.openai.com for free.

Any chance you could integrate the backend-api, and let me paste in my Bearer token from there?

kami8845 · on March 30, 2023

Hey! I definitely understand the reservation. This is definitely me as well. My reasons for using the UI at this point:

* GPT-4 is decently faster when talking straight to the API

* The API is so stupidly cheap that it's basically a rounding error for me. Half an hour of chatting to GPT3.5 costs me $0.02

Would be curious what you mean by integrating the backend-api?

qwertox · on March 30, 2023

GPT-3.5 is really cheap (prompt and completion = $0.002 / 1K tokens), but GPT-4 is around 20 times more expensive (prompt = $0.03 / 1K tokens + completion = $0.06 / 1K tokens).

But the benefit from using the API is that you can change the model on the fly, so you chat with 3.5 until you notice that it's not responding properly and, with all the history you have (probably stored in your database), you can send a bigger request with a probably better response once with GPT-4 as the selected model.

I really wish the interface on chat.openai.org would allow me to switch between models in the same conversation in order to 1) not use up your quota of GPT-4 interactions per 3 hours as quickly and 2) not strain the backend unnecessarily when you know that starting a conversation with GPT-3.5 is efficient enough until you notice that you better switch models.

OpenAI already has this implemented: When you use up your quota of GPT-4 chats, it offers you to drop down into GPT-3.5 in that same conversation.

sebzim4500 · on March 30, 2023

Sure, but GPT-4 through the UI costs $20 per month, which is a lot of api calls.

moneywoes · on March 30, 2023

Isn’t it 10 per hour?

arthurcolle · on April 2, 2023

25 / 3 hrs

robopsychology · on March 30, 2023

How is it that cheap?! I ran three queries on langchain yesterday with two ConstitionalPrompts and it cost $0.22 - made me realize deploying my project for cheap could be expensive quick.

kami8845 · on March 30, 2023

GPT3.5 Turbo pricing is 10k tokens or ~7500 words for $0.02. Though note that every API request includes the entire chat context and charges for input & output tokens. https://openai.com/pricing

monkmartinez · on March 30, 2023

You need to check which model you are using, also... LangChain runs through the model several times with increased token count on each successive call.

robopsychology · on March 30, 2023

Yeah I assumed it would be doing several times but still more expensive than OP mentioned. I think the issue is I'm using davinci-003

drusepth · on March 30, 2023

Yeah, davinci-003 is gonna be gpt3, which is more expensive than 3.5.

One more anecdote: I've been running a half dozen gpt3.5 IRC bots for a few weeks and their total cost was less than a dollar. A few hours of playing around with LangChain on gpt3 cost me almost $4 before I realized I needed to switch to 3.5, though even then it still uses a ton of tokens every chain.

robopsychology · on March 30, 2023

Thanks, I'll do that later

agotterer · on March 30, 2023

I’d love to see a comparison of the average cost to use this with the OpenAI API versus subscribing to chat-gpt plus.

Maybe I’ll have to try this for a month and see if it end up costing more than $20. Thanks for creating it!

joenot443 · on March 30, 2023

Wow! Is it really that cheap? GPT4 is much more expensive, I imagine?

kami8845 · on March 30, 2023

GPT-4 is decently more expensive -- I personally really like & use the therapist character a lot. In this scenario the session would cost me less than $1 which is still much cheaper than any therapist I've used previously :)

coolspot · on March 30, 2023

What is your setup?

1xdevloper · on March 30, 2023

You can try the extension I built [0] which uses your existing ChatGPT session to send requests.

[0] https://sublimegpt.com

unitg · on March 30, 2023

The overlay option is great .. Any chance for a firefox version?

Karunamon · on March 30, 2023

Remember that using the API comes with privacy guarantees that using the chatGPT site does not. tldr; anything sent through the API won't be used to train the model and will be deleted after a month.

https://help.openai.com/en/articles/5722486-how-your-data-is...

kami8845 · on March 30, 2023

This is a good point I'll add!

teawrecks · on March 30, 2023

> Run locally on browser – no need to install any applications

That's not what "run locally" means. This isn't any more "local" than talking to chatgpt directly, which is never running locally.

kami8845 · on March 30, 2023

Hey, run locally in this case means: YakGPT has no backend. Whether you use the react app through https://yakgpt.vercel.app/ or run it on your own machine, I store none of your data. I will try and make this wording clearer!

NBJack · on March 31, 2023

In that case you're basically offering a browser-based client. 'Locally' strongly suggests this is running entirely on the machine (vs. making API calls). Going to break a lot of hearts out there with the wording as it is.

rafael09ed · on March 30, 2023

It is more local than talking to chat GPT directly. Open AI stores all your requests on their server. This saves it on your computer. The title also claims it's a UI which always, for now, runs locally.

blairanderson · on March 31, 2023

Honestly your "idea generator" blew my mind. Would love to see a section that includes a larger catalog of prefilled prompts.

I'm thinking: What would a GPT project manager do? What would a GPT money manager do? What would a GPT logistics manager do? GPT Data Analyst, Etc.

meghan_rain · on March 30, 2023

> Run locally on browser – no need to install any applications

> Please enter your OpenAI key

...

Do people just not get it?

I would in fact rather give all my company secrets to this random dude than OpenAI.

iib · on March 30, 2023

There are instructions on how to run the GUI from localhost, and the title and even the phrase that has the link to their own hosting tell you you can run it locally first.

It seems they are genuine, and they phrase it exactly as it is. The only thing I would have maybe wanted to see in the title is "open-source" or free software.

runnerup · on March 30, 2023

Everything still gets sent to OpenAI. “Locally hosted” means the UI, not the AI.

hombre_fatal · on March 30, 2023

OP already makes it clear that they are just a front-end.

asow92 · on March 30, 2023

Love the idea of prompt dictation. Taking that idea a step further, would it possible to have a feature where ChatGPT responses are spoken back to the user?

pibefision · on March 30, 2023

War Games

mthoms · on March 30, 2023

"Do you want to play a game?"

smusamashah · on March 30, 2023

This is fast. And talking to it is a nice touch. Consider adding text to speech too :)

One feature I am missing from all these front ends is the ability to edit your text and generate new response from that point. Official chat gpt UI is the only one that seems to do that.

danielbln · on March 30, 2023

Chat-with-gpt has that, we use it in our org as an alternative chatgpt Interface: https://github.com/cogentapps/chat-with-gpt

smusamashah · on March 31, 2023

In official UI, if you edit a message and get a new response, you can still always go back to any of your previous messages and continue from there on. Basically the history is like a tree in official UI. History in all other frontends including this one is linear.

ilovepuppies · on March 31, 2023

I've never seen this one before. It has several features I've been looking for. Has it been working well for your organization?

danielbln · on March 31, 2023

It has, especially since we don't want to go through the accounting nightmare of buying everyone ChatGPT+ accounts, so just inviting everyone to the OpenAI org and giving out API keys to be used in tools like this one has been good.

ilovepuppies · on March 31, 2023

Good to know, thank you.

tluyben2 · on March 31, 2023

I added whisper to that (was merged) so you can talk to it as well.

smusamashah · on March 31, 2023

In official UI the chat history is like a tree. If you edit a message, it branches off the conversation from that point. You can always go back to any message in the tree and see the conversation from there on. Can you do that in your UI? No UI has done that so far.

tluyben2 · on March 31, 2023

I am not the author , just a contributor, but it would not be very hard to add.

kami8845 · on March 30, 2023

Hey! You can edit past messages you've submitted and they will generate a new response that overwrites whatever happened in the conversation previously. If you're talking about a tree-like struct where you can have different branches, then true, only the official UI has it AFAIK :)

Tiberium · on March 30, 2023

Looks cool! Are you planning on adding more customization to be able to influence the AI? See https://bettergpt.chat/ (it's also open source and uses API in the browser). Basically with that frontend you can control the role of all messages (e.g. add system messages) and also edit them all to better influence the AI in some cases.

kami8845 · on March 30, 2023

Editing the prompts (which are currently submitted via the system message similar to your linked app) is a great idea. I'll add it to the to-do list :)

computershit · on March 30, 2023

BRO. Your transcription is SO fast. I've hacked at a similar project passing to the Whisper API and honestly I was already blown away with its speed and accuracy (as was anyone I showed it to), but your implementation is so much faster both in TTS as well as the response from their API. I will absolutely use this.

ilovepuppies · on March 30, 2023

Very cool. I use a custom local UI as well, based on a fork of a similar project called ChatPad (https://github.com/deiucanta/chatpad). That also uses Mantine UI, and lets you create and save prompts just like chats. Data is stored locally using indexdb. I embedded it in an electron app, which lets me run it from my dock rather than a terminal. But what's missing is speech-to-text, so it's great to see this project has that.

There are a few drawbacks to local, I've discovered. For example I doubt the new plugins can be extended to beyond ChatGPT's web UI. Also, it doesn't stream response tokens as they're generated, which is a pain. I haven't looked into whether OpenAPI let you do that though.

Nice work!

ezzato · on March 31, 2023

Looks great. Super interesting to browse other peoples code. I'm working on a desktop app for ChatGPT.

https://github.com/EzzatOmar/delegate

throwaway675309 · on March 30, 2023

Given that Vocode (realtime audio, llm, etc) came out a few days ago, could you speak to how yours compares to it?

https://github.com/vocodedev/vocode-python

MikePlacid · on March 31, 2023

So, is time, finally, to entertain spam callers with nice, polite, _long_ conversations? About my credit card numbers and passwords to my accounts? My personal record is 40 minutes - some nice guys were trying to install a remote controlled door on my MacBook and were thinking they were very close to success. There are existing services, like https://jollyrogertelephone.com/ - but they are not as good as me. Still, using myself to entertain the robocallers is fun, but expensive, it would be interesting to see if AI is ready to help here…

user- · on March 31, 2023

Cool! I tried out the speech to text and it was instant and accurate, i had no idea whisper was that good.

Do you know their privacy for our voices? Do they train on it, hear it, etc ?

dilek · on April 6, 2023

if you're running it locally, they don't and cannot.

if you're using the hosted whisper, they can. however, they don't specifically talk about it.

Karunamon · on March 31, 2023

I absolutely love this! The UI is nice and responsive and this is the first chatGPT UI that has voice recognition that works outside of chrome!

I kind of want to throw this up on a server for my housemates to use, I am currently the only person with a openai account, so I would like the ability to embed my API key. Minor feature request :-)

einpoklum · on March 30, 2023

Hi ChatGPT! Let me register using my personal information, then tell you what my tasks are at works, what I'm interesting in, what I'm struggling with in life and a bunch of other sensitive personal information. I trust you completely, and am sure a nice AI such as yourself would never use my personal data for anything.

HeavyFeather · on March 30, 2023

Barking up the wrong tree, this post is for a thirdparty tool.

illuminated · on March 31, 2023

The only thing I'd suggest to consider to add is some sort of authentication. If I deploy this on a server so I could reach it with my mobile, on the go, and it has my API credentials, I wouldn't want anyone who stumbles upon the page to be able to interface ChatGPT on my expense.

Otherwise, it really looks good.

1nf_ · on March 31, 2023

Your API key and chat history are stored in browser local storage only.

illuminated · on March 31, 2023

Well, I'd love to see a change there as well, then: if I'd want to share the interface with my family I wouldn't want to reenter everything everywhere they might be accessing the page from.

butterfly771 · on March 31, 2023

you need maybe https://github.com/Chanzhaoyu/chatgpt-web or https://github.com/Yidadaa/ChatGPT-Next-Web

fudged71 · on March 30, 2023

I've been playing around with your Idea Generator persona for the last 15 minutes and have been absolutely blown away. Excellent prompt engineering.

As mentioned by others, it would be great to customize or write new personas/prompts.

Also could you add a voice chatbot as well using vocode? It could be an alternative UI for each of the personas.

diversionfactor · on March 31, 2023

So if you add audio output to it so I can talk to my computer like in Star Trek, I'll venmo you $100. Then, I want to have a command line module so I can ask it to write files to the local disk and run them, so I can deploy code it's just written to AWS, that's worth at least another $100.

yosito · on March 31, 2023

It's not that hard to do, but I think this is lowballing. If you want a talented programmer to do something for you, you should be willing to pay them $150/hr. And I'm assuming this is more than an hour of work.

dingclancy · on March 31, 2023

It would be great if I can just enter "space" in the app and it just lets me talk to it. Keyboard shortcuts!

BTW I have a lot of these ChatGPT UI apps installed, mostly free and open-source. Perhaps this is really the era of going back to just talking to a chat interface like the old times.

chenxi9649 · on March 30, 2023

This is very well made and designed. I will likely use this instead of the actual Chatgpt UI since their API is a lot cheaper than the 20$/month pricing for my usage amount.

Interesting note: I tried speaking mandrain chinese to the mic and it auto translated what I said into English.

donpark · on April 1, 2023

Just tried this in both English and Korean. Fumbled a bit with voice control but worked well once I got it going. Very nice. Korean prompts got translated to English so had to tell ChatGPT to respond in Korean to get full non-English UX.

Well done.

tough · on April 2, 2023

It sounds like a nice modifier to add a one liner to the prompt "Return your response in $user.language"

oriettaxx · on March 30, 2023

It's pretty bad to ask people to enter e private secret key in a web site (any, I mean)

thangngoc89 · on March 30, 2023

They provided an option to build it locally and run it yourself. But yeah, I wish there is a common proxy protocol that would allow website accessing private resources without exposing private keys

titaniczero · on March 30, 2023

OpenAI should implement an oauth authorization server and allow developers to use "Login with OpenAI account" into their apps.

butterfly771 · on March 31, 2023

I agree, this is the best solution, I'm sick of countless projects with key input fields where I have to go to Ctrl/CV every time.

blazespin · on April 11, 2023

Not to mention the ludicrously gaping security issue that this is. My guess is they want to push people to the plugins tho.

andag · on March 30, 2023

Maybe a small video demo would be an ok alternative?

Veen · on March 30, 2023

What alternative would you suggest for a free service that depends on OpenAI APIs? It's easy enough to generate an API key for this service and delete it afterwards.

gkbrk · on March 30, 2023

Why? OpenAI keys can be revoked at any time, and OpenAI allows you to set soft and hard limits for billing as well.

You can also generate multiple keys, so if one app misbehaves, you don't need to rotate all the keys, just the one that misbehaves.

This is assuming the API keys can only do generation. If it can access billing details or something it's very different of course.

balls187 · on March 30, 2023

> Why?

Because it's bad practice to provide sensitive information to untrusted sources, and if you are an ethical developer, it's an anti-pattern to write software that encourages bad practices.

Your credit card company will reverse any authorized charges. Will you email me all your credit card info?

sebzim4500 · on March 30, 2023

If I could generate a credit card number just to send you money then yeah sure.

oriettaxx · on March 31, 2023

> It's pretty bad to ask people to enter e private secret key in a web site (any, I mean)

I answer back to myself: I miss-understood since the idea of the developer is to run it locally http://localhost:3000 while I got scared from the DEMO

Congrats to the developer!

terran57 · on March 30, 2023

I installed it locally about an hour ago and have been running it through some paces. Nice work! (In addition to the predefined prompts, I like the API usage meter at the top).

(now, I just need Openai to take me off the waitlist for GPT-4)

psychoslave · on March 31, 2023

I’m a bit confused, I tried to utter some queries in Esperanto and French and it transcribed English (fine) translations. Can I disable this behavior to have the text transcribed in the language uttered?

andymac4182 · on March 31, 2023

I might be missing it but do we have an idea about the prompt that ChatGPT uses so we can replicate the experience?

I haven't played with the OpenAI API yet. Is there examples of good prompts to use to get good responses?

noobcoder · on March 31, 2023

Love this, Few things we could add: - Search Feature - Way to import/export chats - Star/Favourite replies by ChatGPT - For GPT4 provide 8k/32k model variations - Prompt Dictionary

victorantos · on March 31, 2023

I get a 404 error in the browser console for http://localhost:3000/encoderWorker.umd.js

afro88 · on March 30, 2023

This is exactly what I need, thank you for building this! We're using Azure cognitive services for API access to OpenAI models though. With any luck, expect a PR today for basic Azure support :)

LanternLight83 · on March 30, 2023

Could I hook this up to one of text-generation-webui's API formats?

aryamaan · on March 31, 2023

would be so fun if you could fork a project on vercel i.e this project has a button to fork: - which forks its github - makes a new project on your vercel cause it's connected to your github - it opens a new tab with your project running.

kulikalov · on March 30, 2023

Isn’t GPT a trademark owned by OpenAI? Is it legal to use it?

sebzim4500 · on March 30, 2023

Looks like they've recently applied for the trademark but they haven't got it yet. I have no idea if they will get it or not, it is just an acronym but they did come up with it.

syrrim · on March 31, 2023

They did seemingly position it as a generic name for this style of AI model, and other people have been using it in that fashion (eg "gpt-j"). It's usually recommended to contrast a brand name for your product with its generic name, so that the two don't become confused. Hence why scrabble is always subtitled "crossword game".

matemp · on April 5, 2023

Agreed. I doubt that OpenAI's recent application seeking to trademark "GPT" will be approved. Maybe specific models/products, but not just "GPT" by itself...

To be able to register a trademark in the U.S., the applicant has to show that the proposed trademark is in fact "distinctive" of their company. The more generic a term is in its field, whether to begin with (i.e., by not becoming distinctive in the first place...), or over time (i.e., by failing to maintain its distinctiveness), the less likely it is to be registerable. And, such "distinctiveness" is notably harder to achieve and/or maintain for terms that are more generic/descriptive rather than truly unique…

In the case of "GPT," in the context of software (specifically A.I.), those letters -- particularly in that combination -- are understood to stand for things that refer to a kind of A.I. language model having certain characteristics, even though OpenAI was first to produce a (g)enerative (p)retrained (t)ransformer and they're still the most notable provider of such technologies.

ushakov · on March 30, 2023

What's the use-case for this instead of the default UI?

itsthecourier · on March 31, 2023

Cross-platform compressed audio record!? How!?

yosito · on March 31, 2023

> Run locally on browser – no need to install any applications

This seems to be a contradiction. Am I running it locally, or is it running on someone else's server?

amitmerchant · on March 31, 2023

It simply means calling GPT APIs locally.

NBJack · on March 31, 2023

Kind of a gross misuse of the term, isn't it?

Obertr · on April 2, 2023

speech to text didnt transcribe text after a minute. recording was 5s long(((

kami8845 · on April 4, 2023

yeah unfortunately the OpenAI API hangs sometimet

thelittleone · on March 31, 2023

All your prompts are belong to us

kristopolous · on March 30, 2023

Make it easier to try

kami8845 · on March 30, 2023

Hey! I would love to. I seriously considered adding my own key into the app, and implementing some rate limiting to e.g. allow you to send 3 messages for free. But unfortunately that would require me to store some backend data on you that I do not want: I want this to be a completely "private" / FE-only application that stores no data on anyone.

connorgutman · on March 30, 2023

Testing YakGPT right now, excellent work! I would recommend adding some screenshots to the GitHub README so that people can get an idea of how it looks before entering their API key.

avindroth · on March 30, 2023

Before you comment something like this, ask yourself "How would I make this easier to try?" The only reasonable answer is providing the OP's own API key, which is undesirable.

kristopolous · on March 30, 2023

A video demonstration, cleaner example of what it is, etc... You can experience it by observation

jerrygoyal · on March 30, 2023

could you please add some screenshots of how it looks

kami8845 · on March 30, 2023

will do!