Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
CLI tools for working with ChatGPT and other LLMs (simonwillison.net)
204 points by simonw on May 18, 2023 | hide | past | favorite | 51 comments


I've had a positive experience building a ChatGPT shell for Emacs [1]. Not having to context switch between the editor and browser is great. With Emacs being a text paradise, there are all sorts of possible integrations, like babel integration to elisp [2] or SwiftUI [3].

In addition to a shell, functions for inserting GPT responses can be pretty neat too. For example, creating org tables [4].

[1]: https://xenodium.com/chatgpt-shell-available-on-melpa

[2]: https://xenodium.com/images/chatgpt-shell-available-on-melpa...

[3]: https://xenodium.com/images/chatgpt-shell-available-on-melpa...

[4]: https://raw.githubusercontent.com/xenodium/chatgpt-shell/mai...


I’ve not seen that SwiftUI code block tool before, I’m in absolute love. Mind linking some of your configurations or some of the tool names?

EDIT- my mistake, the first link points it out as https://github.com/xenodium/ob-swiftui. Thanks for sharing!


Check https://xenodium.com/ob-swiftui-updates with latest changes. I haven't gotten around to updating the README in the project page.


this requires api key. By any chance you know something that can be used in emacs without the key?


there's not really a sane way to communicate openai/chatgpt without the use of an api key


There are plenty (perhaps far too many) tools for doing basically `curl` to OpenAI. Local LLM tools are needed however; and are much better for deploying systems on terabytes of data at fractions of the cost.


Yeah, that's on my roadmap for "llm " (hence the name) - I want to be able to use the same tool to execute against local models as well.

Everything that goes through the tool can be logged to SQLite so this should make it easier to build up comparisons of different models.


Does it work with LocalAI [1] if you change the openai.api_base value to http://localhost:8080/ ?

1. https://github.com/go-skynet/LocalAI


+1 for more llms and local llms


  $ curl -s https://news.ycombinator.com | strip-tags | ttok -t 4000 | llm --system 'summary bullet points' -s


I use the aichat [1] command line tool a lot for these sort of ad hoc chats. It takes piped input and has nice configurability for setting up a variety of system prompts ("roles"), etc.

If you want to use GPT-4 to manipulate and edit files in your local file system, you can use my cli tool aider [2]. It’s intended for generating and editing code, but you can use it to chat with GPT-4 to read, edit and write any text files in your local. If the files are under git source control, it will commit the changes as they happen as well.

Here’s a transcript of aider editing the ANSI-escape codes in an asciinema screencast recording, for example[3].

[1] https://github.com/sigoden/aichat

[2] https://github.com/paul-gauthier/aider

[3] https://aider.chat/examples/asciinema.html


I wonder how many different ways people use to do basic ChatGPT queries.

My preferred method is to run a WhatsApp bot, this way I can easily use the LLM also on my phone. And on a computer I just use WhatsApp web, which I keep running anyways. Also this method natively supports iterated conversations.

That, plus some scripts for repetitive stuff.


If you haven't heard, there's an official iOS app[1], so that's probably a far more efficient/private alternative to a custom bot.

[1]: https://apps.apple.com/us/app/openai-chatgpt/id6448311069


That's sounds great! Can you share some docs on WhatsApp bot? IIRC, those APIs were only available to businesses and not individuals.


The OpenAI API is available to everyone. I've spent well over $100 just trying various things out over the past two months. I was not trying to save on it, you can do quite a lot even on $10, just make sure to do some napkin maths before you query som endpoint a lot of times. For example it's a lot easier to spend a lot on Dall-e than it is on GTP-3.5


Sorry for not being clear - I was referring to the Whatsapp Bot API :)


Ah. The thing that I'm using is running some kind of headless Chrome that runs the WhatsApp Web: https://github.com/pedroslopez/whatsapp-web.js

From what I understand this might get killed by Facebook at some point as they never approved this method.


What was the issue with the official API? https://developers.facebook.com/docs/whatsapp/


It's not available for personal use and requires business verification if I understand correctly: https://developers.facebook.com/docs/whatsapp/overview#terms...


No idea, didn't write it, but it Just Works.


Only wimps use APIs.


To have an LLM on my android I prefer to use Termux for this, since whatsapp and their api is a hassle.


How do you do this? I am interested to learn more about this. Any documentation would be awesome


Try a Google search for GitHub projects that do that. Or really any other GPT idea. People are building many copies of everything, so I'm not even going to recommend the one that I'm using because there's probably a better one already :). It's simple code so you can also modify it to your liking.


There's an awesome list for BYOK (bring your own key) projects here: https://github.com/reorx/awesome-chatgpt-api#cli


Charmbracelet recently developed 'mods' which has some cool ideas around Unix pipes.

https://github.com/charmbracelet/mods


How prevalent is GPT-4 api access? I feel like I've been on the waiting list for forever, yet this tool has GPT-4 as the default.


GPT-4 isn't the default - it uses gpt-3.5-turbo by default, because that's massively cheaper.

If you want to run against GPT-4 (and your API key has access) you can pass "-4" or "--gpt4" as an option.

CORRECTION: Sorry, I was talking about my "llm" tool - https://github.com/simonw/llm - it looks like "mods" does indeed default to 4: https://github.com/charmbracelet/mods/blob/e6352fdd8487ff8fc...


I keep plugging my own… yet another API invoker - with parallel queries, templates, and config files written in Golang; https://github.com/tbiehn/thoughtloom Has some interesting examples, but I expect the population of users to be constrained to the 5 of us that enjoy CLI, jq, and writing bash scripts.


Really like the design of these tools so that you can easily pipe between them, this a good way to make things composable. Also really cool to see all of the other CLI tools folks have posted here, lots that I wasn't aware of.

I've been experimenting with CLI/LLM tools and found my favorite approach is to make the LLM constantly accessible in my shell. The way I do this is to add a transparent wrapper around whatever your shell is (bash,zsh,etc), send commands that start with capital letters to ChatGPT, and manage a history of local commands and GPT responses. This means you can ask questions about a command's output and autocomplete based on ChatGPT suggestions, etc.

You can see this approach here, I hope it proves useful to other folks! https://github.com/bakks/butterfish


Another enthusiastic vote for https://github.com/charmbracelet/mods - this is precisely the UX I was looking and waiting for - the day that I cloned it and started using it within my terminal was the day I no longer needed to even window out to firefox - and it feels very natural to compose with pipes, wrap into shell scripts, etc.

Early days, but you can see some of the ways this is already helping me out quite a bit (and increasing my enjoyment of things I already like to do): github.com/zackproser/automations


It's still a bit hacky in the current PyPi version of LMQL, but you can also use it from the command line, just like `python -c`:

  echo "Who are you?" | lmql run "argmax '\"Q:{await input()} A:[RESULT]';print(RESULT) from 'chatgpt'" --no-realtime
Gives you: I am an AI language model created by OpenAI.

I am one of the LMQL devs and we plan to also add a little more seamless CLI interface, e.g. to support processing multiple lines of text (e.g. quick classification tasks).


I wanted a simple chat history and rudimentary web searches in the terminal, so I wrote my own [0] (bring your own OpenAI API key).

It was a very novel experience writing the API for the simple "tools" in plain English in the system prompt (e.g. to search the web, read a website), though I never managed to make GPT4 successfully use the "execute Javascript" one.

[0] https://github.com/neon-fish/cass


There is also ShellGPT: https://github.com/TheR1D/shell_gpt/


I have a basic python script running in one of my tabs on my CLI to talk to OpenAI. It's not even 50 lines. It's basically just a while loop for user input, which it then sends to the the ChatGPT API and prints the response. Add a try/catch for rate limits and connection issues and that's it.

It's really nice to have an always-open ChatGPT equivalent in one of my terminal tabs that I can switch to at any time.


I made a small wrapper too, you call it from the command line and it keeps the conversation context for 15 minutes in a temp file.

https://github.com/mcdallas/gptask


I just keep mine in memory for simplicity, but it’s nice that you can restart old ones.


Same here, pretty much. I didn't even write it. I asked GPT-4 for the code from the API playground, and it just worked.


Very cool, but curious if you see people actually directly interacting with LLMs vs in a script as part of a larger application? I see myself needing debugging, visualizing output etc. so much that an IDE makes more sense to me as an interface, so want to learn about cases where that doesn't.


I’ve been using a Jupyter notebook from vscode as my primary interface to GPT lately. Ticks all the boxes for me.


Is there a plugin you are using to do this?


I recently created a simple TUI for ChatGPT optimized for fast access. Currently its optimized for mac os. I use it daily, feedback welcome.

link: https://github.com/tcrensink/chat_term


Can you elaborate what you mean with "optimised for fast access" and "optimised for mac os"?

I took a brief look at your code and it just looks like your run of the mill python and bash code, with an integration locked to tmux.


That's right. TUI usability depends on startup speed, and python scripts are slow to start. The "fast access" is tmux integration so its runs in the background. The "optimised for mac os" -> works on linux but a few bugs to iron out.


So all in all, what you've built is a daemon, because python scripts are slow to start?

Have you ran benchmarks comparing the time it takes to show the full response of a prompt with a daemon and without?

I'm a bit skeptical that loading the amount of code that you have into memory takes that long, but I'm coming from a nodejs background.


I made a Typescript-based CLI and package [0] you can import into projects, extend [1], and get metrics [2]. Hopefully others can find this useful. I built in response validation, lots of configurable stuff, fully tested.

[0] https://github.com/keybittech/wizapp

[1] https://github.com/jcmccormick/wc/blob/main/api/src/modules/...

[2] https://gist.githubusercontent.com/jcmccormick/38b5527c16479...


I have made a bash script(using rofi) to use chatGPT if anyone is interested.

https://github.com/ilse-langnar/bashGPT


Me too. OpenAI anyway.

I would like to get API access to more models. I am getting RSI filling out the application emails.

It is fun tho. Also being able to build what I want to use. Get what I need.


I always like like what tools this author builds, they are very useful. A question on strip-tags tool - Can we use it in a general way, to extract contents from any page?


been waiting for something like this rather than my haphazard collection of scripts


phronesitron is one I made :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: