Ask HN: What's the coolest non standard application of LLMs you've seen?

ilovefood · on Dec 23, 2023

I'm using it to filter out the content that's displayed in my browser screen as I browse: https://karimjedda.com/llms-in-the-middle-content-aware-clie...

Essentially, I wrote a small browser extension, that takes the content of LinkedIn, Twitter, YouTube posts/titles, and filters them out based on if they are clickbait, low effort, etc.

It's liberating :D

wood_spirit · on Dec 23, 2023

The thing that made the initial chatgpt refreshing was the lack of ads - it wasn’t trying to sell you anything. This obviously will not continue; commercial pressures will direct AI efforts towards being a better ad pusher.

So the AI of the social media sites will end trying to get the crap past your local AI filters, in a big AI arms race :)

ilovefood · on Dec 23, 2023

I would say, bring it on! Nothing will make it past my phi-2 or mistral-7B-v0.1 ^^, at least for now.

I think what this could lead to is homogenization of the content serving layer, since all you'd really need is to get content to the user that can move their filters from one site to the other, the display layer being less relevant (and differentiating). But let's see, exciting times.

firtoz · on Dec 23, 2023

That's awesome, I want to do something similar: categorize the content in social media, so I can choose what to see when I want. Sometimes I want to avoid politics, sometimes I'm ok with it, for example. Sometimes I want to see only content about game development.

What's your plan with your project, will you turn it into a product for others, open source it, or neither? I would love it if it was either of the former!

ilovefood · on Dec 23, 2023

Thank you very much for the supporting words, I've been getting lots of positive feedback from this. The end of year workload has made it such though that I need to be mindful of time. I think one of the two first options will be the way to go. I'll post an update here as I always do with my small projects.

gardenhedge · on Dec 24, 2023

A video of before and after would do wonders.

Also, if this could show stats and graphs on the topics the user has been exposed to and what has been blocked out it would be amazing.

LinasKo · on Dec 23, 2023

It has to be the auto-playing Tomb Raider agent, where LLMs were used to give Lara self-awareness. I've never seen anything like it.

It starts off with some classical computer vision shenanigans to understand the character movement, map layout, and to create the 'desire' to explore. Then the LLM is given input of images, sound descriptions and prior thoughts, lettting Lara remark on the situation, which feels very surreal and, at least for me - very unexpdcted. E.g. she hears the wolves howl and wonders how they survived in this environment. Or meta-remarks on game music changes.

https://youtu.be/0wTf_bbkW2U?si=tsWJpyLrRpRDSXD9

grumbel · on Dec 23, 2023

Worth pointing out that most of that video is fake[1]. Though it and its debunking video are still a great example how to make entertaining fictional content with a little help of AI. It probably won't be too long before somebody builds something like that for real, similar AI mods[2] for Skyrim are already out.

[1] https://www.youtube.com/watch?v=bmqUUb80ApQ

[2] https://www.nexusmods.com/skyrimspecialedition/mods/98631

ipsum2 · on Dec 23, 2023

The description says its faked?

> This video may be inaccurate and is made for entertainment.

splatzone · on Dec 23, 2023

I'm watching it now, it's quite entertaining. But there are quite a few comments on YouTube suggesting it's a hoax?

GolDDranks · on Dec 23, 2023

I'm attempting to create a frequency list of words for language learners. (In Japanese.)

Commonly, these lists are based in just what word appears in the text at "surface" level. However, words commonly have multiple "senses" or nuances of meaning in which they are used. Dictionaries list these senses, but it has been traditionally hard to disambiguate which sense the word is used in, given an usage in text.

LLM's make this feasible, so I'm attempting to create a word sense/usage frequency list.

interloxia · on Dec 23, 2023

Consider using fastText's word vectors. They have a bunch of languages that come pre sorted by frequency and are sufficient for basic word sense. Perhaps use a LLm to automate some of the disambiguation.

https://fasttext.cc/docs/en/crawl-vectors.html

https://news.ycombinator.com/item?id=13771292 (6 years ago)

Aligning the fastText vectors of 78 languages

https://github.com/babylonhealth/fastText_multilingual/blob/...

GolDDranks · on Dec 23, 2023

Thanks, I look into these.

tkgally · on Dec 23, 2023

That’s a great idea. I hope it can be done for other languages, too.

I used to help prepare study materials for Japanese learners of English. The other editors and I would try to adjust the vocabulary to keep it at an appropriate level for the target learners. Word-frequency lists provided some guidance, but they showed only how often words appeared in the surveyed texts, not the meanings in which they were used. The word “medium,” for example, might have a fairly high frequency, but could we expect the learners to know the meanings “a substance through which a force travels” or “someone who claims to have the power to receive messages from dead people”?

A similar problem was with multiword idioms. The verb “make” is one of the most common words in English, but how common are “make it,” “make do,” “make up,” “make away with,” or “make out”? Ten years ago, I was unable to find any reliable answers. We had to rely on our gut feelings.

Good luck with your project. LLMs should be a big help.

GolDDranks · on Dec 23, 2023

Thanks you! Yep, multi-word idioms are tough. How do you quantify whether a phrase is just a "sum" of it's words, or is there some additional meaning, "idiomness" to it. I haven't thought a lot about that yet, but it's a problem that I need to solve for this.

tkgally · on Dec 23, 2023

If you’d like to discuss these issues, feel free to get in touch. My website URL is on my profile page. I’m not a programmer or expert on natural language processing, but I have worked on over a dozen Japanese-English and English-Japanese dictionaries and enjoy thinking about such problems.

wodenokoto · on Dec 23, 2023

Can you talk a little more about the process? I’m guessing you’re not just prompting gpt to list most common words.

Are you asking the LLM to annotate text and then count number of annotations?

How do you make sure that each disambiguation has a stable label throughout?

GolDDranks · on Dec 23, 2023

Basically, I have a big corpus of text (novels, as I'm interested in getting the learners to read) and a dictionary. I annotate the words using the dictionary, and then give the text context, the target word and the possible dictionary definitions as input to LLM, and I let it select or score, which definitions could be considered to "apply" given the context. Finally, I tally the counts.

The disambiguated senses are provided by the dictionary. Does that answer your question?

wenc · on Dec 23, 2023

How about the highest frequency phrases and variations?

As a language learner, I’ve found that high frequency word lists to not be that useful. It’s too atomic of a unit devoid of context. Memorizing word lists don’t lead to speaking a language — but learning phrases often do. Even better is to learn phrases within a context, like a restaurant or a lecture.

LLMs might actually add value. Word frequencies are simply statistical counts, but finding common phrases is a more co more complicated problem — and the LLMs structure (attention) might actually be the solve.

(I actually ask this if ChatGPT 4 today. I ask it to tell me the highest value phrases I should learn if I’m in a restaurant. I also ask it to break down phrases for me, and give me a lesson on conjugations etc.)

GolDDranks · on Dec 24, 2023

Ah, yeah, totally! The whole point of this excercise is to ascend the level of "words" to get to level of "units of meaning". These commonly consist of not single words but phrases.

Also, you are absolutely correct that learning "atomic units" in isolation is not good practice. What I'm thinkin here is to get some tools to collect the data for "what". The "how" of the learning needs to happen in context.

hiAndrewQuinn · on Dec 23, 2023

I've recently been experimenting with training LLMs on the personal corpus of a dear family friend who passed recently, with the intent to eventually embed the device in his tombstone up north so that people can come and commune with him.

He was a well-known tarot reader, mystic and Haskeller in the northern Finnish community; without his help it's very likely I would have been deported from the country before I could get my passport sorted out. We came up with this plan together before he passed mostly out of a really weird shared sense of humor.

_giorgio_ · on Dec 26, 2023

Can you share the size of the datasets, and some parameter used for training?

I'm trying to understand what is the minimum size of a corpus and the architecture size too.

vunderba · on Dec 24, 2023

Yeah, I imagine it won't be long before we'll have virtual seances online for people who have passed.

Black mirror really popularized this idea too.

tomcam · on Dec 23, 2023

Brilliant. Cannot wait to see the v1.0 announcement here on HN

hubraumhugo · on Dec 23, 2023

I was overwhelmed by the pace of AI news and papers coming out, so I built an automated HN news monitoring service that delivers relvant news straight to my inbox or my RSS feed: https://www.kadoa.com/hacksnack

It uses LLMs to extract, summarize, and tag the front page articles and classify the different perspectives in the comments.

No more FOMO :)

imarkphillips · on Dec 23, 2023

Is this just for monitoring HN or other sites throughout the world?

I'd love to build a niche news service for a small market.

tobr · on Dec 23, 2023

Maggie Appleton shared some interesting ideas a few months ago[1]. I especially find the ”Branches” concept interesting, just the idea of exploring multiple paths from a starting point in parallel.

1: https://maggieappleton.com/lm-sketchbook

_boffin_ · on Dec 23, 2023

Shit... this is exactly what i've been thinking of for the past few months! Those examples with the UI solve so many problems and I just love the ideas!

link: https://news.ycombinator.com/item?id=38398563#38407664

Have anything more like this?

devnull90 · on Dec 23, 2023

Simulating The Sims with LLMs and observe their behavior https://dl.acm.org/doi/abs/10.1145/3526113.3545616

oliwary · on Dec 23, 2023

I use GPT-4 to generate a poem as a reward for solving the daily puzzle at https://squareword.org The poem is based on words from the puzzle.

It usually manages to create a reasonably coherent and amusing poem from up to 10 completely random words, something would struggle to do myself. People tell me they enjoy them, although some of the poems turn out a bit odd haha.

Here is an example: https://x.com/SquareWordOrg/status/1660702885154377730?s=20

oliverbennett · on Dec 23, 2023

We’re playing with embodied LLMs that can externalise thoughts in a virtual environment. The idea is to help facilitate knowledge work.

It’s not our main area of interest, but it’s been interesting to experiment with how human/machine and machine/machine interactions work in real-time when you limit how fast agents can move or write. It's much easier to engage in a dialogue with agents that can't create / move tens of sticky notes and graphics faster than you can create one.

You can see a short, old video of the environment at https://www.temin.net

swah · on Dec 23, 2023

edit: misread the title

The uncensored one [1] - finally gave me instructions for making crack and a bomb. It felt cool that it would answer everything, like a 90s zine.

[1] https://huggingface.co/TheBloke/dolphin-2.1-mistral-7B-GGUF

Note on that page: This model is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant to any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.

Maxion · on Dec 23, 2023

Jira issue generator.

Custom GPT with instructions that outputs issues according to our issue templates in markdown.

Allows me to write horribly typoed bullet point lists and get out surprisingly good issues.

Gets me 80-90% done in a fraction of the time. I can then just edit them to get them to be what I need.

What I'd really want to get working is a PR desription generator.

matsemann · on Dec 23, 2023

Why can't the bullet points just be used as is? Either they contain enough signal, or they don't and llm won't help anyways.

I fear everything will be expanded by llms soon. "write an email, three paragraphs, about X", instead of just sending X directly. Then the receiver gets a wall of text, and uses an llm to distill it back to X' before reading. Just hope too much didn't get lost in the inverse compression through llm.

Maxion · on Dec 23, 2023

> Why can't the bullet points just be used as is? Either they contain enough signal, or they don't and llm won't help anyways.

Because it is a lot messier and harder to understand when it's not structured. Having clearly structured tickets lessens the cognitive load.

IanCal · on Dec 23, 2023

Short is not the same as clear and concise.

kelvie · on Dec 23, 2023

I use copilot in emacs, and running "git commit -v" puts the diff in my emacs(client) buffer with copilot on and it's not terrible at describing the changes.

A lot of times it'll even guess the JIRA ticket number from the diff or the branch name.

fl0id · on Dec 23, 2023

Gitbutler is doing that iirc and replit also might have sth like that

videlov · on Dec 23, 2023

Currently GitButler only generates commit messages in this way (with some config options for style e.g. semantic commits). With that said, generating PR descriptions is something I was tinkering with this morning.

(disclaimer: I'm a GitButler co-founder)

cyode · on Dec 23, 2023

For me, json and yaml formatting and analysis. ChatGPT is pretty decent at the following real work tasks I used to use less robust tooling for:

- pretty print and indent “json-like” string (ex. Python object str) from a log, or json with typos (extra commas, wrong quotes, imbalanced brackets…) with a summary of errors at the end.

- verbal description (numerically listed) of the changes between two commits of a yaml file, esp when order has changed making git diff hard to read.

magicseth · on Dec 23, 2023

Well, it's part chat bot, so I don't know if it meets your criteria. But we're using them for a LOT of things behind the scenes to help kids find content they love that their parents approve of.

[HelloWonder.ai](Hellowonder.ai)

The front end looks like a chat bot, but on the backend we're using LLMs to find, parse, rate, classify, and rephrase content on the fly for individuals.

lysecret · on Dec 23, 2023

Counting my calories.

joisig · on Dec 23, 2023

Me too, initially in a chat with GPT-4 [0] and then in a (private for now) wrapper that sends me a text message when analysis is complete, sums up the day's meals, and compares to my total calories burned per Apple Watch.

[0] https://joisig.com/gpt-4-passable-personal-nutritionist

lysecret · on Dec 23, 2023

Very cool!:) Just went over the article and this is close to how I use it. I implemented it in an Iphone app and added some Rag tricks. Let me know if you want to try it out.

gokhan · on Dec 23, 2023

Details pls.

maxlamb · on Dec 23, 2023

How? From meal description?

lysecret · on Dec 23, 2023

Yes from descriptions. ALso, often I have a rough idea of how many calories something has. So one of the main features is you can say: Protein shake with 140 cals and 35 grams of protein remember as P1, and then whenever I have the same thing again I just type P1.

I have an Iphone app I have been using for half a year (and lost 10kg), if there is interest write me Email in bio I might release it then ;)

atarian · on Dec 23, 2023

Summarizing the legal terms for a new job offer

nvy · on Dec 23, 2023

Trusting an LLM for anything to do with legal agreements seems like a terrible idea.

orangepanda · on Dec 23, 2023

I'd trust an LLM more than my own interpretation.

acosmism · on Dec 23, 2023

trusting any idea that floats into your head is a terrible idea. you have to vet the ideation.

nvy · on Dec 23, 2023

Thankfully humans are great at pattern matching and it's trivial for me to "vet this ideation":

LLMs are notorious for getting subtleties wrong, and in legal agreements like terms of employment the subtleties are often of material import. Therefore this is a bad idea.

If you don't want to read/don't understand the terms of your job offer then pay a lawyer. Asking JobOfferGPT is just asking for trouble.

barnabee · on Dec 23, 2023

I’m not sure it’s so black and white.

Reading yourself (at whatever level you are capable of and can tolerate) followed by asking an LLM to highlight any areas of the terms that are non-standard, may cause concern, could be restrictive, or might cost you later could help identify subtleties you might have missed.

Certainly it’d seem no worse and possibly better than just reading the terms, especially as a layperson.

nvy · on Dec 23, 2023

And what do you do when your interpretation is correct and the LLM is incorrect, contradicting you, but you aren't sure who's correct and who isn't?

barnabee · on Dec 29, 2023

Seek better advice than the LLM, if the discrepancy remaining unresolved is problematic enough for you.

acosmism · on Dec 23, 2023

as a human *cough

hereonout2 · on Dec 23, 2023

Behold the wonders of the modern age!

throwup238 · on Dec 23, 2023

Fine print with prompt injection. Thank god for this modern age!

selalipop · on Dec 23, 2023

What about chatbots that understand products? https://notionsmith.ai

hahn-kev · on Dec 23, 2023

Honeycomb is an open telemetry tool that has a complicated search UI. They also have a text box you can use to have it query your data for you, it basically just drives the filtering and group by UI. It's really cool because it just makes the UI simpler to use, worse case it might set the wrong filter.

phillipcarter · on Jan 2, 2024

Coming in late here, but I'm glad you like the feature! We built it exactly for the reason you describe.