Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations (github.com/ads-cmu)
124 points by advaith08 on Sept 9, 2023 | hide | past | favorite | 51 comments
Hello HN!

I've been thinking about the idea of a LLM thats a clone of me - instead of generating replies to be a helpful assistant, it generates replies that are exactly like mine. The concept's appeared in fiction numerous times (the talking paintings in Harry Potter that mimic the person painted, the clones in The Prestige), and I think with LLMs, there might actually be a possibility of us doing something like this!

I've just released a fork of the facebookresearch/llama-recipes which allows you to fine-tune a Llama model on your personal WhatsApp conversations. This adaptation can train the model (using QLoRA) to respond in a way that's eerily similar to your own texting style.

What I've figured out so far:

Quick Learning: The model quickly adapts to personal nuances, emoji usage, and phrases that you use. I've trained just 1 epoch on a P100 GPU using QLoRA and 4 bit quantization, and its already captured my mannerisms

Turing Tests: As an experiment, I asked my friends to ask me 3 questions, and responded with 2 candidate responses (one from me and one from llama). My friends then had to guess which candidate response was mine and which one was Llama's. Llama managed to fool 10% of my friends, but with more compute, I think it can do way better.

Here's the GitHub repository: https://github.com/Ads-cmu/WhatsApp-Llama/

Would love to hear feedback, suggestions, and any cool experiences if you decide to give it a try! I'd love to see how far we can push this by training bigger models for more epochs (I ran out of compute credits)



I wonder if my clone will also respond 3 days later as if nothing happened


You are being very generous. Unless for dire needs, a weekly replies to Text/WhatsApp is a good frequency to keep/advance world peace.


> The concept's appeared in fiction numerous times (the talking paintings in Harry Potter that mimic the person painted, the clones in The Prestige)

How is your most notable example not when Gilfoyle does exactly this so he doesn’t have to talk to Dinesh in Silicon Valley??


Hahaha a friend told me this but I havent watched SV. Will do so immediately



How is the most notable example of this not the Dixie Flatline in Neuromancer?


I was immediately reminded of this black mirror episode:

https://en.m.wikipedia.org/wiki/Be_Right_Back


Llama 7B is quite dumb. Using the 13B you'd get significantly better results, and you can train a qlora on a single 3090 (I think even less is possible but not sure)


oh yeah definitely. Do you know how I can get access to one for cheap though? I burnt through $150 just on this exercise with a P100 on GCP


I'd love to see an update for 13B, and I can vouch that vast.ai prices are very good.


Ooof. I'd expect this to cost like 5 bucks on runpod using a single 3090.

I use axolotl for training, I didn't check your notebook but axolotl likely comes with more optimized defaults for speed and vram than what you're doing.


vast.ai

Yeah, GCP GPU prices are terrible. $150 for a short time on a P100 is highway robbery.

TPUs are better, but still kinda pricey.


You said that the model fooled your friends 10% of the time.

I wonder how well would chatGPT | llama2 do given just the last 5 messages of each and asking to generate the next reply pre tending to be you…

Somehow I don’t think it would be worse?


Yeah I wondered if few shot prompting would yield better results than finetuning. For the amount of finetuning I've done (1 epoch, 7B model with 4 bit quantization), I think it might be comparable. But if we scale this to a bigger model and longer training times, I think finetuning should produce much better results. Hoping someone with access to compute will try it out and update us!


> You said that the model fooled your friends 10% of the time.

I am imaging some nerds actively changing their real behaviour to become closer to ChatGPT so that others will more likely believe it when they take a break and handover their work/comms to AI tools :)


We are very close to where AI tech can replicate Harry Potter portraits


askdumbledore.org was done a few months ago in fact


I can't seem to find a link to the git or documentation. I'm very curious how this is achieved? Is a model fine tuned? If so how? I'd be most interested in seeing how they formatted the input data if fine tune was done.


I used GPT API with a custom prompt :)


Nice. I remember thinking of doing something like this when I was much much more of a novice. I wrote a WhatsApp message parser and thought of doing this with the parsed messages. Unfortunately I knew too little back then, and Llama didn't exist either. Cool to see it!


Super cool! I had a similar idea where I wanted to create such clones of some of my friends (with consent ofc) and see how well they know me. To extend your clone even more, you can also throw in every piece of digital text you have into this, eg. emails, notes, essays, blogs etc. I'm super down to work on LLM clones like these!

edit: I actually started a little work on this. If you wanna export more messages than the limited 40k, you can use [0]. I did and I have every text I've ever sent since I had WhatsApp.

[0]: https://github.com/YuvrajRaghuvanshiS/WhatsApp-Key-Database-...


Thank you, Ill check it out!

Yeah, this can be extended to create a "simulation game" of us and our friends. This paper (Interactive Simulacra of Human Behaviour https://arxiv.org/abs/2304.03442 ) has a setup on how we could create a Sims game with us as the characters


Nice! I did something similar with GPT 3.5 and slack https://rosslazer.com/posts/fine-tuning/


Very cool. I like the introspection bit, I've realised quite a bit about my texting style from talking to Llama too. I think Im also very "type first and think later" on WhatsApp


Now you can edit messages up to 15 minutes after sending! At least one can on an iPhone 14.


A few years ago I did the same thing with GPT-2 on my friend and I's WhatsApp conversation history.

So it would simulate conversations between us.

The result was hilarious yet at times uncomfortably accurate... like looking into a mirror...


wow thats cool! Do you have the code put out somewhere?


Tap the contact's name in WhatsApp (I think it only works on a phone) and at the bottom of that screen there's Export Chat.

There was some post-processing needed to get it in this format:

    Alice: blah

    John: blah blah
Edit: OP's GitHub README has instructions for exporting & preprocessing.

---

For finetuning GPT-2 I think we used this thing. I ran it on Google Colab. My friend ran it on his GPU, it should be doable on most modern-ish GPUs.

https://github.com/minimaxir/gpt-2-simple

I tried doing something with this a few months ago though and it was a bit of a hassle to get running (needed to use a specific python version for some dependencies...), I forget the details sorry!


Code for fine-tuning davinci (GPT-3): https://github.com/rchikhi/GPT-is-you


> the talking paintings in Harry Potter that mimic the person painted

I remember that the photos in the newspaper moving mimic the person.

But I thought the talking paintings were ghosts living in the paintings or something.


Awesome work, I've had the idea for a while of setting up a pipeline like this that could take input from all available sources of the person to clone their voice and image as well as dialogue.

The intent being to create digital avatars of lost loved ones to help people with the grieving process.

I know that there would be tremendous opportunity in such tech for malicious actors to do serious harm, but the stated goal is still a worthwhile endeavor.


Yeah, this is definitely a dicey ethical question. Would be interested to know what guardrails you're considering for these digital avatars, and how you'll ensure that people use them in a healthy manner and dont get dependent on them.


Right now I haven't gotten past the drawing board phase. But I have spent more than a few hours hashing out the morality of the idea with friends and family. In the end, I always come back to the fact that the tech itself would HAVE TO BE FOSS; Ipso facto, the extent of any guardrails would have to be limited to sternly worded recommendations for how it SHOULD be used. Much like the stance that the US NAVY has taken with the TOR project.

Edit:

However, I also know that the average grieving person doesn't necessarily have the skills or desire to compile source, fine-tune a llm, etc..., so there would most likely be an opportunity for people to create a paid turnkey system wherein a digital archivist helps the grieving person create the avatar. In such a system, I would probably recommend that the product be tied with:

1) A Grief Counselor

2) a nag system for when interaction time reaches unhealthy levels

3) some method of alerting the grief counselor directly when certain thresholds of toxic interaction are met


Cool idea. One more fictional example:

https://www.youtube.com/watch?v=IWIusSdn1e4


This is cool, although I’m guessing you need to input your conversation history manually? Or is there a way to export it from WhatsApp?


here's how you can export your chat history: https://faq.whatsapp.com/1180414079177245/?cms_platform=andr...


To batch it (I think, trying it now): https://github.com/B16f00t/whapa


WhatsApp can export last 40k messages in any chat.


Good work. But, how is this useful other than for deception and trickery beside the fun aspect of it all? Maybe im lacking imagination and perhaps this type of progress in mimicking human interaction will actually push more and more people back to the IRL world of person to person communication.


Oh I think it could have all sorts of use cases. For example:

Let's say you need to buy a gift for a friend. If your LLM assistant is trained on all the WhatsApp chats you've had with that friend, it would understand your relationship well and would be able to recommend something accordingly

Or it could also just speed up the time taken to reply to messages by suggesting responses and therefore having you type less


I find that Google Workspace saves time in predicting the end of sentences, a subject for a finished email, and informal/formal endings in the language my email is in. I'm sure more personalized predictions would be even better because I'd agree with them more frequently.


Good idea!

I expect there will be profitable businesses based on training LLMs to simulate eminent people & celebrities – on both their public utterances and their private correspondence – then charging for access to the best models.


yeah I think there are a lot of use cases for an assistant trained on your chat history. Given how privacy sensitive this use case is, I think maybe Apple is the best suited to build something like this? Hope they come out with something cool


Black Mirror was not a manual for the future.... Just watch S02E01 from 2013(!), I know the llms are not quite there yet, but still.


I'd love to try this but my GPU is potato.

Does anyone know a convenient way to access the kind of GPUs required for this?

Should I just pay for Google Colab?


You get $300 credits on GCP when you sign up with a new google account. You can then request for a GPU, they approve quota increase requests immediately

Its very easy to connect your colab notebook to a GCP machine


Any plans for a Llama 2 version? (Wondering how much difference it makes at such small model sizes.)


Oh my apologies, this is Llama 2! Im using Llama 2 7b chat here, not the original Llama


So discord, google and fb chats can pretty much do this too...should have been obvious by now.


Very interesting! I’m wondering if anyone attempted something similar in Telegram though.


i am screaming in horror on the inside




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: