Hacker News new | past | comments | ask | show | jobs | submit login
Training NanoGPT on My Journal (hut.pm)
192 points by romseb on Jan 29, 2023 | hide | past | favorite | 24 comments



"I have been the happiest when She doesn't want to break up" wow that is saddening, you have to shoot higher dude, you deserve it!


Made me sad, too. Please remember that you can NEVER (and I hate using absolutes) find real happiness in the actions of others, or from any other external source. Never. I am 55 and have learned this the hard way and through studying the teachings of experts.


I think "real" happiness is highly subjective when there are just as valid views like "happiness is only real when shared".

At the end of the day, what makes you happy might be a miserable time for others.

Personally, finding happiness in the action of others is fine for me, it is expecting reciprocation that makes some bitter and that is just unneeded burdens.


> Please remember that you can NEVER (and I hate using absolutes) find real happiness in the actions of others, or from any other external source.

I haven't really found "real happiness" yet, but I still get a lot of happiness from talking to all my friends and knowing that I am loved and appreciated <3

Even when I lose or fail to make one, there are always others that I can hang out with to feel better.


Wise, and agreed


It sounds like your incantations may have resurrected Racter.

https://en.m.wikipedia.org/wiki/Racter


Too bad Chamberlain was so stingy and secretive, never opening it up. We can't know how well it really worked versus how heavily they edited output for the book passages.

What did they ultimately accomplish or gain by not sharing the details?

p.s. Is it interesting to anyone besides me how weak nanoGPT is compared even to 80s technology? I look forward to when training our own personal offline GPTs is widely prevalent with a few orders magnitude better accuracy / reasonality.


The least hypothesis is that the "good" version of Racter that supposedly composed the novel never existed, and that it was all a hoax made up by someone who wanted to bask in some publicity and sell the Racter that actually got released. As you say, the output is suspiciously good for something they insisted was accomplished on a home computer of the era and, oh no, they can't release the software that did it for Reasons... yeah, pull the other one, it's got bells on.

Also, this is interesting:

https://electronicbookreview.com/essay/constructing-the-othe...

> There is little evidence of The Policeman’s Beard’s process of production, likely due to its digital conception, as well as the normalcy of its creators. Chamberlain and Etter were two friends, with the former an amateur computer enthusiast and the latter employed as a computer programmer; neither felt it necessary to document production. Further, all of the referenced sources focus on Chamberlain, and Etter’s name has virtually disappeared from any record of activity since Racter. Even then, his name is mentioned only once, and between commas, in The Policeman Beard’s front matter. There does not appear to be any photographic evidence of Etter (while there is plenty of Chamberlain), nor evidence of additional contributions to any technological field. There is, frankly, no convincing evidence for Thomas Etter’s existence other than a few quotations attributed to him in news articles about Racter, and assurance from Chamberlain himself (correspondence with the author, who also confirmed that Etter is now deceased).


It is too bad. Reading The Policeman’s Beard is Half Constructed it is pretty clear to me that it was one or both of:

1) nearly every grammatically accurate combination of a human-chosen lexicon being constructed by a computer and then heavily human-filtered

2) a human writing a simple program, getting the feel of the basic computer process and then repeatedly emulating it for artistic/comedic effect

My money is on 2, as it is more fun and less work, especially when you consider it was the early 80s. Still at least the work of a human-computer hybrid in some sense.


>The output is not great, but at least there are some sentences that make sense :) I will do some more neural network training and dataset cleaning, let's see where it will get me.

From my own experience doing something very similar, I'm afraid that this won't cut it. Even a 1B parameter language model will struggle with the most simple topics. The smallest thing that I have seen, that may be useful for general purpose, is GPT-J with 6B parameters and even that is far from perfect. The idea of running these architectures in any useful way on old hardware is sadly a pipe dream. AI got where it is today mostly because of the increase in computing power.


True, true. But I already get somewhat coherent sentences that are personally enjoyable and valuable.

The main problem is that 90% of the output is garbage: broken formatting, abruptly ending sentences, mixed languages, occasional nonsensical sentences like `your "two" type,ir zwe "G together"`. If I can produce something that looks like normal sentences even if they're not particularily meaningful, with ~90% reliability, that's all I need for my purposes, and I think the current bottleneck is actually the very poor dataset of my very heterogeneous notes.

I'm thinking about it like how some people love their pets or toddlers even though they struggle with the most basic tasks. The personal connection makes up for it :)


Yeah, if you only want coherent language, 1B parameters is more or less good enough. If you also want general context awareness and some basic world knowledge, you will need to up it another order of magnitude. Either way, this is nothing that you'd run on dated hardware. Smaller transformers can be useful for very specific task, but autoregressive LMs are just insane when it comes to hardware requirements.


> All in all, this AI-generated line describes my feelings about the results perfectly:

> You're a real problem, but I'm just happy that you are doing.

Good line.


This is something that I've imagined playing with eventually.

Are you thinking in doing it able to "chat"? (integrate your new input)


Author here. Yes, my main goal currently is enabling it to chat with me, and it seems quite feasible.


I remember Andrej mentioning in [1] that is a challenging part, what materials/lectures/papers have you found or what ideas on what to do about that so far for that?

[1] https://www.youtube.com/watch?v=kCc8FmEb1nY


You can create chat-like situations with appropriate prompts.

For example, if I enter the message "Wow this day was horrible" to the chat application, it could translate it into a prompt like "Amy: Wow this day was horrible\nMe: " and the language model could auto-complete the response of me - assuming the language model was trained on chat logs between my friend Amy and me.

But as sigmoid10 mentioned, the quality of the replies will not be remotely comparable to ChatGPT.


I wonder if note taking applications like Notion (or new startups) might include something this as a capability to improve their search and summarization.


> my laptop has less than 1GB

What gives? Most smartphones made in the last 10 years have more RAM than that.


VRAM, not RAM :)


Pretty interesting use of the LLMs.


I'd be interested to know how much text it was trained on.


Author here :) Some results (around 50%) are from a NN trained on 1.77MB of my notes, and others are from another NN trained on 1.25MB of my notes and 2MB of a random slice of "openwebtext" (OpenAI's WebText dataset from GPT2).


Cool, running a chat bot on your laptop.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: