First install aitextgen:
pip3 install aitextgen
aitextgen generate --model minimaxir/hacker-news --n 20 --to_file False
aitextgen generate --model minimaxir/hacker-news --n 20 --to_file False --prompt "Show HN:"
Show HN: A simple, free and open source alternative to Turkish potatoies
Show HN: A boilerplate for mobile development
Show HN: Simple UI Gao-Parser (for the Web)
Show HN: A fast, fully-featured web application framework
Show HN: I have a side project you want to sell in a startup?
Show HN: S3CARP Is Down
Show HN: Finding the right work with friends and family
Show HN: I built a webapp to remind users to view your photoshopped stripes
Show HN: Send a hands-only gift reason to the Mark Zuckerberg & Stay a lot.
Show HN: A simple, high-performance, full-disk encryption
Show HN: Peer-to-peer programming language
Show HN: Browse and duplicate images in your app's phone
Show HN: Waze – Send a face back end to the internet
Show HN: A simple, minimal, real-time building app to control your Mac.
Show HN: Sheldonize – A collaborative group for startups
Show HN: Gumroad – Make your web app faster
Show HN: An easy way to track time using MD5?
Show HN: A simple, fast, and elegant ORM/Lambda: progressive web apps for Vim
Show HN: A simple landing page I've been working on elsdst Certy. Here is how I was within the last year
Well it knows how to get HN users attention all right.
I need to see this in action.
It's easy, awkward, time consuming and probably pretty wrong for tracking hours. Just like regular time tracking!
Will take a look.
Ask HN: What's your favorite computer science podcasts?
Ask HN: How do I convince a non-technical exercise to keep a journal
Ask HN: Is it just me or not?
Ask HN: What do I do with my MVP?
Ask HN: How to sell?
Ask HN: How do you use HackerNews?
Ask HN: Best way to make a B2B startup?
Ask HN: Why do I have to live in San Francisco?
Ask HN: How to tell my heart changes?
Ask HN: How to deal with the difference between a job interview and a product?
Ask HN: What is your favorite open-source sytem?
Ask HN: What are your favorite blogs and resources?
Ask HN: What are the best books for learning a new language/frameworks?
Ask HN: What's your favorite HN post?
Ask HN: What is your favorite RSS reader
Ask HN: Is the SE not a mistake like a safe space business?
Ask HN: How do I start programming in a job?
from aitextgen import aitextgen
ai = aitextgen(model="minimaxir/hacker-news")
I'm going to have a lot of fun with this and this is going to be my starting point about learning more about colab notebooks and ai (always loved doing practical things instead of reading theory to learn something new).
Kudos to you for all this amazing work.
p.s. sorry if this is a lame question, but can this be used like how gmail recently has started to autocomplete my email sentences?
> Generates text faster than gpt-2-simple and with better memory efficiency! (even from the 1.5B GPT-2 model!)
This is exciting news. One of very few drawbacks of gpt2-simple is the inability to fine-tune a model of more than ~355M parameters. Do these memory management improvements make it possible to fine-tune a larger one?
Unfortunately not yet; I need to implement gradient checkpointing first. Memory-wise, the results for finetuning 124M are promising (<8 GB VRAM when it used to take about 12 GB VRAM with gpt-2-simple)
If I want to fine-tune this to some text data, are there obvious constraints to be aware of? I've got a reasonable amount of text (~50-100G) but seeing that there's a json file created makes me think that's probably too much. gpt-2-simple seems to describe 100M as 'massive' so what's a reasonable amount to aim for?
Or should I be training from scratch? (edit - looking into training from scratch since I don't have thousands to throw at this I'm guessing that's a 'no')
I'm not 100% sure you can encode and store that much data in memory with the current implementation, even with the fast tokenizers.
> I'm not 100% sure you can encode and store that much data in memory with the current implementation, even with the fast tokenizers.
That makes sense. I wasn't too sure what sensible sizes would be, there's probably some interesting subsets of the data I could take though and use for fine tuning (or some sampled data) - maybe down to 100M as that sounded like a large-but-ok amount to use.
I'm looking forward to seeing what I can get out of this, thanks for making something simple enough that I can do that for a "I wonder if" kind of problem!
Model I/O: aitextgen abstracts some of the boilerplate and supports custom GPT-2 models and importing the old TensorFlow models better.
Training: Completely different from Transformers. Different file processing and encoding, training loop leverages pytorch-lightning.
Generation: Abstracts boilerplate, allowing addition of more utility functions (e.g bolding when printing to console, allow printing bulk text to file). Generation is admittingly not that much different than Transformers, but future iterations will increasingly diverge.
Hence why I'm looking into smaller models, which has been difficult, but releasing aitextgen was a necessary first step.
The next step is architecting an infrastructure for scalable generation; that depends on a few fixes for both aitextgen and the base Transformers. No ETA.
>>> ai.generate(1, prompt="Trump")
Trump] The best way to start your life is to have sex with someone who is still a virus
As funny as it seems, it shows what things are being associated with