I would like to see this expanded, I think it's a bit unfair to assess its abili...

szopa · on April 10, 2023

We'd love to see that too! However, I'm afraid that creating a substantial number of examples would transform this delightful family activity into something akin to punishment. Kłeti is quite the challenge for us Indo-Europeans, and it seems that even its creator isn't immune to the struggle.

afro88 · on April 10, 2023

Both GPT-3.5 and GPT-4 versions of ChatGPT are limited to 4k tokens, even though GPT-4 is capable of 32k.

This leads me to believe that part of the reason for some of the mediocre results OP saw was because they hit the token limit and ChatGPT started "forgetting" earlier parts of the conversation.

szopa · on April 10, 2023

No, I was explicitly watching for this. In one of the sessions where we asked it to generate Kłeti sentences and the conversation passed the token limit it started inserting characters like ı (the Turkish dotless i). A week earlier I was playing with interpreting go positions, and at some point the model switched to talking about Chess (a bit less subtle than inserting unusual characters).

knome · on April 10, 2023

GPT-4 allows you to use 8k of context in their current beta, if you're using the chat api directly. It will be interesting ( and probably expensive, lol ) when they open it to a full 32k.

Baeocystin · on April 10, 2023

I'm really looking forward to being able to use a personalized LoRa on top of a GPT-4+ class model. I want to be able to train on all of may writing over the past few decades and interrogate the history of my ideas, and I think this would be tremendously valuable for writers of all kinds. Heck, think of the value of training (with their blessing) on something like /r/AskHistorians, or other deep-dive, high quality fora.

Name_Chawps · on April 10, 2023

Though unfortunately it will cost like $20 per 32k completion...

M4v3R · on April 11, 2023

More like $1,92 (32 * 0,12) for a 32k prompt, or twice that for a 32k completion. Still not cheap though.

Imnimo · on April 10, 2023

The vector database would be good for retrieving vocabulary, but could it be expected to do things like retrieve sentences with similar syntax or tenses? It feels like it would be hard to successfully retrieve examples that were important for reasons other than semantic content.