Hi HN! I'm Ted. I've been lurking for ~10 years while working long hours at corporate jobs, and finally decided to strike out on my own. Today I'm launching my side project: Albert AI.
Albert is an attempt to offer a persistent conversation with an AI, similar to ChatGPT but with a single session; the experience should feel the same as texting with a friend. For now I'm using gpt-3.5-turbo with a carefully-constructed prompt, a long-term memory, and some postprocessing to remove the rough edges. Soon I hope to integrate gpt-4, but an important point is that a user shouldn't be concerned with the implementation. I could swap out OpenAI for LLaMA-7b; from the user's perspective, the value is derived from their history with Albert.
Here are some technical details:
- Long-term memory: every hour, users with "expired" memories are identified (at least 24 hours since memories were generated). Then all recent messages are loaded and summarized by gpt-3.5-turbo, and merged with the user's existing memories. I'm *not* using a vector database yet; Albert's memory is expected to be lossy, like a human's. I'll add embeddings if I start to see some traction :)
- Context is tricky. I need to include the rules, Albert's memories from conversing with the user, and the most recent messages. I use the js-tiktoken library extensively to make sure I'm packing as much as I can into the input tokens, while leaving enough room for output tokens. This usually means truncating older messages.
- gpt-3.5-turbo can be wordy, overly polite, and apologetic. I try to reduce this with postprocessing:
- I search responses for specific strings that are often used: /feel free|help|assist|need anything|do for you/
- Then I feed the response back into gpt-3.5-turbo with a prompt telling it to rewrite it by removing any offers to assist.
- Similarly, I detect if the model is wishing the user to have a good day - very freaking annoying. Those are removed as well.
- I *do not* yet remove "As an AI" from responses. Coming soon.
- I also have some built-in responses for when Albert gets rate-limited by OpenAI. This was definitely over-engineering and I regret the time I spent on it :)
- For the rules, I can confirm that the model is way better at being told what to do, rather than what not to do. So for example, it doesn't work to tell it *not* to be overly polite or offer more assistance when the user says thank you. Instead, I tell it to *always* say "You're welcome" when the user says thank you, and then I detect "You're welcome" at the start of a response and replace it with something more conversational.
- From my own experience, sometimes I just want to play with an LLM but I don't have the creativity to make it interesting. For that, I implemented a "Magic Button" with a list of 2,000+ built-in activities and use cases. It's been a big hit with my kids :)
- The rest is just tech stack decisions, which I'm happy to dive into. Azure Functions, Cosmos DB, Azure Static Web Apps, Terraform, etc.
I recognize that there is a large intersection between Albert and many other services. At this point I'm keeping an eye out for more specific markets that I could pivot towards, and hoping to connect with others working in the space.
Thanks for reading!
The "Magic Button" feature sounds fun as well, I'll be sure to give it a spin!