Recent proofs of concept like GPT-3, though far from perfect, make me optimistic...

jedwhite · on March 28, 2022

Thank you so much! Andi's strong point is these type of factual answers, and I think you're 100% right to be optimist about models being within reach.

The question answering feature has only just been released for the first time today for this post, and for an entire field of questions it already surprises us that not only does it work but it does *well*. We can iterate on the intent error correction and verbal tricks. And we're just a tiny team standing on the shoulders of giants. The entire field is moving quickly and making astonishing progress.

The exciting thing in this area is the rate of improvement. The thing language models have lacked is factual accuracy, and that's definitely a hard challenge. We have problems to solve with applying common sense and reason to things like information safety/confidence, and fixing misunderstood intents is mostly just iterative training. But the exciting thing is that this already works in many cases.

It's interesting to try it out with current news too. Something from today like "why does tesla want to split its stock?"

You can see the progress in this space is real and getting faster. The verbal tricks are fun to laugh at, but the underlying progress is real.

jedwhite · on March 29, 2022

Thanks for trying out all these questions on Andi and posting the results here. That was really exciting to see!!

If you have access to any of the GPT-based Playgrounds, you'll see that large language models on their own tend not to be good on factual accuracy. At the same time, we couldn't build Andi unless we were standing on the shoulders of the amazing work done by the folks working on those, especially the pioneering work done by OpenAI, which has also created an entire open source ecosystem around GPT-J/NeoX etc.

freediver · on March 29, 2022

A story about GPT3 is on HN now

https://news.ycombinator.com/item?id=30832465

jedwhite · on March 29, 2022

Interesting article because it talks about all the things that happen with the use of large language models on their own. Large language models are amazing at mimicry and composition, and are a key part of getting to great Q&A.

But on their own they have no idea of factual correctness. That's what excites me about what we're doing with Andi. The answers are not only well generated, but do well on factual questions, especially given this is the first day live. There are some non-GPT models we're using that do well at this too.

Are you doing much with language models at Kagi yet? It's a fun area to work on.