
A new state-of-the-art open source chatbot - olibaw
https://ai.facebook.com/blog/state-of-the-art-open-source-chatbot
======
drusepth
According to the "Get the code" link [1], it looks like these models need
pretty huge GPUs to even interact with the pre-trained models. Is that
abnormal? I was under the impression that training the model is generally what
takes the beefy GPU, and then using that model requires more consumer-adjacent
hardware. A P100 GPU is $3000 [2].

[1] [https://parl.ai/projects/blender/](https://parl.ai/projects/blender/)

[2]
[https://www.amazon.com/dp/B06WV7HFWV/](https://www.amazon.com/dp/B06WV7HFWV/)

~~~
rahimnathwani
These are very big models, like 100x to 300x the # parameters of resnet-50.

2.7bn parameters (for the smaller model) means you have to do 2.7bn
calculations for a single step of the model. You could fit the model in main
memory, but how long is it going to take you to run all those calculations on
a CPU? And the full model will need to run multiple times to output a single
sentence.

------
shermanmccoy
Boiling it all down, when prompted, these models just regurgitate a similar
sentence to what is observed in the training data for loosely that same input,
using some glorified curve fitting. This does not necessarily imply the model
understands the meaning of what it is spitting out. So the uninitiated will be
really impressed with this kind of toy.

The researchers here appear to have placed particular emphasis on cleaning up
what the model is spitting out, but I think it's lipstick on a pig. The area
begging for more research is parsing out the meaning of anything but the most
simple sentence.

~~~
AndrewKemendo
>Boiling it all down, when prompted, these models just regurgitate a similar
sentence to what is observed in the training data for loosely that same input,
using some glorified curve fitting

This is not that much different than what you do.

What criteria would you use to determine if something understands the meaning
of a word/phrase/concept that isn't a string of definitions and metaphors? And
at what level is sufficient?

Attempting to prove that something "understands the meaning" is a fruitless
task with no quantifiable criteria - much like proving something is
"conscious."

~~~
shermanmccoy
Representing meaning as a number, or more specifically a point or set in some
vector space is fine. I'm not suggesting a more sophisticated concept of
understanding than the status quo here, basically text in, a number/numbers
out, which as I understand it is how most intent analysis works currently.

So meaning in this sense is very much quantifiable, yet how far along are we
in parsing out even the most basic meanings? Can we build something which
discriminates between "I'm moving in to <address>" and "I'm moving in on
<date>", using the latest and greatest word embeddings? Not without some extra
layers of external rules imposed on top. So the model does not 'understand',
even in this limited scope of understanding.

Don't be fooled by the sentence recycling is all.

~~~
mycall
I wish someone fused Cyc with a modern chatbot.

