I would like to know:
Yes, it take really a lot of compute and time to train e.g BERT.
(probably 64 V100) and 3 days of training.
BUT once it is trained, and you want to use it on your application.
Inference time usually take a few ms, is it far longer with bert?
And does a modern smartphone can easily run such inference?
I've heard about mobilenets but they sacrifice too much accuracy, so I really hope BERT can be run today on a 7nm snapdragon + it's mini TPU.
I can't find such data on the web but this is an elementary question, necessary for complete success of Nlp.
throw compute at it because we're Google/OpenAI
Sorry but for training time, neural network deep learning is far from a "smart" paradigm.
It is essentially statistical brute force + a few clever math tricks.
This is a part of the answer on how to create an artificial general intelligence.
But where's is the research for creating a causal reasoning system understunding natural language?
It mostly died in the AI winter, and except a few hipsters like me or opencog or cyc, is dead.
I wonder how many decades will be needed for firms like Google to realize such an obvious thing (that real intelligence is statistical AND causal).
throw compute at it because we're Google/OpenAI Sorry but for training time, neural network deep learning is far from a "smart" paradigm. It is essentially statistical brute force + a few clever math tricks. This is a part of the answer on how to create an artificial general intelligence. But where's is the research for creating a causal reasoning system understunding natural language? It mostly died in the AI winter, and except a few hipsters like me or opencog or cyc, is dead. I wonder how many decades will be needed for firms like Google to realize such an obvious thing (that real intelligence is statistical AND causal).