I think in the future we'll see their AI fighting against our AI in an arms race similar to the spam wars. The one with the most computing power and biggest dataset will win and humans will be at their mercy.
Sounds like GAN in meatspace.
But quite possibly "greater efficiency" according to a fitness function that's not accurately mapped onto "keeping humans alive"...
I wonder if this'll end up in an equivalent state to the "tank detection neural net" which learned with 100% accuracy that the researchers/trainers had always taken pictures of tanks on cloudy days and pictures without tanks on sunny days? ( https://www.jefftk.com/p/detecting-tanks )
Who'd bet against the doctor/insurer neural net training ending up approving all procedures where, say, the doctor ends up with a kickback from a drug company - instead of optimising for maximum human health benefit?
Since when was this ever the case? Especially in America? The US healthcare system is NOT built around providing adequate care for everyone, as far as I've read/heard.
Full disclosure: West-EU citizen here
It was always like this. In my opinion it doesn't make a difference if some guy is more intelligent and is therefore able to suppress others or if he uses an AI that is more intelligent. For me the result is the same: I get rekt.
Google/Apple shuttle buses are being shot up with pellet guns today, imagine what happens when big AI corps openly work against population. Google AI/Amazon Rekognition protests suggest at least some employees have a shred of self awareness and survival Instinct.
No deep learning though.
By training a language model on the dataset, then using that model to fine tune the sentiment classification task, they were able to achieve 94.5% accuracy
Sebastian (author of this article) saw the lesson, and was kind enough to complete lots of experiments to test out the approach more carefully, and did a great job of writing up the results in a paper, which was then accepted by the ACL.
It's fine if you don't like it (although it may be you're just not quite used to it yet), but I'm not sure it's fair to call it "bad".
Everything is fully customisable in pytorch and this is explained with many examples in part 2 of the course. Written documentation is being worked on as we move towards a first release later this year (currently the library is still pre-release - it works fine and is used at many big and small companies, but there's still much to do to get it to a v1.0).
I'm glad you linked to the style guide - i've often wondered where certain names come from.
It's not all bad. K for key, V for value, i for index... no disagreement. But the seeming aversion to any variable name longer than 2-3 characters might be great in mathematics, but makes for a nightmare of code which is focused more on writing. It doesn't need to go the extreme of UIMutableUserNotificationAction; but at least use words.
AI is already clever enough; it doesn't need to be made more cryptic with poor naming conventions. I get that for people in the industry it may be fine - but for it to break out into general use it'll need to be more social; which means simpler & clearer verbage.
It is totally fair to call a naming convention where all variables are 2 or 3 letter acronyms bad, at the very least un-pythonic, and certainly not suitable for an educational tool. If you think that it's good practice to make your code shorter at the expense your users needing an abbreviation guide, you're going to make adoption a much harder sell, especially if you're pitching at python devs. Unlike the last few decades, we now live in an age of autocomplete, there is absolutely no need for this.
Here is a few lines from the imdb ipython notebook (https://github.com/fastai/fastai/blob/master/courses/dl2/imd...), that gave me a headache just to look at -
trn_dl = LanguageModelLoader(np.concatenate(trn_lm), bs, bptt)
val_dl = LanguageModelLoader(np.concatenate(val_lm), bs, bptt)
md = LanguageModelData(PATH, 1, vs, trn_dl, val_dl, bs=bs, bptt=bptt)
The cognitive load put on someone unfamiliar with your code is not acceptable:
LanguageModelData takes two LanguageModelLoaders as arguments, and then later in the code it produces a model? Surely these two classes are named the wrong way round? You haven't explained what 'bs', or '1' are supposed to be. To figure out/ remember what the other variables are, you need to chase up and down the notebook, which would easily be avoided if you just gave them descriptive names. You only use these variables once or twice, so I don't even understand what is gained by making their names so terse.
Once you are familiar enough with the abbreviations, it becomes much easier to read. Just from looking at this code, I can already tell you bs is batch size, bptt is back-projection through time. And then probably dl is data loader, lm is language model, md is model data. No idea what vs or 1 is but I can just look at the docs for LanguageModelData, which takes what? less than 30 seconds to read?
I think it's worth it because you can just look at one line of code and know what's going on. Instead of parsing through a paragraph of code that does the same thing.
It is now possible to grab a pretrained model and start producing state-of-the-art NLP results in a wide range of tasks with relatively little effort.
This will likely enable much more tinkering with NLP, all around the world... which will lead to new SOTA results in a range of tasks.
And ULMFiT here: http://nlp.fast.ai/category/classification.html
Are there any applications/websites where this can be seen in action? It's increasingly hard to judge how good state-of-the-art really is from research papers.
I put it on the shelf because the sentiment analysis just wasn't up to snuf (i.e. the bias differentiation was too weak).
Might be time to try again!
A great supplement is Sebastian’s NLP progress repo: https://github.com/sebastianruder/NLP-progress
Helped edit this piece, think it is spot on - exciting times for NLP.
So, the first sentence in this passage is a huge assumption. For a model to
predict the next token (word or character) in a string, all it has to do is to
be able to predict the next token in a string. In other words, it needs to
model structure. Modelling semantics is not required.
Indeed, there exist a wide variety of models that can, indeed, predict the
most likely next token in a string. The simplest of those are n-gram models,
that can do this task reasonably well. Maybe what that first sentence above is
trying to say is that to predict the next token with good accuracy, modelling
of semantics is required, but that is still a great, big, huge leap of
reasoning. Again- structure is probably sufficient. A very accurate model
modelling structure, is still only modelling structure.
It's important to consider what we mean when we're talking about modelling
language probaiblistically. When humans generate (or recognise) speech, we
don't do that stochastically, by choosing the most likely utterance from a
distribution. Instead, we -very deterministically- say what we want to say.
Unfortunately, it is impossible to observe "what we want to say" (i.e. our
motivation for emitting an utterance). We are left with observing -and
modelling- only what we actually say. The result is models that can capture
the structure of utterances, but are completely incapable of generating new
language that makes any sense - i.e. gibberish.
It is also worth considering how semantic modelling tasks are evaluated (e.g.
machine translation). Basically, a source string is matched to an arbitrary
target string meant to capture the source string's intended meaning.
"Arbitrary" because there may be an infinite number of strings that carry the
same meaning. So what, exactly, are we measuring when we evaluate a model's
ability to map between to of those infinite strings chosen just because we
like them best?
Language inference and comprehension benchmarks like the ones noted in the
article are particularly egregious in this regard. They are basically
classification tasks, where a mapping must be found between a passage and a
multiple-choice spread of "correct" labels, meant to represent its meaning.
It's very hard to see how a model that does well in this sort of task is
"incorporating world knowledge" let alone "common sense"!
Maybe NLP will have its ImageNet moment- but that will only be in terms of
benchmarks. Don't expect to see machines understanding language and holding
reasonable conversations any time soon.