Relevant links for anyone interested:
* spaCy on Github: https://github.com/explosion/spacy
* NER demo: https://demos.explosion.ai/displacy-ent/
* Neural coref by HuggingFace: https://huggingface.co/coref/
* Accuracy of built-in spaCy models: https://spacy.io/usage/facts-figures
Last time I calculated, the lowest cost way to run spaCy in the cloud was on Google Compute Engine ns1-standard pre-emptable instances. It should be over 100x cheaper per document than using Google, Amazon or Microsoft's cloud NLP APIs. Accuracy will depend on your problem, but if you have your own training data, performance should be similar.
Also, it would be useful to see the Dockerfile or script that generated this img.
Added GPU would probably help both spacy and neuralcoref in performance.
> Also, it would be useful to see the Dockerfile or script that generated this img.
Will put it on github shortly.
FastText for example has pretrained embeddings for 294 languages:
Google's Parsey McParseface handles POS tagging for 53 languages:
This current docker image is not exposing those other languages but I can expose them in an update if it helps a lot of people.
"Donald Trump's administration" is not a person.
In the following example, "The currency" is not a subject and "India" is not an object.
I don't know how much useful information is extracted by this system.
(I'm the author of spaCy, not this Docker container.)
I just wish the author of this docker container chose demo sentences that advertised it better.
But "subject" and "object" is for indicating the Subject-Verb-Predicate (object) of the sentence and not as in literal object ?
Also that issue is in my code (poor naming choice). Will put up the code on github soon. Hope that will help.