I assume there's some low latency twitter API that you can pay big money to access? And then I guess that goes to some array of GPUs to run the NLP model as quickly as possible? And the NLP results get traded on by placing orders with an exchange?
Are there companies that roll up many low latency data feeds for purposes like this?
>some low latency twitter API that you can pay big money to access
Yes. The twitter firehose is the big one, where you get a stream of _all_ tweets. Twitter doesn't publish the price, and one source claims $360k/y for half of the hose. https://econsultancy.com/want-your-own-twitter-firehose-you-...
>And then I guess that goes to some array of GPUs to run the NLP model as quickly as possible?
General speaking, trained models usually aren't so big that you need a cluster of GPUs, or even a single one. Sometimes running it on a CPU makes more sense because if it's small enough it's just easier and cheaper.
In particular, basic NLP and sentiment analysis is pretty easy, especially like in this case when you're dealing with a very limited vocabulary. A free CPU based library like https://github.com/cjhutto/vaderSentiment should be good enough.
> And the NLP results get traded on by placing orders with an exchange?
Yes. Doing this quickly is pretty tricky, but financial institutions like JPMorgan Chase would have access to an extremely fast and low cost (per trade) API to do this. And then they probably already have the tech to strategically place big orders, so the other bots don't realise what you're doing. Or just use a private exchange, aka a dark pool: https://en.wikipedia.org/wiki/Dark_pool
In this specific case you only need access to Trump his tweets, which you can get without parsing and evaluating the firehose (expensive to get and process).
> In particular, basic NLP and sentiment analysis is pretty easy, especially like in this case when you're dealing with a very limited vocabulary.
Exactly. With Trump's tweets you can probably find easy correlations between sentiment (good or bad) and a select range of words as well as tweet times, whether he is in office or not, whether it was tweeted from his android or not. Statistically speaking you can also look at his overall sentiment. Eg. simply always short the market whenever he tweets might be the profitable move 80% of the time.
Also the computation needs to be transparent and reproducible. I didn't read the spec, but I doubt that this is a complex NLP model.
Technically, some people are already using super fast twitter api for sentiment analysis, so they probably do it the way you say, but they're on the buy-side, not the bank side.
EDIT: basically the purpose of this index is to hedge your vol exposure to Trump's tweets. JPMorgan will sell you a swap that will pay more or less depending on what Trump says. They'll have a fairly large bid/ask spread cause this is in fact quite random, but statistically, they found that this product may make sense for some people.
One of the markers JPM uses is likes and retweets, so I don't even think this is calculated with low latency.
EG Buy Papa John's Pizza on July 11th (when the CEO got fired for using the N-word). It's up $6 since then.