I think it's very difficult to build a generalized model that predicts stock market movements based on news (outside very specific things - drugs that pass testing, buyouts etc). It maybe possible to build specific models that track the likelihood of particular events and then map these to movements though.
My product works on prioritising news for human analysts instead. This is much more practical and works better with many companies existing work practices
The problem is there are so many similar products out there, and its hard to even sift through the products.
This is such an issue that there are entire companies dedicating to reporting the quality of other data products.
Eagle Alpha is one of them, for instance. They have something like 10,000 different data products in their index. Can you imagine what it would be like to even browse the websites of 10k providers? Let alone talk to them about their offerings?
I know, your service is news/sentiment related, and some of these look at credit cards, and satellite photos, and foot traffic and so on. But the consumer just wants to find a signal that will make money, so in essence, you're competing with these other techniques as well.
How did you solve the challenges of getting/collecting news data
There are many challenges. What specifically are you talking about?
Newswire data is pretty easy to get and moderately useful. Stockmarket announcements are another good source.
The TLDR is there's a difference of opinions about how to use EDNS and protect visitors' privacy. They'll get it hammered out. Until they do, you can just do few host lookups with another name server and then stick that in your /etc/hosts or wherever you do static names. For East coast US, I have
18.104.22.168 (CloudFlare) respects archive.is's agency and doesn't try to work around it
But do have an ad for a company wanting more funding that might do it. They even have this economist article on their front page - https://www.arkera.ai/
Bit of a fail for the economist.com