Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft open sources code to give any app Bing-like intelligence (arstechnica.com)
119 points by JaimeThompson 4 days ago | hide | past | web | favorite | 12 comments

I work on the team that built this and many of our other ranking tech. I can answer any questions people have.

Also shameless plug: using this tech we released some artificial search sessions as an exploratory dataset. https://github.com/dfcf93/MSMARCO/tree/master/Conversational...

Can you explain how this is different from a regular word2vec NLP algorithm?

Honestly its not. There is a paper in SIGIR 2019 about how it was trained and works called 'Generic Intent Representation in Web Search'. The jist of it is if a document has a sat click from multiple queiries make those vectors closer together. Do this over a few billion urls and documents and you can make a vector represenation for any query.

Thanks for the response! I'll definitely dig into it some more.

Awesome. Thanks for also releasing the data set.

Any IP encumbrances we should be aware of?

For the dataset The MS MARCO datasets are intended for non-commercial research purposes only to promote advancement in the field of artificial intelligence and related areas, and is made available free of charge without extending any license or other intellectual property rights. The dataset is provided “as is” without warranty and usage of the data has risks since we may not own the underlying rights in the documents. We are not be liable for any damages related to use of the dataset. Feedback is voluntarily given and can be used as we see fit. Upon violation of any of these terms, your rights to use the dataset will end automatically.

Original article on Microsoft's AI Blog: https://blogs.microsoft.com/ai/bing-vector-search/

This is probably smart strategy for Microsoft: If they can make search into a generic commodity, it cuts the legs out from under Google.

There's a barrier to entry in that users have been trained by Google, specifically, they've run many searches with Google and over time the feedback of hit / miss has trained them to write queries that produce useful results on Google.

Anyone trying to implement search then either has to be competitive with Google at handling Google-friendly queries, which is probably impossible, or somehow train users.

Thus, offering Bing searches as "good enough" and free to many companies makes a lot of sense.

It would be interesting to see if people adjust how they search on Google vs. Bing, even if unconsciously.

A small point, but I noticed that they listed Linux build instructions before Windows in the README. Yes, it's irrelevant (they are in alphabetical order after all) but MS of a few years back would have been much pettier about something like that.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact