Hacker News new | past | comments | ask | show | jobs | submit login


> DuckDuckGo gets its results from over four hundred sources. These include hundreds of vertical sources delivering niche Instant Answers, DuckDuckBot (our crawler) and crowd-sourced sites (like Wikipedia, stored in our answer indexes). We also of course have more traditional links in the search results, which we also source from a variety of partners, including Oath (formerly Yahoo) and Bing.


No? Your quote basically confirms it. All organic search results are from Oath and Bing. The other 400 sources are just for fluff like widgets.

It's worse.

AFAIK Oath / Yahoo has switched to using Bing under the hood since 2009: https://www.nytimes.com/2009/07/30/technology/companies/30so...

Huh, the last time I tried DDG back in 2014 or so, all the search results came from Yandex, which really put me off of it.

Having said that, Bing is no replacement for DDG.

How does that work with their privacy stance? Do Yahoo/Bing get to keep and use that search data and it's just anonymized, or does DDG pay to keep it untracked?

Kind of disheartening regardless. I assumed they had their own scrappy, independent tech stack.

Creating your own search engine in today's world is pretty much impossible.

For one thing, loads of sites load all their content via Ajax, so at a minimum you're gonna need a browser engine as the base of your crawler...

Headless Chrome is available and widely-used, and commonly you can get around the JS thing by simply waiting a few seconds before scraping. I'd assume the crawling itself isn't the hard part (aside from maybe just the raw compute time it takes).

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact