Today, Google accounted for 92% of all Internet searches. Google and Bing together occupy 96% of the search market. Most other search engines, including the popular privacy focused ones, simply get the search results from Google or Bing and reorder them, because developing a search engine is neither easy nor cost effective. Consequently, the few big tech companies control what people see.
Right Dao is different. We are a fully independent search engine, and we have the infrastructure and build the technology from the ground up. That enable us to show the search results free from search engine monopoly's manipulations.
We invite you to have a try, while we are constantly improving the quality and adding more features.
FAQ: open source?
We think search engine's code is a bit sensitive in general. The engine depends on many subsystems, including storage, scheduling, indexing, etc. If the ranking code is seen by the spammers, they could push their spam websites. We are not sure if there's a viable way to open source a search engine.
What I mean is instead of indexing the entire internet with an adhoc ranking, trying to guess what I want, let me whitelist and blacklist domains and let me configure the ranking. I'd begin with stackoverflow, hackernews and arxiv and probably blacklist pinterest and other paywalled gardens.
From time to time your search engine could suggest search results from other sources, so I could update my whitelist.
It could index pdf files or even show their summary with some ml model, perhaps for an additional fee.
Another idea that's been bothering me for a while is searching for movies or songs. If you figure how to show me the most interesting (according to my filters) movies in 2020, I'd pay for that. Even more so for music.
That's half the reason I don't use Google right there and they pretend to follow European laws. So, yeah, right, no thanks.
It's only if you start storing those that you have some rules to follow. Nowadays, it's the same if you are in California with the recent data protection laws.
Also, Right Dao is under New York law, so it has to follow US law I guess.
So, how do I do business with people?
It's not about not having information, it's about having consent before acquiring it.
You search something which we hope is useful for you as the cosumer. We wont the overly smart and show you information we think might be more fitting or relevant because we tracked you down to live near a super-potent adwords custimer and therefore rank his results higher in your search results.
Is it "English only"?
I just searched on it for:
CR2032 "due fili"
("due fili" means "two wires" in Italian)
i.e. the button battery for RTC/CMOS on portables
And results are essentially "completely random" pages without neither CR2032 nor "due fili", most notably second result is:
and third is:
Are you independent of VCs? of governments?
Most often this is just a cyberpunky way of trying to avoid either tax or legal liability, but I'm not sure that either necessarily apply here. It would be interesting to know how the project is structured and funded.
Long-term, a combination of theirs and your own could be optimal.
There are strengths and weaknesses with using their dumps: on one hand, benefits include them having crawled and having dealt with being throttled, etc. They offer monthly dumps for general content and daily dumps for news .
On the other hand, it's a huge pile of data to wade through, and their index format might not be your preferred method. The archive and index reside officially at AWS, so that may decide where to process it. (Not sure whether other providers maintain a copy as well or not.)
By "huge", specifically:
> October 2020 [...] contains 2.71 billion web pages or 280 TiB of uncompressed content.
From our analysis a few years ago, that was to be the approach for the now-defunct Snagz.net  (which never fully launched because co-founders were unable to join due to extenuating circumstances).
 https://commoncrawl.org/2016/10/news-dataset-available/ - this one can be hard to find unless you know to look for it
My big question: What's behind the name? I find it a bit confusing and not very memorable at first sight, maybe an explanation would help.
If I search for my name like this: "kevin whitefoot" the first 27 hits on Google are directly relevant and my name appears in the link or in the extracted text. Right Dao on the other hand returns a list where the most of the hits do not include my name as quoted just the two words separately which means that the hits are completely irrelevant as they refer to a completely different person.
So how does one search for a person by name?
However, our news search is updated every hour, and Salesforce Hyperforce has new results.
First searches have been relevant and also refreshing different and new from the big ones. Battle-testing will tell.
Disclosure: Mojeek team member
Happy to add Right Dao but did not find it on Twitter.
Or is there proprietary tech behind this?