I still think there's room for a search engine that supports boolean(ish?) operations like AND, OR , NOT and NEAR. Providing links directly to the source and not a redirect to the search engine company would be a really good thing.
A cookie-less search engine would be double plus good.
On the contrary, I find Google barely usable these days, with everything covered in layers and layers of horrible bloat, random UX mess of the week and constant pestering about G+.
DDG is nice and clean, has features Google is (still) lacking and just gets me the results. I may resort to other search-engines (like Google) once a week or so, but at this point, there's no way I'm going back to having Google as default.
That's just not worth it.
As for search results: I tried DDG again last week, and was shocked by the poor quality of results compared to google, bing or yandex. Search for "html table tag" and you'll see no w3schools or MDN results. Instead a lot of other, crappier sites. Google, yandex and bing all show w3schools and MDN near the top, so DDG must be filtering these results (which is ironic, given that they say they don't put you in a "filter bubble").
In my own experience, w3schools is the quickest way of getting the information I want, which I guess is why it is at the top of google. I know it isn't cool to like w3schools, but then I'm not a cool programmer.
Seems like a nice passive income project - man pages for HTML / CSS aimed at beginners but with advanced information too.
I'm surprised it hasn't already been done. Perhaps it has and I just don't know the URL.
I find MDN and w3schools about equal, although I do find w3schools is a bit better laid out and quicker to get the info I'm looking for. MDN certainly isn't perfect...I just had a quick look for some html5 stuff (MediaElementAudioSourceNode) and the page is missing on MDN. I find that w3.org is the only place that has full specs for all the html5 stuff, but it's pretty difficult to navigate.
I agree with what you write about w3.org, it's almost painful to navigate when you just need a quick reference.
As for "perfectly usable", mountaindragon (result 2) and codesinhtml.com (result 3) are atrocious. And makemyownwebpage.com (result 10 or so) looks like a homepage from the 90s.
I think the issue is that DDG is getting crappy results from yandex, then filtering out the good sites to make the results even worse. Weird :)
When I can't find something, I'll try a !g, but only rarely does that help any more.
For more technical searches, Google has become next to useless for me. "lupdate.exe not working Qt" transforms to "did you mean 'Update not working'?" (lupdate.exe is part of the Qt translation toolchain and Windows forces it to run as admin only because there is "update" in its file name. The Google 'correction' of that is perfectly useless).
For more technical searches, Google has become next to useless for me.
Just this morning I tried searching for this:
µC/gui button detach
Which is odd, because googling this:
These days I rarely use !g - the results have improved considerably and the bang searches are a very clever way of not requiring me to change search engine to search another site specifically.
Things look fine.
It's a big difference, and these different experiences with Google could just be a matter of which keywords people tend to use (e.g., very general vs. very targeted searches) and what extensions they chose to install in their browser.
(Note that while DDG's results page for 'flowers' is cleaner, the actual results are similarly shit.)
While you might want the wiki page for flower when you search "flowers" I think most people might actually want to order flowers.
If I wanted to buy flowers I'd search "florist" or "online florist". The wiki page for florists is surely of far less interest than the page for flowers.
I suspect the google/ddg results for "flowers" are a result of florists SEO'ing out the ass, not a reflection on what people are actually looking for when they search for things.
The Google result page for "flowers" also includes the wikipedia page for "Flower", btw.
Yeah, but it is farther down the list. I have to search for it.
If they wanted the wikipedia page, they would have performed the search directly on wikipedia.
The question you have to ask is: what is the most likely thing they are looking to do? Is it to learn more about flowers (wikipedia) or is it to buy flowers (regular results with ads + map of nearby flower sellers)?
Why not? Computers are better at memorizing things than people; it shouldn't be a human's job to remember a site URL (and browser bookmarks are less useful than intelligent search engines).
The second most stupid thing is if you are behind or corporate LAN/VPN you have to type CAPTICHA to search.
The last stupid thing is its UI design, where's the page cache link? how to limit search time range and order results by time?
Plus ducks are my favorite animal, so launching DDG as I begin the work day is a joy!
One example I can remember:
I was working on some linux V4L2 code and wanted to get more informations on the "buf_queue". By mistake I searched for "vbuf_queue". Google's results:
It only shows 3 results where I am (and 0 a few months ago when the problem occured) which makes it pretty obvious I'm not searching for the right thing.
As for DDG:
It displays a pageful of garbage that I will parse for a while until it occurs to me I made a typo.
And it does it for pretty much any bogus query as well, compare:
In this case ddg happily outputs what appears to be misinterpreted binary files.
Maybe DDG works well for non-technical contents but 90% of my queries at work are obscure programming/electronics stuff, component datasheets and the like. For that ddg is simply not usable by my standards.
I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
The "I'm feeling ducky" feature never fails to illicit a smile from me.
(Why correct him? Because I care.)
I switched to DDG a couple years ago after becoming concerned about privacy. Unfortunately, DDG has never quite cut it for me in terms of search results. I do a TON of technology related searches, particularly wrt very obscure programming related technologies, especially related to embedded target dev. DDG always comes up short, and I typically end up doing "!g" searches. So startpage.com appears to the best blend of privacy + good search results. For me anyway...
Good point, if anyone made a search engine run by koalas I'd have to use it no matter how bad it was
I think the way DDG can gain more traction is motivating top sites to enhance their local search pagerank. It's incredible how Google search restricted to Stack Overflow works better than the Stack Overflow local search engine. I wrote a rant about this at Challenging Google’s Search Engine .
Can you elaborate a little bit about why it's not a distributed solution if sites share their local indexes with the routing search engine?
Yes, sure. I use the concept of distribution in the sense of separating a process within different entities (search router and local searches).
In the distributed case the "search router" queries other sites to determine the best results. For example, searching for code samples involved querying Stack Overflow, Code Project, Forums, etc. This approach is clearly expensive: you rely on the other sites speed, web service availability, etc.
The non distributed approach is just receiving their algorithms and data and processing everything in the router search engine. Obviously this solution can be implemented in a distributed way inside the search engine but it is not distributed in the sense of distributing the process within different entities.
In the two cases you are distributing efforts, one of the key goals of this approach because it is really difficult to compete with Google. Google "knows" how to give good results in diverse areas while in the proposed attack vector you rely on others for part of this optimization. A movie site should know how to give good results about movies while a site related to books knows about books.
It's important to note that the vast majority of search results ends in relatively few sites, so if the top visited sites implement this approach Google search market share can be challenged. Obviously we don't really know if this approach will work in practice until we see it.
I'm very interested in this topic because I've proposed my solution for how to improve search engines, not from algorithm point of view, but from systematic point of view. And making it fully distributed is the key.
While you mentioned about the 2-level search and "receiving their algorithms and data ...", I don't think it's very feasible. Do you agree? So vertical distributed architecture across various industries is not a feasible solution. But we can do a horizontal distributed architecture which will collect data from geographic locations. In each location, there will be many different verticals. It's matter of time if Google cannot find a better solution, search engine will be improved in certain way.
Why not? I don't get it.
1) Google quality of indexing doesn't have any competition yet.
2) They can calculate a page rank across different domains
3) No single entity can make the same efforts or is so smart to build a similar thing
If you follow the 2-tier route:
1) Each entity takes responsability to optimize the quality of search locally.
2) They know their own domain or they can learn how to optimize their page rank at a local level instead of a global level
So, at the end you have distributed the work of local optimization across different intelligent entities. For example, when you look at the Linux kernel or other open source projects you can count million of man hours that are difficult to have if you run a single entity.
I also agree with you by using distributed sites to optimize the results locally. Actually what I proposed is to make the distributed search from both geographic location and vertical market point of view, as opposed of dedicated sites from you. But they are complementary. The dedicated sites definitely will provide better and more relevant results than a global search engine if Google search was not limited to a particular site.
However, the only thing I don't agree with you is when you said it does not have to be distributed though, the search router can integrate the algorithms from the dedicated sites. Then I think it's not quite feasible since it's not possible for Stack Overflow or Wikipedia to share their algorithms with DDG.
Let me know if I misunderstood you. If you'd like to take if offline, I'll be happy to discuss with you via email. See my profile.
Usually, the router search engine queries data from the second tier websites to get high quality results without having other websites' algorithms. Also there is another problem: how do you know which websites to go for given an arbitrary keywords? For example, when user searches for "cookie" on your search engine, where do you send the query to? How do you know if they are looking for food cookie or browser cookie?
Regarding how do you know where to route a query, it is an issue but not so great in this case. The article doesn't talk about having a two tiered search for every web site. If it has a two tiered search for the top 100 sites that is enough to challenge Google (the main point of the article) and making 100 searches and filtering them in the 2nd tier it's not difficult.
1. You don't know that.
2. No search engine ever spied on you when they started.
2. You don't know that.
Have they been audited? That could help.
I did blind search engine testing and Google came out on top.
That said Google is frustrating me more and more, especially now you need to "put it in quotes" if you don't want them to just show you things they think might be tangentially related in some way to your search and, you know, show you results with the actual terms in.
It's quite interesting to me that Google's main advantage when I first used it was I didn't need to AND all my terms together, it was the differentiating feature that won me over (I can't recall who from, it was around the time of Teoma and AllTheWeb IIRC). Now even when you use the cumbersome notation to say a term is required it still shows you pages without the term on ... but it nonetheless is the best offering I can find (for me). Argh!
Also: news, and book or Google Scholar.
I'm willing to accept the lack of these most of the time. Yes, privacy matters to me.
Not exactly what I'm after but might be useful, thanks.
* When I want a local result (i.e. I'm googling for business cards, and I want a UK based company that will sell them to me). This is because DuckDuckGo doesn't use your location (by default).
* When I am searching for a word that means something different to me than for the general population. For instance when I search for Python I don't want to see snakes. Google does this by tracking the results you click and forming a profile of you.
I guess the DuckDuckGo way is to search these by !uk and !programming, but that is a whole extra word to type. It would be nice if it let you list your interests in its settings page, creating your own filter bubble.
You opt into one or few of this personae and each query sent to a search engine would track this persona.
The key is that these categories should be few and predefined (i.e. offered by the search engine) so that they cannot be used to track individual users.
I've noticed trying to find news stories that are not in wide circulation (small events in foreign countries or not-so-wide-spread announcements) are much harder to find on DDG.
Having said that, I try to use DDG FIRST, then move on to other engines if it will not provide. It does feel like a sacrifice from Google; I guess they know me real real well.
DDG is especially terrible if you are used localized versions of Google. I was used to "Swedish Google", and using a localized (Swedish) DDG was totally unusable.
And so I found Startpage. Works better than the real deal for me. There are a few differences though. A search operator like site: is named host: as an example.
That's usually my first fallback before going !g
How can you be so sure of that ? A few months ago, pretty much anyone would say "but the NSA don't spy on you". From a paranoid (ie security) point of view, DDG is not better than Google et al.
Now, I do believe they are more respective of everyone's privacy, but there's absolutely nothing more than words to back this.
i've started using it as my main engine a while back but i absolutely have to go back to google fairly often. i'm using pentadactyl so i never noticed the !g !gi etc. commands. thanks!
It is not !g typed when you are not focused on the query box.
I keep leaving DDG for the not-always-super-relevant results, and I keep coming back for the privacy, clean interface, and !bang syntax.
I love that using it means my searches are private by default. But the real killer feature is !bang. I find that it's faster to do a Google image search using DDG with "!gi" than to search in Google and then move my hand to the mouse and click "Images".
For example, after setting up once, I can press "Ctrl+L" then type "gi" then "Tab" and start typing my search keyword. I don't even have to first go to Google or DDG.
There's a bang for almost every site I frequent, and when a site changes its domain or search URI the bang is updated to reflect it almost immediately with no work on my end.
After a few weeks of acquainting myself with the bang system, I started to see web search from a completely different perspective. I think of a search engine now as less of an "everything index" and more of an "index of contextual searches". My mind, instead of just thinking "I'll google it", thinks "I'll choose a context for it".
Google is like a system-wide grep whose output is altered by advertising, and DDG w/ bangs is like a vast collection of commands piped into a grep.
Corporations in data monitoring projects, OK.
I'll ignore the entire filter bubble issue and get right down to the privacy implications. When often people use search engines or other related websites (reddit search) to look up all kinds of information that in single snippets would probably be meaningless to most people, in aggregate it can paint an entire picture about that person, their interests, their computer activity and location through IP logs, and I would even venture to say we aren't too far off from a psychologist doing a persons mental profile from their search history in a court case, or even textual analysis of writing style to prove a person wrote something. (dangerous implications)
Google and MS are concerned almost entirely with our internet activity, as opposed to your claim. It is the core data metric of what makes them their shit-tons of money. More google than MS, but they are making huge moves into advertising (I have been doing SEO research for my company), and they are increasingly involved in politics of a questionable nature which include the NSA, the State Department, the CIA, and others.
So no, we will not stop talking about privacy, and if your argument is that privacy is dead, then at least skip the many times proven bad "if you have nothing to hide" implied argument you make.
On the other hand the fact that a specific person within those companies is not tracking your web activity doesn't make the privacy problem go away.
Nevertheless, I just find it silly when somebody just alludes that some actual person (e.g. Bill Gates, Sergey Brin) might actually track your specific searches; although I know nobody actually believes it, it's just a figure of speech, just to give a human body to our fears. That's the problem: corporation aren't humans, yet they have a life of their own.
EDIT: thanks to acheron for the name of the figure of speech I was mentioning. My point is that this personification is misleading and causes endless discussions about what can be expected by this or that company, and what you can expect from their employees etc.
The concept here is https://en.wikipedia.org/wiki/Synecdoche
If you could fit an entire Google data center into your mobile phone, maybe this kind of digital personal assistant could be taken offline, but for the current state of technology, big data requires the cloud. Even in the 24th Century, the Enterprise Computer is a centralized data store which tracks queries made from com-badges.
No. Another way is to make their behavior more predictable. If I type "Chinese" into the search box, Google may tell me about the Chinese language, or it may send me some ads for local take-out places. Either it guesses which I want, or it makes it easy for me to specify what I'm asking; I know which I'd prefer.
If I'm having a conversation with you about learning foreign languages, and then I ask you about Chinese, I expect you to know I'm not talking about food.
Having to overspecify a query that could be determined from context is particularly annoying on mobile, or via voice.
Sorry, it sucks.
Just because you have fifty different tracking, advertising, retargeting, etc. scripts doesn't mean you should use them all at once on your website.
In at least four cases, Barksdale spied on minors' Google accounts without their consent, according to a source close to the incidents. In an incident this spring involving a 15-year-old boy who he'd befriended, Barksdale tapped into call logs from Google Voice, Google's Internet phone service, after the boy refused to tell him the name of his new girlfriend, according to our source. After accessing the kid's account to retrieve her name and phone number, Barksdale then taunted the boy and threatened to call her.
In other cases involving teens of both sexes, Barksdale exhibited a similar pattern of aggressively violating others' privacy, according to our source. He accessed contact lists and chat transcripts, and in one case quoted from an IM that he'd looked up behind the person's back. (He later apologized to one for retrieving the information without her knowledge.) In another incident, Barksdale unblocked himself from a Gtalk buddy list even though the teen in question had taken steps to cut communications with the Google engineer.
results than compromise privacy.
It is not so much that I am not willing to share stuff (even with the government). I used to run a small SaaS for a specific legal industry, and I was subpoenas by the attorney general's office. So I am well aware of the process and I do think in some cases, government do need access to our data to ensure security.
The difference is to ask for permission (court order and transparent due procedure) and have transparency. Maybe the big difference is the exercise of Power instead of Force.
And power is a word with many means, so to be rigorous, this is what I meant by power:
"Power means pretty much the same thing as freedom. Power is a thing that everybody wants the most they can possibly have of. That is, skiing is power, sex appeal is power, the ability to make yourself heard by your congressman is power. Anything that comes out of you and goes out into the world is power and in addition to that, the ablity to be open, to appreciate, to receive love, to respond to others, to listen to music, to understand literature, all of that is power. By "power" I mean human faculties exercised to the largest possible degree. So, in a way, in a large sense, by power I mean individual intelligence. Now when you reach out to another person through the energy or creativity that is in you and that other person responds, you are exercising power. When you make somebody else do something against their will, to me that is not power at all, that is force, and force to me is the negation of power." - Charles Reich
And Free Software (Free as in freedom) is a good real world exercise in power.
"Searches Google and !bangs DuckDuckGo."
What does this mean? It's the equivalent of a DDG !g search? Isn't that exactly the same as just searching Google?
This gives you a single search bar for all documentation, which is amazing.
Thanks for sharing.
or just search ddg for !bang
seriously, I don't know how I would function without !python, !pypi, !w, !gmap, and !hn
Now I use DuckDuckGo as my home page and it's my only search engine. And the search results just keep getting better and better.
I don't recall many doodle since, as I'm not using Google much and rarely from the homepage even then.
I found that a lot of times when I was looking for general advice from other people (recent examples: how to treat dandruff, how to start a garden) I actually get really well thought out answers from real people who have done the same things. And I know they're not trying to sell me anything or just get page views.
Doing those same searches on google just returns vapid SEO filled articles from ehow and wikihow, and those kind of places, pretty worthless.
site:reddit.com start a garden
I switched my default search engine to Duck Duck Go a few years ago mainly because of that interaction. I still need to fall back on the g! shortcut a lot, but search quality has improved quite a bit over the years. I like supporting a hometown startup that exists outside the Bay Area bubble.
Another issue that kept coming up for me, was their lack of keyboard controls on search result pages. Although, this isn't really as relevant to most users I would speculate
I type in "go blahblahblah" a whole lot more nowadays.
Is this like how foreign words get remapped into Japanese syllables? Fast Company tech writers only speak Apple? chuckles
If their data sources ever cut them off, it's over.
They need to build their own crawlers like gigablast.
Unfortunately the opportunity for that is pretty meek. Many webmasters block crawlers that aren't the top search engines. :-(
Oddly enough we have blocked legit googlebot/bing/baidu servers, because they fail to properly configure their servers...
However, just to put DuckDuckGo's 4 Mqpd into a broader context:
Google 2013: 5.9 Gqpd
Bing 2012: 3.1 Gqpd
Not to put DuckDuckGo down or anything, but it's important to understand what a >1000x difference in scale means, in terms of operating costs and scalability issues. For example, for a single person is very easy to get hold of $10k but it's extremely difficult to get $10M. (I'm not actually interested in money, it's just an example measurement unit, and particular range, that people are familiar with).
It's incredible how much you can accomplish with very very few focussed makers.
We created our own rack mounted HSM, our own Hardware POS payment terminals, and all the payment web structure. Answer to support and automated the supply chain.
That said, I hope the grow strong so I can use their engine.
All in all, I'm happy with it. This after sixteen years of Google use.
You can compile Chromium with the tracking flags off, and then search in incognito.
I need motivation.
Tip from experience: Really think about logistics before diving in. Racks (even quarter racks) are heavy!! Data center equipment is also noisy. You eventually also hit limits on household electric circuits. It is fun though :)
Their algorithm isn't necessarily better, but if you were to just give it a try, you'll notice their 'Instant Answers' section at the top of a search that usually gets you exactly what you want without having to click on a single result (much like how certain search queries on Google will return you an instant answer).
There are other much better reasons not to prefer google as your search company though, foremost among them that they sell advertising and are no longer in the growth stage but the monopoly stage of the corporate lifecycle, so their incentives are not really aligned with free search customers. Their dominance of web search is starting to seriously distort the web.
You would think they would, in order to make the service better.
The less self-aware majority still have nothing to hide, and the other side is using Google via proxy.
His past isn't exactly one I'd look for if I wanted a trustworthy search operator, I think.
I don't like that when I press tab I go to the search field I would like to go to the first link. Also I type in YouTube and the first link is for the Wikipedia entry.
(edit to add) I read somewhere that the first Google home page just had the search box on, and people used to wait because they thought it hadn't finished loading. So they added the copyright notice as a footer just so people would realise it had.
Edit: it appears to be a "karma balancer" that docks points in a thread if it measures participation as unbalanced. That'd be my guess.