Anecdote 2: I have a close friend that is a high school teacher. She's not all that tech savvy, so I help her out sometimes. Google is damn near useless when it comes to helping her develop her courses. Not exaggerating at all... maybe 60, 70... 80 percent of the results for educational queries are SEO spam. Every result is essentially, "want to learn how to write a monologue? Click here to pay $20/month for the privilege".
It's gotten so bad that I jokingly tell her that it would be more efficient to just walk down to the library and take out a book on whatever topic she's querying. But, you know, they do say that every joke has an element of truth to it...
I doubt the declining quality of search results is a product of Google's own advertising business, as the article would have us believe, and is mostly a product of third-party SEO spam. This outcome also wholly predictable. Part of the appeal of Google to early adopters was the lack of the SEO spam that destroyed the utility of earlier search engines. I recall conversations at the time when Google was so new they eschewed graphical ads that foresaw this outcome, though our predictions may have been a bit on the pessimistic side (i.e. Google held up against SEO longer than we thought it would).
My theory, why this is not happening: I guess that most of these SEO-spam sites are actually including Google-funneled ads, so this means there's something wrong there too...
Blacklists are also problematic. I have an email account with a smaller provider that I never bother to use since there is a good chance that their servers are blacklisted at any given point in time. Getting their server removed from these lists is non-trivial in most cases. Once they are removed, they usually get relisted within a few months. The problem also runs in the opposite direction: I have corporate email accounts where every external email (including those from certain departments of their own organization) is labelled as such and as potentially suspicious simply because blacklists are not sufficient. The only reliable outcome of blacklisting is the reduced reliability of communications channels.
User reported abuse is even more problematic since it opens up avenues for abuse. While it may be relatively easy to filter out bad reports in situations where there is a minimal vested interest (e.g. finding something disagreeable), that won't be the case when there is a considerable vested interest (e.g. attacking the competition).
I think this is instead an active choice on the part of Google. When the options are "show someone's personal blog as the top result" or "show some company as the top result", they always choose the latter because a company is somehow believed to be more trustworthy.
Also the AI you speak of can equally be used to produce content that looks real but is not. It's even possible to produce articles that are impossible to determine if a human wrote it.
This is a cat and mouse game that won't end. There's too much money on the line.
Those badass AI should be able to use their image recognition power and be able to recognize products I put on the counter and calculate the price. It seems this is still beyond capabilities AI has (apparently playing chess or go is way easier than working in the grocery store).
Really? Classification of images on a known static background should work very well. Or at least well enough that you can request a manual scan it every 10th item and still get a large speed increase. The bag weight works as a double-check anyway.
It's a totally different thing be selling Angular templates, when your website is titled "Tutorial: Learn how to write Angular services"
One the two is clearly misleading.
Try searching for
site:reddit.com best headphones
And click "Tools" and set the date filter to be "Past week" for example.
Click the first result. You will notice it's from a year ago. Click the second result, it's from 2 years ago and so on. Not a single result matches the date filter. Similar issues with the other filters.
It's 100% broken and been this way for at least a year now as far as I remember.
And then this. An unsolved problem that they could have solved 20 years ago - recognising that pages are not necessarily one atomic unit and that different parts can be updated at different times. Or more generally that different types of websites require slightly different approaches to search.
It's not a trivial problem if you think about it. But for a search company to get this completely wrong even for a global top 100 discussion forum requires a severe lack of incentives.
And that's where I get the sinking feeling that I shouldn't be encouraged by Google's failure at all, because those incentives are difficult to fix for anyone. I'm pessimistic about Ramaswamy's approach. Putting a paid ad blocker on Bing isn't going to fix this.
Unless the redesign of Reddit broke this somehow? But as far as I know, this has broken for other sites too. And Reddit redesign is over 2 years old now.
Or alternatively, improve their algorithm for detecting page changes.
Somehow it should be improved, as it’s a useful feature.
While smaller companies seems to have a hard time implementing A/B-testing Google seems to be running tens or hundreds of tests continously.
Here's a nice trick: If you report any issue on a search results page it seems to opt you out of the experimental group and you get normal results for some hours/days/weeks. Still not great like in 2009 but not as crazy as whatever bugs you now. At least it has worked for me on a couple of occasions.
Googlers: If you are relying on people giving feedback to know if a change is annoying users, be aware that your feedback process actively discourages people from submitting feedback. Even I have to be reaaalllly motivated to send feedback.
A common human fallacy is to pick a medium size number and think it is big. People have trouble grasping large scale (perhaps due to how our senses instinctively use a logarithmic scale)
If you use Firefox, some results that are full of a Firefox-specific scam where "you are the billionth search result" (maybe it also happens for other browsers). This has been going on for years now - I noticed it because scammers were scraping my blog for content and republishing it but making searches terms redirect to scam sites.
Sure having a search engine do exactly what the user requests would be great for us, IT people. Not sure it is possible to build much traffic with that alone, though.
I can't say for sure if this got more frequent in recent times or not.
I'm working off the thesis that combining highly relevant vertical results is the best way to combat SEO.
As a fun aside, when I did a Show HN for this tool, the title was "Runnaroo, a new search engine that didn't raise 37.5M to launch" as a friendly joke toward Neeva. Dang changed it later to be just "Runnaroo, a new search engine."
 Example search for "bose headphones reddit": https://www.runnaroo.com/search?term=bose%20headphone%20redd...
I saw from your previous post that you're using Google for the web results, but the only option listed in their docs has a 10k queries/day limit. Have you been able to get them to agree to a higher limit, or are you planning to move off of Google once your traffic grows?
Also, your example search has a character encoding issue - "Stolen iPad Pro & Bose headphones".
Regarding, "what's your distinct advantage over the easy to remember site:reddit.com"
It's arguable that just typing 'reddit' with the query is easier to remember than typing "site:reddit.com" for most people, but you can have the best of both worlds and still use the site operator and get direct Reddit results .
it's great if I know what I want to search (i.e. I know the field and the keywords and I need the specifics) but when I don't know what I want to search I can't use it for discovery (i.e. when learning new things and I lack the terminology)
The bangs are painfully slow over the main duckduckgo site.
Miss the bangs.
Also, for non-trivial results, Qwant is mostly a fancy Bing front-end, so you might as well use Bing.
And yes, it was around that time that search quality took a nose dive for me.
Also unless you work at Google or have done extensive research: I think you are still wrong.
I think part of the reason for that is that off-reddit there's so much emphasis on SEO and analytics (trying to hit all the right keywords, linking to other pages/sites, tricking people to stay on a page/site longer etc.) that everything becomes very cluttered very quickly.
I use !g for Google pretty frequently, but there are many searches I just send directly to the site I think has the best answer.
I'm still tricked every time though, I see the result matching exactly what i'm looking for, I disable my reddit block from browser, click the link, and am 99.9% of time disappointed in the result.
So it's another UI over bing like duckduckgo? I'm not too optimistic, at the moment there are fundimental issues with how search engines interpret text and rank results.
Privacy. DDG at least claims to make it a first-class feature, and one would imagine that means that they're not selling you out when they pass the search along to Bing. Going directly to Microsoft may be OK. I haven't really bothered to look into it; I just went with my warm fuzzies. This is a spot where Microsoft has a checkered past, and it's going to take a bit more than the ill-fated "Scroogled" ad campaign to change minds there.
Cleaner UI. Bing's interface is relatively cluttered compared to DDG's. It loads all sorts of images, sticks a chumbox on the bottom of the home page, nags you to download Edge, etc. If I run a search, I have to often scroll through two entire screen heights of I-don't-know-what before I get to actual webpages. Lately, DDG has been adding clutter to their site as well, but there's still quite a lot less of it, and what there is tends to be less visually noisy.
Same with Windows Menu? WTF is cotton candy doing there on a fresh install ?
Why does opening Edge have so much msn news spam?
Like is Microsoft just oblivious to what the user really cares about ?
Most likely some execs compensation is tied to how many ads they shove so they prioritize that over the user’s experience.
For a while, you could get away with interpreting Apple's behavior as if it were a single person with a coherent mind. Since 2011, though, that model's been getting less and less workable for Apple as well.
- searching via POST requests (so that your searches are not saved in browser history)
- I heard many people like DDG's browser for iOS, as a dedicated incognito browser
Which you can do inside your browser too.
> searching via POST requests (so that your searches are not saved in browser history)
That's not a benefit to me.
If I want less browser history I'll handle that locally, thank you.
> I heard many people like DDG's browser for iOS, as a dedicated incognito browser
That's not a benefit of the site.
Better privacy and better infoboxes sometimes are what I would immediately list, but purely on features that's not very compelling.
Prices are here:
Little known fact: if you buy anything on Amazon, DuckDuckGo gets to know exactly (the precise item) you bought as part of the Amazon affiliate program.
This isn't specific to DuckDuckGo in any way whatsoever. This happens for ANY affiliate. I've used the Amazon affiliate program, and every month I would get reports of the exact items people were purchasing. I couldn't link those purchases to any particular individual, mind you, but I could see exact items.
DDG has their own crawler and their results are a composite of many different sources of which Bing is one.
It’s easy enough to check. I searched my name on both DDG and Bing, the results are completely different.
The fact that they return different results doesn’t mean much, firstly the bing API and web search return slightly different results especially if you use some of the extended parameters, secondly it doesn’t mean that they return results without additional post processing since they can have their own weighting/pageranking algorithms, filters etc. on top of Bing.
DDG isn’t just a UI for Bing but their results rely on Bing.
Search for "what is my ip" and you will see Bing bot IP in the DDG snippets.
> To do that, DuckDuckGo gets its results from over four hundred sources. These include hundreds of vertical sources delivering niche Instant Answers, DuckDuckBot (our crawler) and crowd-sourced sites (like Wikipedia, stored in our answer indexes). We also of course have more traditional links in the search results, which we also source from multiple partners, though most commonly from Bing (and none from Google).
In other words, their own crawler and the other 400 sources are used for their Instant Answers and widgets while all "traditional links" (i.e. the search results) come from Bing.
Even back when DDG did only use Bing/Yahoo data, you'd likely have seen different results for your name depending on what personalised results Bing/Yahoo might show you, or other aspects (such as weighting applied to your location).
A distinct focus isn't just "Bing without ads", in a similar way that Bing isn't just Google with different ads. They can have different focuses, qualities and utilities.
The UI is less important. Those 3-4 ads appearing at the top of search results can be filtered out by browser extensions.
Bing isn't just Google, because it uses a different index and a different ranking algorithm. The results are vastly different.
If a search engine is just a shell over Bing, then there isn't much point in using that over Bing.
Note DDG provides some niceties over Bing, plus the pledge that they won't share your searches with them, but you basically have to take their word for it.
neeva.co, neeva.co/blog, moneycontrol.com, indiatimes.com, nytimes.com, neeva.tech, neevagroup.com, babycenter.com
neeva.co, nytimes.com, androidauthority.com, medium.com, oflox.com, gomoguides.com, neeva.tech, neevagroup.com
(I don't use Bing, and DDG isn't supposed to track me, so neither should be personalized results.)
https://i.imgur.com/fML3x0l.png (in French)
Once it is set to my country (France), then I get almost identical results.
I don't know how to deactivate this feature on Bing, so I could not compare Bing and DDG in the case where the "location" setting would be ticked off on both sites.
Sounds like a UI over Bing to me.
1. Actually return the results matching the words that people typed in your search box. The more they match, the more they go up.
More and more, it has become extremely disappointing what you get back from Google (and others as well). Verbatim search (where you surround exact terms in quotes), seems to have vanished in the last year or so. More often than not I have clicked on a result, only to find out that my particular query is nowhere to be found on the site, but some "related stuff" does.
As virtually all early search engine developers have found out the hard way, it's extremely easy for website owners to cheat such a basic algorithm by spamming keywords.
This is everything that's wrong with search in a single sentence.
"What's the name of the plant that that eats flies?"
This should, and does, produce results for Venus Fly Traps but I doubt any useful page contains all of those words.
If you were searching that sentence without the quotations I could get behind it return Venus Fly Traps results, but if you are using the double quotes you are trying to search for that EXACT string.
"What's the movie where the guy turns into a pidgin?"
This should return results for Spies in Disguise, and indeed, every search engine I tried does. I doubt any relevant page actually contains all the words in my query.
Of course, around that time Duckduckgo started going downhill as well in this regard.
I doubt getting them to switch would be very easy unless you did some deal with browsers. But they like most others else uses Chrome, doubt Google would do that.
I was sitting down with a very experienced C++ engineer a few months ago to work on a problem. There was something we needed to do a web search on.
He opened Chrome, clicked in the address bar (which already had the keyboard focus, but never mind that), and typed "google". This did a Google search for Google, and the first search result was www.google.com. Then he clicked www.google.com which took him to the Google home page, and there he typed in the search terms.
Yes, he Googled Google to do a Google search.
It's a coincidence that modern behaviour is the same for certain domain names.
If he's an experienced C++ engineer he's probably of that vintage, and has probably been in demand enough in the meantime that he hasn't had to address this "niggling" behaviour.
I've heard this a lot, and have basically trained myself by rote not to hover over people's computers when we're looking at something together, but I still can't say I understand it. I don't know if it's an inarticulable phenomenon, but do you have some sense of what drives this and/or what it feels like?
People sharing their desktop/app/window, then breaking out of that sharing selection to bring something else up, while they're talking.. and searching, and things aren't working exactly as expected, so they go into rabbit-hole mode, etc...
Everyone's got a million things going on in their brains, and an audience changes things, and "presenting to an audience" is different than sharing with an audience.
I switch between my Mac and a Windows machine, between Safari and Chrome, between ctrl- and Command-, etc etc. Half the time things I'm connected to are broken (VPNs, endpoints, services...) so I find myself half-stabbing my way through little time windows during the day. And if I'm talking to someone, while trying to do something I've done 1000x before, I'm probably covering my bases by stabbing at the keyboard even more.
I do presentations a lot too, it's a totally different mode.
Perhaps it makes me really uncomfortable when what I am doing is the center of attention. Like the nervousness normal people feel while performing something on a stage in front of hundreds of people, but on a micro scale, even doing something in front of a single person elicits the same response from me.
The HN circle is complete
"Google’s Top Search Result? Surprise, It’s Google"
The distance from their pen to their paper is an example of a ridiculous metric. And expecting humans to be robots is a straw man to what I'm asking.
The candidate's ability (or effort) to understand and use the tools they use 100 times a day to do their work is not an irrelevant metric.
Doctors need to have a basic understanding of how stethoscopes work. If they start by listening to your elbow, and only then proceed to listen to your chest and back, something is amiss.
It's not a matter of suboptimal behavior or poor efficiency.
That doesn't suggest that I don't know how calendars or timekeeping work: I was just distracted and glanced at my watch on autopilot.
 I did make a lot of smart-ass jokes...
I don't remember the stats, but when I worked for Yahoo (2003-2005) they got a fairly substantial number of daily users to stay on Yahoo with that trick.
Translate: Feedback from catering to the least common denominator boosts techie self-esteem.
PSA: Similar reasoning is responsible for political ads.
Isn't this just manually entering the prefix?
That wouldn't take care of SEO spam, which is very good at stuffing its pages with whatever words people search for.
Also, as much as I hate to admit it, Google is pretty good at guessing wrong spelling or synonyms and getting good results in a majority of cases.
Include ads on the destination website as part of the reputation score. If a website is loaded with ads, sink it to the bottom. Google doesn't do this because they likely make money from the ads.
Heuristically determine spammy content. It's pretty easy to tell at first glance which content is bullshit, so it's probably not hard to create an ML model to do the same classification.
Manually assign positive weights to websites used by engineers and domain experts. You could even curate this list in the open and solicit help in maintaining it.
there is a clear upper bound on the amount of total legitimate web content, and that upper bound is not prohibitively high -- linear on the total number of coherent-content-producing humans with only so many hours in a day and only so many years of adequate brain functioning (and not on the amount of whatever computing resources that are thrown in to support whatever algorithmic content fire hoses that the currently-dominant search engines contend with).
That page doesn't contain the word "conditional" at all - the word that Microsoft uses is "branching" but Google deduced that it was the best result, which was perfect. The same search in either Bing or DDG produces results that all have the words "microsoft", "forms", and "conditional" in them and none of them link to the page I mentioned above, which I consider to be the best result.
Moral of the story? Search is hard, but Google does it better than everyone else.
Also, I learned via this thread that DDG is just a wrapper around Bing, which explains why the search results between DDG and Bing were near identical - they even have the matching video suggestions.
Sounds like the results of a neural network: roughly approximating your intent and searching around that intent, in continuous space, to find other viable search terms and phrases. (This is one possible approach, given in broad strokes.)
That's a massive barrier to entry. You need enough data and compute to train a massive language model, more compute to run the model against all incoming queries, and then even more compute to handle the extra search load precipitated by use of the language model.
Not to mention the years of R&D that go into these models and their associated tooling.
Luckily, most of the time you could improve my user experience by removing that cr*p and give me my 2007-2009 Google back.
From there you would only need to allow users to make personal blacklists, share personal blacklists (this was about the time when auto-generated content started to become popular) and maybe also aggregate some popular blacklists for a default blacklist and it would be better than anything we have seen since.
(I remember having a txt-file with -spammydomain.com -anotherspammer.com etc etc that I pasted in at the end of certain searches to take care of sites that had either had
- auto-generated content
- or stuffed their pages with black/black or white on white keywords )
But in all honesty it is not the SEO scammers fault that Google serves me pages that doesn't contain the words I searched for after I have chosen the verbatim option.
It also isn't SEO scammers fault that when I search for Angular mat-table I get a number of pictures of tables with mats on. That is probably the result of someone playing with some cool AI tools while othwrs are busy trying to make more efficient ways to ignore customer feedback ;-)
We must manage to keep those two thoughts in our head simultaneously:
- Black hat SEO have changed
- Google has adapted to another audience and has ditched us power users hoping we wouldn't notice.
: screenshots of that and some other clear examples of Google and Amazon testing out AI in production here: https://erik.itland.no/tag:aifails
Anyway my point is if you rewound the clock to 2008, you'd have a way bigger problem than you might think.
- + (ok, they broke that deliberately around Google+)
- the verbatim operator
All those should be able to work even if the crawler and processing techniques are updated, right?
Also a heads up: I added some more details to my post above,I didn't think you would answer so fast :-)
Edit: I only know the black hat methods that was well know 10 years ago like:
- backlink farming from comment fields (we protected against it by applying nofollow to all links in comments)
- Google bombing (coordinated efforts to link to certain pages with particular words in the link, trying to get Google to return a specific result for an unrelated query. I think the canonical example was something like a bunch of people making links with the text "a massive failure" that all pointed to the White house website.
- Link-for-link schemes
I used to be able to learn how to effectively search but lately it's just so terrible. It's made for 99% of people's dumb searched, but try to get specific and it fails hard.
True, but they do it worse and worse.
Google still has some margin before their usability declines to the level of their competitors, but they're headed there quite fast.
So far it's based on Bing, which does. This makes it a bit a hard sell, compared to an intelligent ad blocker.
The most important problem of search engines is SEO spam. Google themselves sort of have a moral hazard to not be too stringent on SEO spam, because it shows ads by Google, increasing Google's revenue.
OTOH I wonder if the subscription revenue is going to be sufficient to have access to a reasonably good search index and enough processing power to efficiently combat SEO spam while returning relevant results. This takes your own data centers run frugally, because fees of something like AWS or Azure will just be exorbitant for a global search engine and a global search index.
I wonder if companies aiming to provide alternative search engines will cooperate on maintaining a common index, to distribute the massive costs of doing that. They could even publicly sell access to it, at a point where running a competing search engine won't be practical; e.g. researchers would buy it.
We may end up with search engines that use multiple indices where each is curated to a certain domain of information, rather than a one-size-fits-all index.
It's not an easy question.
There are over 1.7 billion websites , so the task of ranking content, the way algorithmic search engines do it in a matter of milliseconds, is not as easy as it sounds when you add humans into the mix. It would only end up the way Mahalo did .
You can still automatically index but have users vote on the results. There are 1.7 billion websites and 3 billion+ users, and you don’t need that many to be active voters to help assist algorithms. Plus how many are at the top anyway? I’d love to downvote a ton of google results even if it only used it as a trainer for my own.
Plus, there are so many “super curation” sites like here and Reddit that provide a big dataset curated by people automatically. Lean on them more. Everyone knows “site: reddit.com” or “site:stackoverflow.com” already give you better results.
A simple upvote downvote on their results would let me downvote all the spam SEO sites. It wouldn’t take many votes for them to start tuning it.
Stats are a good way to blind yourself. That algorithms scale doesn’t mean people don’t improve them. Google’s problem is they are too cocky about algorithms, but their algorithms fail compared the curated communities all over the web already.
But if you filter for paid members, you've filtered for much smarter people (on average) picking the best results - and you've also filtered out a lot of the SEO people who are going to be trying to manipulate the bounce rate with proxies.
One thing that would be really helpful, stop counting the words appearing on links on the sidebar or other non-content part of the page as important for my search. It's amazing how many searches go astray because someone has some words on a sidebar that don't have anything to do with the content of the page. You would think with all this ML someone would teach a search algorithm to ignore it.
>Neeva's most unusual feature is its ability to also search users' personal files. In a demo, Ramaswamy searched for tax documents and photos, all surfaced within his search results or available in a Personal tab in the Neeva interface.
Privacy focused my a$$!
Search results for hardware problems üs a total garbage landing you in forum topics of totally different hardware, just because some show off poster listed every single electronic equipment he's ever owned in his forum signature.
Something even worse than this has has forced me to ddg out of necessity: fucking CAPTCHA.
Which is essentially discrimination against anyone with a mobile internet connection who blocks google's tracking.
I just can't use google search anymore, it's a special kind of torture, after years of getting used to using google for everything and now getting this thrown in my face every single fucking time - way to permanently train users away from your search engine. It's time we had more competitors, google search is a monopoly and it's only going to abuse it's users more.
1) Word-based crawling/indexing: quickly abused by spammers.
2) IA/ML-based: I think this is the current model (?), but after a while the machine got "clever" and it makes Google to think it knows better than me about what I am looking for, and returns result for "most people" tastes. The problem is when you are not "most people", and/or you are looking for some niche topic/work related/tech stuff/etc. Simply trying to discover new things like an interesting blog or a small shop it's impossible.
3) Paid-based: as in the article, and might be a good idea. But I think it has to run on a custom indexer. Why would I pay for Bing results?
4) Aggregators: a search engine that returns results from a bunch of other search engines, like DDG and others.
5) A mix of the above?
So unless new ideas come to the rescue, I think it's always going to get worse.
I guess that ends quickly once spam means you get blacklisted no matter how many Google ads you serve ;-)
A large piece of this article hints at how they have some interesting options that Google doesn't have.
Not sure if this solves any user problem to be honest, and the idea is only appealing to me because I think some amount of domain expertise and human curation could go into categorising pages. While this sounds (and no doubt would be) labour intensive, if we consider the number of domains that users actually visit when conducting a search (i.e. ignoring anything past page 1 of google) then perhaps it's not so extreme.
That said, search is a really hard problem to solve (even if you can take shortcuts like using Bing's API).
Watching Google slowly fill more and more of the search results with ads, this is an obvious and very welcome idea.
I haven't seen an advertisement in 10+ years. I don't really understand why anyone chooses to see them when they don't have to.
And sorry if this comes off as confrontational, I just see so many people talking about advertisements and it's difficult to have to tell each individual that adblocking extensions have existed for close to 15 years . I wish there was some better way to spread this information so no one would have to see ads or comment about their existence ever again. The internet is so much better without them.
None of the choices of content blockers in Safari successfully block the majority of ads on the internet.
Furthermore, it's not just about the surfaced ads. Even if you use uBlockOrigin, search engines like Google optimize for ad clicking, which will affect the search result ranking even if you have ads blocked. As a result, search quality has been steadily decreasing over the past decade (there have been hundreds of highly ranked HN discussions on this in the past).
Finally, uBlockOrigin is an amazing tool developed by 1 person. There is always the chance that, in the future, there are developments in browsers or ad-serving technologies that render it obsolete (e.g if Google decides to make a breaking change to the Chrome Extension API, like Safari did). In that case, it would be worthwhile to have alternatives.
As to the search quality decrease, that's definitely more of a reason to desire an alternative than seeing ads is.
The first is punative. While it does, kind of, punish bad behavior, it doesn't go far enough. You may deprive Google of some small revenue, but you're still giving them a lot that contributes to their bottom line. They can still claim a googol of clicks and eyeballs, etc. People will still see them as the end-all-be-all, and voluntarily submit their sites to google's attention--ignoring any other options.
The second is the failure to reward. It is not enough to kill off bad actors if you haven't nurtured good actors to take their place.
In this case, the good actor could even be Google. If there are efforts to do things differently, we should find the ones we like and reward them. They will benefit and their competition will observe and imitate.
It didn't to me. I use ublock and take it a step further to use other search engines. Personally, I only notice how ad filled Google search is when I use someone else's computer.
On the other hand, how come Cliqz has been shut down instead of being sold? Are there no companies with deep pockets who are interested in containing Google's revenue besides MS? E.g. since Cliqz was so privacy focussed, wouldn't that have been a great start for Apple to have a privacy respecting search engine?
Siri has search and its top hits are surprisingly helpful. Maps is getting slowly better, and has both native and web clients. Apple TV+, iCloud Drive has paid tiers. Shortcuts and Messages look like apps, but they’re really UI around services, as is Siri.
I hadn't even had a chance to read the first line of the article and the site's already asking me to sign up for their marketing trash.
Why in the world would anyone think that's acceptable? I have never seen the site before and have no idea of the quality of its content so what on earth makes them think they're going to get a subscriber out of me?
Obviously, not at the same level as google and there are other parts. But I believe we can do this together if we try to. People were talking about building their own search engine on elixir forums a while ago and many seemed interested.
You can do crawling by using an extension that allows you to create a new tab, crawl data on your current url and send it up to the mothership.
It's a solution more easily solved by vc companies or government laws, because we're not seeing Google doing that in this lifetime, while FOSS solutions simply won't get the needed traction.
You just traded SEO as we know it for a scheme in which any rando can just upload the supposed contents of any URL.
Also, ethically speaking, aren't we at the point of considering the idea of trusting random people smarter than trusting huge corporations whose only goals are to make more and more money?
A proof of work does not seem viable either. You're asking for the submitters to pass it for no reward, so the difficulty factor can't be particularly high. But then it becomes useless at blocking somebody who is actually deriving a benefit from submitting (fake) results.
The giant company will in this case build an index that's far superior. The crowd-sourced version will have huge amounts of duplication of popular pages, and massive underrepresentation of the long tail. And can you imagine how inefficient the distributed version will be both on storage and bandwidth. There can't be any facility for scheduling pages to be crawled at sensible intervals given the push model. The indexing nodes will just be flooded with pages they didn't actually want.
The crowd-sourced version will also not be "random people" like you suggested. A lot of them will have an agenda, and will be trying to manipulate the index to meet that agenda. And manipulate it in a way that's not useful to the people making searches. At least the company's goal of making money is furthered by building as useful an index as they can given the resource constraints.
The search engine page can be used for validation, just allow people pressing the back button on the page to tell you whether the results were useful or not.
But further, you are not really thinking through how one would abuse this kind of a feature. If doing seo, I wouldn't forge a page to have content that make it be returned for irrelevant searches. Instead I would forge some high quality pages to show up as having backlinks to my page, and boost its pagerank. Or to demote the page of people I dislike, I'd forge it to have results that make it not show up on any searches. Your heuristic would not work there: if there's no clicks in the first place, there can't be any bounces.