Hacker News new | past | comments | ask | show | jobs | submit login

It's a fair point about how awful recipe sites look without ad blockers, but this part is just plain incorrect:

> You can tell just by looking at the URLs that those sites are going to be worthelss blogspam.

At least two of the three results in the screenshot are from legitimate baking sites (Cookie and Kate, Sally's Baking Addiction) which are generally trusted sources online. I don't know anything about the third. But Google seems to have actually done a good job of highlighting recipes from reliable blogs.

The points about the compromised experience on those sites due to intrusive ads remain.




I just looked up Cookie and Kate. On my iPad I had to flick 7 times to get past the exposition on Crispy Roasted Chickpeas and find the actual ingredients. When I found the ingredients, they occupied a small squeezed sliver of the page. As I was counting the number of simultaneous ads surrounding the ingredient list (4 separate ads), a pop up covered them all and suggested I sign up for her newsletter.

The recipe looks good (chickpeas, olive oil, salt, spices, oh shit I stole her blog post). I also think the site counts as "worthless blogspam".


The problem is that Google started weighing time spent on page very heavily in their ranking algorithm - I don't remember at what point this happened but it must be about a decade ago by now. Every time a user clicks a Google result without using "Open in New Tab" and clicks the back button, Google gets a signal about how long they spent on the page. The longer a user spends on the site, the stronger the signal. Once all the SEO vampires figured it out, everyone started to pile on prologues to all their content, not just recipe sites. In my experience that was the beginning of the end.

Any recipe site that survived had to adopt the tactic or die, leaving only the spammers and the odd outlier with actual content to write about like Serious Eats. Same thing happened to Youtube and their preview photos; even the legit content creators had to start making those stupid bug eye images.


Yup. This is the Long Click metric.

Evaluating search is difficult because it's a tension: if users click a lot, is it because they find many valuable things, or because they didn't find what they were looking for?

If a user clicked just once, is it because they found what they were looking for or just that the rest of the results were so bad the user gave up?

The long click (user clicked, then didn't click again for a while) is a better metric, but also not ideal: did they stay because they found what they were looking for, or was the result just that confusing they had to stay to comprehend whether it was the right thing? Most often it's because they found what they were looking for, but the pathological cases hide in the middle: many similar correct results, winner is the one that makes the user a little slower.

(This has nothing to do with tabs or back buttons, by the way. It happens any time they can detect subsequent clicks on the search result page.)

I've worked in the search space (though on less evil projects than Google) and I still struggle with the question on how to evaluate search. If you have ideas, let me know!


One idea, but people will probably hate me for it: If you return to e.g. the google search site (hence: when the long click metric would be triggered) have a dialog on top saying ‘result great / OK / bad-or-confusing’. Can probably be gamed (bot nets trying to destroy the reputation of others) but at least a long time would not automatically mean ‘great result’. (In the arms race to combat destruction, it could be so that a ‘bad-or-confusing’ click would not actually push a value down, just not make it go higher).

Kind regards, Roel


This was tried with a +1 button around the time of Google Plus's launch.


> if users click a lot, is it because they find many valuable things, or because they didn't find what they were looking for?

Why do you care as a search engine? This is a natural human problem that can't be solved with technology, only by humans.

It used to be, that I went to page 5 of Google instantly, because that was where the real results were. The first few pages were people who knew more SEO than sense.

These days, that doesn't work since "semantic search" because now it appears to be sorted by some relevance metric and by about page 5 you start getting into "marginally related to some definition of what you typed in but still knows too much SEO to be useful."

The point is, this was already a solved problem if you knew to go to about page 4-5. Then people started trying to use a technical solution to a very human problem.


> Why do you care as a search engine?

Wait, are you really asking why a search engine would care how well it finds what the user is looking for?

Granted, there are a lot of search engines that sell themselves on other metrics ("it's fast!" or "it uses AI!" or "it's in the cloud!") but any serious search engine player strives to learn how good it is -- in practise -- at helping the user find what they are looking for. That's ultimately the purpose of a search engine.


> Wait, are you really asking why a search engine would care how well it finds what the user is looking for?

While a useful metric, it's an unknowable metric.

1. You have no idea if the user even knows what they are looking for, so how would you know that they found it?

2. You have no idea if the user found what they are looking for, maybe what they are looking for isn't on the internet?

3. You have no idea if the user is even looking for something, maybe it was just a cat running across the keyboard?

The only way to learn the answer is to have humans talk to humans. You can't game your way through it by using metrics.

It reminds me of this one time the CEO asked our team to add a metric for "successful websites" (we were a hosting provider) and we rebuffed with "define successful." They immediately mentioned page views, which we replied "what about a restaurant with a downloadable menu that google links to directly?" and back and forth with "successful" never being defined for all verticals and all cases. It just isn't possible to define using heuristics.


I disagree. It's unfortunate that some users don't know what they want, some want things that don't exist, and that some are cats. But most users are humans with a rough idea of an existing thing they are looking for. It's worth it for a search solution to find out how good it is at helping them. The cats add noise to that measurement, they don't invalidate it.

Do you philosophically agree there are websites that are more successful than others? If yes, then there are tangible qualities that distinguish this group from the other. They may be subjective, fuzzy, and hard to pin down, but they're still there. If no, a success measure is irrelevant to you but other people might disagree, and once thoroughly investigated, you sort of have to agree the measurement coming out of it reflects their idea of success.

In none of this am I saying it's simple or easy (I started this subthread by saying it's difficult!) but fundamentally knowable.

Yes, humans talking to humans is definitely the start. But then I'm posivistically enough inclined that I think with effort we can extract theories from these human interactions.


I didn’t go into all the problems with “successful websites” but it really is impossible to measure. For me, my business site is successful when I capture leads, my blog is successful when I write posts, a restaurant is successful when people show up to eat. There’s no way of knowing what variables and metrics constitute success without asking the person.

I had a CEO who searched for the related business search terms every morning. No clicks, he just wanted to see the ranking. The other day, I was searching for an open NOC page that I knew existed but couldn’t remember the search terms. Eventually I gave up, but I’m 90% sure I left the tab open to a random promising search result that had nothing to do with what I was really searching for. There’s a pdf that archive.org fought over and simply mentioning it results in a DCMA, you can find it now, but for nearly 20 years, you could only find rumors of it on the internet and a paper copy was the only way you could read it.

Even when I know what I’m looking for exactly, I sometimes open a bunch of tabs to search results and check all of them, (This is actually the vast majority of my non-mobile searches) especially because the search results are often wrong or miss some important caveats — especially searching for error messages.

The only way you could find out these searches were unsuccessful (or successful) is to ask. There’s no magic metrics to track that will tell you whether or not my personal experience found the search successful.


I feel like the problem is trying to turn human experience into a metric. Probably the better approach would be to have a well staffed QA team.


We should be mad at Yahoo for having fucked up. If anything, they could have spun out the search part and be remembered for it,.


I honestly don't think it's possible to have a QA team large enough to handle the gajillions of websites that come up and disappear every day. They just have to come up with better and better metrics until they find one that approximates the human experience the best.


You don't have to cover the long tail... Maybe just top 10% of topics would be a big improvement.


Google also massively reduced AdSense payouts over the years as well.

Result? Adsense-based websites started jamming in more ads per page to maintain their old revenue levels. Pages became longer so that more ads could be thrown in.


Why did people continue to engage with such trashy sites?


Where do you find out about metrics like this?


There are SEO industry nerds that scour Google patents for clues (this long click metric was an early 2010s patent that was granted in 2015), and Google lets information slip from time to time, either officially or unofficially.


The first site "cookieandkate" might look like blogspam but it wasn't.

After going through some random archived posts from 2011 & 2016 , I think it probably fell into the same trap the article mentioned and kind of proves how needless seo spam ruins websites.

[1] is a link to a recipe on the same site from back in 2011. It has some content at the top giving personal context and plenty of normal pictures of actual recipe, not those fancy artistic photos. It has that personal touch with no hidden agenda type feel.

[2] is a link to another recipe from 2016. The content and format is more or less same as 2011 with a bit more long form content.

Compare that with current posts on the site. The content looks similar but there is a lot of needless use of bold/emphasised content probably for seo. Every paragraph is worded like it has some call to action or has an agenda.

[1]. https://web.archive.org/web/20120109080425/http://cookieandk...

[2]. https://web.archive.org/web/20160108100019/http://cookieandk...


That's pretty depressing. I don't really do any kind of content marketing work these days and haven't really been around that industry for a decade, but I can only imagine how disappointing it must have been to start seeing your traffic drop off, seeing which results were winning in search compared to your own site, seeing how they were winning, and then having to add more and more shit to your own site in order to climb back up the rankings.


I got so fed up with this that I made a browser extension for it. It's in the Chrome Web Store and Firefox as well, but you'll have to build the xcode project in the Safari directory if that's your preferred browser.

https://github.com/sean-public/RecipeFilter


That's not entirely fair.

The problem is that Google forces actual good cooks to make their recipes look like worthless blogspam, but a good original recipe is not actually worthless blogspam, even when disguised in the way Google requires.


When it looks and acts like the spam sites, then what difference is there really? If I have to scroll 4 pages to find the ingredients and then scroll around like crazy to find the instructions (then scroll back and forth while cooking/baking) then it does not matter how good the recipe is, the page killed it for me.


I'd argue that most web users have a higher tolerance for ads than HN users, so they put up with the scrolling. And if it results in a tasty recipe, then they'll do it next time too, since that's seemingly the (tolerable) price to be paid for good food.

But lots of recipe sites now have a "jump to recipe" link at the top, so they've realised the junk is annoying for some fraction of their users. Although page junk is a pain, shortcuts for low-tolerance users seems like a good compromise.


Look it's not OK to milk humans like this. It's manipulative and rapey. Just because the NPC meme is true does not mean you get to hack their programming for a buck and call yourself a good community member and businessman.

Enough has to be enough!


Nobody forces you to put ads on anything.

The idea that every website or tool with lots of visitors should be monetized is sad.

Original author made a tool, why do you have to make money on it?

Perhaps it sad that websites without ads aren't ranked higher.


Because websites aren't free to build or run. No one is obligated to put ads on their site, sure. They're also not obligated to work for many hours to provide you with free content or pay $X/no to serve it to you.


But they can also have a separate job that doesn't ruin the internet and produce out of generosity, like some of us, free content that is not span ridden.

Also web hosting doesn't cost much when your website is well made with some frugality in mind.

And there are also better, cleaner ways to make money on the internet: getting rid of the ads and spam and having the content accessible to paid members.


While it is admirable that you are willing to produce content out of your own generosity, it seems a little optimistic to assume that everyone making content on the internet is both willing and able to share it for free.

I am somewhat curious to hear more about the better and cleaner ways to make money on the internet, but I have a suspicion that in some circumstances (such as recipes) they may put you at a competitive disadvantage. I certainly have no desire to pay to access recipes I find via Google searches.


Not engaging in fraud also puts you at a competitive disadvantage to those that do. Doesn't mean we have to be happy to be defrauded.


We need to find a metric for anti-profitability. I think that index could yield much higher quality results.

Detect sales/commercial language and structure,* and specifically target that for removal from results as if sales-oriented sites were hardcore porn and the child safety filter is turned on.

*Buy and cart buttons/functions, tables containing prices with descriptions but don't look like long-form reviews (which would be it's own filterable tag), etc, and domains trying to obfuscate are blacklisted permanently.


Really just removing all sites with ads would be a huge improvement. Regular old websites trying to sell you something are usually not nearly as bad as those that want to monetize you while pretending to be free.


Nowadays, there are numerous free hosting services for static sites.


Websites are practically free to build and run (if you treat it as a hobby and don’t count your time). I agree on the rest though.


The thing is, even if you don't put ads on a page or tool, Google will sometimes not index it because it doesn't think there's 'enough' content, no matter how little sense that makes. At least half the issues with recipe sites and company sites come from them trying to get a site that doesn't need reems of text content indexed by a search engine that seems to blindly value the quantity of content and time spent on the page over all else.


The people who have bad content are the ones to get money, while those who have good content are not. Logical result is that people with good content stop producing that content while the people with bad content continue producing it and being rewarded for it.


Look I hate these SEO-laden pages just as much as the next guy, but I think the binary classification of "good content" and "bad content" lacks nuance. I would refer to it instead as "bad packaging" of (often) good content. As much as I loathe having to hunt for the "jump to recipe" button on my phone each time I open one of these pages, I also appreciate being able to freely view recipes which I enjoy and cook regularly.


I just stopped looking for receipts online if I can avoid it. It became literally faster and easier to search in old school cook book. And there was period when I considered those completely outdated.


An earnest writer and spammer might reach the same method in different ways, but the result is still blogspam.


>I also think the site counts as "worthless blogspam".

This is a strange complaint. You're visiting the blog of a woman who writes about cooking. Can't speak to the ads (I block them), but her site looks pretty good. Why do you think she should list her recipe like some kind of index? Perhaps she blogs for her own enjoyment, not for yours?

Have you ever read popular cook books? They aren't simply listings of ingredients, either.


You should try viewing the site without your ad blocker turned on. Here's a preview: https://imgur.com/a/FDI0L6i. The red arrow is where ad #4 was when I checked it out last night.

Edit: real cookbooks was basically my answer to this problem to be honest. Some of them actually have fun stories in them. Most of them have a standard-ish "recipe on one page, photo on the opposite page" format. But none of them have promo codes for shoes, supplements, or terrible Canadian coffee chains in them.


I created this simple site exactly for this: https://recipebotpro.com/

You enter the name of your desired dish and have a plain recipe with steps in 5 seconds. No ads etc


I suspect you’ve bitten off more than you can chew.

I checked four recipes. One was a joke made out of genital references. Three began with near identical “embark on a journey of flavour” pseudo-SEO bullshit.


FWIW, I tried a few recipes too and they came out just fine, without the usual clutter. I further anticipate that this is the direction we'll be going in general, "search" as we know it was a ~30 year period where Google reigned supreme. The world since moved on.


Yeah, but the new gatekeepers and tech are going to be worse. Ai companies, where you never see original human content any more. Just what the company’s ai shows you


lol. "Cups"... No serious recipes there.


I generally use https://www.taste.com.au. No bullshit prologue about how a distant relative used to make the recipe in question. Just and overview, photo ingredients and steps. Everything else is secondary and usually worthless.


Why when i try to click that link it links me to tags.news.com.ua ? My dns filters are blocking it.


Hmmm. First one I clicked from their home page:

https://www.taste.com.au/baking/galleries/autumn-cakes/p6d5x...

> When the weather starts to finally cool down and the evenings ...

Just No.


That's not a recipe, it's a short intro to a list of recipes. Just Learn To Read.


"Just Learn To Read" adds nothing to the sentence that precedes it. The point was already made correctly and well. You should avoid when possible starting a comment you want to actually be read with an insult or ending it with a snap. It degrades the quality of the conversation.


I really don't care.


> it's a [...] intro to a list of recipes

That's exactly the point. It doesn't need to be there, doesn't add any value whatsoever, etc. ;)


I laughed out loud at this. You haven’t looked up many recipes in the last few years, have you? 95% of recipe results are nonsense and ads. It can take a few minutes of searching just to identify ingredients sometimes. My wife and I have been improvising recipes lately to avoid digging through all the junk. I actually recommend this: you can sort of make stuff up based on prior experience and things turn out pretty well sometimes.

Or, put your simplified recipes in a binder near the kitchen

Anything to avoid going to google to find a recipe




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: