Hacker News new | past | comments | ask | show | jobs | submit login

Google Rep: "You said in the post that quotes don't give exact matches. They really do. Honest.

Me: Google for "quotes don't give":

Top result, does not contain the given phrase

Second result, does not contain that phrase

This is literally the first thing I tried and I did a view source to be sure. They are lying to us and/or themselves.

Looking at the result, I would say 1/2 the results don't include that phrase including results for products on Amazon.

Clearly Search does not work they way this Google Rep believes it to work.

I'm the Google Rep, and there were two more parts to my reply which explain this:

Here's why people often think quoting with Google doesn't work when it really does (I've looked a huge number of these reports). 1) We match ALT text 2) We match text not readily visible, such as in a menu or small text 3) Page has changed since we indexed it 4) Punctuation...

Punctuation comes into play if you did a quoted search like "dog cat" and there's text that says "dog, cat" then we'll see that without the punctuation. That doesn't seem a major issue but we're looking at if we could improve there.

And...as noted in another reply below, in the first result, if you look at the cached page that shows the ENTIRE page that we indexed (rather than the paginated version you land on), there's this text:

quotes, don-t-give-up-the-fight

with when you remove the punctation, is the match:

quotes dont give

And I get that this can be frustrating, that we don't consider punctation in a quoted search. That's not a new change, however. It's been that way for ages. But as I also said, it's something we might revisit.

I realise you're trying to help, but this is still not right. The top result for "quotes don't give": https://www.goodreads.com/quotes/tag/never-give-up

Does not contain: "quotes don't give" "quotes dont give"

Nor does the cached version: http://webcache.googleusercontent.com/search?q=cache:https:/...

Yes, as someone pointed out, there are tokenized versions down at the bottom, but that exact string does not exist.

So first of all it's just not true.

Second of all, if your consumers are all complaining the results do not match what they are searching, it might make more sense to listen. Adding quotations marks like that to find an exact string should be a huge red flag that the user is only interested in that string – why then would "it's in an offscreen meta tag of a cached version of the actual site" be anything other than a bad result?

Regarding point #2 - why would you ever match text that cannot be CTFL+F'd?

I can understand why you would match it generally, but why would you ever serve those types of results to users in a browser? Why would users ever want this as a feature?

What's interesting to me is that if I search for "You said in the post that quotes" the first result is the Twitter thread, with that quoted phrase bolded. the second result is some other site which is mirroring this thread itself. There are no more results, I only get those two.

However, if I search like you did for "quotes don't give" then I get the behavior you describe. So the quoting works sometimes, not others. Maybe based on length or something else, I'm not sure.

Just tried this and you're completely right.

Shameless how they think anyone would be gaslit.

The difference might be due to Google A/B testing their algorithm. That is, giving users different results for the same query trying to infer which results were preferred based on whatever happens next (user keeps searching, or user goes and stays on some site).

Ignoring full page replacements ('Showing results for "eggzactly that", to see results for "eggzackly that"...) I think this is all just punctuation related.

For instances, on the ["quotes don't give"] example, the first result I get is


If I do a find-in-page for "quotes don't give", I get zero results. Oh no! Perfidy!

... but, if you look more closely, you'll find this string waaaaay down at the bottom:

> tags: don-t-give-up, don-t-give-up-on-your-dreams, don-t-give-up-on-yourself, don-t-give-up-quotes, don-t-give-up-the-fight, encouragement, ...

Thanks to the wonders of tokenization, that "don-t-give-up-quotes, don-t-give-up-the-fight" gives you the string of tokens, "don t give up quotes don t give up the fight", which contains the exact phrase "quotes don t give", which is the tokenization of the phrase "quotes don't give".

Yes. And thank you for spotting.

Nice work! I too viewed source, but I did "give up" quickly.

That's actually from the visible text of the page -- well, desktop version as I saw it; when you click through on mobile they only show half as many quotes and you need to load more to find it.

I may start using another search engine just because of this. I can't tell you how many times I've searched for something in quotes, but the results almost always lack a word or two. They just want to show you some results; they don't care if that's exactly what you wanted.

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact