Hacker News new | past | comments | ask | show | jobs | submit login

I'm the Google Rep, and there were two more parts to my reply which explain this:

Here's why people often think quoting with Google doesn't work when it really does (I've looked a huge number of these reports). 1) We match ALT text 2) We match text not readily visible, such as in a menu or small text 3) Page has changed since we indexed it 4) Punctuation...

Punctuation comes into play if you did a quoted search like "dog cat" and there's text that says "dog, cat" then we'll see that without the punctuation. That doesn't seem a major issue but we're looking at if we could improve there.

And...as noted in another reply below, in the first result, if you look at the cached page that shows the ENTIRE page that we indexed (rather than the paginated version you land on), there's this text:

quotes, don-t-give-up-the-fight

with when you remove the punctation, is the match:

quotes dont give

And I get that this can be frustrating, that we don't consider punctation in a quoted search. That's not a new change, however. It's been that way for ages. But as I also said, it's something we might revisit.




I realise you're trying to help, but this is still not right. The top result for "quotes don't give": https://www.goodreads.com/quotes/tag/never-give-up

Does not contain: "quotes don't give" "quotes dont give"

Nor does the cached version: http://webcache.googleusercontent.com/search?q=cache:https:/...

Yes, as someone pointed out, there are tokenized versions down at the bottom, but that exact string does not exist.

So first of all it's just not true.

Second of all, if your consumers are all complaining the results do not match what they are searching, it might make more sense to listen. Adding quotations marks like that to find an exact string should be a huge red flag that the user is only interested in that string – why then would "it's in an offscreen meta tag of a cached version of the actual site" be anything other than a bad result?


Regarding point #2 - why would you ever match text that cannot be CTFL+F'd?

I can understand why you would match it generally, but why would you ever serve those types of results to users in a browser? Why would users ever want this as a feature?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: