Hacker News new | comments | show | ask | jobs | submit login
Ask HN: Why does Google put the query in the URL hash instead of query string?
158 points by SimeVidas on Jan 9, 2017 | hide | past | web | favorite | 66 comments
Search for something on Google, and the URL will look something like this:


The search query is the `#q=something` part of the URL, whereas other parameters are stored in the URL via the query string (the part between `?` and `#`). Why is that? Why isn’t the search query stored in the query string? Why in the hash?

If you are search from any place (toolbar, addressbar) other than google.com, your search query would be passed in as a query string.

But once you are already on google.com search page, the entire page need not be reloaded. So google would fetch the search results for the new search string via a XHR and update the page.

In fact if you search for https://www.google.com/search?q=wonderland#q=alice, the webpage would first load the search results for 'wonderland' and once the page is loaded, there would be another XHR for 'alice' and the DOM would be updated again with the new results.

They could use `pushState` to change the query without reloading the page?

They could if they were building it now. IE got pushState support in IE 10 in 2012.

The hash state has been there since Instant Search was launched in 2010.

Graceful degradation/progressive enhancement is a thing, though.

The amount of money Google makes supporting older browsers is most likely measured in billions.

Not to mention, why fix something that isn't broken? There are cases for it, but it's roughly the same tech that it was 4 years ago. What would they gain by spending developer time on it?

Perfect. They definitely don't need optimization for search engines, which is probably the most important reason to a clean URL.

Exactly. You are someone who gets it.

I suspect they spend a ton of developer time. It is their #1 source of income. What they won't do is releasing it unless it is 99.9% certain that it yields an improvement.

Yes, but in this case the benefits of the enhancement are small enough that Google may have decided that consistent behaviour was more desirable.

Consistent and long-lasting support is a thing too.

Rare on today's web, but still a thing.

Would that not remove the possibility for sharing or manually editing the URL? (I'm asking because I don't know how much cool stuff pushState can do.)

No, but it makes shared links less inefficient. If you share a URL like this, the server will not receive the hash on the initial request, so the js client will have to do the search as a second request after the page load. The reality is that most users will never even notice because it's still fast.

Manually editing the link causes that same fetch-render-refetch flow.

Edit: Oh, you were asking about pushstate. No, that specifically fixes the problem with the double fetch, so long as the server side and client side do the right thing to make the user see the same page for the same URL.

Pushstate just lets the JS client modify the URL without triggering a page reload, so the client can change the actual query param instead of the hash.

Not unless the server doesn't interpret the URL the same way.

The fragment of a URL is not sent to the server as part of an HTTP request

Interesting that your alice/wonderland example works. Would've thought Google would try to prevent that.

I don't think they can - iirc the hash isn't sent to the server, so to it this looks like any other normal query.

Well sure they can, how do you think the search results for "alice" are obtained? It gets sent to the server, just not upon first load but using Javascript. This Javascript could check for another query being present in the query string, or (less reliably) the server could in the referrer.

Why all the downvotes? I think he means that it could prevent the alice query, not the wonderland.

the server doesn't have access to anything following the hashtag. This is only accessible on the client side via javascript. Javascript that you do not have unless the browser send the first request.

That's what I said:

> how do you think the search results for "alice" are obtained? It gets sent to the server, just not upon first load but using Javascript.

Where do you think the initial js comes from? The "wonderland" query.

I know but the comment I was responding to said Google couldn't possibly know about the duplicate parameter. Well, here you have it, they can. Not at first, but still before displaying the results.

As far as Google are concerned it makes no difference if the query is a GET variable in the URL or a hash fragment that their JS code sends as a query after the search page is loaded. They both mean "load Google Search with this parameter from the URL".

Don't forget that the hash part is client sided; Google doesn't know about it until the page is loaded.

For what it is worth, Safari on OSX goes straight to "alice" results, but I think Safari handles google.com differently than other domains.

I'm on Chrome, and even if I search directly from google.com (google.co.uk) I see my search query as a proper query variable.

On a related note, and probably obvious to many, I've just discovered I can make a nice little link to a one-result search, handy for a bookmark.

Things like current temperature and forecast I don't want a dedicated app for when a simple search is more efficient...


Yes, and if you're using Firefox, you can this with the knowledge of its Smart Keywords to create super quick shortcuts!


For example, I directly go to Wikipedia's page on "India " by typing wp India in my address bar using these search keywords, or search images of birds on google by typing gi birds in it.

(There's a rather cumbersome way to do the same in Chrome as well)

It's not that cumbersome in Chrome. Right click the URL bar -> "Edit search engines" -> Find the search engine you'd like to easily search, double click the second column and enter a keyword for it. For example, for wikipedia, I have "wp" as a keyword. In the URL bar, you can just type "wp<tab>" and you can search wikipedia.

you don't even have to do any configuration. after you've used a search box that's correctly formatted (like wikipedia's is) chrome automatically offers you the option to "press tab to search" when you start typing that's site's url.

Yep, I use this feature all the time and I think there's only ever been one or two sites I've had to set it up for manually. For most sites, Ctrl+T <First 2-3 Letters of Site Name> <Tab> <Search Query> <Enter> works just fine without me having to configure anything.

The level of customization offered in Firefox is a nice benefit though, as you can add a wildcard to any portion of a URL.

I have some specific shortcuts like wdir (walking directions from my home address) and some shortcuts that allow me to quickly jump to subsections of specific sites.

You don't even need to use the tab key. It automatically works for me just from pressing space after the keyword in Chrome. In the end, the usage is identical to FF.

Set DuckDuckGo as your default search engine and you don't even need to do any configuring. You can then just type `!wiki India` or `!gi birds` or any other !bang: https://duckduckgo.com/bang

I've tried that, and while it is nice and useable it is also significantly slower, going through DDG for one more complete roundtrip, than it is going directly to a URL you already know.

I've had one of these for YouTube for years, really excellent feature.

Opening a new tab with yt <songname> is such a habit now.

Exactly the same for me!

That's pretty neat. Didn't know that. Thanks!

And if you set num=0 then it returns the normal "did not match any documents" page: https://www.google.com/search?num=0&q=temperature+london

Which seems like it could be used to fool someone into thinking there really are no results for a subject.

Or to pretend Google search is broken ;)

Or tell the truth about the truth ;)


You still get the google "applet" + 1 search result, any ways to get only the "applet"?

Very cool, thanks!

So cool.

I had always assumed they did this for financial reasons.

The hash won't be sent along as part of the referrer header, so the search query won't be available to 3rd party analytics. The only way to track search queries is to use Google's own analytics, which can correlate queries to traffic since it's watching both ends of the handoff.

The technical explanation discussed in this thread is certainly valid. But it only explains what is happening. The question posed was "why". Why would a search engine implement a design that would have the effect of removing the query string from referrer headers?

Answer that, and you'll answer your question

There are good technical explanations of the behaviour in the comments. And the observed behaviour is that only subsequent queries get put in the hash instead of the query.

The why is also addressed by the comments (IE only got support for pushState after the solution was already built).

Besides, for a long time, Google added intermediate pages when you clicked through a search result, adding the appropriate referrers right back in.

I don't think you need to attribute this behaviour to malice when a simple technical explanation suffices.

If that's "malice" then everything Google does is malicious. There's nothing wrong with doing something for multiple reasons. It seems exceedingly unlikely that "now we can control Referer" never came in search product discussions.

If you click a search result, you still have a google.com/url?… redirector in between, so hiding the search term in the hash wouldn't really be necessary.

As others have said, it's to show the data without reloading the page.

They could of course do it without actually adding anything to the url - but then it won't be bookmarkable (and refresh would get you back to a page without results).

They could use the new history pushstate api, but then: 1. it won't work in older browsers 2. if they wanted it to work in older browsers they'll have to shim it (which ends up using hashes anyway) - and maintain it, and it will add the the page download size, which google (at least in the main site) take very seriously.

Why would google.com/?q=search-term not be bookmarkable?

OP is saying if you do not update the URL between searches it will not be bookmarkable. Not, if you update the URL with the actual search term in a query string. The reason they do not update the URL query string itself is due to browser incompatibility. Doing it this way they get support for all browsers released after 2010, whereas they could only support IE 10 or greater by using History.pushState

Because subsequent searches aren't full page reloads. The first search (via the Chrome URL bar) puts the search in the "q" query string for me.

> Because subsequent searches aren't full page reloads.

This does not mean google cannot rewrite the url without reloading the page (via history.pushState). But less browsers supports this [1] (IE9, browsers from 2009, and some android browsers).

[1]: http://caniuse.com/#search=history.pushState

Could it be to hide the query in the referrer? Although I think all clicks go through Google servers first.

Aren't all Google search pages now in HTTPS? I'd they are, then they drop the query in the referrer.

Correct. The only service which can provide you with google queries is google analytics.

So if you want the data, you are forced to help spread the google drag net across the internet.

No. Keywords do not appear in Google Analytics since search switched to HTTPS. The query strings are dropped going from HTTPS to HTTP. Keywords will show up as "(not provided)"(Disclaimer: I work at Google but I'm not on the search team. I do use Google Analytics.)

They drop them from https to https using this method as well. In this case, what is good for the user is amazing for Google. They are the only ones that get access to the query stream.

Is there any technical reason for this? Because as far as I know the keywords still appear in Webmaster Tools.

Anchors are not part of a referer sent to external websites; also, it will not be logged in webserver logging since it's never sent to the server. Everything from # on is stripped off and stays on the client. The only way to interact with it, is via javascript.

"never sent to the server"

It's sent in a separate request.

That's because a request is manually created with javascript; this is not because it's part of the url. If you append #something to most random url's, nothing will happen (unless specific behaviour was written to do so).

Yes, I know.

Pretty sure they do it for search privacy as hash is not usually passed in "referer" header, so the pages visited from google search will not have access to the typed query.

I think they did it to reduce stability and occasionally render blank results pages.

I've lost count of the amount of times I've had to force-refresh the page and enter my query all over again because of this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact