

What is up with the insane long Google results URL now? - mergy
http://mergy.org/2013/01/what-is-up-with-the-insane-long-google-results-url-now/

======
dsl
I assume someone at Google reviews and approves each one of these parameters
that gets added, and each one of them has a good reason for existing (even if
we don't know what it is or believe its a good reason).

As a thought experiment I figured if I worked at Google how would I quantify
the expense of a query parameter. Users don't really know or care what URLs
look like anymore as long as the page is fast, so apart from measuring the
gallons of tears generated on Hacker News, I guess resource usage would be the
first logical thing to look at right?

Assuming every search is exactly like the one provided in the linked article,
and using some public numbers from 2011 on searches/day, these are the top 5
parameters in terms of bandwidth used just transmitting the key and value to
the server.

    
    
       gs_l	    42.3 Mbps
       bav	    11.7 Mbps
       bvm	    9.1 Mbps
       fp	    8.7 Mbps
       sclient  6.1 Mbps
    

For just the top 5 thats almost 78 Mb of inbound data per second. Being sent
over end users pipes (which are often limited by upload), hitting my transit
and peering links, passing through my routers, hitting my front end load
balancers, being turned into GoogleInternalProtocolX and fanned out to dozens
of servers inside the datacenter over switches and internal routers, and being
logged on durable storage for 18 months (assuming they dump this extra data
when they strip personal information from the search record). Wow.

A general rule of thumb would be for every 20 bytes you add to URLs by way of
keys or values, you've increased the overhead of almost every part of the
Google infrastructure by 1 MB per second. (Note I did switch from bits to
bytes in the final conversion, and this number is sourced from shaky data to
start with)

~~~
lelandbatey
Maybe I missed it, but where are you getting this data from? It looks quite
interesting, and I'd love to see more. Would you be willing/able to share?

~~~
dsl
I googled for "google searches per day", kept clicking around until I found a
few sites that agreed on a number for a recent year. Counted the number of
bytes for each URL parameter, then just did a bunch of back of the envelope
math. None of it is scientific or probably even close to the actual numbers.
Just a thought experiment I did that I thought might be interesting enough for
a comment.

~~~
Cogito
You were correct, but maybe add some more details and links to your sources?

------
NelsonMinar
Google's search URLs have slowly been creeping up in length over years. Some
of the parameters like hl are for user interface control (in this case,
langauge). Others like sourceid are for broad tracking of who's using Google
how. And lately there are many more nonces mostly related to tracking
individual users, Google+, instant search, etc. Many of those change based on
who you are logged in as.

Google is also responsible for all the utm_source spam in Feedburner and other
tracked URLs. And they are the ones behind the #! / _escaped_fragment_
nonsense. Google's first search product worked so well mostly because it
relied on standard URLs as unique pointers to web pages. It's a shame they're
breaking URLs in so many ways now.

~~~
polyfractal
I don't have a problem with the _escaped_fragment_ stuff, since no human is
supposed to ever see those. I utilize those on my single-page JS website and
it is fantastic. Google can index my site while I can serve blazingly fast
pages to my users.

If a human ever sees an _escaped_fragment_ page...something terribly wrong has
happened.

------
patio11
Speculation: way back in the day if you Googled for [software marketing
service], liked what you saw, and had me Google for it, I'd get substantially
the same results. That has been increasingly untrue for years, due to
geotargeted results, personalization, persistence across searches, search
refinement, their datacenter model, yadda yadda. If users expect copy/pasting
a link to a search so that someone can repro it to actually work, then that
search query now has to carry more state than just "What I typed into the
Internet before I clicked the Googles" (totally baffling to users, who think
this is what selects their results).

~~~
batgaijin
but what the hell is the point of that if the url redirects to another url? do
they really want to track people copy-pasting urls into email/ims? if so, why
not use goo.gl?

~~~
jxi
What redirect? And, did you even read the comment that you're replying to?

1\. There isn't a redirect from that huge link. 2\. The post replying to just
explained a theory that doesn't involve tracking: the URL needs to be that
long to be unique enough to specify a page with the exact same search results.

~~~
michaelt
Perform the following experiment:

1\. Visit <https://www.google.com/search?q=hackernews>

2\. Mouse over first result, note the url news.ycombinator.com shown in your
browser's status bar.

3\. Right-click the first result and 'copy link location'

4\. Check what's on your clipboard. For me it's
www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CDIQFjAA&url=http%3A%2F%2Fnews.ycombinator.com%2F&ei=fRb5UNSeD-
am0AW88YGQBg&usg=AFQjCNGKJHXhsq1s0-gYR96B--m47G9oRw&bvm=bv.41248874,d.d2k

5\. Visit said URL in your browser and note you get redirected to
news.ycombinator.com

As you can see, you are redirected via a huge link.

~~~
daave
As janzer said, that's not the URL patio11 is talking about, however this
redirect is here for a good reason too. It's needed to preserve the search
terms in the referrer header for links that are triggered by JavaScript -
which is needed if webmasters want to see which search terms are driving
traffic to their sites.

~~~
No1
I presume it's also handy for tracking click-throughs.

------
j_s
This is Google keeping search keywords for their paying customers (changing
referer), among other things. Honestly I wouldn't care so much if it weren't
so dang slow at random times!

Firefox addon (useful many places):

[https://addons.mozilla.org/en-us/firefox/addon/redirect-
remo...](https://addons.mozilla.org/en-us/firefox/addon/redirect-remover/)

Chrome:

[https://chrome.google.com/webstore/detail/undirect/dohbiijnj...](https://chrome.google.com/webstore/detail/undirect/dohbiijnjeiejifbgfdhfknogknkglio)

Edit: This comment is relevant to search results, not the address bar, my bad!

~~~
Cthulhu_
Thanks for linking to that one, the redirect thing is very annoying (and time-
consuming, relatively speaking, which for a speed-obsessed party like google
isn't a good thing or very representative)

------
rosche
But I really want it to stop. I hate not being able to just "copy link
address" off of the results page

~~~
jakub_g
This is really insane. The only important params from my point of view are
(apart from query) hl (language -- important if you use google.com outside of
US to have results in English or any other lang), page number, and safe (for
filtering).

The solutions that could be applied on Google side for this:

1\. Perhaps they could add some button like on YouTube and other pages to
share the search (be it web search, image search etc.). Of course it will
clutter the UI but it's already cluttered and non-intuitive, especially given
their experiments (AB testing?). Each few weeks I see something moved,
changed, colors tweaked. Think what if Microsoft were doing sth like this with
Windows or Office... :)

2\. Use the History API to decrappify the URL after the load (it's a kind of
cheat, but I would like it).

------
nonamegiven
Easy to work around, no plugins or silliness require.

Get to Google via Duck Duck Go as your intermediary. Either from your search
bar (if DDG is your default search engine), or directly from the DDG search
page, search for

!g testing

!g tells DDG to send the query to google, and you land on the google results
page (not DDG), at the following URL:

[https://encrypted.google.com/search?hl=en&q=testing](https://encrypted.google.com/search?hl=en&q=testing)

Even if you prefer google to DDG, you should make DDG your default search, and
get to google via !g.

!gi searches google images. !am searches amazon. !imdb searches imdb, etc.

<https://duckduckgo.com/bang.html>

~~~
marshallford
I am really interested in this, any tips for automation on chrome so I can use
the omni bar and don't have to type !g ?

~~~
nivla
Click on the "Zebra Crossing" (that used to be the wrench/spanner :sigh:) icon
> Settings > Search > Manage Search Engines

Scroll to the bottom and add "DDGooG" as Name, duckduckgo.com as the keyword
and <https://duckduckgo.com/?q=!g%20%s> as the URL.

Hover over the URL and select "Make Default".

You are all set :)

~~~
magicalist
That's a start, but it's a pretty roundabout way to do it.

You can just use
[https://encrypted.google.com/search?hl=en&q=%s](https://encrypted.google.com/search?hl=en&q=%s)
directly :)

~~~
alxndr
...and make ones for Google Images, Google Maps, Wikipedia, and lots of other
sites...

~~~
magicalist
right, but that's different than what the GP to my post was asking, which was
how to automate a regular google search with the clean URL without having to
type !g every time.

------
jackhlaw
Sucky side effect - Safari's browsing history just lists a whole bunch of
really long google URLs. If I'm looking for something particular I have to try
every link in the list.

Edit: See [http://bartkowalski.com/2012/02/google-urls-in-safari-
browse...](http://bartkowalski.com/2012/02/google-urls-in-safari-browser-
history/) for an example.

~~~
craigc
Well the original post was about the results pages. These urls are the
redirects when you click on one of the results.

This seems like a bug in Safari though and not an issue with Google's URLs
although I admit they are not too pretty.

An alternative to trying every link would be to try a different browser.

~~~
brigade
It's entirely due to Google's client-side redirects. Which are stupid and
annoying and Google should fix them.

But since Google doesn't care, the Detox extension works nicely. Though it's
always somewhat sad when you have to resort to browser extensions to work
around web programmers' stupidity.

EDIT: Also, I just tried your suggestion of a different browser, and tried
Google's own browser. The stupid titleless redirection URLs are still there,
cluttering up your history. So that doesn't really fix it.

------
franze
my job involves copy & pasting URLs into mails, IMs, documents, .. all the
time, that's why my google chrome omnibar triggered default search URL is
i.e.:

    
    
      https://www.google.com/search?q=hacker+news&pws=0&hl=en
      (pws - no personalization, hl = language) 
    
    

how to: Google Chrome Settings -> Section "Search" Button "Manage Search
Engines" -> Overlay "Other Search Engines" -> Scroll Down -> Add new search
engine with the URL
[https://www.google.com/search?q=%s&pws=0&hl=en](https://www.google.com/search?q=%s&pws=0&hl=en)
-> "Make default"

sadly custom search engines aren't synced in chrome, so every time you set up
a new browser, you need to add your custom search setting again...

~~~
aaronharnly
This is fantastic advice. Thank you.

------
keyboardP
It's been like this for a while. A Twitter friend of mine created a simple
bookmarklet that you can click which will remove the extra data and allow you
to right-click and copy the URL alone. You can drag the bookmarklet from here
to your bookmark bar [http://techkp.blogspot.co.uk/2012/01/copy-pasting-
googles-se...](http://techkp.blogspot.co.uk/2012/01/copy-pasting-googles-
search-results.html)

Click the bookmarklet before copying the link to get the actual URL.

~~~
vjani
Or you could use this greasemonkey script that doesn't even require you to
click on the bookmarklet, and it makes all result links direct, so you can
reach them faster:

<http://userscripts.org/scripts/show/134151>

------
swlkr
In chrome I set my default search engine to "Googol" with the search query:
<http://google.com/search?q=%s>. Seemed to fix it for me

~~~
zeroexzeroone
but then you have to tell everyone who visits your site to do this...

------
daave
Some of the parameters are there to enable certain performance enhancements.
For example:

-The home page, after rendering the search box and all that, asynchronously downloads the css, images and html needed to display the chrome around the results listing.

-That way it is cached on the client, and when they perform a query, less data has to be downloaded from Google's servers (just the actual results & ads), making the result page render faster.

-It knows not to download all the chrome due to the presence of the 'fp' GET parameter, the absence of that parameter will cause the entire results page, including chrome, to be downloaded.

I presume the rest of the parameters are useful for similar reasons.

It should also be noted that for non-HTML5 compatible browsers, modifying the
hash fragment with JavaScript is the only way to change the url that gets
bookmarked without causing a page reload (which would add latency), so if you
want a bookmarkable local results page for images with certain preferences,
adding a bunch of crap (latitude, longitude, preference hash, query, search
type, etc.) to the fragment is the only way.

------
readme
DuckDuckGo URL: <https://duckduckgo.com/?q=test>

~~~
firefoxman1
<http://www.google.com/search?q=test>

~~~
quaz3l
Click the search button.

------
dfc
CoralCDN link:

[http://mergy.org.nyud.net/2013/01/what-is-up-with-the-
insane...](http://mergy.org.nyud.net/2013/01/what-is-up-with-the-insane-long-
google-results-url-now/)

I couldn't get through to the main site. Fortunately someone had fetched the
page over coralcdn at some earlier point. I wonder if someone wrote a
proactive coralcdn bot for HN links.

~~~
mergy
Throwing some more bandwidth on it now. Thanks for the mirror link.

------
kahirsch
One side effect of these is that it's hard to figure out what the minimal
parameters are needed to search when you're trying to create a keyword search
(or similar task). E.g., I can type "gna harry carey" in my address bar to
search in the Google News Archive, but they've broken my URLs several times
and sometimes the fix was far from intuitive. (Right now, for gna I use
"[https://www.google.com/search?hl=en&gl=us&tbm=nws...](https://www.google.com/search?hl=en&gl=us&tbm=nws&q=%s&tbs=ar:1)
)

Since we're on the topic of Google searches, have other people noticed that
advanced searches have become quite a bit worse over the last couple of years?
The basic search is often quite astounding in how it gets what you want at the
top of the results, but using advanced operators often produces quite strange
results.

This is even more obvious in the non-web searches, such as Google News, Google
Groups, patents, and Google Books. There you can see a result, but then if you
add a restriction to the search, such as date range, or "group:" or
"ininventor:", it won't find any results--even the ones you just saw which
match new criteria.

------
hgpc
If you want to share a Google search, you can always use the short and sweet:
google.com/#q=keywords

------
adrockdust
I don't believe this is a new development.

~~~
marshray
I think it used to happen only if your User-Agent parsed a few specific ways.
I did just recently notice the broken "Copy Link Location" in places where it
wasn't before.

All the more reason to stick with DDG.

------
wahnfrieden
Does someone know a working Chrome extension / userscript that will disable
the Google search results redirects?

~~~
vjani
Yep, <http://userscripts.org/scripts/show/134151>

------
therandomguy
Not sure if this is related but I started using this goo.gl extension and it
has been great to paste links in emails. I use it all day long especially for
Google docs links.

<http://goo.gl/Ya1kC>

------
untangle
In Chrome, you can lessen the pain by using the "goo.gl" (or similar)
extension.

------
RexRollman
Google: slowing fucking up a good thing.

