
Copy-Pasting URLs from Google Search Can Leak Previous Searches - andygambles
https://medium.com/@jeremyrubin/caution-copy-pasting-urls-from-google-search-can-leak-previous-searches-11940508e79
======
danso
I've noticed this before as everytime I copy-paste a search URL, I manually
delete the parameters as even the non-readable ones probably contain some
metadata about me.

Yeah, it'd be nice if Google search pages came with a handy "Link to this"
button as Google Maps and Google Books do...but in reality, that would raise
even more issues. If I understand Google's search mechanism...the results are
tailored to each user...my search for "burger place" is going to be much
different than someone doing it in New York. Yet if I see a "Link to this"
button -- _especially_ if it's a link shortener -- then I would expect it to
direct all users to results that I've seen. OK, _I_ don't expect that...but I
bet 99% of other users do.

So it's not a clearcut decision about whether such a feature should exist,
given the ephemeralness and personalization of SERP. So given that it's not an
intended common use case to pass around URLs to results...what is it that
Google should do to fix this? Re-engineer their system so that every URL is a
non-human-readable URL that obfuscates all possible metadata? Remove the
ability to tailor search results based on past results of that same session? I
can see why they might choose: "If a user wants to send around a URL by
manually copying and pasting it, we should trust them to read it before they
send it out as the relevant parameters are human readable"

~~~
saalweachter
I suspect it's someone being clever.

I _think_ doing subsequent searches as a "#q=x" change to the URL lets them
avoid reloading the page and use clever javascript to replace the results.

~~~
simoncion
A friend of mine says that the only part of the URL you can change on older
browsers and _not_ refresh the page is the anchor-tag part.

OTOH, if you _only_ put search params in the anchor-tag part, the initial
search is much slower on slow connections.

Still, infoleaks are BAD. Page load and update speed shouldn't come before
safety of a user's private information.

~~~
plonh
Strange to cite a "friend" when stating a basic fact about how web browsing
works.

~~~
PhasmaFelis
Fact-checking is good, but it takes time and it's not always worth the
trouble. I'd rather see someone use an honest qualifier than either claim
certainty they don't have, or not post at all.

------
nostrademons
Weird...when I was at Google, very significant engineering effort was expended
(we're talking man-years) to prevent exactly this case, and it was a major
design constraint on the features we could launch. Something must've changed
in the thinking of the higher-ups.

I also wonder why they don't just use pushState, which was proposed as a
solution way back in 2010 but didn't have the browser support necessary. Now
it's got the browser support:

[http://caniuse.com/#search=pushState](http://caniuse.com/#search=pushState)

~~~
sotojuan
There was a thread a few weeks back about something Google-related and an ex-
Googler was saying how many hours they spent making the home page and search
results pages as light as possible, but that recently they stopped caring
about that.

~~~
nostrademons
You mean this one?

[https://news.ycombinator.com/item?id=10395008](https://news.ycombinator.com/item?id=10395008)

That ex-Googler was me. :-)

I'm not actually sure whether they don't care about it, BTW - the SRP is about
1/3 as heavy as it was when I left, so it looks like someone's been cleaning
it up. These things tend to move in cycles - I was hired at the very end of a
"latency & performance" cycle, then spent most of my career there during a
"moar featurez!" cycle, and it wouldn't surprise me if the focus is again
latency and performance.

~~~
mattmanser
Doesn't that bring in to question all the research that says the faster your
page loads the more money you make?

If Google can swing between adding a load to the page load and then whittling
it down making it faster, the money difference involved must be fairly
trivial.

~~~
nostrademons
Not exactly. It's more that making the page faster is fairly predictably going
to get you $X in additional revenue. Adding a new feature is going to get you
anywhere from $0 to $Y in revenue, but _you can 't know what $Y is until
you've launched it and given users some time to learn about it_. So the only
way to avoid getting stuck in a local maxima - your current feature set, as
optimized as possible - is to periodically try to shake up the page, add some
new features, and measure their effects. After a couple years or so, the
features for which $Y < their cost in latency & developer maintenance are
killed, and a new round of optimization & code cleanup starts based on the
current feature set.

------
a3n
Similar behavior when you search for something in Amazon, then click on
something. The original search is in the URL of the page that you're on. Be
careful sharing that last link.

Not Safe for Work:

Here I searched for "blow up doll" (yup, they sell them), then clicked on a
cute alien inflatable doll. The original search is at the end of the martian
doll URL, that I so thoughtfully shared with my kid, or grandkid, or spouse,
or employer's decorating committe ...

[http://www.amazon.com/Green-Inflatable-Martian-Alien-
Decorat...](http://www.amazon.com/Green-Inflatable-Martian-Alien-
Decoration/dp/B003KXWHPK/ref=sr_1_1?ie=UTF8&qid=1446073301&sr=8-1&keywords=blow+up+doll)

------
cromwellian
In another time and place, and if Google wasn't involved, I doubt this would
even hit the front page of HN. People have been embedding tons of state into
URLs for years without obfuscating/encrypting them. Web apps should be
idempotent with respect to URLs, and bookmarking any screen you're looking at
should allow you, the user, to return to that state.

Maybe there should be a fix for this, at least to encrypt the old query, but
realistically, this behavior is widespread in web apps, and it's only
newsworthy because Google.

~~~
lukeschlather
Unnecessary state should not be present in the URL. Yeah, people do it all the
time, but it's sloppy coding and bad for many reasons, including those
described in the article.

~~~
voltagex_
What about allowing people to bookmark parts of a single page app?

~~~
detaro
How does this contradict not keeping state unnecessary to return to a place in
the app? (as in this case, past state). I'd think most users expect that a URL
leads them to the same view again, but not that it preserves their history to
get there.

------
caffeinewriter
While I understand why people are irked (to say the least) by this, I can
understand the technical reasoning behind it.

The hash part of the URL, (#andmorestuff) is not sent to the server by the
browser, however the query string is. (?var=stuff&another=otherstuff)

Using this to send the search to the server is smart, since the server can
respond with data without an additional round-trip AJAX request. However,
everything in the URL, other than the hash is read-only by client-side
scripts. So, if you want the URL to reflect the search, you can either a)
reload the page using the query string, or b) follow the single page app
methodology, and manipulate the query string with the data, and change it
using an AJAX request.

Google's interactivity, with the smoother feel of not seeing an entire page
refresh (plus potentially less data being sent over the wire after the initial
load) could be simply stored in an internal state. However, should the user
refresh, the search would be lost. If you try going to a Google search with a
search query hash, you'll notice a brief delay before the search results are
displayed.

For example, in this link:
[https://www.google.com/search?q=original#q=newsearch](https://www.google.com/search?q=original#q=newsearch)

The page loads, but there's an AJAX request to the /search route with the
hash-specified query, with query string parameters to specify how the data
should be served. Still, the client-side script cannot strip the query string
of the previous search without a full page reload.

While the info leakage is annoying, it's unavoidable in this strategy.

Note: The browser history, and subsequently the URL can be manipulated through
the History API's pushState, which could be a direction Google could go in,
but IMHO, the hash approach is viable as well.

[https://developer.mozilla.org/en-
US/docs/Web/API/History_API](https://developer.mozilla.org/en-
US/docs/Web/API/History_API)

~~~
Exuma
Why can't typing search in chrome bar directly go to -> google.com#search-term

~~~
aiiane
Because that would make the search results load slower, since it would first
have to load google.com and then do a second request for the actual search.

~~~
davito
But isn't that how it's meant to work ?

------
signaler
Clean Links [https://addons.mozilla.org/en-US/firefox/addon/clean-
links/](https://addons.mozilla.org/en-US/firefox/addon/clean-links/)

Mozilla Addons to the rescue once again. I am always surprised at how under
appreciated plugins like these are. If I ever stumbled into money, I would pay
it forward to all the developers who feverishly code these plugins for the
betterment of humanity and the web.

------
jlrubin
Author here -- curious to hear HN's thoughts on if this should be fixed or is
acceptable.

~~~
Terribledactyl
I don't recall needing to copy and paste a search URL to someone in a long
time, but maybe other people use it in their workflow. (maybe in jest I'll use
[http://lmgtfy.com](http://lmgtfy.com))

Interestingly enough in Safari on 10.11, the URL I see in the address bar when
searching for "$x" is just "$x" but when I copy and paste "$x" into a text box
I see my previous search in the url. I've noticed safari shortening to just
the base recently but this feels a bit off.

~~~
simoncion
> ...when I copy and paste "$x" into a text box...

What do you mean by this? The following things fail to repro the issue for me
on Firefox 41:

 __

* Open google.com

* Type a search in the search box, press return. (get a #q=$SEARCH anchor)

* Type another search in the search box, press return. (get a #q=SEARCH2 anchor)

 __*

* Go to google.com

* Type in the search box, press return. (Same result as above.)

* Paste text into the search box, press return. (Same result as above.)

~~~
jlrubin
It's a very safari specific thing.

address bar reads [$x], where $x is search term, on copy you get a url.

~~~
simoncion
> address bar reads [$x], where $x is search term, on copy you get a url.

Yikes.

------
edtechdev
For me, it's mainly an issue when trying to copy and paste a link to a file
like a Word document that downloads instead of being viewable in the browser.

------
RawInfoSec
>Note: This information has been disclosed to >Google appropriately, they have
chosen to not >fix this behavior.

>The other day, my friend sent me a link

So did Google respond with a "No" or is 2-3 days with no fix considered too
long?

I'm not diminishing the seriousness of this problem but it just got a whole
lot worse being on HN after only 2-3 days head start on a fix to a major part
of the Google product line.

~~~
jlrubin
They responded with a no, even after further prompting that they should fix
it.

~~~
RawInfoSec
Can you post this dialog? I have trouble understanding how they gave a solid
"No" on something which warrants much consideration.

Most companies won't even respond if they don't intend to fix something, that
way they can claim they're working on a fix when it all blows up in their
face.

------
morganvachon
A safe(r) way to search Google is using startpage.com, which searches Google
anonymously on your behalf. I just confirmed this bug doesn't translate over
to Startpage; this is the URL you get when you follow the same procedure as
the author:

[https://startpage.com/do/metasearch.pl](https://startpage.com/do/metasearch.pl)

~~~
lern_too_spel
Then how will the author share the results page? That was the task they wanted
to accomplish.

~~~
laurent123456
Looks like there's a "Bookmark this search" link. It can be right-clicked to
copy the URL.

------
joering2
Slight OT, but it took me 4 hours to realize that it's Google who's crawling
the URL of pages (mostly cron jobs, or admin sites) that nobody should have
never stumble upon.

I'm not sure how this works (the URL I paste and use in Chrome is sent to
Google?), but it came back few hours later from an agent called "Google
favicon". The IPs in 66.249. range strongly suggest it's a legitimate Google
bot.

Why they do it - no idea, but just to be save I blocked all 66.249. and
66.102. traffic across my systems.

EDIT: yes there is robot txt that dissalows all traffic. Google ignores.

This is very internal site - no need to have everyone from 66.249 acessing it.

~~~
toomuchtodo
Using robots.txt to block them should've worked as well.

~~~
oaktowner
s/as well/better

------
mikkom
Are you sure this isn't by design? I mean if you send a link of

    
    
        https://www.google.com/search?q=computer+language&....&q=python
    

it _should_ return different results than

    
    
        https://www.google.com/search?q=snakes&....&q=python
    

So if people send urls with queries (they should just send actual target urls)
then it's quite relevant that the search results are the same for the both
links sent.

~~~
aw3c2
The problem is that the first search terms are included for no reason.
[https://www.google.com/search?q=psychiatric+doctor+in+palo+a...](https://www.google.com/search?q=psychiatric+doctor+in+palo+alto&....#q=python)

------
josu
>Google’s automatic inclusion of prior search terms is a similar violation of
a user’s privacy expectations, and they should fix it.

How? That's information you are voluntarily sharing. Besides you can easily
see what you are sharing since it's not obfuscated nor encrypted in any way.

------
jamesrom
The lack of ability to share search results is a conscious decision by Google
to make their search engine feel private and personal.

You have your own Google that knows a great deal about you. It's is tailored
to you, your location, your life. Google has become an extension of self.

Making search more social by adding simple sharing capability means that you
would be less likely to use Google for personal and private search queries.

Google is the most successful advertising company on the planet.

