
Chrome deploys deep-linking in latest build despite privacy concerns - mikro2nd
https://www.theregister.co.uk/2020/02/20/chrome_deploys_deeplinking/
======
derefr
Amusingly, I built a (private prototype of an) extension for functionality
almost exactly like this a while ago. My goal was to be able to bookmark
arbitrary long articles (or single-page books) “in the middle”—without the
author providing an anchor permalink—and then come back to them where I left
off, precisely by embedding my scroll location on the page into the fragment
of the bookmarked URL. It worked pretty well, and mostly obviated my need for
any service like Instapaper/Pocket.

I always wondered why browser bookmarks don’t just _work like this_ , even if
doing so required storing additional metadata outside the URL itself.

One thing that’s interesting about this functionality in Chrome specifically,
is that due to the nature of Chrome’s PDF support, these URLs should allow you
to deep-link into a PDF from outside it, which hasn’t really been possible
before.

Still quite a way from Xanadu transclusions, but it’s a start.

~~~
jfim
For those confused by the "Xanadu transclusions" part, it's basically quoting
documents through hyperlinks (somewhat like OLE embedding on Windows).

See
[https://en.wikipedia.org/wiki/Transclusion](https://en.wikipedia.org/wiki/Transclusion)

------
thinkingemote
The quote about DNS seems wrong to me

""Consider a situation where I can view DNS traffic (e.g. company network),
and I send a link to the company health portal, with #:~:text=cancer," he
wrote. "On certain page layouts, I might be able [to] tell if the employee has
cancer by looking for lower-on-the-page resources being requested.""

I thought DNS requests just get the domain, not the hash and not even the page
requested.

But lets play along. How would sending a link tell you anything apart from if
a user clicked said link? How would links to existing anchors
example.com#h3-subtitle be any different?

~~~
liveoneggs
also #fragments aren't sent to the server at all, unless this changes that (a
major major change if so)

~~~
falcolas
Does this matter when the server (well, multiple servers when you throw in ads
as well) are running their custom Javascript on the client, which has access
to not only the viewport coordinates, but the URL itself?

~~~
liveoneggs
does this add an additional vector to that case?

------
ghostpepper
I think a lot of the comments in this thread are trying to evaluate the
privacy concerns on merit, which makes sense, but IMHO it's also instructive
to look at the fact that other W3C members don't want this included and Google
is able to do it anyway. Perhaps that should be the bigger cause for alarm
than any one feature.

~~~
bad_user
In Google's defense, and basically because I like this particular feature, I
have to point out the obvious ... most new browser features happen _before_
and not after standardization.

The standardization process is meant for reaching agreement among browser
makers and to provide clear specs for those features. But it isn't meant for
exploration.

That has always been the case. Even Mozilla does it. And if not, then you
can't really get the browser vendors to agree on mostly anything a priori,
without market validation.

Standardization works as a refinement of already existing work. E.g. Google
came with NaCL/PNaCL which was basically their own ActiveX, Mozilla came with
Asm.js in response and the end result was Web Assembly, which is objectively
better than both and that wouldn't have happened without that prior art.

The W3C doesn't guard what gets implemented in browsers. It never did. And if
browser makers willingly participate in the standardization process, we should
thank them, because they can always stop doing it. Because really, retreating
from the W3C would mean absolutely nothing for computer illiterate people.

~~~
joshuamorton
To elaborate, the process for standardizing browser features today is whatwg
managed, not w3c. And it's basically "write a proposal, solicit feedback,
implement the feature in one browser, and if it's ever implemented in 3
browsers it's part of the standard".

~~~
SahAssar
WHATWG is mostly HTML and DOM/browser api:s, ECMAscript is by TC39, CSS is
still in W3C, URI schemes (which this is about) are in W3C, as seen in this
draft being hosted by WICG, which as far as I can tell is a W3C group.

------
pkulak
Wow, what a dumb justification for the privacy concerns. URL hashes aren't
even sent to the server, let alone have anything to do with DNS. We already
put sensitive stuff in URL hashes, like OAuth tokens.

~~~
t0mas88
Read again. Their point isn't that the fragment is sent to the server or DNS.

Their point is that you can infer the existence of a certain text on a page
the user views (even via https) by sending them a link like this and observing
the browser loading resources that would be at the bottom of the page. The
only reason the browser would do that is if it scrolled to the bottom,
confirming the text you put in the link is on the page.

------
blakesterz
The article says the folks at Google have this short docs to address the
concerns:

Scroll-to-text Fragment Navigation - Security Issues

[https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj...](https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj2gkwCq8_5xwIae7PVik/edit#heading=h.uoiwg23pt0tx)

~~~
sincerely
I don't have access to this document, is there a mirror?

~~~
def_true_false
[https://pastebin.com/raw/M1yQMK3h](https://pastebin.com/raw/M1yQMK3h)

------
phillipseamore
What? A DNS lookup doesn't include anything but the hostname and anything
following a hash is never sent with a request from the browser.

~~~
cmg
I think the idea there is that there would be lazy-loaded content from another
site that would only load when someone scrolled far enough to see the
highlighted word, which will automatically happen with this feature.

The title of the Forbes article is too hyperbolic for my tastes and while this
could be a security concern in very specific situations it's being overblown.

~~~
phillipseamore
I think that would need to be a designed attack, not something that would
apply to 99% of websites. The only legitimate resource (that could be used as
a canary) being loaded far down a page would be an image (and that kind of
requires it to be lazy loading as well).

~~~
merrywhether
Latest Chrome defers out-of-viewport images on its own, or at least tries to.

------
bad_user
I avoid Chrome due to privacy concerns, but I actually like this feature
¯\\_(ツ)_/¯

Many times I'm looking for an existing anchor for linking to a certain section
of the page, because the author did not bother to create a table of contents
and many times that anchor ID is missing.

Also — I was under the impression that anything that comes after # is not sent
to the server, being a fragment meant to be processed entirely by the client.

What am I missing?

~~~
chias
Let me give you an example. It will be contrived for simplicity but there are
circumstances that will seem more real-worldish. Lets consider the following
scenario:

\- you visit a page at example.com that contains private data about you (e.g.
bank account number, medical conditions, what have you)

\- this page loads external resources only when they are scrolled into view

\- it also hotlinks some image below the fold that a potentially bad actor can
see requests to (maybe the image is on their webserver, maybe you're on a corp
network and they can see dns resolutions, whatever)

So now I link you to:
[https://example.com/privatepage#:~:text=Account%201](https://example.com/privatepage#:~:text=Account%201)

Did the image get requested? If yes, I know your account number starts with 1;
if not, I know it starts with something other than 1. Rinse and repeat.

~~~
tpmx
> Rinse and repeat.

How exactly would you as an attacker perform this rinse and repeat action?

~~~
chadlavi
and why would a bank site hotlink to an asset owned by someone else?

All the security concerns I've seen for this seem quite contrived to me. And I
say that as someone who assiduously avoids google products*

(*alas, except at work, because I can't really choose that)

~~~
tpmx
I think this is the perfect storm.

1) Privacy-conscious browsers are trying to get exposure, so they are
stretching an extremely narrow privacy risk into something extreme.

2) (I also believe) media companies are worried this will rob them of ad
exposures, so they are incentivized to cover this as something scary.

3) The "bad guy" is Google. This means the amplitude of the story is
immediately 10X larger.

------
rubatuga
Seems like a useful feature to me. You could just train yourself not to click
on hyperlinks with the aforementioned tag. And like another comment says, it
shouldn't be a privacy issue unless you are running over HTTP, since the
resource URLs are encrypted.

~~~
prophesi
I agree that it's a useful feature and not a privacy issue, but I will say
that it's nigh-impossible to avoid clicking links like these. I already try my
best to strip out tracking parameters in query strings, and avoid shortened
URLs. But they're simply everywhere, and sites/email try their best to hide
them.

------
AlexandrB
IE6 is back, baby! All that’s left is deep integration with some proprietary
Google “standard” (I’m thinking AMP) and it’ll be the early 2000s all over
again.

~~~
DoctorOW
It'll be like the early 2000's if switching to another browser was most likely
just IE6 reskinned.

~~~
abbracadabbra
Microsoft Edge, which is built on Chromium, would fit this parallel nicely.

~~~
_verandaguy
And brave, and vivaldi

~~~
seabrookmx
And Opera.

------
bastawhiz
The example that the security researcher gave seems moot: the same thing would
happen if the employee simply scrolled down on the page manually, no? And we
already have the ability to link to anchors on a page, and that's not
considered to be a privacy issue. Can someone explain how this is actually a
meaningful privacy issue?

~~~
merrywhether
Say you have a long page that lists “Pre-existing conditions” at the bottom,
and near that section is also a unique image or other external asset. If you
click on the link and cancer is in your list, the page will scroll and load
the related assets instantly. Without cancer in your list, you’d only load
those assets through human scroll, which would most likely look different
timing-wise. Thus you can determine with high probability whether your target
has cancer listed (if you have access to DNS records, as mentioned in the
example, and the target is using a browser that delays offscreen asset loading
- like this same latest Chrome).

Whereas anchors tend to be generic (#preexisting-conditions), this new scroll
behavior can be used to create an existence check for any user-specific text
on the page (in carefully crafted scenarios). There are probably other
variations that could be devised on this concept, since it allows indirect
page interaction that can bypass authorization walls (since the browser would
transmit cookies normally and such).

~~~
prophesi
This is absurdly difficult to pull off with very little payoff. You'd need to
be sniffing the traffic of a network. Then craft a URL that contains a unique
image near your text fragment query. Then somehow send that URL to the victims
on your network. _Then_ check how long it takes for them to load that image
upon clicking the URL.

I'd like to call myself a privacy advocate, but this is just absurd. The pros
obviously outweigh the con of a very precise and targeted attack that leaks a
predetermined bit of information.

If you've got their DNS records, you've already violated plenty of their
privacy to get the information you want. No need for this text fragment
"attack."

~~~
notatoad
>I'd like to call myself a privacy advocate, but this is just absurd.

Yeah, my read of this is that it has nothing to do with privacy, people who
want to block change for some reason have just learned that "i have privacy
concerns about google" is a catchword that will get you some press coverage,
and are essentially hijacking the actually valid and important privacy
concerns to push forward their unrelated opinions (and promote their browser
product).

~~~
chadlavi
This is my take on this, too.

As a FF user, I'd love to see FF implement this too!

~~~
bzbarsky
FF is considering implementing something in this space, yes. Note that there
are at least two proposals for how this could work: one that is already
deployed via a polyfill on various sites and the Google one. They have various
functionality tradeoffs, and unfortunately Google decided to make up a wholly
new thing instead of improving the existing thing...

------
canacrypto
I'm not convinced this is a problem. In terms of ways we leak privacy on the
web, this seems very low on the priority list.

~~~
kevmoo1
Seems like a great excuse to create click-bait headlines – not much else.

------
tyingq
Confused. As I understand it, anything after the # in a uri isn't sent over
the wire.

So the only way someone could see that you're navigating to a specific
fragment is some sort of deep chrome logging, or chrome plugin, etc. And if
that's the case, cat's already out of the bag for everything you do already.

~~~
hesselink
I was confused as well. I think (from some other comments) that the issue is
that you can scroll to text that is specific to you and others might not have
on their page (e.g. 'cancer' on some medical page) and then somehow gather
from the requests that you scrolled there. It seems pretty hypothetical, but I
can see the issue with forcing a scroll depending on what text is on the page,
I guess.

------
pornel
There are plenty of reasons one would want to switch away from Chrome, but
potential abuse of ScrollToTextFragment must be the most contrived one.

------
overcast
Just like everything else. Millions of users won't understand what this is
talking about, or care for that matter.

------
thrower123
Hmm. The use case of linking to a specific location in a document, really
appears to be more of an issue with websites not actually using ids that could
be anchored to. If every paragraph had an id, you could get 95% of the way to
the desired functionality by just making it easy to copy a link to e.g.
www.example.com/foo#paragraph4.

At this point, Wikipedia is one of the few websites I use regularly that
actually works in this way.

------
superkuh
No doubt because most large web corporations have completely switched to
javascript for their anchors (ie, #) and no longer use HTML spec anchors that
actually work. I'm looking at you Microsoft Github.

~~~
gnomewascool
> I'm looking at you Microsoft Github.

Where don't anchors work when javascript is disabled? (With admittedly very
brief testing just now, I couldn't find any such cases.)

~~~
superkuh
On every single repo index page that has "anchors" that I've tried over the
last year. The markdown is now interpreted different so anchors are
class="anchor" and not real anchors. Maybe you didn't fully disable JS? Make
sure JS is disabled before you load the page and you've cleared your cache
(ctrl-f5 in FF-alikes).

I just went to the most recent github tab in my browser session and found one
instantly:
[https://github.com/quiet/quiet#dependencies](https://github.com/quiet/quiet#dependencies)

~~~
kemayo
Can't say I've noticed it, but then I'm going around with Javascript enabled,
so it's not like I would.

That said, the github one is a sort of interesting case, insofar as they're
doing it to avoid user-generated content clashing with their page-chrome
ids... while still preserving a readable URI that matches what the content
creator expects. Which doesn't seem like an unreasonable case, to me.

You'll note that there _is_ a real anchor on that link --
[https://github.com/quiet/quiet#user-content-
dependencies](https://github.com/quiet/quiet#user-content-dependencies) \--
which works without Javascript. It's just rather inaccessible.

------
BiteCode_dev
Can somebody explain to me how is this more of a privacy concern than the rest
of the url, often containing the title sluf, date, id, tags and/or filter
parameter of the page ?

------
janpot
Is it possible to agree with the assessment of severity of this privacy issue
without being regarded as supportive of the way Google uses its monopoly to
force certain webstandards on the web?

------
homero
I don't get it. This is the least privacy reducing feature there is. It's a
great feature.

Wouldn't be nice to have people worry about this while hundreds of other
actual privacy reducing features go unnoticed?

------
noahmotion
Was it wrong of me to be amused when I got to the end of this post on internet
privacy concerns and Google, only to see a link at the bottom encouraging me
click in order to follow the author on Facebook?

~~~
_bxg1
And I couldn't read the article because of my adblocker

------
keeganjw
Could someone explain this a bit better? I've read two articles on this this
morning and I still don't understand what the privacy concerns are with this
feature. Thanks!

~~~
vinaypai
The feature in question is the ability to use fragment (# in a URL) to link to
matching text rather than just an ID. Here is the key line with the so-called
privacy concern.

"Consider a situation where I can view DNS traffic (e.g. company network), and
I send a link to the company health portal, with [the anchor] #:~:text=cancer.
On certain page layouts, I might be able [to] tell if the employee has cancer
by looking for lower-on-the-page resources being requested.”

So they could send someone a link to a page with a fragment, trick them into
clicking it, and matching text and watch for DNS requests being lazy-loaded to
learn the fact that they clicked it.

It's convoluted nonsense.

------
Avi-D-coder
Last week I switched to Firefox nightly on desktop and Firefox preview on
mobile.

It took some configuration, but I'm not regretting it at all.

------
hallihax
For those concerned, I believe you can just disable this behaviour via this
flag: chrome://flags/#enable-text-fragment-anchor

------
garganzol
They should have used XPointer instead of reinventing the wheel. It has a much
more effective syntax.

------
halayli
i don't think you can arrive to any conclusion based on analyzing traffic/dns
reqs made alone.

How can you tell a user didn't jitter scroll and caused more content to show
up?

What if it was a 404 page.

How's the intruder able to see the query param anyway? if it's non secure all
bets are off.

The same argument used revolving around text can be made about words in a
domain name.

------
hartator
What was wrong with just `example.com/#section` instead of
`ScrollToTextFragment`?

That seems to complicate things for no good reason.

------
ryanolsonx
If I'm understanding his correctly, google actually changes the HTML to have
these anchor tags?

~~~
DanHulton
No, it's a selector, but one that can include the actual text in the page. So
instead of having to select based on a pre-existing anchor, like
`[http://site.com/page#anchor`](http://site.com/page#anchor`), you can link to
`[http://site.com/page#understanding+his+correctly`](http://site.com/page#understanding+his+correctly`)
(though that's not their format). No need to alter the markup.

Though, I'm curious how this works when text is broken apart over spans or
divs.

------
slim
All this fuss should have been about lazy loading of resources. The leak is
there

------
karol
Why is this feature so interesting it got implementation priority? Does this
enable some commercial activity by Google?

------
JaceLightning
Uhhhhh, you do realize the anchor (#...) is not sent to the server, right?

~~~
l33tman
The attack depends on measuring the timing of resource-loading that in turn
depends on the free-text search the attacker can get the victim browser to do
on his behalf.

------
MentallyRetired
As a developer, how do I lend my support to stopping google from trying to
steal the web?

Something actionable, something specific to my semi-unique position as a
developer?

~~~
jfk13
As a first step, at least: Use and contribute to Firefox (or other non-Blink-
based browsers, but there aren't many to choose from...), and encourage those
you support/influence to do the same.

------
minikites
Imagine telling someone in 2003 or so that the most popular web browser
fifteen years later would be made by DoubleClick and imagine what their
reaction might be.

~~~
speedgoose
Something like "wow, software development will improve so much! You will make
a web browser in a double-click."

~~~
stickykeys
The year is 20XX. You return to your slave cube to find what you assume is a
PC running "Plan 9 from Microsoft". You are shocked at it's convenient
features:

The world's information is merely one click away.

The world's browsers are simply two clicks away.

Dispensing candy is an easy ctrl-triple-click.

Bootstrapping a compiler is just the konami code backwards.

Printing the screen is really simple as hitting PrintSrc and loading your
printer with $300 cyan ink cartridges.

However turning the computer off is a week-long process that involves arguing
with the built-in HAL9000

------
tomaszs
Will never use it. Read what Google did with JS websites and recommendation to
implement hashbang and escaped fragment to provide prerendered version.

You wont. Because it didnt work and killed SEO. Than after years they silently
removed documentation about it. Not providing any way to transition to other
ways without damaging SEO ever more and moreover, it was impossible to do
anything about it because server does not receive data after hash.

And now Google tries to force new standard that is technically broken. Dont
use it. It is a trap. It will hurt your SEO sooner or later!

------
blakesterz
EDITED and hour later: A couple of people have pointed out it's back, they've
made some changes and marked it as public now.

There was another article on this same topic this morning with this:

"Google's engineers have not ignored worries about the security and privacy
risks. To their credit, they've gathered them together into a single document
and they've clearly been engaged in understanding what people are worried
about. It's just that they've concluded the concerns aren't that big a deal or
can be dealt with to their satisfaction."

That Single Document is here, and I went and had a read this morning, but now
I get permission denied, I guess it was getting too much attention?

[https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj...](https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj2gkwCq8_5xwIae7PVik/edit#heading=h.uoiwg23pt0tx)

I wish I would've save a copy now. It hand some decent details on the various
vulnerabilities and how they are handling (or ignoring) them.

~~~
drusepth
It's available and listed as (PUBLIC) in the title now.

