
Crooked Style Sheeding – Webpage tracking using only CSS - ProfDreamer
https://github.com/jbtronics/CrookedStyleSheets
======
kodablah
If you're concerned as a user of a malicious site:

* Link click tracking - So what, the site could route you through a server side proxy anyways

* Hover tracking - Can track movements of course, but doesn't really help fingerprinting. This is still annoying though and not an easy fix

* Media query - So what, user agent gives this away mostly anyways

* Font checking - Can help fingerprinting...browsers need to start restricting this list better IMO (not familiar w/ current tech, but would hope we could get it down to OS-specific at the most)

If you're concerned as a site owner that allows third party CSS:

* You should have stopped allowing this a long time ago (good on you, Reddit [0] though things like this weren't one of the stated reasons)

* You have your Content-Security-Policy header set anyways, right?

Really though, is there an extension that has a checkbox that says "no
interactive CSS URLs"? I might make one, though still figuring out how I might
detect/squash such a thing. EDIT: I figure just blocking url() for content and
@font-face.src would be a good compromise not to break all sorts of background
images for now.

0 -
[https://www.reddit.com/r/modnews/comments/66q4is/the_web_red...](https://www.reddit.com/r/modnews/comments/66q4is/the_web_redesign_css_and_mod_tools/)

~~~
oneeyedpigeon
> not an easy fix

 _All_ of this is an easy fix: disable css. In the same way that "I don't want
to be tracked by javascript" can easily be resolved by disabling javascript.
I'm not seriously suggesting everyone does that, but anyone who is so paranoid
that they don't want a site knowing that they're reading its content might
want to consider it.

~~~
keypress
Happily used to (5 years ago) surf the web with no JS and no CSS, or rather
applying my own style-sheet for 90% of my web viewing. I'd fall back to Chrome
when absolutely necessary. It was fast and comfortable, it just relies on well
structured accessible content.

~~~
dahart
> it just relies on well structured accessible content.

Honest question: how much of this is left? What popular sites are still
accessible this way? HN might be the only site I visit frequently where
browsing with no js/css has any hope of working.

~~~
ekianjo
Check surfraw (by Assange), you can actually access a surprising amount of
resources using sr and lynx from a terminal. Text only, but still makes
internet pretty useful.

~~~
MamboJ
Are you aware of any piece, writeup review or anything on Assange’s
programming skills?

------
Angostura
Whose going to be first to make the 'I always browse the Web with CSS
disabled' post?

~~~
yorwba
Using lynx, the only thing that makes reading Hacker News somewhat
inconvenient is the lack of indentation to show the nesting hierarchy, but
otherwise it works quite well.

Some other sites are so messed up that it's actually more comfortable to read
them in a text-only browser that completely ignores CSS and replaces images by
their alt-tags.

Of course I frequently _do_ want to look at images, so my main browser remains
Firefox, but it's still useful to remember that other browsers with different
tradeoffs exist and can be used.

Sometimes, you really just want to read some text and don't need any of that
fancy other stuff.

~~~
ams6110
You can see the indentation if you use w3m. HN uses tables to structure the
comment hierarchy, and the w3m browser does a pretty great job rendering
tables.

~~~
kuschku
It'd be much nicer if HN used nested lists (without icon) for comment
structuring. That'd also work fine in many more textmode browsers.

------
chatmasta
I don't see what's problematic about this. The tracking is not really done in
CSS, so much as on the server. You could accomplish the same thing with 1x1
images, or loading any remote resource. Effectively the only difference is
you're loading the URL conditionally via CSS, as opposed to within a
`<script>` or `<img>` tag. Furthermore, this can be blocked in the same way as
any tracking URL.

I concede this is a novel way of fingerprinting the browser from within the
client, without using JS. However, I think a better way to describe this would
be " _initiating_ tracking on the frontend without the use of javascript."

~~~
Osmose
The difference is that CSS can trigger remote resource loads in response to
post-pageload user behavior, which intuitively seems like a JS-only thing. For
example, tracking where the mouse has moved, as mentioned in the readme.

I wouldn't say it's some sudden, alarming capability, but it is distinctly
more capable than <img> tags.

------
antibland
About 8 years ago, a colleague and I interviewed a nervous kid fresh from
undergrad. He was applying for a junior front-end position at our fast-growing
startup. Dressed in a shiny, double-breasted suit and wingtip shoes, he
followed us into a tiny office (space was so limited) where we conducted
interviews.

"Tell us about your CSS experience," we asked him.

"Ah, yes. I, well, haha, of course. The CSS is where you make your calls, to
the database, ah, server, ah, of course."

Unsurprisingly, we did not hire the applicant, though his answer to our
question lived on in infamy for many years. But all that changed, today,
reading this. The joke was on us. That kid was clearly from a future of which
we had no awareness. Starting today, I'll always trust programmer applicants
donning double-breasted suits.

------
shove
Reminds me of similar techniques that could be used several years ago to sniff
browser history via a collection of a:visited rules.

~~~
avian
> However using my method, its only possible to track, when a user visits a
> link the first time

This suggests that browser history sniffing is still possible - as long as you
make the user click the link (in contrast to the old a:visited method where
this could be done with no user interaction)

------
andrewmcwatters
Any part of a browser that can make a request can be used to do this sort of
thing. Any part of a browser that can alter the view and its related DOM
attributes can cause a user to interact with it and give up data
involuntarily.

Turn off JavaScript and CSS media queries can cause resources to load based on
a number of parameters. Have canvas enabled and you can be fingerprinted. Use
one browser over another and get feature detected. Anchor states give away
browsing history. Hell even your IP address sacrifices privacy, and that's
before the page gets rendered.

So with that being said, if you're browsing the web, you're giving up
information.

------
globuous
Very smart! This is a few line of code away from a css class based mini
tracking framework...

Aside from the obvious, this could also be used as a fallback (restricted) A/B
testing for no js users ? I'm thinking data about just what was hovered,
clicked, and media query allows for some basic UI testing of responsive
websites.

------
secdewd11
This doesn’t mention my personally favorite css tracking trick, timing attacks
that can be used to detect what sites you have loaded. This can be done by
interweaving requests to a remote URL (say background-image) with requests to
your server script, which times these differences.

~~~
ubernostrum
The fanciest tracking trick is the HSTS supercookie.

You use a bunch of subdomains -- a.example.com, b.example.com, etc. -- each
configured so that a particular URL (call it the 'set' URL) sends an HSTS
header. A different URL (the 'get' URL) doesn't.

You generate an ID for the user, and encode it as a bit pattern using the
subdomains to indicate positions of '1' digits. Say your ID is 101001 -- you
serve a page which includes images loaded from the 'set' URLs for subdomains
a, c, and f. On later page loads, you serve a page including images loaded
from the 'get' URLs of every subdomain, and you pay attention to which ones
are requested via HTTPS. Since the 'set' URL sent an HSTS header, subdomains
a, c, and f get requested over HTTPS, and now you reconstruct the ID from
that: 101001.

~~~
j_s
> _The fanciest tracking trick is_

I feel like this changes all the time; I was recently surprised to discover
'TLS Client Channel ID' (my nomenclature is a bit fuzzy - an RFC for automatic
client certs "for security") and would love to learn more about the extent of
its current implementation in Chrome.

[https://news.ycombinator.com/item?id=15753648](https://news.ycombinator.com/item?id=15753648)

>londons_explore: _In Chrome, it also uses the TLS Client Channel ID, which is
a persistent unique identifier established between a browser and a server
which (on capable platforms) is derived from a key stored in a hardware
security module, making it hard to steal. Ie. if you clone the hard drive of a
computer, when you use the clone, Google will know you are a suspicious
person, even though you have all the right cookies._

[https://en.wikipedia.org/wiki/Transport_Layer_Security_Chann...](https://en.wikipedia.org/wiki/Transport_Layer_Security_Channel_ID)

[http://www.browserauth.net/channel-bound-
cookies](http://www.browserauth.net/channel-bound-cookies)

------
SimeVidas
But how many users disable JavaScript in their browser to prevent tracking?
And is the fact that a website can track all your clicks and mouse movements a
privacy/security issue to begin with? Isn’t it by design that the website
you’re visiting can track you?

~~~
oneeyedpigeon
> by design

By design, the web is a "1\. send me the document <-> 2\. here it is"
transaction, not a series of many small notifications. By design, the url()
property almost certainly wasn't intended to be dynamic. This is clearly
'bending the established rules' — cleverly, admittedly.

~~~
dahart
> By design, the web is a "1\. send me the document <-> 2\. here it is"
> transaction, not a series of many small notifications.

That was true in 1998, but most of the web has been turning, by design, into
what you might call a series of many small notifications ever since then.
Gmail's been doing it since 2004. Today, almost all large sites are running
Google Analytics or something like it, which track everything this article
discusses, and operates on constant micro transactions. All web apps are built
on many small notifications, and many of them even use websockets which was
explicitly, by design, built for streams of micro transactions.

> By design, the url() property almost certainly wasn't intended to be
> dynamic. This is clearly 'bending the established rules'

There was never a rule against, even if dynamic usage wasn't expected or
imagined (which I find unlikely). CSS allows it, therefore it's allowed by
design.

------
EldonMcGuinness
Call me naive but, as a dev, I don't see why this would be any better than
using JS. The group of people that block JS is likely to do the same for this
and, as mentioned by others, common sources of such mucking are blocked by a
good ad blocker.

Then, there is the whole, _" how could it be integrated into an existing site
with minimal fuss"_ issue. With JS you can specify targets and the like for
actions and observations, the only comparable thing would be to offer _sass_ /
_less_ integration so that it works with clients that disable or block JS,
which is arguably much more difficult.

While it is definitely clever, I just don't see a practical use for it. It
would really only benefit those willing to put the work into using it and only
work so long as their logging URL is available and not blocked. I just don't
see the real value.

~~~
r3bl
Does it need to be practical? It's just a proof of concept, as is displayed in
the very first sentence in the README file.

This seems to me more like "wow, something cool has been done in an unusual
way" material, rather than "this is something you should consider using".

------
mistersquid
At least in Safari 11.0.2 (macOS 10.12.6) link tracking does not work. The
selector

    
    
      #link2:active::after
    

appears to always exist in Safari 11.0.2 ("active" is being disregarded).

I clicked on none of those links, I have never visited google.de, but
results.php page told me all 3 links had been clicked.

EDIT: formatting, remove word.

~~~
akvadrako
You should let somebody know:
[http://bugreport.apple.com](http://bugreport.apple.com)

The demo site correctly tracks me in Safari 11.0.1 (macOS 10.13.1)

------
Raphmedia
Alright lads, let's all go back to RSS feeds and scrap that whole "browser"
experiment.

~~~
tqkxzugoaupvwqr
I seriously think we need an alternative to HTML that axes styling and
scripting and concentrates solely on the markup / content description.
Websites would use a certain set of elements/descriptors to describe the
content they contain. The user’s website reader would parse the markup /
content description and display a page how it thinks it should be displayed
(according to the user’s preferences). All websites would have the same
styling – the one chosen by the user. This HTML alternative could provide an
API that makes it possible to have dynamic websites but still prevents
scripting and fingerprinting.

~~~
j_s
Against an Increasingly User-Hostile Web |
[https://news.ycombinator.com/item?id=15611122](https://news.ycombinator.com/item?id=15611122)
_(2017Nov:1307 points, 502 comments)_

#oneofus
[https://hn.algolia.com/?query=13226170&type=comment](https://hn.algolia.com/?query=13226170&type=comment)
_(click a 'comments' link on the search results, then 'parent')_

I've connected similar sentiment here for about a year now; I've appreciated
mention of several helpful tools in this thread.

------
asadlionpk
Interesting trick. But I think adBlockers block requests to entire tracking
domains. So even css calls would be blocked?

~~~
AntonyGarand
Not if you're hosting your own tracking information, on your own domain

~~~
beckler
honestly, I don't care if you use your own tracking solution on your own
domain, as long as it's not passing the data to a third-party.

I get that some of that data is genuinely useful in determining what parts of
an app are popular and what is not. Even though I don't like being tracked for
dumb shit like ads, it does have valid uses.

------
petercooper
I'm surprised browsers wouldn't prefetch stuff like this given it's an easy
performance win. Which would then also make these stats useless.

~~~
fixermark
Prefetching state that's never needed is a bandwidth waste.

~~~
saagarjha
For many users (i.e. desktop) bandwidth is something you're throwing away if
you're not using it. I'm not saying this is a good idea in all cases, but it
might be in some.

------
Trufa
Very interesting, it's always intriguing to see how much of a cat and mouse
game this privacy stuff is. I'm always thinking that this needs an overhaul
and slightly different approach altogether, sadly I can't produce any viable
solutions.

With this huge and complex kind of issues I don't think we have to find one
solution but rather point in the right direction, but I'm not even sure we're
doing that.

~~~
tqkxzugoaupvwqr
In my opinion the new approach should be: Let websites deliver content and let
the user’s website reader interpret the markup / content description and style
the page according to the user’s preferences. Websites shouldn’t be able to
style and script themselves any longer.

The website should load more content when the user scrolled to the bottom? Let
the website reader retrieve the content itself. The website wants to know the
dimensions of the viewport to load the appropriately sized image or change the
layout? Tough luck, this is none of the website’s business! Let the user’s
website reader handle this.

------
expertentipp
This is where the talent, funds, and resources will go as ads and marketing
are industries with lots of funding available. Even more tracking and of even
more pervasive kind. We hate tracking while we bet our time and money on it.
The web is cancelled, go back home everyone.

------
rhn_mk1
Is there a way to turn off CSS media queries in Firefox, or fake their
conditions? Apart from the security issues, it's plain annoying when the page
layout will change completely because a few pixels of window size are missing
for the perfect experience.

------
en-us
Well this is depressing.

------
john-aj
This could easily be stopped by a change in browser behavior. If web browsers
downloaded contacted every address specified with `url()` automatically on
page load, without considering the conditions, this type of conditional
requests would be impossible.

Conceivably, you could solve it through a simple browser extension that looks
through all of the page’s stylesheets and calls all URLs present in the CSS
before the page is rendered.

In an ideal implementation, though, URLs dependant on “static”, non-
identifiable conditions, such as an image with `display: none`, would be left
alone.

~~~
fixermark
That's likely to have unfortunate performance implications, particularly on
mobile or low-bandwidth connections.

------
nukeop
The obvious solution is to block the server-side pages that the CSS elements
link to. This kind of tracking can be mitigated the same way any other kind of
tracking is already handled by uBlock or uMatrix.

~~~
kalcode
uBlock can't block manual tracking...just third-party scripts that do it.

Example: You visit example-site.com

example-site.com is the php server that sends you the html. It also the site
that does the tracking. So when you click something it sends that data to
example-site.com and then it can forward the data to a third-party tracking
service.

If you blocked or used host files on the server-side pages then the site
example-site.com would be completely blocked too.

Ultimately if everyone uses ad-blocks to block tracking script they can be
added to the back end. If you block the back-end you effectively block the
website you are accessing in the first place.

~~~
nukeop
If it's all moved to the backend, then we win, because then we can easily
control what data is being collected.

------
Quagga
I think it is time to split the web into:

* user and machine readable content (text with hyperlinks, pictures, audio, video, rest)

* universal app store (javascript, css, intents, permissions...)

Every user could consume or style content as he wishes. If my IDE has dark
theme, I want all web pages to have dark theme. Why do I need javascript to
read news or browse pictures.

If user wants to installs app from app store he should accept software license
and give permissions to that application.

------
talmand
This is an interesting concept, but I'm not seeing anything that couldn't
already be done with a properly set up website and server logging.

Things like "@supports (-webkit-appearance:none)" doesn't give you chrome
detection. It gives you webkit detection, which is a rather large subset of
the whole. Plus some of the other browsers started supporting webkit prefixes.

~~~
claar
> doesn't give you chrome detection

Checking every possible prefix should distinguish most versions of most
rendering engines -- still not bad.

~~~
talmand
It would only, at best, give you outdated browser versions. Unless you are
going to create a huge set of rules checking certain properties against other
properties. Plus it doesn't tell you which browser, only maybe which webkit
engine version. Which tells you next to nothing.

------
Someone
_”Interesting is, that this resource is only loaded when it is needed (for
example when a link is clicked).”_

The resource is retrieved using GET, so I wouldn’t think that is required by
the http standard. If so, browsers can mitigate this kind of attack by pre-
fetching these resources (even pre-fetching a fraction at random already might
be enough)

It is a neat hack, though.

------
Tepix
Do browsers really need to allow fetching URLs in the "after" event of a link?

~~~
flooq
It's not after event, the "after" is for inserting a pseudo-element. That then
sits inside the original link element and hits the tracking URL by trying to
load a resource from it when the link is active.

------
daxaxelrod
Tracking seems to only really be server side. The css just dispatches requests
with qs params. Probably not an ideal production tracking solution as it
severely limits the data you can send back for better analytics

------
jwilk
What's "sheeding"?

~~~
jwilk
Apparently a typo:

[https://github.com/jbtronics/CrookedStyleSheets/commit/c5d59...](https://github.com/jbtronics/CrookedStyleSheets/commit/c5d599f025f2222e077fce3d87afddbe5c4ee088)

------
keypress
How does 'check spelling as you type' work, via a dictionary that is
previously downloaded, or is this an online service that leaks all/or some of
your key presses?

------
bradyholt
Nice POC! I love the project name :)

------
madez
The demo shows that this technique doesn't work for "Privacy Browser" on
Android. It can be obtained from F-Droid.

------
lozzo
very good. I wonder if somebody really needs this.

------
JepZ
Wow, CSS is the new JS :D

------
throwawazqq
A lot of css is fluff masking low information content. Turning off css helps
me not have to page down x times to see a noncathartic one-liner.

