Hacker News new | comments | show | ask | jobs | submit login
Show HN: Tricking the user to access history using CSS and captchas (frantzmiccoli.github.io)
140 points by frantzmiccoli on June 9, 2014 | hide | past | web | favorite | 47 comments



Very nice concept. You should also add the following CSS to the captcha letters:

     -webkit-touch-callout: none; -webkit-user-select: none; -khtml-user-select: none;
     -moz-user-select: none; -ms-user-select: none; user-select: none; 
This will make it feel even more like a real captcha by making it impossible to select the text. (Right now you can select it to see the invisible letters)


Alternatively, pointer-events: none; will do all the above in one property. Better cross-browser support also.


Done, thanks for the hint ;)


I can still select text with mouse in ff 24.5 esr - but still very clever trick!


I didn't applied the style to the right element... My bad!


Works now, but you might want to block pointer events too, for example I can drag with the mouse from the captcha into the text area below to get a URL.


Quite scarier than the TinSnail demo, but it must have a much lower bandwidth. The source only has three links and you will probably see all three if you have caching turned on. I guess if you're looking for one or two specific sites, it doesn't matter.


Brilliant. getComputedStyle used to give away the color of a link, so at some time this attack was trivial: you didn't need any user input, as a blue link meant :unvisited, and a purple one meant :visited. Replacing getComputedStyle with user input ("is this letter black or transparent?") is definitely brilliant.


There was a pretty interesting talk about using differences in style render time to get history.

https://www.youtube.com/watch?v=KcOQfYlyIqw

It's pretty interesting, and can be done in a way that doesn't require any user input.


You might want to check out the research paper "I Still Know What You Visited Last Summer Leaking browsing history via user interaction and side channel attacks" ( http://www.ieee-security.org/TC/SP2011/PAPERS/2011/paper010.... ). The paper describes several similar (if not the same) attacks.


Another interesting, more recent paper (I couldn't find a link to the actual pdf, but I'm sure people are resourceful):

http://dl.acm.org/citation.cfm?id=2516712

This one describes an attack to not only steal browser history, but to reconstruct pages from the users cache.


That's impressive.

Honestly modern browsers should just start ignoring off-domain :visted styles.


That would break aggregator sites like HN and Reddit (although really they should really be maintaining the visit history themselves, as they do with Reddit Gold users).


I tried to use the history tracking that comes with Reddit Gold for about a week and it was virtually useless. After browsing on my phone and two computers only like 1 in 10 of the links would correctly show up as purple on the other devices (even just PC to PC it didn't work).

Edit: I should mention I bought Reddit Gold just for this feature, so I was optimistic that it'd work.


As an alternative, you can set reddit to hide links that you have voted upon which does not require a reddit gold account.


> although really they should really be maintaining the visit history themselves, as they do with Reddit Gold users

Another HTTP request between me and the content I want. Another 1s of RTT (UMTS link)... No.


This can be done in parallel with JS, without using an HTTP redirect.


Nope, if you fire the AJAX request directly on the onclick event, chances are high it will not be submitted/processed before the browser navigates away...


You could to do opt in permissions similar to the permission request for location information or for chrome desktop notifications.


There has to be a better way to indicate :visited, using browser chrome.

Perhaps only showing the visited info on mouseover (as a cursor style),

or limiting it to cases where the style is a color-change that is not nearly the same as the background color, in a DOM element that is front-most z-index... (but this probably can't be computed reliably...)

or just defaulting to disabling, until the user approves the domain or path (NoScript-esque)


What am I missing? I just got this pre-determined list of links: https://github.com/frantzmiccoli/visited-captcha-history/blo...

I was impressed when this list came up, but suspicious because I hadn't visited reddit or github yet today.


It's site's you've visited at any time in the past (since the cache was cleared). Anything that would normally show up as purple rather than blue on regular websites.


I took me a minute to figure out, but that hard coded list is the list of URLs it checks to see if you've visited. Try opening a private browser window.


Interesting but this method is limited to the URLs that you list in the javascript (in this case linklist.js). More of a specific validation to see if the user has visited the links you provide rather than a total data scrape.

To full scrape the users history you would have to list every URL in existence.

Great proof of concept though.


For ad related purposes, it makes it easy to see if you've visited competitors' websites (3-5 of them), therefore meaning that you're actively looking for business, instead of just bouncing on the page with no intent of buying anything.


There are a lot of attacks like this, and it's serious enough that browsers attempt to mitigate them by e.g. preventing JavaScript from reading out the computed properties of a visited link element.

Three sites is obviously too small to do much, but if you splat in a list bigger list of popular web sites you can learn a lot about your visitors.


Rebuilding the full user history would be impossible, but finding a few compromising websites would be easy.


Would have been a greater one if linklist.js contained links to more sites than Github, Reddit and Hackernews... I mean I could have guessed those by assuming that I visited that page via Hackernews.


My point was to actually show something to the testers, I reduced the scope on purpose.


Clever


The submitted title was "Show HN: Tricking the user to access his history using CSS and captchas". We finessed the pronoun issue in this case by just taking "his" out.


I'm impressed by the number of comments this can raise. Your solution seems like a good one, if only I could edit the title.


s/his/their

Unless of course there is something on this service that actually limits all of your users into being one gender :)


:%s/his/their/g

sorry.


There was only one "his" in the title, it didn't need to be a global replace.




1745 called. They want their grammar back.

http://en.wikipedia.org/wiki/Singular_they


The "male" pronoun's use is grammatically correct when the gender of the subject is unknown.

Take your SJW puffery to Tumblr.


> Take your SJW puffery to Tumblr

Personal attacks are not allowed on Hacker News. Please don't address other users like this.

(My comment here has nothing to do with pronouns.)


How is this anything but a non-sequitur? Personal attacks aren't ungrammatical and racial slurs aren't ungrammatical. Yelling "FIRST!" at the top of a comment thread probably isn't ungrammatical either. Like using 'he' for a generic user, each behavior is obnoxious (albeit in different ways and with different scales).

Singular they is well established as a grammatical construction (http://www.crossmyt.com/hc/linghebr/sgtheirl.html, for instance), and avoids the issue entirely.


You don't have to be a jerk about it.


It is true that the use of male pronouns in gender-neutral context has historically been considered correct and is still considered correct by many. It is, however, generally discouraged because it only serves to reinforce the very real and problematic implication of male as the "default" sex.


Discouraged by whom?


It's a fairly common prescription in style guides, though it is still debated: try Googling "style guides gender neutral language" for instance.


Incorrect. Grammar, like all aspects of language, are socially generated. Currently, the shift has been to move towards 'she' or 'they' as a default gender.

Saying "'he' is grammatically correct" is like saying "C++ is the proper programming language." It's all about usage.


And "he" as a gender-neutral pronoun is still quite common usage, thus still correct by your definition of it.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: