

Ask pg: "&#x2F" considered harmful? - dmckeon
https://www.hnsearch.com/search#request/all&q=%26%23x2F&sortby=create_ts+asc&start=0

======
dmckeon
It appears that trailing slash characters on URLs are showing up as an encoded
entity (amper-hash-x-hex-hex) in links since sometime on June 7, 2013. Many
examples in the search posted, here's a simple example:

[https://news.ycombinator.com/item?id=5845372](https://news.ycombinator.com/item?id=5845372)

On a related topic, I've seen mention of people adding a bare '#' hash to URLs
to avoid duplicate link detection. No idea if that can be handled well in
software.

~~~
krapp
_I 've seen mention of people adding a bare '#' hash to URLs to avoid
duplicate link detection. No idea if that can be handled well in software._

pg could attempt to hash the head or meta tags from the remote url (though
that would require making a request for every submission), and pay closer
scrutiny to urls with common domains and paths. Just adding cruft to a url
would seem to be something you'd be able to detect easily.

Of course I say this having no clue how difficult that would be to do in Arc.
To say nothing of having to follow redirects, etc...

------
mooism2
The <title> appears to be double-encoded.

