Hacker News new | comments | show | ask | jobs | submit login

You'll always lose in an arms race that you didn't start.

If you start stripping affiliate IDs, I'll just write a redirector and link to that, or link to an existing redirector. Are you going to ban all of bit.ly? Or t.co? or letter.obscure_tld from your website?




We did ban URL shorteners on delicious.

It wasn't that hard to pick out URLs that were not user facing (that a bookmark let would never see)


Isn't it possible to first check if the URL behing a bit.ly is an affiliate link ?


There's a better way.

Make a crawler that follows your redirects. If it hits an affiliate page, you can presume (with some likelihood) that it's a spam link.

If you put in intermediate redirects that the crawler wouldn't pick up, there's a chance your targets won't either and you'll lose customers.


The crawler better be undetectable as such. For instance, it better send "expected" headers (User Agent, Accepts, etc.), and it better have cookies enabled, and also operate from many distinct and perpetually changing IP addresses.

Otherwise, the spammer will be able to run his/her own URL shortener service in a 5USD/month VPS and be able to show a spammy link to the users and a regular-looking link for the crawler.

BTW: a "crawler" implemented with Mechanical Turk workers would be a little bit harder to detect, but would also have its downsides.


"For instance, it better send "expected" headers (User Agent, Accepts, etc.), and it better have cookies enabled..."

How are these challenges exactly? Just set the user-agent and enable cookies. Done.

"...also operate from many distinct and perpetually changing IP addresses."

Okay. Change the IP it operates on every few days.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: