Hacker News new | comments | show | ask | jobs | submit login

Will Skimlinks (or someone similar) offer a reverse-affiliate service that strips affiliate IDs from links on your site instead of adding them? Does this already exist?

(It would be trivial for Pinterest to manually do this, say, for Amazon, which could instantly crush a spam model based only on Amazon, without any spam network detection/banning required)

Personally, though, I think affiliate links in social networks are pretty innocuous, if not slightly positive.

You'll always lose in an arms race that you didn't start.

If you start stripping affiliate IDs, I'll just write a redirector and link to that, or link to an existing redirector. Are you going to ban all of bit.ly? Or t.co? or letter.obscure_tld from your website?

We did ban URL shorteners on delicious.

It wasn't that hard to pick out URLs that were not user facing (that a bookmark let would never see)

Isn't it possible to first check if the URL behing a bit.ly is an affiliate link ?

There's a better way.

Make a crawler that follows your redirects. If it hits an affiliate page, you can presume (with some likelihood) that it's a spam link.

If you put in intermediate redirects that the crawler wouldn't pick up, there's a chance your targets won't either and you'll lose customers.

The crawler better be undetectable as such. For instance, it better send "expected" headers (User Agent, Accepts, etc.), and it better have cookies enabled, and also operate from many distinct and perpetually changing IP addresses.

Otherwise, the spammer will be able to run his/her own URL shortener service in a 5USD/month VPS and be able to show a spammy link to the users and a regular-looking link for the crawler.

BTW: a "crawler" implemented with Mechanical Turk workers would be a little bit harder to detect, but would also have its downsides.

"For instance, it better send "expected" headers (User Agent, Accepts, etc.), and it better have cookies enabled..."

How are these challenges exactly? Just set the user-agent and enable cookies. Done.

"...also operate from many distinct and perpetually changing IP addresses."

Okay. Change the IP it operates on every few days.

I'm building one of these at the moment.

It's to process URLs, and it's just a part of what I need, but knowing that others might like such a thing means I'll see if I can open it up afterwards.

I'm looking to achieve this by two methods:

1) If the link contains the ID of the end page (i.e. ASIN for Amazon), then reconstruct the URL of the end page.

2) For places in which #1 fails, follow the link and seek to determine whether a permalink or canonical URL exists at the end page.

Ironically I seek to strip affiliate codes in order to add my own in my given use case... but I'm using golang and am trying to structure it all in a flow based way in which stripping and adding codes are just separate steps.

So it doesn't seem to me to be too hard to then expose each side as a service by itself.

2) What if I build a redirect engine that doesn't redirect to an affiliate page at first and then turn on the redirect after your engine completes the permalink check? For example, if you do the permalink check when the story is published, I would have my engine wait 5 minutes or so before changing the destination of the URL to my affiliate page.

For my own work I plan to keep a store of ALL user submitted links.

I plan to iterate over it and do sane things with it: * Is it an embedded image and is Chrome reporting the domain as malware? = Convert to link instead of image * If it a link that was only transiently available? = indicate that it is no longer available and suggest searching instead

My aim is to self-heal user submitted content that links elsewhere, as much as it is to monetise that content where it's possible to.

I plan to visit links on a schedule and react as necessary. I hadn't really figured in affiliate spammers, but that would just be part of the self-healing now... detect change in destination and re-run the bit that strips affiliate codes.

Stripping affiliate links is pretty easy, however it's pretty hard to know all the possible combinations from all the networks.

Skimlinks already ignores their redirect if the link is already affiliated so I'm guessing they could just reverse this logic & have a solution.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact