Hacker News new | past | comments | ask | show | jobs | submit login

We've found that good indicators include: a large distance between billing and shipping addresses, a large distance between estimated IP location and billing address, large order size, using a free email like gmail/yahoo/hotmail (that's the smallest of the factors, but virtually all of our fraud orders use them). Even combining these and others with a threshold, it's still hard to reliably detect without too many false positives.

At a previous gig we found the same basic factors. I wrote a quick script to iterate through all the available Weka[1] classifiers using our manually flagged data as a training set. Then I took the top 20 performing ones and used them on incoming orders in production. If more then half the classifiers agreed a transaction was fraud, we denied it. Though this seems a very blunt hammer (I'm not a machine learning expert by any stretch) it worked remarkably well.

[1] http://www.cs.waikato.ac.nz/ml/weka/

What happens to the false positives - ie the ones you denied that were real? Do people get a way to prove they really are legit?

(Matt Cutt's blog claimed I was a spammer when I made a comment about two factor authentication - I have no idea why. It told me to email the administrator to get it accepted but of course provided no clues on what address to use etc so I didn't bother. I'm betting the anti-spam software is claiming a victory when it was actually a failure.)

For any denial we popped a "technical issues, please call customer support" message. We found that fraudsters were far less likely to call then real customers.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact