Hacker News new | past | comments | ask | show | jobs | submit login

A couple years ago at Inky we pivoted away from general improvements to email to focus on phishing prevention using ML and computer vision, and this has been tremendously successful. (This pivot was motivated primarily by Google Inbox, which of course just got shuttered -- but that's another post.)

One challenge with phishing is that virtually all the "best practice" pieces written by the press still follow this Atlantic article's "blame the user" approach to phishing prevention. I.e., train your end users to not click on stuff in bad emails.

Unfortunately, what we've seen over the last year or so is exactly what you'd expect: now that many companies are running simulated phishing training campaigns -- sending fake phishing emails to end users to try to train them to not click on "bad links" -- attackers are now sending brand forgery emails that are essentially perfect looking. The key insight here is that the attacker actually has a labor-saving technique that is also completely devastating to the approach of training users: "Save As HTML".

It's obvious in retrospect, but all the attacker has to do is take the exact HTML from a real transactional email (say, a DocuSign request), edit it to change one link, and resend it. (In security parlance this is a kind of "replay attack".) By definition, the body of this email will look identical to the original transactional mail, so you're left with training users to see the invisible. The hapless end user logs in "to DocuSign" and thereby gives the attacker his/her credentials.

In contrast, a machine learning system that is trained to recognize brand-indicative clues from emails can trivially verify DKIM, etc. on the mail to determine whether the mail really is from DocuSign or not. (It's actually not really trivial, because DocuSign might send mail through MailChimp or some random domain they never told you about, but that's a detail...) Software 1, Humans 0.

This leads me to personally believe that while phishing awareness training is important and a good practice, the future must be one where the machines do the vast majority of phishing email identification, blocking these emails before they reach end users. And it's a hard problem.

Of course, attackers can't precisely control the headers -- e.g., they can't easily send DKIM-signed mail from a domain named docusign.com -- so they can't literally replay a real DocuSign mail. But here again they use lots of clever tricks. One of my favorite (i.e., most evil) real-world phishing emails was a clone of an American Express "confirm card activity" email sent from domain aexp-ib.com. Most recipients would plausibly believe that that domain was some kind of internal Amex mail server or something, so it doesn't look at all weird. Even more devastatingly, this email came DKIM-signed -- with SPF and DMARC "alignment" -- by a very high reputation sender (Google), so it sailed right through mail protection systems based around traditional "good mail / bad mail" signals.

Why was it signed by Google? Because the attacker set up a G Suite account and sent the emails from there. This is another challenging phenomenon we're seeing: it's trivial for attackers to "inherit" the good reputation of a shared service like G Suite in this manner. Similarly, instead of hosting phishing sites on sketchy-looking URLs that can be detected with simple Bayesian models (good vs bad URL detection), attackers now host their stuff on Google Sites and on compromised web sites with high Alexa rankings. You can use simple heuristics like "don't trust mail from a domain set up in the last 3 days" but that's problematic too, because attackers can simply bank domains. (And, for that matter, real senders create new domains and send legitimate mail from them.)

According to the FBI, email-based phishing attacks have cost companies over $12B since 2013. If there's any silver lining to this scourge, it's that it makes for a really interesting technical challenge for the white hats that pretty much everyone understands the need for. (A completely different set of techniques is required to block impersonations of people -- "spear phishing" -- but I'll leave that for another day.)




> One challenge with phishing is that virtually all the "best practice" pieces written by the press still follow this Atlantic article's "blame the user" approach to phishing prevention. I.e., train your end users to not click on stuff in bad emails.

This is the same problem most pieces of advice for avoiding fraud and financial crimes have. Identity theft shouldn't be blamed on the victims' 'carelessness', because at the end of the day, it's the banks/credit card company/whoever that screwed up and let someone access their money that they shouldn't have.

The solution to most financial crimes and fraud isn't to 'train the customer to avoid the bad guys', since the bad guys are getting more sophisticated all the time and people can't be vigilante 24/7. The solution is to fix the underlying systems and procedures at the companies involved so criminals can't exploit their systems in the way they can right now.


Exactly. What people fear is not "identity theft" but bank libel. I could care less if a fraudster borrows money from a bank using my social security number and other public information. That is the banks problem when they try and collect the money. The problem is when the bank libels me by writing to a credit bureau and tells them that I have defaulted on a loan. If this libel was a large crime and prosecuted vigorously, "identity theft" would go away.


You'd still have a hell of a hassle on your hands cleaning up the mess and protecting your credit score, etc., right?


If the banks were heavily penalized for giving wrong info to the credit bureaus, the number of people having this problem of "identity theft" would plummet. If, while cleaning up your credit score you could also get a $10,000 judgment from the bank that libeled you, maybe you don't feel so shitty about the hassle the bank has put you through.


Are they heavily penalized for giving wrong info to credit bureaus? Or perhaps it's not a big problem as banks have the drill down with regard to identiy theft's aftermath?


Are banks saying that you yourself defaulted, or just that the given identity defaulted? Whether you as a person defaulted, or your identity was stolen and someone else defaulted with it, there is still increased risk from the baseline that your identity will default.


I would be interested in knowing if this is how the banks get around the libel problem. New laws should be enacted preventing that, if that is the case.


For one thing, what I said above. For another, it is impossible to libel someone by one private party telling another private party information. Being published is a necessary condition of libel.


If a credit bureau creating documents for others to use for various reasons in not publishing then Congress needs to pass a law declaring it is publishing or make a law that what the banks do to people when reporting false statements to the credit bureaus is a crime. Do you think that a creditor that reports a false statement that a person defaulted on a loan is not wrong and should not be penalized? Its great for the people that loan money but not how any reasonable person would think the world should work.


A credit reporting agency seems very much like a publisher. They receive statements from some parties and distribute them (usually more widely) to other parties.


A few bits of information about me is not my identity.


Enough of your identity to have a wild night out in Boise on your dime.


Who pays the costs of phishing? Is it mostly individuals or do corporations feel the pinch?

Your analyses gets at the root causes, the bug fix that woukd stop the trouble tickets. Are the incentives there or the political will? It’s big. You’re getting into major league ball level public policy when banking regulations are involved. The problem must not yet be painful enough to drive the politics but maybe that day will arrive sooner or later.

While this could stop the most lucrative form of phishing, it woukd still leave the other forms of phishing. And as @badrabbit mentioned there are other types of phishing (e.g., Malware delivery).


When you find a phishing email, do you have a program that spams their "docusign" login with vast quantities of plausible-looking but fake credentials?

IMHO active countermeasures are always the best approach to stuff like this. Attack, attack, attack.

Also I could imagine giving them special fake usernames, such that when they try to login via that special fake username, it turns on extra metrology and slowbanning/cpu intensive operations etc.


The 2nd option you mention, submitting honeypot credentials and then watching for login attempts, is helpful with a comprehensive defense.


We don't do this kind of thing, but some companies do offer "take-down" and related services. Not sure if credential-DDoS is one of them though. :)


What’s the objective? Is it meant as deterrence of future attacks, retribution, or maybe other objectives? I’m not criticizing, I’m trying to learn.


I always thought the standard anti-phishing advice for end users is to never click on any link in any e-mail. Then it doesn't matter how sophisticated the attack is.


Thanks for sharing! Someone in my social circle recently fell victim to a phishing attack. I had similar thoughts that you really need to prevent the email from getting through in the first place. Success for phishing is all about the weakest link in an organization and training only goes so far.


I think you’ve elucidated a good middle term vision. It seems sound to me but I would not qualify as an expert on this topic.

That leaves us with this transition period before your vision’s realized. You’re realistically imagining a time when we solve this problem, relieving we human beings of the burden of consistently resisting the urge to take the phish hook.

In the scenario you mentioned above, the phisher spoofed the real domain, authenticating it or are you saying they used a similarly spelled domain they owned so implemented email authentication, enabling it to pass as technically legitimate, leaving the final defense up to the judgement of each recipient?

Sorry it’s been a long week I’m slow on the uptake now...


It is a design problem that people are trying to fix in production. I strongly agree with you about not blaming users but I think the focus should be on what happens after they click on the link or open up the attachment.


> completely devastating to the approach of training users: "Save As HTML".

If you are training users to rely on information that can be spoofed you are Doing It Wrong.


Good insights as even many InfoSec folks seem to fall into this blame the victim mentality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: