
Combatting illegal abuses of ngrok - johns
https://inconshreveable.com/05-29-2014/automate-away-your-problems-combatting-illegal-abuses-of-ngrok/
======
patio11
I think this is an excellent illustration of a situation where your users get
a better experience by being asked to pay money than they get by being given
services for free. Putting _any_ pay gate in front of this will greatly
decrease the amount of abuse you get. You'll still have abuse problems, but
abuse is largely a matter of "outrunning your friend, not outrunning the
grizzly." When you're free and automatable, you are the weakest link on the
Internet for possible exploitation. Even though people can use e.g. stolen
credit cards and fake data to sign up for services, you will no longer be the
weakest link even with that trivial barrier, greatly decreasing the amount of
abuse you'll suffer.

This is why Appointment Reminder requires a CC upfront, by the way. I have
very little desire to again be on a conference call with a lawyer and a police
detective about violation of a restraining order made possible by an ex-
husband being able to essentially proxy his harassing phone calls through my
software.

There exist a variety of potential scalable anti-abuse measures you could
implement here. For example, I think that foo.ngrok.com should probably have a
low, hard limit on how many IPs/regions/etc should be able to connect with to
it. Nobody should be putting production systems on it, so if 100 IPs are
connecting to one of the ngrok sites, that's likely either a) abuse or b) a
use case which you should not be supporting on a free tier.

Relatedly: the problem with abuse, here and elsewhere, isn't limited to "My
hosting provider is opening tickets. If I don't catch one, the site shuts
down." Abuse which your hosting provider doesn't catch is still abuse. You
didn't get into the business to make the Internet a darker, more dangerous
place. Make the necessary technical/business changes to avoid operating a
public nuisance.

~~~
inconshreveable
I disagree with respect to a pay gate. A payment gate on ngrok, even something
as simple as requiring a credit card number would pose a significant barrier
for a large swath of ngrok users. As some motivating evidence I can tell you:

I've had requests to support niche payment systems (whose names I can't even
recall) because the country's credit card infrastructure is poor or not
commonly used.

I've had a number of people tell me that their credit card wouldn't process
because their provider is not supported world-wide.

ngrok is useful to students, some in high-school or even younger who surely
don't have easy access to a credit card.

Sure, I could make those people email me with special requests to get around
the system, but that greatly degrades the experience and moves initial
experience from joy -> frustrating.

There are a number of possible technical solutions I would rather pursue first
before greatly damaging first-time user experience and funnel.

~~~
patio11
I won't disagree that putting a credit card form up is a barrier, principally
because constructing barriers is the point of the exercise. Net-net, it
strikes me as better to inconvenience a fraction of users predictably with
mitigation in place versus inconveniencing all users unpredictably in a
difficult-to-recover-from fashion when e.g. your hosting company sends you a
termination letter.

By the way, since you mention you haven't worked in hosting before: you should
know that you're at risk of getting your account terminated, even if you
answer all tickets in a timely fashion. Every ticket opened against you
represents multiple fairly highly-paid humans having to interrupt their work
days. They accept this as a cost of doing business the first few times. You
will eventually be told "You don't pay us _nearly_ enough to generate this
volume of tickets. Take your business elsewhere."

Anyhow, your business, ultimately your call. If you're looking to do primarily
automated solutions:

1) Risk score customers. You can do this in intelligent, data-driven fashions,
but you'll get results quicker by just coding heuristics. Some heuristics
which you'll find valuable are "Where is the ngrok client getting called from?
Does it come from a number of high-risk countries?" (I know, I know, most
people hate the implications here but it is a case of being mugged by
reality), "What OS are they using for the client application?", "Has the
machine invoking the current ngrok subdomain been banned before?", "Does the
subdomain receive traffic from 'far away' from the IP we're sending it to?",
"Does the subdomain receive traffic from more than N IPs?", "Is there a large
lag between creation of the subdomain and first use?", "Does an IP create
multiple subdomains either in parallel or in series?", etc. You can easily
make people's access increasingly open as their risk score approaches Presumed
Safe and make it more restricted as their risk score approaches Quite Possibly
Not Safe. (This is a nice compromise which lets legitimate users without a
credit card in without-loss-of-generality China continue to get non-zero value
out of ngrok.)

2) You may wish to have your client application fingerprint systems which it
is installed on. (There exists substantial literature on this.) Add abusers'
fingerprints to a hellban list: their IPs can see their sites, and maybe the
first 3 IPs accessing their subdomain see it and get added to a bucket which
shares their hellban, but everyone else mysteriously 500s out. This will make
it hard for abusers to realize "Doh, I have been found out and need to switch
proxies or rooted boxes."

P.S. Customer to vendor here: irrespective of your decision with regards to
people who don't pay money for your services, please charge more. It will give
you more resources to fix this problem so that you don't get shut down again
and ruin my Twilio programming productivity.

~~~
neil_s
I agree with @patio11, this seems like an amazing service that I can't wait to
use now that I have discovered it, but similar to those running open proxies
on the net and thus creating negative externalities for the rest of the web,
you have a certain responsibility to block abuse before it is reported to the
hosting provider.

Maybe something like Sift Science could help automatically create these
heuristics. You could just modify your script to notify their systems every
time abuse is reported, and it could learn the features of potential abusers.

------
jacquesm
Interesting article. I just spent the better part of 4 months in total working
for a customer with similar issues. In the end the solution we picked was to
take all the information available during the first contact with the users to
generated a whole pile of signals, each of which was then used to generate
inputs to a Bayesian classifier operating with a fairly arbitrary cut-off.

The system has gone into production a while ago and the results are quite
amazing, there is a signficant reduction in the amount of manual work to be
done to combat the abusers and the number of false positives is extremely
small (and so far each false positive that we identified has led to an
improvement of the code and similar false positives were not witnessed
afterwards).

These are annoying problems, but any free and open service will sooner or
later be faced with having to make their abuse defences scale.

~~~
rmrfrmrf
Very cool solution! How easy was it to collect data for the classifier? I'm
just wondering if the company had huge detailed logs or if you had to train
the classifier in real time.

~~~
jacquesm
They had data going back for years which really helped a lot. Still, coming up
with all the signals was hard work and re-computing the various weights took
days per run. Added complexity is that to ensure the non-absive users' privacy
the system had to operate on an absolute minimum of information per user.

The resulting program does not run in real time, it basically runs every few
minutes to have a good look at whoever signed up since the previous run and
then assigns a probability to each of the new accounts.

This still catches the bad guys long before they can get up to any mischief
(they still have to verify their account) and does not get in the way of the
speed of the forms processing during sign-up.

~~~
rmrfrmrf
I can imagine it being a grueling effort! Part of what has kept me away from
using a classifier or any kind of 'neural network'-type solution is that the
initial configuration always seems to be something of a shot in the dark, and
tweaking the weights is a non-trivial process with, as you said, a lot of wait
time. When it works, though, it has that wow factor that clients love. I give
you all props for taking that risk and having it pay off!

~~~
jacquesm
Yes, the wow factor has definitely made me a bunch of new friends. The
downside is that now they think I can do magic...

------
kbar13
To the OP:

I'm glad you're being proactive in responding to abuse reports.

The fact of the matter is, phishing is probably one of the worst forms of
abuse, in that it directly affects those who don't know what hit them and can
also be near impossible to detect.

As someone who works in the hosting industry, I can tell you for a fact that
while your provider most likely forwards you all of the reports it receives,
chances are there are still instances that are never reported to the provider,
and thus you.

Responding to abuse reports is great, and is highly encouraged, but preventing
abuse from occurring in the first place should be the first priority.

Thanks for writing this post!

~~~
inconshreveable
This automation was just a first step.

This is an area that is rather completely new to me. I've never worked in the
hosting world or had to deal with this type of problem personally. I certainly
have a lot to learn. Can you talk about the techniques you've used in the
hosting world to combat these problems or point to any resources that you or
others have posted about it?

~~~
crznp
Why not limit anonymous accounts? For instance: a time limit on anonymously
created URLs. If I'm just trying out the service or running a demo remotely,
changing the URL after an hour isn't a big deal. If I'm sending out phishing
emails, I would be disappointed if all my links broke after an hour.

Then if ngrok seems useful except for those damn limits, I can just give you
more information and get them removed.

------
conroy
Alan is a good friend of mine. Taking lead from patio11[0], I have been trying
to convince him to charge money for ngrok. He's put together a great hack here
to solve the problem, but charging a monthly or one time fee (ala Pinboard)
would prevent many of these shady sites from popping up.

[0]: [http://www.kalzumeus.com/2014/04/03/fantasy-
tarsnap/](http://www.kalzumeus.com/2014/04/03/fantasy-tarsnap/)

------
birken
Another potential solution is to maintain your simple free usage flow for non-
risky accounts, but require a credit card or human review for accounts which
seem risky. Determining risky accounts would have to be something you figure
out by analyzing the known good/bad accounts, but depending on how easy they
are to tell apart and what signals you have available it could be very
powerful.

I recall a conversation I once had with an early Google engineer who said they
had abuse problems with the early adwords campaigns due to the ads going up in
real time. Phishers would create adwords accounts with stolen credit cards to
create phishing ads that would steal more credit cards (which they would use
to make more adwords accounts, and so on). Human review and whack-a-mole was
just too slow and didn't work. So what they did is they found a bunch of
signals that correlated to the phishing accounts, if a new account triggered
the signals, then the ads didn't go up in real time and were human reviewed
first. So for the false positives it was a minor inconvenience, and when it
worked properly it caught a bunch of the phishers before they caused damage.
Even today with whatever amazing technology Google has, I believe they now
human review all adwords ads... preventing abuse is hard.

------
jarrett
I wonder if it would be possible to automatically flag ngrok sites for manual
review based on certain criteria. E.g. if the phrase "citibank" appeared on
the site, it would appear in a moderation queue.

Though there may be bulletproof ways for criminals to bypass such automatic
scanning. For example, I've heard of criminals creating two versions of their
site: One, a harmless site, that you get if you type in the domain name cold,
and then a criminal one, that you only get if you have the right referrer
header or query string. Which makes it difficult or impossible for an outside
entity to see the illegal content without access to the spam email, referring
site, or what have you.

I don't know much about ngrok, but I take it you can tunnel SSL over it such
that it's impossible for ngrok to inspect the contents of your traffic. Which
would rule another viable strategy: Monitor traffic for suspicious keywords,
thus bypassing the cloaking techniques described above.

Any other techniques for automated flagging that I might be missing? Maybe
some kind of content-agnostic traffic analysis that spots a likely spam
fingerprint, e.g. certain chronological traffic patterns?

~~~
zhemao
The post already mentions why he doesn't want to do this.

> One of the core tenants of the ngrok.com service is that it does not inspect
> your traffic at all beyond reading the header field necessary to perform the
> multiplexing.

~~~
jarrett
Yeah, I'm just thinking one could relax that policy--if there were actually a
viable strategy for flagging sites, which there may not be.

Unless you're tunneling SSL, you can't _know_ that ngrok doesn't inspect your
traffic. Promises of privacy that are based on the honor system aren't worth
much, at least to me. I'm not at all attacking the honesty of the ngrok
operator; I'm just stating a generality about security--one that applies
regardless of how much you think you trust any particular actor. For two main
reasons: 1) The actor may not be as good as you think, and 2) even a truly
good actor can be compromised in a variety of ways.

Therefore, to me at least, a promise not to inspect traffic has little or no
value. And if that promise has no value to users of ngrok, perhaps it could be
relaxed in favor of protecting the long-term viability of the service.

That being said, the caveat stated above still applies. There's no point in
relaxing the promise unless there exists a viable flagging strategy. And such
a strategy may not exist, owing to the problems I described in my previous
post.

------
croikle
Is there any protection from trolls sending mail to this address? Yes, one can
roll back the change, but an automated attack would win.

Perhaps the address should be an unguessable <GUID>@ngrok.org instead? Or do
you validate SPF and only accept mail from your ISP?

(Maybe you already have solutions; I just found it interesting to consider
abuse of the system.)

------
trevmckendrick
FWIW, ngrok is so easy to set up that I remember audibly saying under my
breath "wow" when I first used it.

I get that abuse is a problem, but count my vote of extreme appreciation for
how quickly I got exactly what I needed.

~~~
MBCook
Agreed. I used it to do some API testing that required callbacks to my dev
machine and was dumbfounded it was so easy. I kept looking for what I had to
setup but the instructions just said "run this command".

So I gave it a try and bam... it worked. So amazingly easy.

I'd vote for anything that doesn't break that experience. I could see
reasonable rate limiting, but that may not kick in unless a fishing link was
getting a lot of traffic.

An amazing service.

------
rgejman
OP:

I'm sure the service you offer is great. Moreover, I'm sure that not requiring
any identifying information greatly improves the experience you give to first-
time users and increases conversion. However, you are inevitably doing a
disservice to your users and the entire internet by offering hosting without
any sanity checks (like an account, valid credit card or deposit). I hope
you'll reconsider whether increasing legitimate usage of your service is worth
inadvertently assisting criminal activity.

------
mitchellh
Vagrant Share has avoided this issue but putting a reasonably short time limit
on any given share unless you pay. Vagrant Share is completely free, but your
share stops routing after 4 to 8 hours. You can just request a share again
immediately but the name will be completely different.

We see many shares per day and haven't had this issue. Would this work for
ngrok?

~~~
inconshreveable
This is by far the most common suggestion that's been suggested to me that I
hadn't considered before. I like it a lot, and I wouldn't be surprised if I
ended up implementing this.

Another possible alternative I'm considering is to force all non-paying
tunnels to use http authentication which would greatly reduce the utility to
phishers and the like.

~~~
dylz
A lot of phish already use HTTP auth.

    
    
        http://legitimate:www.bankofpaypai.com@460fc73627.ngrok.sexy

~~~
drrotmos
Plus, adding HTTP auth would totally fudge every tunneled app using HTTP auth
for itself.

------
danielweber
Thank you for opening by telling me what ngrok is.

------
robocat
Looks like something worth charging for and that my business would use. I like
to be charged because:

* I like to see that services I use have a viable business model (e.g. standard SaaS tiered pricing signup is a good signal to me).

* I don't want to be using a service that is flakey (usual problem is they become uneconomic to run, although your current problem counts too!).

* I want somebody with an economic incentive to help me if I do have a problem.

* Trust is important if I am installing something - I want your business to have strong visible incentives to be trustworthy.

Perhaps find a friend (go halves or something) that can set up exactly the
same thing using a payment model and a different domain that looks more
business friendly?

If paying, I would definitely like a more anonymous looking domain for use
when diagnosing problems with our corporate clients.

Above said, I will definitely by trying it (suitably sandboxed!).

~~~
tokenizerrr
Ngrok has paid features that let you use your own domain name, though when
used with https it will give an ssl warning but this is optional.

~~~
robocat
Thanks - I will look into it after having a go to see if it does actually help
me.

Re SSL: However my #1 issue at present (a single client where the content is
removed from XHR requests) happens on https.

There are cheaper ways I could do the equivalent of ngrok, but ngrok looks
like it will do exactly what I need to diagnose particular problems. If ngrok
saves me a day a year of work and resolves problems more quickly for clients,
that is worth paying low 2 digit amounts per month.

------
ClashTheBunny
I feel that the article does a good job of lauding the position for low entry,
but it lacks an absolute perspective. For example, today email is generally
viewed by people as a double edged sword because of the ease of entry for
people sending. People sometimes still view Wikipedia with scepticism because
of the low cost of entry to add your personal views to the system. Chat
roulette, which was a great idea, became a haven for people who wanted to
expose themselves. All three cases are situations where a low barrier of entry
has either taken a army of content curators (either software or human) or
driven the service into obscurity. How does this solve the issue of either
making ngrok a double edged sword or something that the general public will be
warned about in a 20 second context-less news blurb?

------
steveax
Limiting the number of IP addresses that can access a free account seems
entirely reasonable as does cycling the subdomains after a few hours and would
surely cut down on the attractiveness to abusers.

BTW, paid ngrok customer here and I love the service - sorry you're having to
deal with miscreants.

------
dkersten
At the risk of sounding all 90's ;-) what about inserting a banner into the
HTTP requests of free users stating that its probably not your bank.

Yes, it means you can't serve a REST API, but hey, another reason to upgrade
to a paid account, right?

------
crazytony
Do you also pass the abuse notice on down to the tunneler's ISP? Eg: if you
are notified, notify the abuse@ for the ISP of the endpoint's IP.

------
Demiurge
What about making the free subdomains expire after 24 hours?

------
im3w1l
I don't get it. What is a legitimate use case of this?

~~~
patio11
I have used it for Appointment Reminder development. It is fundamental to the
architecture of the app that Twilio needs to be able to fire HTTP requests
against my app, live, when my Twilio phone numbers get called/SMSed. Making
HTTP requests against a laptop is hard. Ngrok lets me register a subdomain
with Twilio and say "OK, Twilio API, when the dev number is called, hit this
URL on this subdomain. It forwards to a laptop but you don't have to know
that."

It is also useful for more prosaic cases like, in a Campfire room, "Hey, I
have a new design on localhost. Can you sanity check it for me?
foo.ngrok.com/the/page/goes/here"

~~~
MBCook
Twilio is how I learned about it. I needed to do development on their API that
supported callbacks. There were three options:

a) Use ngrok and easily get an externally accessible tunnel to a web server
running on my dev machine so I could test and make changes

b) Get sysadmin to setup some kind of port forwarding to my local machine or a
reverse proxy to my local machine

c) Deploy test software (over and over and over) to a non-critical server and
try to do the debugging there. Of course this would also need the database (or
the ability to connect to the one on my dev machine).

b & c would have been messes. They would have taken a long time, involved all
sorts of security issues, etc. Instead I fired up ngrok, did my testing,
closed it back down and was all set. Perfect testing, direct connection just
like prod would be (no reverse proxies or anything), ability to snoop the
traffic easily with Wireshark, etc.

For development or showing a fun little project to some friends quickly it's
an amazing tool.

Honestly, the possible 'evil' uses of it (like phishing) never occurred to me.

------
j_m_b
Honest question: Why would I pay for something like this when I can use the
tunneling and proxy features of openssh for free aka 'ssh -NfD 5000
user@host.com'?

~~~
elithrar
NAT traversal, works easily on dynamic IPs without having to use another
service, etc.

------
le_meta
tl;dr: blacklist

~~~
felixrabe
I actually hope OP will change the service by preventing abuse to prevent
*.ngrok.org getting blacklisted. I can see some legitimate use cases for
hacking my own cheap cloud service running on my Raspberry Pi at home.

Btw, I found a list of similar services at
[http://stackoverflow.com/a/20702133](http://stackoverflow.com/a/20702133)
(ngrok being named at the top).

