
A Recycled IP Address Caused Me to Pirate Books by Accident - nickjj
https://nickjanetakis.com/blog/a-recycled-ip-address-caused-me-to-pirate-390000-books-by-accident
======
chatmasta
This is known as "subdomain takeover" and is definitely a common problem. It's
probably one of the most frequently reported bug types on hackerone.

I wonder -- has anyone written code to spin up EC2 instances and check for
subdomains pointing to the IP? Not sure how you could do that efficiently
(does rdns work after the IP has been recycled?), but a starting point might
be gathering as many NXDOMAIN subdomains as possible, filtering the ones at
cloud providers, and starting instances until you get a match.

~~~
Operyl
Bing has been known to be a great search engine for "reverse IP searches."
[https://www.bing.com/search?q=ip%3A208.109.192.70](https://www.bing.com/search?q=ip%3A208.109.192.70)

If you're fast enough, it could still be cached in bing when you search.

~~~
textmode
[https://api.hackertarget.com/reverseiplookup/?q=208.109.192....](https://api.hackertarget.com/reverseiplookup/?q=208.109.192.70)

~~~
iforgotpassword
Try passing 1.1.1.1 if you want a no-js version of the infinitely scrolling
website.

~~~
chatmasta
That’s pretty cool. They must be buffering output themselves and sending raw
http responses over the socket.

------
hartator
> But before deleting it, I copied the IP address so I could open a support
> ticket on DigitalOcean. I figured they would like to know that someone is
> illegally distributing content on one of their servers. Now that they know
> the IP address, they can shut it down.

It’s also possible you exposed a web service that wasn’t meant to be public.

~~~
stelonix
This. The article was interesting but the copyright knighthood put me off.

What if the person was simply serving those files for himself over the
internet (I've done it countless times) and Google caught it because the
_author_ was careless with handling DNS entries? Now DO has an IP and an
accusation, more power is given to the DMCA-strike-first-ask-later status quo,
all for what? It's not child pornography we're talking about, it's _books_ for
Christ's sake. There's no harm, it does not affect your life, why go through
the effort of bringing trouble to someone else because of _your own lack of
care_ with sensitive issues such as DNS entries?

~~~
tzs
> What if the person was simply serving those files for himself over the
> internet

There are a couple of reasons to believe that this is not the case.

First, there were thousands of them. Someone having thousands of books is not
unreasonable, of course, but both the breadth and depth of this collection is
such that it is extremely unlikely it is someone's personal library.

Second, the PDFs aren't the actual books. They are just short blurbs
describing the book and containing download deep links into bookfreenow.com. A
couple examples [1] [2]. Clicking to create an account so you can start
downloading redirects through some ad companies (and possibly some shady
affiliate marketing companies), eventually reaching some download site (I
think) that tells you no free slots are available and asks you to make an
account.

(The bookfreenow.com pages for each book all seem to be the same template with
just the book info substituted. Even the comments on the each page are from
the same people, at the same times, and say the exact same things except they
have the correct book title on each page. They aren't even trying to make it
look like the comments are legit).

[1] [http://bookfreenow.com/downloads/the-lm3900-a-new-current-
di...](http://bookfreenow.com/downloads/the-lm3900-a-new-current-differencing-
quad-of-plus-or/)

[2] [http://bookfreenow.com/downloads/essential-orthopaedics-
by-j...](http://bookfreenow.com/downloads/essential-orthopaedics-by-j-
maheshwari/)

~~~
dbasedweeb
_First, there were thousands of them. Someone having thousands of books is not
unreasonable, of course, but both the breadth and depth of this collection is
such that it is extremely unlikely it is someone 's personal library._

Sounds like my personal library actually.

 _Second, the PDFs aren 't the actual books. They are just short blurbs
describing the book and containing download deep links into bookfreenow.com. A
couple examples [1] [2]. Clicking to create an account so you can start
downloading redirects through some ad companies (and possibly some shady
affiliate marketing companies), eventually reaching some download site (I
think) that tells you no free slots are available and asks you to make an
account._

Well that’s a lot harder to explain in charitable terms! So is this even
piracy, or just some kind of scam based on the promise of piracy?

~~~
vbezhenar
It's a scam. I was searching some ebook for free and there are many links like
that. It's funny that scammers might be more successful saving paid books than
copyright warriors :)

~~~
chipotle_coyote
_It 's funny that scammers might be more successful saving paid books than
copyright warriors_

You'll never actually get the book, because they don't have it. It's a scam to
try to trap people who are trying to find free downloads of ebooks rather than
paying for them.

(It's also kind of a funny definition of "save" you have there. With all due
respect, if you want to save paid books, you should -- crazy as this may sound
-- pay for them.)

~~~
vbezhenar
Saving is wrong word, I guess. I mean that someone who's trying to find a
pirated book will just stop trying after few unsuccessful attempts. For
example copyright owners are forcing Google to hide search results with
pirated content. But may be polluting search results with fake content is
better strategy.

------
iMerNibor
Just got some new ips and I've been on the other end of this. Some staging
domain of a website still points to one of the ips I got and there's a health
checker that keeps trying to ping /health on the domain.

Nowhere near the scale of this though, just some background noise I'll ignore

~~~
imhoguy
Serve them a cheap gzip b0mb (example [0]) - it sould move their ass to clean
up things.

[0] [https://blog.haschek.at/post/f2fda](https://blog.haschek.at/post/f2fda)

~~~
dx034
The users are often innocent, they just want to use the old page. Best is to
just serve a 403 or refuse the connection.

------
nebulous1
This is a decent reminder to get rid of DNS entries to IP addresses you no
longer control.

That said ...

> A few months ago I started to receive an absurd amount of notifications, but
> I ignored them.

Really? I find it pretty amazing that he chalked it up to “Google is probably
on drugs”, without even investigating at all!

~~~
icebraining
I can see myself doing the same thing, particularly if I didn't have much free
time when it happened. I wouldn't expect Google Alerts to warn me of a
security issue!

~~~
nickjj
The problem is, when the Google Alert hit my inbox it didn't show the
ssl.nickjanetakis.com subdomain in the alert snippet. It just showed "Nick
Janetakis".

Still, I should have clicked through to see what was up, but then again, the
links looked very suspicious. I don't make a habit of clicking a bunch of
unknown links sent via email, especially not when running Windows.

------
tinix
Ha!

One of my old staging subdomains had an old Digital Ocean address left in it
for a bit while we migrated some servers, and Google indexed some random ebook
pirate site too, here[1] is a snap of the logs for anyone who is curious. Once
I updated the DNS, Googlebot started to blow us up.

[1]
[https://node.zeneval.com/snaps/a79fe276b688da7b589ce539c9a4a...](https://node.zeneval.com/snaps/a79fe276b688da7b589ce539c9a4a062.png)

I never would have even noticed, had it not been for Googlebot indexing the
crap out of us, and causing 10s of thousands of sessions to be created in a
short time which threw our Munin graphs off the charts.

The site we were staging ran fine, redis handled it without breaking a sweat,
but we're not a public facing service, so I just straight up blocked Google
Bot w/ an nginx rule.

------
hartator
> I know I made a stupid mistake by not removing the A record but this could
> happen to anyone. I would like to see more services only allow for DNS based
> authentication by adding TXT records.

There is plenty of reasons why one will prefer HTML verification over TXT DNS
verification. It's usually faster, and more predictable. Plus DNS are far from
being completely secure.

~~~
icebraining
It's also pretty nice if you're a provider hosting a website for the client
(e.g. Github Pages, Shopify, etc). Getting them to point an A record to us is
hard enough, but at least it's only once. Then you can use HTML verification
for setting up LE certificates, Analytics, etc.

------
itake
Just out of curiosity, how do you know this just wasn't an intentional side
effect of someone hosting a website on a DO box? Namely, was the box just
responding to anything that would connect to it?

Google has probably already crawled that domain previously, and when it asked
for that IP address, it found some other persons website.

~~~
smileybarry
The screenshot shows the blog author's name attached to all of the search
results as a proper, spaced name. (As opposed to a domain substring) It looks
like intended impersonation.

------
deaps
I'm not very security minded. I have old domains and subdomains that I used to
use that have long since passed - that lived on server IP's that have also
long since passed on from my ownership as well.

I just double checked all of my old _stuff_ \- and there's not a trace left
out there. Apparently, I cleaned up all my old DNS entries as things moved on,
even though none of those domains are my 'brand' (as the author states it is
his). As a non-security minded person, I find it hard to believe a security-
minded person, whose good at his trade, forgets to do this.

------
rando444
_The only way someone is going to gain access to my server is if they manage
to gain access to my workstation and steal my SSH key pair._

If it has happened before, you'd be foolish to think that it's impossible to
happen again.

[https://nvd.nist.gov/vuln/detail/CVE-2016-1247](https://nvd.nist.gov/vuln/detail/CVE-2016-1247)

~~~
tinix
What does this have to do with anything even remotely related to this article,
other than it being a webserver? Symlink takeover is not a new vulnerability,
and if someone has a user account on your server, you're already owned anyway.
Escalating to root is trivial almost always.

~~~
rando444
The author spends the beginning of the article talking about how he takes
security very seriously, that his webserver is practically uncompromisable,
and that the odds of it being compromised are so remote because he has "the
reflexes of a highly trained ninja" and doesn't run nginx as root.

I'm pointing out that his server isn't as uncompromisable as he's trying to
lead the reader to believe.

~~~
tinix
If the author is truly a "ninja" they wouldn't be running their web
application as the nginx www-data user in the first place, and then a web
application exploit wouldn't inherently give anyone access to the nginx user
either to exploit the log-rotation mechanism via symlink. One can read more
about the CVE you linked here[1]. But basically the gist of it is this:

> As the /var/log/nginx directory is owned by www-data, it is possible for
> local attackers who have gained access to the system through a vulnerability
> in a web application running on Nginx (or the server itself) to replace the
> log files with a symlink to an arbitrary file.

This assumes the web application is also running as www-data, which wouldn't
be that smart.

[1] [https://legalhackers.com/advisories/Nginx-Exploit-Deb-
Root-P...](https://legalhackers.com/advisories/Nginx-Exploit-Deb-Root-PrivEsc-
CVE-2016-1247.html)

~~~
mbreese
From the article...

 _My site is static too, which means it’s only being hosted through nginx from
a non-root user._

~~~
tinix
Yeah, so then you have to exploit nginx, not a web application. Good luck with
that. If someone can get RCE through nginx alone, you're already toast.

------
cathhhhji
What would be the reason to use someone else's domain that you don't have
control over to point to an IP address?

~~~
nebulous1
They probably didn't even know that it was accessible via that domain. Their
webserver responded to any request with the default site, and google decided
to crawl ssl.nickjanetakis.com and found all the pdfs.

~~~
nickjj
That's what I thought too but then I noticed that someone bothered to put "\-
Nick Janetakis" in the titles of those PDF pages (check the screenshot in the
article).

~~~
nebulous1
Well spotted!

I don't think that's exactly what was going on though, although perhaps
somebody else can chime in.

I don't think the "\- Nick Janetakis" is actually in the title of the PDF,
rather google has appended it to the actual title (the end of which has been
replaced with an ellipsis).

I think google can get this from either the title of a html page or from a
og:site_name entry of a html page (I'm not 100% on all this). It's possible
that google took these from the "actual" ssl.nickjanetakis.com and still
remembers the og:site_name and applies it to the pdf files?

------
incompatible
So, if I had a subdomain set up that way, and sci-hub or somebody came along
and starting using it, would I have any legal obligation to do anything about
it? Could my domain be seized?

~~~
incompatible
I expect the domain registrar terms and conditions would get your domain shut
down sooner or later.

------
z3t4
Google is aggressive when it comes to crawling (which I think is OK) so it's
very much possible that the one who hosted these PDF's had no idea that Google
had crawled the site, or that it was under that domain.

~~~
smileybarry
It seems like they knew what they were doing, given that the search results
have the author's name attached to them. (As a proper name, rather than
cutting "nickjanetakis" out of "ssl.nickjanetakis.com")

------
uptown
What's the purpose for spreading PDFs like this? Are there ways to embed
malware into PDFs so they attack the host machine of whoever downloads the
files?

~~~
goda90
There are people who feel good about sharing non-free materials with others.
Perhaps this approach is for users where torrenting isn't an option.

~~~
Froyoh
Yep, like me

~~~
tudelo
I don't disagree with the idea, but it is hard to think of a world where we
only optionally pay for things. Do you think it's okay because the cost is
prohibitive? Or because if we truly value it we will support it if it is free
or not free? I don't necessarily know how to reason this issue out.

------
trumped
> The odds of that are remote because my workstation never leaves my office
> and I have the reflexes of a highly trained ninja.

Does HE ever leave his office?

------
dec0dedab0de
if you're using cookies for sessions on your main domain this can be a very
big flaw.

------
throwaway218649
> I have Google Alerts set up so I get emailed when people link to my site. A
> few months ago I started to receive an absurd amount of notifications, but I
> ignored them. I chalked it up to “Google is probably on drugs”.

Between this quote and the bozo-level advice in the "Domain Validation Should
Be More Strict" section ("I would like to see more services only allow for DNS
based authentication by adding TXT records" is going to solve this problem?
permanently decommissioning IPv6 addresses?), the one lesson I can take away
from this article is to stay as far away as possible from any of this guy's
security related courses.

