
A website that deletes itself once indexed by Google - cjlm
https://github.com/mroth/unindexed/blob/master/README.md
======
tonyarkles
I had a client once who had something similar, although unintentionally. She
approached me because her website "kept getting hacked" and she didn't trust
the original developers to solve the security problems... And rightly so!

There were two factors that, together, made this happen: first, the admin
login form was implemented in JS, and if you went to log in with it with JS
disabled, it wouldn't verify your credentials. And it submitted via a GET
request. Second, once you were in the admin interface, you could delete
content from the site by clicking on an X in the CMS. Which, as was the
pattern, presented you with a JS alert() prompt before deleting the content...
via a GET request.

Looking at the server logs around the time it got "hacked", you could see
GoogleBot happily following all the delete links in the admin interface.

~~~
brador
What's the idea solution for this? I would drop a cookie and use that to
verify admin privileges on each page. Is that right?

~~~
ben0x539
Fundamentally, authentication when someone tries to delete a thing needs to
happen in server-side logic, not on the client side. The rest is flavouring.

~~~
vkjv
Authentication should happen server side, but authentication need not happen
at the time of delete. When deleting, you should be authorizing and
validating, which _can_ safely be done client side... but, if you are doing
something server-side (e.g., a delete) you should also be doing it server
side.

~~~
TillE
HTTP is stateless. You need to generate some kind of token that can be checked
for each admin action a user takes.

------
tlrobinson
I'm surprised there are so many people on Hacker News asking "why?".

Hackers don't need a reason, other than it being clever, novel, fun, etc. But
if you want a reason there are plenty:

* art: there are numerous interpretations of this

* fun: this is sort of the digital equivalent of a "useless box" [http://www.thinkgeek.com/product/ef0b/](http://www.thinkgeek.com/product/ef0b/)

* science: experiment to see how widespread a URL can be shared without Google becoming aware of it

* security: embed unique tokens in your content to detect if it has leaked to the public

~~~
barbs
I agree that there are lots of reasons that someone would make a site like
this, but I think people are curious as to the maker's specific reason. From
the github:

 _Why would you do such a thing? My full explanation was in the content of the
site. (edit: ...which is now gone)_

I'm curious as to what the website said originally.

~~~
tlrobinson
In that case, I'd guess "art".

------
dsjoerg
It's a digital embodiment of coolness; once the masses can find out about it,
it isn't cool anymore and the coolness is gone. Literally.

~~~
LukeB_UK
I think Hipsterism is what you're actually referring to.

~~~
xorcist
I was cool long before it was hip.

------
frik
An alternative would be to check for the browser _user agent_ and delete the
website right at that point and return a 404 page to the Google crawler bot.
Then Google won't have a static copy of the website.

~~~
desdiv
Your approach is "a website that irrevocably deletes itself once indexed by
Google".

What OP has done is "a website that irrevocably deletes itself once Google
decided to publicly reveal the fact that it indexed said website".

OP's approach has no way of knowing _when_ the site was indexed. It's
conceivable that Google indexed it on the very first day and decided not to
share it publicly until 21 days later.

~~~
nostrademons
Technically, the former is when it is "crawled" and the latter is when it is
"indexed".

In practice, since 2010 these two events have generally been separated by
minutes.

~~~
tyrust
If you really want to get "technical", then the first one is when the site is
"crawled" and the latter is when it's "served". "Indexing" happens in-between
the two.

------
whoopdedo
What about the opposite? A website that created when it is indexed? Start with
nothing and content is added each time the site is visited by Googlebot, or
shared on Facebook, tweeted, posted on Reddit, etc. The website exists only so
that it can be shared, and the act of sharing it defines what the website is.

~~~
TeMPOraL
This is an uber cool idea. Especially if, when this website is shared by
someone, it would attempt to scan the sharer's public feed, last submissions,
last comments, last tweets, etc. (depending on where it got shared), and
generate additional content based on what it found.

Sounds like an awesome weekend project.

------
yk
Cool, but why? ( And shoulden't we invent digital Baroque art before inventing
digital postmodernism?)

~~~
cheatsheet
Both exist.

Postmodernism is a lot more relevant to the digital age than anything, imo. It
emphasizes pointing out ways of thinking and doing, which I think is
especially relevant when we are actually automating most of our ways of
thinking and doing.

I know it gets a bad rap because of the ridiculous examples, but the real
point of it engages the viewer into a serious kind of contemplation concerning
the massive infrastructure that exists and how that shapes our culture,
thoughts, understanding, action..

We have the expectation that the generations to come will accept this
infrastructure and what it says about how the human mind functions. But much
of it is founded on belief systems of how thought and action operate in the
real world. Most of these systems are baseless, the idea of a base obfuscated
only by the sheer complexity involved in understanding each layer.

~~~
matt4077
Please don't tell me Geocities was our Renaissance.

~~~
cheatsheet
I really look forward to when we, as academics, historically document and
seriously examine the various phases of the internet, from a variety of
alternative perspectives.

It's interesting while it's being built, but it's also interesting to look
back and reflect on the bigger picture, outside of the buzzwords and technical
terminology used to pull the creation through, and make it actualized.

I look forward when critics and theorists start thinking about the goal of the
internet from a social perspective, a collective cultural subconscious
directive. I look forward to all the kinds of art history theoretical
methodology used to explain the significance of Picasso or Manet in their
respective time periods, to use the same kinds of methodology to reason about
the relation between the internet and everything that is not the internet.

It's interesting when some information gets washed away and other information
is retained through time, and it isn't always the stuff that is indexed that
is retained. The idea that art critics can even agree to call the same
collection of works "cubism" or "impressionism" fascinates me, and I look
forward to the same kinds of invented vocabularies being used to describe
various processes, movements, and patterns throughout internet culture (way
beyond studying memes and tropes - there are so many layers to the collective
psyche of the internet, it is dumbfounding).

I don't know what geocities represents. I'd have to define it's 'kind' and
compare and contrast it to other 'kinds' throughout time. I know this was
meant to be a humorous comment, but I love to weave theories, and some of them
even turn out to be descriptive of the nature of things.

~~~
joepie91_
And if you want to help out to archive the data that is needed for that kind
of work, ArchiveTeam needs your help:
[http://archiveteam.org/index.php?title=Main_Page](http://archiveteam.org/index.php?title=Main_Page)
:)

------
byte1918
Thank you.

[http://i.imgur.com/cjDeLEb.png](http://i.imgur.com/cjDeLEb.png)

EDIT: What's with the downvote hate? Somebody actually posted a valid key...

~~~
PhasmaFelis
As far as I can tell, you just posted part of a random screengrab from your
web browser for no obvious reason. Striking's response suggests that this is
actually a reference to a site which, per the OP, is gone forever, along with
any chance of getting your joke. So...I'm not really sure what you were
expecting.

------
hackhat
>Why would you do such a thing? My full explanation was in the content of the
site. (edit: ...which is now gone)

So anyone really understood why he did this?

~~~
TeMPOraL
My guess - because he could, and likely had some good laugh when discussing it
with friends.

------
ikeboy
Anyone know the origin or have an archive?

~~~
TimWolla
The origin is this:
[http://eep40h.herokuapp.com/](http://eep40h.herokuapp.com/)

~~~
ikeboy
[https://web.archive.org/web/20150213152238/http://eep40h.her...](https://web.archive.org/web/20150213152238/http://eep40h.herokuapp.com/)

Yay!

Edit: and now [https://archive.today/3QpC9](https://archive.today/3QpC9)

~~~
aqwas
IF anything, that's a much deeper comment than the website itself. No matter
how hard you try, it's impossible to really destroy something once it's been
on the web. Resistance is futile.

~~~
scribu
That's not quite the point being made, since the site didn't even attempt to
block indexing via robots.txt or a meta tag.

------
WA
Not sure if I see this as "art" or something. I mean, _irrevocably deletes
itself_ could be attached to a thousand arbitrary things.

\- deleted after 100 visitors

\- deleted if visited with IE 6.0 for the first time

\- deleted if referrer is Facebook

\- ...

~~~
comboy
Also, irrecovability seems a bit questionable (google cache, archive.org etc.)

~~~
getsat

        <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
        <meta http-equiv="Pragma" content="no-cache" />
        <meta http-equiv="Expires" content="0" />
    

+

    
    
        Cache-Control: no-cache, no-store, must-revalidate
        Pragma: no-cache
        Expires: 0
    

+

    
    
        User-agent: ia_archiver
        Disallow: /
    

Of course, this won't prevent crawlers which do not honor these headers/meta
tags from caching your site, but if you're not in Google's index you're likely
not getting traffic from said crawlers.

~~~
comboy
Good point. I wonder if meta tags were updated later or did archive.org ignore
them -
[https://web.archive.org/web/20150213152238/http://eep40h.her...](https://web.archive.org/web/20150213152238/http://eep40h.herokuapp.com/)

------
cubano
Snapchat for websites...hmmmm perhaps.

------
thewizardofmys
I see some potential use of this, for example as soon as Google crawlers reach
the site I know that it is accessible from outside and I destroy the site.

~~~
aqme28
What is the purpose of a website that is inaccessible "from outside"?

~~~
jessaustin
Maybe it's a resource that should only be used by people in a particular
organization?

------
arash_milani
"Death is reason for the beauty of butterfly"

~~~
ars
Who said that? I could not agree less. Butterflies are beautiful for their
color, not their death.

~~~
TeMPOraL
Whoever said that, probably meant selection pressure.

~~~
mod
Potentially also that if every butterfly that ever existed were still alive,
we wouldn't be very fond of them.

------
neilellis
I have to say I'm not usually a fan of conceptual art, but kudos - the concept
is great. Keep experimenting!

------
scottcanoni
I would be interested in similar experiments but with a couple of minor
variations to see the effects of each:

1\. Sending the NOINDEX meta tag

2\. Combining meta tags

3\. Monitoring for a referrer URL that matches a Google search page to catch
the 1st non-sneaky user coming from the index.

4\. Monitoring other search engines and their behaviors.

------
angelortega
grep Googlebot /var/www/log/* && rm -rf /var/www/site

------
shubhamjain
How about detecting GoogleBot traffic and deleting when it has crawled your
website?

~~~
tjgq
Then anyone would be able to trigger the autodestruct by spoofing their UA.

~~~
desdiv
Googlebot's identity can the authenticated to prevent spoofing:

[https://support.google.com/webmasters/answer/80553?hl=en](https://support.google.com/webmasters/answer/80553?hl=en)

~~~
tjgq
I actually wasn't aware of that! Thanks for the link.

------
hartator
[https://github.com/mroth/unindexed/blob/master/views/alive.e...](https://github.com/mroth/unindexed/blob/master/views/alive.ect)

------
bernardlunn
Like a snow angel? Art that auto destructs? Stay in the moment.

------
lukasm
What problem does it solve? EDIT: that was an honest question.

~~~
317070
The problem of creating something interesting, a.k.a. creating art.

------
hellbanner
Guess I better clone the source before its deleted..

------
switchb4
Oh I was thinking something very similar few minutes ago and when I opened
hacker news and saw this post I was amazed

------
tzury
a) One can also use referer to check whether a visitor has come from Google to
trigger the deletion (in addition to "seek itself in Google").

b) robots.txt shall get the same results, plus, no cached content at Google,
unlike "deleting itself", which the cache content remains at Google.

------
psykovsky
You mean a website which can't be used with Chrome or even with Android
itself, on any browser.

------
colund
This makes me think about the immensely cool self destructing sunglasses in
Mission Impossible

------
facepalm
I have a thought that I will forget immediately once somebody asks me what it
is.

Now I am an artist, yay :-)

~~~
qu4z-2
What is it?

------
FaisalRashid
@Cjlm, what type of problem does it solve?

~~~
sorokod
A Google worshiping sand mandala (
[http://en.wikipedia.org/wiki/Sand_mandala](http://en.wikipedia.org/wiki/Sand_mandala)
) ?

