
Medium tries to prevent people reading deleted articles on the Wayback Machine? - scandox
http://www.selectedintelligence.com/post/173476365679/medium-tries-to-prevent-people-reading-deleted
======
dorian-graph
Over the last year I've come to trust effectively nothing on the internet.
I've had so many Spotify playlists I was following disappear, whole artists,
websites, articles, etc.

I'm slowly moving to offline-first versions of all the information I care
about. Edit: This change too also lends to the 'slow web' (or just slow
$whatever) movement, which I'm a fan of.

~~~
amelius
The problem is, it's getting harder every day to find DRM-free content. For
audio, CDs are getting out of fashion, and piracy channels lack the seeders
because everybody is on Spotify.

~~~
twostoned
Yes. The debate is gone. Young people have a much different view of
'ownership' (I struggled for a better word) than older people. For example, I
remember the copyright, file sharing, music piracy arguments and debates from
the 90s (Metallica, Napster! Hah) and 00s. But when I talk about this stuff
now with people in their early 20s there seems to be less awareness. DRM &
'Stream everything' are the way it is, as if its some kind of inevitability.
The concept of actually owning, or possessing, something (even if its a byte
stream on a physical hard drive in your house) seems to be disappearing. It's
interesting to watch.

I think the most interesting part is the lack of discussion.

~~~
Arzh
That's because 99% of the time the streaming service for music is better than
trying to build your own library. It also seems to have a lot less attribution
than video stream right now so people don't have to pay attention to where
they can stream a specific song, they can use just about any app and get what
they are looking for.

~~~
blux
One major drawback to me is the recurring cost. My feeling is that building an
offline library that you truely own is much cheaper than using some streaming
service with monthly recurring costs that inflate over time.

~~~
rbrcurtis
Only if you never listen to new music. CDs cost around $10, which is your
Spotify sub cost per month. Imagine only listening to one new album every
month.

------
jsty
Archive.is doesn't seem to be affected by the redirect strategy.

article: [http://archive.is/gPcBW](http://archive.is/gPcBW)

~~~
MarkusAllen
Archive.is is awesome. It just works.

~~~
fiatjaf
Who pays for it?

~~~
rambojazz
I've always asked the same question myself... I don't think they are a
nonprofit like the IA. Their FAQ says "It is privately funded" and wrt ads "I
cannot make a promise that it will not". Years ago there was an archiving
service that displayed ads, but unfortunately I don't remember which one it
was... I vaguely remember it could have been archive.is, but I'm not sure.

------
dahart
> So it looks like Medium has embedded a method to frustrate the casual user
> of Wayback Machine from seeing articles that their authors have removed from
> the original site.

It strikes me as less likely that Medium is doing something intentional to
prevent reading deleted articles, and more likely that the author of this post
is making assumptions.

Besides, archive.org has a policy of respecting copyright. All you have to do
is ask them to not re-publish, and they will. No need to engineer wacky
redirects that don’t work anyway.

[http://archive.org/about/faqs.php#20](http://archive.org/about/faqs.php#20)

------
dontchooseanick
You might like that :

root@localhost:~# links -dump
'[https://web.archive.org/web/20160826003417/https://medium.co...](https://web.archive.org/web/20160826003417/https://medium.com/@Svenskunganka/interviewing-
my-mother-a-mainframe-cobol-programmer-c693d40d88f7') | nc seashells.io 1337

Results at [https://pastebin.com/SMGBscz2](https://pastebin.com/SMGBscz2)

~~~
scandox
I used w3m -dump
[https://web.archive.org/web/20160826003417/https://medium.co...](https://web.archive.org/web/20160826003417/https://medium.com/@Svenskunganka/interviewing-
my-mother-a-mainframe-cobol-programmer-c693d40d88f7)

Produces a fairly nice, readable version also.

~~~
dvfjsdhgfv
Actually it's not necessary to use text-mode browsers as the trick used here
is JS-related, so it's enough to switch JS off.

Actually most nuisances on the web today are JS-related so I have a button for
quickly disabling it. It works like a charm, also for this case.

------
baby
When I delete an article from my blog, it's because I don't want anyone to be
able to read it anymore. Be it shame, inaccuracy, change of mood, etc... I
think there is something fundamentally wrong with wanting to have EVERYTHING
backed up at all time against the creators' will.

~~~
ekianjo
> When I delete an article from my blog

It's not yours anymore once it's public. It's like saying "when I do something
crazy in public I want to be able to make everyone forget about it later on".
That's not how things are supposed to work.

> I don't want anyone to be able to read it anymore

People can still have local copies so you have absolutely NO CONTROL.

~~~
zmk_
Except they do in Europe under the 'right to be forgotten'.

~~~
lostmsu
That's what they think. Fortunately for the Internet, there are lots of
unaffected servers outside it.

~~~
StanislavPetrov
And unfortunately for Joy Reid.

------
thogenhaven
I wonder how wayback machine will work after GDPR? I can't imagine they can
just show content that the authors deleted from primary sources?

~~~
alerighi
They can't, and I would like to know what Europe wants to do about it. Block
wayback machine in Europe ? Well, I can still access it with a VPN if I want.
Also I want to know what they will do about git and GitHub, or even blockchain
project (how you delete something from a blockchain ?)

The problem is that GDPR is a stupid legislation written by incompetent people
that doesn't understand the subject and imposed with no possibility of choice
on member states, like all the regulations from the EU (cookie banner law, for
example).

And of course GDPR doesn't impact to much the companies that they aim to
fight, like Facebook, Google, etc, they have teams of layers payed millions
with the sole purpose to find ways to circumvent these regulations, they will
just update the terms of services and done, the ones that will be more
affected are small companies, startups, personal no project side projects,
people that doesn't have money to spend in a layer for a project that doesn't
make him any revenue.

I think that in Europe it's not more possible to do anything, if you have a
good and innovative idea and you want to realize it, better take a flight to
the US...

~~~
simion314
The law is meant to allow me to delete my account from your cool SV startup,
and delete meaning actually delete the data and not deactivate the account but
continue using or selling my data.

The cookie law is a problem because lazy web developers did not implement it
right, probably you complain about don't spam me law because it adds a bit of
extra work for adding the unsubscribe link and implement the requierements.

The laws are done for the good of the society and not for helping a minority
to implement some move fast break things, pivot and try again.

~~~
jfaucett
There is big difference in law and regulation between intention and real-world
effects. For instance, making marijuana illegal has the intention of
decreasing drug addiction and dependence but has the effects of
disproportionately encarcerating youth aka "criminals" under the new law for
drug consumption, and thus limiting their opportunities in the socio-economic
system.

If you look into it I think parent is most likely correct with his predictions
since they are easily verifiable i.e. big coorps do have massive teams and
monetary funds to deal with this legislation, startups and one-man shops do
not. This is completely ignoring the deontological question of what should be
the case, where I think most would be in agreement.

~~~
matthewmacleod
_big coorps do have massive teams and monetary funds to deal with this
legislation, startups and one-man shops do not._

That applies to literally every piece of legislation. Yet we don't decide that
small restaurants should be exempt from food hygiene laws, or that small
construction teams should be exempt from health and safety laws.

~~~
jakeogh
Conflating information with feeding someone poison food is so par.

~~~
matthewmacleod
Your objection is not helpful.

Being careless with personal data has harmful consequences.

Being careless with food safety has harmful consequences.

This is why these things are related.

~~~
walshemj
what personal data is being exposed - the EU isn't keen on anonymous
publishing in general eg in the UK any thing published by a political party
during elections etc _must_ have both the printers name and their agents - the
penalties are quite severe .

~~~
matthewmacleod
I'm really not sure what your point is?

------
gnode
I think this is more likely to be unintentional. As another comment mentioned,
article.is isn't affected. If you want to remove things from the Internet
Archive, you can do so using your robots.txt:

[https://archive.org/about/faqs.php#14](https://archive.org/about/faqs.php#14)

[https://www.fightcyberstalking.org/how-to-block-your-
website...](https://www.fightcyberstalking.org/how-to-block-your-website-from-
the-wayback-machine/)

~~~
johanj
robots.txt is really only supposed to be used for blocking the Internet
Archives first snapshot, and not to remove existing snapshots – and even this
might not be the case in the future as they try to preserve most snapshots.
They made a few policy changes last year[1] to how they handle robots.txt
files, to handle cases where a domain is sold and a new robots.txt file would
result in deleting old data among other things.

[1]: [https://blog.archive.org/2017/04/17/robots-txt-meant-for-
sea...](https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-
engines-dont-work-well-for-web-archives/)

~~~
jrochkind1
Hmm, that may be what it's meant for, but pretty sure it can currently be used
to block things retroactively too. IA may still have it in the archive, but
won't let viewers view it.

As happened in this case:
[https://news.ycombinator.com/item?id=16919017](https://news.ycombinator.com/item?id=16919017)

No? The article you linked says they've stopped paying attention to robots.txt
for US government and military sites, but it looks like it still retroactively
removes visibility for everything else.

I guess IA could change their practices. If medium or people like them start
actively using robots.txt to try to retroactively remove things from
visibility in the archive, perhaps IA will change their practices/policy. I
would welcome it.

------
textmode
Is a certain browser required to reproduce the problem? I had no trouble
accessing the archived page.

[https://web.archive.org/web/20160826003417/https://medium.co...](https://web.archive.org/web/20160826003417/https://medium.com/@Svenskunganka/interviewing-
my-mother-a-mainframe-cobol-programmer-c693d40d88f7)
\----------------------------------------------------------------------

Interviewing my mother, a mainframe COBOL programmer

My 1 year older brother (to the left), my mother and me (to the right)

My mother has been working for one of the largest banks in the EU since before
I was born and I've always been fascinated by her line of work, especially
these last years since I've become a programmer myself. I've been asked to
interview her plenty of times, and finally decided to do so.

    
    
       * * *

------
erikb
I don't really see why this is a deal. If I decide I want to delete the
content I put on Medium I very well want them to delete it everywhere else as
well.

One could read that as "Medium tries to protect their users content".

~~~
baby
Completely agree with your comment, not sure why you're being downvoted. A
blog is the same thing as a facebook account, would you like it if you
couldn't delete your pictures on facebook?

~~~
jobigoud
A blog is not like a Facebook post. It's more like writing an essay,
publishing it in a book, and then asking libraries to burn their copy of the
book.

~~~
danso
That seems like an arbitrary distinction, especially since Facebook can be
used as a personal blog via its posting feature.

------
tenryuu
Uhh, is the first medium article not meant to be loading? worked fine for me,
but I read it as it was meant to redirect to the homepage.

If I did read it wrong, I was able to load the second one fine as well

------
ayepif
What are some ways around it as is mentioned in the article?

~~~
rambojazz
Either disable JavaScript before visiting the Wayback Machine, or stop the
loading of the page just after it has loaded the text but before it performs
the redirection (a bit tricky, you have to stop it at the right time).

~~~
gnud
You can't use the wayback machine without JS though, can you?

~~~
JackCh
Given the link to the archive, it works without javascript for me. I disabled
scripts from archive.org with umatrix and the archived page loaded just fine.
The only difference from usual is the top bar that archive.org normally
displays isn't present.

However you do seem to need javascript enabled to query the wayback machine
from web.archive.org: _" The Wayback Machine requires your browser to support
JavaScript, please email info@archive.org if you have any questions about
this. "_

------
thisisit
If the page loads completely then yes it redirects to medium.com. But you can
read the article by interrupting the redirect and clicking the stop button.

~~~
scandox
This is true. It took me five goes, but I am over 40 and my mouse finger is
not what it was.

~~~
StavrosK
Why mouse? Click the link and press escape. You youngsters with your mouses.

~~~
lostlogin
And there you go, it turns out that ‘mouses’ is an accepted pleural for a
computer mouse.

Less horrible than ‘Magic Trackpad 2 — Silvers’ I suppose.

~~~
StavrosK
Yep, unlike "pleural", which is not yet an accepted spelling of "plural" :p

~~~
lostlogin
That’s just excellent. It stays. Guess which field I work in.

------
paulie_a
Medium generally prevents people from reading articles anyways due to their
insanely shitty UI. I generally avoid that site now

------
jrochkind1
Hmm, all they got to do is have a dynamic robots.txt that forbids wayback from
the deleted articles, and they'll remove the workaround even. yes?

~~~
Exuma
Once it's stored I imagine they don't need to even scrape the page again, so
robots.txt wouldn't do anything.

~~~
dragonwriter
Internet archive does rescrape periodically, and it removes archived pages
based on the _current_ robots.txt. This behavior is documented behavior of the
archive that goes beyond the normal conventions of robots.txt.

~~~
LinuxBender
I would add, the content itself is not removed. They only stop displaying it
whilst the robots.txt says not to. If they can not reach your robots.txt, the
content comes back as I have experienced multiple times.

------
coldacid
This is why I like archive.is, it tends to avoid falling for any of that kind
of BS. Something gets archived and it's there forever.

------
tinus_hn
Alternatively, they can just tell the Wayback Machine to delete the articles
or forbid it from archiving them, right?

------
spechide
Just use Telegram Instant View.

------
phnofive
> Medium tries to prevent people reading deleted articles on The Wayback
> Machine

Please use the article headline. However, the automatic redirect makes this
pretty close to the truth.

~~~
scandox
Yes probably should have used my own title. It's just difficult for me to
accept a headline so lacking in pith.

Edit: have updated title

~~~
phnofive
Awesome. I am stepping through the page execution and trying to figure out
what is embedded in the page that would cause a redirect if the article is
deleted. Did you find it?

~~~
scandox
No. I took the view that a much more intelligent HN user would do that for me.

~~~
nasredin
Expect a promotion soon.

~~~
scandox
I’ve already been promoted to the level of my incompetence.

------
onetimemanytime
Maybe I'm out of touch with HN, but kudos to Medium. I delete the article for
a reason, why should someone read it years later (regardless of the website or
format.) It's my article

~~~
leereeves
What would credibility mean in a world where lies, mistakes, and fits of
emotion could be erased completely?

~~~
psyc
I'm not sure I understand the question. As it is now, the vast majority of my
transgressions _in real life_ disappear into a memory hole. What makes the
internet so special in that regard? You _want_ there to be a permanent digital
record of people's mistakes, why? So that those best at hiding their defects
can get ahead? Why is that desirable?

~~~
leereeves
For roughly the same reason schools administer tests and keep permanent
records of grades.

An author's track record for honesty and accuracy is (was?) the foundation of
credibility.

~~~
krapp
And yet the same community that insists the internet act as a permanent,
immutable and irrevocable platform for "judging honesty" tends to object when
intelligence agencies, law enforcement and employers use their data for just
that purpose.

Even if it were the case that one could _only_ judge a person's honesty and
accuracy from the "track record" of content published to the web that can be
traced to their identity, this assumes that all such content is unbiased and
factual, and that any interpretation of that content would also be unbiased
and factual, but that isn't true.

------
bazooka2th
Is this an April Fools' joke?

Too much first person.

Also, Wayback Machine "is frequently used by journalists and citizens to
review dead websites".

This isn't some fucking standard; it's Wayback Machine's responsibility to
archive websites, not the other way around.

