
ArchiveTeam rescues Justin.tv videos - sp332
https://twitter.com/textfiles/status/476064989879349248
======
timr
I don't think anyone has mentioned this yet, so I might as well: unless they
changed it after I left, JTV has been deleting videos older than a week for
many years. I wrote the original (well, almost original) system, and IIRC, we
took it down to a week of storage back in mid-2009 or so.

JTV caught quite a lot of static for _" not giving more than a week's
notice"_, but it's pretty hard to do that when you're going from one week of
storage to _no_ weeks of storage.

Also, since people are asking: back when I first worked on it (circa late
2008), we started by installing (again, IIRC) 40TB of RAID6 disk space, and
that was enough to store about a month of video before we ran out. By the time
I left, we had something like 4-5x that capacity, and we were down to a week
of archive storage. So that should give you some perspective on the amount of
data involved....and it's probably gone up substantially since then.

~~~
stingraycharles
> JTV caught quite a lot of static for "not giving more than a week's notice",
> but it's pretty hard to do that when you're going from one week of storage
> to no weeks of storage.

I have trouble understanding this point you make. What prevented them from
announcing something along the line of "We keep videos for one week at the
moment; in 3 months time, however, we will stop this and there will be no more
storage." ?

------
aroman
I love the ArchiveTeam.

Jason Scott, the leader of the project, gave a fantastically entertaining talk
about how they saved Geocities, Yahoo! Video, and Friendster — using a
Distributed Preservation of Service Attack. Definitely worth a watch.

[https://www.youtube.com/watch?v=-2ZTmuX3cog](https://www.youtube.com/watch?v=-2ZTmuX3cog)

------
textfiles
Since people are interested... a fact that may be obscured from the
conversation and statistics is that somrthing like 550gb of those videos have
ZERO views.

Kudos to archive team!

~~~
AustinDizzy
How can videos with at least 10 views also have zero views? The ArchiveTeam is
only keeping videos with at least 10 views because the total size of those
videos with less than 10 views is 1.01 Petabyte.

~~~
textfiles
Sorry for not being clear. Out of the 1.1 petabyte, something like 50-60% of
that has zero views. Archive Team duplicated everything with 10-infinity
views, which was about 10tb.

------
userbinator
The page for their project:
[http://archiveteam.org/index.php?title=Justin.tv](http://archiveteam.org/index.php?title=Justin.tv)

I'd be curious to know how many hours (days? months? _years_?) of video
content JTV actually has.

~~~
kd5bjo
5 years ago, it was 22 hours recorded per minute[1]. That translates to 25
years per week, which is their previous retention term. The volume has
probably gone up substantially in the intervening time.

[1] [http://mashable.com/2009/05/21/justin-tv-usage-
stats/](http://mashable.com/2009/05/21/justin-tv-usage-stats/)

~~~
userbinator
So they're huge numbers, but still pretty small in comparison to YouTube which
claims to have 100 hours of video uploaded every minute [1] - meaning they
probably have at least several _millenia_ of stored video... wow.

[1]
[http://www.youtube.com/yt/press/statistics.html](http://www.youtube.com/yt/press/statistics.html)

------
sp332
I guess this answers why they're not doing archives anymore. 1,000 TB of
videos with fewer than 10 views!

~~~
baddox
Where do you see that statistic?

~~~
dbpatterson
They said they archived 10TB (and that covered everything with at least 10
views), and that the total including all videos would be 1.1PB, ie 1100 TB. So
there is over 1000TB of videos with less than 10 views.

~~~
bjterry
1.1PB would cost $56,000 at the cost of Backblaze's latest pod, based on their
recently posted pricing.[1] That would be for live storage in physical
computers. If they put it on tapes at $0.01/GB, it would only cost $11,000 for
the tapes.

1: [http://www.tuaw.com/2014/03/19/backblaze-now-
storing-100-pet...](http://www.tuaw.com/2014/03/19/backblaze-now-
storing-100-petabytes-of-data-announces-storage-p/)

