
The Upspin manifesto: On the ownership and sharing of data (2014) - stablemap
https://commandcenter.blogspot.com/2017/10/the-upspin-manifesto-on-ownership-and.html
======
spc476
I'm reading the manifesto, and for some reason, a story I once heard, possibly
apocryphal, came floating up out of my memory: a person was using their PC
when a coworker came up asking for a copy of a spreadsheet and handed the
first person a floppy disk. That person took the disk, inserted it into the
PC, then _launched Lotus 1-2-3_ , loaded the file, and then saved it to the
floppy. The second person was incredulous that the first person just didn't
copy the file directly from the hard drive to the floppy. The first person
replied, "you can do that?"

~~~
zerocrates
I have a family member whose standard workflow involved doing the same with
Word on Windows. Save As was the only way he knew how to copy things.

He's a baby boomer. I'd assume this is only more common now with peoples' main
interactions with computing being in an app-focused, hidden-filesystem world.

------
jpeloquin
The manifesto's idea that files should have a unique address which any machine
can access reminds me of Brian Hauer's rant, which argues that each
application should have a single instance with a unique address (for a given
user) that any machine can access
([http://tiamat.tsotech.com/pao](http://tiamat.tsotech.com/pao)). Put the two
together and a person's entire digital life would seamlessly follow them
between machines.

I like the proposal of making caching a central design element to work around
today's bandwidth limitations. I work with large-ish (a few TB) scientific
datasets, and it isn't pleasant to have to choose between (a) storing
everything on network storage and suffering slow IO or (b) storing everything
locally on every workstation and suffering the need to synchronize data.

~~~
scoot
_" it isn't pleasant to have to choose between (a) storing everything on
network storage and suffering slow IO or (b) storing everything locally on
every workstation and suffering the need to synchronize data."_

That's a solved problem, which depending on the workload could involve a
compute farm with a clustered or distributed file system, a copy data
management solution, or cachefs, for example. The are also solutions for
shared storage between containers across multiple nodes.

~~~
jpeloquin
> That's a solved problem ... compute farm with a clustered or distributed
> file system, a copy data management solution, or cachefs, for example.

Thank you for suggesting some potential solutions to my data management
problems. From googling them, I get the impression that you are thinking
primarily of enterprise scale data management (i.e., organizations with server
farms and an IT department), whereas I'm primarily thinking of organizations
with < 30 employees (who mostly use desktop software) and a single file
server. My particular situation is an academic lab. However, I think these
solutions can can still work with a little adaptation:

Copy data management seems to be the use of block level data deduplication or
virtual disks on a server in order to decrease the disk utilization per VM or
per container. I'm not completely sure; I found mostly marketing documents,
and there's no wikipedia entry. This, as well as clustered/distributed file
systems, would apply if we turned the file server into a VM server and had
each employee work on a VM via remote desktop. In principle, this could work,
and would let temporary employees (e.g., summer students and visiting
scholars) get started quickly with a standard OS environment. I will
experiment with this when our last batch of desktop PCs hit end of life.

CacheFS looks useful if we start using NFS to connect to our file server
instead of Samba. It looks like Windows 10 (Enterprise) supports NFS caching
too ([https://technet.microsoft.com/en-
us/library/cc976862.aspx](https://technet.microsoft.com/en-
us/library/cc976862.aspx)). I will try this.

------
krylon
That sounds a lot like IPFS (Interplanetary filesystem)
[[https://ipfs.io/](https://ipfs.io/)]

Or am I missing some key point here?

------
hawkinsw
Rob Pike explains this manifesto in several videos available on YouTube:

[https://www.youtube.com/watch?v=ENLWEfi0Tkg](https://www.youtube.com/watch?v=ENLWEfi0Tkg)

It's a fascinating talk. I really enjoyed it. I hope you do too!

------
helper
The 2014 label is a little weird. While it was written in 2014 it was only
just published publicly today.

------
natural219
Definitely a fan of this project, but I'm intensely curious as to why they
separated from Camlistore, which seems like a very similar project and is also
headed by a key member of the Go core team (Brad Fitzpatrick). Anybody from
either of those two projects care to comment?

Motivation: There are 100s of initiatives trying to solve similar problems,
and they could be solved relatively quickly if engineers deigned to work
together on a solution instead of splintering off into hundreds of fractured
groups.

~~~
jff
Brad Fritzpatrick's response:
[https://news.ycombinator.com/item?id=13700968](https://news.ycombinator.com/item?id=13700968)

~~~
natural219
Ah, thanks for the speedy response.

> The main difference I see is that Camlistore can model POSIX filesystems for
> backup and FUSE, but that's not its preferred view of the world.

This makes me want to throw things. I'm actually mentally discounting both
projects now on the charge that core authors seem to care more about bickering
over technical details than implementing working solutions to these society-
breaking problems.

~~~
jff
Andrew Gerrard worked on both and apparently didn't think Camlistore was the
right basis for what they wanted in Upspin. But I'm sure you, who I'm not sure
has used either project, know better than Andrew and Brad and Rob.

~~~
natural219
I am claiming I do, yes, and would happily make my case to any of them for why
they should do the hard work of agreeing on minor technical details and merge
the two projects. It is the easiest instinct for engineers to "split off and
code their own version" over technical disagreements, and why we have a
dizzying array of incompatible, half-completed decentralization projects while
Facebook and Twitter continue to eat society.

Thank you again for the info/backstory, though. I am just a naysayer who has
sat through 1000 pitches of Fitzpatrick's basis thesis back in 2010 and seen
excruciatingly minimal progress in the space of "actually making these things
work for normal people".

~~~
enneff
I know both projects intimately and they are not "minor technical details" but
rather fundamental architectural differences.

~~~
natural219
I'm happy to discuss this further -- my life-passion-project is to see
decentralization through -- but fear I've overstepped my bounds in this thread
and am taking away focus from the project at hand, which I am a supporter of.

~~~
scoot
_" I'm actually mentally discounting both projects now "_ _" the project at
hand, which I am a supporter of"_

Which is it?

~~~
natural219
I keep a ranking of decentralization projects in terms of how likely they are
to succeed and catch on. Camlistore and Upspin have been near the top of my
list for years now (Camlistore was the one that originally inspired me to quit
my job at Twitch and do decentralization advocacy full-time). I am now slighly
less excited about both projects, although they still have incredible
potential and I would be overjoyed if either of them met with minor success.

At this point, I get the sense that Upspin/Camlistore don’t really _want_ to
succeed in terms of catching mass-market success and disrupting the
innovation-stifling tech giants. It seems like they’re more interested in
scratching their personal itch and being content with that. Totally fine, but
I’m going to be slighly less excited about releases from both of these
projects in the future unless I get indications that the core team members are
willing to escape the same trap that plagues all standardization schemes
([https://xkcd.com/927/](https://xkcd.com/927/))

------
DonbunEf7
Sounds like Named Data Networking, which cannot come soon enough. Is anybody
doing commercial NDN yet?

------
skybrian
Assuming they got similar adoption, I'm wondering why I should use Upspin
rather than Keybase? It seems like Keybase's users and groups are more
sophisticated, and its private git support is immediately useful.

~~~
shykes
I am a big fan of both Upspin and Keybase. In my view they are quite
different.

1) Polish vs Openness:

\- Keybase is a polished product and a closed identity platform;

\- Upspin is open-source plumbing and an (almost) open platform.

2) Focus

\- Keybase is a crypto identity platform which happens to have a file storage
app;

\- Upspin is a file storage platform which happens to have a crypto identify
feature.

 _Note: I call Upspin "almost open" because it does not support running your
own key server in a private namespace. All users must use the same public key
server. In exchange for a slightly less open platform, Upspin gets a strong
guarantee of a single global namespace, which is a really great feature for
end users. I think it's great that the project is clear about its priorities
and the tradeoffs it's willing to make, and communicates them upfront._

~~~
4ad
Of course you can trivially run your own key server and your own upspin
universe. But then of course, you can't talk with other people.

The problem with upspin is that there's a SPOC keyserver, not that there's a
single namespace. You could trivially have a single global namespace with many
delegated keyservers using DNS. You know, like e-mail. But the authors
unfortunately don't want that.

------
scribu
> From a human point of view, the data is all we care about: my pictures, my
> mail, my documents.

I don’t think it’s that simple.

Consider messaging apps: do users care about their own messages? No - a list
of sentences is meaningless when detached from the overall conversation.

So in who’s $HOME do you store that conversation?

~~~
mark_edward
Both, like email

------
xfer
Is there any specification of protocol? Or the code is the specification for
now(and in flux)?

~~~
enneff
The core interfaces are pretty well-documented:
[https://godoc.org/upspin.io/upspin](https://godoc.org/upspin.io/upspin) and
the wire protocol:
[https://godoc.org/upspin.io/rpc](https://godoc.org/upspin.io/rpc)

We think the APIs have settled down a lot now, but they may yet change.

------
aidenn0
The manifesto talks about networked home directories; has anyone used
AFS/Coda/InterMezo and can they speak to how the experience compares to NFS?

------
Karrot_Kream
Hm I'd love to marry this to git-annex. Not sure why I haven't heard of upspin
before.

------
alpb
Why does the title read "(2014)"? This seems to have been published today.
It's new content, even though it's probably internally prepared in 2014.

~~~
dang
The article was written in 2014. If, say, a letter from Mark Twain got
published for the first time today, we'd put the year it was written in the
title.

~~~
nickm12
Count me among the confused. Works are usually referenced by their publication
year and, on Hacker News, I always assume something with a (YEAR) in the title
was published in that year.

