
Why Files Exist - brettcvz
http://blog.filepicker.io/post/26157006600/why-files-exist
======
ChuckMcM
If you want to get existential files don't exist. What exists is a way to name
a non-volatile data set. Given the name, and a non-volatile memory unit, and
an algorithm for translating between that name and a memory unit specific
representation of its internal structure, you can retrieve the data set. If
the name is sufficiently portable you could in theory hand it to another
program/process/thread and that thread could translate to the actual data on
the memory unit.

It sounds all horribly abstract but it is the actual reason file names, file
systems, and file system APIs exist. Then there is a whole different set of
semantic interpretation of the contents of files. Whether it is a simple
stream of UTF8 encoded code points, or an ELF file which can describe
executable code that is ready to be loaded into memory and executed.

The OP decrys the lack of an _interchange_ format which is simply a convention
by which two programs can both interpret the contents of of non-volatile
memory which they have both accessed using a unique name. And that mostly
because of iOS devices and the applications which have eschewed the idea of
putting the names of their non-volatile data sets into a globally accessible
namespace.

~~~
quesera
That's exactly right. Files don't exist. The "filesystem" just a big fat KV
store.

Somewhere along the way someone decided it'd be a useful abstraction to
imagine a hierarchical organization system (directories) on top, so that was
glommed on, but it's not real either.

It was never a perfect system, but it worked well enough when most users were
fairly technical and had a humanly comprehensible set of labelled data
streams.

I appreciate the effort to insulate users from the complexity that has grown
up underneath them. The simple fact is that most people fail at large scale
taxonomy and organization. It's hard. And it's a lot of work to maintain even
if you're good at it. See: library science. So I don't think there is another
model that will succeed as well as "files" have.

iOS hides the filesystem, but it's still there obviously. So far all we've
seen is insulation for those who need it, as a byproduct of huge control loss
for everyone. The other (valuable) byproduct is security.

We haven't found the compromise yet. There might not be one.

~~~
tagx
There is a hierarchy of organization structures that are typically used for
"content." For content with low numbers, a list works best. For example, the
home screen is a list. With a medium number of files, hierarchical structures
like traditional file systems work best. However, when you reach many, many
files, tagging and searching is typically needed.

Files do exist. They are the values in the filesystem KV store, but they are
also a schema for the value so that interoperability works. If we followed
your logic, apps would not exist either but we all agree they do.

~~~
derleth
> when you reach many, many files, tagging and searching is typically needed.

When you reach many, many files, just _try_ to get people to tag them. Just
_try_.

~~~
drone
Indeed, you either tag from the get-go expecting to have many, many files in
the future, or you give up on ever contextually managing those files outside
of large containers.

How do you tag a file when you can't find it to tag it, and when do you tag a
file that you've forgotten about?

Ultimately, someone comes up with yet another abstraction that makes it just a
little bit easier... Now, if we made every application those wrote a file also
tag it meaningfully, and then had meaningful translations, and.. oh geez. I
normally just delete everything and start over when I realize I have no idea
what 90% of the files I just scanned were for. If they were truly important, I
would've known what they were. I guess the people with ten bazillion files on
a PC are just data hoarders. "But, but, but I'm going to need that report one
day!" (Bet you would tag it, now wouldn't you?)

~~~
johnchristopher
Well, there is something such as digital hoarding:
[http://online.wsj.com/article/SB1000142405270230340470457730...](http://online.wsj.com/article/SB10001424052702303404704577305520318265602.html)

First link for "digital hoarding", but there is more to read about it.

------
kahirsch
There is one aspect that everybody here seems to be ignoring. Files can span
boundaries of time, space, connectivity, bandwidth, and trust. They also span
boundaries of architecture--CPU and OS.

I have files "in the cloud" that were born on systems that haven't been
manufactured since before Google existed. Those files are self-contained units
that I control and can move to whatever system I desire.

And, although people here say that end-users just don't know how to use files,
I have relatives who are over 85 years old who still manage to attach photos--
and Powerpoint presentations, for some reason--to emails and share them.

Saying that we _only_ need an API is saying that it's okay for the data to die
when the manufacture goes out of business, or decides that it's time to shut
down the DRM servers, or you just lose your phone.

Files are reifications of data that allow us to separate some concerns.
Transporting and backing-up files are _orthogonal_ to the data that is in the
files. We can compress and email and FTP any kind of data whatsoever.

That's not an insignificant thing.

~~~
zanny
You "could" in theory write apps for the iPhone that interface over localhost
ports. Which would be awful. (too bad ports in unix are described with file
descriptors!)

------
thoughtsimple
I disagree that files are the only solution. Back in the 90's Apple had an OS
that had fully interoperable data in applications and that OS didn't have a
file system.

It was Newton-OS and it used something known as soups for persistent storage.
Soups were discoverable databases that intelligently handled Flash cards
insertion/ejection. The ability to handle Flash on removable media is still
something that mobile OS's have trouble with to this day.

The OS could merge soups on different stores dynamically and detect if some
data in a soup was currently in use on an ejected card and ask for the card
back. This merging of soups on different storage devices is something I've
never seen duplicated in the subsequent 20 years.

Files are not the only way to achieve the requirements in the article. They
are just the common solution.

~~~
wrs
Hey, someone remembers! (I did the Newton object store.)

I spent years of my life trying to get rid of treating direct user access to
the filesystem as a foundational UI metaphor, at both Apple and Microsoft. As
I liked to say, why is the UI based on a filesystem debugger? (If you can see
/dev or C:\windows\system32, then yeah, you're running a debugger.)

Many people who aren't programmers don't seem to get deep hierarchy (deep
meaning > 2 levels). Searching works, tags kind of work, but few people really
know how to set up and use a folder hierarchy.

The reason it works to let the app deal with navigation is that the app knows
how to do type-specific, contextual navigation. People like concrete things
(whereas programmers like abstract things—a constant struggle). If you're
trying to find a song, you want to have a UI that knows about songs: they come
in albums, the same song may be on multiple albums, they have artists and
composers, etc. Any attempt to represent that in a filesystem hierarchy can be
nothing but a compromise.

This has nothing to do with defining standard formats for exchanging units of
data. Just how you find them once you've stored them.

~~~
encoderer
First, hat-tip for your accomplishments.

Now on to the bashing....

(I kid)

Seriously, though: For a song, that works fine. But what happens when it's a
note I jot down in a hurry? And then an address I tap in for later. And my
grocery list.

Now, I have to keep this mental mapping of where my data lives. I have to,
essentially, remember file types and associations myself.

Not saying I need a file browser, but the current iOS facility for this isn't
good enough. Look at the card-wallet thingy for iOS6. Maybe what would work is
something like that for each general type of content. You want to see any
stored gift cards and boarding passes? Open your wallet. You want to see any
stored notes and grocery lists and what not? Open your moleskin.

You've clearly thought about this more than I have, though. So what's your
take on it?

~~~
wrs
Well, your examples are kinda covered in iOS already: the note goes in Notes,
the address goes in the Address Book, and the list goes in Reminders. But I
think I see what you mean -- where do you throw random bits of stuff and how
do you get it back?

I think in the sort of usage you're describing, you just make random things
and save them, and you get them back with search and a chronological list. The
three things you describe don't sound like you'll need them after, say,
tomorrow afternoon. So why put a ton of effort into organizing them?

It is of course useful to be able to organize arbitrary files in a more
permanent way. The repeated mistake (to me) is that the process of
organization is not itself considered a concrete application based on specific
use cases. For some reason, a document format is considered application-
specific, but as soon as you want to group two documents together you're
dropped into this pure universal abstraction of a filesystem hierarchy. In
other words, applications get to define how files work, but not how folders
work.

For example, you could have a "project" that let you group various things
together (maybe some CAD drawings of an office remodel along with various
random notes and a budget spreadsheet). That's what a folder does, but a
project would be much more specific--maybe do some time tracking, have some
client-based organizational functions, etc. And of course you'd look at
projects in the project application, not in a filesystem debugger.

------
juriga
IMHO, interoperability between apps is the main benefit of Android against iOS
(see Intents in Android).

I can take any photo, URL or file and open/share it in/to another app. The OS
and app developers take care of which apps support which resources, so as a
user I'm always presented with a sensible list of apps currently available on
my device.

I'd say the notion of a file with a specific file type is too abstract and
technical for most use cases for casual users. The UI should group pieces of
data as human-understandable resources (i.e. a "picture" can be a .jpg, .png
etc.). With this level of abstraction, a user can be expected to understand
when presented with a list of apps:

OS: "What do you want to do with this URL?"

User: "Share to Twitter/Facebook/My other browser"

~~~
MatthewPhillips
Don't you see the problem with this? By allowing the app, rather than the user
or the system, to own the file you wind up with multiple copies.

App A shares data with App B. The user makes some changes and App B saves its
new copy. It doesn't send the data back to the originating app. It _could_ ,
but then the user would have to manually do that and save it in App A again.

This is horrendous! People are confusing the poor UI that we have for files
(file pickers) with thinking that files themselves are a bad abstraction.

~~~
jpxxx
I understand what you're saying, but you need to illustrate an actual problem
beyond "technically there's two" that this causes in mobile workflows for
typical consumers.

That the current Send-To idiom is limited is not under dispute. That it's
worth changing is what's worth discussing.

~~~
MatthewPhillips
The 2 copies are not guaranteed to be in sync. If I modify data in App B, App
A now has an old copy. I shouldn't have to explain why this is undesirable.

~~~
jpxxx
Relax. :) Now we're getting to the crux of the situation.

Let's talk spreadsheets. That's a pretty data-freshness-critical thing, yes?
Jennifer gets an e-mail with an Excel attached. She previews the Excel,
decides it needs an edit, and taps Send-To -> Numbers (the most common current
scenario). A duplicate is created and shipped over to be owned by Numbers.

She makes edits, then clicks Send To -> Email. The correct version goes out.
The updated version remains in Numbers. The user assumes that Numbers has the
spreadsheet, because that's where she was making changes. The copy in the Mail
Attachments archive remains old but _it's never invoked a second time_ and the
workflow never takes in stale data.

The only scenario in which Send-To causes problems is in the case of two
applications that have equal abilities to process a given filetype _AND_
roughly equal chances of being invoked by a user, and how often is that going
to be coming up on a telephone or an tablet?

You could always build a file locking system at the app-level or OS-level or
cloud-FS level, but then we're back to Who's Freshest?, the hardest game of
all to win.

Apple chose not to play this game and pushed all the filesystem yuckiness out
onto a rarely-traveled edge of a regular user's possibility space. It's not
ideal, but what is?

~~~
drone
Why do there have to be two copies of the file? Could there not be only one
copy of the file, with two "begin pointers" that are dated? When a user
decides that they no longer need an older version, the older section can be
purged. Obviously, this could result in quite large files, and we start
wanting to do smarter edits to the contents, but I'm not sure having two
distinct copies of the data serve the average user any better than one with
two internal copies of the data? Obviously, there are technical challenges,
but let's just assume for a minute that years of database and similar use
case-driven development have solved many of the basic issues already.

~~~
jpxxx
Two copies solves more problems than it wastes space.

What you're describing is concurrent access to a shared resource, which means
we now need to start having the following discussions:

\- Who's accessing? \- Where's the thing they're accessing? \- When are they
starting? \- When are they done? \- What happens if they stop talking without
saying goodbye? \- What happens if changes from User A happen before User B
gets a full copy of the starting work object? \- What happens if User B
deletes everything? Should she be able to? \- How do we manage identity? \-
What happens if the work object is asked to save but is in an incomplete or
non-logical state? \- What happens if the work object is damaged?

(Now multiply the word 'user' by multiple axes: actual humans, programs,
network services, filesystem services)

This is an awful, awful, awful lot of work designed to save the space used by
a duplicate 300KB .PPTX file. And frankly, I've only ever seen SubEthaEdit get
it right.

Also: the user will never, ever, ever, ever decide to review or delete an old
file. That's shit-work, and we are mortal.

A majority of customers aren't even aware that there can be multiple versions
of the same work or that the work they're doing can be expressed discretely -
they're just doing stuff, and cannot be expected to think about about stuff--.

So in short, let's waste some space and save some hassle. That's what flash
chips are for.

------
sehugg
I think the "here today, gone tomorrow" nature of app stores is impairing file
interoperability. There's just little incentive to allow your productivity or
creative app to play with others (unless that's the whole point of your app,
like PlainText). I've given up on fancy note-taking apps, knowing there will
always be a better one that's not compatible with my old data.

In another decade we'll have a whole lot of unreadable proprietary app data,
inaccessible because the original app doesn't work on new hardware. Extracting
it will be a tedious process of either reverse engineering or emulating the
old hardware/software combination.

Not that we haven't been down this road before, but it just seems like it's
worse this time. Even the word "file format" seems archaic, and not many
(other than pirates) seem interested in reverse engineering and/or documenting
them.

------
guard-of-terra
File systems have to evolve. These days, file system means two things: an
application-independent API to access common documents, and a hierarchical
local storage. But it doesn't have that way.

The best thing I've seen in file system evolution is KDE's KIO: Any KDE
application can take any KIO url and use it; all file operations are
asynchronious (even if you open local files, and that's very nice even for
local files that are big), and any program can use network resources as easy
as local with little to no effort.

But we should improve on that: a heterogenous user file system should provide
discoverability (e.g. your social network photos are automatically available
in any program once you bind the account, and you know where to find them).
File system branches restrict some operations on files or hint on their cost
(scanning a huge photo bank is a very expensive operation; you can't access
the contents of audio files inside a streaming service but you can play them
in the program). There also should be other ways to organize files than just
dumb hierarcheries (imagine a search box in place of a folder, you need a
query and then you enter the search results; or you can have tag cloud in your
file system)

There's a great deal of work of innovation here and nobody does it at the
moment, so it seems.

Sorry for mistakes 'cause I'm hurrying to go to bed :)

------
paulsutter
Files exist because decks of punched cards were cumbersome. A best practice
was to make a diagonal stripe across the top with magic marker so that you
could restore the ordering if you accidentally dropped the deck. Files in a
filesystem eliminated the need for that. And the cards were heavy.

That's why files exist. Not sure what the article is trying to say, TL/DR

~~~
agumonkey
The diagonal trick reminds me of CDROM CRC error mechanism (reversed of
course)

------
ori_b
The key isn't the specific file abstraction used today. The key is being able
to name data. Whether that is through a traditional hierarchical file system,
through an activity log, through URLs, through some hash-based key-value
store, the requirement is being able to refer to data independent of the
application that produced it.

------
jules
We will still need a way to pass information between applications, but that
may be so different from the concept of files that it would be ridiculous to
call it files. For example applications on the internet exchange information
via APIs. Microsoft is doing something similar with Windows 8: if you want to
get a photo into an application you show the user a menu to get a photo. The
user gets a list of all his photos to pick from. Where this list comes from is
dependent on which other applications are installed: if you have a facebook
application you can choose your photos from facebook, if you have picasa then
you can also choose photos from picasa, etc. This works because each
application that has photos is supposed to provide an API to the OS to access
its photos. Exchanging information by exchanging it directly via standardized
APIs makes a lot more sense than exchanging it via an abstraction layer
designed to operate on top of a hard disk. This is similar to the difference
between Unix pipes and getting the output of one program, storing it on your
hard disk, and then reading it in with another program. With the API model the
disk loses its special status, and instead becomes just one other data
source/sink like any other (FUSE turned on its head, if you will).

------
aganek
I love this post.

There is no doubt in my mind that the file system (as we know it) is dead.
Daily workflows are becoming more and more integrated with the social graph.
Its one thing to manage your own file set, but try keeping track of everyone's
files... or even your own across multiple different purpose devices for that
matter.

If I save files using one filtering scheme and someone else saves to the same
shared drive using another scheme... both of our files eventually become lost
in a mess.

Like others have posted, I believe the solution is search. Maybe not textbox
search like Google, but certainly different ways to view lists of files. Can
you imagine viewing the most recent files edited by a certain coworker, or the
most recent files edited within range of a certain GPS location. I don't have
an exact answer how to sort the data, but in my mind... there is a lot of
additional data that can be used to help filter file presentations beyond the
just the file index and file attributes used today.

I'm in the bay area, working on a startup to address this shift. Message me if
interested... I'm always looking for people to talk about it with.

------
7952
"but in every OS there needs to be at least some user-facing notion of a file,
some system-wide agreed upon way to package content and send it between
applications."

This is what the world wide web does. DNS, HTTP, and MIME types solve these
problems. The problem is that it is still to difficult to make things on a
device into URLs.

~~~
brettcvz
But even on the web there is limited ability for applications to share data
without explicitly working with the apis. A central filesystem allows for
"star network" integration rather than point-to-point

~~~
icebraining
The API is already there: it's HTTP. read() is GET, write() is PUT or PATCH,
unlink() is DELETE. If you want to be fancy you can use WebDAV, which is also
a standard API.

You don't need APIs, you need _standard file formats_ , just like with
filesystems.

------
tagx
How many different file types do you typically use in a week?

~~~
tlrobinson

        $ find ~ -type f -atime -1w | awk -F/ '{print $NF}' | awk -F\. '{if (NF>2 || (NF>1 && $1!="")) print tolower($NF)}' | sort | uniq | wc -l
              83
    

Granted a lot of those are system files.

~~~
super_mario
And this is the best counter example why not having files/filesystem would
suck. You could not do this rather simple calculation at all.

Somehow this crusade against files and the filesystem just feels like it has
ulterior motives behind it. I have yet to see even a computer illiterate user
who has a hard time understanding "folder" metaphor and that folder may have
items inside them, including other folders.

~~~
agumonkey
If some files are archives (or any encapsulation format/mechanism), then the
count is false.

Files and folders are too generic and not generic enough. Some files aren't
files, some files are ~folders. Actually most of those files are ~folders,
they are containers for other kind of data and relationships. List of samples,
Tree of names, Graph of points.

IIRC Plan9 tried to be a little more generic (in a good way), you could
read/write/list anything even visual objects with one single mechanism.

We need maps to see/categorize/find data. Graphs of atoms that you can close
(as in closure, any datum involved in the meaning of an operation has to be
included) to transmit them in a consistent state. Moving files is wrong and
everybody have seen it, it's full of hardcoded context.

------
tobyjsullivan
The limitation of apps not saving to the iOS file system is not a bad thing.
It is progress.

There is nothing preventing my shiny new iOS app from sharing files with other
applications. Apple is just preventing those files from being stored on the
device. Instead, if an app developer wants interoperability, they can have the
app save a file to Drop Box, or my Google Drive. Any other application can
access that same cloud storage and access the file.

The beauty is we've moved beyond sharing between applications on a single
device. Now EVERY application I run on EVERY device I have has the potential
to share the same data seamlessly.

This is why iOS doesn't open its file system. It wants the app developers to
use something a little more flexible and reliable.

~~~
juriga
> if an app developer wants interoperability

Implementing interoperability with all the possible cloud storage systems
shouldn't be left to each app developer separately. This should be a feature
of the operating system.

As an Android user, I'm genuinely interested if iOS users find the sharing
options between apps too limited. Do you often end up requesting new sharing
options from the developers of your favorite apps?

Also, not every piece of data is a file I'd want to save to Dropbox. For
example, I share article URLs from Flipboard to 2cloud many times a day
(2cloud opens the URL on my desktop browser). I'd hate to have an extra
save/open step between the apps.

~~~
tobyjsullivan
My argument to this is that having access to the file system just gives
developers a cop-out. If you give app devs access to the FS, they will all use
that because it's easier and avoids the challenges of supporting cloud
storage. However, this is ultimately worse for the user experience in the end.

I get what you mean about only wanting certain documents on Drop Box - that
was just a limited example. The spirit of the concept is that the developer
can choose what cloud storage to use based on the application.

In the case of a mobile Photoshop app, Drop Box might make sense. In your
example, the storage medium would be different (maybe proprietary even) but a
cloud space would still be ideal for the end user over just storing these
URL's on the local device.

~~~
vacri
Nice though Dropbox is, I don't have an account on it. You're saying that in
order to share data between two apps on the same device, I should have to sign
up to a third-party system to do so?

Then, if I have different kinds of documents, I should sign up to a bunch of
different third-party cloud systems? Each sign up being another username and
password, another point-of-failure for security, more management overhead? All
for a system that won't work if I lose network coverage (rural, underground,
airplane mode, choked tower, foreign travel etc)?

The cloud does have its good points, but I do not buy this snake oil.

------
colinsidoti
We'll see how it plays out, but I imagine the notion of a file will continue
to decline, and end up replaced by APIs.

APIs continue to provide interoperability, but instead of having the user
select a file to upload, they select an image through the Facebook API. This
should ultimately improve the user's experience, but there are some downsides
during this transition period (IE: photoshop touch lack of sharing features).

While you could argue FilePicker brings back the file, you could also hedge
your bets the other way, and work as much as possible to abstract away the
file. Instead of grabbing Facebook photos as a set of files, what if I could
easily grab the set of photos than contain me and a friend?

~~~
guard-of-terra
This way if you start a new social network, in addition to network effect it
would have that huge disadvantage to Facebook that no programs are willing to
interoperate with is because they only know that proprietary Facebook API.

Good for Facebook, bad for you and me.

------
colourforth
Forget about "filesystems" for a moment.

Files exist because the amounts of "stuff" users want to "store" do not always
correlate well to block size.

To put it another way, block size is fixed. But the size of "stuff" is
variable.

OK, now you can go back to thinking in terms of "file systems".

------
lucb1e
Oh, I thought this was going to be about who invented files in the first
place. Why'd you call something a file? How'd that idea arise? The only real-
life "files" I know are these reports the police keeps on people, or I guess
any dataset. But who invented filesystems?

Edit: 302 found <http://en.wikipedia.org/wiki/Computer_file#History>

------
njharman
Interfaces (read APIs) and data types (PDF, png, json, markdown, etc.) much
superior to files for consumer level users. This is we're iOS is heading.
Itseemsby evolution, not design.

Files are great when there is a competent, skilled user to provide the
interface glue between apps. To automate, and have things just work,
interfaces and data types are needed.

------
AsylumWarden
I remember the Palm Pilot didn't use files. I think I was using the Palm IIIxe
may be. Developers, perhaps a little freaked out by the idea of no files,
actually created an api to make it look like there were files. I thought it
was pretty funny at the time.

------
nitinthewiz
My response to this, including how file systems and file explorers will never
die -> [http://blog.nitinkhanna.com/why-the-file-system-will-
never-d...](http://blog.nitinkhanna.com/why-the-file-system-will-never-die/)

------
stcredzero
I recently switched to using MiniKeepass on my iPhone. I have my encrypted
KeePass file on Dropbox. To get it into MiniKeepass, I just went to the iOS
Dropbox app and clicked on my key database file. MiniKeepass was registered as
an app for the corresponding MIME type, and the file opened. Easy.

That worked great for me. What else is needed?

