

Dropbox's Carousel: Speeding up the Data Model - drewhaven
https://tech.dropbox.com/2014/08/building-carousel-part-ii-speeding-up-the-data-model/

======
batbomb
I'm working on something similar where we have a file system like structure in
a database managing tons and tons of files and metadata.

What I found out works best for me is:

1\. A virtual file system which interacts with the database.

2\. A virtual file that is basically the minimum attributes for a given file
and a generic container for a variety of other views (I.e. FileAttributeView,
FileVersionsView, DirectoryAttributeView, DirectoryChildenView, AclView,
ImageFileAttributeView). In this sense, it behaves similar to a dentry and an
inode.

3\. A global virtual file cache for the virtual file system. The cache can be
smart: It can use different underlying caches depending on the file type, file
attributes, and evacuation model you want.

4\. A layered approach to these File Attribute views. A good example would be
one view is the basic set of attributes, but if a user requests more
information, it may become an extended set of attributes. Or, it may just
contain the extended information and merge to a new view on demand, so the
extended information may be garbage collected first.

If necessary, the views themselves can use their own dedicated caches as well.
This simplifies cache evacuation.

The nice thing is that whenever a file is updated, all updates to through the
virtual file system, so you can handle cache invalidation in several ways. The
bluntest is removing the virtual file from the global cache, or you you could
invalidate the view, or just reinitialize a view's members.

I implemented this on top of the Java Files API with some inspiration from the
Linux VFS and apache commons VFS.

~~~
mandeepj
Very impressive. If possible please give more details like what are you using
this for?

If you write a blog about it then that will be great.

