Hacker News new | past | comments | ask | show | jobs | submit login
Camlistore: a new project from Brad Fitzpatrick (camlistore.org)
118 points by joshfraser on Jan 29, 2011 | hide | past | web | favorite | 22 comments

Looks like a pretty neat project, though I didn't have a very good sense of what they're trying to do until I read the use cases page: http://camlistore.org/docs/uses

Too bad they mostly seem to be focusing on go (for local) and google app engine (for remote) implementations. That's probably a serious barrier to excitement for most people I know, even if they plan to encourage other implementations in the future.

What Brett said. Also, as the project matures we will provide binaries and downloads but for now the barrier to entry is probably convenient.

Go isn't the project's choice. It's my personal choice as I continue to love the language. The project encourages all languages.

It sounds more like Tonido (http://www.tonido.com).

At least, the broader principles:

    * Disk is cheap and getting cheaper

    * Put the user in control. Own your data.

    * Privacy and paranoia

    * Decentralization is important, but..

    * End users won't be dorks. Must also be possible to be easy, hosted.

    * Content-Addressability has so many awesome properties (validation, cachability, etc). Use it as much as possible.

    * Redundancy and over-explicitness is fine. Compression will help. Redundancy and over-explicitness will be convenient for future digital archeologists, too.


Write code. We have Go, Java, JavaScript, Python, Perl in the repo. Multiple implementations are the goal.

I also found overview.txt (link below) in the documentation useful for getting a picture of the project, which if I had to summarize in three words would be "git-like content-addressable filesystem." Some of the things they're talking about for synchronization are neat and could open up some really interesting possibilities.


I was about to submit this myself - one of the most interesting Google 20% projects I've come across in a long time.

If you don't look through the whole website, at least skim their vision for use cases: http://camlistore.org/docs/uses

The single feature of being able to keep a private store of different web services could make this really take off for a lot of people.

My three-word summary would be "application strength Dropbox"

I was thinking about how proxying would work. You could get very cool proxying, similar to the way AptProxy works, where you instead of accessing http://camlistore.org:3179/camli/hash-1, a client would access http://localproxy?url=http://camlistore.org:3179/camli/hash-.... It would parse the URL and see if a blob is being referenced that a local camlistore has cache, and serve out of that.

Then I looked at the camlistore sharing model. The proposal involves storing private data in blobs camlistore, protecting it behind a 401, and requiring the client to append a ?via=hash-2, where hash-2 is a claim that says "ok, let this data through". I'm not a big fan of that, because proxies won't reproduce your security model, and private data is stored in the clear.

What one should do, really, is store private data encrypted. Then, you can reference a "phantom blob", which is not actually an object in storage but represents the cleartext of the private data. Your claim could now be a recipe on how to reconstruct this phantom blob, i.e. get chunk X of ciphertext and decrypt it with this key.

Now, dumb proxies won't just store sensitive stuff in the clear (although, they would store enough data that the plaintext could easily be reconstructed), but at least your security model is preserved.

Encryption should occur higher in the app layer. For cache privacy everything should be over https and avoid proxies. There are uses for fully public blobs too and those do not have these issues.

Sounds like git for personal storage in the cloud.

Right. At least, there is a blob server in the architecture and a blobref in the reference document. In Camlistore, they have something called "schema blob" that looks more a generalized version of the specific "tree", "commit" and "tag" in Git. I hope they could have a sample skeleton in their schema to support the Git Object Model. In that scope, they could host standard git repository in their store and maybe benefit from existing git tools...

We could (and kinda expected that somebody _would_) but I don't quite see the existing tools working. I'd love to be proven wrong, though. I suppose it could be done with some remapping front-end, but I think that front-end would also need to maintain maps between as-git-computed blobrefs and Camli blobrefs.

If you look at the docs for the Signed Claims[1] perhaps the most interesting part of this is the safe way to share content. Being able to cryptographically verify a claim of access to or ownership to content sounds pretty awesome to me.

I spoke to Brad and Brett at OrdCamp[2] here in chicago this weekend and it sounded pretty interesting. They plan to make the tooling around this hide the difficulties in dealing with public private keys from the average user. I suppose if it's done right this could do for sharing content what SSL did for Commerce on the web.

[1] http://camlistore.org/docs/json-signing [2] http://ordcamp.com

meta: their schema seems to be JSON instances plus comments. http://camlistore.org/code/?p=camlistore.git;a=tree;f=doc/sc...

This is probably just a temporary notation, but perhaps it's clear enough for ongoing use?

Optional/required is handled by comments; JSON "[ ]" syntax indicates lists - but it's unclear if the contents represent a repeated group, like (abc)* , or alternatives, like (a|b|c)*; alternatives in general don't seem to be handled - in general, alternatives could also appear as the value of a field, not just in a list.

This reminded me of Venti. http://en.wikipedia.org/wiki/Venti

They mention it in their list of influences.

I've been working on a distributed, Venti-like archival filesystem, too. While plan9 never really caught on, it had many great systems within it that really deserve wider exposure.

I wanted to give it a try, but you need enabled Billing to use at least the blobserver on Appengine.

I believe you can enable billing and leave your cap at $0.

You could use the local version.

Ok, that is awesome. You had me hooked at "Apache License".

Very interesting project. Brad, are there any similarities with your previous work on brackup? Perhaps with how the files are tracked?

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact