I know that you wrote that you used flat files for Viaweb. I was wondering if you were still doing that for YC News, if you had broken down and started using a database or rolled your own data store with Arc.
Please write about how you do this and how you interact with the files (loading and serializing). I'm looking at alternatives to using relational databases and information about how to avoid data corruption (features analogous to transactions) is scare. How would you convince a mission critical site developer that this is safe?
It doesn't matter it is in Erlang - you can do it in any language. A Lisp that implements software transactional memory is Clojure (runs on the JVM):
http://clojure.sourceforge.net/
I wrap code that changes things within a call to atomic,
which prevents the thread from switching in the middle of it. That solves the problem of two threads trying to modify an object at the same time. I don't add any protections against e.g. the host machine's power being shut off in the middle of writing a file, though maybe MzScheme does.
The answer to your specific question, though, is that I wouldn't try. News isn't written like banking software.
Transaction isolation would be handled like any non-database backed application: use your language and/or library's native thread synchronization features.
As for persisting transactions, you could marshall to file the deltas for each transaction, and on regular intervals apply them to the full image to create an up-to-date image.
Just remember to yield every now and then if you do anything lengthy. Depending on your application, it may not be that bad, and if you don't need anything fancier in the way of scheduling fairness, it makes your life really simple.
Yep, the CSP style of channel communication is great. Those thinking Erlang is better than sliced bread need to make sure they're up on what came before and after; http://swtch.com/~rsc/thread/ Personally, I find Erlang to be too clunky as a language. Good for special purpose telephone switching software maybe, but for general programming the CSP style can be done in nicer ways than having to switch to a whole new language.
Not CSP, no, but libraries that bolt onto existing languages, e.g. C, Python. I'd love to see it become more of a mainstream technique in Python than `import threading', etc.
Ignore the crufty Occam (Transputer anyone?) syntax, just concentrate on the concepts. Although some of the PDFs seem to have many pages, often a page is the same as the previous with a minor change; they're slides!
Would you mind elaborating on the persistence a little? What is the granularity of persistence? For example, does each submission get written to its own file with associated comments? Do you utilize a memory mapped file? Inquiring minds want to know :)