Show HN: A threaded, TCP, key value store in Haskell

sold · on Feb 16, 2013

You can change

    appV fn x = atomically $ readTVar x >>= writeTVar x . fn

to

    appV fn x = atomically $ modifyTVar x fn

The function getValue need not be in IO monad. You can simplify it using fromMaybe in Data.Maybe.

Sometimes you check "head l" and "tail l"; it's more idiomatic to use pattern matching. For example, you can change:

    setCommand handle cmd db = do
        appV (conv k v) db
        hPutStrLn handle $ "OK"
        where k = (head cmd)
              v = (unwords (tail cmd))
    ....
    case (head cmd) of
        ("get") -> getCommand handle (unwords (tail cmd)) db
        ("set") -> setCommand handle (tail cmd) db

to something like this:

    setCommand handle key val db = do
        appV (conv key val) db
        hPutStrLn handle "OK"
    ...
    case cmd of
      "get":key     -> getCommand handle (unwords key) db
      "set":key:val -> setCommand handle key (unwords val) db

Also consider using hlint, it often gives good advice.

mrhonza · on Feb 16, 2013

Excellent, thank you. This is really helpful.

joeyh · on Feb 16, 2013

Nice example of using Software Transactional Memory. Which is starting to be available in things like gcc, but is much much nicer in Haskell for reasons involving purity and monads.

Particularly this function, which changes a variable in the database, transactionally.

  appV :: (DB -> DB) -> TVar DB -> IO ()
  appV fn x = atomically $ readTVar x >>= writeTVar x . fn

It would be easy to use this to add commands that eg, increment counters in the database, and the STM would ensure that it works correctly with multiple concurrent writers. No locking needed. A quick example, which could be improved by changing the DB type to support Int as well as String values:

  incrCommand :: Handle -> [String] -> (TVar DB) -> IO ()
  incrCommand handle k db = do
      appV incvar db
      hPutStrLn handle $ "OK"
    where
      incvar k v = insert k $ show $ (fromMaybe 0 $ readMaybe v) + 1

I don't know how well Haskell's STM performs compared with eg, database transactions or locking. It's been more than fast enough for my own needs. Anybody know?

tinco · on Feb 16, 2013

As far as I understand it, this simple example won't perform any better than if it were written using MVar. (i.e. it was explicitly locking reads and writes)

The difference is that `atomically` calls can be nested and still work. If you do an readMVar whilst that MVar is already locked in a function above you in your call stack you would have a deadlock.

So STM seems to be a smart way of keeping track which locks the current execution thread has so you never double lock.

At least in this case :)

stuffihavemade · on Feb 16, 2013

If you like this, check out http://www.happstack.com/docs/crashcourse/AcidState.html . It allows for arbitrary Haskell data structures to safely and automatically be marshaled in and out of the data store.

pestaa · on Feb 16, 2013

I really loved AcidState when I first learnt about it. Saves tons of headaches when converting Haskell's superb algebraic data types into dumb SQL types; but it also limits the points of entry into 1 (which means no outside queries/reports which SQL does very well).

Another feature it lacked is replication which I believe was on the roadmap so I'm not sure about its current state.

If you can live with the fact that your application is the only entry point for your data, definitely look into AcidState.

ac · on Feb 16, 2013

Replication and sharding (horizontal and vertical scaling) in acid-state are in development right now. The developer's page has the road-map as well as his contact details: http://acid-state.seize.it. There's another project for distributed storage and data processing in Haskell called Holumbus http://holumbus.fh-wedel.de/trac which powers one of the search engines for Hackage API's.

ac · on Feb 17, 2013

> but it [acid-state] also limits the points of entry into 1 (which means no outside queries/reports which SQL does very well). > [...] > If you can live with the fact that your application is the only entry point for your data, definitely look into AcidState.

Sorry for the double post, but I couldn't edit my previous reply anymore. You do have a point about SQL being more amenable to outside queries and reports, especially ad-hoc ones. However, take a look at Data.Acid.Remote (http://hackage.haskell.org/packages/archive/acid-state/0.8.2...) which allows you to connect to an acid store in your running application remotely: however, at this time you will have to provide your own security solution for that. Also, for ad-hoc queries to your data you can, probably, bolt on a dynamic Haskell evaluator/interpreter to your application: see http://www.haskell.org/haskellwiki/Safely_running_untrusted_... for a possible solution. That is, of course, not safe (as in Haskell-safe, see the section about exploits), but neither are ad-hoc SQL queries on your live data store.

tinco · on Feb 16, 2013

    conv k v db = insert k v db

You could rewrite that as:

    conv = insert

Probably some remnant of you figuring out how you should write appV, which is a proper helpful function, but probably should be called updateTVar or something :)

mrhonza · on Feb 16, 2013

LOL, thanks! You're absolutely right. You hack and hack until you just can't see what's right in front of you.

jrockway · on Feb 16, 2013

hslint will also find things like this.

lclarkmichalek · on Feb 16, 2013

How do I store something with a \n in it ;)

Edit: While I posted this as a joke, I do love the way that redis avoids any kind of injection attack via prepending the message length. Of course, I understand that this project is not at the stage where it is meant to be deployed in production/security become an issue; it's just interesting to see how the different communication methods have different consequences.

dschiptsov · on Feb 16, 2013

No monads?!)

tinco · on Feb 16, 2013

Every single function in that Haskell file returns a monad except for the version which just returns a constant string.