Hacker News new | comments | show | ask | jobs | submit login
Tachyon - Fault Tolerant Distributed File System (highscalability.com)
99 points by aespinoza 1193 days ago | hide | past | web | 13 comments | favorite

Anyone know if it has the same single point of failure issues as HDFS (NameNode)?

Also, bummer that it's written in Java. I know that was for easy pluggability into Hadoop, but it makes it tough to write bindings for other languages.

You may look in to: http://hadoop.apache.org/docs/r1.1.2/streaming.html for that. Hadoop's HTTPFS (http://hadoop.apache.org/docs/r2.0.3-alpha/hadoop-hdfs-httpf...) maybe. If this is a dropin replacement, a REST wrapper around the API to just take advantage of it with client bindings in random libraries could work as well.

Edit: https://github.com/amplab/tachyon/blob/master/src/main/java/... It appears to just be a thin wrapper around HDFS, so the namenode and what not probably applies here.

Actually, any language with a Thrift binding can talk to Tachyon: https://github.com/amplab/tachyon/blob/master/src/thrift/tac...

I guess using non-native API would introduce significant overhead and kill all the advantages of using this Tachyon filesystem.

I take it, the resulting DFS can't be mounted, am I right? If yes, that's a pity. By the way, are there DFS with similar properties (very fast, aggressive memory use) that can be mounted?

Why not? As far as I can tell, it'd be feasible to write a FUSE adapter that uses Thrift to talk to the Tachyon servers.

FUSE can be really ugly sometimes though, kernel modules are better if you need any real reliability.

There it is. I didn't dig much in to that.

> Anyone know if it has the same single point of failure issues as HDFS (NameNode)?

I think the NameNode SPOF issue in HDFS is also being dealt with:

"Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster. [...] The HDFS High Availability feature addresses the above problems by providing the option of running two redundant NameNodes [...]"


Hadoop 2.x lets you run a standby namenode with automatic failover.

Additionally the newish QJM will allow those namenodes to write to a quorum of servers.

Always cool to see different DFS options out there. That said, I personally won't find this very useful. It does not appear to have any POSIX support and thus only would work as a API/library accessible storage share. Hadoop at least has ways of mounting, although I personally haven't used them.

Wow, so bottom line 1. useful for Streaming cause of the nature of the file system. 2. Probably will loose the append nature of HDFS. 3. Retaining data in memory makes it hard to use with any of the hdfs based databases like hbase. So raw map reducers should definately try it...

I am extremely interested in this kind of technology. And to be honest this one sound very interesting. I am trying to night in the lab.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact