Hacker News new | past | comments | ask | show | jobs | submit login
Facebook open-sources LogDevice, a distributed storage for sequential data (logdevice.io)
336 points by cedricvg on Sept 12, 2018 | hide | past | web | favorite | 118 comments

Can someone from FB chime in with some info how much storage is needed for the logs/data? Say, for 1 GB of raw input logs from a http server (nginx/apache), when stored in LogDevice would they take notably less space on disk (compression), or more (overhead)? This interests ne for evaluating resources/costs I'd need to prepare if I were to deploy it...

These numbers really depend on the compressibility of the content, compression scheme and the type of batching used. The metadata overhead is fairly minimal. LogDevice allows you to configure this on either the client, sequencer or rocksdb level.

Is some form of compression enabled by default, without having to tweak options?

No, compression is disabled by default. You can enable compression either on the storage layer, or by enabling batching and compression on the sequencer, or by using the buffered write API with compression on the client.

have you had a look at https://github.com/oklog/oklog https://www.youtube.com/watch?v=gWWK2eyZ-sc

I think it's fairly simple and might be enough. Can't comment on storage requirements thou.

Dang, I was looking at oklog earlier, but looks like it is archived now...

Happy to finally see LogDevice open. We have been working on this for years now.

Hi Ahmed,

I imagine you looked at other solutions before starting this. A distributed log is a fairly simple idea to understand (hard to implement) but what pain point is being solved?

Seeing that it is written in C/C++ - would it be that logdevice is optimised purely for speed and responsiveness?

Can you give an overview over the difference to eg Apache Kafka?

It seems very similar.

It's a very different architecture and design. You can head to https://logdevice.io/docs/Concepts.html to learn more about how LogDevice works.

In terms of function. LogDevice is similar to the core of Apache Kafka.

True, but Kafka has two very annoying features built into it:

- There is no many-to-many log recovery whereas -- for example in Pulsar/DistributedLog -- logs are stored in small segments and distributed to multiple nodes.

- Read scalability. Since all the log is stored in one node (with some replicas) the readers are bound to single disk sequential read capacity. Again Pulsar stores logs in segments that are distributed among broker nodes which helps a lot when there are many readers.

LogDevice has many-to-many rebuilding as well, and typically data for a log (similar to partition in Kafka) is spread relatively uniformly over the (potentially large, much bigger than replication factor) set of shards that hold data for that log.

I'm not sure how accurate your comment is regarding Kafka's annoying features given that Kafka has partitions, which "solve" all of the problems you stated.

No it doens't, since a single partition is stored sequentially on one disk which limits the consumers to bandwidth of single disk (say c1 reads beginning of the partition and c2 end of the partition). But in the case of Pulsar c1 is most probably connected to a different node than c2.

LogDevice has this concept of "node set", which is the set of storage nodes that can be selected by the sequencer as recipients for a record or a block of records. A typical node set size is around 20-30 in our deployments. Each storage node in the node set contains a subset of the records (or blocks of records) of the log, we call that subset a log strand. The amount of IO capacity available to append records to a log or read records from a log scales with the size of the node set.

All of this is done while preserving the total ordering guarantee thanks the separation of sequencing and storage.

The operator could for example set a bigger node set size for logs that are known to have multiple consumers and require more IO capacity.

At facebook, we have use cases where a single consumer will need to replay a backlog of records in a log, sometimes hours or days worth of data to rebuild its state. We call this a backfill. Node sets allow the IO to be spread across multiple disks which improves backfill speed and helps reduce hotspots.

-- Adrien from the LogDevice team.

splitting data into partitions would mean there's no total order on that data anymore, right?

Kafka only guarantees a total order within a given partition, not across them.

Does it require at least one fully dedicated FTE to set up, maintain, and use correctly?

Curious where logdevice would be a bad decision..

From what I can see this doesn't have built-in consumer balancing and offset storage, like Kafka does. It also lacks more exotic Kafka features like topic compaction and exactly-once processing.

In Kafka bulk reading is very cheap, the broker basically just calls sendfile() to send a file segment with compressed message chunks. On the other hand only the leader of a partition can serve requests, so you are often limited by bandwidth. It looks like LogDevice has to do a bit more work server side, but may be able to read from all servers with a replica.

Kafka stores more metadata in the record wrapper, like client and server timestamps and partition key.

There are client libraries for C++ and Python.

Operationally they look similar - both require a Zookeeper cluster, and both require assigning permanent ids to nodes.

It would be interesting to see some benchmarks comparing LogDevice with Kafka and Pulsar. That said, I suspect from the lack of buzz around Pulsar that Kafka isn't a performance bottleneck for most people using it.

Kafka is also very embedded everywhere now, with a big first-mover advantage. Pulsar already does everything Kafka does but also supports custom functions, per-message acknowledgements, and native cross-region replication.

Unfortunately it's hard to change something that already works. Most users don't hit the performance limits of their tools so they'll just continue Kafka if it's already running.

Congrats Ahmed!

Very interesting! I hadn't heard of this before but I'd love to see it in action.

If anyone from the FB team or anyone using LogDevice wants to test performance with Optane SSDs (and compare to a NAND SSD), make a request by submitting an issue on our GitHub page: https://github.com/AccelerateWithOptane/lab/issues. I'll hook you up with a server hosted by Packet.

Martin Kleppmann seems to point out technologies for problems of similar patterns already exist - https://twitter.com/martinkl/status/1039938408393662465

Those are streaming/pubsub services though, this actually claims to be a store. I feel that's an important difference.

Do people just point their system journal at Kafka and wait for something to break?

At my previous job we built something similar to this out of rabbitmq and mongodb. I always wondered what the other big log companies used. Mongodb seemed like a pretty good fit, but a pure append only database might be even better. Trimming performance in MongoDB was subpar so we worked around it by creating a new collection for each day, trimming became a simple operation of dropping a collection at the end of each day.

> Those are streaming/pubsub services though, this actually claims to be a store. I feel that's an important difference. > Do people just point their system journal at Kafka and wait for something to break?

Kafka can be used as a data store if you like, so long as you're happy with the data management and access patterns it gives you - it is, after all, optimised for large sequential reads.

LogDevice looks to be very similar for most use cases to Kafka, hell, they even use RocksDB, which is used by stateful operations in Kafka Streaming, and of course, Zookeeper.

Where it differs is that it looks like it was designed for you to be able to work against a single "cluster" that could well be running across multiple data-centres. Which is very much a Facebook problem to solve.

So yeah, Kafka was a distributed log built for LinkedIn size problems, LogDevice is a distributed log built for Facebook sized problems.

Most of us don't have Facebook sized problems.

What's a good distributed log for 10-dev sized companies? :)

If you're willing to go cloud, then Google's BigQuery is a very good fit for sequential data. It's fully queryable, unlike most log-oriented databases, and it's extremely cheap compared to competing offerings (e.g. AWS RedShift).

Apache Bookkeeper. Or install Apache Pulsar and get full messaging capabilities with tiering out to S3 for infinite retention.

OKLog, Humio, and Splunk are all worth checking out.

OKLog has been abandoned by the author (the project is now read only on GitHub).

Humio is not self-hosted or open source, so not really a fair comparison. It also seems targeted towards operational logs, i.e. system logging, traffic logging, auditing. Not things like data pipelines. Kafka and friends can be used for that kind of log, but they are more like databases; they use the term "log" in the sense of sequential and append-only.

Same goes for Splunk, which does have a self-hosted version, but is extremely expensive, last I checked. The SaaS version is also extremely expensive.

Humio does have a self-hosted version - but is closed source. You can download a trial and it supports most logs types via support for opensource log shippers, ie logstash and beats..along with other popular formats including Kafka.


The UI is what simplifies analysis and visualization with live, real-time query and db.

AWS Kinesis + s3

Thanks! I'm guessing you're referring to Kinesis Streams? Is there an OOB solution to persist the records past the default 168hours, or is this something that you have to build out yourself following some pattern?

You can configure it to output to S3 and the mechanism for that is easy to configure and has fault tolerance.


All of those are similar systems and have persistence. I'm not sure what distinction "streaming" makes but they also all support multiple publishers and subscribers. Some only use local storage on the nodes while others can tier out to cold storage like S3.

MongoDB is a full OLTP document store so it won't match the write throughput and pubsub features of these focused systems. RabbitMQ on the other hand has performance limits but is meant for complex service-bus style routing and RPC uses, but I recommend using NATS for that now.

I had just stumbled across https://github.com/facebookincubator/python-nubia and am anxious to try it out. Was wondering about the internal project it was factored out from. This appears to be it.

Correct. LDShell in logdevice was the starting point of python-nubia.

The use cases overlap neatly with Kafka's. Everything from it's usage of zookeeper, time-and-storage-based retention tuning are similar

The announcement does not clarify the reason they use this over kafka. Is it because Kafka doesn't scale to millions of logs on a single cluster or is it because kafka is not sympathetic to heterogeneous disk arrays containing SSD and HDD. I strongly suspect it may be latency of writes at scale but this is pure speculation.

I don't know. If I understand why anyone might use this I'd contribute to building language bindings for the APIs.

Some strengths of LogDevice include:

- It's designed to work with a large number of logs (roughly equivalent to partitions in Kafka), hundreds of thousands per cluster is common.

- Sequencer failover is very quick, typical failover time when a sequencer node fails is less than a second.

- It supports location awareness and can place data according to replication constraints specified (e.g. replicate it in 3 copies across 2 different regions and 3 racks).

- Because of non-deterministic data placement, it is very resilient to failures in terms of write availability.

- If a node/shard fails, it detects the failure and rebuilds the data that was replicated to failed nodes/shards automatically

> Because of non-deterministic data placement, it is very resilient to failures in terms of write availability.

I am happy to expand more on this point.

We have this concept of "node set" of a log which is the set of storage nodes available to receive record copies sent by the sequencer. It is typically made of 20-30 nodes in typical deployments at Facebook. Write availability is maintained as long as enough storage nodes in the node set are available to accept copies. When storage node failures are detected, the sequencer can just exclude these nodes from the list of potential recipients for new records. It does not need to update a view that needs to be synchronized with readers, which is a heavy-weight operation. This model allows preserving high write availability even if many nodes in the node set are unhealthy.

Additionally, this record copy placement flexibility allows the sequencer to quickly route around latency spikes on individual storage nodes, which helps guarantee low append latency.

> Is it because Kafka doesn't scale to millions of logs on a single cluster

I doubt that's it, since Kafka can certainly do that.

Millions of separate topics on a single Kafka cluster? The way it's designed requires opening files for all of those topics and their partitions so good luck if you're trying that. You'll run out of file handles, then memory, and then the disk access will completely freeze up.

I didn't think we were speaking of millions of topics here; only millions of logs. You can certainly have logs numbering in the millions using a single topic. Mux/demux would have to happen at the producer/consumer side, of course.

Do you mean log segments then? In that case I don't see what's special about it because that's just rolling files and all of these systems can handle millions that way.

As far as millions of topics, if you have to do it at a logical layer yourself, then you might as well use a system that supports it natively.

The logs in LogDevice also have an independent lifecycle, which your solution doesn't allow.

A log in LogDevice is roughly equivalent to a Kafka partition.

It does not, I've lost alot of time profiling Kafka perf issues against clusters on the exact same hardware with exact same traffic but with a 3000% throughput difference. The root cause was one cluster had a lot of empty test topics

Try benchmarking Kafka from 0 partitions to a few thousand partitions in 100 partition increments. The benchmark only needs to write to a single topic, using their provided producer perf tool while all other topics are inactive with zero data.

As the partitions increase there is a very noticeable drop in throughout that looks to be linear.

Kafka does not handle a large number of partitions well currently, large even being low thousands. It's easy to hit with just a few hundred topics.

Reading between the lines ehen Linkdin and Netflix advertise several clusters, i am predicting/guessing they shard the data.

I didn't think we were speaking of millions of topics or partitions here; only millions of logs. You can certainly have logs numbering in the millions using a single topic. Mux/demux would have to happen at the producer/consumer side, of course.

Great to see this released. Some similar architecture decisions to Apache Pulsar as well with the separate of compute (in this case the sequencer) from the storage.

Kafka has done well so far, especially in making streaming systems more common, but it's about time for the next-gen systems.

How does LogDevice differ from Kafka?

Kafka brokers handle both the computation (partition/topic management, sequencing, assignments, etc) and storage together. This coupling creates scaling and operational challenges which LogDevice removes by separating the layers. Storage nodes can be as simple as object stores (but optimized for appending files) and use multiple non-deterministic locations for a given piece of data to randomize placement. They read, write and recover data very quickly by working together in a mesh.

Meanwhile the compute layer becomes very lightweight and almost stateless, which is easy to scale. In LogDevice, the Sequencers are potential bottlenecks but generating a series of incrementing numbers is about the fastest thing you can do so it'll outpace any actual data ingest to a single log, while giving you a total order of all entries within that log. The numbers (LSNs) follow the Hi/Lo sequence pattern so if a Sequencer fails, another one takes its place with a greater "High" number, so it's guaranteed that all of its LSNs will be greater than the previous Sequencer as a result. This also provides a built-in buffer to still accept messages and assign the permanent LSNs to them after recovery in case a Sequencer fails.

Apache Pulsar is similar to LogDevice but goes further where brokers manage connections, routing and message acknowledgements while data is sent to a separate layer of Apache Bookkeeper nodes which store the data in append-optimized log files.

Interesting. Microsoft's Tango paper had some interesting things to say about sequencers/sequences as well.

Amazing! We had lots of operational issues because of the coupling you mentioned.

One question though: will Presto support querying from LogDevice directly? :)

I worked on LogDevice at FB until about 6 months ago.

I'm not that familiar with Kafka, but in general LogDevice emphasizes write availability over read availability. There are many applications where data is being generated all the time, and if you don't write it, it will be lost. However, if reading is delayed, it just means readers are a little behind and will need to catch up.

So, when a sequencer node dies and we need to figure out what happened to the records that were in flight -- which ones ended up on disk & can be replicated, what the last record was -- LogDevice still accepts new writes. However, to ensure ordering, these new writes aren't visible to readers until the earlier writes are sorted out.

What happens with a sequencer which appears to fail but hasn't really, and then comes back up after the process to figure out inflight records has completed? If that sequencer receives a record for a log, will it be able to write it to the storage nodes? I.e. is there any fencing mechanism to tell the storage nodes that the epoch has been bumped, so don't access writes for that epoch anymore?

yes, in LogDevice it's called "sealing". However, as it stands, a newly activated sequencer won't wait for sealing on the old epoch to complete before taking new writes - in the tradeoff between write availability and consistency LogDevice picks higher availability. Blocking new writes until sealing is complete, however, should be fairly easy to integrate into LogDevice as an option.

Does it block reads until sealing is complete? How many nodes in the nodeset have to respond before sealing is complete? [NodeSet] - [ReplicationFactor] + 1?

Yes, reads are not released (i.e. are blocked) until sealing is complete. We call the minimal set of nodes sufficient to serve reads for a log (the same set is needed for sealing to complete) an f-majority.

For a simple case where placement of data is location-agnostic, indeed the definition of f-majority is n - r + 1, where n is the nodeset size, and r is the replication factor.

However, if your replication property, is say, "place 3 copies across 3 racks", then the definition of f-majority becomes more complicated - e.g. having all nodes in the nodeset respond minus two racks will also satisfy it.

> Yes, reads are not released (i.e. are blocked) until sealing is complete.

Which are the cases where consistency is compromised then? If a client of the log needs consistency, it needs to ensure that it has seen all previous updates to a log before making a new update, which implies a read.

> However, if your replication property, is say, "place 3 copies across 3 racks", then the definition of f-majority becomes more complicated - e.g. having all nodes in the nodeset respond minus two racks will also satisfy it.

Sure, the aim being that no write can be successfully acknowledged by enough replicas to complete the write.

> Which are the cases where consistency is compromised then? If a client of the log needs consistency, it needs to ensure that it has seen all previous updates to a log before making a new update, which implies a read.

Consistency in a more general sense than just read-modify-write consistency. If you have sequencers active in several epochs at the same time accepting writes, the records may end up being written out of order, and there would be a breakage of the total ordering guarantee.

> Consistency in a more general sense than just read-modify-write consistency. If you have sequencers active in several epochs at the same time accepting writes, the records may end up being written out of order, and there would be a breakage of the total ordering guarantee.

But given that reads are blocked on all sequencers before the current one, this should still provide total order atomic broadcast, unless a single client can connect to a sequencer with a lower epoch than one it has already seen.

LogDevice clients do notify sequencers if they have seen newer epochs, which would cause a sequencer reactivation, which indeed resolves the issue within the context of a single client.

However, there can still be reordering in the context of a wider system. E.g. if client A sends a write (w1) to sequencer in epoch X, which gets replicated and acknowledged, and after that client B sends a write (w2) to sequencer in epoch (X-1) which gets replicated and acknowledged (because epoch X-1 is not sealed), then readers eventually will see w2 before w1. If writes in epoch X weren't accepted before the sealing of the epoch (X-1) had completed, this reordering would be impossible, however as a result write availability would suffer.

Ok, but for this to be problematic, readers would need to have some other mechanism to know that w1 did actually take place before w2. So FIFO instead of total order.

Anyhow, thanks for answering my questions. Very interesting system.

Ah, I think you may be talking about the repeatable reads property? All readers in LogDevice are guaranteed to see the same records in the same order (aside from trimmed data).

What I was wondering really, was whether LogDevice provides total order atomic broadcast, and as such whether it solves concensus. It appears it does (or rather, it daisychains on the concensus provided by zookeeper and uses it's own fencing mechanism, similar to what bookkeeper/Pulsar does).

Scribe is the Facebook-internal Kafka equivalent. LogDevice is the storage layer used by Scribe.

Scribe isn’t the only place where LogDevice is used though — Facebook has documented using it for TAO as well (as part of the secondary indices)

I don't believe Scribe and Kafka are equivalent. Isn't Scribe at-most-once?

Unless we're talking about two different projects named Scribe, which is certainly possible.

As lclarkmichalek said, there’s more than one way to skin a cat.

At any rate, I used “equivalent” here to mean that, while different trade-offs have been made, it has the sort of users building the same sort of applications on the same sort of abstractions — it plays the same role, for all intents and purposes.

Scribe doesn't really make decisions about that, it doesn't store checkpoints for readers. Readers are commonly more-than-once.

> Scribe doesn't really make decisions about that

It kind of does, on the write pipeline. Tailers vary, but scribed controls the semantics of how your message gets delivered to LogDevice.

I’m wondering this as well. In the description it says that it ensures total ordering while Kafka only ensures partition ordering. I haven’t read enough to say more.

Awesome, I have been waiting for this since seeing the @scale talk about it. https://atscaleconference.com/videos/logdevice-a-file-struct...

Is there supposed to be a replay of that talk on the site you link to or is it just not loading for me?

The event is hosted by Facebook and there is an embedded Facebook video player on that page. Here's the direct link: https://www.facebook.com/atscaleevents/videos/19602876909109...

Thanks! My script and adblockers must've broken the site.

The amount of great quality open source projects dein Facebook just keeps growing. I really like the consistency guarantees:


And it uses RocksDB under the hood:


Thank to Open Source that, it looks a great project.

Could a LogDevice give a bit of informations about the scale they use that at facebook ?

- How many record this thing can injest per day ? - Any limitations on the maximum number of storage nodes ? - What would be your maximum and advise size of record for a production usage ? - ZooKeeper seems to be the center point used as epoch provider. Did you encounter any scaling limitations or max number of client due to that ?

I cannot give you exact numbers, but here are some information that might be useful: - LogDevice ingests over 1TB/s of uncompressed data at Facebook. This already has been highlighted in last year's talk in @Scale conference. - The maximum limit as defined by default in the code for the number of storage nodes in a cluster is 512. However, you can use --max-nodes to change that. There is no theoretical limit there. Each LogDevice storage daemon can handle multiple physical disks (we call them shards). So, If you have 15 disks per box, 512 servers. That's 7680 total disks in a single cluster. - The maximum record size is 32MB. However, in practice, payloads are usually much smaller. - Zookeeper is not (currently) a scaling limitation as we don't connect to zookeeper from Clients (as long as you are sourcing the config file from filesystem and not using zookeeper for that as well).

Hope that helps.

Very interesting!

I like the idea of decoupling compute from storage for streaming/log data.

I wonder if it would be easy to make it run under Consul, instead of ZooKeeper.

We use Zookeeper primarily for the EpochStore. This is the abstraction that you can you use if you want to replace Zookeeper. It shouldn't be that hard as long as Consul offers the same guarantees as zookeeper.

Am i the only being puzzled by


Store up to a million logs on a single cluster. ?

This sounds pretty confusing / low volume.

logs = topics, so they mean 1M separate topics on a single cluster.

Makes more sense. Thank you!

What benefit to facebook is there from open sourcing technology they have developed?

Facebook's competitive advantage doesn't come from having the best reliable streaming data store at scale, or from its software in general. Even if MySpace, Friendster or Google + got their hands on the whole software stack & started running it, people would stick with Facebook.

So there's no cost to open sourcing. The benefit comes from being known as technically innovative in general, and for recruiting, being known as having interesting, meaty, challenging projects to work on.

The impetus usually comes from team members who want to do the work. It could be to become known for having worked on the project, or a sense of giving back to the community, or a hope that you'll get bug fixes & features from outside contributors. In my (very limited) experience, managers "passively encourage" it -- they generally don't push the team to do it, but when the team asks, they encourage it.

If that's true, why haven't they open sourced Haystack? Clearly they're holding onto it due to competitive advantage.

I don't know, but my guess is nobody associated with it wants to put their other work on hold to make it happen. From my limited experience, nobody pushes you, and nobody blocks you. So it depends a lot on the motivations of the engineers on the project.

>So there's no cost to open sourcing

Not true, there's a legal cost associated with making sure something is really ready for public eyes.

Look at React - if it had never been open sourced, Facebook might still be using it, but it wouldn't be the same thing it is today. For one, basically all of the current excellent React team probably wouldn't be working at Facebook. And it would be far, far harder for Facebook to recruit engineers for product teams who were proficient in React. Since it is open source and very popular, the odds of a browser introducing a change that unfixably hurts React's performance is now very low. Et cetera.

The cost of maintaining an open source project is real, but when it is a world-class piece of infrastructure, open sourcing it helps keep it world-class.

Branding / Marekting / and more importantly, attract talent to work for them.

Institutional dependency?

Awesome to see this finally happening :)

Previous discussion in HN: https://news.ycombinator.com/item?id=15142266

I don't see anything about trust requirements or verification. Does LogDevice assume that all devices in my cluster are trusted?

LogDevice uses SSL for authentication. This can be enabled for both clients and servers [1].

[1] https://logdevice.io/docs/Settings.html#security

That's not what I mean though. What if I have a cluster with devices I don't trust, but I want to let them emit logs if they conform to a particular protocol. Like, will this thing check signatures for me and such?

Since it doesn't say anything about trustlessness, I assume that it assumes that all nodes are trusted.

LogDevice is crash fault tolerant not byzantine fault tolerant if that's what you're asking. This fault tolerance is in regards to where the logs are placed not who's emitting them though. If you want to analyse logs for inconsistencies or attack patterns you should look into something like SEAMS/REAMS, it's completely out of scope for LogDevice.

LogDevice is payload-agnostic and doesn't inspect the value of the binary blobs it stores. If your writer is allowed to write according to ACLs, LogDevice will happily take writes from it regardless of their content (upto the max payload size). Verification of payload content should be done on a layer above LogDevice - either before taking the write or when reading.


All daemons and system administration utilities belong into sbin, because bin is for end-user applications.

Historically, the "s" in sbin meant something else, but it always contained applications and scripts only root could run.

When I see these examples, it's depressing to see just how much understanding of UNIX is missing.

Maybe sending a PR would help?

It's for Linux only, and I run illumos-based SmartOS on my own infrastructure.

That's not the point. The point is that all these generation Y kids grew up on PC buckets and still don't understand UNIX and the concepts behind it, and yet they use it to power their applications. This can only end badly unless they start making an effort to understand the concepts behind the substrate they are writing software for.

A few things. First, can't you run Linux inside a Solaris Zone? I don't know much about Solaris stuff (although I do like it very much, I grew up mostly with Linux, which you so much despise, and I'm not too familiar with other Unixes). So... I think you could probably run Logdevice if you really wanted.

Then, here's my two cents. When engaging in conversation and civil dialog, please try to avoid being so dismissive and so proud of yourself and of how much you think you know about stuff. You come across as abrasive and entitled. It's not nice to just jump into a conversation and talk trash about the work of others just because you dislike the operating system that they use.

Finally, if you really care, work on porting it to your operating system of choice and engage in civil conversation doing pull requests, etc. Everybody will be thankful for that.

> You come across as abrasive and entitled.

I recognize the good intention here, but if you're going to post this kind of comment, please try to eliminate the personal provocations. They don't help, and do hurt.


Please follow the site guidelines.


Which guideline do you believe I did not follow? (I hope you write back face to face, because I would tell him the same thing in person and then some, and enjoy every microsecond of it.)

The comment was uncivil and snarky.

Regardless of your temperament, can you please not be abrasive/aggressive on HN? It encourages worse from others and leads to a toilet-whirlpool effect.

The person in question is one of the authors of "logdevice", which is relevant to that person's reaction: instead of saying "we didn't know about sbin, we'll fix that" it seems it was easier to just write a tractate trying to teach me, multiple decades older than them manners. What was aggressive?

I will let the generation Y kids know your opinion ;)

External logging service is my favorite way of doing replication. It provides nice features. Specifically:

- Cross vendor replication which makes migration much easier.

- No dependency on vendor provided replication protocols.

- Ability to use in-app databases such RocksDB, SQLite, ...

- Upgrading DB nodes becomes way easier since they are totally separated from each other.

How does that fit in a ML training pipeline? (this is mentioned on the page)

It's just streaming data but more scalable and with total ordering which can be important for ML.

Sounds like it might have been influenced by the MSR CORFU project (separate sequencer, write striping). Can anyone confirm?

It's hard to deny that there is at least some influence there. Like LogDevice, the zlog project [0] is influence by CORFU (separate sequencer, write striping), but both use different storage interfaces / strategies.

[0]: https://github.com/cruzdb/zlog

Is there any comparison with other similar storages?

This lot more similar to apache bookeeper.

this is like....the harder half of a whole blockchain project :D super interesting

Is this a Kafka competitor?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact