
Scribe: Transporting petabytes per hour via a distributed, buffered queueing - gluegadget
https://engineering.fb.com/data-infrastructure/scribe/
======
dekhn
braggy PR is misleading: the 25GB/s coming from CERN is after they filter the
data down from 600TB/s because there are no commercial systems that can
capture data at higher rates.

~~~
breck
This is a good point!

Just for fun, for more perspective on big data, a human body generates around
1-10M new cells per second, and a cell contains about 10-100GB of information.
So a single human is generating 1-100PB/s of data just in the new cells! (Give
or take a few OOM)

~~~
throwaway_bad
Are you trying to quantify the "information" by the size of the DNA? I think
this is a pretty meaningless number to multiply since most of the DNA will be
exact copies and DNA alone doesn't capture all the information about a cell.

OTOH the amount of "information" needed to perfectly simulate a cell is
probably unbounded. Just a corollary of the fact that we currently don't know
how to perfectly simulate reality. Even a single "real" number can take up
infinite space.

~~~
glenvdb
> OTOH the amount of "information" needed to perfectly simulate a cell is
> probably unbounded. Just a corollary of the fact that we currently don't
> know how to perfectly simulate reality.

This is a very good point. The 'information' in a cell isn't the base pairs in
its DNA, but all the atoms that make up the whole cell. And then each atom
encapsulates properties such as position, velocity, charge, van der Waals
radius etc.

However this considers atoms with classical mechanics. In a quantum mechanical
representation it would be very different again and you can start asking
really hairy questions about whether information can be created or destroyed.

------
londons_explore
Keep in mind the network cost of petabytes per hour cross continent.

Those of us who don't own our own cross-ocean fiber can't afford to design
systems like this.

~~~
carapace
Never underestimate the bandwidth of a cargo ship full of SSDs.

(I'm paraphrasing an old, old joke.)

~~~
noir_lord
Fairly high latency though I guess.

~~~
trhway
using jet for trans ocean data delivery gives you several hours latency -
acceptable for the logs - at the cost on the scale like $0.1-0.3/TB (really
depends on the napkin used for the estimations)

~~~
derefr
Big capital costs in setting up the number of parallel writers required on one
end and readers required on the other, though. And presumably human "IT
teamster" labor, hooking and unhooking drives.

------
100k
Naming is hard! Facebook used to have a _different_ Scribe
([https://en.wikipedia.org/wiki/Scribe_(log_server)](https://en.wikipedia.org/wiki/Scribe_\(log_server\))).

We used it at a company I worked for, but it had long-since been deprecated,
so I was confused when I saw this Scribe.

~~~
johndoe345
This is the same scribe. Facebook closed-sourced it because it was too hard to
maintain an open version and a version that addresses Facebook's needs.

~~~
naringas
I wonder why couldn't they just make the open source version address their
needs...

~~~
javiermaestro
The version that was open-sourced kept evolving and integrating with other
internal systems at Facebook. That's what made it hard to continue open-
sourcing it (why the open-sourced version was discontinued) and why the
current version is also hard to open-source.

Maybe one day we'll have a version available. In any case, one of the larger
parts of the system (LogDevice) is open source :)

(disclaimer: I work in Scribe)

LogDevice: [https://engineering.fb.com/core-data/open-sourcing-
logdevice...](https://engineering.fb.com/core-data/open-sourcing-logdevice-a-
distributed-data-store-for-sequential-data/)

------
dividuum
Is it normal for these internal system to not implement any kind of access
control? From the post it seems every reader can access every stream?

~~~
javiermaestro
That's actually not the case, there's access control :)

The article just focuses on certain areas of the system and doesn't go into
the security and privacy parts, that's all.

(I work in Scribe)

------
pvlak
why can't Producer/Scribed write directly to LogDevice storage. Any reasons
for routing through WriteService.

~~~
thetrooperer
That's a good question. There are multiple reasons for this. I'll briefly
mention two of them. One is the high fan in ratio - millions of machines are
writing relatively small blobs of data, so the middle layer serves as an
aggregator (which saves backend's IOPs, number of connections, etc). Another
reason is the volume of metadata - it would be inefficient to keep all the
LogDevice-level metadata on each of the producer hosts.

~~~
pvlak
Will the WriteService(Aggregator) make sense for environments having thousands
of machines(not in millions) and they are all within the DataCenter. In our
company, we are moving away from this design of having aggregators, to
directly writing to Storage whereever possible, as it reduces the message
loss.

On the volume of metadata held by Producers, will there be any significant
difference between holding WriteService & LogDevice meta.

~~~
thetrooperer
The devil is in the details probably, but if you have a single datacenter and
all writes are coming from thousands of machines ("edge"), yes, it may make
more sense to set up a single LogDevice / Kafka cluster and have all the edge
hosts write to it directly.

------
bradhe
So, if I'm reading this correctly, 2.5GB/s of log data being generated? If we
assume (aggressively) that they have 5mil machines in their infrastructure,
doesn't that mean that each machine would have to be generating 500kB/s of log
data?

Despite that, I find the claims to be underwhelming. So your system can
process massive amounts of data by scaling massively horizontally...neat.

~~~
javiermaestro
The number in the article is 2.5 TB/s, not GB/s :)

(disclaimer: I work in Scribe)

~~~
bradhe
Right—sorry. But point still stands. Under what circumstances was that much
data being generated from (what I’m assuming is) normal logging?

~~~
javiermaestro
I'm not following. I understood from your first comment that you think the
amount of data is low ("underwhelming") and from your last comment that it's a
lot ("that much data").

In any case, the data is "whatever needs to be logged".

And it's not "server logs", which is what I'm interpreting from your comment.
Scribe transports most data at Facebook to be processed by real-time systems
(e.g. Puma, Scuba) and also "batch systems" (data warehouse). So, it's quite a
lot, being "the ingestion pipe" for Facebook.

Does this answer your question? :-?

Puma: [https://research.fb.com/publications/realtime-data-
processin...](https://research.fb.com/publications/realtime-data-processing-
at-facebook/)

Scuba: [https://research.fb.com/publications/scuba-diving-into-
data-...](https://research.fb.com/publications/scuba-diving-into-data-at-
facebook/)

~~~
bradhe
> So, it's quite a lot, being "the ingestion pipe" for Facebook.

I see. I walked away from the article with the impression that it was meant to
be a log aggregation service a la flume, splunk, or logstash.

> the amount of data is low ("underwhelming") and from your last comment that
> it's a lot ("that much data").

I was remarking on the numbers in regard to generation, not consumption. Based
on the article, my estimate is pointing out that generating 2.5TB/s of
transactional logs and telemetry data using "millions" of machines would be
technically possible but not reasonably practical...and thus likely not real
;). But, you corrected my understanding: That number isn't based on a
different use case.

------
ninju
How does this compare to a robust Splunk infrastructure?

~~~
packetslave
For what you'd pay for a Splunk license that can handle petabytes/hour of
data, it would probably be cheaper to just buy Facebook and use Scribe. :-)

