
Crossing the 1 tb/s throughput threshold for object storage with NVMe drives - edogrider
https://blog.min.io/performance-at-scale-minio-pushes-past-1-3-terabits-per-second-with-256-nvme-drives/
======
mrb
Fabric-wise, PCIe 4.0 offers 16 Gbit/s per lane, so a single PCIe x16 device
would have 256 Gbit/s of bandwidth available (in each direction), and a single
server could reach 1 Tbit/s with four NVMe devices on four PCIe slots. There
is the overhead of 128b-132b encoding to be considered, but that's minimal
(3%). There is also the overhead of PCIe packets to consider, but if the NVMe
device supports large maximum payload sizes (1kB+) it should be minimal.

Are there PCIe 4.0 x16 NVMe devices commercially available?

~~~
Rafuino
x16 NVMe SSDs aren't very common. The most recent one I've seen is the Micron
X100, but that's a x16 PCIe 3.0 device and it's not commercially available
yet. Most consumer and enterprise SSDs are x4, though perhaps that may change
as better media and controllers are able to saturate PCIe capabilities. Then
it's a tradeoff on power requirements though to be considered...

~~~
jordanthoms
Asus motherboards (and probably others too) support a mode where you can split
the x16 into 4 x4 connections and use [https://www.asus.com/us/Motherboard-
Accessories/HYPER-M-2-X1...](https://www.asus.com/us/Motherboard-
Accessories/HYPER-M-2-X16-CARD/) to mount them.

------
algo_trader
Are there any architecture papers/blogs on re-designing software for very fast
storage ?

Once storage is as-fast as memory, things change.

Please note: i am NOT referring to cpu-on-the-memory-chip architectures

~~~
the8472
P2P-DMA is making inroads to the linux kernel. Which means software can limit
itself to orchestrating transfers between storage, GPUs, accelerators and
network cards and avoid having the data touch system RAM or the CPUs.

~~~
natmaka
Confining the CPU to the most noble processing, and flanking him with highly-
specialized units directly exchanging... isn't it the mainframe all over
again?

------
SomeHacker44
Can someone please fix the title? I presume it means TB/s or Tb/s but I am not
able to load the article now. Then delete my comment.

~~~
LawnboyMax
It should be Tb/s

------
Nextgrid
My question is whether this includes overhead of actually persisting the data
and guaranteeing a certain SLA in terms of retrieving the data? Don’t get me
wrong, this is a nice achievement and good luck to them if their objective is
to build a storage company (I or my company has no need to store large amounts
of data at high speed so no bias there) but there’s a difference between
ingesting data at high speed and actually guaranteeing it’s going to be there
in a year’s time despite potential storage drive failures and stuff.

------
BubRoss
This doesn't say what minio actually is. It also says the benchmark cost $1000
to run. Also since this is terabits it is about 120 gigabytes per second using
256 drives.

~~~
mtmail
I looked up the pricing: Each i3en.24xlarge (96 vCPU, 768GB RAM, 8 x 7500 NVMe
SSD) costs $10.848 per hour. For the test they rented 32, so around 350
USD/hour or around 250.000 USD/month.

~~~
dx034
At this cost, is AWS really worth it? Shouldn't colocation be much cheaper
after only a few months of use?

------
lettergram
The key is “drives”, writing simultaneously across all you can get high
throughput.

