
Oracle built a 1,060 node Raspberry Pi cluster - whalesalad
https://www.servethehome.com/oracle-shows-1060-raspberry-pi-supercomputer-at-oow/
======
NohatCoder
Now load any piece of Oracle software onto it and get charged for running 1060
CPUs. Whoops.

~~~
walrus01
Whenever I see any technology news about Oracle, I mentally translate it as
"Larry Ellison needs a bigger yacht".

~~~
tyingq
He owns an entire Hawaiian island. It's like he's a real life Bond villian.

[https://en.wikipedia.org/wiki/Lanai](https://en.wikipedia.org/wiki/Lanai)

~~~
n0rbwah
Oh, and there are 3000 people living there? It's like he's some kind of feudal
lord.

------
Yuioup
That's what Azure feels like sometimes, running like a cluster of Pis.

~~~
partisan
I really do wonder why Azure is so slow. You have to optimize somewhat
aggressively to squeeze average performance out of Azure SQL, for instance.

~~~
jermaustin1
Back in 2012 or 2013 I built an auto-scaling WordPress hosting off of Azure.
It worked OK, better than shared hosting, but the $15 base VM at Azure (it was
a cheaper time back then) couldn't compete with the base $10 droplet at
Digital Ocean (it was a more expensive time back then).

On top of that Digital Ocean's API was so much better than Azure. So I spent a
week and switched to Digital Ocean. Needed to scale the instances less, cost
less, so more profit!

Unfortunately after 2 months of development, and half a year of marketing it,
we only managed to attract a handful of customers, and had to shut it down.

~~~
twobat
Why did try to scale when you had no customers?

~~~
jermaustin1
I guess I wasn't perfectly clear. The technology was to scale the customer's
wordpress installation when certain metrics were met CPU or RAM > 75%, clone
droplet and load balance.

The technology was basically an a beefy server for NFS, a beefy MySql server,
and a beefy load balancer. When you started hosting with us, you paid $50+/mo
and got what felt like a standard WordPress install on a dedicated digital
ocean droplet, when metrics showed that you were hitting the scale threshold,
we would create a new droplet run an initialization script to mount NFS, set
up the nginx server, and then attach it to the loadbalancer.

When the combined CPU or RAM droped below the initial scale threshold, it
would drop one VM from the load balancer at a time keeping it provisioned for
an hour in case it needed to scale again.

Worked amazingly well. Sure the load balancer was a single point of failure,
but in the 10 or so months that it was running, we never had a second of
outage based on pinging the websites.

~~~
apple4ever
Now that is very cool! I've always wanted to try something like that.

What was it written in?

~~~
jermaustin1
The main application I left hosted at Azure. It was an ASP.Net MVC app.

I used the Digital Ocean API to provision infrastructure, and SSH to remotely
login and run the initialization bash script. The init script configured
nginx, NFS (maybe it was SMB, i dont rmember now), a cron job to minutely
heartbeat statistics back to the MVC app.

There are so many better ways to do it now, though. Off the shelf load
balancers with letsencrypt and an api, hosted dns, managed & clustered
databases, monitoring api. It would be a lot simpler today than it was just 2
years ago, let alone 7.

------
axegon_
It's incredible how powerful these small things are. I have a couple and I use
them for just about anything, from access servers to access my home network,
to running various applications inside containers on them. I did plan on
making a cluster some time ago but for various reasons I've put that plan at
the back of the line.

The real question here is did they use it to run some heavy applications on
the cluster and how did it perform? Basically do they have a purpose for it or
was it simply a "because we can" type of thing?

~~~
shervinafshar
I think an objective might have been demonstrating the benefits of Oracle
Autonomous Linux regarding automatic patching and tuning; lower TCO being the
advertised selling point.

------
djaque
I think this is cool, but engineering for engineering's sake aside I've
seriously looked at building a raspi cluster a couple of times and have always
concluded that it isn't worth it.

The real selling point for me is energy efficiency and price, but the reason I
could never make it happen is because most of the applications I want to use
are x86 only and I don't have the resources to be a trailblazer in that area.
In addition sourcing becomes an issue with raspberry pis when you aren't a big
company like Oracle. It's hard to buy more than a handful as an individual.

~~~
giancarlostoro
The Raspberry Pi Zero's are ridiculously affordable. For the cost of a regular
Pi 3 you can get one thats wireless with a case with 3 covers for the case (1
for the Raspberry Pi camera). So maybe the Zero W's are more of what you might
want. If you dont need them all to have wireless then you can just get the
Zero's instead.

~~~
djaque
By sourcing, I wasn't arguing about their price. The problem is that it's hard
to find anyone who will sell you more than a handful of them at low prices.

I haven't checked recently, but at least that's what I remember.

------
tra3
I've been curious about RPis for a while.

I have a file server running AMD Phenom II 955 that I'm thinking of replacing
with a RPi. Is there a practical way to have RPis manage a software RAID?
Performance is not a consideration. Is it even worth looking at RPi in this
scenario? The advantages I can think of is small footprint, lower electricity
cost.

~~~
whalesalad
> Is there a practical way to have RPis manage a software RAID?

I can't think of any good reason to do use software RAID, let alone doing so
on a Raspberry Pi. I would recommend hardware RAID, ZFS, or a Synology
appliance.

If performance is truly not a concern, yes it is possible to do this with a Pi
but I don't think you are going to be super impressed.

~~~
ComputerGuru
ZFS is 100% software raid.

~~~
colejohnson66
ZFS doesn’t support hardware RAID controllers?

~~~
whalesalad
It supports it, but ZFS itself is operating on the host CPU, not the RAID
controller. Additionally, ZFS prefers that the RAID controller is
disabled/defeated and operating in JBOD mode where disks are passed-thru to
the OS without any caching.

~~~
ComputerGuru
Not _just_ caching. The problem is that non-JBOD HBAs also swallow error codes
and mask retries, meaning ZFS is potentially unaware of some read errors and
is not in control of the latency as the controller firmware has its own
timeouts and retry counters. Moreover, most will not pass through SMART or
else fake it.

------
SEJeff
This explains so much about Ellison's vision for Oracle Cloud.

------
tzs
> At Oracle OpenWorld 2019, tucked away in the Oracle Code One area [...]

As someone not familiar with Oracle events, I'm rather confused about Oracle
Code One. I found a FAQ [1] which says:

> Oracle Code One is the ultimate developer event where we will cover the
> technologies and programming languages developers have come to love over the
> years. Expect talks on Go, Rust, Python, JavaScript, and R and SQL along
> with more of the latest Java technical content.

but looking at the photos, I see in the background quite a few arcade video
games. Is this some sort of break area for people to hang out in between
talks, or something like that?

[1] [https://www.oracle.com/code-one/faq.html](https://www.oracle.com/code-
one/faq.html)

~~~
SanderMak
Oracle Code One is the successor of JavaOne. Since Oracle acquired Sun, the
JavaOne conference was co-hosted with the much larger Oracle OpenWorld
conference. Recently, JavaOne was rebranded into Oracle Code One, morphing it
into a cross-technology conference rather than a Java-focused one.

~~~
jaaron
I remember when JavaOne was definitely worth going to. I've not heard anything
like that for years and just assume it's dead as "Code One." Anyone want to
confirm?

------
nrki
No benchmarks. Cool but just seems like an ad for Oracle.

~~~
rhinoceraptor
Of course, you're not allowed to benchmark Oracle publicly, it's in the EULA
:)

~~~
ineedasername
Seriously? As someone with purchasing power, I could only assume they were
slower than alternatives in that case. Anything they quoted me would have to
be considered marketing fluff without independent numbers.

~~~
rhinoceraptor
I can almost understand why they do it. Benchmarking is incredibly hard to do
correctly, and it requires a very deep understanding of software and hardware.

For example, I remember hearing about Joyent's port of KVM to Illumos. They
were testing the Illumos KVM against Linux KVM, and the Illumos KVM was
faster. After a lot of debugging, they realized the Illumos machine was
turboboosting better than the Linux machine, since it was in a colder part of
the data center.

------
_bxg1
I wonder what the cost-effectiveness is in terms of compute power. I'd assume
not great, but who knows? Raspberry PIs pack a lot of punch for their size and
price.

~~~
asdfasgasdgasdg
When it comes to cluster-scale computing, size is not the main issue. The main
issues are: compute/$ and compute/watt.

There is also the matter of how much RAM and disk can be attached to each
board, and the speed of the NIC. Gigabit ethernet is a little slow these days.

~~~
willis936
GbE is slow, yes, but GbE for the amount of compute a raspi 4 can do is
massive. Raspis don’t even have the oomph to max out GbE links. You hook 48
raspi 4 into a switch and using a 25 GbE link to a storage cluster.

Edit: It is conceivable that a raspi has the oomph to max out a GbE link but
is limited to 300 Mbps due to the use of a USB 2.0 bridge in between the
ethernet interface and the SoC. My mistake.

~~~
esotericn
The RPi4 is capable of gigabit. I benchmarked mine at 900-950mbit via iperf
yesterday. That was to a usb3 gigabit ethernet adapter on a laptop.

~~~
willis936
Ah thanks. My previous post is dated knowledge, applicable only to the raspi
3.

------
jimmcslim
More info on the USB power supplies they were using would be interesting,
seems likely it would have been rack mounted and probably custom?

~~~
rasz
couple of those will do the job [https://www.digikey.com/product-
detail/en/tdk-lambda-america...](https://www.digikey.com/product-
detail/en/tdk-lambda-americas-inc/HWS1000-5-HD/HWS1000-5-HD-ND/3984323)

------
salvagedcircuit
Talk about a thermal design nightmare. Definitely just an "look-at-me" exhibit

------
Havoc
>an array of USB power supplies.

Surprised they didn't just bridge 5V to GPIO pins.

------
jonplackett
Not much info on what they can actually do with it

------
tr33house
This seems like Oracle's way of marketing to devs that they're a cool place to
work... And it might have actually just worked

------
tobyhinloopen
Why...

~~~
rvz
Exactly, I just feel that Oracle just needed to do a huge flex to the world to
show that they are relevant.

Weird flex but okay, I'll just carry on with my ARM64 Cavium kubernetes
cluster that blows this toy out of the ocean.

~~~
walshemj
But they probably built it as test cluster to do initial work on before going
for a big shot on a big boy cluster.

