

SeaMicro drops an atom bomb on the server industry - timf
http://venturebeat.com/2010/06/13/seamicro-drops-an-atom-bomb-on-the-server-industry/

======
pjscott
Performance per watt depends on the computer _and the workload._ It's not
apparent from the SPECint benchmarks they show in the article, but chips like
Atom are better than beefy server chips for some server workloads, and worse
for others, when measuring performance per watt. Here's a paper from
Berkeley's RAD lab which tries two Atom processors and a Xeon on several
different server workloads, and compares their performance per watt:

[http://www.sigops.org/sosp/sosp09/papers/hotpower_10_chun.pd...](http://www.sigops.org/sosp/sosp09/papers/hotpower_10_chun.pdf)

The tl;dr version is that what you really want are _hybrid_ datacenters, where
you can assign various workloads to different types of machines, and use each
machine type for what it's best suited to do.

~~~
ssp
See also this:

[http://www.realworldtech.com/page.cfm?ArticleID=RWT090909050...](http://www.realworldtech.com/page.cfm?ArticleID=RWT090909050230)

for a comparison of various types of chips.

~~~
pjscott
That chart compares theoretical peak GFLOPS per watt, which is an absolutely
terrible metric for server workloads. As the paper I linked to showed, Atom is
sometimes much better and sometimes much worse than large, fast server
processors in performance per watt _on real server workloads._ Those charts
don't take into account things like cache, or branch prediction, or multicore
memory coherence.

The only way to get meaningful performance per watt data is to actually run
real workloads on various processors.

------
DEinspanjer
I worked with one of these prototype boxes. I was using it for something a bit
outside their common use case: clustered ETL processing of log data. I was
quite happy with the performance. In the workloads I had that needed lots of
threads, I was able to use the box to spin up a _lot_ of nodes and crunch
through several hundred GB of log data very quickly. The machines were easy to
work with since they felt like normal Linux nodes, and the interconnect fabric
made inter-process communication very snappy.

~~~
runT1ME
I'm still confused, is it a single multiprocessor machine, or is it a
'cluster' in the sense that each node has a single processor? (You used both
threads and inter-process, but also implied multiple nodes...)

~~~
blhack
>I'm still confused, is it a single multiprocessor machine

I would love to watch that thing boot, haha. SO many penguins!

~~~
blhack
I apologize for the self-reply, but am I wrong here? I use mostly OpenBSD and
linux only lives on servers that I don't physically see booting. I remember
when I first starting using Linux, there would be one penguin for each
processor core. Booting it up for the first time on a dual xeon machine and
seeing 8 of them, was kindof funny in a nerdy way.

Would this be different? If there are 512 cores...would it show a penguin for
each core?

Sorry, like I said, I haven't actually watched the boot process on a linux
machine for quite some time, and now I'm a bit curious...

Sorry if this question is juvenile or stupid or something.

~~~
wmf
You're right; IIRC some SGIs show a whole screenful of penguins. But you're
kind of derailing the conversation.

------
p3ll0n
James Hamilton, VP at Amazon and author of the popular blog "Perspectives" has
some interesting insights into and great things to say about the work SeaMicro
and other new startups are doing to revolutionize the server industry.

[http://perspectives.mvdirona.com/2010/06/14/SeaMicroReleases...](http://perspectives.mvdirona.com/2010/06/14/SeaMicroReleasesInnovativeIntelAtomServer.aspx)

~~~
hga
See <http://news.ycombinator.com/item?id=1429625> for HN item on this
excellent posting of his with a few comments, e.g. I note he notes this system
has no ECC. And that Intel is pretty unique in that limitation.

------
limist
"People who are really serious about software should make their own hardware."
- Alan Kay

This news is yet another data point that developers will need to hack
concurrency sooner than later, as a core skill in one's professional
repertoire. Off to learn Stackless PyPy, Clojure, Scala, etc...

~~~
WilliamLP
I think I can afford to wait until people start proving undeniably (through
out-competing rivals) that Scala and Clojure have any pragmatic benefit in
real multicore programming tasks. Anything that mentions Fibonacci numbers
doesn't count, nor do easily parallelizable problems commonly handled in C/C++
(ray tracing, media encoding, web servers, etc.) It needs to be something not
trivially broken up into independent parts.

~~~
drhodes
While this isn't undeniable proof, it's interesting none the less, Simon
Peyton Jones: Data Parallel Haskell.

<http://www.youtube.com/watch?v=NWSZ4c9yqW8>

He talks about how this scheme can parallelize non-trivial data structures,
and how tedious book keeping can be avoided. I haven't used it yet in a
project, so I can't speak of its efficacy - but worth a watch, imho.

------
pornel
Their tech overview PDF has less layman fluff than the article:
[http://dev.seamicro.com/sites/default/files/SeaMicroTechOver...](http://dev.seamicro.com/sites/default/files/SeaMicroTechOverview.pdf)

------
patrickgzill
Interesting in terms of technology, but of no interest to me as someone who
does colocation and web servers for my clients, especially since they almost
all use traditional RDBMSes like MySQL and Postgres.

The Atom is too underpowered and too RAM limited for individual systems - you
would do better in most cases with a 2x quadcore setup and 32-64GB RAM
combined with OpenVZ or Solaris Zones. Lack of ECC = automatic
disqualification for me as well.

For a company that is doing a lot of web serving a la Facebook or eBay I can
definitely see the appeal. In such larger cases, power usage dwarfs many other
considerations.

~~~
arvinjoar
I would rethink that if I were you. What if you started to provide virtual
private servers on one of these boxes instead of offering colocation? I think
this innovation makes VPS solutions even more efficient than they are today.

~~~
lsc
ECC is a huge thing in that case.

The thing about being a VPS provider is that if your hardware chokes, your
customers notice, and they get pissed. Sure, if you are running some kind of
web farm you set it up so that things keep running without interruption if you
lose a server. But if you provide VPS hosting and you lose a server? People
notice, and they will be pissed. If you are a VPS provider and you loose
customer data? they will parade your severed head around town on the end of a
pike.

Amazon has successfully tackled this problem by training their customers that
they can shoot a server at any time. This is good for them, but it's not how
VPSs have traditionally operated. customers expect the thing to stay up and
for their data to stay safe.

~~~
bnoordhuis
I've seen ECC mentioned a few times in this thread but how relevant is it
really? At my old job we used both regular COTS hardware from Dell and HP and
high-performance gear from SUN, often at four times the price. The latter had
ECC memory and, if memory serves, shielded CPUs but I never noticed much
difference in stability or performance when compared to regular hardware. If
someone has some numbers on this, I'd love to see them.

~~~
lsc
if your dell and HP kit was server grade, it almost certainly had ECC ram.
before nehalam, it was almost impossible to get dual-cpu servers that would
support non-ecc ram. Even today, now that unbuffered ram is supported
(buffering and ECC are not the same thing, but it's unusual to see registered
or buffered ram that is not also ECC) nearly all server hardware you get from
the likes of dell or HP is going to default to using ECC ram.

The big deal with ECC is that with non-ecc ram, you don't know when in-ram
data was corrupted. If you have a bad bit of ram that is being used for our
journaling/disk caching, it's pretty easy to corrupt your data.

If you've got ECC, it corrects single bit errors, and logs that there was a
problem, and it can be configured to crash on double bit errors, rather than
just silently corrupting your data.

If you use non-ecc ram, not only will bad ram silently corrupt your data, and
sometimes crash your server, but there will be no indicator (other than random
crashes) that the problem is in fact bad ram.

You can get away with not having ecc in large server farms if you have lots of
checksums on generated data... but in a VPS type situation where people get
pissed if you loose a single server, it's sure to end in tears.

------
astrodust
I hope this shakes up the VPS hosting industry, too. I'd love to see what kind
of Linode or EC2 style on-demand computing these might provide.

~~~
wenbert
Hopefully it will be cheaper. $20/month might sound very cheap but for some
guys (like myself) living in 3rd world countries, it's almost next to
impossible.

~~~
lsc
Hey, I have a question. If I wanted to advertise to /technical/ people in your
area, where would I do it?

Note: I'm not really interested in business people; I want *NIX enthusiasts
who are /not/ spammers, who can write English and are capable of running a
Linux box that won't get compromised, and who don't need much help from me to
run the thing.

~~~
mlni
Why not just buy google ads for technical keywords and target only the desired
countries?

~~~
lsc
For my ad money, I like display ads rather than pay per click ads... I am just
trying to say 'Hey, I exist' and pay per click is usually not a very cost-
effective way of doing that.

(now, most of my advertising 'budget' was spent on, say, writing a book... but
per-click ads are the opposite of what I want.)

------
GFischer
Interesting... I've never been in a really big datacenter, so I'd like to see
some (hopefully non-biased) reviews from somebody that does work in those
places.

Would this really work well for the intended market? There are lots of
startups over here that plan on massively serving webpages - would something
like this (only cheaper :) ) make you reconsider using whatever cloud services
you're currently using?

Articles I found on Google:

[http://gigaom.com/2010/06/13/seamicros-low-power-server-
fina...](http://gigaom.com/2010/06/13/seamicros-low-power-server-finally-
launches/)

and the Wall Street Journal's take:

[http://blogs.wsj.com/digits/2010/06/14/seamicro-tries-to-
ret...](http://blogs.wsj.com/digits/2010/06/14/seamicro-tries-to-rethink-the-
internet-server/)

~~~
stcredzero
_There are lots of startups over here that plan on massively serving webpages_

Seamicro is using the same sort of lateral move that RISC made and that
Transmeta attempted to apply: Could certain functions happen more efficiently
if moved outside of the traditional 'box?'

Could a datacenter provide a low-latency infrastructure for front-end web-
servers while reducing their power and other expenses? I think there's a good
chance this is already being done.

------
CountHackulus
I do see a potential problem here. In the pictures, they show a bunch of Atom
CPUs soldered directly to the board. That means dire things for service. Now,
if a single CPU has a a flaw, you need to replace an entire board of CPUs.

Compare this to a standard blade setup, where you could just swap out CPUs, or
even an IBM System Z where you could hotswap the CPU, and service doesn't look
so great.

~~~
wmf
To be fair, the unit of hotswap in System z is a book that contains ~32 cores;
that's not so different from SeaMicro.

------
stcredzero
The way I read this, they are achieving savings by virtually mux-ing (or de-
muxing depending on viewpoint) much of everything that's not the CPU. Is this
optimized to make the support of virtual servers with relatively low
throughput more efficient?

~~~
njharman
I read it to be they created a chip that runs code written to
emulate/virtualize most of the other (non-cpu) logic on a motherboard. so
instead of 10 "io" chips and 10 "memory" chips most of which are idle they
have 15 virt chips which may act as either "io" or "memory" depending on
demand.

I guess (de)muxing is close to that? Multiplexing many "request" amongst fewer
real(non-virutal) objects.

~~~
stcredzero
_I read it to be they created a chip that runs code written to
emulate/virtualize most of the other (non-cpu) logic on a motherboard._

That's exactly what I meant by "virtually (de)muxing everything but the CPU."
I should have said, "CPU+chipset," though.

I wonder if their architecture is also (unintentionally) optimized for
languages like Ruby/Python? I suspect that languages like those tend to have
more CPU operations per IO operation. I wonder if anyone has researched this
metric?

------
xtacy
This seems quite similar to the FAWN project at CMU.
<http://www.cs.cmu.edu/~fawnproj/> The idea is similar: if IO is the
bottleneck, instead of scaling up IO, scale down the CPU power.

------
mattmcknight
It would have been better to see a comparison with SGI (ex-Rackable) CloudRack
systems, which have a bit of an inbetween approach, using Xeons, but at least
nominally seem to pack in more cores in the same enclosure size. One of their
power tricks, in addition to pulling back DC converters further from the
computers, is to allow things to run hot, resulting in savings on cooling
costs.

~~~
Tamerlin
The SGI machines are geared pretty heavily toward supercomputing applications.
As such they're tailored for applications that need to share data between
processes efficiently, and that also need to do a lot of computing.

The SeaMicro server probably would look pretty lousy when compared on SGI's
turf, since Atom processors aren't designed to be performance-competitive with
the current desktop and server processors.

What would make more sense is a comparison a datacenter for a company like
Disney or Google or Amazon. I suspect that even though SeaMicro's rig would
look pretty bad for heavy number-crunching apps, it would most likely do very
well in web and database. Those apps don't need to share a lot of data between
threads/processes, and they're generally I/O bound rather than compute bound.
With proper caching setups, they're mostly network bound, rather than disk
bound also.

~~~
mattmcknight
I was thinking of the Rackable stuff, not the supercomputing stuff. They
actually acquired SGI, but took the SGI name.
[http://www.sgi.com/company_info/newsroom/press_releases/2009...](http://www.sgi.com/company_info/newsroom/press_releases/2009/april/rackable.html)

~~~
Tamerlin
I'd forgotten that Rackable did they acquiring in that one. Oops :)

------
Scott_MacGregor
Great news! Hardware innovation typically means new software opportunities. If
this is turns out to be a generally accepted workhorse server design and not
just a hotrod box I wonder who will be the first in here to develop a
profitable software product for it.

Dell products have features and services that make them enterprise friendly,
they are more than just hardware to the customer. So trying to compete with
Dell head on might not be the company's best strategy at first. Perhaps
following the strategy EMC used with CLARiiON to sell through DELL would be
more of a money maker for the company.

------
zokier
[http://en.community.dell.com/dell-
blogs/b/direct2dell/archiv...](http://en.community.dell.com/dell-
blogs/b/direct2dell/archive/2009/05/19/dell-launches-quot-fortuna-quot-via-
nano-based-server-for-hyperscale-customers.aspx)

Similar concept from Dell, from over a year ago. Although Fortuna seems bit
more conventional, I'm not sure if it's a bad thing.

------
sh1mmer
Anyone care to suggest some ideas on a few things:

a) who would likely buy these (corporates, SMEs, startups)?

b) it seems that they have increased the risk of single point of failure (e.g.
1 PSU taking down 128 nodes) what's the mitigation strategy?

c) what would an architecture on a box like this look like? Should I just be
thinking of it as a cheap set of VPS nodes?

d) People keep mentioning the kind of processing these chips are good for and
not so good for. Can someone be explicit about good real world uses and bad
ones?

~~~
wmf
a) Web 2.0 sites.

b) It has four redundant power supplies, as do all systems of this size.

c) It's a Beowulf cluster.

d) See the FAWN, Amdahl blade, Gordon, and CEMS papers.

------
10ren
I would think that ARM chips would use less power than Atom; but compatibility
with existing software is a big selling point (as always). But this would
provide a pressure towards ARM servers, and therefore ARM-compatible software.

OTOH, I get the impression that the bulk of the power savings are not from the
CPU at all, but from virtualizing the other components. Therefore, the
pressure towards ARM is much less.

And I suppose Intel wouldn't be supportive unless there was a compelling long-
term reason to choose Atom.

------
rythie
Does this really solve a problem anyone has though?

It seems we have an oversupply of CPU on modern boxes and an under supply of
I/O speed (or space with SSDs) and memory.

------
wingo
Interesting article, but a terrible title! Atom bombs, really?

~~~
jbarham
I think it's supposed to be a pun...

~~~
wingo
Oh dear, I did not realize that. Internets one, wingo zero...

------
surlyadopter
With that many processors crammed into such a small case isn't heat
dissipation a problem? Or are the Atom chips used not as power hungry as a
standard commercial cpu?

~~~
jonknee
"Atom chips used not as power hungry as a standard commercial cpu" is quite
literally the entire point of this startup/product and the article. You had to
quit reading before the second sentence to miss this:

> The startup is announcing today it has created a server with 512 Intel Atom
> chips that gets supercomputer performance but uses 75 percent less power and
> space than current servers.

------
jobeirne
Where can I invest?

------
GrandMasterBirt
I am completely baffled that Intel did not do this. I guess they figured ATOM
is just a crappy little side-project processor, not one of the big boys.

~~~
ippisl
Some see the move to many-core ATOM servers , as a move to put financial
pressure on intel(since ATOMs are cheap).

And this move helps the future of ARM based servers , since ARM is more
similar to ATOM than to XEON.

~~~
hga
There's going to be limits to the amount of pressure as long as Intel doesn't
offer ECC for any Atom variety microarchitecture. One of the articles I read
said that they thought about using an AMD chip but it wasn't going to ship
soon enough. I can see them hearing from their customers (would be and actual)
that they'd buy more if they offered ECC and SeaMicro offering an AMD based
box in the future.

I wonder how Atom specific the ASIC is....

~~~
ippisl
i think that the chip also suits the ARM architecture without changes.

------
c00p3r
There are no information about the chipset, memory specs and extension slots
in particular.

It could be a huge bottleneck between CPUs and RAM, because of concurrent
access of so many CPUs to the memory while Atoms have a very small caches.

It means that real-world applications, like a multi-threaded services
(especially JVM-based, or simply MySQL) could not be used efficiently.

I'm also very skeptical about hundreds of KVMs which is a the only working
virtualization technology I know. =)

~~~
hga
Here's a lot more technical detail:
[http://www.anandtech.com/show/3768/seamicro-announces-
sm1000...](http://www.anandtech.com/show/3768/seamicro-announces-
sm10000-server-with-512-atom-cpus-and-low-power-consumption/2)

Per CPU (8 per board):

    
    
      Atom Z530 (1.6GHz, Silverthorne)
      US15 chipset (Poulsbo)
      2GB DDR2
    

Per board:

    
    
      4 ASICs to virtualize IO
      2x PCIe x16 connectors
    

" _Each hop takes 8 microseconds_ ", so good enough for (most/all?) IO but not
good enough for memory.

No ECC of course, it being a non-server Intel chip. Each single cored CPU has
its own DIMM.

 __* __* __* __* EDITED to clarify the layout per CPU and per board after
c00p3r made it clear I wasn't being clear. __* __* __* __* __ _

~~~
c00p3r
So, it's just a tight packing of a 512 netbook's boards in a one box. I
thought it is a 512-core system. Oh, I'm such a moron. Nothing to see here.

~~~
hga
No, no: each board has 8 CPU/chipet/DIMM(s?) (latter on the other side of the
board) with 4 ASICs each handing all the rest of a motherboard's IO,
virtualizing it (e.g. each core sees 4 virtual SATA drives). That's why each
board needs 2 x16 PCIe connections, the backplane has a fancy supercomputer
style torus topology interconnect, and you can put up to 64 physical disks in
front. Lots of Ethernet out the back and there's an FPGA system for the
backplane, which might be due to the small number of units but will give them
lots of flexibility.

------
hackermom
An ARM-powered variant would sit on top of this Atom-powered machine, with
quite possibly the same or higher numbers of efficiency when looking at
performance-per-watt.

The article quotes the interviewee saying that ARMs can be used, but not clear
enough to determine if they actually have ARM versions of these machines,
which I'd be very interested in seeing 100k specint_rate figures for.

~~~
pjscott
Most of their technology is not related to the processors themselves, but to
the rest of the computer. They've got a fancy high-bandwidth, low-latency
interconnect between the chips. They multiplex access to I/O hardware, which
I'm sure takes advantage of that nice interconnect. They have hardware-
accelerated network load balancing which tries to keep the CPUs at maximally-
efficient levels of utilization, letting the spare CPUs sleep, and avoiding
malfunctioning CPUs. I would love to see this with ARM Cortex-A9 processors,
and I think that's actually very likely if this company succeeds.

------
Super74
Big surprise, Dell, Hewlett-Packard and IBM not innovating? I can't believe
it...

Just look at the computers they are selling. most still offers 512M of RAM as
a default. What can you run on 512? Nothing like having to upgrade the day you
receive your product. Consumers by and large need the guidance of
manufacturers to make the right HW choices and the manufacturers just want the
cash. They will suck each segment dry until the market forces them to make the
changes. That's progress?

With all the advances in multi-monitor add-on SW or the fact that Windows and
Mac OS have been supporting multi-monitor control for years, try finding
hardware already fitted with multiple display connections. In the end, the
user is forced to customize their own equipment. And yes, I know most HN
readers do this, but I am talking about the general public.

Great to see the little guys are still fighting. The big ones really don't
give a damn.

~~~
rbanffy
> What the F* can you run on 512?

A lot, really. I have rendered DVD quality movies, handled product composition
databases and processed millions of orders on machines with less than 512 megs
of memory.

Not many companies (or people) have enough data or volume to stress a machine
with 512 megs of RAM. I agree it seems a pittance - my netbook has more than
that - but if I break up memory usage by application here, Firefox takes most
of the RAM, then Gwibber, then Rhythmbox. None of them would be running on a
server.

It would be interesting to write down a list of what you can't do with 512M.

~~~
Super74
Well, you can't effectively run Mac OS X or W7 on 512 to start, the most used
operating systems in the world. Even Ubuntu needs an upgraded graphics card
and RAM to utilize all of the bells and whistles.

As I said before, I was referring to the general public. There will always be
hackers or high-end users that can make miracles.

~~~
lftl
I think the general miscommunication here is that you seem to be talking about
desktop computing, while this article is focused on datacenter usage.

~~~
Super74
I agreed with the article in the sense that a smaller company has seen the
error of some major manufacturers' ways. My opinion was that these major
manufacturers do not have the user's best interest in mind and used their PC
sales as another example.

They have billions for research at their disposal and they couldn't see that
their equipment was too power hungry, would not be sustainable in the long run
and probably needed re-tuning? I'm guessing they did.

