
RaptorCS's redemption: the POWER9 machine works - ddevault
https://drewdevault.com/2019/10/10/RaptorCS-redemption.html
======
yjftsjthsd-h
> However, I refuse to give any company credit for waking up their support
> team only when a scathing article about them frontpages on Hacker News. I
> told them I wouldn’t publish a positive follow-up unless they also convinced
> me that the support experience had been fixed for the typical user as well.

You are a gem and I wish more people were like you.

~~~
tpmx
A very old, but still key lesson here for product makers: when you know that
you're doing new, early-gen work with a kinda reasonably high probability of
something faulting - you need to provide stellar, high touch support, whatever
it costs. And you need to plan to be ready for that from the point when you
ship your first thing.

(I think this applies both to hardware and software products.)

Of course, this is easier said than done, but ultimately this is the core
function - are your customers happy or not?

As a CTO you need to prepare the company and your investors for this failure
mode. You need to be able to explain to them that sometimes making sure that
the product actually works and the existing customers are happy is more
important than that sexy new feature of the month. It's not always a very easy
balance.

In my experience, even semi-technical people with a more sales/marketing kind
of focus in the company often get blindsided by this. It's on us to prep them
for this.

It seems like this company learned it the hard way. The key thing is that they
did learn.

~~~
tpearson-raptor
For what it's worth, we do have a focus on making sure the hardware works
properly etc.; I've always been pushing for that and haven't had any real
problems there.

This particular board was a test escape. A very unique, thankfully singular
(so far as I know) test escape, but there's always a chance for that -- the
larger failure in my mind was the breakdown in the support process. With our
new controls in place (including rapid escalation / RMA of "odd" faults
observed by customer) I am confident this won't happen again.

And yes, we're updating tests to catch the newly discovered failure mode. It's
an interesting one, the board is a sort of a "zombie" in that if you boot it
up by looking at it just right, it'll run stably and pass our stress tests,
but it should never have left our factory in that condition due to the other
faults. Full stop. :)

~~~
qubex
That particular horror story led me to _not_ equip our development team with a
dozen POWER9 machines, which is a pity, because the combination of almost-
total trustworthiness and enormous firepower was extremely attractive for our
particular niche (plus, I used to hand-code PPC64 AltiVec “velocity engine”
code for HFT funds back when Apple sold G5 Xserves).

I’m glad to hear that it was a one-off and that you have revised your support
strategy, but... really, it made for a pretty frustrating read. It’s not chump
change, and going out on that kind of a limb on... trust... which was clearly
unwarranted at least at the time of that initial incident.

There’s nothing more frustrating than paying top dollar for something exotic
and then being stonewalled by oblivious “tech support” when things don’t work
as advertised, particularly when something is sufficiently novel or unusual
that relatively few other users are out there to proffer help and the
potential points of failure are plentiful and “unknown unknowns”.

As far as I’m concerned, that incident was extremely bad publicity. That board
could’ve been shipped to me. I’d have been left to dangle from my solitary
rope and squirm when management asked me what was going on.

That’s just not acceptable, sadly.

~~~
tpmx
If I were you I'd be a lot more comfortable pulling the trigger after reading
all of this.

Being worried over what _might have been_ spoilt milk if you had gotten a
broken board _in the past_ seems counterproductive.

~~~
qubex
The new machines I eventually decided to go with were ordered more than a week
ago.

~~~
tpearson-raptor
Sorry to hear that. If you ever get tired of worrying about untrustworthy
firmware phoning home, DRM, etc., please consider us for the replacement
purchase? ;)

------
trollied
Original issue thread:
[https://news.ycombinator.com/item?id=21049093](https://news.ycombinator.com/item?id=21049093)

------
classichasclass
I hope this means some future work on Wayland performance on ppc64le or at
least on 2D only is coming. My GPU-less 4-core Blackbird was unusable in
Wayland with the basic BMC graphics (whereas it is quite sprightly in X.org on
the same version of Fedora and with the exact same hardware loadout). I'll be
honest and say I'm a Wayland skeptic generally for reasons I won't derail this
post with, but I'm resigned to it probably being the future, and I would like
to see it do better.

~~~
emersion
Which compositor are you using? If you don't have a GPU you'll want one with
software rendering support. Using a software OpenGL implementation like
llvmpipe will work but will be very slow.

We don't support this yet in wlroots, but it's definitely something we want to
add, and not a limitation of the Wayland protocol at all.

~~~
classichasclass
It was whatever was out of the box with Fedora 30 and GNOME (I guess gnome-
shell itself?), and indeed was very slow with llvmpipe. I look forward to the
future work on this.

------
Athas
I really like these machines. Unfortunately, I'm in Europe, where they seem to
be harder (or riskier) to get hold of. I suppose I should also try to think of
something reasonable to do with them, given their cost...

~~~
smcl
Fellow EU here: I was curious too, but the high cost + the insane import duty
+ risk of return. I'm 99% sure if I RMA'd it like OP did, the local customs
would somehow not understand that it's a replacement/repair and sting me for
the import duty once more :(

~~~
gawin
This is the main reason I didn’t buy one.

~~~
smcl
Worth noting that I actually feel quite bad about this, I _really_ wanted to
try one out, and vote with my wallet in favour of more open hardware, an
alternative architecture and to support a smaller company like Raptor. I
eagerly followed all the news around Raptor (both here on HN, lobste.rs and
elsewhere) and read all the guy "ClassicHasClass"'s blogs @ talospace.com -
but in the end I unfortunately can't justify to myself the expense (when I
spec out a modest Blackbird system it consumes the better part of a month's
salary, I've recently bought my first flat and I sense a recession around the
corner...). That actually makes me feel a bit hypocritical and part of the
"problem", but maybe I'm not actually the target market.

~~~
tpearson-raptor
Anyone wanting control of a powerful modern computer is part of the target
market. We just haven't been able to drive the price down further at this time
-- doing hardware design around fully open source firmware is _hard_ and a lot
of the typical shortcuts to lower costs (including the always-concerning
"post-sale monetization" concepts) simply aren't something we find acceptable
on any level.

Basically, it doesn't do anyone any good for us to lower cost by giving up the
full owner control experience that is centric to our product lines. :)

------
dragontamer
> I have christened it “flandre”3, which I think is fitting.

Youmu Konpaku is best character. Just gotta love the sword-characters in a
bullet-game. Although Flandre was an "extra" endgame boss character, never
actually playable IIRC. Really good music for her stage though.

\----------

Glad to hear everything is working out for your machine. One thing I'm curious
about:

> Installation was a breeze, it compiles the kernel on 32 cores from spinning
> rust in 4m15s

Is that 8-core / 32-thread CPU? I don't think there's actually a 32-real core
CPU available from them. If so, I think that's an impressive speed for 8-real
cores.

~~~
Sir_Cmpwn
Yes, I meant threads. Thanks for catching that.

------
equalunique
Source hut build slave for ppc64? Groundbreaking!

~~~
kop316
To ask the dumb question, what is source hut? It looks essentially like Gitlab
to me.

~~~
Un1corn
It can do the same tasks like Gitlab (Git hosting, issue tracking, CI and
more) but it does it in a different way that is more similar to the workflow
of the Linux kernel and other big open source projects.

For example instead of pull requests, you have mailing lists in which you send
patches. You can see more in the website of Sourcehut:
[https://sourcehut.org](https://sourcehut.org)

~~~
kop316
Ahh, makes sense. Thanks!

------
nicoburns
> I am quite impressed with it so far. Installation was a breeze, it compiles
> the kernel on 32 cores from spinning rust in 4m15s

Does anyone know how this compares to a recent x86 machine? This seems quite
fast for the entire kernel, but I have no idea what typical build times are
like.

~~~
JackRabbitSlim
The trick is far fewer modules and compile time options.

A fun benchmark would be a Power9 Vs a X86_64 cross compiling the same ARM64
kernel. I wouldn't bet heavily on the Power9...

~~~
tpearson-raptor
I might take that bet core for core, matched node size (i.e. 14nm), run out of
tmpfs or NVMe storage, and with both systems protected against Spectre,
Meltdown etc. (i.e. no cheating by ignoring the ISA specifications)...

Slightly OT, but if you limit the playing field to open ISA / fully owner
controlled systems, POWER9 is so far out front in terms of performance that it
looks like an outlier. ;)

~~~
zwegner
> both systems protected against Spectre, Meltdown etc. (i.e. no cheating by
> ignoring the ISA specifications)

Does the x86 specification say "speculation will have no observable side
effects on the memory subsytem"? I wasn't aware of that. In other words,
Spectre/Meltdown are bad security flaws, but I don't think they're the result
of cheating.

~~~
wahern
Meltdown is absolutely cheating as it speculates through permission
boundaries. I don't see how anyone could argue otherwise in good faith,
especially considering that it was only Intel and IBM effected. AMD and SPARC
were immune, and the extent of ARM's vulnerability was a single register,
which seems like a bug, not reflective of an architectural design decision.

You can't design your software to protect against Meltdown, which makes the
ISA guarantees useless. Whereas Spectre-type side channels can be mitigated
through software, which is what cryptographic algorithm implementations have
been doing for years.

There's far more room for debate regarding culpability for Spectre class
attacks, though it's pretty clear that Intel deliberately pushed the envelope
in ways that AMD and ARM weren't prepared to do.

~~~
zwegner
Again, I don't see that as cheating or violating a spec (or at least any
guarantee made by Intel about x86(-64) behavior that I know of). I would
assume that speculation can do just about anything (access protected memory,
run illegal/protected instructions, etc), as long as any effects aren't
committed to architectural state until the speculated path retires. The
existence of side channels (timing information of subsequent memory accesses)
for retrieving information from speculative execution paths is a different
issue. Perhaps it was naive of the architects to assume this wouldn't have any
consequences beyond improving performance, but I don't know what spec it is
supposed to violate.

~~~
wahern
What's the point of protected memory, _especially_ when using VM extensions,
and _particularly_ with regards to SGX, if the architecture is implemented in
such a way that unprivileged software can read the entire contents of memory
and there's no way for software, either the kernel or the processing software,
to prevent it? You can make a tortured, pedantic argument defending Intel if
we disregard VM-x and SGX, that memory protection was originally intended only
to prevent data corruption, not confidentiality, but at the end of the day all
such an argument does is emphasize the deliberate choices Intel made to
sacrifice confidentiality for performance. And those choices are all the more
unforgivable considering Intel's primary motivation for taking these
performance short cuts were to expand into and secure their dominance of the
VM and cloud hosting market; a market _predicated_ on the ability of their
architecture having the nominal capability to ensure data confidentiality.

