
Intel's Xeon Phi Is Being Sold for a Low Price - lelf
http://www.phoronix.com/scan.php?page=news_item&px=MTgzNjY
======
Coding_Cat
An insanely low price _relative to_ it's original price. Peronally, I still
find this a very reasonable price (double it would still be decent), but
normally they are heavily overpriced.

It's hard to compare different devices of course, but in terms of pure FLOPS
(and benchmarks) is roughly the same as an AMD r280x, which costs 240$ and
allows you to play video-games in between your work.

There is definitely a market for these cards I think, niche but it exists,
however the normal price of >$2k is the old-fashioned "industry tax" (c.f.
Tesla cards) and not a reflection of the actual performance.

~~~
rayiner
You're comparing apples and oranges. The Xeon Phi has full-fledged x86 cores
and 8GB of ECC memory. It's a much more flexible machine than any GPU.

~~~
Marat_Dukhan
I wouldn't call Xeon Phi x86:

\- It doesn't support CMOV, MMX, SSE, AVX, and all other ISA extensions the
popped up after the original Pentium.

\- It does not use standard System V x86-64 ABI.

\- Most of its compute power is due to special vector instruction set, that is
not compatible with any previous or future x86 ISA (including AVX-512).

\- The only fully supported compiler is Intel Compiler (icc). And in my
opinion, its code-generation quality for Xeon Phi is far worse than for other
Intel architectures.

\- The only supported assembler is GAS. No NASM, no YASM.

\- Most debugging and profiling utilities (that run fine on normal x86 cores)
are not supported on Xeon Phi. This includes valgrind, Address Sanitizer,
Thread Sanitizer, and even memory profiling options in Intel compiler.

Given these limitations, how does it help that Xeon Phi is x86-based?

~~~
rayiner
It handles branching and memory access like a conventional x86 core.

------
cpks
I'm a little confused about the value proposition here. This is about 60 CPUs
at 1GHz with little memory and low IPC. Price, before promotion, is $2k, and
after, $250.

I can get an FX-9590 for $250, which has 8 cores, but running at 4.7GHz, and
higher IPC. In terms of raw compute speed, it seems like the FX-9590 will be
at least the same speed. But:

* I can use standard programming tools.

* I only distribute computes 8-ways, not 60-ways. That's easier.

* Anything which is not easily made parallel is much faster. My experience is that there tends to be a lot of that.

* I have less work to transfer data back and forth. For big data, that actually takes a fair bit of time.

When would the Phi be faster? When would I want to use it?

~~~
nkurz
_In terms of raw compute speed, it seems like the FX-9590 will be at least the
same speed._

They aren't really comparable. An FX-9590 at 4.7GHz is about 300 GFlops, but
the the Xeon Phi 31S1P is slightly over 1 TFlop --- more than 3x as many
floating point operations per second. The spec that probably shows the most
difference is memory bandwidth. The FX-9590 with 1866 memory achieves 30GB/s.
The Phi reaches 320GB/s, about 10 times as much. The presumption is that you
might put 4 of these in a server, in addition to the standard processors.

The price is actually quite a bit better, too, as some vendors are offering
the Phi at $125 each in quantities of 10: [http://www.colfax-
intl.com/nd/xeonphi/31s1p-promo.aspx](http://www.colfax-
intl.com/nd/xeonphi/31s1p-promo.aspx)

This article offers some good information about cases where the Phi is a good
(or bad) choice:
[https://software.intel.com/sites/default/files/article/33016...](https://software.intel.com/sites/default/files/article/330164/an-
overview-of-programming-for-intel-xeon-processors-and-intel-xeon-phi-
coprocessors_1.pdf)

The short answer would be that the Phi might be the best choice in cases where
you need to perform billions of energy efficient floating point operations on
a working set small enough to fit in the cards RAM, where the degree of
branching is such that a "normal" GPU would be inappropriate, and where the
problem justifies considerable programmer time for optimization. Financial and
scientific models are the usual examples.

~~~
tormeh
Ok, but if the normal price is 2000USD, then that leaves space for a lot of
9590s. You could get almost the same memory bandwidth and 3x as much
processing power with 10 9590s. And remember that's floating point
performance; the bulldozer architecture the 9590 is based on gives you about
effectively 1.5x as many cores when doing integer operations.

So, ok, the Phi is useful at the current price, but at the 2000USD price other
people are talking about in the thread it seems pretty useless.

~~~
awalton
Everyone in this thread keeps railing on its floating point perf, which is
about on par with parts like nVidia's Tesla and similarly priced... but its
_integer_ performance is why people are really buying these cards up. All GPUs
are floating point monsters but integer pansies because that's what their
graphics workload heritage turned them into. Xeon Phi is a Computing Monster,
with crazy Integer and Floating Point perf, because its heritage was "Let's
try to make a graphics card out of CPUs" and not visa versa. It's similar to
buying a whole stack of x86 chips just to get to the SSE/AVX units. Only, now
you don't need a whole stack of computers, you need a handful of host
computers and a stack of add-on cards.

And if you think $2000 is expensive for 60 x86 cores (~$35/core) when all you
care about is the SSE/AVX unit and how many vector ops you can cram through
it, you're definitely not Intel's target.

~~~
nkurz
Thanks, you are right that I shouldn't have focussed on just floating point. I
emphasized it because it better matched the frequently quoted "teraflop in a
single card" measurement. As you say, the Phi is equally strong at vector
integer operations as floating point, in both cases operating on 512-bit
vectors.

But I'll quibble with the assertion that all GPU's are "integer pansies",
although it's a fantastic phrase. Historically, NVidia GPU's are much stronger
at floating point operations than integer because of their graphics origins,
and NVidia's Tesla is definitely what Intel views as nearest to the Phi's
target market.

But AMD/ATI cards have excellent vector integer performance, several times
that of NVidia/Tesla, and much better per dollar than Phi until this recent
price drop. This is why AMD cards were the best choice for bitcoin mining
until the recent arrival of dedicated ASIC's:
[http://www.extremetech.com/computing/153467-amd-destroys-
nvi...](http://www.extremetech.com/computing/153467-amd-destroys-nvidia-
bitcoin-mining)

------
jsnell
It is a very aggressive promotion, but I kind of wonder whether there's any
chance it'll find a useful target audience. At least when we briefly looked at
deploying various kinds of coprocessor solutions, paying the real cost for a
development unit would not have been any kind of a blocker. It's just that
actually deploying a Phi in production didn't seem to make any sense for us in
the end (just like GPU and ASIC based options didn't either). I would imagine
that a development unit would not have been an issue for anyone else who has
seriously evaluated one of these. Seems like this kind of fire sale would only
help in reaching people who need lots and lots of Phis, don't realize it yet,
and are susceptible to buying a piece of kit just because it's almost free.
(And admittedly very cool).

Also, my understanding is that this model is a 270W TDP board with passive
cooling. What kind of a machine can you actually install those in?

~~~
13
You're meant to be running these in server rack cases that already have fans
in the front. Most server CPU heatsink designs take this sort of form as
well[1], as there's no point in putting a tiny fan on top of the heatsink when
you've got delta fans at the front blasting air through what amounts to a
sealed pipe. I've dealt with GPUs of a lower TDP in a cramped environment and
the amount of airflow you need is completely insane, if you ran this without
aggressive cooling for more than a couple of seconds it would just ignite.

[1]: [https://i.imgur.com/yh2oMeh.png](https://i.imgur.com/yh2oMeh.png)

------
lovelearning
Some case studies of this here [1].

[1]: [https://software.intel.com/en-us/mic-
developer#pid-22612-186...](https://software.intel.com/en-us/mic-
developer#pid-22612-1861)

~~~
tim333
Thanks - I was curious what it was for. I see:

\- Black-Scholes Valuation Computing

\- Weather simulation

fairly specialist number crunching I guess

[update]

I see the 'worlds fastest computer', Tianhe-2 uses 48,000 Xeon Phis. Wonder if
they got a bulk discount?

[https://en.wikipedia.org/wiki/Tianhe-2#Applications](https://en.wikipedia.org/wiki/Tianhe-2#Applications)

------
myrryr
In 1997 there was a supercomputer called ASCI Red. It had 76 cabinets of
processors.

The entire thing (with memory and disk) took up 1600 sq feet of floor space.

Here we are 17 years later, and we can get the same processing power in a pci
card.

------
octotoad
Does anybody know how these sorts of devices are presented to an operating
system? Is it controlled/programmed in a similar manner to a modern GPGPU
card?

~~~
orbifold
They run an embedded Linux operating system on the coprocessor, which in
principle allows you to run programs on it, as if it were an ordinary albeit
weird independent computer (see here [https://software.intel.com/en-
us/articles/building-a-native-...](https://software.intel.com/en-
us/articles/building-a-native-application-for-intel-xeon-phi-coprocessors)).
That is you can ssh on the device, etc. There are also various tools sold by
Intel all with three digit price tags (I believe), that allow you to program
with parallel directives for example and which translate that to something
that runs on a xeon phi. You can also use OpenCL, in which case it presents
itself as one of several OpenCL devices you can use.

------
rbanffy
I've said it before but I think it's worth repeating: general purpose CPU
cores are not becoming much faster. Learning to make your software run well on
multiple cores seems to be a very sound investment.

------
soylentcola
I just checked and these are on the list of unsupported processors for Cinema
4D. Are there any uses for this sort of thing in other 3d rendering
applications or is it not suited to that kind of thing? Only asking because
there are still a lot of rendering engines that don't leverage GPU but $250
for a good chunk of rendering power would be a great alternative to building a
dedicated render farm for someone like me who does it more for art/creative
projects and not so much for a living. Right now my options are just
continuing to use my standard consumer workstation, building a small render
farm, or purchasing (and adapting to) a GPU-accelerated renderer.

I'm certainly no expert on this stuff but I'm always on the lookout for
affordable ways to make my projects easier to work with.

------
sah88
Knight Landing, the second version, is supposed to be coming out in early
2015. If you want to get one be warned you might need a special motherboard as
well if you were thinking of tossing it in a consumer box.

[http://www.pugetsystems.com/blog/2013/08/06/Will-your-
mother...](http://www.pugetsystems.com/blog/2013/08/06/Will-your-motherboard-
work-with-Intel-Xeon-Phi-490/)

Here is a link to Intel's promotion page it has participating vendors in
various locations:

[https://software.intel.com/en-us/articles/special-
promotion-...](https://software.intel.com/en-us/articles/special-promotion-
intel-xeon-phi-coprocessor-31s1p)

~~~
trsohmers
Won't be until the second half of 2015... We should have a better estimate
come ISC in June, and my bet is on a release in November of 2015 to coincide
with SC15.

------
angry_octet
Perhaps the insanely low price is because no one is buying the current design
in the quantities they naively expected and they have to dump these ugly
monstrosities for whatever price they can get before a new architecture is
released.

------
zoba
The Xeon Phi looks really awesome, and I'd love to be able to have that many
cores at my disposal...however, as far as I can tell, its not at all like have
57 cores on your current computer. Whereas in my current software (Ruby,
Haskell programs) I can just specify the number of threads to create to
parallelize my software further, I'm not really sure how I'd use the Phi.

------
norswap
What can I use this "coprocessor" for?

~~~
jvreeland
Coprocessors target audience has been for the most part HPC providers. If
you're doing large numerical task or some huge amount of parallelism they can
be incredibly useful. That being said there's a not insignificant amount of
development time needed to properly use them.

------
khebbie
A really dumb question: Would it be usable for running some kind of virtual
machine, like ubuntu in virtual box or vagrant?

~~~
ams6110
No

~~~
khebbie
Craps, it would have been so cool to buy a 200 usd extension for your computer
and have a mega virtual machine...

------
Xcelerate
Is there a cheap way to connect this to my Macbook pro? I do a lot of MD
simulations, and this would be great to play around with when I don't want to
use research funded computing time, but I feel like the connection to my
laptop could end up being more expensive than the board itself.

~~~
nitrogen
You can probably find a Thunderbolt-connected PCIe enclosure, but the
bandwidth from host memory to card memory will be lower than a direct
connection. Drivers would be another issue.

------
rdc12
I would be really tempted if I can find a computer capable of running it on
the cheap (student). I think my current desktop may work, if I remove the
video card and use the integrated graphics...

------
patrickg_zill
I seem to recall a case where someone used a GPU to handle index lookups or
some other part of query processing for a database, on a GPU. I wonder if this
could be used for that purpose.

------
discardorama
Is the deal over? Amazon lists it for $499 now.

------
danellis
Can Blender Cycles use this? Looks like there was talk about it in 2012/3, but
nothing recently.

------
scottcanoni
How good is it at Crypto Currency mining?

~~~
comboy
Looking at GFlops, much worse than GPUs from few years back (and we are in the
ASICs era now). Don't know about scrypt, but since there are also ASICs there,
I would imagine cost and power usage per hashrate would be not even
comparable.

------
haddr
good luck trying to get that in europe...

~~~
rbanffy
I live in Brazil. Anything is easier than that...

~~~
comboy
Sorry for being ignorant about politics, but can you elaborate? Why is that
so?

~~~
rbanffy
Customs. Import taxes are absolutely crazy.

------
stefantalpalaru
Previous submission, completely ignored:
[https://news.ycombinator.com/item?id=8593125](https://news.ycombinator.com/item?id=8593125)

~~~
carlwarnick
You buried the lead. The price was the important information.

~~~
stefantalpalaru
Whenever I improve the title, it gets reverted back to the original. Whenever
I point to a secondary source, it gets replace with the primary. When I use
the primary source with the original title, it gets no votes.

