
Russia now selling home-grown CPUs with Transmeta-like x86 emulation - lelf
http://arstechnica.com/gadgets/2015/05/russia-now-selling-home-grown-cpus-with-transmeta-like-x86-emulation/
======
huhtenberg
Elbrus is a mature project with quite a bit of history, going all the way back
to the mid 70s. It was state-funded so it went through several periods of
stagnation and got nearly scraped off at some point, but back when I was in
the Uni many CS profs spoke with a great deal of reverence of both the project
itself and those working on it.

[http://en.wikipedia.org/wiki/Elbrus_%28computer%29](http://en.wikipedia.org/wiki/Elbrus_%28computer%29)

~~~
thesz
The difference between Elbrus from 70-th and contemporary one is quite
significant.

Old Elbrus was stack based outside and has a level that translated stack-based
ops into RISC commands for OoO execution. Stack based instruction set was
meant to reduce code size (and complexity of code generation).

New Elbruses are VLIWs and I cannot agree with that architectural decision.
They claim their VLIW and compiler solve frequent stalls (a hallmark of any
VLIW arch, except in DSP setting where memory is quite predictable) but
numbers in benchmarks do not agree with that.

Consider this: [http://www.7-cpu.com/](http://www.7-cpu.com/)

Elbrus with 4 threads is about 15 times as slow in compression as Intel i7
(Intel i7 3770 (Ivy Bridge)). The difference in clock speeds is about
sevenfold.

7-zip compression is very memory-intensive, and access memory in rather
unfriendly manner - going backwards in dictionary search and forward in
comparison.

This great discrepancy means that Elbrus stalls much more heavily than i7. And
rightfully so - OoO CPUs like i7 specifically designed to avoid stalls.

Other that CPU architecture decision, Elbrus as a SoC is very good.

~~~
userbinator
I think that's a memory bandwidth bottleneck, since with a single thread the
Elbrus is close to Ivy Bridge in performance: it achieves 600MIPS at 500MHz
which is 1.2MIPS/MHz, while Ivy Bridge with 4200MIPS at 3400MHz is
1.24MIPS/MHz.

~~~
thesz
For 7Zip the issue is stalling, not bandwidth.

The delay gap between 600MHz DDR memory interface for CPU with 500MHz clock is
smaller than the delay gap for 3.4GHz CPU and 800MHz DDR interface. The 500MHz
CPU can get away with smaller cache or less cache levels. 3.4GHz CPU cannot -
there is not enough computation volume available to mask delay of several
dozens of clock cycles. There is not enough computation available in 8 SPARC
Tx Niagara threads for one 1.8GHz core, even with their scout thread tech (to
reduce stalling thread does resolve future addresses and prefetching data when
stalled). What to say about 3.4GHz CPU with only two threads per core? It
certainly will stall during cache misses for random memory access with
(relatively) slow DDR interface.

In short, if you reduce clock count for Intel i7 to 500 MHz, you will have:
reduced pipeline length from 15-20 cycles to 5-6, proportionally to clock
frequency reduction, and reduced stalling penalty for cache misses. Shorter
pipeline does not do much harm in case of mispredicted branches. Shorter
pipeline also means quicker computation and less stalling in Read-After-Write
command chains. To sum, you will have more performance per clock cycle than in
higher frequencies.

(this is why mobile processors are mostly much slower than desktop
counterparts)

So, on 7Zip problem hypothetical 500MHz i7 will be faster than just 3.4GHz i7
scaled down by 34/5 factor. I think difference will be two-three fold in favor
of 500MHz i7 (mostly due to shorter pipeline, partly due to less severe
penalties).

This gives that hypothetical 500MHz i7 a speed of 2.5-3MIPS/MHz.

PS Reduction of pipeline from N stages to N-1 stage gives speed boost of about
1/N. E.g., I observed in simulation speed boosts of about 15%-20% when
pipeline length went from 5 to 4 stages. This was in "real" clock-accurate
simulation, with proper simulation of DDR memory and controller, cache and
their delays. The tests were also real, like string and graphics
handling/processing.

------
spatular
Here is more information from official site [1,4,5]:

\- native "Elbrus" ISA or x86 ISA,

\- Ebrus ISA is VLIW, can dispatch 23 operations per cycle (33 with SIMD), in-
order execution,

\- it's stated that x86 code translation + register allocation is done in HW,
but later they write about a software translator and full-system emulator,

\- 6 ALUs (all support integer operations, 4 can do FP),

\- 256 x 84-bit register file,

\- hardware support for loops, including pipelining,

\- some kind of module for async mem preloading,

\- speculative execution and branching predicates,

\- "4S" model has 4 cores,

\- 800 Mhz core clock,

\- 64 KB L1, 128 KB L2, 8 MB L3 (shared between cores),

\- 3 DDR3-1600 interfaces, ECC support,

\- 3 x 12GBytes/s inter-CPU links, support for up to 4 sockets,

\- 65nm process, 380 mm^2 die size, 986e6 transistors,

\- software is based on Linux 2.6.33 and Debian 5.0 with more than 3000
packages.

There are some benchmarks for older chip model "2S" (overclocked to 500MHz, 2
cores) [2,3]. FP performance is about 1-5x of Pentium M 1GHz (1 core?)
depending on benchmark, integer performance is about 1x. New CPU, "4S", should
be 3 times faster than "2S".

[1]
[http://www.elbrus.ru/arhitektura_elbrus](http://www.elbrus.ru/arhitektura_elbrus)

[2]
[http://www.elbrus.ru/files/535269/9f0cd8/50606f/000000/2014-...](http://www.elbrus.ru/files/535269/9f0cd8/50606f/000000/2014-04-19_161108.png)

[3]
[http://www.elbrus.ru/files/535269/0e0cd8/50586f/000000/2014-...](http://www.elbrus.ru/files/535269/0e0cd8/50586f/000000/2014-04-19_161043.png)

[4] [http://www.mcst.ru/mikroprocessor-
elbrus4s](http://www.mcst.ru/mikroprocessor-elbrus4s)

[5] [http://www.mcst.ru/mikroprocessor-elbrus4s-gotov-k-
serijnomu...](http://www.mcst.ru/mikroprocessor-elbrus4s-gotov-k-serijnomu-
proizvodtstvu)

\--

Edit: loop pipeling, OS information

~~~
agumonkey
How many processors have hardware support for loops ? I expect it to be a
different, more efficient infrastructure than Comparison/Jump, maybe something
similar to DisplayLists in old OpenGL ?

~~~
Zardoz84
x86 LOOP

~~~
revelation
Or just REP.

~~~
ddingus
Which CPU is this for?

~~~
Sanddancer
x86/x64 --
[http://web.itu.edu.tr/kesgin/mul06/intel/instr/rep.html](http://web.itu.edu.tr/kesgin/mul06/intel/instr/rep.html)

~~~
ddingus
Thanks!

------
return0
The title is a little "Soviet", it's a russian company not "russia"

~~~
ludamad
As far as I know, this is new ground for Russia, so the title could be
warranted

~~~
return0
I might excuse it if it said Russians, but the message they are passing is
that this is part of some state-sponsored plan. "A Finn builds minix clone" is
different than "Finland builds minix clone". [not that linux was a minix
clone]

~~~
Retric
My understanding is the project had at least to the level of state support as
SpaceX. So, sure it's a private company, but not really a 100% private
project.

PS: Intel is at 14nm and started with 65nm back in 2006. Which is not as bad
as you might assume, but still far from bleeding edge.

------
mrbill
Wasn't the NVIDIA "Denver" Tegra K1 originally intended to be an
x86-compatible chip along these lines, then when they couldn't get the
licensing right, it was turned into an ARM-compatible?

[http://en.wikipedia.org/wiki/Project_Denver](http://en.wikipedia.org/wiki/Project_Denver)

~~~
rkuska
It's not arm as ARM architecture, it's just a russian acronym.

~~~
mrbill
I'm aware of that; I'm talking about how the K1 was originally supposed to be
x86-compatible through microcode and then had to be implemented as an ARM chip
when the licensing deal fell through. Obviously the Soviet chip doesn't have
to worry about paying Intel licensing costs.

------
ChuckMcM
Hah, what goes around comes around. It is notable that China is going
gangbusters on building ARM variants rather than coming up with an entirely
new architecture. Back in the USSR days when Sun was working with the ELVIS
group they were required to have some Soviet designed machines in addition to
the SparcStations that Sun provided. Those machines were not well liked by the
researchers.

~~~
sedachv
Do you have any more info on the ELVIS joint project? This is the only mention
I've been able to find:
[http://articles.latimes.com/1993-03-17/business/fi-11913_1_n...](http://articles.latimes.com/1993-03-17/business/fi-11913_1_network-
technology) The announcement is dated over a year after the dissolution of the
USSR.

~~~
ChuckMcM
Not much more, I left the networking group for the "project Green" group in
1992. The director in charge of that project had been working on it for a bit
before I transferred but to be honest I don't recall if that started before
the end of '91 or in early '92.

------
dmitrygr
I wonder if, unlike Transmeta, they will let you run native code on it too.
Transmeta could not do it for technical reasons. As far as i know, the
underlying arch was designed ONLY as a JIT target, and did not even have
protected memory as such. Memory accesses were translated to different
instructions (one privileged and one not) based on context of translated code.

~~~
exDM69
Not sure about this Russian chip but in the case of Transmeta and Nvidia
Denver (and to a lesser degree, Intel x86 µops), writing the "native" code
directly is not beneficial in any way.

The whole point is that the JIT compiler running in the CPU can make dynamic
optimizations that's somewhat similar in nature to doing branch prediction and
other optimizations modern CPUs do.

The native code executed by these CPUs is a poor target for static
compilation. Without runtime data about which branches are taken, which memory
locations are touched, etc, it is not possible to generate code that
outperforms the built-in JIT or can compete with more traditional CPUs.

And besides, the JIT frontend in these chips is rather cheap in terms of power
and performance.

------
bhewes
"Technology independence" is going to be our generation's equivalent of
"energy independence".

~~~
frozenport
Nope. We need "energy independence" to avoid places with dictators and
repugant middle eastern societies. The only folk that care about "technology
independence" have irrational, kelptocratic nationalist agendas (Russia,
China), when's the last time you had a problem with a bastion of technology?

~~~
touristtam
Funny, that you omitted the USofA with Russian and China. Technology
independence is very relevant in this world where everything seems so
networked that it is virtually impossible to imagine going back.

Just read the last few years news about hardware back door and commercial
pressure from technology firm having a de facto monopolistic position and
you'd understand a bit more the reluctance of anyone not a direct partner of
the USofA to use and rely upon technologies coming from them.

And energy independence is not so much about avoiding having to deal with
those countries you are mentioning (however you want to characterize them),
but not having to be chained to their pricing policies since their are acting
like a cartel. This has been the issue since the initial petrol crisis from
the early '70s.

~~~
frozenport
Frankly, I'd take a direct partner of the USofA any day over a direct partner
of Russia or China.

------
cordite
Why don't they do something similar but with ARM? We already have plenty of
proven stuff working on that architecture.

Also, the propreties that they are getting feel like being knocked back a
decade--is it due to the fabrication facilities that they have?

~~~
weland
As much as I'd love to see x86 go the fuck away, we have _a lot_ more proven
stuff working on that architecture -- including a lot of legacy or closed-
source software that won't get ported soon.

> Also, the propreties that they are getting feel like being knocked back a
> decade--is it due to the fabrication facilities that they have?

Most likely. Top of the line stuff needs a lot of money and expertise --
Russian companies don't have that much of the former, and Russia exports or
rent^H^H^H^H uses a lot of the latter for outsourcing.

------
CUViper
Is that "Elbrus-compatible Linux distro" native? Or just x86 with some tweaks
to make it compatible? (perhaps addressing emulation shortcomings or
optimizations)

If they made kernel changes, and choose to respect GPL, I'd be interested to
see those sources...

~~~
spatular
It's stated on official site [1] that it's based on Linux 2.6.33, and it looks
like kernel and userspace are compiled for Elbrus ISA and run natively,
without x86 emulation.

There are no links to sources on their site, and they don't provide
datasheets. To request sources under GPL you need to get binaries first. I
live in Russia, and I've never seen Elbrus in real use anywhere. It's not
marketed or sold to general public. I think target market is government's
security agencies. Of course they get sources anyway for audit and have no
incentive to publish them.

[1] (russian) [http://www.mcst.ru/os_elbrus](http://www.mcst.ru/os_elbrus)

~~~
mjg59
> To request sources under GPL you need to get binaries first.

This is a common misconception, at least as far as GPLv2 goes. If you're
distributing commercially then you have two choices:

1) Include the source when you distribute the binaries (GPLv2 3(a)) 2) Offer
the source code on demand to any third party (GPLv2 3(b))

Of course, Elbrus may be distributing under the first of these, but if it's
the second then there's no need to obtain binaries in order to make a request
for source.

~~~
spatular
Sure. Just in this case variant 2 is very unlikely. I'd say getting sources
after buying one or two Elbrus-based servers would be a success for a company
not affilated with government.

Company I work for once tried to get linux kernel sources for an embedded
system produced by another russian company. We got just honest "we have our
proprietary module in our linux tree, so we won't be giving you any sources.
Still, we are nice enough to recompile it for you with options you need."
Ugh...

------
Theodores
You could have a CPU that updates itself to be able to emulate whatever new
features come along in x86/AMD64 to therefore be 'future proof'. Even if the
raw performance is not at the same level as the genuine Intel, does it matter
if merely surfing the web? If performance is 'ample' then the CPU that just
gets updated to include new instructions could make it so computers could last
for decades doing things like showing web pages. How hard can that get?

~~~
SCHiM
Many instructions on your cpu are already 'micro coded' instead of wired into
the hardware. I'm not 100% sure on how it works, or what it is. But it sounds
like what you're proposing could already be implemented in modern cpus.

[https://en.wikipedia.org/wiki/Microcode](https://en.wikipedia.org/wiki/Microcode)

~~~
sliverstorm
Microcode is a special unit in the decoder. When a microcoded instruction is
detected in the codestream, it is stopped and replaced by a stream of
instructions from the microcode ROM.

Imaginary example:

mul 37 -1

Could be automatically identified and replaced with a predetermined codestream
of:

add (not 37) 1

This allows super complicated instructions (of which x86 has many) to be
implemented with simple instructions, and it also allows the designer to work
around bugs that might be discovered later on.

~~~
dfox
That is the "x86 on RISC-ish core" interpretation of term microcode, which is
probably not exactly correct as far as the actual implementations go.

As a technical term, microcode is contents of memory that is used to implement
CPU sequencing (in contrast to doing sequencing in hardwired random logic or
having essentially no sequencing logic as is the case in classical RISC
designs). In most straightforward implementation it's essentially turing
machine with microcode as state table and tape replaced by rest of the CPU.
Various enhancements and modifications are possible which end in having
essentially full-blown CPU for processing microcode, but even in that case one
thing is critical for the term: the microcode only handles control functions
(sequencing) and does not directly process data.

------
ex3ndr
The price is known. It is about 200.000 Rub that equals to 3900$.

~~~
tekni5
I was doubting your comment, but after looking into it seems like all the
Russian sources claim it to be around 200,000 rubles.

Does anyone have any idea why an average consumer, would even consider buying
this at this price?

~~~
liviu
I think they were made for governmental purpose, not for regular people.

~~~
static_noise
Also for people who don't want US backdoors in their hardware.

I can imagine that quite a few companies outside the US are going to build
some of their core IT around this CPU. The CPU price is quite low if you
consider what the data is worth.

~~~
visarga
> Also for people who don't want US backdoors in their hardware.

For people that want to swap KGB surveillance over NSA's.

FTFY

~~~
newuser88273
If you're in the west, KGB backdoors are less of a risk to you.

Say you're in the west. NSA has backdoors into your US-designed hardware and
your US-designed OS. NSA also, probably, has access to your ISP's routers and
US net/web corporations like Google. As a result, NSA may hide C&C channels
where KGB use of a backdoor to your hardware would still show detectable
illegitimate network traffic.

Lower risk of detection translates to higher probability of use.

------
ido
Why use names such as SPARC and ARM when the products are neither? Is it some
sort of joke/wordplay that got lost in translation?

~~~
ArtifTh
МЦСТ (MCST) was an abbreviation for "Московский центр SPARC-технологий", now
it's just a meaningless letters. ARM in computer name means probably
"Автоматизированное рабочее место" \- "Automated workstation"

~~~
huhtenberg
Oddly enough the name is apparently due to Dave Ditzel, who was financing MCST
at the beginning. The same Ditzel that went on to found Transmeta 3 years
later.

PS. I didn't know that actually, just followed few links from the Elbrus
Wikipedia page and bumped into an interview with the MCST founder from 2003
[1] - he basically says there that in the early 90s there was a spike of
interest in the tech behind Elbrus 3. First it was HP, then it was Ditzel, who
was at that time with Sun. Ditzel eventually quit Sun and went on to found
Transmeta and develop the Crusoe. Babayan then goes on to say that "We are
effectively on a market, except we are not seeing any money. But on the other
hand without Ditzel we would've not gotten any (research) money and all our
work would've died." Interesting stuff.

[1]
[http://offline.homepc.ru/2003/81/24693/](http://offline.homepc.ru/2003/81/24693/)

------
RexRollman
Transmeta! Now there's a name I haven't heard in a long time.

------
ck2
A little late to the party, maybe they should have been working on ARM

