
The Mill CPU Architecture: Switches [video] - willvarfar
http://millcomputing.com/docs/switches
======
deepnotderp
Well, I'd love to get some discussion here.

One of the perennial questions that I've never received a satisfactory answer
to is: Where's the ILP?

Everytime I get asked this, I get referred to the "Phasing" talk, but that
isn't a satisfactory response either.

I think quantitative analysis will prove to be far more nuanced and damning
than expected. Take this paper
([https://pdfs.semanticscholar.org/1002/50a822251d1302c146ab49...](https://pdfs.semanticscholar.org/1002/50a822251d1302c146ab499635af2b3adb45.pdf)).
This paper analyzes a processor with _much STRONGER_ characteristics than the
Mill: It dynamically identifies these dependencies and then schedules them.

The result? It gets smashed by a classic OoO.

~~~
cwzwarich
It seems to me that the main fundamentally new idea of the Mill is the
deferred load mechanism described in the Memory talk. They claim that it can
complete with OoO execution with a much simpler implementation. It does
require some of the other Mill mechanisms for full efficiency, e.g. it uses
the belt to defer loads across calls.

If this mechanism doesn't work as expected with real software, then there's no
reason to think that the Mill will fare any better for general-purpose
computation than any earlier VLIW/EPIC machine, for all the same reasons as
before. And in that same talk they make some claims about OoO that seem a bit
naive. The state-of-the-art has evolved a lot since the company was founded in
2003.

~~~
deepnotderp
Copy paste:

Fun fact, the Mill's proposed method for hiding that latency, "deferred loads"
has already been done by Duke's Architecture Group:
[http://people.duke.edu/~bcl15/documents/huang2016-nisc.pdf](http://people.duke.edu/~bcl15/documents/huang2016-nisc.pdf)
(warning PDF link). The big gain? A measly ~8%.

~~~
jimrandomh
IIRC the Mill's presentation about deferred loads predates this paper, though
the paper is a lot more detailed and has simulations. It's not clear how the
Mill's gain from deferred loads would compare (it differs in a lot of other
ways that would interact).

------
rbanffy
There seems to be no hardware implementation right now, not even on FPGA (last
message about it dating from 2014). I also couldn't find a GCC or LLVM backend
for its architecture and no software simulator with or without cycle counters.

As much as I'd love to see something new in the general-purpose CPU arena (was
Itanic the last attempt?) I will remain skeptical of such claims until they
can be demonstrated by actual benchmarks.

~~~
chillydawg
They've got a huge mountain of patents to file and sort out before they can
start going public, I think.

~~~
DiabloD3
On top of this, they _have_ gotten patents filed, and as they do go through
the process, information gets released about what is now covered by patent.

As in, they are making good on the promise of actually proving it isn't
vaporware, it is just a slow and painful process.

Anything that gets leaked ahead of time can be used to derail their patent
process and depending on what it is, it could be the difference between being
a profitable venture or not.

Everyone who makes certain types of processors (which includes Intel and AMD
both, along with all the ARM licensees) have a vested interest in ruining this
effort, no matter if Mill ends up being successful or not (as in, Mill holding
certain patents even if their implementation was not successful, is still a
highly profitable venture).

As a side note, a lot of people use Itanium as an excuse to shit on Mill: just
remember, quite a few things engineered for Itanium have found their way back
into today's x86 designs. As in, things Intel patented in Itanium are still in
use and lived on. Even though the Mill is strange, if it doesn't succeed,
parts of it will live on in other designs.

