
We Need Hardware Traps for Integer Overflow - mmastrac
http://blog.regehr.org/archives/1154  
======
willvarfar
(Mill team)

With apologies to those tired of hearing about the new Mill CPU let me explain
how the Mill traps on integer overflow:

For overflow-able arithmetic operations, we support four modes:

* truncate

* except

* saturate

* double-width

With excepting overflow, the result is marked as invalid (we term it "Not-a-
Result (NaR)").

As these invalid results are used in further computation, the NaR propagates.

When you finally use the result in non-speculatively e.g. store it or branch
on it then the hardware faults.

NaRs have lots of other uses.

This is described in the Metadata talk
[http://millcomputing.com/topic/metadata/](http://millcomputing.com/topic/metadata/)

And [http://millcomputing.com/topic/introduction-to-the-mill-
cpu-...](http://millcomputing.com/topic/introduction-to-the-mill-cpu-
programming-model-2/) for a broader overview.

------
rwallace
I think the article overstates the cost of doing this in software and
understates the cost of doing it in hardware.

The cost of doing it in software is just a highly predictable (not taken)
branch after every integer arithmetic operation that the compiler can't prove
stays within bounds. The article presents no data on this cost. I have none to
hand either, but I'm going to predict that on a modern CPU with typical
workloads it will be small enough that it would be very hard to measure.

The article speaks as though doing it in hardware would be free, but that's
very far from true. The hardware solution might have a nominal cost of 'zero
clock cycles' where overflow doesn't occur, but extra transistors in critical,
heavily used parts of the CPU core would be burning a small but nonzero amount
of energy all the time _even on code that doesn 't use the overflow check_ \-
i.e. the vast majority.

If you think overflow check is a great feature (of which personally I'm not at
all convinced), go ahead and add it to a new language or provide it as a
library function in an existing language. But imposing it as an inescapable
tax on all hardware makes no sense whatsoever.

~~~
jcalvinowens
> The cost of doing it in software is just a highly predictable (not taken)
> branch after every integer arithmetic operation that the compiler can't
> prove stays within bounds.

While I agree with you that hardware overflow traps are a bad idea, I think
the article's author is referring more to the general overhead of a software
BIGNUM implementation. Specifically with his JavaScript example, I think it's
very plausible that using integers instead of IEEE754 would incur a 10%
overhead if you were throwing lots of big enough numbers around.

What the author really wants is a hardware BIGNUM implementation, not overflow
trapping. I just don't think he really thought it through.

His contention that simply checking for overflow in C or C++ incurs a 5%
overhead is unquestionably false though. Typically one would check for integer
overflow in C like this:

    
    
      unsigned int do_addition(unsigned int a, unsigned int b)
      {
    	unsigned int tmp;
    
    	tmp = a;
    	a += b;
    	if (a < tmp)
    		asm volatile ("nop;" ::); /* Handle it here */
    
    	return a;
      }
    

Any decent compiler will do the right thing (GCC with -Os):

    
    
      0000000000000000 <do_addition>:
         0:	89 f0                	mov    %esi,%eax
         2:	01 f8                	add    %edi,%eax
         4:	73 01                	jae    7 <do_addition+0x7>
         6:	90                   	nop
         7:	c3                   	retq
    

I think rwallace is absolutely right in saying the overhead of such a check
would be very close to zero in the non-overflow case.

~~~
stormbrew
Note, in case someone blindly takes this as advice: This method of expressing
it in C doesn't work for multiplication (you could wrap more than once).

I'm not sure you can express an overflow check for multiplication purely in
portable C (without library or intrinsic support), actually. Well, I guess you
could break the multiplication into checked additions manually, but that's
probably not a great idea.

~~~
xenonflash
He also picked one of the most trivial cases - unsigned addition. Signed data,
especially signed multiplication, requires a lot more steps.

I'm working on a set of numerical problems in C now that involve checking for
integer overflow. The best way of doing software overflow checks depends on
the larger scope of the problem. You can knit your checks into various places
in your code in ways that avoid unnecessary duplication of computational
effort. However, that is a lot of work and has the potential for hidden
programmer introduced errors.

The real problem is that languages like C simply don't allow you to take
advantage of hardware which already exists in the CPU without writing in-line
assembler. A standard set of "checked" math macros which handled the
portability issues would probably satisfy most C applications.

Edit: For addition, subtraction, and multiplication, you just take one
operand, calculate the largest possible second operand for that data type
which won't overflow, and check that the actual second operand doesn't exceed
it (remembering to take signs into account). For division and modulus, check
for division by zero. For multiplication, division, modulus, negation, and
absolute value of signed values, check that you are not negating the maximum
negative integer, as integer ranges are not symmetrical (e.g. one byte is -128
to +127).

If you are looping over arrays and have multiple checks for different cases
(e.g. negative, positive, etc.), then you can have different loops for
different cases and so avoid redundant checks for that data. I'm working on
this sort of application, so the above works out best for that. If you're
doing something a bit different, then different algorithms may make sense.
Unfortunately, there's not universal one-size-fits-all solution to this
problem in software.

~~~
jcalvinowens
> For addition, subtraction, and multiplication, you just take one operand,
> calculate the largest possible second operand for that data type which won't
> overflow, and check that the actual second operand doesn't exceed it

Sure, but that calculation is CPU-dependent, since it depends on how the
underlying hardware represents signed integers. By definition, it is
impossible to portably check for signed integer overflow in C, as I'm sure you
know.

I implemented a simplistic BIGNUM library in C once (that's where I pulled
that expanding multiply code in the other comment from). The only truly
portable way to do that is to make your bignums sign-magnitude and use
exclusively unsigned arithmetic on them. That's what I was envisioning in my
original point about performance degradation due to overflow checking.

Realistically of course, most CPU's these days are twos-complement, and you
can make signed overflow defined by compiling with "-fwrapv", which I would
guess is what you're doing.

~~~
xenonflash
Yes I'm assuming two's complement, but there's not a lot of hardware around
these days that isn't two's complement. I'm writing a library for something
that already assumes two's complement while doing other things.

If the code had to be portable to one's complement hardware, then I would
create special cases for that type of hardware. Laying my hands on such
hardware for testing would be the big problem, and if you haven't tested it,
then how do you know that it works?

As for "-fwrapv", it's not portable either, and I need to cover both signed
and unsigned math. It's also not compatible with what I need to link to (I've
gone down this road already). I also need to cover the largest native word
sizes, so the trick of using a larger word size won't work for me.

I'm only dealing with arrays of numbers though, so I can often amortize the
checking calculations over many array elements instead of doing them each
time. This is an example of knitting the checks into the overall algorithm
instead of using a generic approach.

As things stand, there's currently no universal ones-size-fits-all answer to
this problem in most languages.

I do like how Python has handled this - integers are infinitely expandable and
simply can't overflow. This comes at the expense of performance though. What
this type of solution needs is an option for unchecked native arithmetic for
cases where you need maximum speed.

~~~
jcalvinowens
> As for "-fwrapv" [...] It's also not compatible with what I need to link to
> (I've gone down this road already)

I'm actually really curious: exactly what issues did you run into with this?
Intuitively I wouldn't think it would be a problem.

------
userbinator
It's a little amusing that x86 has the INTO instruction, a single byte opcode
at position CEh, that was designed specifically for this purpose and was there
since the 8086, but when AMD designed their 64-bit extensions, it turned into
an invalid instruction (and Intel was forced to go along, presumably for
compatibility.) A rather shortsighted move, I think; instead of having a
possibly useful (but not previously often-used) instruction, that single-byte
opcode becomes wasted in 64-bit mode. With it, adding overflow-trapping
arithemtic support to a compiler would be trivial and only add 1 extra byte to
each arithmetic operation that needs it.

Ditto for BOUND, which is useful for automatic array bounds-checking - it
performs both lower and upper bounds checks.

Also, I don't really get why integers wrapping around should be "unexpected".
It is only to those who don't understand how integers work. The saying "know
your limit, play within it" comes to mind.

~~~
JoshTriplett
Those instructions generate an interrupt; you'd have to define the OS ABI to
make those instructions trap back to the application in a catchable way.

~~~
ahomescu1
If you put the INTO immediately after the overflowing instruction, the OS
could just go back one instruction (harder than it seems, but not impossible).
Wouldn't that work?

~~~
JoshTriplett
That'd require the OS to define that as part of the ABI.

Also, you almost certainly don't want to just go back an instruction; you want
to catch and handle the overflow.

------
sounds
Modern CPUs don't like traps ("exceptions"). The exception causes a pipeline
flush which kills performance for math-intensive code.

For example, detecting integer overflow on x86 and x86_64 CPUs is easy: check
the overflow flag after every arithmetic operation. It would only be slightly
more difficult to detect overflow for SSE (vector) operations, which would
require doing some bit masking and shifting.

For a language such as Swift, building it in is simple.

~~~
maggit
The entire premise of the linked article is that doing it the way you suggest
is too expensive for the common case, where no overflow happens. Further, it
implicitly asserts that the cost of a hardware trap/exception, while great,
will be offset by the savings from the common case.

Now, it doesn't back these assertions with much data, but neither do you :)

~~~
al2o3cr
The problem is that potentially-overflowing integer instructions make a real
mess out of things like out-of-order execution and speculative execution.

For data to back this, see your favorite computer architecture reference,
particularly anything that discusses the consequences of highly-complex
instructions in things like the VAX.

~~~
brigade
Citing stuff from the 80s RISC movement is pretty outdated nowadays,
especially in this specific case where quintessential RISC architectures such
as MIPS implemented trapping arithmetic instructions from the start.

------
acqq
"Those who do not understand the hardware are doomed to blog about it, poorly.

Intel chips already detect overflow, that is what the OF (bit 11) flag in the
flags/eflags register indicates. That the most recent operation overflowed.

Testing, and generating a trap, for an overflow is a single instruction (JO –
jump if overflow and JNO – jump if no overflow).

This is true of almost all CPU’s. At the hardware/assembly programming level,
overflow detection is handled by hardware, and detected by a single
instruction. The reason all of your languages listed don’t have any sort of
trap/etc. is simple. The language designers did not bother to check the result
of their operations. Why, most likely because almost all of them are
ultimately implemented in C, and C does not reflect the overflow status back
into the C language level.

So the problem isn’t the hardware. It is the software architectural design(s)
implemented above the hardware. They are not bothering to communicate to you
the results of what the hardware already has available and tells them about."

(Qouting comment from Anon)

~~~
dspillett
_> Why, most likely because almost all of them are ultimately implemented in
C_

That isn't right: languages created before and after C exhibit the same
behaviour. Some languages _do_ explicitly check the overflow flag every time.

The reason for not always checking (or never checking) is that in a tight loop
the extra instruction can significantly affect performance especially back
when CPUs had a fraction of their current speediness capability. The reason
for not exposing the value to the higher level constructs is similar: you have
to check it every time it might change and update an appropriate structure,
which is expensive given an overflow should be a rare occurrence so checking
every time and saving the result is wasteful.

The linked article specifically mentions the performance effect of checking
overflow flags in software. I believe what it is calling for is some form of
interrupt that fires when the flag is switched on in a context where a handler
for it is enabled - a fairly expensive event happens upon overflow but when
all is well there is no difference in performance (no extra instructions run).
Of course there would be complications here: how does the CPU keep track of
what to call (if anything) in the current situation? Task/thread handling code
in the OS would presumably need to be involved in helping maintain this
information during context switches.

~~~
acqq
The performance penalty you mention ("in a tight loop the extra instruction
can significantly affect performance") doesn't happen if you add the new type
in the language (that is, what in VC++ is a "SafeInt"). In the really tight
loop you wouldn't use that type. That type is important exactly for the things
I've given a MSFT's example (calculating how much to allocate -- overflow
means you allocated much less and you don't catch that!) So no, you don't have
to "check every time."

The reason it's not in standard C is to be portable with some odd old
architecture which doesn't have the overflow flag at all. Some modern language
can be clearly designed to depend on the overflow flag. The cost would happen
only when the programmer really does access it (in a modern language: by using
such a type) and the cost would be minimal, as there is a direct hardware
support.

> I believe what it is calling for is some form of interrupt that fires when
> the flag is switched on in a context where a handler for it is enabled

And that is misguided, as it doesn't allow for fine grained control -- it's
all or nothing, either all instructions generate "an interrupt" or none. If
you want to change the behavior from the variable to variable, changing
processor mode would cost. If you add a new instructions for all things that
can overflow _and trap_ , you'd add a lot of new instructions. So it's also
bad. The simplest approach is: use what's already there. The flag is there in
the CPU, it's not used by the languages OP mentions, but once the language
supports it for the "safe" integer type, it will be checked only when it's
really needed: for that type and nowhere else.

Finally, the maintenance of the exception handling code (what you name under
"how does the CPU keep track of what to call") is something that modern
compilers and even assembly writers must take care of and is very good
understood among them: for example, Winx64 ABI expects every non-leaf function
to maintain the stack unwinding information properly and even if I write
assembly code I effectively have to support exceptions outside of my code for
every non-trivial function I write. So this part is very good known, and the
most is taken care of outside of the OS. The OS merely has some expectations,
the compiler (in broader sense, that is, including native code generator and
linker) writers must fulfill them.

------
forrestthewoods
I'm gonna shill for a moment and link a post I wrote on integer overflows very
recently. [http://forrestthewoods.com/perfect-prevention-of-int-
overflo...](http://forrestthewoods.com/perfect-prevention-of-int-overflows/)

TLDR: Not accidentally performing an undefined operation is really really
hard.

~~~
pascal_cuoq
Hello,

the comment system on your blog eats comments, that become lost forever when
submitted. You might want to know that.

Apart from that:

1) “2147483650f” is Java syntax. Also this number, written in C++ as
2147483650.0f, actually represents the number 2147483648 (assuming float is
the single-precision IEEE 754 format). You might want to denote that number
2147483648.0f, which would be less confusing.

2) the line “const int min = 0x80000000;” does NOT contain undefined behavior.
The overflows occurs during a conversion from an integer type to a signed
integer type. Overflow during such a conversion is implementation-defined (or
an implementation-defined signal is raised). Even an implementation-defined
signal is not undefined behavior, but in practice, the compiler you use
produces wrap-around behavior, and it will continue to do so, because it has
been forced to document it.

~~~
forrestthewoods
What login did you use for making the comments? It just uses Disqus which is
pretty common these days. I was able to successfully make a comment with a
Twitter login. Recent posts have had surprisingly few comments but not zero.
Strange...

The 2147483648 vs 2147483650 is actually kind of interesting. Visual Studio
watch window prints floats with one less digit of precision they they need as
it actually displays (from memory) 2.14748365e9 when it _should_ display as
2.147483648e9. Thanks for the catch.

I'll have to check spec docs to if the unsigned to signed overflow is
undefined or implementation defined. I think you're correct but I need to
verify.

------
angersock
One of the surprisingly annoying minor things we implemented for some safe
code were C routines for "safe arithmetic"\--explicitly and carefully catching
overflows due to multiplication and addition, for example.

This code proved invaluable when writing binary parsers designed to support
"unsafe" mesh data coming off of the network--the file might be garbage, but
we could at least safely parse it.

I'd go so far to say that if you are dealing with a binary format--especially
a sane one which has length headers and chunks for rapid seeking--and you
_aren 't_ using something similar, you are doing it wrong.

EDIT:

For the curious, you may find some of these bit-twiddling hacks to be of some
use.

[http://graphics.stanford.edu/~seander/bithacks.html#IntegerL...](http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious)

~~~
TorKlingberg
Regehr, where this post is from, recently had an article about how to do safe
arithmetic in C.

[http://blog.regehr.org/archives/1139](http://blog.regehr.org/archives/1139)

As you say, it is surprisingly different to it get right, especially if you
want good performance. A common mistake is checking for signed integer
overflow after the fact. It is too late when undefined behavior has already
happened.

~~~
userbinator
> It is too late when undefined behavior has already happened.

This seems more like a theoretical concern, as just about all the hardware out
there is 2's complement and signed overflow wraps around in the usual way -
because that's the simplest, most straightforward way to implement it. (This
also means I think compilers that exploit this "loophole" in the language are
seriously violating the principle of least surprise.) If you're working on the
few exceptions to this, then integer overflow is probably going to be the
least of your worries...

------
JoshTriplett
Personally, I'd like to see more programming done in languages that simply
don't allow integer overflow in the first place. Most current languages have
arbitrary-precision integers; well-implemented arbitrary-precision integers
are quite efficient when they fit in a machine word, and as efficient as
possible when larger. Sure, you'll lose a bit of performance due to checks for
overflow, but those checks need to exist anyway, in which case it seems
preferable to Just Work rather than failing.

~~~
jjoonathan
Checks are the smallest cost of bignums. Dynamic allocation isn't free, it's
terribly expensive. Dereferencing pointers isn't free, it's terribly
expensive. It's one thing for a scripting language (where people expect poor,
inconsistent performance) to have automatic bignums but it's quite another for
a language in which people will be writing performant code to have automatic
bignums. They introduce a thousand difficult-to-debug edge cases that slow
your code to 1/3 or 1/5 (or 1/100th or 1/1000th) speed. I don't know about
you, but I don't consider that "Just Working."

Don't get me wrong, they could be a useful feature, perhaps even one which
should be enabled by default in Swift (I'd argue against them, but I wouldn't
be entirely unsympathetic to their proponents). However, they aren't for
everyone and they don't come cheap.

~~~
JoshTriplett
> Dynamic allocation isn't free, it's terribly expensive. Dereferencing
> pointers isn't free, it's terribly expensive.

So don't do either of those things until your arithmetic overflows the size of
a word; until then, you can keep numbers in a register.

~~~
xxs
That's not easy either as it'd require _very_ heavy inlining by the compiler.
Functions that accept just 'integer' have to check if the value is a native
number or actual reference and process differently. C/C++,Java* ,C# have it
easier there - when you pass 'int'/'long' the receiver knows it's a native
number.

* Fixnums support (headless objects) needed for non-Java lanaguages on JVM is still unimplemented to my knowledge[0]

[0]
[http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6674617](http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6674617)

~~~
JoshTriplett
> That's not easy either as it'd require very heavy inlining by the compiler.

Right, bignums need to be a _language_ feature, not a _library_ feature.
Compilers with native bignum support often have ways of handling native
unboxed single-register numbers, and then branching to full bignum routines
when needed. Haskell can do that, for instance.

Also, you can use the standard trick of decreasing the maximum single-register
size and using the extra bits to identify indirect objects.

~~~
xxs
>> _Also, you can use the standard trick of decreasing the maximum single-
register size and using the extra bits to identify indirect objects._

That's given, you still pay the price (like mask/shift), though. That was the
part the "languages like C/C#/Java have it easier" about. The point is mostly
that even if built-in support (interrupts) exist in the hardware, bignums
still need quite a lot of extra code in-place, plus a good
optimizing/inilining compiler.

Personally I am happy with constrained integer types - else the entire stack
(incl. storage) must support bignums and often (like almost always) going out
of range would be a bug actually.

------
jlarocco
x86 processors already have overflow and carry bits in their flags register to
tell when overflow has occurred.

It makes more sense to me to have compiler writers check the flags if they
care about overflow, and avoid the slow down if they don't.

~~~
danbruc
No, doing it in hardware makes more sense. If you expect overflows you disable
overflow checks in the hardware and everything is as before. If you don't
expect overflows you enable overflow checks in the hardware and if you have a
bug you get the exception. If you do it in software you have to execute
additional instruction every time but they will do nothing useful if your
program is correct.

------
Someone
I found it weird that nobody mentioned it, but was this posted because Apple's
new language Swift has integer overflow detection (with operators &+, &-, etc.
operators for doing conventional C-style modulo integer math)?

~~~
noblethrasher
It was posted to reddit six days ago[1], four days before Swift.

[1]
[http://www.reddit.com/r/programming/comments/26s8iu/we_need_...](http://www.reddit.com/r/programming/comments/26s8iu/we_need_hardware_traps_for_integer_overflow/)

------
paulannesley
> In a unicode language we might use ⊞, ⊟, etc.

Nope.

------
zvrba
Or you could have a "sticky" overflow flag, so that you could check for
overflow after each complete expression instead of after every single math
operation. This needs one new flag and one new conditional instruction. (Plus
compiler modifications.)

~~~
stormbrew
I don't know if this solves all the associated problems. If part of the chain
of operations isn't side-effect-free, eg. function call, you'll still have to
check more than once for the whole chain. And if you want to do more than
throw an error/exception you may need to re-run the operations on a slow path
to get the correct result.

But I like it a lot even so. For a general case it seems like it would help.

~~~
xxs
The branch is to be predicted (close to perfectly) by the hardware so the
branch cost would be like an extra cycle.

------
stormbrew
An interesting tangent to this, in terms of trapping math operations, is that
Swift added an operator for a _non-_ trapping div-by-zero operation: &/

I'm not sure I've seen that in any other languages before.

------
Sirius17
I've worked with many CPUs (i86, Z80, 6502, ...) and all had some kind of
overflow / underflow indicator, normally in the status register.

------
Gracana
ARMv5 and beyond have saturating arithmetic instructions _, which can be
pretty handy. It won 't cause an exception, but it is at least one way to
improve the situation.

_ and a sticky saturation flag in the status register

~~~
stephencanon
Deprecated (and moved to NEON) in aarch64, FWIW.

~~~
Gracana
Huh, okay, I guess that's sensible. Thanks for pointing that out, I haven't
been paying much attention to the latest in ARM, so I hadn't heard that.

------
shasta
In case anyone has the same difficulty reading this as I did, "integer types
that wrap" is supposed to refer to arithmetic modulo 2^n. I thought at first
he was talking about wrapped integer types.

~~~
maggit
> wrapped integer types

A google search for "wrapped integer types" gives me the Wikipedia article on
Integer Overflow as its first result. The other hits seem to be related.

What else do you mean by "wrapped integer types" if not arithmetic modulo 2^n?

~~~
simcop2387
I think he probably means boxed integer types: [http://msdn.microsoft.com/en-
us/library/yz2be5wk.aspx](http://msdn.microsoft.com/en-
us/library/yz2be5wk.aspx)

------
robryk
I wonder how often it'd be possible to deduce that an overflow is not possible
from the context. (For example, if we've just checked that INT_MAX/a > b
before doing a*b.)

~~~
DasIch
Division is a lot more expensive than the multiplication itself, checking the
overflow flag after the multiplication - which OP critizes for being too slow
- is going to be much faster than your check.

------
perlgeek
My assembler is very rusty, couldn't compilers check the carry flag on i386
after a math operation that could potentially overflow, and handle the
trapping in software?

~~~
danbruc
Of course and this is what actually happens if you enable overflow checking
but it comes at a price - if your code is correct you will never need the
checks but you will execute them every time.

~~~
throwawayaway
here's the best guide i could find on enabling those checks in gcc. it is
unclear whether they take advantage of the carry bits, do you have a source
for that information?

[http://www.pixelbeat.org/programming/gcc/integer_overflow.ht...](http://www.pixelbeat.org/programming/gcc/integer_overflow.html)

~~~
danbruc
You can see the implementation in GCC here [1], for example addition starting
on line 74. What machine code gets emitted for that obviously depends on the
target architecture but it is reasonable to assume that the optimizer
recognizes these patterns and uses hardware flags where available.

[1]
[https://github.com/mirrors/gcc/blob/master/libgcc/libgcc2.c](https://github.com/mirrors/gcc/blob/master/libgcc/libgcc2.c)

~~~
throwawayaway
[http://www.emulators.com/docs/LazyOverflowDetect_Final.pdf](http://www.emulators.com/docs/LazyOverflowDetect_Final.pdf)
This paper seems to imply that that high level languages can't access the
carry bit, yet C#'s checked mechanism uses it.

I've been using -ftrapv with gcc for a long time, from here:
[https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-
Options](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options).

From the code you posted, and the assembly[1] emitted in the pdf above I can't
see the optimiser taking advantage of the carry bit. No mention of CF, OF:

[1]

lea ebx, DWORD PTR [edi+eax]

cmp ebx, edi

jae SHORT $LN4@unsigned_c

cmp ebx, eax

jae SHORT $LN4@unsigned_c

mov ecx, 1

jmp SHORT $LN5@unsigned_c

xor ecx, ecx

The gcc flag ftapv only works for signed integers, not clear what the carry
bit behaviour is for unsigned integers.

------
enupten
Didn't lisp machines have Hardware traps of some kind ?

~~~
ScottBurson
Lots of them. But the CPU wasn't even pipelined, never mind superscalar, so
adding traps was easy -- just more microcode.

For example, there was tag type DTP-GC-FORWARD. When the CPU loaded a word
from memory with this value in the tag field, it would automatically indirect
through the pointer contained in the word.

