
Two's complement – You beauty - yogeswarant
http://everyogi.in/c/twos_complement/
======
vog
This all is suddenly less surprising for people who learned some _modulo
arithmetics_ in school (or university). That is, calculating with the
remainders of division. For example:

\- Calculating "modulo 60" means calculating time with a round clock in mind,
considering only minutes and ignoring hours (and seconds).

\- Calculating with angles (in degrees) means just calculating "modulo 360".

\- "modulo 1000" means calculating with the last 3 decimal digits of an
integer, ignoring the front digits.

The fundamental result here is: No matter modulo which number you calculate:
addition, negation, subtraction and multiplication work "out of the box". And
you'll quickly notice that "two's complement" just means calculating modulo
2^n, where n=8,16,32,64 or 128. But this all really works for any m >= 2, not
just m = 2^n.

(One drawback though: division doesn't work here, it works only if m is prime,
and even then it is slightly different from what you'd expect, although
completely logical.)

In short, the elegance comes from modulo arithmetics. It has nothing to do
with "two" or "binary", it would e.g. work with 3-state logic machines the
same way.

 _EDIT: To those who downvoted this: Do you care to elaborate? The author did
't mention modulo arithmetics with a single word, although it is an essential
part to truly understand how and why two's complement works._

~~~
gizmo686
I didn't downvote you. However, I am a math person and your post _feels_ wrong
for some reason. It isn't wrong, but it feels like it should be; and I am not
entirely sure why.

CPUs use modular arithmetic for both signed (two's complement) and unsigned
values, so that cannot be the defining feature of twos complement. The insight
with two's complement is that we can interperet the "upper" half of numbers as
negative. As you say, this should be very familar to anyone who has worked
with modular arithmatic. For those who have not, this just means that (in
3-bit twos complement/ integers mod 8). We interperate 7 = 8-1 = -1 and so on.
As a result of this, operations on signed and unsigned integers are literally
identical. Almost.

CPUs differ from modular arithmetic in 1 way: multiplication. Specifically,
CPUs do not do modular arithmetic for multiplication. However, the result of
the multiplication of two n-bit numbers could be as big as 2n-bits. When CPUs
do this multiplication, they store the result in two registers. If you only
look at the bottom register, the result is equivalent to modular arithmatic.
However, if you consider the top register, the result is not (this is why
there is a MUL and IMUL instruction).

Arguably, the same thing happens for addition. However, because there is at
most 1 extra bit needed, it is not given its own register, but rather, the
upper digit is stored in the carry flag.

The other insight of two's complement is encoding. Under the construction I
presented above, if given the 3-bit twos complement number 0b111, we would
have to compute that:

    
    
        0b111 > 0b011
        0b111 = 0b1000 - 0b0001
    

The insight of two complement is a way performing this computation using
primitive bitwise operations. Specifically:

    
    
       If the first bit is 0, we are done
       Otherwise, perform bitwise negation, add 1, and consider the result "negative". 
    

There is no obvious equivelent of this method for other bases.

~~~
vog
_> CPUs use modular arithmetic for both signed (two's complement) and unsigned
values, so that cannot be the defining feature of twos complement_

I disgree, because this still fits nicely into modulo arithmetics.

Signed versus unsigned just means that we choose a different set of
representants for certain operations (such as). For unsigned, we use the
smalles non-negative representant. For signed, we use the representant with
the smallest absolute value (and prefer the negative one if there is a tie).
Still, nothing with binary.

Except for one single detail: The "tie" is solved in favour of the negative
number, because that way, the first bit always denotes the sign. This little
details is binary-specific, but that's it.

 _> CPUs differ from modular arithmetic in 1 way: multiplication.
Specifically, CPUs do not do modular arithmetic for multiplication. However,
the result of the multiplication of two n-bit numbers could be as big as
2n-bits. When CPUs do this multiplication, they store the result in two
registers. If you only look at the bottom register, the result is equivalent
to modular arithmatic._

Good point, but in most (non-assembly) code that I see, the result of a
multiplication is stored in a same-size integer. So I'd argue this is used as
much. I agree that this is still a difference, though.

 _> The insight of two complement is a way performing this computation using
primitive bitwise operations._

I believe that negation is _not_ what this is all about. To the contrary, the
negation is more complicated for two's complement than for other
representations. For example, in other representatios you just flip a single
bit to negate a number.

No, the point is that there _no special cases_ for increment, decrement,
addition, subtraction and multiplication (with the small difference discussed
above). And there is not even a difference between signed versus unsigned
arithmetics except for comparison (also discussed above). This is what works
out perfectly well in modulo arithmetics, and has nothing to do with binary.

~~~
Double_Cast
"Two's Compliment represents a _linear_ automorphism over a cyclic group" is
what I think you're trying to say. russdill & co are hinting that Modular
Arithmetics are cyclic groups.

I think for most people, the aha-moment comes when they realize One's
Compliment double-counts the zero. In contrast, linearity is simply assumed.
Otherwise, why would anyone make a number system out of it?

------
Elrac
I program a UNISYS 2200 mainframe at work. It uses 1's complement.

Yes, there are two zeros. Not a problem in practice because all arithmetic
operations normalize -0 to +0 at no extra cost in execution time, so -0
practially doesn't happen. IIRC from Assembler class, addition is implemented
as subtraction of the negative operand. Just in case anyone ever needs it,
e.g. for bitmaps, the SZ (store zero) assembler instruction is complemented by
a SNZ.

Programming in a high level language, the representation of negative numbers
is all but transparent to the programmer. When reading dumps, not having to
perform an extra addition (subtraction?) when changing a number's sign is
pretty sweet! Also, abs(-MAXINT) == (+MAXINT), reliably. The asymmetry of
number ranges always bothered me on 2's complement machines.

What _is_ annoying about the UNISYS boxes is the 36 bit word format, though.
Characters are stored in 9 bit quarterwords that map pretty awkwardly to bytes
containing 8-bit ASCII. Binary data formats are essentially incompatible with
anything.

~~~
gumby
> What _is_ annoying about the UNISYS boxes is the 36 bit word format, though.
> Characters are stored in 9 bit quarterwords that map pretty awkwardly to
> bytes containing 8-bit ASCII. Binary data formats are essentially
> incompatible with anything.

This is why the FTP protocol has a byte size command. If all you have is 8-bit
bytes then that seems strange. But at the time FTP was designed the most
common machines on the ARPANET had 36-bit words (mostly PDP-10s and their
derivatives) and bytes (the term was used in the more general sense) were just
bit strings of 1-36 bits. 7-bit ascii was common (5 characters would fit in a
word, like my username GUMBY), as were six bit bytes (pack six characters into
a word). I never used 9-bit _characters_ though arrays of nine-bit bytes were
not unreasonable.

BTW the PDP-10 had 18-bit addresses so each word of memory held a Lisp cons;
CAR, CDR, RPLACA etc were machine instructions. Gordon Bell and Alan Kotok
designed the -10 (and its predecessor the PDP-6) with Lisp in mind. The first
Lisp Machines.

> Binary data formats are essentially incompatible with anything.

Well, that's true today, but look at it the other way around: Unix was really
developed for an 8/16-bit machine. It was a reimplementation of Multics that
ran on a 36-bit machine (GE 645 & Honeywell 6180) written in PL/1\. Unix was
famously written for the PDP-7 (an 18-bit machine) but it was written in
assembly. The famous PDP-11 version was written in a BCPL derivative you might
have heard of called "C" and, since PL/1's level of machine abstraction was
still new, the derivative modeled the PDP-11 architecture. So nowadays all
CPUs are C machines and C runs well on them. Probably the most common non-
PDP-11-like machine most programmers will program these days is a GPU.

~~~
Turing_Machine
> 7-bit ascii was common (5 characters would fit in a word, like my username
> GUMBY), as were six bit bytes (pack six characters into a word)

There were a bunch of different six bit character encodings, often (though not
always strictly correctly) called "BCD". The horror show of IBM's EBCDIC was
an eight bit extension of one of these.

Then there was 5 bit Baudot code, and...

The last time I checked, many *nix systems will still assume that you're on a
5 bit Baudot (uppercase only) teletype (i.e., a genuine physical tty) if you
attempt to log in using all uppercase in your user name.

Some systems hacked in more characters by having special "shift in" and "shift
out" characters. If a "shift in" character appeared in the stream, the system
would switch to the alternate character set until a "shift out" character was
received.

~~~
kps

      > … *nix systems will still assume that you're on a 5 bit Baudot (uppercase only) teletype …
    

_Akshully_ the original 1963 version of ASCII¹, which was a 7 bit code but did
not include lower case. The Model 33 teletype² (one of the terminals used by
UNIX developers³, and probably a contributing factor to two-character command
names) was a 1963-ASCII device. Even after 1967 ASCII added lower case,
popular low end video terminals⁴ did not include it so that they could get
away with 6 bits worth of printable character ROM.

¹
[http://worldpowersystems.com/archives/codes/X3.4-1963/index....](http://worldpowersystems.com/archives/codes/X3.4-1963/index.html)

²
[https://en.wikipedia.org/wiki/Teletype_Model_33](https://en.wikipedia.org/wiki/Teletype_Model_33)

³
[https://commons.wikimedia.org/wiki/File:Ken_Thompson_(sittin...](https://commons.wikimedia.org/wiki/File:Ken_Thompson_\(sitting\)_and_Dennis_Ritchie_at_PDP-11_\(2876612463\).jpg)

⁴ [https://en.wikipedia.org/wiki/ADM-3A](https://en.wikipedia.org/wiki/ADM-3A)

------
zardeh
The python result should be expected. Python's integer type isn't sized, that
is python will happily give you factorial(100), despite it being much larger
than 64 bits. It can't then give you twos complement, because it can't know
the size with which to complement the two.

~~~
brandonbloom
That's a reasonable tradeoff for Python, but it's worth noting that there is a
way to deal with this.

Quoting from
[http://reference.wolfram.com/language/tutorial/IntegerAndNum...](http://reference.wolfram.com/language/tutorial/IntegerAndNumberTheoreticalFunctions.html)

    
    
        Bitwise operations are used in various combinatorial
        algorithms. They are also commonly used in manipulating
        bitfields in low‐level computer languages. In such
        languages, however, integers normally have a limited
        number of digits, typically a multiple of 8. Bitwise
        operations in the Wolfram Language in effect allow
        integers to have an unlimited number of digits. When an
        integer is negative, it is taken to be represented in
        two's complement form, with an infinite sequence of ones
        on the left. This allows BitNot[n] to be equivalent
        simply to (-1 - n).

~~~
gpvos
Yes, but how would you print that infinite sequence of 1s?

~~~
ema
same as the infinite number of zeros in front of a positive number.

~~~
AstralStorm
Just trimming 1s would be ambiguous. Is b11 3 or -2? You would have to add a 0
prefix to all positive numbers or some other to negative.

~~~
brandonbloom
Just prefix a + or - just like you do for base 10.

~~~
gpvos
So then it would not make much of a difference anymore with what Python does.
Which was what they wanted to avoid.

------
white-flame
Looking at limited-digit odometer style devices gives the best example of why
2's complement is saner.

    
    
      ...    or  ...
      9998       1110
      9999       1111
      0000       0000
      0001       0001
      0002       0010
      ...
    

What number is before 0 in binary? 1111. So that's where -1 is. The number
before that? 1110, so that's -2. The whole XOR + 1 thing can be derived from
this shape.

That, and simple addition of both signed and unsigned numbers actually works.
:)

The only question is where you draw the line between underflowing negatives
and overflowing positives, and going halfsies on the top bit seems to make
sense. For an 8-bit number, there are 128 numbers on each of the negative/non-
negative split, but zero mucks it up by being not mathematically positive. 1's
complement evens it out by having 127 numbers on each side plus two zeros, but
messes up signed math.

------
chii
the
[https://en.wikipedia.org/wiki/Method_of_complements](https://en.wikipedia.org/wiki/Method_of_complements)
article explains a more fundamental piece of information about this operation
- i often see articles explaining two's complement, but doesn't say anything
about this general 'complement' method (which works for all bases, not just
binary).

~~~
ramshorns
> In the decimal numbering system, the radix complement is called the ten's
> complement and the diminished radix complement the nines' complement. In
> binary, the radix complement is called the two's complement and the
> diminished radix complement the ones' complement. The naming of complements
> in other bases is similar. Some people, notably Donald Knuth, recommend
> using the placement of the apostrophe to distinguish between the radix
> complement and the diminished radix complement. In this usage, the four's
> complement refers to the radix complement of a number in base four while
> fours' complement is the diminished radix complement of a number in base 5.

That makes sense. The names one's complement and two's complement are kind of
confusing otherwise, since they actually refer to totally different things,
and it's not clear how they would generalize to higher bases.

------
0xcde4c3db
> There would be two ways to represent 0, as +0 and -0.

IEEE 754 floating point actually has this (due to having a dedicated sign
bit). I think most code doesn't care (IIRC they are defined as equal for
comparison purposes even though the bits are different in memory), but
apparently it's sometimes handy to have for some functions that have a
discontinuity at zero or otherwise need to preserve the sign through a
multiplication by zero.

~~~
kybernetikos
Because of this, javascript has a +0 and a -0. They make a fun trivia question
because there are very few ways to distinguish them since most of the ways of
checking equality (even ===) will report that they are equal.

In fact, I only know two ways to distinguish them: divide something by them,
and you get positive infinity for +0 and negative infinity for -0, or you can
use Object.is(-0, 0) which will return false.

~~~
SamBam
Weird. I just tried it out. So in Javascript you can have two variables, `a`
and `b`, such that `a === b` and `1/a !== 1/b`.

------
paulddraper
If you're looking for true beauty, look no further than negabinary.
([http://mathworld.wolfram.com/Negabinary.html](http://mathworld.wolfram.com/Negabinary.html))

E.g.

    
    
         3 = (-2)^2 + (-2)^1 + (-2)^0 = 0110_-2
        -3 = (-2)^3 + (-2)^2 + (-2)^0 = 1101_-2
    

There is no signed bit, you don't have to worry about sign; everything just
works like "normal" numbers because that in fact is what it is.

A pity it was never used except a few times in early computing. I'm never sure
why 2's-complement won.

~~~
jonsen
_...why 2 's-complement won._

Maybee conversion to/from character code is easier. Notice the conversion code
in the linked article.

~~~
paulddraper
> conversion to/from character code is easier

(1) How often do you convert from machine integers to binary character
representations?

(2) If you're referring to

    
    
            for(j = 0; j < bitlen; j++) {
                bin[j] = (i & 1) ? '1': '0';
                i >>= 1;    
            }
    

that's the exact same for a negabinary machine.

~~~
jonsen
I meant to/from numbers in text form. Usually decimal.

------
vanderZwan
One thing that I find very fascinating about Gustavson's Unum work is that he
proposes a lot of interesting ideas - not all new, as he is happy to remind
you of himself - for encoding numbers:

> _Type 2 unums are a direct map of signed integers to the projective real
> number line. The projective reals map the reals onto a circle, so positive
> and negative infinity meet at the top._

[http://deliveryimages.acm.org/10.1145/3010000/3001758/ins01....](http://deliveryimages.acm.org/10.1145/3010000/3001758/ins01.gif)

He also proposes to include the reciprocal of every included number in this
projection, leading to a very nice property:

> _To negate a unum, you negate the integer associated with the bit string, as
> if that integer was a standard two 's complement number. Flip the bits and
> add one, ignoring any overflow; that gives you the negative of an integer.
> It works with no exceptions. But get this: To reciprocate a unum, you ignore
> the first bit and negate what remains! Geometrically, negating is like
> revolving the circle about the vertical axis and reciprocating is revolving
> it about the horizontal axis. And yes, the reciprocal of zero is ±∞ and vice
> versa._

[http://ubiquity.acm.org/article.cfm?id=3001758](http://ubiquity.acm.org/article.cfm?id=3001758)

[http://www.johngustafson.net/presentations/Unums2.0.pdf](http://www.johngustafson.net/presentations/Unums2.0.pdf)

------
bsder
While two's complement is quite clever, it's not an obvious choice when you
are building things out of tubes or relays.

One's complement has two very nice properties:

1) the range is symmetric

2) "end-around carry" makes all the bits look identical in terms of
implementation

~~~
bogomipz
But isn't that drawback of 1's complement that 0 in an 8 bit number can be
represented two different ways?

0000 0000 and 1111 1111

~~~
bsder
Is it a drawback? If your "compare to 0" is "are all bits the same", then it
would be an advantage.

An asymmetric range is a "drawback" of two's complement. Is that a drawback?

It depends upon what your building blocks in the technology are. For example,
we don't use J-K flip flops anymore because they are a pain to make and use
when MOSFET's are your building blocks (J-K wants bipolars).

------
laszlokorte
A widget I built last year for interactively visualizing a number circle with
various binary interpretations: [https://thesis.laszlokorte.de/demo/number-
circle.html](https://thesis.laszlokorte.de/demo/number-circle.html)

~~~
colejohnson66
That's really cool! However, I can drag the SVG's viewbox around; something I
wouldn't expect to be able to do.

EDIT: It makes sense when you zoom in (like you would with 7 bits), but if I'm
zoomed out all the way, I wouldn't' expect to be able to pan around. I'd
expect it to prevent panning past the edge.

~~~
laszlokorte
Makes sense - thanks for the suggestion :)

------
eps
It's an elegant construct, but this used to be a part of an entry-level course
in every computer school I know of. Have things changed now?

~~~
TillE
I suspect lots of people here have little or no higher education. Mine was 10+
years ago, so an occasional refresher can be nice; I remember the general
concepts, but not necessarily the details.

~~~
ashark
The category of Things I Once Knew but Have Since Forgotten because I Don't
Them Often Enough probably includes 95+% of everything I've ever learned about
both mathematics and computers/programming.

The people on here who rattle off the names of various mathematical theorems
like it's nothing and act like it's weird not to remember how intro-level
algorithms work without thinking really hard and doing some trial-and-error
for a while or consulting a reference must have much more interesting jobs
than I _ever_ have. :-/

------
crawfordcomeaux
Would it be possible to use - 0 & +0 in useful ways, like determining
"direction" of previous operation? No clue how that could be useful, but I'm
betting there's a use case out there.

~~~
cmrx64
Yes! Although I'm not sure if it would ever be useful for integers, it's vital
in floating point:
[https://people.freebsd.org/~das/kahan86branch.pdf](https://people.freebsd.org/~das/kahan86branch.pdf)

------
morecoffee
Probably the most interesting part of 2's complement is how to negate numbers.
Flip all the bits and add 1. Which is strange, because to negate the number
again, flip all the bits and add 1. It feels like accidentally adding 2
doesn't it?

------
dnautics
There is that wierd number 100000...0000 which is that extra negative number
with no positive analog. I'm currently working with a noninteger value
representation system that hijacks this value and assigns it as +/\- infinity
(negation is two's complement)

------
1010111
Shitty C code: the binary printing function is needlessly complicated.

    
    
        void print_bin(int x){  
    	for(unsigned m=~(~0u>>1); m ; m>>=1)  
    		putchar(x&m?'1':'0');  
    	putchar('\n');  
        }

~~~
yellowapple
Shitty C code: I've seen Perl golfs more readable than this.

The article's code is more explicit and verbose, which makes it easier for
non-C-programmers (like myself) to actually understand what's going on.

~~~
todd8
Understandably confusing for a non-C-programmer, but it is idiomatic C. It's
not code golf; it's just the way one does machine independent bit twiddling.

A mask is used to test each bit position in turn. The tests look like this if
written using binary literals:

    
    
        0b1000 & x
        0b0100 & x
        0b0010 & x
        0b0001 & x
    

The '&' is the bitwise AND opperator in C. Of course, we'd have to do as many
of these as the word size so _1010111_ uses a for loop that starts with the
first mask, a 1 in the leftmost position, and shifts it right one position
every time through the loop (using the C right shift operator >> on an
unsigned mask value). When the one bit is eventually shifted out the right
side of the mask, the mask is all zeros so the loop terminates because zero
acts like false in the for loop test.

The only other tricky thing is initializing the mask. To set only the leftmost
bit in a word the code uses the bit complement operator ~ of C. Breaking it
down for a four bit example looks like:

    
    
        0u          == 0b0000
        ~0u         == 0b1111
        ~0u >> 1    == 0b0111
        ~(~0u >> 1) == 0b1000
    

This is the expression that appears in the for loop initializing the mask
value, and it works for any word size.

The original article's code was definitely not idiomatic, efficient, or safe
(the memory allocation for an array of characters could fail and segfault).
The book _Hacker 's Delight_ is a great reference for those wanting to
understand how to do low level coding, a requirement for close to the hardware
work like writing device drivers.

------
gumby
> I was not curious to ask what is the need for one's complement or two's
> complement.

A problem with school, not the student (a common problem IMHO).

Good write up!

------
user51442
The Universe is two's complement. Item 154 of HAKMEM:
[http://catb.org/jargon/html/H/HAKMEM.html](http://catb.org/jargon/html/H/HAKMEM.html)

------
signa11
you could as well provide an explicit mask to help (how far does the sign
extend) python f.e a

    
    
             'bin(-5 & 0b1111)' gives 
             '0b1011' 
    

which is what you want

------
gravypod

        int bitlen = sizeof(i) * 8;
    

I wish life was this simple.

------
BuuQu9hu
This is halfway to p-adic numbers and quote notation.

