
Mathematicians Discover the Perfect Way to Multiply - WMCRUN
https://www.wired.com/story/mathematicians-discover-the-perfect-way-to-multiply/
======
makapuf
Preceding discussion on hackernews
[https://news.ycombinator.com/item?id=19474280](https://news.ycombinator.com/item?id=19474280)

------
svat
> _in 1960, the 23-year-old Russian mathematician Anatoly Karatsuba took a
> seminar led by Andrey Kolmogorov [....] Kolmogorov asserted that there was
> no general procedure for doing multiplication that required fewer than n^2
> steps. Karatsuba thought there was—and after a week of searching, he found
> it._

Karatsuba's method is beautiful and actually quite simple to explain.

Suppose you want to multiply two 200-digit numbers. Write them as aX+b and
cX+d, where X=10^100. We want to compute the product (aX+b)(cX+d), which is
(ac)X^2 + (ad+bc)X + (bd).

By the naive method, we'd end up effectively performing all four
multiplications ac, ad, bc, bd. Karatsuba observed that we can do it with just
three multiplications: ac, bd, and (a+b)(c+d) — and get our (ad+bc) as
(a+b)(c+d)-ac-bd.

Doing this recursively (so that T(n) = 3T(n/2) + O(n)) gives Karatsuba's
method, cutting down the n^2 time of the naive method to about n^1.58 time.

[1]:
[https://en.wikipedia.org/wiki/Karatsuba_algorithm](https://en.wikipedia.org/wiki/Karatsuba_algorithm)

~~~
nopinsight
Karatsuba’s algorithm is a shining example of creative applications of simple
concepts.

I used to teach math to gifted elementary school students and used some
problems from Japanese exams for the talented. I think the creativity and
complexity of thought required to come up with this algorithm is about the
same as some solutions to a harder problem in those exams. (The creativity
required to solve them is significantly more than for American Math
Competition, which typically involves difficult but quite conventional
thinking.) [1]

More countries should adopt these Japanese-style problems in their gifted
education as well as simpler versions for normal students. They are a great
training ground for creative problem solving, which has become increasingly
essential for successful careers in most fields.

PS. Hong Kong holds annual creative math problem solving competition for
primary and secondary school students:

[https://www.edb.gov.hk/en/curriculum-development/major-
level...](https://www.edb.gov.hk/en/curriculum-development/major-level-of-
edu/gifted/resources_and_support/competitions/local/cps.html)

There was also at least one international math competition for primary
students in Asia-Pacific which focuses on creative math problem solving.

[1] AMC problems and solutions
[https://artofproblemsolving.com/wiki/index.php/AMC_Problems_...](https://artofproblemsolving.com/wiki/index.php/AMC_Problems_and_Solutions)

~~~
brianpgordon
Interesting; I thought that a common criticism of Japanese-style education was
that it's based to a large extent on rote memorization, whereas the West tends
to focus more on concepts and creative thinking. Can you speak to that? Is the
conventional wisdom wrong, or am I just wrong about what the conventional
wisdom is?

~~~
jbay808
One anecdote: My North American math education seemed to focus on rote
memorization, with little creative thinking at all. Did you have a different
experience? Not sure how they teach math in Japan though.

~~~
52-6F-62
Not necessarily related (and avoiding the attached political junk)—

here in Ontario (Canada), the current government as ... strongly advocated...
for going back to the "good old days" of rote memorization for the math
curriculum versus modern discovery math methodology—ie: problem solving and
creative thinking.

There tend to be pillars of thought in NA.

~~~
chmln
Don't see what the current ON government has to do with the abhorrent state of
math education here. It was just as bad under the previous governments

~~~
52-6F-62
I was not getting into the political junk, as I said. Only pointing out that
there are pillars of thought in NA who think that rote memorization as the
sole method for learning mathematics was “the good old days” and is the only
worthy method (with a practical example).

------
xiphias2
,,Hardware changes with the times, but best-in-class algorithms are eternal''

Nowdays for these kind of problems the really interesting solution is the best
parallel solution. It may be elegant to still prove things for turing
machines, but a GPU has 2000 running threads (and if the tensor cores could be
used, it's even more instructions), so an algorithm implementation that's not
using it is about 100x inferior in practice.

~~~
bigred100
I saw Jack Dongarra say that plain parallelism is now old and the thing is
“communication avoidance”. Also made the point (relevant to your quotation)
that changing hardware means redoing the algorithms every ten years to exploit
it.

~~~
xiphias2
That's a great model that even helps running algorithms on FPGAs.

Are there researchers working on finding the theoritical communicication
limits of classical numerical problems?

------
davesque
Is "Perfect" really an appropriate qualifier here? Did the source article
present a proof that O(n log n) is the lowest growth order for this kind of
task? Or am I just out of the loop and it's already known that this is the
case?

 _Update:_ Apparently, the article actually says that the approach is _not_
perfect.

~~~
smadge
From the article: “Harvey and van der Hoeven’s algorithm proves that
multiplication can be done in n × log n steps. However, it doesn’t prove that
there’s no faster way to do it. Establishing that this is the best possible
approach is much more difficult.”

~~~
davesque
Thanks. I skimmed it but didn't read thoroughly.

------
frosted-flakes
This article was originally published in Quanta Magazine:
[https://www.quantamagazine.org/mathematicians-discover-
the-p...](https://www.quantamagazine.org/mathematicians-discover-the-perfect-
way-to-multiply-20190411/)

~~~
pradn
Ah! I was wondering why the Wired graphic is similar to how Quanta does it.

------
ecesena
Paper: [https://hal.archives-
ouvertes.fr/hal-02070778/document](https://hal.archives-
ouvertes.fr/hal-02070778/document)

------
kristianp
"The speed gap between multiplication and addition has narrowed considerably
over the past 20 years to the point where multiplication can be even faster
than addition in some chip architectures."

Does anyone have an example of hardware where multiplication is actually
faster than addition? That's not an intuitive result unless they're talking
about floating point numbers, and even then...

~~~
a1369209993
I don't know of a architechure that actually _does_ this, but I suspect
they're comparing a N->N->N truncating multiply with a N->N->N+1 addition that
sets the carry flag. In addition to forcing the compiler to reorder test-
fiddle-branch sequences in a potentially suboptimal way, you'd also introduce
speculation hazards for any instructions that depend on the flags.

I'd be interested to see any hardware where that's enough to make addition
_slower_ , though, since usually you can make up for it by dumping a extra
(very cheap) adder into the execution system.

------
asimpletune
I remember implementing strassen, only to be surprised that the naive
implementation was actually faster, because it better exploited the machine
architecture... that is until I tried using absolutely huge numbers.

This recent innovation is very exciting.

~~~
thomasahle
Strassen is for multiplying matrices. This is integers.

~~~
ahelwer
They probably meant Schönhage–Strassen, which is for integers.

------
_bxg1
"It was kind of a general consensus that multiplication is such an important
basic operation that, just from an aesthetic point of view, such an important
operation requires a nice complexity bound"

It is just nuts that this kind of reasoning works out reliably.

~~~
wilg
That is amazing. Do we understand why aesthetics and intuition are so
effective here? Or actually if that's even trueish in the general case?

~~~
umanwizard
The human brain is an incredibly powerful unconscious pattern recognition
device. The conscious mind is unaware of most of this processing.

For chess positions where there is no tactic that gives a clear advantage, the
best move is determined by subtle positional/strategic considerations. It's
interesting that often, top players will all agree what the best move is, but
be incapable of explaining why except in very vague terms. It just "feels
right".

I haven't read about computer chess in a while so I'm not sure if this is
still the case, but at least for several years after it became impossible for
a human to beat a computer, a human+computer team was still much stronger than
a computer alone, because of this intuitive strategic/positional ability that
computers lack.

~~~
_bxg1
The marvel isn't that we're good at picking up on patterns, it's that nature
itself is so patterned to begin with.

~~~
umanwizard
Both are incredible!

I guess to explain my comment a bit further: I conjecture (as a pure layman)
that a similar process is going on when a chess grandmaster says "I played Nf7
because it looked natural" as when a mathematician says "I expect this
statement to be true for aesthetic/intuitive/moral reasons".

I think in this case the mathematician's unconscious mind has correctly
identified some pattern that his or her conscious mind is unable to access.

Just like I can't really explain why I know my dad's face is his, and not that
of some other man of similar race, age, and build.

------
yahrly
Ok I'll be that guy - where is the reference implementation and at what n can
I expect an improvement over existing libgmp multiplication on a core i7 ?

~~~
thomasahle
It only really kicks in over the previous best at about n>2^2^4000 iirc.

This is a theoretical paper, but it is possible that some of the new fourier
transform ideas can be used by libgmp down the line.

~~~
Someone
Not that it matters, but the constant used in the paper is 4096, not 4000.
This algorithm is proven to be O(n log n) only for numbers with at least
2^4096 ≪ 10^1200 bits.

For comparison, we estimate there are 10^80 or so atoms in the known universe,
so we would need quite a bit of progress in physics to even store one such
number; this clearly is theoretical computer science.

On the plus side, they didn’t set out to minimize that constant, and think it
can easily [1] be decreased significantly.

[1] ‘easily’ in the mathematical sense of “we think everybody who understands
this paper and is willing to do potentially a lot of hard work will be able to
pull it of”

~~~
klyrs
> This algorithm is proven to be O(n log n) only for numbers with at least
> 2^4096...

No worries, all the rest runs in O(1)..

------
Lowkeyloki
I was going to make a joke that they could have found out years ago by talking
to biologists, but then it turned out that this article is really fascinating!

------
lordnacho
Strassen, perhaps unsurprisingly, also has something to say about matrix
multiplication:

[https://en.wikipedia.org/wiki/Strassen_algorithm](https://en.wikipedia.org/wiki/Strassen_algorithm)

"Strassen's algorithm works for any ring, such as plus/multiply..."

------
lamename
Could someone explain how one can use the FFT in an algorithm for
multiplication?

I guess I would understand if I understood what FFT actually does 'under the
hood', but I assumed fft itself required multiplication.

~~~
adrianratnapala
Discrete fourier transforms have always been a way of doing fast _polynomial_
multiplication -- which is also called convolution. That is we can do

    
    
         (a0 + a1^z + a2*z^2...) * (b0 + b1^z + b2*z^2...)
    

As

    
    
        fa = DFT([a0, a1, a2...])   # O(N log N)
        fa = DFT([b0, b1, b2...])   # O(N log N)
        fab = element-by-element multiply fa by fb # O(N)
        return InverseDFT(fab) # O(N log N)
    

Now this is well established if we are literally doing a convolution of two
sequences of complex numbers. But this is different from integer quite
multiplication because:

1\. Polynomial multiplication doesn't have carries.

2\. It does inexact arithmetic on complex-numbers, we want exact results on
integer digits.

I am guessing (2) is solved by getting modulo-arithmetic do the same algebraic
gymnastics that make the FFT work.

Also I didn't think any of this was new -- I vaguely remember reading about
fast convolutions using discrete algebras in old textbooks from the '70s.
Maybe the hard thing was to do it in a way that leaves enough information left
over to do the carries.

------
gigatexal
Might be fastest but it’s not intuitive for my way of thinking. Perhaps
because I am so entrenched with the n^2 method.

~~~
dreamcompiler
Multiplication using FFTs was one of the most mind-blowing things I
encountered in my professional career. It involved realizing that third-grade
multiplication is in fact polynomial convolution, and in the frequency domain
that just becomes coefficient-wise multiplication.

The Karatsuba method is almost as cool and more practical for smallish
numbers.

~~~
lordnacho
Can you explain the connection with FFT? Or provide some useful links? Sounds
really interesting.

~~~
tprice7
I gave a quasi-explanation in response to lamename's comment. If you want some
search queries, learn what convolution is, then observe the connection between
discrete convolution and polynomial multiplication, then learn about the
convolution theorem.

~~~
lordnacho
This bringing back control theory for me. Thanks.

------
NelsonMinar
This is a lovely, readable summary of a relatively abstract mathematical
result.

------
wppick
Are exponentiation operations much faster than multiplication since an
exponentiation b^n is just n additions? Given that is true, I wonder if there
is a fast approximation method of solving a multiplication by solving an
exponentiation that would have a close result?

~~~
TheRealPomax
Except it's not, because exponentiation is not something reserved for
integers. Just because you can write out b^4 as b * b * b * b doesn't mean
that "that must be the rule". How would that rule even begin to explain how to
calculate b^2/3 or b^π?

It's a cute little rule to use in middle school, when you get your very first
glimpse at exponentiation, but it's also incredibly wrong if you try to
actually use it with the rational, real, or complex numbers =)

([https://www.reddit.com/r/3Blue1Brown/comments/a6mqsf/how_is_...](https://www.reddit.com/r/3Blue1Brown/comments/a6mqsf/how_is_exponentiation_different_from_repeated/)
has some more information in case you want to know what _would_ be a way to
understand exponentiation for all numbers, not just integers)

