
Project Euler 001 the Hard Way - bdg
https://statagroup.com/articles/explore
======
FabHK
Two quick remarks:

> Looking at the code it’s quite obvious 3 and 5 are replaceable with any set
> of other numbers.

What's not obvious to me (I'm not a mathematician) is that both solutions
answer the question correctly when the set of numbers is not pairwise coprime.
For example, "Find the sum of all the multiples of 3 or 6 below 1000" is
clearly just the same as "Find the sum of all the multiples of 3 below 1000",
while the inclusion-exclusion algorithm will probably deliver a different
answer. So, the set will need some pre-processing to remove common factors,
which will then require adjusting the answers.

Next, for the problem "Find the sum of all the multiples of <M numbers> below
<N>", the naive algorithm seems to have runtime of about O(NM), while the
"sophisticated" algorithm seems to have runtime O(2^M) at least - so, as we
increase M (the size of the test set), the "naive" algorithm will soon be
faster, or not?

~~~
Someone
_What 's not obvious to me (I'm not a mathematician) is that both solutions
answer the question correctly when the set of numbers is not pairwise coprime.
For example, "Find the sum of all the multiples of 3 or 6 below 1000" is
clearly just the same as "Find the sum of all the multiples of 3 below 1000",
while the inclusion-exclusion algorithm will probably deliver a different
answer._

You’re right. The solution for this is to replace “product of n and m” bij
“least common multiple of n and m”. So, you would get (using 4 and 6 as an
example that’s slightly better than 3 and 6):

    
    
       Number of multiples of either 4 or 6
        = Number of multiples of 4
        + Number of multiples of 6
        - Number of multiples of lcm(4,6) = 12
    

and yes, if M gets larger enough, the “naive” algorithm can get faster. It
will help if you bail out once you have found _a_ divisor, and, if your
numbers are ‘large enough’ (1), to do division testing in some specific order
(2).

(1) what is ‘large’ will be system dependent. In general, once you need
bigint’s, but if your CPU doesn’t have a division instruction, it can come
earlier.

(2) in general, smallest to largest, but if one of your test divisors is a
power of 2, move those up front. Also, if you’re forced to use bigint’s,
divisors of “2^register size +/\- 1 and their factors may be easier (just as
testing divisibility by 9 or 11 or 9’s factor 3 is easier in decimal written
integers)

~~~
FabHK
> The solution for this is to replace “product of n and m” bij “least common
> multiple of n and m”

Nice, and it works for my example, too:

    
    
        Number of multiples of either 3 or 6
        = Number of multiples of 3
        + Number of multiples of 6
        - Number of multiples of lcm(3,6) = 6
        = Number of multiples of 3
    

Cool. How to extend to more than 2 numbers?

~~~
jarekkruk
You could do it like in article, so for every nonempty element in powerset
(combinations for each length) of those numbers calculate their lcm and add or
subtract amount of multiplies of lcm according to length of element.

So for A, B, C you have (M-number of multipliers) M(lcm(A)) + M(lcm(B)) +
M(lcm(C)) - M(lcm(B,C)) - M(lcm(A,C)) - M(lcm(A,B)) + M(lcm(A,B,C))

------
gordaco
Isn't this like... really, really basic? I would have never considered
iterating over all the numbers, not even when I first entered Project Euler
back in 2011; inclusion-exclusion always was the obvious way. I'm probably
biased since I'm a big Project Euler fan and I probably have a much more math-
oriented way of thinking than the average software developer, so take my
opinion with a grain of salt.

Most serious Project Euler problems require finding ways to reduce the problem
complexity into something manageable. For example, problem 268 is a more
complex and challenging version of problem 1, which can't be solved by brute
force in a reasonable amount of time [0]. Also, whenever you solve a problem,
don't forget to check the problem thread, accessible only to solvers, for many
mathematical insights (for example, here's a hint: buried somewhere in problem
10's thread, in what I believe to be the highest rated comment found in any
problem thread all over Project Euler, you can find a very useful function
that can be used to solve a few, much later, problems).

Also, just yesterday Project Euler released the 700th problem (which is an
easy one if you know basic modular arithmetic) [1].

[0]
[https://projecteuler.net/problem=268](https://projecteuler.net/problem=268)

[1]
[https://projecteuler.net/problem=700](https://projecteuler.net/problem=700)

~~~
fctorial
From the problem 700:

    
    
        Consider the sequence 1504170715041707n mod 4503599627370517
    

Could you explain what 'n mod' means? Google doesn't seem to know what that
is.

~~~
philiplu
The problem is a bit underspecified, perhaps. 'n' is a positive integer, so 1,
2, 3, .... So the sequence is found by multiplying 1504170715041707 by 1, 2,
3, ... sequentially, then finding the remainder when you divide each of those
products by 4503599627370517.

------
Ragib_Zaman
>...where I discover the hidden complexity of a simple programming problem.

I thought it was very ironic that soon after that sentence, the author claims
the arithmeticSum method is O(1) when it is actually O(log(n) log(log(n))).

Many people seem to assume that multiplication is a constant time operation.
There is actually immense "hidden complexity" in doing multiplication of
arbitrarily large integers efficiently. David Harvey proved last year that
multiplication of two n bit integers can be done in O(n log n). It is still an
open conjecture that this is the best possible.

~~~
dahart
You’re right, and it’s a good point, but I think you’re too hard on the author
and other people.

> the author claims the arithmeticSum method is O(1)

He really claimed that Gauss’ formula is O(1), where it’s reasonable to assume
multiplication without a specific implementation is constant. After that he
gave an implementation in JavaScript that is O(1). It’s only a larger
complexity if you use numeric methods on computers that support arbitrarily
large numbers.

> Many people seem to assume that multiplication is a constant time operation.

Multiplication is constant time for the built-in data types, for any 64-bit
ints or doubles. It’s okay to call it constant until you use huge number
methods.

~~~
rwbhn
But the author follows that with "It works the same way for 100 as it does
10e100."

~~~
dahart
True! That statement does follows the math formula, and by itself the
statement is true regardless of complexity. It will work the same way for
small numbers as large ones.

The 10e100 comment does precede the implementation, and this particular O(1)
implementation of course might not support inputs in the range of 10e100
exactly.

------
appleiigs
If i was shown the fizz buzz question in a job interview, and i answered using
the “hard way”, how would they respond?

Would they be impressed by the math? Or would they complain about the number
of lines of code, readability for code review, etc?

~~~
ubercow13
What's the 'hard way'?

~~~
appleiigs
The title of the post is the “Euler 001 the Hard Way”. The way i read it, the
hard way would be “Solving all combinations with performance”.

------
foxes
The inclusion/ exclusion idea can actually be useful for harder project euler
questions (deriving some sort of closed form sum to compute). Worthwhile to
think about.

Im also reminded about the "semigroup resonance" way to solve fizzbuzz posted
recently on hn [0]. Seems like another interesting method.

[0] [https://blog.ploeh.dk/2019/12/30/semigroup-resonance-
fizzbuz...](https://blog.ploeh.dk/2019/12/30/semigroup-resonance-fizzbuzz/)

------
Grue3
It's not the hard way, it's the way you're supposed to "solve" a PE problem.
Almost all of them are trivially solvable by brute force and a sufficiently
powerful computer. Instead you're supposed to analyze the problem and find a
mathematical "trick" that makes it solvable even with pen and paper.

------
pizzaknife
very cool write up. to me what it really illustrates is that dealing with
arguably simple data and requirements, has different tiers of how to deal with
the scaling quantity. At some point, a different language (math) becomes the
required dialect, as programming language expressions become outmoded. Neat

~~~
bdg
Thank you. My original version 8 years ago included a short tour into "The
Inclusion-exclusion principle" which is how I eventually derived the pattern
to use even/odd sums of bits in the ABC=111 encoding. I cut it out because I
felt it was distracting for most of my readers.

I eventually summarized it this way after a few wordy paragraphs that
explained the way the equation worked:

> The sum of all the intersections would mean (if we had 4 items, A, B, C, D)
> adding together A, B and C and D. The second line would mean we subtract the
> sum of AB, AC, AD, BC, BD, and CD. The third line means we add the sum of
> ABC, ABD, and BCD, the fourth line means we subtract the value of ABCD. The
> pattern here is we alternate the operation (adding or subtracting a sum)
> based on the cardinality (how many items are in a set) of something. Even
> cardinalities are subtracted, odd cardinalities are added. We sum together
> all combinations of that cardinality.

------
JakeStone
Something similar from 4+ years ago, but in C#

[https://gist.github.com/RichardVasquez/6780214](https://gist.github.com/RichardVasquez/6780214)

------
master_yoda_1
To Author: good for your weekend musing. But please don't torture interviewer
with it (with time limit of 45 minutes).

------
jpxw
`sumMultiplesOfTwoAndThree` should probably be `sumMultiplesOfThreeAndFive`
right?

