Hacker News new | past | comments | ask | show | jobs | submit login
Fast constant-time GCD algorithm and modular inversion (cr.yp.to)
124 points by MrXOR 14 days ago | hide | past | web | favorite | 81 comments



It seems "constant-time" isn't used to mean "the time taken is O(1) regardless of the size of the input n", but rather, "for a given input size, the algorithm is carefully written to do the same amount of work no matter what the specific bits of the input are, to defeat timing attacks". This took me a bit of time to figure out.

To illustrate the kind of thing it's talking about, consider the naive algorithm for computing a^n via exponentiation by squaring:

  total = 1
  while n > 0:
    if n is odd:
      total = total*a
      n = n-1
    a = a*a
    n = n/2
  return total
If n has k bits, and j of them are 1s, then there will be k-1 squarings of a, and j multiplications of total by a. An attacker who can measure the total time may be able to at least figure out the number of 1 bits in n. If they can get fine-grained observations of power draw or something, then they might even be able to tell which bits are 1.

Consider this alternative:

  total = 1
  while n > 0:
    maybe_total = total * a
    if n is odd:
      total = maybe_total
    a = a * a
    n = n >> 1
  return total
This will do the same number of multiplications, if you can convince the compiler to not do any optimizations. Note that it still has a branch, though, which might conceivably be detectable. To plug that hole, something like this might work:

  # if n is odd:
  #   total = maybe_total
  # becomes this:
    low_bit = n & 1 # i.e. 0 or 1 if n is odd or even
    mask = low_bit - 1 # i.e. "all 1s" or 0 respectively
    total = (total & mask) | (maybe_total & ~mask)


This is standard terminology within programming and cryptography. This is the only practical definition to use, as cryptography commonly deals with arbitrarily sized inputs, which can of course not all run in O(1).

A simple example of constant-time within this definition, as well as why constant-time does not mean O(1) in this context, is that of a comparator:

The most normal, and fastest, approach is to go through data in the largest chunks the CPU can handle, returning 1 immediately a mismatch is found, and 0 at the end under the assumption that no termination = no mismatch. However, this reveals information about the values compared, as the time it takes to compare reveals where the mismatch is located.

For constant time, your goal is to ensure that all data is processed equally regardless of outcome. You do this by effectively making the comparison a-b=c, or a^b=c, where a zero-result means equality. In reality, this would be implemented by XOR'ing the largest chunk the CPU can handle together, and then OR'ing the result with a global value starting at 0, overwriting it. This continues without termination to the very end, where you then simply return the running OR'd counter, which is either 0 for equality, or an arbitrary value for the opposite.

Both of course leak the length of the values compared, but the lengths in these cases are commonly known by the adversary, making its protection less relevant.


This is the only practical definition to use, as cryptography commonly deals with arbitrarily sized inputs, which can of course not all run in O(1).

It doesn't necessarily follow that any algorithm on a arbitrarily-sized input can't run in O(1) - for a trivial example there is an algorithm for "determine if the input number is odd" which is O(1).


"not all", not "none of".

The "not all" is referring to the (executions of the algorithm under) potential inputs.

> for a trivial example there is an algorithm for "determine if the input number is odd" which is O(1).

I would argue that this particular example cannot be considered a case of arbitrarily-sized input: There is only a single meaningful bit of information, which is the single bit accessed.

However, if you must, we can expand to clarify that the input is meant to include arbitrary meaningful bits of information that require processing and cannot simply be ignored.


That's not how this works. What you're doing here is defining away the scope of the problem to make O(1) complexity analysis redundant. A bound of O(1) conventionally means that we can exploit some structural component of the input and the given problem to find a solution independent of the input size. If you redefine input size to the more narrow sense of inputs which don't have some sort of structural feature like that, then yes of course nothing is constant time. But then you're just shifting the difficulty of the problem around without much of a gain.

I see how you may think that, but I would argue that I am not redefining anything, as things turn nonsensical without this assumption.

It is not an input if it does not affect your output (I am sure a better formal definition exists—I usually hear "meaningful", hence my use of that). Without this restriction, any input could be arbitrarily padded, inflating input sizes and making an algorithms complexity appear lower.

I find the even/odd example to be an exceptionally clear case of an algorithm that takes a fixed 1-bit input: Its implementation on a little-endian machine does not know the bounds of its input, only the location of the least significant byte (due to byte-addressing).

Without a defined bound, such an implementation must be assumed to either take a single byte as input (here a byte read is an implementation detail, despite the algorithm only needing one bit), or the full system memory from the location specified as input regardless of contents (due to having used the word "arbitrary", the input is allowed to not fit in registers).

However, I admit that I am now entering deeper theoretical waters with regards to definition details that I am normally comfortable with. I do however not find any other definition to line up with practice.


> I see how you may think that, but I would argue that I am not redefining anything, as things turn nonsensical without this assumption.

The point I'm making is that this statement reduces to the claim that defining an algorithm's worst case time complexity as O(1) or constant time is nonsensical.

Do you disagree that adding two numbers is a constant time operation?


> Do you disagree that adding two numbers is a constant time operation?

It is constant time by crypto programming definition in both theory and practice, and O(1) only in theory (bigints are a pain).

I did not try to claim that O(1) is nonsensical. Rather, that certain O(1) variable-input algorithms are in fact simply fixed input algorithms due to not considering its input.

I also find these "non-sensical" O(1) algorithms to be outliers.


Do you disagree that adding two numbers is a constant time operation?

I do. A general algorithm for adding two numbers, at least one represented in the conventional way as a series of digits in some base, has a lower bound of O(log n).


I think it would be tough to find a suitable definition of "meaningful". When searching for an element in an ordered list, how many items are meaningful? One if it's present, zero otherwise? Always log(n)? Always n?

If "meaningful" is a non-trivial property, as difficult to determine as an algorithmic lower bound, then it's not terribly useful.

And it also leaves us with no way to talk about non-constant time algorithms for determining whether a number is odd. Can we say one runs in exponential time when we only talk about "meaningful" bits of input?


> I think it would be tough to find a suitable definition of "meaningful".

I do not believe it is tough to find a suitable practical definition of meaningful: If a piece of information has no effect on the output of the algorithm, regardless of its value, then it is not meaningful.

Examples of meaningful values would be any value in a list being sorted. Non-meaningful values would be any other bit than the lowest bit when determining if a number is even or odd.

I find it to be an incredibly trivial definition when considering algorithms.

> If "meaningful" is a non-trivial property, as difficult to determine as an algorithmic lower bound, then it's not terribly useful.

I find the case where the full input is not meaningful to be an outlier that does not at all affect how we normally deal with algorithms and bounds.

However, when the two do not match, one can start discussing what your input actually is. I would, for example, argue that an even/odd number checker only ever takes a single bit as input. It may consume more due to architectural limitations around bit accesses on modern architectures, but that is not part of the core algorithm.


With this definition, different parts of the input might be "meaningful" for two algorithms that solve the same problem. This makes the size of the "meaningful" part of the input ill suited for comparing efficiency.

(It also makes the statement "no program can run faster than linear time in the size of the meaningful input" true directly by definition, which makes it a pretty boring statement...)


Ok, let's take as an example a string equality comparision algorithm (strcmp() == 0). What would be your definition of a meaningful input?

For an equality operation, all bits are meaningful, as there are input configurations in which any bit can lead to a change in output.

However, I do see your point in that whether a bit flip in input leads to a change in output depends on the particular input. My counter to that is that it is not the specific input, but the potential inputs that matter.


> This is standard terminology within programming and cryptography

Within cryptography, yes, but in programming in general, absolutely not.

The cryptographic definition of constant-time postdates the algorithmic one by decades, i think. It's a shame the cryptographers didn't pick a different term.


>The cryptographic definition of constant-time postdates the algorithmic one by decades, i think.

The normal usage of the phrase "constant time" predates CS by centuries, and it is much closer in usage to the crypto version than the complexity version.

Crypto uses the term "constant time" to mean same time for any input, while the general CS community uses O(1) to mean bounded time and also often uses complexity to hide log N terms. It also allows variable time, as long as it's bounded (and that sometimes allows hiding terms like log N or log log N...).

A good example is most algorithms treat addition as a constant time operation, but, for example, addition of indices for lookups is not constant time when inputs are unbounded. So if you compute an index by addition and ignore the logN time to do the addition, then you fudged. Yet this is commonplace in complexity theory, which models many basic operations as constant time when they are not for unbounded inputs.

For example, a "constant time" hash table operation, which often indexes arbitrarily large indices with addition, ignores the log N needed to add arbitrarily large indices.

Thus crypto uses the phrase in the more precise, centuries old usage.


> Within cryptography, yes, but in programming in general, absolutely not.

I cannot think of any algorithms with arbitrary sized inputs that have truly constant execution time. The reason for that I do not find this possible with classical computers is that no matter what you do, reading data from memory is an unavoidable variable time process.

Do not that arbitrary sized inputs here imply that the input is meaningful and must all be processed. I do not see algorithms that do not process their inputs as candidates here, such as the elsewhere stated example of even/odd test which just read a single bit. I consider an algorithm that only reads a fixed amount of information from its input to be a fixed-input algorithm, regardless of the "full" size of its input.

Of course, if you place an upper bound on the input, then you can pad in various ways, or possibly have algorithms whose execution time is inversely proportional to its input size, thus balancing itself. However, placing an upper bound also mean violating the "arbitrary" requirement entirely, thus disqualifying the algorithm.

Do note that I'd actually be quite interested if you have examples of such algorithms.


> I cannot think of any algorithms with arbitrary sized inputs that have truly constant execution time.

With respect, I think you may misunderstand the meaning of "constant time" in the sense of complexity theory, i.e. O(1). Accessing an element in an array of size n is a constant time operation. See: https://stackoverflow.com/questions/7297916/why-does-accessi...

EDIT: Corrected "search" to "access"


Wouldn’t searching for an element in an arbitrary array of size n happen in O(n) time, while only accessing it is considered O(1)? I could understand the search case for something like a hash table being O(1) but does that also apply to your standard array?

Ah, I misspoke. I meant "access", you're correct about searching. The original link I cited for explanation still applies.

So binary search is not logn time because it only reads logn values from the input? To know which parts are not read you basically have to run the algorithm. I find your definition unhelpful.

Depending on how expansive your idea of the model of ‘computation’ is, there is this old paper: http://zero.sci-hub.tw/1814/0efc01f09f718e409481a47ff90e98fc...

TLDR: using optics for sorting


> A simple example of constant-time within this definition, as well as why constant-time does not mean O(1) in this context, is that of a comparator:

This is incorrect. The paper explicitly uses "constant time" in the sense of O(1). You can see this listed throughout the paper, in various complexity analyses, beginning from Section 2.


> This is standard terminology within programming and cryptography. This is the only practical definition to use, as cryptography commonly deals with arbitrarily sized inputs, which can of course not all run in O(1).

How about 'consistent time' or 'uniform time' for non-O(1) uses?


It might be standard notation, but not only is O(1) misleading as it misrepresents the dependency on the problem size, it is also the wrong symbol as it hides the exact thing it's trying to convey.

This is incorrect. Refer to Section 2, "Organization of the paper":

> We start with the polynomial case. Section 3 defines division steps. Section 5, relying on theorems in Section 4, states our main algorithm to compute c coefficients of the nth iterate of divstep. This takes n(c + n) simple operations. We also explain how “jumps” reduce the cost for large n to (c + n)(log cn)^2+o(1) operations. All of these algorithms take constant time, i.e., time independent of the input coefficients for any particular (n, c).

What the authors are doing is (in the simplest sense) adding a worst case O(1) component to the GCD algorithm in the exponent. This is fundamentally a complexity theory paper, and Bernstern and Yang are using "constant time" in the complexity theoretic sense.

Moreover this is not about clever implementation; the algorithm they present will explicitly not take the same amount of time regardless of the input. In line with the presented complexity analysis throughout the paper, the worst case running time is asymptotically bounded independently of inputs n and c.


The authors are not using "constant time" in the complexity-theoretic sense. That would mean an algorithm whose running time doesn't depend on n,c.

The GCD algorithm here has the property that (I'm quoting section 1.4 here) "the number of operations is ... asymptotically n (log n)^(2+o(1))". That is not constant as n varies.

The nice properties they claim for their algorithm are:

1. The asymptotic running time is good. This is a complexity-theoretic claim. The asymptotic performance is of the same order as e.g. Schoenhage's earlier algorithm, so this isn't in itself any sort of breakthrough.

2. For fixed input size, the running time is constant. This is a cryptographic claim: it gives immunity to timing attacks. There have been earlier constant-time GCD and modular inverse algorithms, so again this on its own isn't any sort of breakthrough.

It isn't clear to me whether 1+2 is claimed to be new. I think it isn't: that is, there are other constant-time GCD algorithms with the same asymptotic growth of runtime.

3. The constant factors are good. This is a matter of the practicality of the algorithm. Here the authors are claiming to have done better than anyone before them.


> The authors are not using "constant time" in the complexity-theoretic sense.

Yes, they are. I've already made a top-level comment citing the paper's explicit definition from Section 2 and comparing it to canonical definitions from the usual literature of algorithm analysis.

> That would mean an algorithm whose running time doesn't depend on n,c.

It does not. Note the exponent, 2 + O(1). It is true that the execution time varies with the input size, but this does not preclude constant time asymptotics.

> The GCD algorithm here has the property that (I'm quoting section 1.4 here) "the number of operations is ... asymptotically n (log n)^(2+o(1))". That is not constant as n varies.

Yes it is. Constant time does not mean that execution time does not vary, in either complexity theory or cryptography.


Your "top-level comment citing the paper's explicit definition" is wrong. It says "The asymptotic runtime of the presented algorithm does not depend on the inputs, n and c", which is flatly untrue because (1) n and c are not the inputs to the algorithm and (2) the runtime does depend on n.

Having a bounded exponent is simply not the same thing as "constant time"; running time that increases unlimitedly with the size of the input precisely does preclude constant-time asymptotics.

In complexity theory, "constant time" means that the execution time is bounded as the size of the input increases. The execution time of this algorithm is not bounded as the size of the input increases.

In cryptography, I don't know for sure whether "constant time" always means the same thing, but in the context of algorithms immune to timing attacks (which is the relevant context here) it means that execution time doesn't depend on the details of the input.

In this case, the algorithm does not have complexity-theory "constant time" because the runtime grows roughly like n (log n)^2. It does have cryptography "constant time" because if you fix the size of the input, the runtime does not depend on the specific numbers you put in.

I'm sorry to be so harsh here, but you have made confident wrong statements and doubled down on them when challenged. Given that your profile says "My academic background is in complexity theory", this is pretty surprising, but there's really no question that what you're saying is wrong. I'm guessing that maybe you misread something and now don't want to lose face, but you need to look again: this algorithm is not a constant-time algorithm in the complexity-theoretic sense, and it is a constant-time algorithm in the sense that the execution time doesn't depend at all on the input if you fix its size.


I think there's a misunderstanding here. Can you please be specific about the operation you're saying is increasing commensurate with inputs n, c? The GCD algorithm is accomplished using O(1) polynomial multiplications, which is achieved because the coefficients for the polynomial multiplication are given by n, c.

EDIT: To make my position clear, what I am saying is this:

1. The presented algorithm will have variable computation time, but not variable asymptotic time,

2. The algorithm uses O(1) (i.e. constant time) polynomial multiplications in the worst case, and

3. The worst case bound does not change with n nor c, though they are used in the algorithm to calibrate the divsteps such that no more than O(1) operations are required.

I will concede it's possible I'm misunderstanding the paper itself, but I don't see that here.


The title of the paper is "Fast constant-time gcd computation and modular inversion". The algorithm presented in the paper does not compute GCDs or modular inverses in time O(1). The time taken for either of those tasks is O(n (log n)^(2+o(1))).

The algorithm presented in the paper also doesn't take a bounded number of polynomial multiplications, though I'm not sure why those should be the relevant thing to count. In section 5.4 they say they split a size-n problem into two problems whose size add up to n using O(1) polynomial multiplications of size about n, and that their algorithm takes them both to be of size ~ n/2. That means O(n) polynomial multiplications, but most of them are small so the total cost for given n is, as they say in section 5.5, Theta(log n) times the cost of polynomial multiplication provided a fast polynomial multiplication algorithm is used.

The worst-case bound, both for the number of polynomial multiplications and for the actual running time, does depend on n. (The former is of order n but, again, most of the polynomial multiplications are small; the latter is of order about n (log n)^2.)

(As for the cryptographic "constant time" notion, my understanding from the paper is that with a naive timing model -- e.g., assuming that all memory accesses take equal time -- their algorithm is exactly constant-time for fixed input size. They say in a footnote near the end of section 7: "There is considerable variation in the cycle counts, more than 3% between quartiles, presumably depending on the mapping from virtual addresses to physical addresses. There is no dependence on the secret input being inverted.")


I'd propose "uniform-time" to disambiguate? Or is there a better word? I feel like there must be...


I have the same feeling... https://en.wikipedia.org/wiki/Side-channel_attack#Countermea... mentions "isochronous" operations.


Oh interesting. I've heard isochronous in the context of real time communication but not here. Cool!


Or maybe "fixed-time". I don't like "isochronous" because it introduces an obtuse neoligism.


> I don't like "isochronous" because it introduces an obtuse neoligism.

Obtuse, perhaps, but neologism?

http://etymology.enacademic.com/20730/isochronous meaning "equal time" dating back to the early 1700s.


Case in point. I'm not an etymologist, nir should we expect people to have to research the words they use. If a word is that old and not in common parlance, it is a failure of a word and it really shouldn't be used.

> not in common parlance, it is a failure of a word

Harsh but not entirely unfair (it is, after all, how language evolves.) Although I'd counter that "isochron" isn't uncommon in travel-related situations (cf [1]) and "isochronous" is easily understandable by extension.

[1] e.g. in the UK parliament, #7927, "the resident population within the Woolwich 20-minute isochron area was about 60,000" https://publications.parliament.uk/pa/cm200607/cmselect/cmcr...


Yes, fixed-time! I like it.


Thanks for clarifying – really changes the connotation of the headline.


This is more of an issue with our definition of 'constant time' rather than a problem with the description. Though I can't think of a better term to rename constant time. Any proposals?


> It seems "constant-time" isn't used to mean "the time taken is O(1) regardless of the size of the input n", but rather, "for a given input size, the algorithm is carefully written to do the same amount of work no matter what the specific bits of the input are, to defeat timing attacks".

This seems very obvious to me. "Constant" independent of the input size seems literally impossible for any nontrivial algorithm. At the very least you need to somehow read in all the input, and that's already input-size dependent.


It's pretty frustrating to see the discussion on this submission dominated by people litigating the "constant time" terminology. The authors, Bernstein and Yang, are using constant time in the conventional, complexity theoretic sense of the word. Here is a quote from Section 2, "Organization of this paper":

> We start with the polynomial case. Section 3 defines division steps. Section 5, relying on theorems in Section 4, states our main algorithm to compute c coefficients of the nth iterate of divstep. This takes n(c + n) simple operations. We also explain how “jumps” reduce the cost for large n to (c + n)(log cn)^2+o(1) operations. All of these algorithms take constant time, i.e., time independent of the input coefficients for any particular (n, c).

In particular note that last sentence. The asymptotic runtime of the presented algorithm does not depend on the inputs, n and c. This algorithm analysis is confirmed throughout the remainder of the paper, which walks through each stage of the algorithm. Now let's look at a few canonical definitions of "constant time", i.e. O(1).

From Skiena, we have:

Constant functions - f(n) = 1 - Such functions might measure the cost of adding two numbers, printing out the "Star Spangled Banner", or the growth realized by functions such as f(n) = min(n, 100). In the big picture, there is no dependence on the parameter n.

Likewise from Sedgewick & Wayne:

Constant. A program whose running time's order of growth is constant executes a fixed number of operations to finish its job; consequently its running time does not depend on N. Most Java operations take constant time.

I'll update if I find a choice example from Knuth in TAOCP, but I think this suffices. The discussion about whether or not the cryptographic use of the term satisfies the complexity theoretic sense of the term is a red herring; it's a distinction without a difference. Algorithm analysis focuses on asymptotic behavior, which is definitionally given by tail behavior, or rate of growth of a function. Among other things, this paper is not about an implementation methodology that ensures the GCD algorithm will take exactly the same amount of time regardless of the input.

______________________

1. The Algorithm Design Manual, 2nd Edition, § 2.3.1 Dominance Relations, Page 39

2. Algorithms, 4th Edition, § 1.4 Analysis of Algorithms, Page 187


> Among other things, this paper is not about an implementation methodology that ensures the GCD algorithm will take exactly the same amount of time regardless of the input.

The introduction to the paper talks about the importance of developing constant-time implementations of Euler's algorithm to avoid timing attacks, pointing out how algorithms with conditional jumps are susceptible to, for example, cache timing attacks.

As far as I understand it, a constant time algorithm, in the cryptographic sense, is one whose running time is uncorrelated with the precise secret information. If it were correlated, then timing attacks would let you infer the secret information. Presumably, if there is no random model, then the definition is that the algorithm must run in the exact same amount of time no matter the input, assuming a fixed size for the inputs. (At least one of the cited papers agrees with this definition.)

The contribution of the paper is a constant time divstep algorithm. The quote "All of these algorithms take constant time, i.e., time independent of the input coefficients for any particular (n, c)" seems to have ambiguous quantifiers and should be rewritten. It should be something like "given a fixed number of iterations n and a fixed number of input power series coefficients c, our divstep takes a constant n(c + n) simple operations, and the algorithm is constant time, i.e., the running time is not correlated with the precise inputs." Section 1.1 gives exact cycle counts for various CPUs.

Section 7 discusses how to use bit masking to actually implement it in a constant time way (in the cryptographic sense).

Anyway, divstepsx is certainly not O(1) time unless n and t are fixed constants.


I think you misread slightly.

> All of these algorithms take constant time, i.e., time independent of the input coefficients for any particular (n, c).

This means that once you have chosen a particular n and c, the time no longer varies. However, if n and c vary, the running time is definitely allowed to vary also (as the formulas n(c + n) and (c + n)(log cn)^2+o(1) clearly do).


No, I didn't misread. You and I are in (apparently violent) agreement. Constant time does not mean that running time cannot vary, in either complexity theory or cryptography. There are misconceptions on both sides here, with regard to what the terminology means in both complexity theory and cryptography.

For precision, I'll start with a good definition[1] for what "constant time" means in cryptography:

> Constant-time implementations are pieces of code that do not leak secret information through timing analysis. This is one of the two main ways to defeat timing attacks: since such attacks exploit differences in execution time that depend on secret elements, make it so that execution time does not depend on secret elements. Or, more precisely, that variations in execution time are not correlated with secret elements: execution time may still vary, but not in a way that can be traced back to any kind of value that you wish to keep secret, in particular (but not only) cryptographic keys.

Secure, constant time cryptographic algorithms need not have unvarying execution time. Now going back to complexity theory, it is also an extraordinarily common misconception that "constant time" means "the algorithm has the same execution time regardless of the size of the input." This is not the case. Big O notation doesn't even care about what always happens, it cares about what happens in the worst case. When we use "constant time" in the O(1) sense of the word, we are not precluding the possibility of an algorithm having variable execution time. Again, for precision, we are simply saying that the execution time (number of operations, etc) has an asymptotic upper bound which is independent of the input. The execution time may vary with the input, and generally speaking it will.

_________________________

1. Thomas Pornin, Why constant time crypto? https://bearssl.org/constanttime.html


> Constant time does not mean that running time cannot vary, in either complexity theory or cryptography.

Again, overloading of the term "constant time" causes pointless misunderstanding and arguments. Your statement here is wrong.

In complexity theory the term "constant time" does indeed mean the running time is bounded even with unbounded input (e.g. goes to infinity), although it could vary within this bound.

In cryptography the term "constant time" is sometimes used to mean a different concept, that the operation actually takes constant non-varying time, so that an attacker can't exploit this as a side channel to figure out the input values.

The paper seems to be using the latter meaning.


> In cryptography the term "constant time" is sometimes used to mean a different concept, that the operation actually takes constant non-varying time, so that an attacker can't exploit this as a side channel to figure out the input values.

Note that I cited Thomas Pornin for my definition of constant time cryptography, who is a cryptographer in theory and implementation. It is emphatically not necessary for software to run with unvarying execution time in order for it to be "constant time" according to the cryptographic sense of the term. This will be a poor hill for you to die on, but I invite you to provide literature supporting your alternative definition.


The definition of “constant time” in computational complexity theory is O(1) time as a function of the input size n, i.e., the time taken for an input of size n is bounded by some constant regardless of n. This is clear even from the two quotes you picked, from Skiena and from Sedgewick: both say that the running time does not depend on the input size n.

What we have here in the cryptography sense is different: the running time n(c+n) clearly does depend on n; it just does not depend on the actual input. That is, it is only constant across inputs of the same size.

I am not sure how you're claiming that n(c+n) is O(1) when it's clearly Θ(n^2) (i.e. O(n^2) and Ω(n^2)).


> What we have here in the cryptography sense is different: the running time n(c+n) clearly does depend on n; it just does not depend on the actual input.

Can you explain to me what the input is, if not c and n?


Consider the sentence just before “This takes n(c + n) simple operations”, as even quoted by you: “Section 5 […] states our main algorithm to compute c coefficients of the nth iterate of divstep”. So from this it is clear that c and n are the number of coefficients desired and the number of iterations desired; they are not the input to the algorithm (the two numbers or polynomials whose gcd is sought).

I'm really curious about your comments here. To understand them better, could you take a moment to say which of these statements you disagree with:

(1) A function f is O(1) if it is bounded. That is, if there exists a constant C such that for all n, we have f(n) < C.

(2) When we say that an algorithm (or more precisely, its running time) is O(1), we mean that the function “the algorithm's running time as a function of its input size” is O(1). In other words, let's denote the time that the algorithm takes on input I by T(I), and let us denote f(n) = max_{|I|=n} T(I) where |I| denotes the size of input I. Then we mean that f(n) = O(1). In simpler terms: that the worst-case running time does not grow without bound as a function of the input size.

(3) The algorithm in this paper, to compute the GCD of two polynomials or integers, does not have a bounded running time. (In fact, the authors describe it by the term “subquadratic”, and compare it to earlier subquadratic algorithms: “[…] a constant-time algorithm where the number of operations is subquadratic in the input size—asymptotically n(log n)^{2+o(1)}.”) That is, given any constant C, there exist inputs to this algorithm (two sufficiently large polynomials or integers) such that the running time of the algorithm is greater than C.

(4) As (3) shows that the property described in (2) does not hold, the running time of the algorithm here is not O(1).

One can argue separately whether “constant time” should be used to mean

• (the usual computational complexity sense) that the worst-case running time, i.e. the function f(n) is O(1), or

• (the cryptography sense) that T(I) does not give any exploitable information about I beyond what is known, typically |I|.

This is just an unfortunate clash of terminology across the cultures of the two fields, and IMO not an interesting thing to debate — everyone can choose whatever terminology works for them. But your claim that the two definitions are the same is puzzling, as all of (1), (2), (3), (4) seem obvious to me.


Agreeing with you by example: In computational complexity the size of the input varies. In cryptography it typically doesn't: time is identical for all inputs of the same size.

Imagine a magic sorting algorithm that always takes 1 second to check if an input is sorted, and 9 seconds to sort the input if wasn't sorted. It then returns the sorted value as output.

EG sorting "abcdefg" would check (taking 1 second) and then return (taking 0 seconds since it's sorted). Sorting "gfedcba" would check (taking 1 second) and then sort (taking 9 seconds) and return. Taking in the complete works of Shakespeare it would check (taking 1 second) and then sort (taking 9 seconds) and return.

It's O(1), yet the time varies based on the input. From a computational complexity terminology standpoint it's constant time, from a cryptographic terminology (and common intuition) standpoint it's clearly not, since it doesn't always take the same time to run.


The confusion is that the word-size of the operands plays a role.

For example, some people would say that an operation like a⨯b takes constant time or that its time complexity is O(1). However, other people (for example those implementing a bignum library) would say that the operation takes time depending on the word size of the operands. In the case of the multiplication operation, see for example [1] for a complexity analysis, to get an idea of what I mean.

[1] https://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strasse...


I would suggest that you take the time to reflect on the fact that if almost everyone disagree with you, it might point to something that you misunderstand. Either about the meaning of what they say or your understanding of the subject.

Their paper shows that the algorithm takes a constant time to compute the Cth coef. of the Nth step, for a given C and N. Not that the algorithm time is independent of C and N. C and N are not the input of the GCD. The algorithm running time is constant for the input of the GCD. It is dependent on C and N.


If I understood correctly, this paper is NOT about a new blazing fast security-breaking CS-history-changing algorithm. This paper suggests an algorithm that takes the same amount of time to compute GCD(6, 9) and GCD(123456789, 987654321), to prevent leaking hints on its inputs through side-channels. That is, this thing is basically less efficient, but still runs the same number of instructions no matter the input.

(EDIT: ... as long as inputs have the same bit-length. Any 32-bit inputs will be handled faster than 1024-bit inputs, but any 1024-bit inputs will consume the same amount of time no matter their actual values. That is, 0x0001 and 0x000000001 are handled differently by the algorithm.)

The paper do mention this:

> However, in cryptography, these algorithms are dangerous. The central problem is that these algorithms have conditional branches that depend on the inputs. Often these inputs are secret, and the branches leak information to the attacker through cache timing, branch timing, etc.

So, yeah, this is security-centered cryptography paper. The term "constant-time" is used in a different context here.


The term "constant-time" is used in the complexity theoretic sense. Can you explain to me, concretely, how what you've said here

> as long as inputs have the same bit-length. Any 32-bit inputs will be handled faster than 1024-bit inputs, but any 1024-bit inputs will consume the same amount of time no matter their actual values. That is, 0x0001 and 0x000000001 are handled differently by the algorithm

indicates the algorithm is constant-time in one sense but not the other?


The problem is: "Constant—with respect to what?" "Remains constant under what conditions?" The function f(x,y) = x^2 is constant under varying y, and is not constant under varying x. The adjective "constant", by itself, is incomplete—unless it's truly a mathematical constant, like 2 or e, that depends on nothing else—which is what leads people to the "O(1)" interpretation of the phrase "constant time".

So if it's not used to mean "constant (no matter what you vary)", then it means "constant (if you vary certain parameters and I'm not specifying which ones)". When you use a phrase with something left out and implied, then the audience has to fill it in somehow. If the audience shares your background, perhaps has been reading similar papers recently in which "constant with respect to xyz" had the xyz spelled out explicitly, this may go well; if not, it may not. In this case, people's interpretations of "the xyz we're varying" appear to range over "the entire space of integer-tuple inputs", "the size of the integers", "the bits of the integers after the leading 1", "the parts of the inputs that are considered 'secret'", and more.

So, if you say something with an implicit part left unspecified, and people fill it in with something different than what you intended... the first time this happens, I might consider it an unfortunate accident. If it happens repeatedly, it may be worth being more explicit or choosing another term. (Suggested terms: "secret-hiding", "secret-blind". "[something]-oblivious" might be another good word-formation—precedent exists in "cache-oblivious" algorithms.)

This is not the worst terminological mess we have in CS[1].

[1] My (least) favorite example is the term "dynamic programming", whose name appears to have been chosen because it sounded good and was vague enough to cover what the author wanted: "Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities." https://en.wikipedia.org/wiki/Dynamic_programming#History


The main difference here is the goal. "O(1)" algorithm often means efficient algorithm out there in the field, but this paper has absolutely no intention of making anything efficient.

People are wrong with that (1) O(1)=fast/efficient (2) I'm arguing over the definition of "constant-time".


I was a bit disappointed to see that the "constant time" was a click bait. Should be "fixed time" - or similar - instead.


One might argue that calling O(1) just "constant time" in asymptotic complexity theory is click bait itself! Calling O(1) bounded time would have been better terminology, in my mathematical opinion.

Sometimes people actually count individual operations to form a function T(n) depending on the input n (see, for example, Knuth's books). If T(n) is the constant function, then the algorithm takes constant time in a much more literal sense.

(The "constant time" terminology bugged me when I first learned about computational complexity.)

Edit: It seems "constant time" in this paper might actually mean "the running time is uncorrelated with the precise secret value, only its size."


It’s not click bait. It’s standard terminology.


To clarify: it's not "constant time" in the sense of having O(1) time complexity with regards to the size of the inputs, which is what most people mean by "constant time" (which is obviously not possible in this case: there's never going to be a GCD algorithm that can work as fast on 100-bit integers as on 1,000,000,000-bit integers).

It's "constant time" in the cryptographic sense, that the time to run it can't be used as a side-channel to figure out what the inputs are. A great result to be sure, but the terminology is undoubtedly confusing.


With any algorithm complexity analysis you have to define what the inputs are considered to be. For cryptography, algorithms are designed to be constant-time with respect to the non-secret inputs. The secret inputs (which you are trying to protect) usually do not vary from one call to the next (eg, long-term private keys etc) - so can be assumed to be constant.

So while the terminology seems confusing, it’s not actually different. It’s just a different choice of “input” compared to typical algorithm analysis.


> it's not "constant time" in the sense of having O(1) time complexity with regards to the size of the inputs

Yes it is. The presented algorithm is constant time in the exponent, i.e. 2 + O(1), where this exponent is not impacted by the size of the inputs n and c. Much like any other complexity analysis, an algorithm is O(1) as long as O(1) is asymptotically the "largest part" of the running time. As the size of n increases, the exponent 2+O(1) increasingly dominates execution time.


"Constant time in the exponent" is nonsense, I'm afraid.

A bounded exponent of n would be the same thing as "polynomial time", but the thing that's bounded (and indeed arbitrarily close to 2 for large n) is the exponent of log n.

The running time of the algorithm presented in this paper is n (log n)^(2+o(1)). This ...

... is not constant; it increases with n, a bit faster than linearly.

... has o(1), not O(1), in the exponent; the two mean different things. O(1) means "bounded", o(1) means "tends to zero". The claim isn't that the running time is <= n times polynomial(log n) but that it's <= n times "at most approximately a quadratic polynomial in log n".

... doesn't in fact depend mostly on that exponent; the most important factor is the n, not the (log n)^(2+o(1)). If that 2 were a 100, the n factor would still (asymptotically) matter more.

For instance, suppose n=2^100 and our logs are to base 2. Then the running time of this algorithm is approximately some constant times 2^100 (that's the n factor) times 100^2 (that's the factor with log n in it). 2^100 is much, much, much bigger than 100^2.


I disagree. "Constant time" means O(1), which this isn't.

Disagreeing with a fact doesn’t make a lot of sense. Maybe you meant to say “I didn’t know this was standard terminology” or “I think it’s a confusing choice of word anyway”.

Unfortunately the terms fundamentally overlap (even constant time equality is not constant as a function of input size).


I presume you came across this researching the zero-day DoS in Windows 10 (and others?) caused by an infinite loop in Microsoft's modular inversion code? Thanks for sharing!



Isn't constant time GCD a problem for factoring big primes?


I have an O(N) algorithm for factoring any prime: Read each digit of the prime from the input tape and write it to the output tape. :P

(Perhaps you meant factoring large semi-primes? :) )


I have an O(1) then!


Given that you have to read and then write each digit of the input I find it hard to believe that you have an O(1) algorithm - can you tell us what it is?


Take a photo of the input on the tape and print the photo.

Depending of the size of the prime you might need to take multiple photos.

Print "1", thought that would be obvious...

Assuming you mean factoring large number, specifically semi-primes with broadly similar sized factors and no assumed structure, the answer is still no. Taking N as the number being factored and C a candidate that might contain a factor, taking GCD(N,C) is not a significant part of the process. The large part of the process is usually finding C.

And the article isn't claiming that the GCD code here takes the same time regardless of the input, it's saying that for two inputs of the same size the time take is the same. Time as a function of the input size is still proportional to the number of bits in the input, it's just that the routine takes the same time regardless of how many of those bits are 0, regardless of the structure of the input numbers.

========

To others commenting here, the guidelines[0] say:

Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

[0] https://news.ycombinator.com/newsguidelines.html


By the 1st look: the algorithm is just free from side channel attacks.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: