
Algorithmic complexity attacks and libc qsort() - matslina
http://calmerthanyouare.org/2014/06/11/algorithmic-complexity-attacks-and-libc-qsort.html
======
nicknash
One semi-interesting thing here is that the "killer adversary" described on
that page requires knowing that the algorithm you're trying to exploit is
quicksort.

Long ago, I wrote adversaries that attempt to force _any_ algorithm that is
extracting a partial or total order from data to approach their worst case.

These adversaries, like McIlroy's adversary, sits in the compare() function,
but it asks itself "what answer can I give, consistent with the data so far,
to force the algorithm to ask me the largest number of subsequent questions?"

For non-introspective quicksorts this will be O(n^2), but it should also bring
out the worst constant factors in O(n\logn) worst case time algorithms.

I did some experiments:
[http://nicknash.me/2012/07/31/adversaries/](http://nicknash.me/2012/07/31/adversaries/)

A proper reference is the Kahn's "Sorting and Entropy", IIRC.

~~~
samsamoa
Very interesting writeup. How did you determine the number of topological
sortings for your "Yes" and "No" examples? I'm getting 360 and 144, rather
than 120 and 48 that you wrote.

------
simias
I'm surprised the article does not point out more directly that it's usually
pretty simple to mitigate this attack vector by switching to mergesort. The
worst case is O(n log n) so there's no real "killer input".

I think it's a good rule of thumb to say "if you need to sort large quantities
of untrusted data you should probably use mergesort".

~~~
matslina
Mergesort would be a poor choice for qsort() due to the linear space
complexity. With N bytes of RAM, you'd only be able to sort (a bit less than)
N/2 bytes of data. An in-place algorithm is preferable.

Glibc is the only libc I'm aware of that implements mergesort. It still falls
back to quicksort for large inputs though.

~~~
tedunangst
All BSD derived libcs contains mergesort(), and heapsort() as well.

~~~
mzs
True, but parent meant used in the qsort and qsort_r implementations.

~~~
tedunangst
Oh, of course. reading fail.

------
CJefferson
Just in case anyone is curious, this does not effect any implementation of
std::sort in C++ I am aware of.

They all use introsort -- basically use quicksort until you have done some
number of partitions, then switch to heapsorting the cells of the partition.
This ensures fast quick-sort performance while guaranteeing O(n log n) worst
case.

~~~
comex
In Linux-land, glibc's qsort also uses introsort, and musl libc's uses
smoothsort:

[http://www.etalabs.net/compare_libcs.html](http://www.etalabs.net/compare_libcs.html)

------
mqsiuser
As someone who implemented an open source quicksort
([http://www.mqseries.net/phpBB2/viewtopic.php?p=273722](http://www.mqseries.net/phpBB2/viewtopic.php?p=273722))
which is used by many in production:

Now this is very interesting, but could also be prevented with a random number
generator (which influences the selection of the pivot element). If this would
have been exploited, people would have looked into mitigating this.

Edit: You can detect recursion-tree-degeneration (just check the depth) and
react to it (pivot selection)

~~~
sgeisenh
Yeah, random pivot quicksort is n log n expected case. Its pretty easy to show
that it is realistically impossible to exceed n log n for large data sets. The
constants are quite high, though, due to the need to generate a random number
for each element.

Additionally, quicksort is just as easy to implement in parallel as mergesort
and runs about as fast since it can be performed in-place.

------
jmount
Some historic quick-sort mis-implementations had O(n^2) runtime when trying to
sort constant data (all keys equal, the effect was due to not picking the
partition carefully in that case). See: [http://www.win-
vector.com/blog/2008/04/sorting-in-anger/](http://www.win-
vector.com/blog/2008/04/sorting-in-anger/)

------
mzs
"Inspecting a diff between the qsort() of 4.4BSD-Lite and that of current day
FreeBSD reveals that very little has changed since 1994."

That reminds me :)

    
    
      $ expr \( -2147483648 \) \* -1
      -2147483648
      $ expr \( -2147483648 \) / -1
      Floating point exception: 8 (core dumped)﻿
      $ uname -srm
      FreeBSD 8.4-RELEASE-p9 i386

~~~
Malus
Looks like it is fixed now:

    
    
      $ expr \( -2147483648 \) \* -1
    
      2147483648
    
      $ expr \( -2147483648 \) / -1 
    
      2147483648
    
      $ uname -srm
    
      FreeBSD 10.0-RELEASE amd64
    

Or maybe it doesn't work on i386?

~~~
mzs
It's likely an i386 thing:

    
    
      $ echo | awk '{ print 2 ** 31 }'
      2147483648
    

Make sure you are using /bin/expr and maybe give -9223372036854775808 a whirl
too.

~~~
Malus
It now reports overflow instead of crashing, but the first case is still
incorrect:

    
    
      $ which expr
    
      /bin/expr
    
      $ expr \( -9223372036854775808 \) \* -1
    
      -9223372036854775808
    
      $ expr \( -9223372036854775808 \) / -1 
    
      expr: overflow
    

Looking at the source code:

    
    
      void
    
      assert_times(intmax_t a, intmax_t b, intmax_t r)
    
      {
    
              /*
    
               * if first operand is 0, no overflow is possible,
    
               * else result of division test must match second operand
    
               */
    
              if (a != 0 && r / a != b)
    
                      errx(ERR_EXIT, "overflow");
    
        }
    
    
      struct val *
    
      op_times(struct val *a, struct val *b)
    
      {
    
              struct val *r;
    
    
              assert_to_integer(a);
    
              assert_to_integer(b);
    
              r = make_integer(a->u.i * b->u.i);
    
              assert_times(a->u.i, b->u.i, r->u.i);
    
    
              free_value(a);
    
              free_value(b);
    
              return (r);
    
      }
    
    
      void
    
      assert_div(intmax_t a, intmax_t b)
    
      {
    
              if (b == 0)
    
                      errx(ERR_EXIT, "division by zero");
    
              /* only INTMAX_MIN / -1 causes overflow */
    
              if (a == INTMAX_MIN && b == -1)
    
                      errx(ERR_EXIT, "overflow");
    
      }
    
    
      struct val *
    
      op_div(struct val *a, struct val *b)
    
      {
    
              struct val *r;
    
              assert_to_integer(a);
    
              assert_to_integer(b);
    
              /* assert based on operands only, not on result */
    
              assert_div(a->u.i, b->u.i);
    
              r = make_integer(a->u.i / b->u.i);
    
              free_value(a);
    
              free_value(b);
    
              return (r);
    
      }
    

Looks like the check for overflow in the multiplication case is broken since
the check itself does not account for overflow. I'll try to remember to submit
a patch when I get home.

~~~
mzs
thanks!

------
Jabbles
In comparison, the Go standard sort function is not vulnerable to this, and
references the same paper as this article:

[http://golang.org/src/pkg/sort/sort.go#L168](http://golang.org/src/pkg/sort/sort.go#L168)

~~~
lelf
Huh? And by what black magic its qsort is not O(n^2) (worst)?

~~~
bcoates
It sets a stack limit proportional to log(n) and gives up and heapsorts the
subproblem if the limit is exceeded. You get no more than n log(n) quicksort
operations, plus heapsorting any partition of the input which is also n log(n)

In other words, they use the worst-case n log(n) heapsort with an optimization
to use the much faster quicksort for non-pathological inputs, which is almost
always.

------
galapago
The bogosort is well known to be invulnerable to such attacks.

------
haberman
A similar DoS-with-degenerate-input these days is hash flooding. In the same
way that quicksort can degrade from O(n lg n) to O(n^2) with degenerate input,
so can hash tables degrade from O(1) to O(n) when all of the keys have hash
collisions.

This is the main motivation behind SipHash, a new hash function that is
designed to be cryptographically collision-resistant but fast enough to use in
hash tables: [https://131002.net/siphash/](https://131002.net/siphash/)

~~~
chowells
You mean the one mentioned in the article?

Though you and the article both are perpetuating the lie that hash table
operations are O(1). What magical hash function distributes n inputs into O(n)
buckets in O(1) time? (Hint: distributing into O(n) buckets requires looking
at O(log n) bits.)

~~~
haberman
> Though you and the article both are perpetuating the lie that hash table
> operations are O(1).

Hash table operations are O(1) on average, assuming a uniform hash function,
and assuming that the number of buckets is kept greater than the number of
elements. More info here:
[http://en.wikipedia.org/wiki/Hash_table#Performance_analysis](http://en.wikipedia.org/wiki/Hash_table#Performance_analysis)

Computing the hash function itself is not usually factored into this analysis,
probably because it's an orthogonal concern (ie. hash functions and hash table
implementations can be mixed and matched), and because computing the hash
function for every operation is not strictly required (the hash of the key can
be cached).

But even if you are factoring in computation of the hash function, this is a
strange argument:

> Hint: distributing into O(n) buckets requires looking at O(log n) bits.

Computers can look at O(log n) bits in a single operation for any "n" that can
reasonably be held in memory.

For example, if your hash table has less than 4B entries, "log n" is less than
32. Any 32 bit computer can "look at" 32 bits in a single operation. So I
can't see why you think that "log n" needs to be a factor in the complexity
analysis.

------
ccurtsinger
There is no "killer input" for randomized quicksort. I'm surprised libc
doesn't select pivots randomly already.

~~~
ahomescu1
Except if you know the PRNG state, but I imagine that's really hard to pull
off.

~~~
gamegoblin
I wonder of it's possible to do some statistical attacks by sending sample
input and measuring response time? That is, after getting the response times
for a million sorts, perhaps you can make inferences into the current PRNG
state?

I suspect fluctuations in latency would make this exceedingly difficult, but I
am constantly amazed by the statistical attacks people pull off.

~~~
gsnedders
Considering the fact that timing attacks based on memcmp short circuiting were
shown to be remotely practical a decade ago, latency isn't going to be an
issue. The bigger question is whether you can get down your inputs to a small
enough number to practically determine PRNG state (you probably can).

------
jacquesm
Quite nifty. Imagine how you could seed a service with data over a very long
period of time knowing and then at the wrong moment you make a single request
causing the server to fall over.

This sort of thing might even work where the target comes to you for data
(such as a search engine crawling you).

------
petergeoghegan
PostgreSQL removed the switch to insertion sort even before NetBSD, in 2006.

------
FlyingLawnmower
Was anyone able to view the linked research paper on Hash Table exploits?

~~~
icegreentea
You gotta dig a bit. But here:

[https://www.usenix.org/legacy/publications/library/proceedin...](https://www.usenix.org/legacy/publications/library/proceedings/sec03/tech/full_papers/crosby/crosby.pdf)

~~~
pbsd
See also the recent followups [1, 2].

[1]
[http://events.ccc.de/congress/2011/Fahrplan/attachments/2007...](http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf)

[2]
[https://131002.net/siphash/siphashdos_29c3_slides.pdf](https://131002.net/siphash/siphashdos_29c3_slides.pdf)

------
danielweber
What is plotted on the black bar charts? The axes are not labelled.

~~~
kylebgorman
it represents the contents of the array to be sorted, I believe. x-axis is the
index of in the array, y-axis is the value at that index.

------
rainhacker
Pet project mentioned on calmerthanyouare.org

[http://code.google.com/p/awib/](http://code.google.com/p/awib/) "Awib is a
brainfuck compiler written in brainfuck. "

