

Released: Python 2.6.8, 2.7.3, 3.1.5, and 3.2.3 - johns
http://mail.python.org/pipermail/python-list/2012-April/1290792.html

======
ComputerGuru
I'm both very much surprised and disappointed that this feature is disabled by
default and needs to be explicitly turned on.

Honestly, if the spec doesn't guarantee that iteration will be in order, no
one has a right to complain if it actually isn't with this release.

In the C++ world I come from, the Spec is treated as being holy. You don't
write code that invokes undefined behavior or relies on unspecified behavior.
If you do, you're on your own and no one will touch your code with a ten foot
pole if they can help it. (Unless, of course, there really is no other way due
to your compiler, etc. etc. in which case you litter the comments with
warnings)

In this particular case, it's a simple matter of choosing the right data
structure to suit your needs, or sorting contents before accessing them if you
truly require in-order iteration.

~~~
ywrkdl
>In the C++ world I come from, the Spec is treated as being holy.

That's ridiculous.

>You don't write code that invokes undefined behavior or relies on unspecified
behavior.

This happens all the time, and not only is code written that relies on
unspecified behavior, vendors maintain backward compatibility for programs
that do this.

<http://www.joelonsoftware.com/articles/APIWar.html>

> If you do, you're on your own and no one will touch your code with a ten
> foot pole if they can help it.

Not really. If you do this and you're a huge success, you're the center of the
universe. If you do this and no one notices, you're fucked.

~~~
ComputerGuru
Software APIs are one thing, the C/C++ spec is another. The latter undergoes
revision after revision through hundreds of publicly released drafts and after
years and years of planning and experimentation. The former is subject to the
whims and caprices of whomever's software you're using.

------
chubot
Wow this is great. As I remember basically all language implementations like
Ruby, PHP, Perl, V8, etc. were vulnerable to this problem. Is Python the first
to provide a fix?

~~~
haberman
Lua authors think the problem is overstated, but are fixing it anyway for
"propaganda purposes":
<http://article.gmane.org/gmane.comp.lang.lua.general/87667>

~~~
emboss
I find their attitude worrying - how can something be overstated if it's
essentially possible for anyone to take down servers as easy as that? It's
possible, so it must be fixed, that's what basically any text book on security
tries to convey: there's no such thing as "mostly secure" - either it's done
right or there's no need to do anything at all. Ignorance won't help in making
the web a safer (more secure) place.

~~~
rplnt
It should be fixed in the web frameworks people are using. Some rolled out
fixes after the bug disclosure (which came surprisingly late considering it
was well known in theory and perl fixed it years ago (in 2003)). The fix is
simple -- don't allow users to pass thousands of arguments/options or
basically any user input which is later put into dictionary.

------
gdg92989
This easily could have been the title of an article bashing python
fragmentation

~~~
yk
This was also my first thought. But with changes as big as Python 3, there is
probably no way around this.

~~~
gregbair
How does that explain the four "current" versions, then?

~~~
jgeralnik
There aren't 4 current versions, there are 2. The other 2 are old versions
that still get security updates.

------
kibwen
Does anyone know if this behavior will be enabled by default in 3.3?

~~~
DasIch
It will be.

------
hristov
I am surprised they get into quadratic algorithmic complexities. If they use
the most efficient data structures (e.g., balanced binary trees) their
algorithms for storage and retrieval should be no higher than O(log(n)). Thus
if everything collides at the same entry in a hash table, you should still be
able to do stores and reads for O(log(n)) time. And if that is the case, it
would be very difficult to execute an attack that would cause significant
disturbances even if the bad guys could beat the hash function.

~~~
tedunangst
I think you're assuming that python hash tables use linked list buckets. They
don't. They use open addressing, and resize the table as necessary. There's no
"data structure" that could be switched to a binary tree.

~~~
hristov
Yes, that is what I was assuming.

------
bmm6o
Does anyone have a link to the commit that fixed this? I'd be interesting in
seeing the before and after.

~~~
ken
It looks like most of it is in <http://hg.python.org/cpython/rev/f4b7ecf8a5f8>

~~~
bmm6o
For anyone else that's curious, it's a pretty simple change:

    
    
        23.1 --- a/Modules/datetimemodule.c
        23.2 +++ b/Modules/datetimemodule.c
        23.3 @@ -2566,10 +2566,12 @@ generic_hash(unsigned char *data, int len)
        23.4      register long x;
        23.5  
        23.6      p = (unsigned char *) data;
        23.7 -    x = *p << 7;
        23.8 +    x = _Py_HashSecret.prefix;
        23.9 +    x ^= *p << 7;
       23.10      while (--len >= 0)
       23.11          x = (1000003*x) ^ *p++;
       23.12      x ^= len;
       23.13 +    x ^= _Py_HashSecret.suffix;
       23.14      if (x == -1)
       23.15          x = -2;
    

The hash computation is initialized with a global random value, and a second
one is xored in at the end. (-1 isn't allowed as a hash, since it's the
sentinel value that indicates the hash hasn't been computed yet.)

------
pyre
I'm disappointed that they didn't fix pydoc to handle Partial objects
correctly (i.e. show them as methods and display their __doc__ text rather
than show them as variable of class Partial).

------
lawnchair_larry
Wow, they are only fixing this just now? Terrible. And yeah, off by default is
a failure. You should be secure by default.

------
jbarham
This post by Russ Cox about random hash functions is timely:
<http://research.swtch.com/randhash>

~~~
BarkMore
The Russ Cox post is unrelated. See the last paragraph in his post.

