Hacker Newsnew | comments | show | ask | jobs | submit | myle's comments login

Isn't version 2 wrong, because

    size_t probe = (low + high) / 2;
may overflow?

-----


Not if low and high are both unsigned (size_t is unsigned.) Even if it overflows, the result will be correct. See: http://googleresearch.blogspot.com/2006/06/extra-extra-read-...

-----


for 64 bit size_t it's pretty academic, but no, not correct.

suppose low is M-3 and high is M-1 (where M is 2^[#bits]) then mid should be M-2. but (M-3 + M-1) is (modulo M) equal to M-4. and half that is M/2 - 2, which is a lot less than the correct M-2. (pretend it's 2 bit unsigned ints so M is 4)

the correct 'mid' computation is low + (high - low)/2.

-----


That blog is wrong. The program won't be undefined( unsigned wrap is defined ), but the offset will not be correct.

-----


So the article isn't wrong, but its misleading you into thinking this should be correct for the wrong reasons.

They are using signed integers as their indices, which means that the signed bit is always 0. Thus the addition after casting to unsigned will never overflow, and you can divide by two (shift by 1) and then recast to a signed integer, no harm no foul.

-----


Actually it is misleading by them( and you ) to assume that in C, an unsigned int can represent values larger than the largest signed int value.

In other words, C allows that UINT_MAX == INT_MAX, in which case you will overflow.

If they made that assumption, they should explicitly mention it, but they didn't.

> Update 17 Feb 2008:... ...Now that we've made this change, we know that the program is correct;)

It seems the article is aware of the irony. Another update would be in order.

-----


Sure, and if you expect your software to run on such a platform with any degree of confidence then you're right to consider it. Even better, tell us about a conforming implementation that you've used in product recently that has UINT_MAX == INT_MAX.

Also I'm not trying to mislead people into thinking that its a good way to implement this. The confounding bit from the article is they started in Java and ended up in C. If you were indexing with signed ints in C, C++, or any language that has unsigned integers then you already have a bug with or without the bad mean check.

-----


You got it backwards there. Only if you know your implementation and plan to code only for it, can you even start to consider bending the C Standard, and not the other way around.

-----


They didn't say unsigned overflow was undefined, but you're right that the offset won't be correct. I thought it seemed intuitively wrong to me, but I figured google research is probably a reliable source!

-----


That's not true. Unsigned wrap is well-defined and wraps around to 0. Signed wrap is undefined.

-----


It is not true that unsigned wrap is defined? Did you even read my comment, here is the relevant part: unsigned wrap is defined

-----


True, but only if the list has more than 2^31 entries on a 32-bit machine.

In practise that's not possible if it contains 32-bit integers because that would take up a 16 GB linear address space which doesn't exist on a 32-bit machine.

An academically correct version would be `size_t probe = low + ((high - low) / 2);` but that would be much slower.

You could instead just add an initial debug-mode-only assert on the list length, for correctness.

-----


Sun had to put out a JDK bug fix for this nine years ago ( http://googleresearch.blogspot.com/2006/06/extra-extra-read-... ).

-----


a) Half-life of technologies is very short.

b) Instead of 10 years experience, maybe someone has 10x 1-year experience.

c) To learn theory, data structures, etc maybe university/company provides a better environment than your PSTN connection on the internet (at least that's what we had when I was 12).

d) When you get hired in a new environment, almost certainly you will face a codebase that you were not aware of. What mostly matters is your ability to generalize your experience.

Because of at least (a) - (d), the condition in the first part of your if statement, does not necessarily imply the second part.

-----


There are algorithms for LP in polynomial time in worst case.

-----


You could have created a filter to avoid marking these mails as spam.

-----


Only after you find out they were getting filtered, usually because someone calls you to say "WhyTF aren't you responding to my emails?"

-----


Some legit websites instead of displaying a 404 error, they redirect (3xx) to a page like the main page or about.

-----


Or return 200 with an error message. Something Google would want to know in any cause.

-----


String concatenation is pretty fast nowadays with the + operator.

-----


Do you have a source? Also, I would argue that concatenation with + is less readable than string formatting.

-----


I think that concatenation with + is okay when you're concatenating two strings. And CPython can optimize x = x + y or x += y calls.

http://docs.python.org/2/library/stdtypes.html

-----


This could have been prevented. Oblivious Turing Machines: http://stackoverflow.com/questions/14847080/how-does-an-obli...

-----


$ man ascii

-----


$ man ascii | cat | lp -

-----


We have a nominee for the useless use of cat award.

-----


Realized after I posted. Too late to fix it. The only reason I made the mistake in the first place is that I wasn't sure if the pager would mess up text output.

(What? You expect me to print three or so pages to test an assumption about how *nix utilities work?)

-----


    man -t ascii | lp -

-----


You can use the same idea in quicksort while choosing the pivot (and make it O(n log n)) but you don't because it is expensive.

There is a very nice and simple probabilistic algorithm instead.

-----


No one mentioned qtile. It is written in Python, tiling WM and very simple to use.

-----

More

Applications are open for YC Winter 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: