The trick would be to make the bugs interesting: weird or hard, preferably both.
After reading an issue like that I would be very hard pressed to motivate myself to contribute anything.
Given that no one stepped up to do it yet and random users chimed in with aggressive and/or unhelpful noise, locking the issue was adequate.
I find it concerning how much abuse GNOME devs get, and I can completely understand that they respond like this.
It’s a little strange because for Reddit’s r/legaladvice it’s the opposite. Follow up posts can rank much more highly than the original.
Reddit on the other hand is passive entertainment, users are more interested in a novel problem that someone else dealt with.
But an update? That means there's actual feedback and conclusions! Something that's desperately lacking on that particular board.
It's not about reddit as a whole.
The only bugs that HN will want to solve, will be the ones that should be solved by the community.
Fun to see the problem and the solution. Thanks Ned and thanks HN.
Source: have seen nedbat@ in action.
Seriously though, we can't know every single third-party package under the sun. But this one is more important than the average.
He had a compound conditional without enough parentheses in it, and one of the possible outcomes resulted in fd=1, which is stdin on many operating systems. My general advice now is to assume a 5th grade reading level for math (which consists of +-x/), and use parentheses for everything else. Especially if you think yourself to be clever (narrator: he did).
if(PORTA & 3 != 0)
looks like it should be true if either of the low two bits is set. But actually, it only fires if the low bit is set because this resolves to PORTA & (3 != 0), which is PORTA & 1.
Some C compilers will warn you about this. Embedded C compilers are not usually the sort of compilers that are nice enough to have these warnings.
As a bonus, different languages handle precedence of &,|,^ differently: Python, for instance, binds & more tightly than ==.
And then everyone decides their own languages will copy it forever.
One more for my list of "why I don't like C".
X != Y returns an integer value, no "implicit cast" required for &.
I've never understood why any language does this. To save a few keystrokes? Code is much more read than written.
With this kept in mind, I've personally found almost all the implicit conversions in both languages to be intuitive.
Beyond that though, most implicit casts are just bad.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
— Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.
Both of these were bizarre bugs, and neither was immediately evident.
#1 was presented by an error saying something to the effect of "received xXxX bytes, expecting yyyy".
#2 was harder. It became evident because of logging. The high res timer was used gor logging amd intermittently, I was seeing +200 years supposedly elapsed during a db operation. Obviously wrong. It was a bug due to how registers are handled between x86 and x64 code.
Does anyone expect that their software would function properly in the face of arbitrary changes patched in by untrusted third parties ?
A code that looks simple and correct has been hiding a very subtle bug because no one understood the implications of an arbitrary exception being thrown at any point under the caller.
And of all the people who looked at the bug not a single one looked at this seemingly simple code and said "oh yeah, that's because this code is obviously wrong". It took someone skilled enough to script gdb to figure out the buggy interactions.
So the next time someone makes fun of `if err != nil`, send them the buggy implementation of tempfile.NamedTemporaryFile and ask them to spot the bug.
Also, KeyboardInterrupt should not be an exception. It's a signal, not an error or exceptional situation that code wants to report to the caller. But once you have a hammer, everything looks like like a nail. In addition to making the code impossible to reason about, no-one can agree what exactly is an exception so they randomly commingle expected errors with exceptional circumstances and, in case of Python, out-of-band signals.
("randomly" as in: why is "open file" throwing an exception when file doesn't exist but "find substring" returns -1 when substring doesn't exist?)
Are you also going to throw out all use of automatic garbage collection in languages because of a bug caused by a GC pause.
Should we stop bothering with encryption because people discover weaknesses and attack vectors?
Come on, man.
> Also, KeyboardInterrupt should not be an exception. It's a signal, not an error or exceptional situation that code wants to report to the caller.
OP doesn't mention anything about KeyboardInterrupt, but I'll bite.
Some languages like C# make "Events" a 1st party citizen, making it really easy to write non-sequential code and handling such events.
Most don't, so when people want to GUARANTEE a caller handles a particular signal, they use the Exception tool because it posesses the necessary qualities.
"Using Exceptions as flow control" is a well-known anti-pattern but even anti-patterns have their use in exceptional circumstances.
> ("randomly" as in: why is "open file" throwing an exception when file doesn't exist but "find substring" returns -1 when substring doesn't exist?)
I'll concede this one point. In-memory utility methods shouldn't throw exceptions when a useful error code would be just as good - it keeps code cleaner without trycatch blocks. NumberFormatException is my bane in Java.
Perhaps it's high time that problem was solved?
It actually is. Abusing exceptions as the normal control flow is a crude anti-pattern. Obviously the normal flow of control is not exceptional behavior. Therefore, abusing exceptions and exception-handling mechanisms to implement the happy path is simply wrong at many levels, from conceptual to practical, and in the end actually cause problems and hard to find buga such as the problems described in this bug report.
A generator in Java has the has_next() and next() method and this can be statically checked and you're basically only allowed to call next is hasnext is true.
In python the generator only has to provide the next-method and raises an exception when it's exhausted. I don't find this particularly unclean.
Surely that’s because of java? conversions normally throw in python and that’s no issue. So does indexing. So does str.index.
Please go check on the original post.
Every time there’s a subtle bug with 7 contributing factors, people pick the one which is caused by a design decision they happen to hate, and place all of the blame squarely on that one.
Because there can be lots of ways for `open` to throw an exception (encoding issues, not found, permissions, etc.). The underlying open syscall has 39 error codes. Python adds a few more failure modes atop that. I'd much rather deal with named error conditions than deal with return values -1 through -50.
Find substring can only fail in one way.
My experience is that while its true that its easy to have bugs in implicit exception handling logic, its just as easy, maybe easier, to have bugs in the explicit boilerplate its replacing.
I like Rust's `?` operator (and Swift's `guard` statement also), but it's worth noting that the bug wouldn't have existed in Rust in the first place because the file object wouldn't have outlived the scope it was defined in.
There was a try block with two function calls, and it seems like the author assumed only the first function call could throw.
With return-based errors, the error handling wouldn't implicitly be shared between the two functions. The author would write the handling for the first function, and then they would notice the second function could return an error state too. Then they might write a correct handler, or at least we should expect them to say "this can't happen, so panic". This bug would have been trivial to fix if it panicked instead of corrupting fds.
Programmers in exception languages never think (or at least should never think) "X cannot throw" because it is never true. Any function has the potential of throwing an exception. This is even more true in Python than in other exception throwing languages.
No, the problem here is that the author has realized that they need special cleanup logic in the case that a file fails to be opened, but they applied that logic too broadly.
return _TemporaryFileWrapper(file, name, delete)
Because it's almost true, and the code cleanup is clearly designed around just the first function.
_TemporaryFileWrapper does nothing in its __init__ except copy the four parameters into the object. There's no reason to anticipate a problem.
But thinking about it more precisely, there's an even worse issue.
Even if neither function has an error, if you get a KeyboardInterrupt between the object creation and the actual return, everything gets corrupted.
People aware of the bug, trying to fix this code, suggested a nested try/catch, and I think that would still explode if the timing was exactly wrong.
Because really, who thinks about a return statement throwing? It's an unintuitive problem that can only happen with this style of exceptions.
It's probably even worse than that. If an exception can hit right after the file object is made then you get a similar problem...
Is there anything in Python that restricts the timing of KeyboardException or is this entire adventure on extremely treacherous territory?
> That's precisely the point I was making: there is no python code which is guaranteed not to throw.
Well the first thing you said is "what is your alternative to exceptions, and how would they would have made the error easier to spot? Its not obvious that the error would have been easier to spot if it had been written using Go's `!= nil` pattern or Rust's `?` pattern."
In those languages, you can't have an error state trigger between lines of code, or inside a line of code that does basic things like assigning to a local variable or returning.
It won't happen in C++ either (with certain assumptions).
Python is a language that's supposed to be straightforward, and very few coders are going to understand exactly how hostile exceptions can be. So while these assumptions shouldn't have been made, the blame goes toward the design of the language. An alternative system would be better.
Uhh... it was a normal exception that triggered the bug in question.
> Well the first thing you said is
Yes, that's the first thing I said. But the point I'm making here is that your initial response to that is probably wrong: "the author assumed only the first function call could throw" That's just not how you think in Python. (Or at least it is not how you should think.)
My point is this: the bug wasn't assuming that the second function wouldn't fail. The bug was assuming that the same cleanup code was appropriate in the case of either failure. That same bug could easily exist with return codes as it does in exceptions.
> Python is a language that's supposed to be straightforward
In fairness to Python, this is an atypical case. The problem arises because this is low level code dealing directly with fds. Most of the time, code will deal with higher level objects which will properly clean up after themselves regardless of what crazy stuff you do with them.
That doesn't count because it was an invasive monkeypatch that took a function that could not possibly cause an error and made it throw.
> My point is this: the bug wasn't assuming that the second function wouldn't fail. The bug was assuming that the same cleanup code was appropriate in the case of either failure.
Maybe. Hard to know for sure.
> In fairness to Python, this is an atypical case.
I dunno, pretty much any code that uses a catch block to undo things has to be very carefully written. Even if you're wrapping things in higher level objects, you still have ugly scenarios where you're mutating a data structure and have to unwind the changes halfway through. I bet tons of that code is written under the assumption that there will be no exceptions.
I guess if we consider "trying to catch KeyboardInterrupt at any level without immediate exit" as a weird barely-supported case, then things are fine.
The original claim was based on that possibility, its pretty goofy to declare it doesn't count.
> I dunno, pretty much any code that uses a catch block to undo things has to be very carefully written. Even if you're wrapping things in higher level objects, you still have ugly scenarios where you're mutating a data structure and have to unwind the changes halfway through. I bet tons of that code is written under the assumption that there will be no exceptions.
Is it any different for return codes than for exceptions? I'd say that any code which attempts to undo change in the case of an error has to be carefully written. Every special case on the ordinary code path, has to have matching special cases on the error paths. In fact, I've never seen code in any language that does such a thing. And I wouldn't trust if it I did, since its extremely difficult to get right for all cases, and the errors paths almost certainly hasn't been well tested.
> I guess if we consider "trying to catch KeyboardInterrupt at any level without immediate exit" as a weird barely-supported case, then things are fine.
Well, trying to continue running after KeyboardInterrupt is dubious, the user just asked the program to terminate.
While there is lots of meta-complexity around this, the bug itself is relatively simple. Make sure files are closed properly on error.
Also find -1 is a wrapper from C, probably kept for compatibility reasons.
It is an issue of exceptions that try clauses can easily be overly broad in the amount of code they cover.
file = _io.open(fd, mode, buffering=buffering,
newline=newline, encoding=encoding, errors=errors)
return _TemporaryFileWrapper(file, name, delete)
However due to the mock it's the wrapper creation which fails. This means the cleanup runs, closes the fd, but leaves the file object (created by io.open) around and assuming it owns an fd, the file object later on gets collected by the GC and closes the fd a second time, breaking anyone unlucky enough to have gotten its reuse.
This issue would not happen if the wrapper were created outside the `try` statement (though the file would be leaking to be eventually closed by the GC which is not great either)
The issue here is a bug in resource management. The specific bug was caused by an exception that shouldn't be caught still being caught. Which erroneously messed with resource management.
Well, first, there's more to it than that. "Open file" will throw an exception when you open the file to read and it doesn't exist. If you open the file to write and it doesn't exist, you won't get an exception because that's expected behavior. Instead, the file will be automatically created behind the scenes. The complementary scenario, where you're opening a file to write and the file already exists, is more likely to be a bug.
Exceptions have semantic qualities and depend on what you're trying to do. Trying to read a file that doesn't exist is obviously an error state because it is inherently impossible to do. The same goes for detecting the location of a substring within a string that doesn't contain it. But examining a string to see whether it contains a particular substring has no such problem. You can always test a string for the inclusion of some other string. So the judgment of whether "find substring" should throw an exception or not depends on whether you think the question being asked when someone invokes "find substring" is "is substring in there?" (answer of "no" is not exceptional) or "I know substring is in there; where is it?" (answer of "it isn't in there, you doofus" is exceptional). Both questions are common; this isn't really a call the language should be making.
The same problem is more frequently discussed in the context of hash tables. If I look up a key in a hash table and I get a null value, is that because there is no value stored for that key, or is it because null is the value stored for that key? People would like accessing the table to be convenient while distinguishing these two cases, and python doesn't allow that. You can choose from some different behaviors:
table[key] # Absence of the key is exceptional.
table.get(key) # Absence of the key will yield None...
# but this is indistinguishable from
# a key with the stored value None.
key in table # This directly tests whether key is
# present, but won't tell you the stored value.
But you can store any value, even null, in a hash table -- that's their thing. Finding substrings doesn't have this problem; a legal result must be a valid index into the string. We can solve the how-do-we-want-the-hash-table-to-behave problem perfectly by returning an invalid index when the substring isn't present, and... that's what we do.
 Common Lisp addresses this by allowing functions to return multiple values (distinct from a single value with multiple subvalues) -- it's possible to treat the return value of a multiple-valued function as a single value, in which case you get only the first value. This is great for the hash table example, but in general it makes this easier while making it much more annoying to handle the multiple values when you do want them all.
Or, you can create a second null-like type in the language, name it 'undefined' perhaps, and return that when the key doesn't exist.
Then you don’t have the issues of in-band signalling.
Is that multiple values? (Yes.) Or a wrapper type? (Also yes, it's a tuple, a collection of values.) This works, but it's unpopular because it makes every table access more difficult.
If you ignore most of what I said then it’s not.
> Is that multiple values? (Yes.) Or a wrapper type? (Also yes, it's a tuple, a collection of values.)
It’s neither type-safe nor unambiguous.
> This works
For low value of works. Which I guess is all you can ask for in a dynamically typed langage (eg Erlang uses this pattern to fairly good effect, but it has very good support for it).
But it should not be confused with an actual solution to the issue. It can make things less bad (again given good support for this pattern which Python does not have), not actually good.
How'd you get here? It's exactly as type-safe as the unwrapped hash table, which is admittedly not especially type-safe, and it's fully unambiguous.
> and it's fully unambiguous.
No, you have any random value flagged as not present (in fact you can forget about or ignore the flag entirely), and no clue if that means anything.
No, this isn't a solution. This is like terminating C strings with 0 characters, or terminating TSV fields with tabs. It sounds like it makes sense, until you ask "what if the terminator occurs within the data?"
Eventually someone's going to need to store undefined. It's no different than null. If it were a competence issue (which it isn't), we'd be saying that storing null in the hash table was a mistake, not something to work around with null2.
Checking for undefined is better known/easily discoverable but hasn't been the right way to do it since at least IE 5.5 / Firefox 1.
Negative indices are valid in python. Also str.index raises.
And product types are not a good way to signal exclusive conditions.
> Hacker News can be helpful, in spite of the tangents about shell redirection, authorship attribution, and GitHub monoculture.