
Why Ruby’s Timeout is dangerous and Thread.raise is terrifying - jvns
http://jvns.ca/blog/2015/11/27/why-rubys-timeout-is-dangerous-and-thread-dot-raise-is-terrifying/
======
AznHisoka
Using ruby timeout led to a bug that took me 2 weeks to debug. It was causing
data interleaving and corruption in redis(ie you set a key's value to 1 but
it's set to 2 instead because a separate thread set another key to 2 - that's
how dangerous it is)

The sad thing is there are few viable alternatives. I forgot which one but
open or read_timeout in the net::http library just uses timeout as well. So
there's really no way to safely crawl a url and have it time out if it takes >
x seconds.

~~~
jonhohle
How is that different from any other race condition when writing multi-
threaded code?

~~~
AznHisoka
I was writing thread-safe code. But the fact I was using timeout led to the
race condition.

------
asveikau
Beware of anyone who says phrases like "kill a thread". The thread must have
some cooperation in such a thing; the right design is for it to politely be
asked to stop. Otherwise if it touches any shared objects, acquires any locks,
etc., "kill the thread" becomes more like "make my address space into a death
trap".

------
sams99
This is the alternative:
[https://github.com/SamSaffron/message_bus/blob/master/lib/me...](https://github.com/SamSaffron/message_bus/blob/master/lib/message_bus.rb#L419-L445)
its significantly less risky, timeouts and raise caused huge amount of pain at
scale when I experimented in message bus.

------
iamleppert
You generally should not be writing your watchdog logic in the same code doing
the actual work!! Use signals and other processes!

~~~
colanderman
Yes, this. Threads are not a resource to be monitored and managed; processes
are.

Similar wisdom hides in the pthread_join(3) man page:

    
    
        There is no pthreads analog of waitpid(-1, &status, 0), that is,  "join
        with  any terminated thread".  If you believe you need this functional‐
        ity, you probably need to rethink your application design.

------
pythrowa
Python's "daemon threads" are also pretty broken. It's just impossible to
implement safely on top of pthreads with Python's model of tearing down the
world (GC) at process exit — the world includes stack frames!

~~~
ptx
It was apparently fixed in Python 3.3:
[http://bugs.python.org/issue1856](http://bugs.python.org/issue1856)

Or are you saying daemon threads are still broken even with the fix, because
of the underlying pthreads stuff?

------
jrockway
I don't really see the danger here. You can also get a signal from the OS at
any time, and you have to be prepared to handle it. Do you retry your library
calls when they return with EINTR? (The answer appears to be no; I've seen
plenty of programs exit with "Interrupted system call". That's not an error.)

If you really want something to time out, there are two options. Build into
your library the ability to supply a timeout, and reliably time out; or run
the operation in a separate process and kill it when the deadline is exceeded.

Go has a nice "context" idiom for carrying around cancellation and deadline
information, so the chances are if you set a timeout on the context, it's
likely to be obeyed because nearly every call that performs IO accepts the
context and cleanly aborts when the deadline or cancellation signal arrives
(presumably propagating err back to you). Though this is not perfect; disk
writes do not appear to accept a context, which shows how careful you have to
be when designing something to time out. ("Disk" writes can easily be network
RPCs; consider NFS or FUSE.)

~~~
JoachimSchipper
Yes, EINTR handling in many scripting languages - I'm most familiar with
Python - is horrifically broken. It appears that
[https://www.python.org/dev/peps/pep-0475/#implementation](https://www.python.org/dev/peps/pep-0475/#implementation)
is intended to solve this issue... for Python 3.5.

------
ptx
"Nobody writes code to defend against an exception being raised on literally
any line" says the article, but isn't that why we have e.g. Python's context
managers and Java's try-with-resources? In Python, KeyboardInterrupt can
already be raised anywhere and any object could have any method unexpectedly
replaced with one that starts throwing exceptions.

But it's a good point that it creates a greater risk of raising an exception
in the cleanup part of the context manager or try/finally, where one might
normally try to avoid it by being extra careful:

    
    
      try:
          try:
              time.sleep(10)  # or some interesting function
          except KeyboardInterrupt:
              print("Unexpectedly aborted, but that's fine.")
          finally:
              print("Cleaning up!")
              time.sleep(10)  # or some important cleanup
      except KeyboardInterrupt:
          print("Abort was aborted! Now things are all screwed up!")
    

So one should probably be careful not to press ^C more than once when aborting
a Python program, I guess.

------
scotty79
> java.lang.Thread.stop, which does essentially the same thing. It was
> deprecated in Java 1.2, in 1998,

> Java has a Thread.interrupt method, which sends InterruptedException to a
> thread. But an InterruptedException is only allowed to be thrown at specific
> times, for instance during Thread.sleep. Otherwise the thread needs to
> explicitly call Thread.interrupted() to see if it’s supposed to stop.

I remember when I read about it back then when I played a bit with Java.

I felt bit baffled that they would make something so stupid like `stop()` ...
after all Delphi that I learned some years before already had this solution
that you only can notify the thread it should stop, through Termintated flag
on the thread object and you had to write the thread code to voluntarily end
when the flag is set.

------
meneses
Forget the implementation, I have never been comfortable with the concept of
Timeout itself

------
colanderman
I think that – just like how they say every sufficiently complex program is
doomed to reimplement Lisp – every insufficiently well thought-out programming
language is doomed to reimplement INTERCAL.

(Seriously; Thread.raise is just COMEFROM with a non-deterministic line
number.)

------
Maarten88
The proper way .NET offers for this is not Thread.Abort but passing a
CancellationToken into the thread, making the timeout explicit to the
executing thread. All async functions in the framework have an overload
accepting a CancellationToken.

------
adevine
Java deprecated Thread.stop long before Java 6. IIRC it was Java 1.1 or 1.2.

------
nimish
Haskell has asynchronous exception handling for this reason but it feels like
this is the equivalent of signal handling which is notoriously subtle

