

Ruby core classes aren't thread safe - bitshift
http://www.jstorimer.com/newsletter/ruby-core-classes-arent-thread-safe.html

======
tomp
It's not that Arrays are not thread safe; it's just that the code was written
in a non-thread-safe way.

Writing

    
    
      x[i] -= 1
    

actually means

    
    
      x[i] = x[i] - 1
    

So, there's a read, a subtraction, and a write, and they all happen
sequentially. Since they are not in a transaction or protected by a mutex,
nothing guarantees that other thread don't mutate `x[i]` in the mean time.

This has nothing to do with Ruby, and nothing to do with multicore, either.
Even on a single CPU core, threads might interleave and cause unexpected
behavior.

~~~
niggler
That's not quite true. The language spec can mandate that the -= be atomic
(the X86 equivalent is mandating a LOCK).

~~~
masklinn
Technically it'd have to mandate that []-= be atomic, there's a load from and
a store to a k:v collection, not just from and to memory.

~~~
acdha
> there's a load from and a store to a k:v collection

That is itself a separate specification question: does simple array access
trigger full load/store semantics or is it simply returning a reference?

~~~
masklinn
Note that it's not a "simple array" here it's an associative one, a map. And
of course it's not just access, it's the retrieval of a reference _to an
immutable cell_ , you can't just alter it in place unless you are certain no
other reference to the cell exists in the system.

~~~
acdha
> Note that it's not a "simple array" here it's an associative one, a map. And
> of course it's not just access, it's the retrieval of a reference to an
> immutable cell, you can't just alter it in place unless you are certain no
> other reference to the cell exists in the system.

This still comes down the question of whether your language preferes
references or values: a design decision in-scope for the original question. A
language designer might quite reasonably choose to specify that `foo[i] -= 1`
means “Get a reference to the object at position i in container foo and tell
it to decrement”, in which case the container access is not a concurrency
issue except for write / replacing the initial object. This approach can also
be implemented atomically in many architectures because it's a simple
arithmetic operation rather than retrieving and storing immutable objects.

It would also potentially be easier to scale if threads don't typically update
the same elements because you could perform locking at the cell level rather
than the entire container.

------
masklinn
Under the semantics described here, Java or C# core classes aren't "thread-
safe" either and I'd expect the _vast_ majority of standard libraries to
completely fail the test (potential winners: Clojure using an immutable
collection bound on an atom, as they have compare-and-swap semantics; and
probably Haskell somehow), the example code requires performing the following
actions atomically:

* Loading an instance-local collection

* Fetching a numeric value from the collection

* Incrementing or decrementing the numeric value

* Putting the incremented value back

Even if the collection is "synchronized" (each method call takes a collection-
local lock), because the value is altered outside the collection there's no
way for the change to be atomic unless it's wrapped in a transaction block or
protected by a lock. As far as I can think, the only ways for core classes to
"be thread-safe" considering the example (keeping mutable collections
semantics) would be to either have collections dedicated specifically to the
operation such as Java6's AtomicIntegerArray
([http://docs.oracle.com/javase/6/docs/api/java/util/concurren...](http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicIntegerArray.html)),
or to have _the collection_ apply the operation internally e.g.
Hash#apply(key, &operation) used roughly like this:

    
    
        def decrease item
          @stock.apply(item) {|from| from - 1}
        end

~~~
aardvark179
A better title might have been, "true concurrent execution may reveal problems
in your code that a GIL has hidden."

Having worked on a JVM port of a similar language and standard library what is
most surprising is how much will still work in real world situations even when
fundamental parts are not thread safe. It requires careful insertion of locks,
and some quite heavy multithreaded testing to find and fix those problems, and
that time may be better spent implementing new thread safe alternatives (or
simply exposing the java concurrent collections) which encourage the
programmer to write thread safe code, and putting locks in at the application
level to fix concurrency issues.

~~~
masklinn
> A better title might have been, "true concurrent execution may reveal
> problems in your code that a GIL has hidden."

Indeed (and as many Python developers have discovered trying to run their code
on pypy, deterministic "garbage collection" schemes such as refcounting may do
the same with resource leaks)

------
jstorimer
Hi. OP here.

Thanks for the comments about atomicity vs. thread-safety. Absolutely on
point. The article started out demonstrating what happened with concurrent
Array mutation, but then I put in that += operation and didn't address it.
Sorry for not making the distinction. Atomicity is absolutely a different
issue than a thread-safe collection. I'm publishing something new tomorrow
that addresses this point.

To bring things back to code, the point I was originally trying to make is
that this code is not thread-safe.

    
    
      array = []
      threads = []
    
      10.times do
        threads << Thread.new do
          100.times { array.push(rand) }
        end
      end
    
      threads.each(&:join)
      # 10 threads each inserted 100 values, result should be 1000
      puts array.size
    

Specifically, too many Ruby programmers won't think twice about this operation
not being thread-safe:

    
    
      array.push(item)
    

But there's no such guarantee. This is demonstrated nicely when this code
example is run on an implementation with no global lock, try it on JRuby.

------
incidently
The article seems to be wrong in several aspects ... First, the issue
described has nothing to do with arrays; the same problem happens when using a
plain number:

    
    
        class Inventory
          def initialize(nb)
            @nb_items = nb
          end
         
          def decrease
            @nb_items -= 1
          end
         
          def nb_items
            @nb_items
          end
        end 
    
        @inventory = Inventory.new(4000) 
    
    
        threads = Array.new
        400.times do
          threads << Thread.new do
            10.times do
              @inventory.decrease
            end
          end
        end
         
        threads.each(&:join)
        puts @inventory.nb_items
    

Second, the mutex in the OP's code synchronizes the whole block passed to a
thread, i.e. there's no parallelism at all (the second thread waits until the
first one finishes, and so on). It should rather be something like:

    
    
        class Inventory
          def initialize(nb)
            @nb_items = nb
    
            @lock = Mutex.new
          end
         
          def decrease
            @lock.synchronize do
              @nb_items -= 1
            end
          end
         
          def nb_items
            @nb_items
          end
        end 
    
        @inventory = Inventory.new(4000) 
    
        threads = Array.new
    
        400.times do
          threads << Thread.new do
            10.times do
              @inventory.decrease
            end
          end
        end
         
        threads.each(&:join)
        puts @inventory.nb_items

------
wuest
Nice write-up. It's easy to fall into the trap of assuming that things are
"thread-safe" when writing under the iron fist of a Global Interpreter Lock.

Ruby threads seems to be a fairly narrow topic on which to base a book; I'll
look forward to seeing what all the book covers.

------
marshray
Let's ban this term "thread safe" and instead say what we really mean.

But first, let's figure out what we really mean.

------
JulianMorrison
There is one core library class that's thread safe, and it's Queue.

Otherwise, take a lock, or don't share data.

------
pieter
This still doesn't explain why the MRI implementation is accidentally
threadsafe. Why doesn't the interpreter switch threads after reading the value
from the hash but before storing the updated value?

~~~
sabat
Due to the Global Interpreter Lock (GIL), your whole script is wrapped in one
giant mutex. That means that you don't have code running in true parallel, so
the data is not corrupted as it is in JRuby and Rubinius (which both implement
real, true, parallel threads).

~~~
sabat
Not sure why someone chose to downvote this to 0, but here's how the OP put
it:

 _The global lock is a feature of MRI that basically wraps a big mutex around
all of your code. That's right, even if you're using multiple threads on a
multi-core CPU, if your code is running on MRI it will not run in parallel._

~~~
masklinn
OP is wrong, MRI will release the GIL (and yield to other threads) on IO and
after running a bit (which can yield convoy effect actually lowering the
performances). C extensions can also release the GIL when not manipulating VM
structures.

Technically the GIL is there to protect the VM's internal state, so its
operational semantics are that it's taken and released for each bytecode
instruction.

However the efficiency would stink (short of awesome automatic lock elision or
merging), so GIL-based VMs usually have a higher threshold, either based on
some sort of instructions count (which may or may not map 1:1 to bytecode
instructions) or an internal "wall clock" timer. I think MRI's the latter but
don't quote me on it.

~~~
jstorimer
Correct. Later on I say:

    
    
      There are some very specific caveats that MRI makes for concurrent IO. If you have one thread that's waiting on IO (ie. waiting for a response from the DB or a web request), MRI will allow another thread to run in parallel.

------
exabrial
Does ruby have a spec? If not, I don't really see this as a problem...
synchronization, especially in a dynamic language that's already really slow,
would just add more overhead.

~~~
chc
It does, though it's the sort that's mainly derived from the canonical
implementation -- see RubySpec.

------
niggler
Does the spec mandate thread safety? (I guess I should ask if there's a spec
or is MRI the reference implementation)

~~~
jstorimer
AFAIK there is no spec. MRI is the reference implementation, but many things
are experimental or intentionally unspecified.

Given that MRI ships with a GIL, the only core classes that are intentionally
aware of multi-threading concerns are Mutex, ConditionVariable, and Queue.

~~~
aardvark179
A GIL does not mean classes should ignore concurrency concerns, it's still
possible to get odd behaviour from things like hash table implementations in a
GIL based interpreter when inserting objects as you may end up thread
switching mid operation.

What saves you most of the time is that it isn't worth switching threads too
often so normally you get lucky.

