
Julia vs. Python: Monte Carlo Simulations of Bitcoin Options - sebg
http://rawrjustin.github.io/blog/2014/03/18/julia-vs-python-monte-carlo-simulations-of-bitcoin-options
======
jwmerrill
Here's a gist that optimizes the author's Julia code to give a 4x speedup over
the unvectorized code, and handily beat the vectorized codes too:

[https://gist.github.com/jwmerrill/9715447](https://gist.github.com/jwmerrill/9715447)

In general, carefully written devectorized Julia will be faster than
equivalent vectorized code because it is possible to avoid allocating
containers at intermediate stages of the computation. This is surprising for
people coming from e.g. Matlab, R, or Numpy, because vectorized code is often
faster in those environments. In those languages, direct scalar code is slow,
and vectorization is about expressing the computation in a form that will call
out to fast C operations, but the C still pays a price to allocate
intermediates. In Julia, the JIT is able to emit fast machine code for scalar
operations, so you don't need to express your code in a form that calls out to
C. There are JIT compilers for Matlab and Python too, but Julia's JIT is very
high quality and everyone uses it all the time.

The main changes I made here were hoisting computations of constants out of
the loop, and preallocating a 2D Float64 array to hold the results. These were
worth roughly 2x each. I've chosen the storage format so that it is accessed
in column order. Both optimizations have their own sections in the Julia
Manual's performance tips section:

[http://julia.readthedocs.org/en/latest/manual/performance-
ti...](http://julia.readthedocs.org/en/latest/manual/performance-tips/)

~~~
shoo
your point at the end is good - if we're only interested in the final price,
then we should be able to do this in constant memory, independent of the
number of timesteps and samples.

edit: on my machine the running time seems dominated by the generation of
random numbers, rather than anything else.

perhaps another optimisation is to elimate the `exp` from the inner loop, and
just do addition. if you're only interested in the final price, you can do one
`exp` at the end.

i.e. change

    
    
        paths[1, i] = st = s0
        for j in 2:M
            st *= exp(a + b*randn())
            paths[j, i] = st
        end
    

to this

    
    
        paths[1, i] = log_st = log(s0)
        for j in 2:M
            log_st += (a + b*randn())
            paths[j, i] = log_st
        end

~~~
Retric
One solution to this is use a circular data structure filled with random
numbers. Then dedicate a thread to updating that ring with new random numbers.
Meanwhile you pull random numbers as fast as you want. It's not good for
encryption, but for simulation it's fine.
[http://en.wikipedia.org/wiki/Circular_buffer](http://en.wikipedia.org/wiki/Circular_buffer)

Another approach is to keep reseeding a fast RNG with a slow but higher
quality RNG.

~~~
StefanKarpinski
You don't have to do anything quite that complicated. You can just pre-
generate an array of Gaussian random samples and do table lookup in the inner
loop:

    
    
        paths = zeros(Float64, M, I)
        rands = zeros(Float64, M-1, 1)
    
        for i in 1:I
            randn!(rands)
            paths[1, i] = st = s0
            for j in 2:M
                st *= exp(a + b*rands[j-1])
                paths[j, i] = st
            end
        end
    

That shaves off another 35% percent of runtime on my machine.

~~~
jwmerrill
Nice! In case it isn't obvious, randn! takes an existing vector and fills it
up with new random normal variates. This allows computing a bunch of random
variates at once, which is a little more efficient, without having to allocate
a new vector every time through the loop.

Another option would be to generate all M-1xI random numbers ahead of time,
but that would be less memory efficient.

~~~
StefanKarpinski
I tried that too, but it wasn't any faster – you've already gotten most of the
gains to be had by lifting the random number generation out of the innermost
loop. So I went with the option that uses less memory.

------
pavanky
We have a matrix library in C++ that works on top of CUDA and OpenCL.

I recently added a monte carlo based option pricing example. which can be seen
here:
[https://github.com/arrayfire/arrayfire_examples/blob/master/...](https://github.com/arrayfire/arrayfire_examples/blob/master/financial/monte_carlo_options.cpp)

I am bringing this up because all the different paths can be calculated in
parallel. Usually without a matrix library in C++, you will end up having to
write low level CUDA / OpenCL / SSE code to get the performance.

Here is what an intel i7 CPU does using our library + OpenCL:

\----

[1]: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz, 7875 MB, OpenCL Version: 1.2
Compute Device: [1]

Time for 100000 paths - vanilla method: 65.815 ms, barrier method: 68.271 ms

\----

BTW, we are working on porting the library to other languages as well. Right
now, it is supported for R, Java and Fortran.

~~~
noname123
Thanks pavansky for sharing. Can you tell me what is the performance and model
accuracy trade off between Monte-Carlo option pricing vs. BSM vs. Binomial vs.
Heston.

Currently I use BSM; however, live performance is poor in extracting implied
volatility from NBBO of option spreads as I use a naive approach to iterate
and converge on the IV. I assume that's what people use CUDA and GPU for to
calculate the greeks and pricing of the whole US option chain series in
realtime.

I'm curious if using CUDA or GPU rented from aws would make performance
faster, in a) extracting IV from market ticks, b) figuring out break-even and
hedging/adjustment points in different complex option spreads, c) calculating
expected payoff in term's of Kelly's for a spread or potential adjustment.

Gl with your endeavor.

~~~
S4M
> Can you tell me what is the performance and model accuracy trade off between
> Monte-Carlo option pricing vs. BSM vs. Binomial vs. Heston.

I don't work in finance anymore, but Heston is _not_ a pricing method per say.
In the linked article, the OP assumes that the price of a bitcoin follows a
lognormal distribution (i.e: dS_t = mu _dt + sigma_ dW_t), while Heston
assumes a different diffusion process (dS_t = mu _dt + digma_t_ dW_t, and
d(sigma_t) = kappa(thau - sigma_t) _dt+ zeta_ sqrt(sigma_t) dW'_t) [0].

I would say that Monte Carlo is in general slower and less accurate than the
other methods you mentionned, but more flexible in the sense that it allows
more complicated diffusion models or payoffs (e.g: option on the max or min of
a basket of underlyings).

[0]:
[https://en.wikipedia.org/wiki/Heston_model](https://en.wikipedia.org/wiki/Heston_model)

------
ajtulloch
I wrote [http://tullo.ch/articles/python-vs-
julia/](http://tullo.ch/articles/python-vs-julia/) a few weeks ago, which
compared Cythonized implementations of particular ML algorithms in Python
against Julia implementations.

There were some useful discussions on HN at
[https://news.ycombinator.com/item?id=7383121](https://news.ycombinator.com/item?id=7383121)

------
tanvach
I have written something similar comparing Julia, Python, Cython, Matlab and
C++ for European options Monte Carlo simulation.

[http://pithawat.com/post/monte-carlo-
simulation](http://pithawat.com/post/monte-carlo-simulation)

[http://pithawat.com/post/monte-carlo-in-
cython](http://pithawat.com/post/monte-carlo-in-cython)

[http://pithawat.com/post/first-brush-with-
julia](http://pithawat.com/post/first-brush-with-julia)

------
phpnode
FAO amurmann - you made a slight faux pas in HN etiquette[0] and were
sentenced to over 1000 days of purgatory without your knowledge. Almost no one
can see your posts and no one at all can reply to them. For literally years.
Sorry about that!

[0]
[https://news.ycombinator.com/item?id=2385595](https://news.ycombinator.com/item?id=2385595)

------
spikels
Anyone know why the two Bitcoin Call Option Valuations get such different
results: Python 361.203 and Julia 374.496? Monte Carlo will always some error
but that's a pretty big difference. Can't confirm which more correct right
now.

~~~
sirsar
This bothered me as well. That's a 3.6% error, which is far from negligible.
Was the main problem too few trials?

~~~
rawrjustin
the problem seems to be too few trials. 10x the number of paths and the
calculations for both languages converge on around 361ish

------
lolball
FWIW, there exists a package for calling Pandas from Julia with almost no
overhead. It lets you set and get elements of Pandas timeseries as fast as
native Julia arrays:
[https://github.com/malmaud/Pandas.jl](https://github.com/malmaud/Pandas.jl)
(I'm the author).

Thus Julia can be seen as a tool that leverages Pandas, not as a rival. In
fact, most Python libraries are very easy to wrap in Julia because of the
PyCall package
([https://github.com/stevengj/PyCall.jl](https://github.com/stevengj/PyCall.jl)).

------
AlwaysBCoding
I've been using R a bunch lately, but would love to use either Python or Julia
instead.

Does either Python or Julia have an IDE as good as RStudio? Where it's really
easy to write code and get visualizations side by side?

~~~
philsheard
I've not personally used RStudio but the IPython Notebook is great for
interactive code with inline visualisations. I believe there's also a Julia
kernel for IPython too so you could have either there.

I'd recommend Python as it has a lot of feature parity with R these days
(especially with the Pandas library for fast dataframe operations).

~~~
Myrmornis
I'm a daily python user, and mostly ex-R user. Agree data-frames in pandas
help. But python really doesn't have much feature parity with R when it comes
to statistical data analysis.

~~~
aeroevan
As another (mostly) ex-R user I have been trying to move to statsmodels,
scikit-learn, etc. with pandas.

I still fire up R/RStudio with ggplot2 for more complicated plots though...

------
jmpeax
I'd like to see the non-vectorized Python version with a @jit annotation.
(That is, from the Numba package:
[http://numba.pydata.org/](http://numba.pydata.org/))

------
apendleton
I'd suggest trying the non-vectorized version on PyPy. This code seems very
JIT-able, and CPython is nowhere near the state of the art for performance
with this kind of code.

~~~
emidln
On my rMBP, I did a quick back of the envelope comparison using PyPy 2.2.1 vs
CPython 2.7.6 on OS X Mavericks and came up with (best of 3 runs):

    
    
        $ time pypy testcode.py
    
        real    0m1.529s
        user    0m1.402s
        sys     0m0.121s
    
        $ time python testcode.py                                                                                            
    
        real    0m13.880s
        user    0m13.635s
        sys     0m0.241s
    
        $ time python testcode_vectorized.py 
    
        real    0m0.746s
        user    0m0.551s
        sys     0m0.139s

------
phyalow
Hmmm, practically speaking pricing European Options using simulations is not
recommended when you can use a beautiful close formed solution (along the
lines of a modified Garman–Kohlhagen model)! But for other styles american,
quantos, barriers etc this lattice simulation is great. Curious as to why the
author only focused on Europeans!

~~~
shoo
i would also find it very interesting to see distinct mathematical approaches
for solving these kinds of problems compared.

are finite-element models or some other approximation of the probability
distribution used in practice as an alternative to simulation when closed-form
solutions are not available?

~~~
phyalow
Hi, apologies for the slow reply. Yes predominately Monte-carlo simulations
are used. If you are interested in the field an easy entry point is Paul
Wilmotts "Introduction to Quantitative Finance".

------
wheaties
And I expect to see large improvements in Pandas as it is an active and
vibrant project. This will be hard to replicate in Julia, not for lack of
trying by the community, but rather due to the head start of the Python
community.

To be honest, if Julia wanted to really captivate an audience it would need to
come out with a package or library which did something needed by the community
in a way that Python didn't. Then you could conceivably see larger adoption
rates.

~~~
andrewcooke
Wut? Python is going to get better because it has a head start? What's the
logic in that? Surely the incumbent is less likely to improve than the
challenger because it's already been polished. IOW it's usually the new guy
that has room to improve.

~~~
wheaties
I never said Python is going to be better. I merely stated that with a head
start in terms of functionality you get from Pandas that Python will have the
upper hand.

Also, if history is any guide, the "new" and the "hot" tend not to beat an
incumbent just because they're new and hot. Having the most room to improve is
not a blessing. You have to improve at a rate that is faster and greater than
the incumbent can improve. And if they can't improve at a faster rate,
libraries and ecosystems win by offering something compelling that is either
hard or impossible to replicate in the incumbent.

~~~
StefanKarpinski
Your argument assumes that improvement is equally easy in both systems, which
is not the case. The Python ecosystem has a substantial head start, but Julia
makes library development much easier – I would hazzard a 10x factor, but it's
really hard to quantify. The result is that developer effort is amplified
significantly, making it plausible for the newer system to "catch up" faster
than you'd expect.

I don't really like to look at it so competitively, however. Open source is
notoriously not a zero-sum game – everyone wins when there are more good
options, whether they are in Python or Julia.

------
tantalor
Is this a performance-sensitive exercise? Is the naive python implementation
good enough? How frequently would you need this, and under what kind of load?

I take it the point is that Julia is _generally_ faster without the
optimization step, but the article seemed so heavily focused on this
particular problem that I have to question whether it matters in this case.

~~~
jwmerrill
Monte Carlo simulation in general is definitely performance sensitive. If
you're averaging over realizations of a process, the accuracy of your estimate
typically depends on the square root of the number samples you take. For a
given time budget, every factor s improvement you make to the speed of the
calculation earns you sqrt(s) more accuracy.

In a high frequency trading situation, if you can estimate a "correct" price
faster than your competitor can, then you get to make the trade, and they
don't. Doesn't get much more performance sensitive than that.

------
amurmann
I just started learning R and am still in the phase of learning a new language
where I feel insecure that I picked the tight language to learn. Can anyone
say anything about how R relates to the examples we have seen here for Python
and Julia? Also Julia seems yo be the new hotness in this space, is that
right? Should I skip R and learn Julia instead?

------
edwinksl
Impressive.

------
kartman
For practice, you can test out your bitcoin option pricing skills in real life
at our site, feedback appreciated :)
[https://www.cointures.com](https://www.cointures.com)

