

Quantopian’s algorithmic trading platform now accepts outside data sets - jbredeche
http://pandodaily.com/2013/04/02/want-to-take-on-wall-street-quantopians-algorithmic-trading-platform-now-accepts-outside-data-sets/

======
rglullis
Just slightly off-topic, but the quant.ly forum is so dead, and I think the
Quantopian guys might chime in here.

I started taking the Finance Engineering class on Coursera, and one of the
things that did not help me was their reliance of Excel as a way to solve most
of the exercises in the problem sets. They even have part of the lectures
dedicated to "how to use Excel for X". For solving the problem sets, they
mention Matlab, but that's about it. Suffice it to say, with my refusal to buy
a license of Excel and LibreOffice not having the requested features, I now
found myself having to learn the theory and writing a bunch of small pieces of
ad-hoc code to re-implement the formulas. And no, I could not find an easy to
use constraint solver, in either Python or Octave.

So my task of "learn about Mean Variance Optimization" also included "learn
and re-implement a constraint solver in NumPy", and the first lesson I'm
actually taking away from this class is "forget about all and any decent
tooling if you want to work in Finance."

There was another field of work where I experienced something like that, which
is bioinformatics. Not to that degree, though. There is still way too much
spreadsheet-emailing going on to my taste, but it was recognized as far from
ideal. So at the lab there was a big push to adopt Galaxy as part of
everyone's workflow. I no longer work over there, but I'm sure that the ad-
hoc, sloppy pieces of R and Python code are being replaced by better
integrated and more efficient Galaxy tools.

Going back to Quantopian... am I too far off to wish that it could become the
"Galaxy for Quants _and finance students_ "? Am I too much of a finance n00b
to think that it makes sense to have something like Galaxy, but with tools
focused on quantitative analysis? Is there a market in developing only these
tools, instead of focusing on trading itself?

Or maybe this is _exactly_ what these guys want to do, but it's just that they
are too early in the development to tell? If so, I'm not exactly looking for
work, but if you need a seasoned Python developer with experience in Galaxy
and an interest in finance, we should talk...

~~~
kyzyl
1\. Constrained optimization ("solving") is a very well studied, deep field. I
assure you that optimization tools exist in dozens of languages, not just
excel. Perhaps you were searching the wrong corner of google? What are you
looking for specifically?

2\. The folks doing portfolio optimization are generally finance geeks first,
and math geeks second (if at all). Like you, they do not want to implement
everything themselves and thus are born libraries like Quantlib[1], which has
interfaces to "C#, Objective Caml, Java, Perl, Python, GNU R, Ruby, and
Scheme" with other bindings on the way or hacked by someone already. This is
not me endorsing qlib, but it is certainly a common choice.

3\. There is a market for finance tooling, and there are many smart people
rolling around in it as we speak. For example, take a look at rapidquant[2]
from (formerly?) LambdaFoundry.

4\. It can be tough to find a tool that suits you out of the box, but they do
exist. My personal preference is python, but at the moment R and C++ win in
terms of financial library support. (hint: rpy2)

[1] <http://quantlib.org/index.shtml> [2] <http://www.rapidquant.com/>

~~~
rglullis
Thanks for your response. I think the key word about the constraint solver
part is "easy to use". I did find quantlib, but in my specific case I was just
looking for an immediate substitute to Excel. Learning how to do minimally
productive work with quantlib seemed to be a task to take longer than my
budgeted time of 1.5 hours/day for the Coursera class.

I do plan on spending some more time setting up an environment with quantlib
and ipython notebook. But when I was talking about tooling, perhaps I misused
the term. I was thinking in finer-grained things. For instance, there is no
"tool" that would just take a table of assets and the covariance matrix to
give you the portfolio optimization. In a system like Galaxy, you can develop
a "tool" that does _exactly that_ and make that available to researchers. They
can then incorporate it to their workflow, get the output and put in the
workflow pipeline for other "tools", etc, etc. I haven't seen anything like
that for finance.

~~~
kyzyl
Yeah there is some startup time required to get to know the tools. In my
experience it's worth it, though. I've seen some absolutely mind boggling
spreadsheets trying to do what is a couple of lines of work with numpy/pandas.
Spreadsheets are dead simple, but get unwieldy very quickly as your data &
problem become more complex.

I suppose you're right, Quantopian could turn into a galaxy-like tool, but I
think there is a great deal of experimentation that goes on in finance that
doesn't fit into the pipeline model very well. As well, a lot of the fitting
routines run in finance are quite computationally intensive and wouldn't
really work on a framework like quantopian. For example, good luck hooking
your risk model into a recurrent neural network on quantopian. That isn't to
say that bioinf. is not computationally bound, of course it is, but a lot of
what's done is more oriented to data pipelines as opposed to fancy
computations (correct me if I'm wrong), which is why a thing like Galaxy fits
so well.

~~~
rglullis
Oh, genomics can be very demanding in their computating power requirements. We
had a few applications with the running time reported in days/weeks.

But still... with all the talk about cloud computing and PaaS, this is the
only real use-case where I see a benefit in tying up a platform to something
like EC2. In the case of Galaxy, the developers provided an AMI that you could
deploy and use your own account. For the Quantopian case, one of their
services could be exactly in doing the managing and scaling of these tools.
So, the neural network would be a tool that you'd have access to, and you
would define how much power should be allocated when running it.

------
kyzyl
Good work! This is an important step. For some people not being able to use
their CRSP data (or whatever) is simply a deal-breaker. As someone who once
implemented most of the bits that make up the quantopian functionality, I can
certainly appreciate the effort you folks are putting in.

I believe it's been mentioned a few times now, but have to agree that risk
metrics should be pretty high on your priority list. Sharpe/sortino/omega/etc.
are great to have, but you really need to be able to build and experiment with
variance models. As in all science, a number doesn't mean much without an
error estimate :-) GARCH models are pretty popular these days, and they are
fairly non-trivial to implement correctly (especially in their vectorized
form), particularly for people new to the field.

Now for the most part this stuff _can_ be implemented by the users in their
algos, but it's often a non-starter when it comes to the more complex stuff
like vector autoregressive processes, bootstrap analysis or sane fitting
routines.

------
fawce
Here's an example of fetcher in action: <https://www.quantopian.com/posts/new-
feature-fetcher>

~~~
wicknicks
Some additional details about Fetcher: <https://www.quantopian.com/help> (You
will need to scroll down to the Fetcher section -- couldn't find anchor links)

~~~
fawce
here you are: <https://www.quantopian.com/help#overview-fetcher>

------
spitfire
Cool!

Now add risk management. It's not sexy, but it is table stakes.

