Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Quantopian’s algorithmic trading platform now accepts outside data sets (pandodaily.com)
42 points by jbredeche on April 2, 2013 | hide | past | favorite | 12 comments


Good work! This is an important step. For some people not being able to use their CRSP data (or whatever) is simply a deal-breaker. As someone who once implemented most of the bits that make up the quantopian functionality, I can certainly appreciate the effort you folks are putting in.

I believe it's been mentioned a few times now, but have to agree that risk metrics should be pretty high on your priority list. Sharpe/sortino/omega/etc. are great to have, but you really need to be able to build and experiment with variance models. As in all science, a number doesn't mean much without an error estimate :-) GARCH models are pretty popular these days, and they are fairly non-trivial to implement correctly (especially in their vectorized form), particularly for people new to the field.

Now for the most part this stuff _can_ be implemented by the users in their algos, but it's often a non-starter when it comes to the more complex stuff like vector autoregressive processes, bootstrap analysis or sane fitting routines.


Just slightly off-topic, but the quant.ly forum is so dead, and I think the Quantopian guys might chime in here.

I started taking the Finance Engineering class on Coursera, and one of the things that did not help me was their reliance of Excel as a way to solve most of the exercises in the problem sets. They even have part of the lectures dedicated to "how to use Excel for X". For solving the problem sets, they mention Matlab, but that's about it. Suffice it to say, with my refusal to buy a license of Excel and LibreOffice not having the requested features, I now found myself having to learn the theory and writing a bunch of small pieces of ad-hoc code to re-implement the formulas. And no, I could not find an easy to use constraint solver, in either Python or Octave.

So my task of "learn about Mean Variance Optimization" also included "learn and re-implement a constraint solver in NumPy", and the first lesson I'm actually taking away from this class is "forget about all and any decent tooling if you want to work in Finance."

There was another field of work where I experienced something like that, which is bioinformatics. Not to that degree, though. There is still way too much spreadsheet-emailing going on to my taste, but it was recognized as far from ideal. So at the lab there was a big push to adopt Galaxy as part of everyone's workflow. I no longer work over there, but I'm sure that the ad-hoc, sloppy pieces of R and Python code are being replaced by better integrated and more efficient Galaxy tools.

Going back to Quantopian... am I too far off to wish that it could become the "Galaxy for Quants and finance students"? Am I too much of a finance n00b to think that it makes sense to have something like Galaxy, but with tools focused on quantitative analysis? Is there a market in developing only these tools, instead of focusing on trading itself?

Or maybe this is exactly what these guys want to do, but it's just that they are too early in the development to tell? If so, I'm not exactly looking for work, but if you need a seasoned Python developer with experience in Galaxy and an interest in finance, we should talk...


1. Constrained optimization ("solving") is a very well studied, deep field. I assure you that optimization tools exist in dozens of languages, not just excel. Perhaps you were searching the wrong corner of google? What are you looking for specifically?

2. The folks doing portfolio optimization are generally finance geeks first, and math geeks second (if at all). Like you, they do not want to implement everything themselves and thus are born libraries like Quantlib[1], which has interfaces to "C#, Objective Caml, Java, Perl, Python, GNU R, Ruby, and Scheme" with other bindings on the way or hacked by someone already. This is not me endorsing qlib, but it is certainly a common choice.

3. There is a market for finance tooling, and there are many smart people rolling around in it as we speak. For example, take a look at rapidquant[2] from (formerly?) LambdaFoundry.

4. It can be tough to find a tool that suits you out of the box, but they do exist. My personal preference is python, but at the moment R and C++ win in terms of financial library support. (hint: rpy2)

[1] http://quantlib.org/index.shtml [2] http://www.rapidquant.com/


Thanks for your response. I think the key word about the constraint solver part is "easy to use". I did find quantlib, but in my specific case I was just looking for an immediate substitute to Excel. Learning how to do minimally productive work with quantlib seemed to be a task to take longer than my budgeted time of 1.5 hours/day for the Coursera class.

I do plan on spending some more time setting up an environment with quantlib and ipython notebook. But when I was talking about tooling, perhaps I misused the term. I was thinking in finer-grained things. For instance, there is no "tool" that would just take a table of assets and the covariance matrix to give you the portfolio optimization. In a system like Galaxy, you can develop a "tool" that does exactly that and make that available to researchers. They can then incorporate it to their workflow, get the output and put in the workflow pipeline for other "tools", etc, etc. I haven't seen anything like that for finance.


Yeah there is some startup time required to get to know the tools. In my experience it's worth it, though. I've seen some absolutely mind boggling spreadsheets trying to do what is a couple of lines of work with numpy/pandas. Spreadsheets are dead simple, but get unwieldy very quickly as your data & problem become more complex.

I suppose you're right, Quantopian could turn into a galaxy-like tool, but I think there is a great deal of experimentation that goes on in finance that doesn't fit into the pipeline model very well. As well, a lot of the fitting routines run in finance are quite computationally intensive and wouldn't really work on a framework like quantopian. For example, good luck hooking your risk model into a recurrent neural network on quantopian. That isn't to say that bioinf. is not computationally bound, of course it is, but a lot of what's done is more oriented to data pipelines as opposed to fancy computations (correct me if I'm wrong), which is why a thing like Galaxy fits so well.


Oh, genomics can be very demanding in their computating power requirements. We had a few applications with the running time reported in days/weeks.

But still... with all the talk about cloud computing and PaaS, this is the only real use-case where I see a benefit in tying up a platform to something like EC2. In the case of Galaxy, the developers provided an AMI that you could deploy and use your own account. For the Quantopian case, one of their services could be exactly in doing the managing and scaling of these tools. So, the neural network would be a tool that you'd have access to, and you would define how much power should be allocated when running it.


I don't know enough about Galaxy to say that we want to be the Galaxy of finance, but I can tell you where we want to go.

There are several key steps in the process of creating quant investment strategies: research, development/testing, backtesting, paper-trading, and finally, live trading. We want to cover all of it.

I think Galaxy is most like the 'research' phase - you want to get your data assembled and in something like a pandas datapanel, which you can then adhoc plot. You're looking for patterns, very quickly testing hypotheses about the data.

We made the decision to build live trading next, and we're full tilt on that, but we will keep expanding the product and it will include the research phase.

drop me a line and we can talk :) -- fawce at quantopian dot com


I've enrolled for "Introduction to Computational Finance and Financial Econometrics"[1] and it uses R (and Excel) for portfolio optimization. Unfortunately I hadnt enough free time in these weeks to be on par so I'm still stuck at week 2 lessons. Another interesting read is the Systematic Investor Blog[2] and its related toolkit, also in R

[1] https://www.coursera.org/course/compfinance [2] http://systematicinvestor.wordpress.com/


Here's an example of fetcher in action: https://www.quantopian.com/posts/new-feature-fetcher


Some additional details about Fetcher: https://www.quantopian.com/help (You will need to scroll down to the Fetcher section -- couldn't find anchor links)



Cool!

Now add risk management. It's not sexy, but it is table stakes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: