Machine learning predictions for QM eMini Crude Oil

tiredfatdude · on Sept 26, 2017

If you glance over the whitepaper, it's obvious this is a completely ridiculous model. Trying to predict intraday price movements with previous day volume numbers and open interest as features? Please. Also, selling the "model" predictions for $365 a year is the real scam -- if he had any confidence in the model, he'd get a private backer and be trading size out of his own account rather than making $50 bets.

madchops1 · on Sept 26, 2017

It's an indicator to assist in trading decisions not a crystal ball. And your analysis of my analysis is incorrect. I do have private backers using this. But I don't see why I can't share as well and have multiple streams of revenue. I post all the results so you can see for yourself. I'm not rich and have only been day trading for a year or two. This is for the little guy not the big guy. So I trade and show examples as someone who may not have lots of liquidity available to show how you too could learn to do this.

rainhacker · on Sept 26, 2017

You should change the following in your whitepaper though:

"..It contains information that is confidential and privileged. If you have received this document in error, please notify the sender and delete this file."

madchops1 · on Sept 26, 2017

Ya I should remove that. Ty

sireat · on Sept 26, 2017

I mean it is possible that the model could work for someone with access to more capital and better fee structure than OP making $50 bets.

Still, this seems like a model that will likely work about 95% then fail on outliers ie "picking pennies in front of a bulldozer" model.

madchops1 · on Sept 26, 2017

The nice thing about QM is that 1 tick is a 12.50 dollar move. So its easy to cover fees. Today those were $50 profits. 2 shares moved 2 ticks. I use stops and limits of course. So my risk was about $100. Its not the same as options.

HockeyPlayer · on Sept 26, 2017

A strategy that goes long or short QM doesn't have an asymetrical return profile. I.e., it isn't "picking up pennies". An example of that is selling options.

madchops1 · on Sept 26, 2017

Thanks HockeyPlayer spot on.

tiredfatdude · on Sept 26, 2017

I think you misunderstand, what you're doing is even worse than picking up pennies, your strategy just boils down to punting futures with no hedge or any form risk management to your open position. You're just gambling on coinflips. If you backtest your "model" you'll find that external macro event induced crude oil moves will completely wipe you out because you have no hedge against them.

madchops1 · on Sept 26, 2017

Your are correct about no hedge agains outlying macro events. But I'm not suggesting using this without any other variables in your trading decision. It may help make a trading decision.

madchops1 · on Sept 26, 2017

Also you should always place a stop and limit order. So you don't get completely wiped out. Its very common.

madchops1 · on Sept 26, 2017

I do appreciate the criticism though I honestly thought I'd get more. Results, results, results

soVeryTired · on Sept 26, 2017

Have a look at Carver's book 'Systematic Trading' and maybe 'Expected Returns' by Ilmanen. You also need to get a proper backtest going. A black box and a diary won't get you very far.

madchops1 · on Sept 26, 2017

I am doing my training/evaluation with a data split of 70/30. Doesn't that qualify as a proper backtest?

soVeryTired · on Sept 26, 2017

I don't really know what you mean by evaluation. But you need to be able to (faithfully) generate all the positions your system would take through time, and also to generate all the returns you would have made through time.

Aside from pure P&L, you should be looking at how much risk your system is taking, and under what conditions it's doing badly. All backtests are overfit: their use is mostly in identifying problems with your strategy, rather than predicting how much money you'll make.

One question you'd get asked if you were proposing this in a real trading environment is this: what is it about the QM emini contract that makes this work? Does it work for other energy contracts? For other commodities? For bonds, or equities? If not, why not?

madchops1 · on Sept 26, 2017

Basically I have a dataset and I train my model with 70% and then evaluate its guesses against the remaining 30%. Hence a baseline is created and I can see if my model performs better.

It took some doing to get this model to perform well. I did this by adding features that help recognize patterns in the time series data.

The features I created are not specific to QM as they are technical (eg. numbers, not news), and time-series related. So the models should work with any historical dataset with the same fields.

My goal is to add another future at some point.

soVeryTired · on Sept 26, 2017

I don't understand your baseline.

I feel like you're talking past me a little. The first thing you need to do is generate all the positions your system would have taken over as many years as possible, and figure out at what times you make and lose money. Otherwise you don't have a backtest.

madchops1 · on Sept 29, 2017

I apologize. I can do that. I'm going to generate that backtest you described.

Right now I have residual data from the AWS machine learning data that tells me weather there is any structure to the times it does guess wrong. And a value below baseline is a better than 50/50 guess according to what I have learned about how AWS does its ML. Knowing that I use this personally as a supporting indicator to my trade decisions. Since its so new and I really don't want people to think I'm scamming or something. I'm just releasing my results free for now, not trying to be a douche ;)

AWS defines the baseline as follows

Baseline RMSE Amazon ML provides a baseline metric for regression models. It is the RMSE for a hypothetical regression model that would always predict the mean of the target as the answer. For example, if you were predicting the age of a house buyer and the mean age for the observations in your training data was 35, the baseline model would always predict the answer as 35. You would compare your ML model against this baseline to validate if your ML model is better than a ML model that predicts this constant answer.

madchops1 · on Sept 26, 2017

I will. Thank you for the advice.

minimaxir · on Sept 26, 2017

A statistical model based on bad/dependent features is essentially just random guessing, which in this case the model just makes favorable guesses.

madchops1 · on Sept 26, 2017

I experimented with the features until I got good results in my evaluations. If I am getting favorable guesses, how does that point to bad features. Favorable guesses is what I was going for in order to assist in trading decisions.

I use a 70/30 split of training/eval

madchops1 · on Sept 26, 2017

Just look for feedback. I use many historical price values other than volume. Plus some specific things I have added for pattern recognition in time-series data. That's the special part.

jb1991 · on Sept 26, 2017

@madchops1 This comment reveals just how seriously we should take your work. This article should be flagged.

madchops1 · on Sept 26, 2017

touche. Sorry for my negative reaction.

jb1991 · on Sept 26, 2017

Note to other readers: the comment was changed.

madchops1 · on Sept 26, 2017

I reacted that way because dude said he read the white-paper and somehow took this as a historical analysis of volume. Which it is not.

I use historical open, high, change, last, settle, prev. day open interest, plus several other fields I use to help recognize patterns and properly weight time-series data.

madchops1 · on Sept 26, 2017

It was changed. I said "your just jelly". Sorry. I changed it.

financedude · on Sept 26, 2017

The OP says "I don't see why I can't trade it AND share it as well to get multiple streams of revenue?"

Maybe because the capacity of day hold futures (especially e-micro crude) is so small that there is almost no way that the revenues earned from your 365 a year subscribers is going to be greater than the decreased capacity of your strategy from having all those people trading it.

As someone who works at a quant fund, this kind of shit pisses me off to no end. It makes a legit industry look like Herbalife

madchops1 · on Sept 26, 2017

I don't really have a strategy to copy. I just provide indicators that may affect how you decide to make a trade or not.

Also the point of this project is to get machine models that evaluate below baseline by improving the analysis of time-series data. I spent a lot of time improving the quality until they became better than baseline. Like I said I do not have a crystal ball or strategy for sale. Just insights from pattern recognition of historical time-series data that has helped me trade so I'm putting it out there.

jb1991 · on Sept 26, 2017

Entirely agree. This site is like looking at a caricature and passing it off as fine portrait art.

soVeryTired · on Sept 26, 2017

d--b · on Sept 26, 2017

OP: how can you expect to be taken seriously if you don't publish backtests, and overfitting analysis?!

It is _extremely_ likely that your model's guesses are just as bad as coin flips.

madchops1 · on Sept 26, 2017

I will do this thank you for your feedback. I didn't release my results until my models were better than baseline in evaluation. Hence better than flipping a coin. I published the results of my evaluations so you can see that. Its not a crystal ball. Its an indicator to help in trading decisions.

madchops1 · on Sept 26, 2017

I published the evaluation results in the whitepaper. I am doing my training/evaluation with a data split of 70/30. Doesn't that qualify as a proper backtest?

JackFr · on Sept 26, 2017

That's how Simons did it at Renaissance. He sold his model for $365 a year.

madchops1 · on Sept 26, 2017

I did not know that. Thanks for the info.

lowdest · on Sept 26, 2017

This is all a bit naive, OP is not up to speed with the state of the art. This looks like any number of student projects that are created every semester.

The best thing about this is that it looks like OP prototyped and released v1 for sale in under a month. That's respectable.

madchops1 · on Sept 26, 2017

It may be a bit naive. going to keep working on this and I have found my indicators to greatly increase my own trading results and making the models evaluation numbers improve is very exciting and interesting to me and maybe others.

Thanks for the one compliment :)

madchops1 · on Sept 26, 2017

Also my models beat baseline in evaluations so I'm stoked on that. It took some doing to produce results with time-series data.

gwbas1c · on Sept 26, 2017

It's been about a decade since I worked with machine learning, and even then I hardly scratched the surface. (I mostly wrote the data access and glue code for someone else's machine learning system.)

Given what I see, it's really hard (for me) to understand what's going on and what's novel. I'm not very active in the financial or machine learning area. All I can understand is that someone wrote an investment program that makes money.

Even with my limited knowledge of machine learning, I know that it's very easy to confuse luck and success. How do you know that you're not just lucky? How do you know that your computer program is really investing, and not just "good timing"?

madchops1 · on Sept 26, 2017

I am not trading purely on these numbers. They are indicators they help me make my trading decisions and I think they may help others too.

My models evaluations are performing better than baseline when trained with 70% of the data and evaluated against the remaining 30% so I take that as value. As someone else put it a potentially "favorable guess". At this point I'm using the predictions regularly. And I guess I'll know more the longer I keep track of daily results.

madchops1 · on Sept 28, 2017

Updated White Paper. Generated more evaluations and back tests. https://s3.amazonaws.com/karlcdn/Dutchess.ai+White+Paper.pdf

madchops1 · on Sept 26, 2017

I know a lot of you hate this but the model has correctly predicted the correct direction of movement at end of day 9 out of 11 days since my model was under baseline and I began publishing my results. I know I've said this a few times but its not a crystal ball, its a value that can assist in trading decisions.

soVeryTired · on Sept 26, 2017

Crude oil has been going up all month. Is your system biased long (i.e. towards buying)?

Edit: looking through your trade history, it does appear to have a long bias. It's predicted a rise in price on all but two days.

gwbas1c · on Sept 26, 2017

11 days is a very short time. Assume that you're lucky.

madchops1 · on Sept 26, 2017

Thats also what the evaluation of an ML model is for.

madchops1 · on Sept 26, 2017

Thats true. I'm going to keep going.

madchops1 · on Sept 28, 2017

I did another back test with a randomly selected 70%/30% training to evaluation ratio for evaluating time-series models. Adding results to whitepaper. The results are still under baseline.

k3oni · on Sept 26, 2017

Couple questions:

1. Why QM?

2. Was this tried on any other future and if yes what was the outcome?

madchops1 · on Sept 26, 2017

I have been trading QM for a year or two now. It has low margin and good volatility for day trading. The tick value is 12.50. So movement of one tick will cover your trade costs. So it makes it possible for a beginner to start with ~$3-5000 and still be able to realistically make money.

cardmagic · on Sept 27, 2017

Trading costs are more than just commission. Slippage for QM is higher than CL because of thinner volume and can easily be $10-25+ per order ($20-50 round trip). You might say that you don’t worry about slippage because you use limit orders, but limit orders have their own problems, like not getting filled which can easily screw up your returns vs predicted returns. Be careful, there are far more ways to lose money trading than you seem to realize still.

madchops1 · on Sept 28, 2017

Also doing random not sequential training and evaluation. The proper way to eval time-series models.

madchops1 · on Sept 26, 2017

I am doing my training/evaluation with a data split of 70/30. Doesn't that qualify as a proper backtest?

soVeryTired · on Sept 26, 2017

What is this I don't even

madchops1 · on Sept 26, 2017

Working on a tutorial...

daveguy · on Sept 26, 2017

I may not be reading this right, but does the white paper contain 5 days of result data?

madchops1 · on Sept 27, 2017

ya it contains the evaluation results of the model and 5 days of result data.

thinkmilitant · on Sept 26, 2017

The ever-relevant xkcd - https://xkcd.com/1570/

jb1991 · on Sept 26, 2017

Not just relevant; that comic and this website are nearly indistinguishable.

madchops1 · on Sept 26, 2017

madchops1 · on Sept 26, 2017

I love xkcd.