I think there are some key differences to cycles in Chess. In Chess, the cycles come from your (or your opponents) actions. With minimax you would max (or min) over these actions, so if you pruned away the cycles at some depth your result is not incorrect because the min or max does not change. But here the cycles come from randomness in the game of which chess has none, and you need to take an expectation over the cycles. So you can't just prune them away without changing the result.
A scoring function could definitely help guide the exploration and/or prune the tree, but only at the action nodes, not the environment randomness nodes. Rolling out the full tree more than 1-2 levels would be infeasible because of the randomness in the environment. When you take an action, the randomness can transport you into an exponential number of states, so you have a huge branching factor that is much larger than chess. I think in chess you have a factor of 40ish? Here it's more like ~1000 or ~10,000 depending on the item.
I also wouldn't know how to design a scoring function for this. If you do something simple to take the number of missing modifiers you will end up stuck in bad states. Maybe there is something really clever here that you can do, but I don't know what it is.
If you have an idea how to make tree search work for this I'd love to try it.
I think the heuristic approach is not as bad as you think. Human crafters, and the current crafting techniques we have are pretty efficient. You don't need to explore the full graph of random states because the mod weights themselves are a representation of those states. As a human I can't think of any crafting item (alt and chaos spamming aside) where the probability isn't a simple to understand number. You think you can use something like modweight x currency cost (x some time modifier for acquisition and use of currency) for scoring. This is how craftofexile does it.
The more I think about this the more I feel you went really overkill on this. The complexity is several orderers of magnitude lower than what you claim. Like try crafting stuff on the emulator and see how easily you can narrow it down to usually 2 or 3 choices at most steps. There should be really easy heuristics to invalidate most branches. I've also used the COE simulator a ton and never had to model something that's on the order of 50+ states even.
Just a remark about the cycles not changing the min-max.
That's not correct. Identical positions do not always have the same value.
Sometimes, a reaching the same position again will allow a player to claim a draw (3 times repetition). Sometimes a repeated position will bring them closer to some other draw mechanism (fe 50 moves without a capture or pawn move).
However, reading this, It seems like a good fit to try a genetic algorithm. I know, GAs are regarded as a complete backwater in AI these days, but nature has shown that they work in these kind of situations.
I do want to release the code, but it may take a couple more weeks. It's currently a mess of research code that's mixed with some other stuff I was working on so I would need to do a cleanup first. See you on the boat!
- AI / Deep Learning Research - previously at Google and have published papers. Mostly focused on NLP and RL, but I keep up with other subfields.
- Infra: Devops, Rust, Go, Kubernetes, Microservices, large-scale systems, all kinds of data stores. Have managed large clusters. Used to be an early Apache Spark engineer and was in a database research group in grad school
- Worked in HFT-style algotrading for a few hedge funds
- Worked at multiple early-stage startups, so I can do other things like full-stack web or app development, but I prefer not to do these full-time. But I can help if stuff comes up.
Hi! 15+ years of engineering experience, and have been through a lot of technology cycles. I'm in an ok place right now and focusing on research and side projects. I'm not actively looking for work but if there's something at the intersection of my interests I'd love to talk. Not sure myself what that would look like, perhaps something around MLOps, infra/automation, Reinforcement Learning, Algorithmic trading, etc.
Location: Usually Japan/East Asia, but currently in Europe due to COVID
Willing to Relocate: No
Technologies (reverse-chronological order):
- AI / Deep Learning Research - previously at Google and have published papers. Mostly focused on NLP and RL, but I keep up with other subfields.
- Infra: Devops, golang, rust, kubernetes, microservices, large-scale systems, all kinds of databases. Have managed large clusters. Used to be an early Apache Spark engineer and was in a database research group in grad school.
- Briefly worked in algo trading (HFT-style)
- Worked at multiple early-stage startups, so I can do other things like full-stack web or app development, but I prefer not to do these full-time. But I can help if stuff comes up.
Hi! 15+ years of engineering experience, and have been through a lot of technology cycles. I'm in a decent place right now and focusing on research and side projects. I'm not actively looking for work but I figured I would post anyway - who knows what opportunities come along! If there's something at the intersection of my interests I'd love to talk. Not sure myself what that would look like, perhaps something around ML/RL research, academia, trading, or infrastructure.
Yup. The only reason I am starting to write about it now is that I am no longer running the system. You could argue that it's not useful to write about systems that worked in the past, but I would disagree. New systems can work 99% the same way, but get an additional edge from somewhere else, like new data or better models. Most of the engineering will always be the same.
> Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.
Exactly, this is one of the nice things about RL. You don't to do a bunch of handwaving to turn your predictions into a strategy.
I think you're comparing apples to oranges here. These funds manage billions of dollars of client money, which forces them into highly liquid markets with scalable strategies. That's quite different from how individuals or smaller prop funds can operate, trading off capacity for higher returns by trading in less liquid markets or with strategies that are "not worth it" for large hedge funds. If you must manage billions of client money then you are right in terms of competition, but as someone who only trades his own capital, you can see a lot higher returns.
>These funds manage billions of dollars of client money, which forces them into highly liquid markets with scalable strategies.
Obviously this is true, but I think you're missing the point. Trading with ML on price data is a strategy that literally anyone can reproduce and, as is evident by reading the comments in this thread, is something that many people have tried to replicate. In that context, everyone using that strategy is effectively acting as a large fund. Further, a large fund or prop shop can deploy small-scale strategies, I think the limiting factor really tends to be leverage. But if they are just trying to make 5% returns for example, they can deploy a lot of small strategies that make ~5% returns. And that's not mentioning the countless tiny shops operating under the radar trading <10-50 million AUM (really, I think the average fund is much smaller than what you would imagine). What I'm getting at is that there are a lot more market players than the "big guys" and they will either have an equivalent strategy to you or will be better equipped to take advantage of that same alpha because of more capital/better data sources/smarter stats. With that in mind, it seems insane to suggest that you can find significant alpha in such a low-hanging fruit.
Remember that you are trading during one of the longest bull markets in history. It's not hard to make good returns, but it is hard to analyze risk. There are a million and one ways to make 100% y/y, but a fraction of a percent of those will continue to work in the long-term. With a black-box model you cannot properly assess risk. Even with well-understood models, this is something that real industry players struggle with: backtrading alpha != simulation alpha != profitable alpha != long-term alpha.
Also considering you're the OP & are trying to argue in favor of this type of trading, it would be very informative to disclose what kinds of returns you actually made. It's hard to expect people to listen to your opinion in a game where everyone successful is motivated towards secrecy.
If the above is true, why would a fund not just allocate a small amount of resources to trade on OP's strategies. Either:
(1) OP's strategy performs worse than the alternative
(2) They already do this, and have resources that allow them to outperform OP at their own strategy
If the returns are really meaningful, i.e. better Sharpe ratio than just holding $SPY or some dead simple strategy like that, then (2) must be true at least _somewhere_.
Location: Usually Japan/East Asia, but currently in Europe due to COVID
Willing to Relocate: No
Technologies (reverse-chronological order):
- AI / Deep Learning Research - previously at Google and have published papers. Mostly focused on NLP and RL, but I keep up with other subfields.
- Infra: Devops, golang, rust, kubernetes, microservices, large-scale systems, all kinds of databases. Have managed large clusters. Used to be an early Apache Spark engineer and was in a database research group in grad school.
- Briefly worked in algo trading (HFT-style)
- Worked at multiple early-stage startups, so I can do other things like full-stack web or app development, but I prefer not to do these full-time. But I can help if stuff comes up.
Hi! 15+ years of engineering experience, and have been through a lot of technologies and cycles. I'm in a decent place right now and focusing on research and side projects. I'm not actually looking for work. But I figured I would post anyway - who knows what opportunities come along! If there's something at the intersection of my interests I'd love to talk. Not sure myself what that would look like, perhaps something around ML/RL, research, infra, or possibly trading.
I posted this because it was recommended to me several times in [0], together with several other "computational approaches to Physics" books, and thought it would be interesting to HN users. If you're looking for more books like this, the whole Twitter thread is worth a read. It's full of good recommendations.
A scoring function could definitely help guide the exploration and/or prune the tree, but only at the action nodes, not the environment randomness nodes. Rolling out the full tree more than 1-2 levels would be infeasible because of the randomness in the environment. When you take an action, the randomness can transport you into an exponential number of states, so you have a huge branching factor that is much larger than chess. I think in chess you have a factor of 40ish? Here it's more like ~1000 or ~10,000 depending on the item.
I also wouldn't know how to design a scoring function for this. If you do something simple to take the number of missing modifiers you will end up stuck in bad states. Maybe there is something really clever here that you can do, but I don't know what it is.
If you have an idea how to make tree search work for this I'd love to try it.