
Unsupervised machine learning with basket clusters - signa11
https://hackernoon.com/unsupervised-machine-learning-for-fun-profit-with-basket-clusters-17a1161e7aa1
======
alextheparrot
Here's my understanding of what's going on:

He has a dataset that links companies to each element of the periodic table
(Intel would likely be highly linked to silicon, while not as linked to
nitrogen as a fertilizer company). He then uses some clustering and whatnot to
produce groups of stocks he is trading as a unit.

How he picks what to trade is kinda hand wavy (Is he trading all baskets and
only reporting the good one?). He also mentions using Google trends data,
which might be some signal for his clusters. Given the lack of detail, my gut
reaction is to assume he made tens of these strategies and has kindly reported
the best one to us.

At the end, you see he is a financial consultant hoping to build his resume up
through posts like these.

------
rushabh
So the author was able to beat S&P500 by 10% over a period from June-2016 to
June-2017 using this solver. The most important question is whether the same
underlying relationships will hold true for 2017-18.

This seems like a classic example of hindsight. There are many things one can
tell on hindsight when the results are out. The key is if they will hold for
the future too. Am I missing something?

~~~
justadeveloper2
It doesn't matter if you can do it once. You have to do it consistently over
years, which is not possible. I know about Renaissance Technologies, but I
suspect they have insider information or some other angle and it's not all
about their algorithms.

~~~
eximius
That is a rather large accusation. Can you restate that as something
constructive or helpful to the conversation?

~~~
justadeveloper2
Do you understand investing at all? People have been attempting to "beat the
market" for decades and nobody has been successful. There is always the
"reversion to the mean" issue in every case. People with algorithms think they
are beating the market then something changes and their algorithm no longer
works. Except for Renaissance--they have consistently been beating the market,
or so it appears.

Does that help?

~~~
dajohnson89
No, it doesn't help all. You have yet to provide any evidence for your
accusation of insider trading (or "something else"). You can't really accuse
RT of wrongdoing, just because you don't understand their methodology.

Also -- if I were a quant that managed to beat the market consistently, I
would shut up and go straight to RT, for two reasons. To avoid toxic dubious
comments like yours, and because RT can pay me better than your standard hedge
fund.

~~~
scottlegrand2
Except that Renaissance trading doesn't really want to hire quants. They wish
to hire scientists and mold them in the Renaissance way, which is something
you'd know if you were a quant.

[http://www.reuters.com/article/simons-hedge-
idUSN21355752200...](http://www.reuters.com/article/simons-hedge-
idUSN2135575220070522)

~~~
justadeveloper2
It's a good article, but it's all that you will find about them, no real
details on how they are beating the markets over and over again. It's simply
not possible to be that successful for so long. The article states that some
experiments succeed and some fail and that their strategies peter out over
time and they have to develop new ones. Okay, but never booking a loss?

~~~
scottlegrand2
Speaking as a former quant myself (1 year), I believe that there are transient
patterns in the market.

The challenge is that those patterns appear and then they disappear mostly
forever because they are frequently created by someone else's mistakes like
Nassim Taleb buying way out of the money put options that never really created
any significant profit. Free Alpha, get your free Alpha, right there.

To tap into those patterns while they are profitable, you need a lot of smart
people, and a great deal of infrastructure for experimentation and delivery of
strategies that can tap into those patterns before they disappear. While
you're right that it is possible that they cheat, it's also possible that they
have a sufficiently sophisticated infrastructure to actually make this work.
It's kind of like how Nintendo only ships 1 out of 3 video games they develop.
All IMO of course.

------
PLenz
I can't help but feel that this is noise multiplied by noise. Regression to
the mean ought to he coming up real soon.

------
drtillberg
This is systematic semi-blind guessing at what other people will guess in the
future are good investments. It sprays investment capital equally at companies
that are not in any way equivalent. Hypothetically, just because two companies
both have names that begin with "Z" and sell product on Amazon, that is not an
investment thesis for treating them alike. Treating them alike _is_ a recipe
for misallocation of capital, in the long run, which is what it seems
sometimes we have in this economy in spades.

------
ikeboy
I don't get the strategy here. What's the investment thesis? Are they
collecting news and trading on it? It doesn't seem to mention that.

------
square90
I'm skeptical of the starmine dataset[0] being used:

"The dataset is based on relationships between elements in the periodic table
and public companies."

This description is a bit vague.

[0]
[http://starmine.ai/datasets/dataset_builder.html](http://starmine.ai/datasets/dataset_builder.html)

~~~
zenkat
I'm having trouble deciding if this whole post is a sly commentary on how easy
it is to get machine learning wrong.

Beat the market by 10% with k-means clustering and a feature set derived from
companies and chemical elements! Hahahaha no.

~~~
scottlegrand2
I know, right? Now mapping it to the sequences coming from the numbers
stations? Totally different story!

------
leereeves
If this approach works, is it wise to be sharing it publicly?

Is there a danger than whatever gains you were able to capture with a unique
insight will end up being shared with competitors?

~~~
lightbyte
Sharing it publicly is the author saying it does not actually work as he
claims it does. You don't write a blog post on a method you created to make
guaranteed money, you use it.

~~~
scottlegrand2
A couple years back there were some traders talking about trying to implement
a really fast version of K-means for clustering stocks that are transiently
correlated on some sub-reddit. I wonder if this guy is the one who started
that thread.

------
landtuna
Is this not a parody, making a joke about how sensitive to supervision
"unsupervised" learning methods are?

~~~
scottlocklin
If it's not a parody, it's one of the most unintentionally hilarious things
involving data I have ever seen presented.

To anyone who takes this seriously: I have a price of butter in Bangladesh
indicator which works REALLY WELL on the S&P500.

