Hacker News new | past | comments | ask | show | jobs | submit login
The MtGox 500 (stamen.com)
265 points by bryanjowers on Mar 20, 2014 | hide | past | favorite | 40 comments

Users 1 and 15's charts make no sense - they have to be special system accounts of some sort. My guess is that #15 is the account that receives trading fees, and #1 represents MtGox itself (or some specific aspect of MtGox, such as its cold-storage).

There're still a lot of points on these plots that don't make sense, though; generally they look like vertical stripes labelled as sets of small sell orders, both far below and far above the market price. User 30 has what looks like a large sell (in Mar 2013) far above the highest-ever price. So either MtGox's order-matching is way more broken than anyone ever knew (unlikely), or these actually represent fees, withdraws, or something similar, and the y-axis position is meaningless.

Also worthy of note is that massive sell order by user 1 in Nov 2013. That's hard to interpret without knowing what the dots on that graph really mean (I doubt they're trades), but it's likely significant.

Each dot is a trade. I pulled the columns from the db we generated with the raw info of that one big trade, so you can see what the structure of the source data looks like. Removed the user id of the buyer, but it's trivial to find if you have the dataset.


2013-11_mtgox_japan.csv|1384974682548514|2013-11-20 19:11:22|*|buy|USD|1000|610000|60941267.96|0|3

2013-11_mtgox_japan.csv|1384974682548514|2013-11-20 19:11:22|THK|sell|USD|1000|610000|60941267.96|0|0

So the size of the dots are the Money_JPY "volume" - that is, Yen, and not Bitcoins? Isn't that a bit redundant, because it contains information from the y-axis, which encodes the Yen value of a BC? Wouldn't it have been more interesting to see if the number of BCs traded actually went down with the increase of value of BCs? Right now, it only shows how the volume of trades has increased, but that seems to be obvious given BC's value hike. Actually, to make the graphs distinct, shouldn't the number of BCs traded be on the Y-axis (maybe on log scale, too, as I suspect a huge drop over time) and the dots remain as they are?

May I ask, where'd you get the dataset?

A torrent found on Reddit the day it was leaked. It looks like mods may have removed that thread:


If you find a copy, don't run the .exe file contained within. It contains malware.

From "About":

>The special user TIBANNE_LIMITED_HK is not represented in the MtGox 500 since its trades are not in JPY. You can read a conjecture on the role of this user[1]. The other special user, THK, shows up as #1 in the MtGox 500.

[1] http://7u83.cauwersin.com/?p=12

They look like bid/ask offers.

The biggest take away from this is that bots were driving the price up, and preventing its fall.

"Bots" offer a lot of stability to the market because they don't react to bad news. Though an error can cause "flash crashes" As happened in May of 2010 on the stock market.


How exactly did these charts lead you to that conclusion?

Look at several places the bots started binge buying when the price started to fall out.

What makes you think people wouldn't binge buy when the price started to fall out?

People react to news. They might binge buy, but they sleep, and they don't sell as often there after. The Bots were keeping the price steady by buying and selling. If not for the bots, the ups and downs would have been a lot "peakier"

I love seeing the mostly red charts, looks like people that mined a lot. check #145 and #180. These people are selling at exponential curves on a log plot... mind blowing.

Or it could be a business that is accepting bitcoins for purchases and then immediately cashing those bitcoins into USD.

Very cool. Important to realize the graphs are on a log scale... threw me for a loop for a second before I noticed.

Could one of these graphs represent the activity of Willy[0]?

As an aside, I experienced a nice bit of schadenfreude when looking at a lot of the graphs from ~250 to ~299

[0] https://bitcointalk.org/index.php?topic=497289.0

This is really beautiful. It's an amazing amount of information to be borne (almost) purely visually.

My favorite is 117. They are the Devon Sawa in Final Destination of Bitcoin.

Any theories on user 15?

The amount of money lost is a little easier to get your head around when you see so many traders buying at relatively high prices towards the end.

Sounds like a bug in the author's graph generation code or some incorrect data in MtGox's database. The price didn't vary that much.

How amazing would it be if we had this kind of data for publicly traded stocks?

For some classes of investors, we do

Fun stuff. Bryan if this is your site can you add to the plot in the corner a dot with net value change? Assume that all bitcoin "held" are currently worthless, so a holding of 10 btc at the end would be -10*last sale price recorded on the exchange. That would give an interesting idea of which traders "won" the game and which "lost."

We started doing this analysis and have the data per user but didn't want to include it in this goaround. Still much to do and investigate with this data set! Thanks for the feedback.

I’m not sure I understand the difference between the green Selling BitCoins (to get Dollars, presumably) and Buying Dollars (by selling BitCoins, also presumably).

I’m even more puzzled by the rates out of trade: did MtGox allowed traders to agree one-on-one on their own rates? That would allow a lot more laundering than any theft.

Missing too are relative size of the traders.

Yes, you used to be able to trade via dark pool in large amounts (at the time, that would be 5+ kbtc) to avoid upsetting the market price. Not sure if they still supported it at time of closure.

This is actually rather scary. Enough people have logged data that these can be dereferenced despite the feeble attempt at depersonalization, but I guess the names will come out in the legal proceedings anyway, depending on jurisdiction.

Then we'll have Newsweek writers at our houses asking how we feel about "losing millions" when our cost basis was really around $10. :-)

To head that off: "Umm, now I don't have to worry about how to declare this on my taxes? 'Miscellaneous Income' was making me nervous."

I don't think the (publicly) leaked database contained any usernames/emails.

You're right, it didn't. All of the trades data included internal User IDs. More recent data (as of April 2012) included User ID hashes, which Gox users have access to (from their password recovery emails for example).

> depersonalization

I think you're maybe looking for bowlderization or anonymization or something similar?

The problem I had with both those terms is that they're not anonymized (and unless the usernames are obscene, bowdlerization isn't really a good fit). With the other leaks, at least for the older accounts, a name can be associated with these charts. For the newer accounts or those that lay dormant for a few years before trading, I suspect those who have been logging the web socket and IRC traffic could create a correlation with an individual.

Anonymize is probably fine, but I want "reversible anonymization". :)

Green -> Buying bitcoin with any currency

Red -> Selling bitcoin for USD

Blue -> Selling bitcoin for any non-USD currency

i want to get married to stamen. the company. they're so awesome.

Pretty interesting stuff. Thanks for the dataviz. What kind of coin volume did it take to get in to the top 500?

This is incredibly well done. Did you just download all the data and analyze it with some crazy R skills?

Thanks! The data was released in a big zip file with a mix of malicious content (malware that stole your btc, personal data on MtGox CEO Mark Karpeles). Trade data was in monthly csvs, and also included some duplicate data which we removed.

Analysis happened in sqlite, python/pandas, csvkit; charts were built in d3.js and exported with phantom.js

1-45 is interesting when compared to 15. Others mentioned this may be a bug in data interpretation or a GOX account, but if you look at all of the high volume traders you can see horizontal striations that correlate with user15.

What pattern would most likely represent the Winklevoss pattern? Assuming there was trading prior to raising public awareness?

Really cool visualization. I wish we could see similar for retail vs institutional traders in the capital markets.

Great use of small multiples.

Isn't User 15 proof that the entire market is flawed and the users were scammed?


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact