
Alleged Mt.Gox code leaked on IRC node by Russian Hacker - obilgic
http://pastebin.com/W8B3CGiN
======
Nilzor
Wow. This code is pretty bad. I mean, it's bad for a college project. It's
horrible for a company dealing with large sums of money.

Some random red flags:

\- There's a class with the name of the application. (Issues: Scope, SRP)

\- There's a class with 1708 lines of code. (Scope)

\- There's a switch-case statement that runs over 150 LOC (readability,
maintainability)

\- There's a string parsing function in the same class as transaction
processing (Separation of concerns)

\- There are segments of code commented out (are they not using source
control?)

\- There's inlined SQL (maintainability, security)

\- There's JSON being generated manually & inline (SoC, DRY)

\- There's XML being generated manually & inline (SoC, DRY)

\- To sum up function __Route_getStats($path)_ : XML production, JSON
production, file writing, business logic, SQL commands, HTTP header fiddling,
hard coded paging limits, multiple exit points...

The amount of refactoring needed here to bring this code up to acceptable
quality is simply staggering.

~~~
easy_rider
for an MVP this is fine :D Seeing how long mtgox has been around this is
absurd. 1700+ lines would spin my head pretty fast.. And inline SQL is non-
forgivable.

~~~
davidw
> And inline SQL is non-forgivable

What do PHP folk typically do for a big, hairy query that isn't really doable
via an ORM? Not that this is the case here, just curious. I haven't touched
PHP for years.

~~~
easy_rider
Oops, "inline SQL" is probably a wrong way of putting it. Actually, even in
Symfony2 which has an ORM (Doctrine), I sometimes find myself writing SQL
queries in code because Doctrine can be so inefficient in some situations.
Writing pure SQL makes your code less portable (i.e. tied to a specific
database, in this case MySQL). Now I don't think its bad at all to have
somewhat complex group/(sub)queries written this way if portability is not a
big concern, and if the number of those queries is not as big, it is something
you just need to be aware of.

What we're actually talking about here which is really bad is string
concatenation without parameterization.

Meaning, it seems the code is not using a database driver. PDO has been the
accepted way of doing this. You pass the parameters to PDO separately from the
query for sanitation. We can all see why this is bad:

    
    
         if (isset($_GET['limit'])) {
             $limit = (int)$_GET['limit'];
             if ($limit < 1) $limit = 1;
             if ($limit > 10000) $limit = 10000;
          }
     $req = 'SELECT * FROM `Money_Bitcoin_Node` WHERE `Status` = \'up\' AND `Last_Checked` > DATE_SUB(NOW(), INTERVAL 6 HOUR) AND `Version` >= 31500 AND (`Last_Down` IS NULL OR `Last_Down` < DATE_SUB(NOW(), INTERVAL 2 WEEK)) AND `First_Seen` < DATE_SUB(NOW(), INTERVAL 2 WEEK) ORDER BY RAND() LIMIT '.$limit;

~~~
Nilzor
Exactly. Parameterization is #1 priority. Second you _could_ also move the
statement string itself to a separate resource (disk file), which helps for
maintainability.

so... the statement above refactored to slightly better pseudo-code:

    
    
        $limit = getLimitParameter();
        $sqlString = SqlResourceLoader::Load("BitCoinNodeSelect.sql");
        $statement = $db->prepare($sqlString);
        $statement->bindValue(1, $limit, PDO::PARAM_INT);
        $statement->execute();
    

..or the query with the SQL along with parameter parsing logic could be
contained in a separate class altogether.

~~~
emilsedgh
Excuse me for asking, how saving the SQL query in a file helps
maintainability?

Storing the query in a file means an additional IO call, right? Isn't the
better approach to wrap the sql statement in a function if it's about to be
reused?

I'm not arguing, I'd like to know the point and learn why saving that into a
file helps maintenance.

~~~
Nilzor
Scenario: You're asked to rewrite SQL for UseCase#14 because it's slow. I'd
argue the developer find it more quickly if it's stored in a separate .SQL
file with a file name related to the use case (discoverability). It would also
separate better in source control history as you'd see when you did SQL
changes and when you did PHP changes. Could be valuable when bug-hunting.
_edit_ : And the obvious: instead of scrolling to line 762, column 18 of PHP
file X, you open file Y on line 1, column 1. It's also an .SQL file, which
tells an IDE to syntax highlight it as SQL and not PHP.

Yes, there's a performance cost. Hard disks are fast, though. And then there's
caching. I always value maintanability higher than performance, as
maintainability often is correlated with number of bugs.

You suggest to wrap the SQL statement in a function, and that might be just as
fine. There's no single best solution here. Just a lot of solutions that are
better than the Bitcoin-class ;-)

------
gnerd
I had a little peak and started seeing a few bugs but then I hit line 1009:

> mail('mark@ookoo.org', 'BLOCK IMPORT ERROR', $e->getMessage()."\n\n".$e);

I haven't done much digging around this issue (so perhaps someone can
enlighten me on this) but I was curious at what was at that domain (since it
seemed the email was going to Mark Karpeles). When I went to the staff page[1]
I could see his company Mutum Sigillum LLC takes care of the administration
for that IRC network. That same staff page used his nickname MagicalTux and
the corresponding page[2] links to his "Professionnal PHP5 certification"[3]
but that lists his name as "Robert Karpeles" and not Mark Karpeles. Does
anyone know why this is? Did he change his name between 2006 and now?

[1] [http://en.wiki.gg.st/wiki/Staff](http://en.wiki.gg.st/wiki/Staff) [2]
[http://en.wiki.gg.st/wiki/MagicalTux](http://en.wiki.gg.st/wiki/MagicalTux)
[3]
[http://www.expertrating.com/transcript.asp?transcriptid=1005...](http://www.expertrating.com/transcript.asp?transcriptid=1005930&submit=Submit)

Edit: After more digging and looking at his test scores and code, I have come
to the conclusion that this guy's skill level is not what he thinks it is.
Here is my favourite:
[http://code.ohloh.net/file?fid=it28K0-Hdeyw2F2XDguAEU0ZSKI&c...](http://code.ohloh.net/file?fid=it28K0-Hdeyw2F2XDguAEU0ZSKI&cid=lrhV9kzrbr8&s=)

Notice his fix to stop local file inclusion vulnerabilities in PHP, apparently
he has never heard of a null byte.

~~~
groovylick
Plus just about 200 lines later there is a completely different email address
to notify of a failed ssh connection.

>mail('mark@tibanne.com,luke+eligius@dashjr.org', 'SSH connection to
freetxn@'.$ip.' failed', 'Used ssh key 14a70b11-5f36-4890-82ca-5de820882c7f,
but couldn\'t login to push those txs:'."\n".implode("\n", $el_todo));

------
TazeTSchnitzel
Ooh, floating-point arithmetic in software making financial transactions!

~~~
JumpCrisscross
Pardon my naïveté - why would floating point arithmetic be a red flag in
transactional code?

~~~
route66
It's less about the transactions but the money.

    
    
        >> 3 * 1005.01
        => 3015.0299999999997

This might get truncated to 3015.02 and might

    
    
        >> 3 * 1005.01 * 10000
        => 30150299.999999996

turn into a real loss.

~~~
brianpgordon
Like Superman 3!

------
ojr
It is impressive that PHP ran the largest Bitcoin exchange for a while and the
main flaw happened to be a software hack than its ability to scale. If you
think language X would be more secure and saved Mt Gox, the Russians would
like to have a word with you

~~~
gnerd
PHP is a tool and as any tool with a low learning curve it can be used by
people with or without expertise.

I think we certainly have had PHP built infrastructure that scales, but surely
you can see why an order matching system should not be written in PHP.

First of all, by doing it that way, the order matching system was coupled to
the website, so now it makes perfect sense why the BitCoin price crashed after
the DDoS attacks on MtGox. Because taking the website down meant taking the
order matching down with it. No more trading.

From what I am seeing, this is not a case of PHP being evil (although, would
you really run mission critical systems with PHP? The execution model doesn't
make sense in that world and if you think a set_time_limit(0) on a PHP script
is the same as an actual daemon written in a robust language meant for that
execution model, then I think we are in extreme disagreement).

For me this is a case of a guy who's confidence was ahead of the reality. I'm
sure in his mind a pacemaker running on PHP code is perfectly fine, and
perhaps it might actually work for a while but that's just it, it will fail
eventually (its a square peg in a round hole after all) and when it does it
will be bad. We don't craft critical systems thinking of the best case
scenario, we do it thinking of the worst and for my money, whatever happened
at MtGox, its the worst case scenario.

PHP in Facebook, yes, NYSE? FUCK NO.

~~~
girvo
I love PHP. Yes, you read that right. And, I completely agree with you.

Also, for the GP, PHP's shared-nothing architecture means scaling horizontally
is actually pretty easy all things considered. I just wouldn't write a bitcoin
exchange in anything that doesn't have static typing: PHPs gotchas could quite
literally cause a massive loss of money.

~~~
ramchip
A matching engine can't scale horizontally very much in the first place

------
smutticus
Am I to believe that MtGox code that actually touched Bitcoin transactions was
actually written in PHP? Is this some sick joke?

I have used PHP on and off for years for many things. I'm not a PHP hater, but
if it's not touching HTTP, it probably shouldn't be written in PHP.

~~~
Aqueous
There's nothing intrinsically wrong with that if you do it correctly. But this
was not done correctly.

------
DonGateley
Looking at that code makes me yearn for Cobol.

------
sasas
The the leaked IRC chat logs [1] the leaker stated he was Serbian, not
Russian. Is there additional information available?

[1]
[http://www.reddit.com/r/Bitcoin/comments/1x93tf/some_irc_cha...](http://www.reddit.com/r/Bitcoin/comments/1x93tf/some_irc_chatter_about_what_is_going_on_at_mtgox/cf99yac)

------
seunosewa
I don't read or write PHP but the code looks quite neat and well-organized to
me...

~~~
McGlockenshire
It's organized, but not well built.

Previous comments:
[https://news.ycombinator.com/item?id=7332372](https://news.ycombinator.com/item?id=7332372)

------
cognivore
Apologies in advance to the PHPers out there, but... PHP?!

~~~
TazeTSchnitzel
MagicalTux is a known lunatic who wants to write everything in PHP. Which
isn't too bad of an idea, but he, for example, hacked together an SSHd in PHP
and immediately put it into production use:

[http://blog.magicaltux.net/2010/06/27/php-can-do-anything-
wh...](http://blog.magicaltux.net/2010/06/27/php-can-do-anything-what-about-
some-ssh/)

Essentially he is a bad programmer and he has Not Invented Here (fixed, thanks
jey) syndrome, both of which are _terrible_ attributes for someone coding a
Bitcoin exchange.

~~~
jey
I think you meant NIH ("Not Invented Here"), not NIMBY ("Not In My Backyard").

~~~
TazeTSchnitzel
Oops, yes I did indeed mean Not Invented Here syndrome. Thanks.

------
dschiptsov
PHP...

What a farce.

------
tobehonest
Now that it has been shown the NSA to offensive attacks to discredit anti-
government forces, could any of this mess be attributed to PSYOPS to discredit
the whole cryptocurrencies?

