
Show HN: An educational blockchain implementation in Python - jre
https://github.com/julienr/ipynb_playground/blob/master/bitcoin/dumbcoin/dumbcoin.ipynb
======
magnat
> It is NOT secure neither a real blockchain and you should NOT use this for
> anything else than educational purposes.

It would be nice if non-secure parts of implementation or design were clearly
marked.

What's the point of education article, if bad examples aren't clearly marked
as bad? If MD5 usage is the only issue, author could easily replace it with
SHA and get rid of the warning at the start. If there are other issues, how
can a reader know which parts to trust?

Even if fixing bad/insecure parts are "left as an exercise for the reader",
learning value of the article would be much greater if those parts would be at
least pointed at.

~~~
jre
OP here.

erikb is spot on in the sibling comment. This hasn't been expert-reviewed,
hasn't been audited so I'm pretty confident there is a bug somewhere that I
don't know about.

It's educational in the sense that I tried as best a I could to implement the
various algorithmic parts (mining, validating blocks & transactions, etc...).

I originally used MD5 because I thought I would do more exploration regarding
difficulty and MD5 is faster to compute than SHA. In the end, I didn't do that
exploration, so I could easily replace MD5 with SHA. I'll update the notebook
to use SHA, but I'm still not gonna remove the warning :)

I'll also try to point out more explicitly which parts I _think_ are not
secure.

~~~
magnat
> I'll also try to point out more explicitly which parts I think are not
> secure.

Things I've noticed:

* Use of floating point arithmetic.

* Non-reproducible serialization in verify_transaction can produce slightly different, but equivalent JSON, which leads to rejecting transactions if produced JSON is platform-dependent (e.g. CRLFs, spaces vs tabs).

* Miners can perform DoS by creating a pair of blocks referencing each other (recursive call in verify_block is made before any sanity checks or hash checks, so they can modify block's ancestor without worrying about changing its hash).

* mine method can loop forever due to integer overflow.

* Miners can put in block a transaction with output sum greater than input sum - only place where it is checked is in compute_fee and no path from verify_block leads there.

~~~
jre
Those are all very good points I didn't think about, thanks for these.

I'll fix the two bugs with verify_block and the possibility for a miner to
inject invalid a output > input transaction.

I'll add a note for the 3 others.

~~~
westurner
For deterministic serialization (~canonicalization), you can use
sort_keys=True or serialize OrderedDicts. For deseialization, you'd need
object_pairs_hook=collections.OrderedDict.

Most current blockchains sign a binary representation with fixed length
fields. In terms of JSON, JSON-LD is for graphs and it can be canonicalized.
Blockcerts and Chainpoint are JSON-LD specs:

> Blockcerts uses the Verifiable Claims MerkleProof2017 signature format,
> which is based on Chainpoint 2.0.

[https://github.com/blockchain-certificates/cert-verifier-
js/...](https://github.com/blockchain-certificates/cert-verifier-
js/blob/master/README.md#check-certificate-integrity)

~~~
Cyph0n
FYI, dicts are now ordered by default as of Python 3.6.

~~~
cpburns2009
That's an implementation detail, and shouldn't be relied upon. If you want an
ordered dictionary, you should use collections.OrderedDict.

~~~
westurner
It's now the spec for 3.6+.

> #python news: @gvanrossum just pronounced that dicts are now guaranteed to
> retain insertion order. This is the end of a long journey.

[https://twitter.com/raymondh/status/941709626545864704](https://twitter.com/raymondh/status/941709626545864704)

More here:
[https://www.reddit.com/r/Python/comments/7jyluw/dict_knownor...](https://www.reddit.com/r/Python/comments/7jyluw/dict_knownordered_versus_ordereddict_an/)

OrderedDicts are backwards-compatible and are guaranteed to maintain order
after deletion.

------
h4l0
For the last couple of months, there have been many educational, simple
implementations that explains blockchain technology, I guess thanks to crypto
bubble. I wish these were around when we were doing our senior design project
on blockchains in 2014. Back then, I only could find basiccoin[0], which was
purely minified to just fit in 1000 loc.

After that, I decided to re-implement everything from scratch. My foremost
constraint was to write readable code so that anyone could read the codebase
and have an idea of how blockchain works.

My current draft of implementation can be found on
[https://github.com/halilozercan/halocoin](https://github.com/halilozercan/halocoin)
, which currently lacks detailed README and documentation. However, you can
still experiment with it by using API or CLI. I'm running a dedicated server
to have an always online peer you can connect to.

[0]: [https://github.com/zack-bitcoin/basiccoin](https://github.com/zack-
bitcoin/basiccoin)

Edit: a word

------
westurner
Thanks! Simplest explanation I've seen.

Here's an nbviewer link (which, like base58, works on/over a phone):
[https://nbviewer.jupyter.org/github/julienr/ipynb_playground...](https://nbviewer.jupyter.org/github/julienr/ipynb_playground/blob/master/bitcoin/dumbcoin/dumbcoin.ipynb)

Note that Bitcoin does two rounds of SHA256 rather than one round of MD5.
There's also a "P2P DHT" (peer-to-peer distributed hash table) for storing and
retrieving blocks from the blockchain; instead of traditional database multi-
master replication and secured offline backups.

> ERROR:root:Invalid transaction signature, trying to spend someone else's
> money ?

This could be more specific. Where would these types of error messages log to?

~~~
jre
Thanks for the precision regarding the hash.

Regarding the error, they are logged when a verify_block/transaction returns
False, just to be a bit more explicit about what failed. In a real
implementation, I guess you would throw exceptions instead (or use some Result
pattern), but I tried and it cluttered the code quite a bit, so I went back to
logging.

------
geraldbauer
FYI: Great blockchain (from sratch) starter article. At the Awesome
Blockchains page [1] I collect starter blockchains and articles (in Python,
Ruby, JavaScript, etc.) the idea is the best way to learn about blockchains is
to do-it-yoursef - build your own blockchains from scratch. Great example.
Keep it up. Cheers. [1] [https://github.com/openblockchains/awesome-
blockchains](https://github.com/openblockchains/awesome-blockchains)

~~~
jre
Thanks ! The awesome-blockchains is a great resource, thanks for sharing.

------
heynk
This is great. Just last weekend I did the same thing, coding a very basic
blockchain in Python for educational purposes. You tackled wallets, which I
didn't get to yet, so that was really helpful.

I'm still a little unsure around exactly how miners and nodes communicate with
each other. Especially things like broadcasting transactions and new blocks.
Any good resources for that?

~~~
jre
Thanks ! I've completely left out communication from this because it wouldn't
fit in the notebook and I haven't researched it. Would also appreciate if
anybody has good resources on it.

------
ivan_ah
Very good writeup that shows all the steps.

Note it uses MD5 hash instead of SHA256 so not exactly bitcoin. I wonder how
much more work would be to make the code fully implement bitcoin. Will it
still be readable? Or Etherium? Would be great value for understanding even if
Python would be inefficient to run in prod.

~~~
h4l0
Fully implementing Bitcoin white paper would require moderate amount of work
if you do not consider every little detail related to security of your client
code. However, current Bitcoin protocol added many features such as scripting.

I think one can maintain code readability in a python implementation but
documentation is the key here. Developer needs to clearly state the objective
of each function.

For ethereum, you need one external element called: Ethereum Virtual Machine.
Smart Contracts are basically byte code that runs on EVM. Without it,
blockchain cannot function. So, ethereum development may require extra
knowledge on top of blockchain technology.

~~~
Gargoyle
I've been ignoring bitcoin for a while. Do the devs still maintain the
official implementation is the only implementation possible since it contains
quirks the spec doesn't specify?

~~~
pmorici
Yup, though a group got fed up with that and forked to create Bitcoin Cash
which has a spec and multiple teams and implementations. Much healthier
situation.

------
vinn124
nice walkthrough.

also, not sure why folks are nitpicking about minor things like security
disclaimers, number of sha256 hashes, md5, etc. while ignoring nontrivial gaps
(eg no merkle dags, one of the cornerstone concepts).

~~~
jre
Thanks.

You're right about Merkle tree. This is a whole section of the bitcoin paper
and it's pretty important. But as far as I understand, it's "only" an
optimization to save disk space, so it doesn't change the underlying logic.

------
jobeirne
Something from a few months back that's a bit closer to Bitcoin's actual
implementation but at a similar level of readability:
[https://github.com/jamesob/tinychain](https://github.com/jamesob/tinychain)

------
brodo
Coll project. I think having educational * implementations in Python is great
in general. Better then having just an article about it...

------
antman
So is there a non educational implementation of blockhain that someone could
use/parametrize/configure? Something that is not educational but more
industrial strength? What must someone do to convert an educational
implementation to a serious one?

------
satellitec4t
Why is this all contained in one big json document? Am I that out of touch
with modern coding??

~~~
theuly
What the author wrote is a IPython / Jupyter notebook [0], which is
implemented as a JSON document [1]. The notebooks are pretty much markdown
with runnable code blocks in between.

Jupyter notebooks are popular in the data science / Python communities to
explain concepts, for examples see here [2].

[0] [https://jupyter.org/](https://jupyter.org/)

[1] [https://ipython.org/ipython-
doc/3/notebook/nbformat.html](https://ipython.org/ipython-
doc/3/notebook/nbformat.html)

[2] [https://github.com/jupyter/jupyter/wiki/A-gallery-of-
interes...](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-
Jupyter-Notebooks)

------
partycoder
Educational implementation, brilliant. Now some startup will give it a name
and make an ICO.

