Hacker News new | comments | show | ask | jobs | submit login
Building a tiny blockchain (medium.com)
187 points by nitramm on July 18, 2017 | hide | past | web | favorite | 57 comments



This worries me. Bitcoin solved the double spend problem by using the LCR with PoW. That is the novel aspect which allowed cryptocurrencies to come into existence.

If you remove the LCR and PoW, all you are left with is a toy example which cannot work in practice.

What is needed is education about censensus design and bitcoin's implementation of the solution, rather than a 'how to guide' for building a basic linked list under the guise of 'blockchain'.

Cheers, Paul.


As for proof of work, am I right in assuming it works like this :-

Add a field to store a random number. Keep generating values for this field until the hash of the structure has 'n' leading zeros.

Is that more or less correct?


Yes, that's more or less it.

It's worth adding to that description that the hash must be secure in that it does not allow determining what random number to use to get a particular output (I'm sure you already recognized this detail - it just guarantees you have to keep trying actually random numbers to find a result).

Also worth adding is that the 'n' (Basically the difficulty) is generally going to need some way of scaling based on how quickly correct hashes were found in the past. This is important to prevent problems that would happen with people being able to generate correct results too quickly (Or even just significantly faster then previously).

In particular, Bitcoin's difficulty factor scales such that a new block is found aprox. every 10 minutes, so 'n' is adjusted every so many blocks depending on how fast the hashes were found. Because in the case of Bitcoin, a found block means more coins in the market, it is important that when more hashing power is added, it doesn't result in blocks now being found 1 minute apart. If that were to happen, then you'd have 10 times the number of coins entering the market.


And this scaling of the work factor - is that tied to timestamps in the previous blocks? (Ie - the work factor is something like max(average(time_blockN - time_blockN+1)?

[ed: from other comments, I gather that "passage of time" means "added blocks (hashes)" - but I'm still not clear on where the agreement on what constitutes "in 10 minutes on average" comes from - if A sees an addition to the blockchain of N blocks with X proof-of-work, lets call it N * X from B, and sees M blocks with Y proof-of-work from C -- it should be easy to figure out which is bigger of N * X and M * Y (here we don't really assume multiplication, but some way to figure out maximum proof-of-work).

But how does A know how long either took? Is the time local, so that A looks at the head of the B and C chains, and considers when A saw this head, and figure the elapsed time based on that?]


Yes, but it is not adjusted per-block, but every 2016 blocks (Which is 2 weeks time if each block takes 10 minutes to find).

It's also worth keeping in mind the problem that timestamps are technically forge-able by the miner. Bitcoin doesn't really solve this problem, but it does place requirements that the timestamp of a block be larger then the average of the past 11 (With various constraints). Thus if you presumably have mining distributed enough, then one giving you as invalid of a time as would still be accepted wouldn't end-up affecting things long-term since they can't generate blocks fast enough to heavily affect the average. Still technically an attack vector though.


To respond to your edit (Which I don't think I quite explained):

In that situation, A doesn't care how long it actually took to find those blocks, the scaling of the difficulty/work-factor handles that. A can rely on the fact that (Unless someone introduced or removed a significant chunk of hashing power, or broke the hashing algorithm) the current network difficulty represents the amount of work needed to represent approximately '10 minutes'. PoW below that difficulty isn't accepted, and above that would be considered proving more work then required (But is also, of course, harder to find. So it proves more, but should take longer then 10 minutes for the network to find).

Since Bitcoin calculates the difficulty to result in about 10 minutes between blocks (As I explained in my last comment), that's how long it should approximately take. It's possible B or C got lucky and found it faster then that, but statistically that is unlikely since the hash result is completely random. If significantly more hashing power is added to the network and the blocks are consistently found faster then the target time, then when the difficulty is recalculated later-on (After the 2016 blocks are done) it will go-up, making them again take about 10 minutes with the hew hashing power taken into account.


Again, that's how it works, but not why it works. It bitcoin it is used as an unforgeable proxy for elapsed time.


>"It bitcoin it is used as an unforgeable proxy for elapsed time"

This sounds pretty interesting, can you elaborate on how exactly its proxy model for the passage of time?


In a p2p environment, timestamps are totally useless at proving when I sent you bitcoins. However, when I sent you bitcoins is fundamentally important when it comes to ordering transactions, because I can send you bitcoins, but also send them to myself after you accept them. If I can forge timestamps then this attack is trivial to pull off.

You need an analogue to elapsed time to solve this problem. Hashes in bitcoin are probabilistic, there is an expected number of hashes required to solve the PoW (on average). Each hash takes an amount of time that cannot be made faster, so you suddenly have an unforgeable, easily provable analogue to elapsed time.


That makes perfect sense. Thanks.


What does LCR stand for? Thank you


Longest Chain Rule, that the network honors the longest chain as the correct chain.


Bitcoin has moved away from the Longest Chain Rule to the blockchain with the most cumulative Proof of Work as what the nodes recognize as valid blockchain.


I hear a lot about hash rates. How are aggregate hash rates confirmed/reported? Does a miner submit every single attempted hash to the network as a potential solution or do miners self-report their total hash rates?


It is not necessary for miners to report any stats. Hash rate is simply inferred from the average time taken to find a block, multiplied by the current difficulty.


Could you explain the difference? Isn't the longest chain the one with the most cumulative proof of work?


Every block hash has a target it is trying to be under. For example, if the target is "0000008dab3", then when you are hashing your block, you need to come up with a hash that is below that number. When the network's hash rate goes up, people are producing hashes quicker so a hash is found below that target well below 10 minutes (what the network tries to average around). If hashes are found in less than 10 minutes for too long, then the target gets lowered even further, say to "00000000ab3" to require everyone to produce more hashes so it takes longer.

Now to actually answer your question: if you take 2 bitcoin blockchains that stem from the same origin block, but blockchainA has 20 blocks with a very easy target "fffffffdab3" and blockchainB has 1 block with a very difficult target "00000000003", then blockchainB has actually done more work than blockchainA, even though it has less blocks (it is "shorter"). So blockchainB has the most cumulative proof of work.


Because the "difficulty" rate (essentially the number of leading 0's needed for the proof of work) can change dynamically, the longest chain is not always the chain with the most work.


Why would that make it more secure? Couldn't I simply make a lot of fraudelent transactions and append them, having a long chain, but a false one?


You could, but you'd be competing with everyone else that's also making transactions so you'd need an incredible amount of computational power to outstrip that.


And if you had that much hashing power, you'd be better off using it for mining. It's explained in more detail in the (surprisingly accessible) bitcoin whitepaper.


Looks like "longest chain rule".


Correct me if I'm wrong but the article is about elucidating the structure of a blockchain database - not a practical secure cryptocurrency.


You are correct. However, this field is so new and so potentially disruptive that there are a lot of so called 'blockchain developers' inventing new cryptocurrecies which are unworkable, or essentially no different to VISA in terms of trust, who are convincing unwitting investors to part with large sums of cash because they don't understand the basics of what a cryptocurrency is. We don't need more of them. We need understanding.


Regardless of the semantic discussion on what exactly a 'blockchain' is and whether or not this qualifies, I found the article interesting by taking a tiny piece of the whole monster and demystifying it a bit.

I don't see this any differently than a tutorial that makes a static image move around the screen with key presses. Is it a full game? Not at all. But can it teach you how to build a tiny piece of it, and you can look at other tutorials to figure out how to build other tiny pieces of it, and eventually synthesize that knowledge and make a full game? Yes.


Hey you got some link for other tutorials to go from here?


The clever part about bitcoin is reusing existing technologies in a novel way.

It uses:

- Hashcash-style proof of work: http://www.hashcash.org/ (originally designed to make spamming harder, although I don't think it ever really became widely adopted).

- Merkle trees to reference transactions in the blocks (same technology used for bittorent chunks): https://en.wikipedia.org/wiki/Merkle_tree

- Elliptic curve cryptography for signing transactions: https://en.wikipedia.org/wiki/Elliptic_curve_cryptography (asymmetric cryptography, used in TLS, SSH etc...)

- When a new "node" connects to the network it uses hardcoded addresses and DNS seeds to find other nodes to connect to: https://bitcoin.org/en/glossary/dns-seed (before that it used IRC)

And there are probably a few other things but basically if you know how to work with those technologies then you have all the tools you need to implement a bitcoin clone. It's nothing groundbreaking, it's "just" a clever novel use of existing technology.

All the actual "rules" of the network (the block reward etc...) are simply validated in the nodes, they'll reject any block that's not properly crafted (i.e. bad reward, invalid proof of work etc...). Therefore the node consensus dictates what bitcoin is. If you implement a non-compatible change in the client and you don't convince all the other users to switch you create a fork since you won't have the same definition of what constitutes a valid block.


I haven't encountered any myself, no. But I haven't really been looking either.


So ah... where's the part where you implement a working blockchain, again? There's no proof-of-work, consensus logic, currency logic, accounting logic, peer-to-peer networking code, etc.

Instead this is just a list of hashed-linked documents... Not that much different from a Git repo or a basic file system with integrity checks. Hashing != blockchains.


What? A blockchain isn’t a distributed ledger. A blockchain IS a chain of linked documents - whereas these documents are usually hash trees including the “transactions” per block. They ensure integrity by hashing the previous blocks. The proof of work is just a way to ensure a self regulating growth of virtual currencies and to prevent double spending.


>A blockchain isn’t a distributed ledger. A blockchain IS a chain of linked documents

You are interpreting "blockchain" to be constrained to a "chain of hashes". That is a valid interpretation.

The parent (Uptrenda) is interpreting "blockchain" as a umbrella label for "distributed ledger". This wider definition is also a valid interpretation and this language phenomenon is called "synecdoche"[1].

To point back to the actual article, the author (Gerald Nash) is using "blockchain" in both meanings. On the one hand, he talks about the wider scope of distributed currency... but on the other hand, his Python example is constrained to a "chain of hashes". Since Mr Nash is mentioning the Satoshi bitcoin whitepaper when talking about "blockchain" (the wider meaning), Uptrenda's criticism is reasonable because the most interesting part of distributed-blockchain is the invention of incentives (mining rewards) and social agreement on acceptable hashes (e.g. how many 0000s are counted, which chain is chosen, etc).

As other distributed projects such as IPFS/Diaspora/Sandstorm/etc show, finding the right combination of incentives to create social buy-in and sustainability is the hard part. The hashes is the easy part.

https://en.wikipedia.org/wiki/Synecdoche


That argument is not exactly valid, because the concept of a block chain has been described a long time before it was used in "virtual currencies." [1]

"Block chains" in modern times are very helpful in handling transactions in databases by replacing things like CAS-Numbers [2] with hashes. The hash of the previous block effectively becomes the CAS-Number, with the feature of staying persistent and verifiable.

CAS example: IF X_version IS 200 SET X=2

BC example: IF hash(X) IS 200 ADD document Y WITH Y_parent = hash(X)

That is, while the article does in fact also mentions Bitcoin, I don't see a point in discrediting the article, the code or the author in any way. Personally, I find it far more valuable for engineers to understand the underlying, fundamental concepts of things before going all-in. Sure, he could have discussed all these things, but for people who are new to the topic, that would not have been very helpful in understanding the concept of a block chain.

[1] https://link.springer.com/article/10.1007/BF00196791

[2] https://en.wikipedia.org/wiki/Compare-and-swap


>, because the concept of a block chain has been described a long time before

Citing historical usage and original meanings is not relevant as to why people use synecdoche.

The point is that in today's discourse, the term "blockchain" has already expanded beyond the original literal technically constrained meaning of "chain of hashes". Even if _you_ want to keep the original meaning of "blockchain", you still have to recognize when _others_ are using synecdoche.

Examples in media:

+ https://www.google.com/search?q=bloomberg+blockchain

+ https://www.google.com/search?q=wsj+blockchain

+ https://www.igvita.com/2014/05/05/minimum-viable-block-chain...

To interpret those articles correctly, you have to mentally substitute "blockchain" with "distributed ledger". If you substitute with "chain of hashes", the articles make no sense.

Synecdoche throws original historical usage out the window.

>, but for people who are new to the topic, that would not have been very helpful in understanding the concept of a block chain.

There are many web articles that create a "toy" blockchain by showing _hashes_. The problem is that the abundance of such articles actually hides the real ingenuity of the Satoshi/bitcoin.

Again, the hard part is combining multiple technologies in clever ways that can synchronize the psychology of anonymous people -- aka "decentralized trust". The hashing is the least interesting component of all that. However, since the "hashes" are the most "accessible" part of it (especially to programmers), that's what everybody ends up writing about! This has the unintended side effect if hiding the more groundbreaking ideas (the psychology) of distributed ledgers.


The psychology aspect is a part that's the most impressive. You have incentives -- everybody likes money -- and proof-of-work -- consensus that solves the byzantine generals problem. But if you combine them all to produce a new trustless kind of money then this innovation itself serves like a socially scaleless memetic payload.

It's like what Nick Szabo talks about in his social scalability article. That the design of Bitcoin offers high social scalability, allowing anyone to participate in the system across cultures, languages, customs, and laws. It kind of ties everything together under a single model.

I also agree with you about the "Synecdoche" thing, too. It seems like we're arguing about the same thing. As you say, you can interpret "blockchain" as the chain of blocks that is outputted from the consensus system or as the whole system itself. But I guess in the context of this article it was confusing whether this poster was having a genuine misunderstanding or was just talking about a part of the system.


> That argument is not exactly valid, because the concept of a block chain has been described a long time before it was used in "virtual currencies."

The idea of a "block chain" had a name back then and it still has a name now. No software developer would confuse a hash list with the underpinnings of a virtual currency unless:

A) You're writing a title for an article and want more clicks

B) You're a banking institution that wants to seem "hip"

C) You're working on a centralized database but really need funding so you throw in the word "blockchain"

D) You're a reporter and you have no clue what you're talking about

Show me anyone referring to a "block chain" in a non virtual currency context that doesn't fall into the above categories and I'll reconsider.


I see what you mean. It's amazing how people on Hacker News always know the exact word for a concept no matter how obscure. I learn more from the comments here than I do from the articles.


This is a really twisted take on blockchains. The whole point behind a blockchain is that cryptoeconomic incentives serve as an additional security function which is better than relying on generosity alone to contribute hashing power. So while it is true that a blockchain can exist without any kind of currency - the real genius behind blockchains is how they are designed as a new kind of corporation - one which requires no trust to pay users to create a single, shared view of an ordered list of events.

It's kind of like how in modern cryptography most of our algorithms aren't bullet proof. Cryptographers aren't saying that an algorithm can't be broken, but that it requires so much computational resources to do so that its simply improbable. We can say that blockchains also include this idea but they add the economic aspect; Now it isn't just improbable to break an algorithm (with universe sized computers) -- its also improbably (and irrationally) expensive.

Your other point is that a blockchain is simply a linked chain of documents. This is incorrect. The whole point behind a blockchain is that it serves as a way to get people to agree on an ordered list of events. The problem was never being able to form that list (anyone can hash a list of documents, its basic applied crypto.) It was getting a group of strangers to agree on a single result under highly adversarial conditions.

See also: http://unenumerated.blogspot.com.au/2017/02/money-blockchain...


> The whole point behind a blockchain is that cryptoeconomic incentives serve as an additional security function which is better than relying on generosity alone to contribute hashing power.

This has nothing to do with blockchains. What you're describing is proof of work with a payout system, and you could implement that without a blockchain at all.

>So while it is true that a blockchain can exist without any kind of currency

It has always been true. This was, in fact, exclusively the case before the first cryptocurrencies - blockchains are data structures, not a concept exclusive to currency.

>Your other point is that a blockchain is simply a linked chain of documents. This is incorrect.

It is exactly correct. The use of blockchains in cryptocurrencies is irrelevant to what they are. If I build the next great money machine tomorrow using a doubly-linked list, it doesn't change what a doubly-linked list fundamentally is. The fact that many people have been introduced to the idea of a blockchain through cryptocurrencies doesn't alter the basic facts.

>The whole point behind a blockchain is that it serves as a way to get people to agree on an ordered list of events.

This is not what the blockchain does. The blockchain helps to ensure that Proof of Work is effective for maintaining consensus after blocks are made - it's the role of PoW to produce the agreement. A blockchain without PoW would allow members to produce new combinations of blocks at will to edit the history.


> This was, in fact, exclusively the case before the first cryptocurrencies - blockchains are data structures, not a concept exclusive to currency.

What are you talking about? The term "blockchain" didn't even exist before Bitcoin. This seems like a case of trying to rewrite history...


Okay, lets argue that a blockchain is simply a data structure to create a linked list of hashes... Do you know what's part of that data structure as defined in the original paper: the format of a new block hash (how many zeros it has.) It's the fingerprint of a block that decides how to extend the chain. So if you simply decide to build a linked chain of blocks without any regard to the format of block hashes (proof-of-work) and the total accumulative hashing power, then it makes no sense at all.

Like I said: you can't just abstract this part away as a "data structure" and still have it mean anything. Without proof-of-work your toy example is literally just a made-up document sitting on a single computer.

>blockchains are data structures, not a concept exclusive to currency.

Blockchains as a linked list of hashes don't make sense. You need proof-of-work to decide on the chain. Proof-of-work is expensive so without the currency, proof-of-work doesn't make sense. There are other ways to agree on the order of events but they are not blockchains as understood by the experts in this space (and removing any one of these things undermines the benefits.)

Journalists and enterprise "blockchains" enthusiasts are free to use the term however they like but it doesn't change the fact that blockchains were introduced as an incentivized, trust-free ledger of value -- and by removing any one of its parts the benefits are still lost.

>This is not what the blockchain does. The blockchain helps to ensure that Proof of Work is effective for maintaining consensus after blocks are made - it's the role of PoW to produce the agreement. A blockchain without PoW would allow members to produce new combinations of blocks at will to edit the history.

Blockchains use proof-of-work to create agreement and pay anonymous contributors for doing the work to do so, its as simple as that. A "blockchain" without either of these things is not a blockchain. Incentives are needed to improve social scalability and ensure that there is a reason to continue to secure the ledger.

Proof-of-work is used to give the chain its meaning. I could hash a list of documents right now that said I had millions of dollars but this list couldn't be used by anyone else in an untrusting network of computers without a consensus algorithm... It just so happens that proof-of-work is still the most secure way to do that consensus. So you cannot abstract it away as a "data structure" independent from proof-of-work and the incentives that drive it...

The argument you're trying to make about "blockchains" is something I've only seen enterprise blockchain enthusiasts make about ledgers, and these people usually don't understand the system very well so they end up taking parts of the blockchain away and building something that makes little sense in the end (sometimes its an improvement but usually not.)


It's also missing the logic to choose the best tip, i.e. the logic that decides which fork is the one to follow.


Again, it’s not a distributed ledger. There’s no “fork” unless you want your data structure to do so.


The only real blockchain is CBC mode. (Cipher Block Chaining)

What you're describing is a cryptocurrency, or some other form of cryptographically assured distributed ledger with a proof of work.

If you think I'm wrong, look in the original Bitcoin paper and search for "blockchain".


Clearly there's an association to Bitcoin and there are certain expectations by people when mentioning blockchain.

It's justified since blockchain on its own is a relatively simple data structure which on itself is not that significant. It's the composition of all the elements that have given the value and recognition to this technology.


Same answer here: the term "blockchain" didn't exist before Bitcoin. It has a very specific meaning involving a particular kind of distributed ledger.


I wanted to try re-writing in Ruby to warm up today. Can someone confirm this is correct? https://gist.github.com/chrisallick/cb196b13555c86f9193f3ec4...


Looks good to me.

You might want to keep track of the blocks as you create them, or you will find it hard to verify the integrity of a block.

(Do I know the hash of the previous block? Do I know about the next block, if it is not the last block?)


yeah im not sure how deep i want to go. i already have a side self-education project but i keep getting sucked into blockchain stuff and find myself reading white papers. i don't know... seems like it could become an obsession.

the thing i really want to try making is take a sinatra/redis/crud app and rebuild the database as a distributed database on a blockchain. or at least i think thats what i want to do.


Regardless of the quality of the article, it's refreshing to see so many good questions and concise answers in this thread (mostly about blockchains-as-distributed-ledgers). It's been a while since I saw such nice, civil and interesting technical discussion on hn.


I've got a JS block-tree here. The root block is hashed from user content. https://github.com/jchris/document-coin


When you're talking about people's money, just following tutorials and piecing something together does not cut it.


We detached this subthread from https://news.ycombinator.com/item?id=14796915 and marked it off-topic.


I don't think anyone is suggesting taking this and publishing as the latest niche 'WhateverCoin'.


Indeed. But there is a lot of talk about bitcoin and blockchain in general lately, and a lot of existing cryptocurrencies which are unworkable, created by people who want to get rich quick.


For sure. But those people probably don't bother trying to make their own implementation of the protocol. The code is out there for them to just go 'copy+paste, change a few config settings, done'. They're essentially the script kiddies of the cryptocurrency world.

But me, I'm not planning on starting a coin, although I'm pretty interested in the tech behind it, albeit not enough to deep dive into the white papers and code, at least not yet. So I thought this was a nice byte-size chunk and was informative.


That's fair enough.


I don't think BlockChain is about creating a newer coin, it's just interesting as a topic.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: