Hacker News new | more | comments | ask | show | jobs | submit login
The Ethereum-blockchain size will not exceed 1TB anytime soon (dev.to)
119 points by fagnerbrack on Dec 15, 2017 | hide | past | web | favorite | 41 comments

The described system is basically identical to Bitcoin's pruned mode. It's lot better than geth's fast sync, which trusts your peers to not have messed with the state.

I don't think there's any way they could mess with the state, since every block header has a merkle root of that block's state.

That just prevents anyone except miners from messing with the state. One of the nice properties of Bitcoin is that even miners can't magic more coins into existence than allowed by the protocol or spend other people's money. You lose that a little with this kind of fast sync. It's probably worth it from a usability standpoint though.

Miners could mess with very recent state, but if you download enough blocks with full state to be confident of finality for the earliest one, and verify just those transitions, you're still safe.

If the incorrect blocks were published on the network, they'd be rejected by other miners, so you don't have to look far back unless (1) there's a sustained 51% attack on the whole network, or (2) you're a very juicy target and a huge miner can take over your internet connection, and you don't notice a large drop in difficulty.

Miners are already incentivized not to mess with the state.

Highly unlikely, but in theory it should be vulnerable to birthday attack

If you can get hash collisions, then this issue is only one of many and that isn't exclusive to pruned modes. Bitcoin and friends rely pretty heavily throughout on the assumption of no hash collisions.

With 256-bit hashes, it would be many millions of years before a birthday attack succeeded.

Bitcoin prune node still downloads the full blockchain, just does not store all of it. It trusts the local file system, not other nodes.

But a pruned bitcoin node does not hold the full blockchain for other nodes to download. A pruned Ethereum node, however, does.

Yeah that's what I meant, I updated my comment to be clearer.

Irrelevant, as it is not working on a powerful laptop anymore, which means it is not a real blockchain anymore.

Good thinking. I like that.

I'm not trolling: I have 50Mbit connection and specd out macbook - probably most powerful laptop you can expect from a user.

This Mist/Wallet thing does. not. work.

Maybe if i tweak or use light client it would, but as a user I don't care.

Yea I gave up on eth wallets as well. I just ended up dumping my coins onto coinbase and hope they don't get hacked or something... I looked into hot wallets, but so many people have theirs keys stolen somehow. Figured it was last risk to leave it on an exchange.

I have a Dell laptop that I bought for ~$500 in 2014, upgraded to 16GB RAM and SSD, it holds ETH, LTC and BTC blockchain. It's constant network transfer is 4MB/s, avg. CPU usage 70%. Run Arch. What problem are you having?

How often do you sync? First of all, mist crashes way too often. Even when it works it takes hours to sync. DOes it work smoothly for you?

Mist crashed only once for me when I run out of memory, "swarm" consumed 15GB of RAM and was killed by kernel. Otherwise it never crashed for me. Mist is on all the time, bitcoin once day for a few hours (3-5hrs), litecoin is on for weekends.

Maybe because it's always on it works for you. I only open it once in a while, and leave frustrated.

You should really use parity I think. Its chain will end up at ~9GiB and the traffic is minimal.

I know parity is better, I tried. But I prefer "official" stuff to at least be bailed out if i lose my coins :D

lol :D

Depends on your purpose for running a node. I use it to validate transactions and to query transaction histories, for which the full state is necessary.

It would be nice if they could implement a version that allows this, without the bloat.

I also use it to query transaction histories and to trace the currency flows. For that, full sync is necessary AFAIK.

Someone help me with this, but why do people need the full history of the Ethereum blockchain? Shouldn't having a few of the last valid blocks be enough?

Each block does not replay the entire state of the system. There could be an unspent output in a block from a year ago, that is then spent in a transaction tomorrow. You need to track all of the valid inputs that might go into a transaction. You don't need the full history (i.e. you don't need spent outputs), but you certainly need a lot more than just the most recent N blocks.

You're describing Bitcoin. Ethereum doesn't work like that.

In Ethereum, you can track the account balances; in Bitcoin, you can choose to track just the unspent transactions. The decision is pretty arbitrary from the pruning perspective.

Is this for Bitcoin or for Ethereum? I thought the former has UTXOs and the latter has accounts.

So the issue is, if you want to validate a transaction, I would have to find the block in which the input was made, and confirm all the blocks from that one to the current block?

You're correct, Ethereum uses account balances. You don't have to keep track of inputs and outputs, just check that the balance is sufficient.

Since each block header has a merkle root of the block's state, you can just get all the block headers to check PoW, get the current state, and validate transactions and states as many blocks back as you feel you need to make sure a miner hasn't faked the current state.

Here's an old article by Vitalik on the subject:


Yeah, looks like I got the terminology wrong, but the concept is the same for these purposes. Whether you track the outputs or the account balances, there's a lot more information that you need to keep an accounting of than what you can glean from just the last N blocks. Which makes sense; it is a distributed ledger after all.

2 years ago when I was first experimenting with Ethereum, it was necessary for trying to extract out smart contract addresses among other things. Now, I'd likely use etherscan or such, unless I was building something proprietary.

Many reasons. If you want to analyze the events and method calls happening on a smart contract you need to review the whole blockchain.

A more obvious use case is reviewing all the transactions related to your address.

Somebody still have to save the full chain. I don't think the idea is finally to have just one full copy of the chain.

But that's precisely my point. Pruned Ethereum nodes hold _all_ blocks and therefore are full nodes. In Ethereum, unlike pruning Bitcoin nodes, old blocks are not deleted.

Yes but you only need the transaction history for that, since you can reconstruct any block state you wish from the transactions.

If blocks doesn’t have anything extra that transactions cannot create then why is block bloated and why do we even need it?

You don't. Full nodes do just fine without storing all those block states.

You need the current state, and very recent history just to make sure there's no shenanigans, but you don't need old state unless you're doing historical queries for some reason.

1/2 a TB now, http://bc.daniel.net.nz/

That's only if you store the full state of every block, which isn't necessary even for a full node. If you just keep a few recent block states, plus the entire transaction history, it's in the low tens of GB but you can still fully validate the entire chain history, and reconstruct the state of any block you want.

Thanks for not reading the linked article, I'm explaining this very same example in the first paragraph :)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact