
Breaking the Solidity compiler with a fuzzer - lymonty
https://blog.trailofbits.com/2020/06/05/breaking-the-solidity-compiler-with-a-fuzzer/
======
saagarjha
The Solidity compiler not quite mature yet, I feel. A compiler optimizations
course I took as an undergraduate had us look at trying to improve upon state-
of-the-art in some emerging field, and a partner and I picked solc. The
optimizer was clearly very new and left us a lot of low hanging fruit because
it'd readily leave performance on the table–for example, it would fail to
hoist/defer stores to storage, which are between two to three orders of
magnitude more expensive than standard arithmetic or stack operations. It's
really strange that nobody has given this another look, considering that
billions of dollars flow through smart contracts…

Solidity itself is kind of strange too: it doesn't seem to have a consistent
design and parsing it is in many cases more hard than it needs to be for what
appears to be no good reason other than "this is how C/Java looks so we're
going to blindly copy it.". I can't say I'm surprised to see a fuzzer quickly
find bugs in the compiler.

~~~
DennisP
The EVM itself is pretty inefficient. They're planning to replace it entirely
with ewasm (a deterministic subset of wasm), once they do the big 2.0
transition in a year or two. That may be why there hasn't been a major effort
on compiler optimizations for the EVM, though they do make regular updates to
Solidity. (The current version is 0.6.9, fwiw.)

There are also some other smart contract languages in the works, the main one
being Vyper, but it's not quite ready for production.

But if you're interested in picking some of that low-hanging fruit, you'd
probably be welcome, most of the Ethereum community is pretty open to new
contributors.

~~~
davidmurdoch
Are you sure the future ETH 1 shard won't continue to execute transactions via
the EVM? AFAIK, I don't think it will be replaced, it'll just live within
Ethereum 2 as an "execution environment".

~~~
DennisP
That is a possibility. They haven't made a final decision, but it would make
things easier. But most of the network would still be running ewasm (after
Phase 2).

------
deegles
Could this be used to find bugs in already published contracts? It brings to
mind the 2013 PRNG issue [0] that led to a bunch of wallets being drained.
Something that is perfectly valid today might have a vulnerability in the
future.

[0] [https://android-developers.googleblog.com/2013/08/some-
secur...](https://android-developers.googleblog.com/2013/08/some-securerandom-
thoughts.html)

~~~
albntomat0
Theoretically, I think so.

Looking through their code examples though, the compiler failures look to be
in obscure/unlikely to be used areas. Additionally, they state that some are
dependent on particular compiler flags

------
e79
It doesn’t surprise me at all that bugs were found in SMT Checker. I recently
wrote a blog post on how Solidity’s model checker works, and stumbled across
several bugs while attempting to write simple example contracts. I didn’t even
need a fuzzer :).

That area of the codebase is far from complete, which is why it is considered
experimental and hidden behind a flag that you have to manually enable.

~~~
pfdietz
One of the distinguished papers at PLDI this year is on fuzzing for SMT
solvers. People have been fuzzing these solves for years, but even now new
approaches find new bugs.

[https://testsmt.github.io/papers/winterer-zhang-su-
pldi20.pd...](https://testsmt.github.io/papers/winterer-zhang-su-pldi20.pdf)

------
analognoise
Rather than fuzzing, why not just implement it in a language with a formal
proof like Ada SPARK?

~~~
DyslexicAtheist
you'll still only be as correct as your formal proofs (e.g. using formal
proofs isn't a replacement for fuzzing)

~~~
e79
On the other hand, fuzzing is only as “correct” as your coverage,
properties/assertions, corpus synthesis, etc.

