
Launch HN: Synthetic  Minds (YC S18) – Program Synthesis to Protect Dapps - saurabh20n
Hi HN! I&#x27;m Saurabh, founder of Synthetic Minds (<a href="http:&#x2F;&#x2F;synthetic-minds.com" rel="nofollow">http:&#x2F;&#x2F;synthetic-minds.com</a>) in the current YC batch. We protect smart contracts by using formal methods to synthesize their adversaries.<p>Current use of web-app-like development practices for smart contracts is not working. People have lost over 4.1M ETH due to code bugs in these contracts, aka decentralized applications (Dapps). That equals $1.8B at current exchange rates, or $500M at the time of loss. The execution environment is &quot;deploy once, change never&quot;, which resembles hardware and space applications. NASA&#x2F;Airbus use formal methods to check this kind of software. Dapps need similar tools.<p>I have worked on formal methods since 2006. In my PhD on program synthesis, I reduced checking code properties to theorems that could be proven mechanically. Incidentally, a full program was not required. The program could have &quot;holes&quot; and the prover would “fill” the holes by synthesizing code. I took those ideas to synthetic biology in my postdoc and previous company.<p>In 2016 when I started to look at smart contracts, I realized those ideas could help build robust Dapps, and maybe prevent them from losing $100M+&#x2F;year. In June 2016, a Solidity vulnerability was discussed (June 10), but the DAO was deemed immune (June 12), right before it was exploited (June 17), losing ~$150M. In 2017, another ~$300M was lost to the Parity multisig bug. This year, I built a system to help mitigate future failures.<p>The core insight requires some background: Imagine you wrote a lossless compression algorithm `K` (e.g., LZ77 or LZW). Would it not be amazing to synthesize the decompressor `D` automatically? The invariant for `K+D` is the identity function. So if we had a `D` with holes, we could use a prover to verify the identity invariant and “solve for D” simultaneously. Now, what if your `K` was a blockchain smart contract, and `D` an arbitrary user contract? If we want K to be immune to double spends, we could ask for `K+D` that violates the invariant “K’s balance at start &gt;= K’s balance at end”. Not only would we formally verify properties, we would synthesize a specific `D` that helps explain any violations.<p>Here&#x27;s how it works: Our tool has `prove` and `autocode` modes. For a smart contract, `prove` validates its properties while `autocode` will generate user contracts that break it—if any exist. Our servers will aim for turnaround times of minutes allowing use in CI, instead of weeks-long human audits. Two demo screencasts are online: <a href="http:&#x2F;&#x2F;synthetic-minds.com&#x2F;pages&#x2F;faq.html" rel="nofollow">http:&#x2F;&#x2F;synthetic-minds.com&#x2F;pages&#x2F;faq.html</a>.<p>There are 3 steps under the hood. The 1st step is a source-to-source compiler from Solidity to a shared-memory, non-object, explicit Blockchain-state Intermediate Representation (BIR). BIR is not Solidity specific, so we could compile Ethereum’s Vyper or Tezo’s Liquidity to it, but Solidity’s traction makes it a natural place to start. The 2nd step is a symbolic executor that converts BIR to an “equation”. Think of symbolic execution as execution that traverses <i>every path</i> and <i>without</i> explicit values. E.g., `func a(int x) { y = x + 1; z = y * y; return z; }` would translate to `a(x) = (x + 1).(x + 1)`. Symbolic execution allows us to convert entire programs into an SMT equation. SMT is a theorem proving language used by many automated solvers, e.g., Z3, CVC4. The most novel, 3rd step, augments a user contract with holes for function bodies. That encodes any arbitrary <i>future</i> user contract. The solver can fill holes that (dis)prove combined properties. This simultaneously validates global invariants, and provides readable code if problematic user contracts exist.<p>Formal methods are sometimes hard to use, but it turns out that for smart contracts we can bypass the major issues (false positives, esoteric bugs found, and difficult-to-explain findings). First, most false positives come from inevitable approximations that formal analyses have to make, so they can terminate while analyzing code that itself might not terminate, i.e., Undecidability of analyzing Turing-complete languages. Gas limits truncate executions, which means we get to skip those approximations. Second, when esoteric (weird and obscure) bugs cause failures in normal web services you have to weigh the time and cost of fixing it against the potential loss, which might be negligible. But because smart contracts are immutable, any esoteric code path can be fatal to the entire Dapp, so there is no such tradeoff. Lastly, in contrast with the infinite variability of interfaces for normal web services, the uniformity of blockchain actors provides a template, enabling us to synthesize code as developer-friendly explanations of behavior.<p>Over 12+ years of work, there is one question I have repeatedly asked: Can we trade off human insight at the expense of compute power?  Current alternatives for securing contracts look for 10-20 anti-patterns, built from human insights observing past failures. In contrast, our approach asks for a functional invariant (e.g., no double spend), which might be 100x or more expensive to solve. Compute is cheap though, and our ability to find future failures is only limited by the amount we expend on `autocode`.<p>The first prototype is operational now, and I am sending out beta invites next week: <a href="http:&#x2F;&#x2F;synthetic-minds.com" rel="nofollow">http:&#x2F;&#x2F;synthetic-minds.com</a>. Excited to hear thoughts from the community!
======
russdpale
This is a great, but its hard to trust closed source code. Everything is
untrustworthy until we can see the code, I could care less about current
exploits being found. How do you know that in a months time someone isn't
going to release a similar tool that yours does? Furthermore, what is stopping
you from exploiting current projects? Or that you haven't already? The space
embraces open source for trust minimization, Am I being too paranoid here? Are
you going to open source everything at some point?

With that said, I don't want to be completely negative, what you are doing is
a __damned __great idea and I think just about all of us can get behind these
types of processes, as someone who is learning about solidity vulnerabilities
(blockchain @ berkeley) right now it is kind of funny this post comes around
today. I have kind of been fretting about how easy it is to write terrible
smart contracts and I kept thinking, well how the hell does NASA do this? What
about air frame manufacturing? That is how I want to develop my dApps, I don
't want to deploy and hope, you know? I hate that feeling, not just because I
don't want to lose anyone funds, but because I want to be proud of my code and
use it for good.

I am really looking forward to checking this out, I can't see it at work but
will check it out as soon as I get home! I think this is really important work
you are doing. I think just about everyone who is responsible in this space
knows doing contract development in a web development mind frame is dangerous
and stupid. I even heard someone say the other day "agile smart contract
development", like NO! I had an old drill sergeant tell me about combat "Slow
is smooth, and smooth is fast", well turns out it applies to smart contract
and blockchain development as well.

~~~
saurabh20n
All great points. And I agree with you on most everything.

I care about the major complicated projects succeeding, e.g., 0x, Golem,
Augur, Aragon, to name a few. They can go the DAO route, or I can do my part
in helping. Help me get in touch with more projects if you're connected.

Open sourcing right off the bat will have the opposite effect. If it works at
scale, we will be pitting the attackers against the developers. I would rather
get this tool in the hands of developers first. I welcome other people
developing the same :) - better yet: I've already developed a version, come
help me make it useful for the community.

~~~
russdpale
Yeh, I totally see what you are saying and the end I agree with you, even as
die hard FOSS as I am.

Having checked out your website I am pretty excited about what you are doing.
How are we to ever build true value delivering services if the underlying
architecture is not secure? This has been my biggest worry with the entirety
of the space. That we will move too fast, lose focus on security first, and
end up building gigantic infrastructures on shaky foundations thus turning
into the very incumbents that we are aiming to displace.

With the idea I am planning on, I will most certainly need your services. I
need for all parties to 1) understand the benefits of trustless sytems, and 2)
be able to trust the trustless system in an environment where each party is
relatively conservative in that it dislikes and discourages disruptions.
Security will be of paramount importance.

Anyways, thanks for making this, its great to see such efforts being made in
the space. We need less hype, and more maths, as I like to say.

------
dumbfoundded
If this all works as you expect and you create a way to find many bugs very
easily, you may want to carefully consider the rollout. For example, if you
release a tool, I could use this tool to steal money from a project that I
analyze to have bugs in it.

~~~
saurabh20n
Absolutely right! Which is why I am connecting directly with the project
developers. Also, I specifically built it to run over Solidity source code. As
opposed to EVM bytecode like other tools. Only 1% of contracts have their
Solidity code public on Etherscan. Making this accessible to 100% of the
original developers and only 1% of attackers, should limit it to white-hat
uses.

~~~
arcticfox
> Which is why I am connecting directly with the project developers.

I wouldn't be surprised to find some of the big hacks have been project
developers themselves moonlighting.

~~~
saurabh20n
You sir or madam, have a devious mind :). I shall hold on to my trust in
humanity for a little while longer!

------
westoncb
> The invariant for `K+D` is the identity function. So if we had a `D` with
> holes, we could use a prover to verify the identity invariant and “solve for
> D” simultaneously.

This was pretty mind-blowing for me. (I haven't really looked into program
synthesis so I may be slightly more susceptible ;) There was one part of it
that seemed problematic from my (perhaps flawed) understanding, though.

So I guess the idea is we have K+D = I (and this is roughly equivalent to
'D(K(s)) = I(s)' where 's' is a string to compress), so if we come up with a
proof of 'I'—which would be a path through theorem space starting from I'm not
sure what axioms—and we know 'I' is equivalent to 'K+D', and we already know
the value of 'K,' then the value of 'D' is implied.

My question is what is the equivalent here to the subtraction operation that
would allow us to say D = I - K? It seems very surprising to me that K and D
would be structured in such a way that their own proofs would necessarily have
any relationship to the proof we derived for I, so that such a subtraction
would work so cleanly...

~~~
saurabh20n
Your `D(K(..))` notation is better than mine. Email me, and I will send you a
gift card.

What you do is solve `\exists D: \forall arr: D(K(arr)) = arr` where `arr` is
an arbitrary length array/bitstream. See paper [1]. There is also really old
code you can browse at the bottom of my personal page linked in profile
(doesn't compile anymore coz of deprecated dependencies.)

The arithmetic analogy break down when solving. So you're not explicitly
computing `I - K`, but instead letting the constraint solver pick values that
satisfy the equation (hence SMT solving.) Also, you are absolutely right that
there is no reason to have their structure be related to identity: _But_ , the
"damage" done by `K` is encoded in the state at the interface to `D` and `D`
better undo that state to get to identity, which is what the solver would
exploit.

[1] Path-based Inductive Synthesis for Program Inversion [http://saurabh-
srivastava.com/pubs/pldi11-pins.pdf](http://saurabh-
srivastava.com/pubs/pldi11-pins.pdf)

~~~
westoncb
> So you're not explicitly computing `I - K`, but instead letting the
> constraint solver pick values that satisfy the equation

Ah—makes sense. I wonder if attempts at going in the other direction have been
made, and initially constraining the structures of the known parts so that you
could do something like a subtraction to get the unknown (probably a crazy
dead-end idea I realize :).

> But, the "damage" done by `K` is encoded in the state at the interface to
> `D` and `D` better undo that state to get to identity, which is what the
> solver would exploit.

Interesting. So because we know it's an inversion scenario beforehand we can
modify the solver to not be so general, and only bother considering solutions
which modify the state in an inverse manner to our original program...

Cool. I'll be keeping this at the back of my mind to see if any ideas come up.
Thanks and good luck!

~~~
saurabh20n
Look up Nate's work [1], specifically Boomerang [2] that aims for
bidirectional programming. Be crazy :)

[1] Nate Foster:
[http://www.cs.cornell.edu/~jnfoster/](http://www.cs.cornell.edu/~jnfoster/)
[2] Boomerang
[http://www.seas.upenn.edu/~harmony/](http://www.seas.upenn.edu/~harmony/)

------
Animats
I'm surprised that YC funded this. The ICO/off-brand cryptocurrency/smart
contract fad is kind of over, as yesterday's post on failed ICOs pointed out.

In the "dApp" world, the top five by market cap are:[1]

\- EOS, a competitor to Etherium. It's a platform for more dApps.

\- VeChain Thor, which pivoted from a RFID tracking system for cargo to a
"mobile wallet".

\- Tron - "The platform is still in its infancy as is expected to reach the
“Eternity” by 2023. In its final stage, the platform will allow content
creators to directly monetize their content by creating their own token over
the TRX blockchain."

\- Omise - this is a "white label solution for mobile payments". It doesn't
really use "dApps", it just used a token sale to raise money.

\- Icon - a blockchain to integrate other blockchains.

None of these actually _do_ anything.

If this is so great, is YC itself buying ICO tokens? Didn't think so.

[1] [https://www.newgenapps.com/blog/top-ethereum-dapps-tokens-
re...](https://www.newgenapps.com/blog/top-ethereum-dapps-tokens-
revolutionizing-world-economy)

~~~
1996
Typical negative comment when it comes to blockchain technology.

Some people have made millions, some brand new technology is created after a
unique technological advancement that solves the byzantine general problem,
even YC is investing --- yet old people cry "it is nothing but a fad!", so
much that it becomes comical, like the black knight in the Monty Python
(especially when it comes from someone confusing Dapps and Tokens)

Eventually, you guys will have to admit you were wrong. A bit like people who
thought horseless carriage were a fad. It's ok, we all get things wrong
someones. No bad blood.

Technology is here to stay. Many people are ideologically opposed to the
libertarian roots of cryptocurrencyes. Many others regret not investing in
time, or selling too early.

I get that. But maybe it's time to move on.

~~~
JustFinishedBSG
> unique technological advancement that solves the byzantine general problem

It doesn't even solve the byzantine general problem...

------
zitterbewegung
Are you using rosette? I had an idea of using program synthesis for creating
new smart contracts based on a language that would create them from unit
tests. Are you going to open source your system?

What I had so far is at
[https://github.com/zitterbewegung/Belmod](https://github.com/zitterbewegung/Belmod)

~~~
saurabh20n
I am not using Rosette, although I love the work Emina and Ras are doing to
make synthesis accessible.

[ Edit: Your idea is fantastic, btw. If you can find use cases where the size
of contracts is small, it'll be amazing to have synthesis from test cases. ps:
Do borrow some ideas from the way Sumit solves Flashfill. ]

My system is closed source, partially because it is computationally heavy and
I'll spin up a server fleet for analyses, and also because I don't want it to
be used for attacks, i.e., limit use to mostly the original developers of
projects.

Connect with the email on the website. I'll look at your code a little later.

~~~
zitterbewegung
Thanks for the advise! I haven't been working on it because I didn't really
know where to start and I had to learn solidity.

------
detroitcoder
Seems very useful. As someone who has recently started rolling out solidity
contracts to learn and eventually create contracts with real money behind,
having something that adds to my own confidence in the security of my code is
helpful. Do you plan on eventually charging for the cli used in the
marketplace demo?

~~~
saurabh20n
Yes. The hope is that you run this analysis as part of your development
workflow. e.g., on every merge into master. Charge up like AWS and use as
frequently as you need. The current alternatives involve hiring a firm for
$20-500k for a manual audit that happens at the end of years long development.
I will work with you to get this into your hands as soon as I can.

------
zebnyc
Congrats on the launch. How is your value proposition different than
quantstamp (QSP)? I believe YC also invested in them

~~~
saurabh20n
Thanks!

The key distinction is building a synthesizer, not just a verifier. These
systems are hard to use, and synthesized code is the key to making them
accessible. Watch the demo videos. See if you agree.

On the verifier side: QSP is a protocol/marketplace for connecting people
needing audits, with human auditors or free open source tools (Oyente and
Mythril) [1]. My verifier is more comprehensive and future-proof at the
expense of being compute intensive, as discussed on my FAQ [2].

[1] [https://quantstamp.com/visual-guide-to-
quantstamp](https://quantstamp.com/visual-guide-to-quantstamp)

[2] FAQ: "How is this different from Ethereum's other tools ?"
[https://synthetic-minds.com/pages/faq.html](https://synthetic-
minds.com/pages/faq.html)

------
retr0yrus
We will soon have ScAfee but for Dapps!

Is it not true that program synthesis will work, as long as a "contract" isn't
too large or designed adversarially? Because if that weren't true you'd have a
debugger for all circuits and programs (suitably restricted in computing
power).

~~~
saurabh20n
I'm guessing you meant McAfee for Dapps?

Yes. It should work. There's a good amount of work to be done though. :)

I don't follow what you mean by a "debugger" in this case.

------
DennisP
I'm an independent developer. What would it cost me to use this, once you've
fully launched?

~~~
saurabh20n
It will be pay for use, like AWS. You fill up, and then depending on how
compute-heavy your contracts are to analyze, you will be charged accordingly.
Your costs will end up depending on complexity of your contracts.

Compare this to the alternatives of paying $20,000 - $500,000 for manual
audits.

~~~
DennisP
So I still have no idea how much it might cost. Let's say my contract is 500
LOC and average complexity.

(I'm not convinced formal verification with automated tools can actually
substitute for manual audits. I think they're great for _augmenting_ manual
audits, but what if there are failure conditions I didn't think to assert as
impossible?)

~~~
saurabh20n
If you see the demo runs on the website FAQ, that example of about 20-50 LoC
cost ~$200. You can extrapolate accordingly. Now if you integrate into your
CI, I am of course going to only rerun on changes (syntactic, easily checked;
and hopefully semantic later.)

If you want a more precise cost analysis, shoot me an email for your specific
contract.

And you are right about audits. Except a caveat: The system will by default
check for global properties as well, e.g., no double spend. Again, look at the
examples in the FAQ. They had no assertions in them. Yet, we find errors.

~~~
JustFinishedBSG
> You can extrapolate accordingly.

Hm no I can't. What's the complexity of your algorithm?

~~~
saurabh20n
Ah. I see why that was unclear.

I make multiple SMT queries. SMT solvers work well in practice, but they are
solving a problem which is between NP-Complete to Undecidable (when
quantifiers are involved [1]).

Also, we are using a very hand-wavy word definition "complexity of code". This
is not the complexity-theory complexity, but instead the types of operations
used and the looping/control structure. (In hindsight, I should have used a
different word.) So 40 lines of multiple nested loops with multiple hashes
being taken and multiple dynamic dispatches over addresses, will take _much
much_ longer to analyze than 40 lines of straight line arithmetic.

My tool attempts to guess the estimated runtime midway through the analysis,
but there is no real way to accurately estimate the cost upfront. That is why
I was saying that if it cost ~$200 for 20-50 LoC then if you accept all
caveats then under wildly simplifying assumptions the 500 LoC might take
10x-25x of that cost.

[1]
[https://stackoverflow.com/a/10924475](https://stackoverflow.com/a/10924475)

------
baby_wipe
I didn't understand a lot of that, but have you found any dapp bugs yet?

~~~
saurabh20n
Ah, sorry. Happy to explain any pieces in more detail. I hope the core is
clear: If you think of Dapps as servers, then it synthesizes its clients
(specifically ones that cause assertion failures).

For bugs: The demo videos on the FAQ page show the kinds of bugs it can find,
e.g., the one that crashed the DAO, and ones that simpler tools will be unable
to find. Of course you are right, the holy grail is finding bugs that escape
both human audits and other tools. Haven't run it on very large codebases yet.
Working on it!

~~~
baby_wipe
Oh no you're fine. I wasn't criticizing your explanation, just prefacing that
I'm pretty ignorant with this stuff. But I find what you're building very
interesting.

------
fembot__
This is cool! I work at MakerDao, we are working on formal verification as
well. Congrats on YC (again)-look forward to seeing how this project shapes
up.

------
adamnemecek
Hey, can you shoot me an email? It's my username at gmail. I might have a
suggestion.

------
verdverm
FYI, your SSL certificate is invalid

~~~
saurabh20n
Fixed. Thanks for noticing.

------
thebiglebrewski
Congrats on the launch but....needs a TL; DR;. Holy wall of text, batman!

~~~
saurabh20n
LOL. I agree. Want to take a stab at it?

~~~
1996
TLDR: you use formal verification methods to find obvious bugs in the
sourcecode of blockchain contracts, taking advantage of the blockchain natural
properties to limit the search space

You have a great idea there. Run with it!

