
What if we could see all concurrency bugs in the debugger? - smarr
https://stefan-marr.de/2019/07/what-if-we-could-see-all-concurrency-bugs-in-the-debugger/
======
breatheoften
The interface looks pretty interesting!

I think adding a way to select multiple universe nodes and say “I expect all
these universes to converge on the value 66” and then serialize that
expectation into a runtime verification which either runs during program
execution or runs during program execution when in a concurrency sanitize mode
would be pretty cool!

~~~
xtofs
Thanks for the suggestion we don’t provide that exact interface but we do
allow querying over all the universes with a graph query language. In the
paper we detail how to get the shortest path to all the end states which makes
it a bit easier to debug.

------
jsw
TLA+ specs explore all state spaces of concurrent processes. Avoiding state
explosions is something you need to be conscious of when building those. The
modeling required also takes some learning investment. Curious if this can
really pull off something like a TLA+ state exploration and invariant
violation detection with real-world code.

~~~
xhgdvjky
TLA+ is great, but it struggles so much to handle lots of states...

Maybe that's a good thing since it encourages very symmetric designs, but it
bothers me

~~~
tempguy9999
Are there other systems where combinatorial explosions of states do not occur?
Is there something in TLA that makes it prone to this where other model
checkers perhaps aren't? Slightly related, isn't state space explosion an
intrinsic part of any such problem?

If possible, could you explain what you mean by symmetry, and how it reduces
state space (it's a big ask, I know).

sorry for the firehose of Q's! I'd really appreciate an expert's view though.

~~~
0815test
Model checkers are essentially glorified fuzzers, so the way they work is
inherently vulnerable to state space explosion. The way to address that is to
move closer to building things that are correct "by construction", and prove
them as such. Symmetry is one way of doing this, it allows you to "reduce" a
whole lot of redundant checking to something much simpler, based on some
underlying properties of what you're working with. A logical proof works much
the same way, of course.

TLA specifications are typically "checked", or rather "fuzzed" by model
checkers; this is not formally required (TLA+ even includes a proof language)
but things tend to be done that way.

~~~
tempguy9999
Would whoever downvoted this please explain why so we can learn from it.
Thanks.

~~~
hwayne
Not the downvoter, but:

\- (Bounded) model checking is exhaustive, fuzzing is not. If your system
successfully model checks, then no possible input can cause an error. Fuzzing
is randomized, so there still _could_ be a bad input. That's why model
checking is considered formal verification and fuzzing is not.

\- Not all model checking is brute force searches. A lot of model checkers use
SMT or SAT, so the check time grows slowly even with large state spaces.

\- _Empirically speaking_, model-checking has scaled much better and been used
for a wider range of applications than correct-by-construction has.

~~~
0815test
> (Bounded) model checking

Oh come on. If your _bounded_ model checking is actually checking every
possible input or program state, then it's not just a method of verification
but an actual proof. (It may be a subjectively _inelegant_ proof, but it's a
proof nonetheless.) That much is obvious. What I clearly meant by "glorified
fuzzing" was the far more common use of model checking in ways that _stop
short_ of this standard. That sort of use can still be nicely scalable, it
_does_ count as a formal method, and it can point out remarkably complex bugs
in your system - but by definition, it cannot give any assurance that you're
doing things correctly! That still requires proof.

~~~
tempguy9999
To put my oar in, and this is purely based on what I've read about spin, spin
can verify 100% for 'small' models. If larger models are checked, it will run
out of memory so you put it in a probabilistic mode which gives probabilistic
assurances, still 'good' but less than 100%. In that respect I suppose it's
degenerated to rather posh fuzzing.

I think the difference between small and large is down to how long you're
prepared to let it run, but even more fundamentally, how much memory you can
give it.

I'd agree that if the model matches the code and you successfully check the
entire state space then it's a proof.

------
kjar
Not quite the same, but reminded me of reversible debugging:
[https://www.gnu.org/software/gdb/news/reversible.html](https://www.gnu.org/software/gdb/news/reversible.html)

~~~
namibj
Mozilla's rr is made for exactly that.

~~~
XMPPwocky
WinDBG's Time-Travel Debugging is another nice implementation of this.

~~~
souprock
Heh, it feels like everybody and their dog has written a reversible debugger.

Listed in this Hacker News thread we have 7 implementations in gdb, plus rr
and WinDBG.

I'll add one more, an in-house hypervisor that I help develop. Stuff like
finding "all concurrency bugs" as in this article would be a matter of
scripting it up. BTW, we're hiring:
[https://news.ycombinator.com/item?id=19797601](https://news.ycombinator.com/item?id=19797601)

So that's at least 10, counting the gdb backends distinctly except lumping the
Linux native x86 and x86_64 together.

~~~
roca
Of the 7 implementations listed in the gdb page: * gdb's "Native"
implementation is unusably slow. 1000x slowdown during recording. * Simics is
expensive and AFAIK a full-system simulator not very useful for debugging
applications. * Moxie-ELF is for an obscure custom CPU. SID looks similar. *
VMWare have discontinued support for reverse execution. * Chronicle-gdbserver
is unusably slow, 300x recording overhead. * UndoDB is the only one that's
really usable.

We actually have another gdb remote backend supporting reverse execution
that's in active use but not publicly available yet --- Pernosco.

~~~
souprock
A properly-equipped full-system simulator is useful for debugging
applications. I have one. It is decently fast too. It's free or very
expensive, depending on how you count several in-house developers to maintain
it.

------
eggxbox8
Reminds me of Corensic's Jynx.

[https://en.wikipedia.org/wiki/Jinx_Debugger](https://en.wikipedia.org/wiki/Jinx_Debugger)

Never used it but the premise of running different possible thread
interleavings in parallel and picking the one that crashes seemed genius.

~~~
tempguy9999
Sounds like what some model checkers do. See
[https://en.wikipedia.org/wiki/SPIN_model_checker](https://en.wikipedia.org/wiki/SPIN_model_checker)
for example.

Spin homepage
[http://spinroot.com/spin/whatispin.html](http://spinroot.com/spin/whatispin.html)

SPIN's language Promela is very odd, like stripped down basic from the 1980s
but even more reduced after that.

Edit: a bit more from the spin site FYI "[spin] checks the logical consistency
of a specification and reports on deadlocks, race conditions, different types
of incompleteness, and unwarranted assumptions about the relative speeds of
processes"

I wish so much I had reason to use it. I so much wish my programming career
wasn't the daily, usually boring, straightforward business support. I'd love
to work with TLA, formal methods, this and more but there seems no way to do
get a job doing this. Any thoughts welcome. Been trying to learn hoare logic
but it all seems so abstract, and it's easy to misunderstand things with no-
one to ask. Anyway, sorry for the whinge.

~~~
jrumbut
I think the typical business software could be a great place to start because
you have sections of the code that would be very simple to model in a formal
methods environment and that really hairy section that could actually use it
(but it would be hard).

I hate to say I've never had the patience for it. For me it has always turned
out to be painful without much reward.

I think my experience isn't uncommon given the popularity of formal methods.
If it was easy and full of low hanging fruit (like automated testing) it would
probably be reasonably popular (like automated testing).

~~~
tempguy9999
I dunno where in business software it might work, but my experience is that
quality is often almost irrelevant. A few weeks ago I got myself unpopular for
finding and reporting a serious bug. That could have been picked up through a
little knowledge or some straightforward testing, but it wasn't because that
didn't happen (there's a little more to it than that, but you see my point).
If producing crap is acceptable, there's no room for formal methods. I've had
a lot of experience of this.

If you can't get the boss's buy-in for even obvious correctness and/or
timesaving approaches (where time spent = salaries paid), I'm stuffed.

Or maybe the problem's partly me. I can't rule that out. I'm not the most
politic person.

> If [formal methods] was easy and full of low hanging fruit...

It isn't and I don't expect it to be. The fruit is right at the top, but it
may be the best quality. I'm ok putting in the work to reach that.

------
whateveracct
For property-based testing of concurrency & nondeterminism, look into dejafu
:)

------
saagarjha
Obligatory plug for LLVM’s Thread Sanitizer, which can find many concurrency
issues before their express themselves as difficult-to-debug problems.

------
carapace
Reminds me of super-compilation.

[https://en.wikipedia.org/wiki/Supercompilation](https://en.wikipedia.org/wiki/Supercompilation)

------
imvetri
(Thumbs-up)

I'm looking for sites/ information that can help me to publish papers. But I
do not have much idea about it. How did you publish?

~~~
quelltext
You typically have to know the venue (or journal) you want to publish at. Then
make sure to check their websites (regularly) for CFPs (call for paper). Then
you submit your paper by the given deadline following the given paper
submission guidelines and wait until it is peer-reviewed. Then either it gets
rejected or accepted (or some middle ground: asking for revisions).

------
hacker_9
Apart from looking nice, I think I understood the program even less
afterwards.

