
Bastion – Highly-available distributed fault-tolerant runtime - windor
https://github.com/bastion-rs/bastion
======
elteto
How is runtime fault-tolerance achieved? My understanding of Erlang is that
the BEAM VM implements these capabilities (custom threads, supervision,
restarts, hot reload), but it is one level removed and above actual code. And
they implement their own user-space threading runtime in order to support
them. But in Rust, there is no such runtime (or is Bastion implementing one?)
and it seems like this is used as a library. I'm very curious.

I think another way to frame my question would be: which is the basic unit of
parallel execution in Bastion? A thread? Or a separate process? There are
mentions of lightweight processes and subprocesses in the README but it is
rather vague what these are.

~~~
blattimwind
Erlang's use of m:n threading is orthogonal to fault-tolerance (perhaps not
inside the implementation, but conceptually).

~~~
elteto
It definitely is not orthogonal. Suppose an OS thread goes into an infinite
loop. How do you cleanly stop it (feel free to assume Linux/Windows/MacOS)?.
In Erlang this is possible because of the custom threading implementation.

~~~
blattimwind
In Erlang that's possible because the program runs in a VM. Erlang could do
the same with 1:1 and m:1 threading.

------
paulsutter
Could we hear a little more about the background of the project, including
what it's being developed for? Really interested to learn more about the
project, this looks great

------
windor
Very appreciate the work on bastion, which really gets the spirit of erlang
actor programming with the supervisor-ing strategy! The code is clean and well
documented, and I cannot believe the project is not well-known by rust
communities.

~~~
windor
BTW: They are working on Hot-Code Swap.

------
mkj
This looks promising, though the "No Forced Trait Implementations" seems to
instead require using a strange looking msg!() macro?

[https://docs.rs/bastion/0.3.4/bastion/macro.msg.html](https://docs.rs/bastion/0.3.4/bastion/macro.msg.html)

Seems less clean to read than Riker ([https://riker.rs](https://riker.rs)),
though that doesn't really do async well.

~~~
windor
Yes, the msg!() macro is a little painful to write. I think it can be
refactored into the pattern like `impl Handler<Msg>`. But beyond that, it
supports async/await naturally. :)

------
jokoon
I have hard time understanding what this is. Is an alternative to docker
somehow? What other framework/platform would bastion compete with?

~~~
coenhyde
So did it. I didn't know if it was a service or a library or if it integrates
with something. Looks like it's a library for Rust. I think mentioning Rust
would speed up the understanding of where this sits

~~~
windor
Yes. the title was changed which I wasn't aware of.

Origin title: `The missing part of actor-model programming in rust`.

------
davidw
Looks like good work! I'm curious about why I might use this instead of
Erlang.

~~~
hopia
I was also wondering if this is aiming to be _the Erlang for Rust developers_
, or rather _a better Erlang_. Either one would probably be worthwhile.

~~~
lostcolony
Out of curiosity, what would you be looking for for "a better Erlang"? Most if
not all of my issues were syntactical, or things that were given up as
tradeoffs that I can't qualify as "better", so I'm curious what someone else's
impressions are here.

~~~
hopia
I'm honestly no expert on the subject. I have virtually zero experience with
Erlang language. And I don't know a thing about Rust. However, if that counts
I have read large parts of Joe Armstrong's PhD thesis about distributed
systems and written plenty of Elixir.

It may be wrong to say we need a _better Erlang_ , but for modern web
development we certainly could use a fair bit more compiler enforced type
safety for writing more correct programs. Mostly due to the lack of a proper
type system, Erlang is not particularly expressive as a language, although its
primitives (processes & message passing) are well geared exactly for what it
focuses on.

Erlang focuses strongly on fault-tolerance and subsequently on high
availability, but I'm not sure this perk is so significant benefit anymore
compared to various other runtimes spinning the wheels of your typical web
server in a cloud setup. We rarely need to write servers anymore that keep
running for many years straight without a restart, only patching code via hot
reloads.

Just my take but since you asked, trading some fault tolerance for more
correct programs would therefore probably be a fair trade-off for a _new kind
of Erlang_.

~~~
lostcolony
Thanks for the reply!

Yeah; I always added type annotations to my code and ran Dialyzer...I wish
there was a stricter compilation mode for that. That said, I really liked the
'optimistic' part of it, so I didn't spend a large amount of time correcting
my type specs when proper testing proved it work.

I can agree with bfrog's comment too; Rust's typing can definitely help
eliminate a bunch of classes of bugs...but honestly I'm not sure I ever ran
into them using Erlang in production anyway. Immutable data + message passing
tends to make it easy to implement things correctly, whereas some of Rust's
more standout features seem designed to solve problems caused by the language
itself picking a different set of tradeoffs (i.e., borrow checking as part of
what is necessary to manage memory without risking it going out of scope
prematurely, or being unable to be reclaimed at a deterministic time by the
compiler). I'm not familiar enough with Rust myself to really comment though;
I just know that in two and a half years of production Erlang, the only bug we
ever encountered was caused by a C driver (and our own bad design in not
circuit breaking calls to it, leading to restarts that trickled up the
supervisor chain under heavier load than we'd tested for. None of which would
Rust have been able to help us with. In fact, even the driver, the issue was
an unnecessary network call that sometimes hit an empty cache, causing a huge
network hiccup that led to an unhandled timeout). It got 'correctness' as well
as 'fault tolerance'. At least as much as a language can (i.e., we could still
implement the wrong things, such as when we implemented an O(N^2), but
correct, algorithm, that we had to fix to an O(N) when we noticed certain
calls being too slow)

------
xanth
I wonder how this performs in comparison to actix[1] & axiom[2]?

1\. [https://github.com/actix/actix](https://github.com/actix/actix) 2\.
[https://github.com/rsimmonsjr/axiom](https://github.com/rsimmonsjr/axiom)

------
dana321
Runtime for what? Does it only run rust code?

------
pronoiac
Odd name - bastion hosts, aka jumpboxes or homeboxes, are also the access
points that bridge different security zones, like internet to a secure VPC.

------
spurdoman77
Can someone elaborate use cases for this?

~~~
gavinray
To provide context, understanding this requires a little bit of background
knowledge about concurrency paradigms.

In concurrent programming, there are a few mental models/approaches you can
use to achieve it. Each of them have different "values systems" and tradeoffs,
if you will.

In a nutshell, you have:

\- Locks (Mutex/Semaphore)

\- Communicating Sequential Processes

\- Software Transactional Memory

\- Actor Model

The Actor Model is a particularly powerful paradigm because it isolates
processes and works via message passing and spawning. The reason why
Erlang/Elixir are fault-tolerant is because of the BEAM's process model, any
given process (more or lesss) can fail and it's not a problem due to
isolation.

What this library allows you to do is architect applications in ways such that
they are much more resilient to failure and easier to scale out +
parallelize/distribute.

It doesn't have to be a networked application either, any code process can be
an actor. It applies to any software.

If you want a great overview of the Actor model, there are some slides here
which do a fantastic job of illustrating it:

[https://cs.nyu.edu/wies/teaching/ppc-14/material/lecture10.p...](https://cs.nyu.edu/wies/teaching/ppc-14/material/lecture10.pdf)

~~~
michael_j_ward
Do you have any good resources in learning more about these models /
approaches?

~~~
macintux
I’ve only skimmed this, but it seemed reasonable.

[https://pragprog.com/book/pb7con/seven-concurrency-models-
in...](https://pragprog.com/book/pb7con/seven-concurrency-models-in-seven-
weeks)

