
Show HN: Hypergolix, “programmable Dropbox” with client-side encryption - nbadg
https://pyhgx.readthedocs.io/en/latest/index.html
======
nbadg
Hi HN, basically a one-man show here. Hypergolix just entered its alpha
release, and I thought it would be an excellent opportunity for feedback. The
fundamental motivation of the project is for individuals to retain autonomy
over information they store on untrusted third-party servers; the hope is that
Hypergolix can make "IoT" development easier with client-side encryption than
without it.

Some links:

[1] Project website:
[https://www.hypergolix.com/](https://www.hypergolix.com/)

[2] Hypergolix source (~36k LoC):
[https://github.com/Muterra/py_hypergolix](https://github.com/Muterra/py_hypergolix)

[3] Golix docs (the crypto protocol that powers Hypergolix):
[https://github.com/Muterra/doc-golix](https://github.com/Muterra/doc-golix)

[4] Golix Python implementation (~5k LoC; needs rewrite):
[https://github.com/Muterra/py_golix](https://github.com/Muterra/py_golix)

[5] Hypergolix Slack channel:
[https://hypergolix.signup.team/](https://hypergolix.signup.team/)

~~~
mempko
As a fellow distributed software developer
([http://firestr.com](http://firestr.com)) great work! Quick question, how are
you doing conflict resolution if there is a long term network partition? Does
a client have any control over merge conflicts?

~~~
nbadg
Thanks! All mutable objects have an internal monotonic counter. Currently,
whichever counter is highest takes precedence, and any decreasing counter is
rejected by the server. The "older" (lower counter) copy will receive an error
when it tries to push, which will automatically pull in the most recent state.
The application can then decide what to do.

That being said, contention issues like this require:

1\. the same account 2\. to be accessing the same object 3\. at the same time
4\. from multiple computers 5\. one (or more) of which goes offline 6\. while
both are still producing data

This is the primary reason why support for concurrent instances of the same
account is very experimental. All objects are single-author, so if you don't
have that concurrent sign-ons, you have no contention.

Conflict resolution will always be an application-level concern. I would like
to expose some synchronization primitives (distributed locks, semaphores, etc)
for use within accounts, but this is a ways down the road.

------
problems
How is locating of the other end done? Do you have a DHT for this? LAN
discovery? Or are we basically relying on your server to stay online?

Not a lot of notes about internals like this on the site - as an end-user
developer it looks very good, so I'm sure someone will use it, but as a small
time operations guy I worry about it.

~~~
nbadg
Because the system is asynchronous, you have to have a persistence server
somewhere -- think email, not P2P. Since everything needs that server to work
anyway, it's doing double-duty as a relay server. Each endpoint pubs/subs, and
the mutual server handles the rest. So for example, when I was monitoring my
home server from my flight over the holidays, all traffic was passing through
hgx.hypergolix.com.

But it's specifically designed to use as many relay servers as you'd like, at
the same time. So if you're worried about uptime, you can run your own
servers. You do that like this:

    
    
        hypergolix config --addhost HOSTNAME PORT TLS
    

So, when I'm at home, my laptop will get updates over my LAN home server, in
addition to hgx.hypergolix.com. Not only is this more reliable, it also
reduces the receiver latency (sending latency is unaffected, because it's
still pushing upstream to both servers).

LAN discovery (of both services and actual users) is planned but not currently
supported; there are a whole host of P2P operations that Hypergolix is very
well-suited for, but that haven't yet been implemented due to time
constraints.

------
gcb0
the first application you can provide (maybe as the project sample app) is to
observe one file and sync it when changed.

most people I know that uses dropbox use it with expensive media workflows
that are extremely slow to adopt anything.

~~~
nbadg
The idea isn't to replace Dropbox. Hypergolix doesn't sync _files_ , it syncs
_objects_. That might seem like a small distinction, but when you're writing
application code, it makes a big difference. For example, the second half of
the sample app (which hasn't been written up yet, but has source on github
[1]) uses a different object to remotely control the logging frequency on the
server.

[1]
[https://github.com/Muterra/py_hypergolix_demos/blob/master/t...](https://github.com/Muterra/py_hypergolix_demos/blob/master/telemeter/telemeter.py)

~~~
gcb0
I got that. but the first thing I can think of is to have a client that syncs
files :)

even applications that syncs files already treat them as "objects" as you need
decisions on which side has a more up to date version for conflict resolution
and such.

------
StavrosK
Am I right in reading that everything is free, _except_ if you want the server
to store your data? I.e., if I store data on my home server, I can use this
for free?

Does your server have access to any plaintext?

~~~
nbadg
Correct. This is yet another "hosted platform for revenue" open source
project; if you deploy on your own servers, you don't pay us.

Servers have no access to plaintext. They also have extremely limited
metadata: only the "author" [1] of the data and its ciphertext length is
known.

[1] Technically not the author but the "binder", which is a specific term used
in the protocol, but we're getting a little deep into the weeds. See here for
more info about binders: [https://github.com/Muterra/doc-
golix/blob/master/whitepaper....](https://github.com/Muterra/doc-
golix/blob/master/whitepaper.md#data-retention-and-removal)

~~~
StavrosK
That's pretty cool, thanks. So I don't need to worry about deploying MQTT
servers and authenticating between them, I can just use this.

Can I suggest this alternative API:

obj = hgxlink.new_threadsafe(cls=hgx.JsonProxy, state='Hello world!')

obj.share_threadsafe(bob)

becomes:

bob.send({"some": "serializable object"})

~~~
nbadg
The API is definitely more cumbersome than I would prefer.

I really like this idea:

> bob.send(obj)

However, I don't think object creation and sending will ever be combined into
the same operation, because:

\+ objects don't need to be shared (imagine using an object to track
application settings; you don't want to send that to Bob but you want it to
persist across sessions)

\+ objects can be shared with more than just Bob (and we'd like it to be the
same object!)

So hopefully, in the future the API will look more like:

> obj = hgx.JsonProxy('hello world')

> bob.share(obj)

Unfortunately because of the async/await syntax, this gets a little
complicated to implement. But it's definitely on the horizon.

~~~
StavrosK
Oh, I see, so not everything is 1:1. That makes sense then, thanks.

------
StavrosK
That looks great! How are you encrypting communications between nodes?

~~~
nbadg
With a purpose-built protocol called Golix [1]. The documentation goes into a
lot more detail but it has three main aspects:

1\. Encrypt things like PGP, except key encapsulation is separate from
ciphertext delivery. Specific primitives used are AES-256, RSA-4096, and
X25519, though deprecation of RSA is planned soon

2\. Everything is content/hash addressed, which helps substantially with the
above. Specific primitive: SHA-512

3\. Data retention is governed like a reference-counted programming language;
data gets a container, and then you make a signed "binding" to give the
container an address. You can then sign binding revocations ("debindings").
When no addresses are left, the server removes the content.

[1] [https://github.com/Muterra/doc-golix](https://github.com/Muterra/doc-
golix)

~~~
pvg
One thing I didn't get from the docs is some clearer explanation justifying
what at first glance seems like a great deal of bespoke, custom crypto. Just
browsing through the source I'm immediately hit with talk of running out of
entropy, use of deprecated implementations, use of low-level functionality
from a library that actually tries to provide safer, higher-level constructs,
etc.

Maybe this is all necessary. But it isn't at all obvious why.

~~~
nbadg
Thanks for the feedback. Can I ask which parts of the documentation you were
looking at?

~~~
pvg
Skimmed the security paper and a bit of the protocol lib source.

------
mxuribe
Sounds interesting...the use-cases are not clicking in my head just right...So
I'll dive in and rummage around some more.

------
thruflo22
Nice work, keep it up and thanks for sharing!

