
Show HN: Baxx – Unix-friendly backup service - zulgan
https://txt.black/~jack/baxx-dev.txt
======
bloopernova
(Alternative product recommendation, please downvote/remove if you feel that
isn't appropriate)

For Unix/Linux backups, may I suggest Borg Backup? It encrypts and does dedupe
astonishingly well. It also works over SSH incredibly fast, and restores are
via a mounted FUSE filesystem so they're easy to pick and choose what you
need. It prunes really well too, and is a single executable so it's easy to
distribute via Ansible/Puppet/etc. I've been using it for several years and it
hasn't failed me yet.

If y'all want to laugh at my bash scripting skills, I have a backup script
that sends backup status to a Zabbix monitor server at
[https://gist.github.com/anthonyclarka2/cef41d201dd5b890dae67...](https://gist.github.com/anthonyclarka2/cef41d201dd5b890dae6786010531aeb)
(I'd appreciate improvements to that script if anyone wants to critique)

~~~
dmd
I evaluated Borg and Restic and found that both of them fall over once you get
to (what I consider to be) production level volumes; in my case that's ~1 PB
and in the range of a billion files.

Sadly, the only thing I've found so far that works at all at those scales is
Bacula, and that is file-based -- i.e., if you have a gigabyte file that
changes by one byte, it backs up the whole gigabyte again. Not ideal.

~~~
8fingerlouie
I can't say i've been testing with PB sizes, but for my "meager" 8TB backup,
Borg works well. A daily backup & prune operation takes less than 20 minutes.

Restic falls over as soon as you cross 2TB sizes, and prune operations are
painfully slow. Backing up 8TB took 4 weeks with Restic, and i gave up waiting
for prune to finish. It was crossing the 24 hour mark, making it unsuitable
for daily backups.

~~~
dmd
Yeah. I back up a few terabytes a _day_ with no problem in Bacula.

------
sanderjd
This is really neat. I really love the idea and the presentation.

I would not trust my backups to your service _yet_ , just because of the "this
is a prototype" language. My immediate thought is, "this seems great, I'll
have to come back and check it out once it's more of a real business". But
therein lies the rub, I think: what will drive me back to check it out later?
There doesn't seem to be a mailing list to sign up for. Maybe you'll hit the
front page of HN with a full launch later, and I'll see it that way, which
would be great, but maybe not!

In any case, nice work!

~~~
zulgan
good point, I added 'make a mailing list' to the todo

~~~
zulgan
i made a mailing list and slack

    
    
      slack: https://baxx.dev/join/slack
      google groups: https://baxx.dev/join/groups

------
lazyant
Nobody has commented on what jumped at me as a great feature: detecting if the
backup seems bad.

    
    
      * get notified if the file is too small  
      * get notified if the file is too old

------
cyphar
I have a feeling rsync.net is far more Unix-friendly as a backup service (it's
basically a Unix shell provider with automated ZFS snapshots). I do think the
website has a neat aesthetic though, and it is cute you can register over SSH
(not that this is practically advantageous over a website-based registration
-- registration is a one-time operation).

The repo on GitHub doesn't have a license AFAICS.

------
Topgamer7
Your email provider domain appears to have been hacked and is being used for
Black SEO purposes:
[http://sofialondonmoskva.com/blog/](http://sofialondonmoskva.com/blog/)

~~~
zulgan
whoa this is my friends's company domain, i just have email there for 10 years
or so, will let him know, thanks!

------
samat
Some very basic html with no css would be much easier to read than parsing
markdown in my head. Machines should do machine work, not people.

Edit: still love idea and execution, thanks for sharing!

~~~
furyofantares
This is kind of a funny comment, markdown was designed specifically to be
readable as-is, which is why so much of it’s syntax looks like conventional
email and other plain-text usages that predate it.

~~~
throwaway123156
True, but markdown was meant to be readable for a markup language.

Which is likely still less readable than something whose sole purpose is to
present to humans, such as HTML/CSS.

------
ebg13
> _Trial 1 Month 0.1E_

> _Subscription: 5E per Month_

> ...

> _I decided to charge 5$ (0.1$ trial)_

So which is it? $ or E? And is E supposed to be €? Also, in English the
currency marker goes to the left of the number.

~~~
zulgan
I used E because € does not render on xterm, maybe I should switch it to EUR
to remove the confusion

also fixed the $ to E in the blog, thanks for the heads up

~~~
yjftsjthsd-h
> because € does not render on xterm,

It does for me?

~~~
sourcesmith
Yes, no problem rendering a € in an xterm.

------
UI_at_80x24
I want to thank-you for using a good-name for your service.

Too many times some developer will pick a common word to launch their
service/product thus making it impossible to search for it.

Your name (a) is a single syllable

(b) almost hints at what it does in the name (Baxxup?)

(c) is not easily confused with other products

(d) Your closest competitor from a google SEO standpoint is a tanning salon
(single word google)

Great job! I wish I was creative like that. Usually my crap looks like: sbs
(server backup script) Works for me but my co-workers hate me.

~~~
zulgan
thanks! my process is actually much easier than you think i simply do some
small transpositions on really cool(for me) names i have seen in world of
warcraft :) such as zulgan and juun, judoc; jaxx -> baxx because backups start
with b

others I chose are horrible, such as [https://scrambled-
eggs.xyz](https://scrambled-eggs.xyz) which is effectively unsearchable,
though you could say it was by design haha

------
xaybey
For some reason there's a lot of negativity in this thread. I thought the txt
file was really cool, signed up immediately, and have been having a lot of fun
with the interface. I don't remember the last time I found registering for a
SASS product to be _fun_.

~~~
rsync
" I don't remember the last time I found registering for a SASS product to be
fun."

Agreed. This:

"The way you register is through ssh, just `ssh register@ui.baxx.dev`"

Is very cool and makes me happy.

~~~
peteradio
That is very cool, how is this accomplished??

~~~
humblebee
Enable empty password for a particular user, and set the default login shell
to the registration program.

[0][https://github.com/jackdoe/baxx/blob/3d2f6f014e5def3b5c35496...](https://github.com/jackdoe/baxx/blob/3d2f6f014e5def3b5c35496e9b68df220f68e099/gui/docker/sshd_config#L33)

[1][https://github.com/jackdoe/baxx/blob/3d2f6f014e5def3b5c35496...](https://github.com/jackdoe/baxx/blob/3d2f6f014e5def3b5c35496e9b68df220f68e099/gui/docker/entry.sh#L24)

------
StreamBright
How is this comparing to Tarsnap?

[https://www.tarsnap.com/](https://www.tarsnap.com/)

------
marssaxman
I like this.

Another down-to-earth, unix-oriented, fair-deal style "cloud" service worth
investigating is rsync.net. I've been using it to sync data between my
devices, and I like the way it leaves the user in control.

I have not tried Baxx yet, but it seems to be a project in a similar spirit,
and I am happy to see more investment in that kind of future.

------
swlkr
This is really cool. The aesthetic is amazing. This really speaks to the
hacker in me, the no-website thing is a really inventive way to stand out from
other backup services.

Great execution!

------
billpg
If there's no website, what am I looking at when I follow the link?

~~~
kiallmacinnes
I think the intent is that, there is no web UI for the product itself. The UI
for product itself is a terminal program, accessed via either SSH, or a web
based terminal client. The product UI has zero HTML/CSS/etc.

~~~
gojomo
The website [https://ui.baxx.dev](https://ui.baxx.dev) is, in fact, serving
HTML, CSS, and JS.

I get the point, but the "without a website" claim is kind of a cutesy
attention-grabbing prevarication.

I'm usually pretty tolerant of those, but personally find that here it mixes
poorly with the idea of a backup service. From such a service, I'd prefer to
see total honesty, and a sort of buttoned-down, almost military
humorlessness/seriousness.

~~~
kiallmacinnes
You're of course right, that page is serving HTML/CSS/JS - but that's kinda
nitpicking.

It roughly akin to me saying "my Golang on Linux program is 100% C code free -
no C is used in the execution of my code!" .. well, clearly that statement
would be false.

However, it's perfectly legitimate to say my project is C code free - and
that's what this project is saying. It's saying the project is web-tech free.
There just happens to be an off-the-shelf execution environment which uses
web-tech.

~~~
gojomo
Perhaps if you don't mind fuzzy hand-wavy claims that are as likely to mislead
as inform, they could say "our project's source code includes no 'web-tech'.

But that's not the headline. The headline was instead that it's a service
"without a website". But there is absolutely, positively, no-nitpicking-
involved an HTTPS-accessible website for the service, set up by the service's
own creators.

------
zulgan
author here: I am going to my daughter's birthday, so I wont reply to comments
for a while, but if anyone runs into trouble please send an email to
jack@baxx.dev

thanks a lot for the feedback, much appreciated!

------
marceloneil
May I suggest Kimsufi along with Hetzner for if you scale up? Its similar
pricing and they have some NA locations (although less storage). I've had good
experiences with them.

~~~
zulgan
i will check it out thanks!

------
dsr_
My key objection is that it doesn't offer any additional value: if you are the
sort of customer who would do this, you are also the sort of customer who
would implement the same chain of commands to backup to a virtual machine that
you owned, eliminating the attractiveness of baxx.dev as a aggregated target.

What additional value can you add?

~~~
zulgan
* it took me good 2 weeks to implement it

* i have actually implemented this kinds of systems multiple times and always failed at the 'who watches the watchers' and usually had broken backups (as in i usually needed them few years after the system is written and by then something is always wrong)

* in the future it will use contextual bandits to probe for broken backups (such as: hey, is file XYZ with size X uploaded at Y weird?) and learn from all the customers triggering the AI flywheel, more backups more data, better contextual bandits, better service, more customers (the idea is to use bandits[or something else with exploration factor] for probability of "good" notification)

so in few words: value = out of the box alerting + nice api + machine learning
(not implemented yet) + good price (no cloud)

------
escaper
I get that it's probably "trendy" to say you don't have or need a website but
with it being so easy to create a static page, I'm failing to understand why
you wouldn't just do that, which is already the bare minimum, if you care
about your project and want to sell it?

~~~
dmos62
Why is an HTML document the bare minimum? I absolutely appreciate this far
more accessible form, author's dedication to function and disregard for
showmanship. Somewhat paradoxically, for a potential client like me, this is
way better marketing than any amount of superfluous CSS sugaring or
formatting.

~~~
recursive
I think this text/plain presentation _is_ a form of showmanship.

Put another way, do think there is ever any amount of CSS that is not
superfluous?

~~~
dmos62
It's showmanship in that it's so nonstandard that it's theatrical. We
shouldn't stigmatize that.

CSS can be non-superfluous in documents, like when formatting text or a table.

I'd like to focus on the fact that this required orders of magnitude less
effort than designing a web site, and most importantly – the presentation did
not suffer. There were no compromises to be made.

Some have mentioned that this lacks functionality, but I'm having a hard time
imagining what that might be. Maybe mobile-readiness, in that it's
preformatted. Considering that this is a Unix-y tool for power-users, I think
it's ok to expect them to read this on a PC.

------
arendtio
Two reasons why I do not want to use the service in its current state (meant
as constructive criticism):

1\. If I would send my backups to some SASS there is no way I would do that
without encryption.

2\. I like to backup my filesystem and not just the files in it (to make sure
I've got everything and make restoring easy). Currently, I just dd my block
devices, but I am sure that could be optimized to not upload a complete image
every time.

Sure, I could solve those two problems myself, but then I could also use AWS
S3/Backblaze ;-)

~~~
ars
> Currently, I just dd my block devices

That's actually not a good idea at all.

It's very fragile, the slightest problem and you may lose the entire backup.

Silent corruption may get backed up for months and you won't notice until it's
too late.

Doing a restore means having space for the full file, just to restore a single
small file.

If you don't know when your file changed you may have to do that multiple
times instead of just being able to check when that single file changed.

In short: don't do that. Copy the full directory structure, sure. But not the
disk image.

~~~
arendtio
Why is dd fragile? I mean, yes doing so while the filesystem is mounted is
like Russian Roulette, but if the filesystem is unmounted? I see that it isn't
very efficient, but I never had any problems with reliability.

Silent corruption might actually be a problem, but that is something you won't
solve by backing up files and directories instead of block devices. After all,
a block devices backup just contains more information than the files and
directories backup. The only advantage of filesystem backups is that you can
easier validate if a file should have changed, but even if you detect that it
changed even if it shouldn't have, you still need some kind of checksum or so
to find out which version is correct.

On the other hand, if you back up the files and directories you have to care
about the filesystem type and if there are special types like links, devices
nodes and the like. On that side, I had enough unpleasant experiences that I
am trying to avoid that trouble.

To restore single files I can simply mount the image on a loop device, so no
problem there.

~~~
zulgan
there is absolutely no technical problem doing it on unmounted filesystem but
it just imposes certain restrictions that I dont think are worth their cost

considering in many cases you want to copy live files, you have to copy them
to place you can unmount, so now you have 2 problems

if you dont gzip the dd you risk pure cosmic microwave background bitflips and
no checksums, and they can corrupt superblocks and etc

if you gzip it (to have crc sums) you cant restore partial files from it so
easily unless you use squashfs and significantly complicate the process

so tar and gzip is more robust (compared to the dd) just because of the crc
(of gzip not of tar), size and portability

------
presto8
Love the idea and implementation! However I have two main questions/concerns
that might prevent me from being a customer:

Cost. Currently I pay $10/month to CrashPlan to store 2TB a month. I think
your plan is to simply pass through the glacier price, but even that would be
much higher cost I think?

Large files. Because I only have finite monthly bandwidth and a limited upload
speed, it would be good to do incremental backup of large binaries, only
changing the parts that changed.

------
alexandernst
As cool as this could be, this service is not passing any company-grade backup
quality requirements. Also, the price is a complete no. Not because 5€/m is
much, but because 5€/m can get you hundreds of GBs of storage in AWS Glacier,
with contractual guarantees about your data, how it's managed, etc...

Basically, this is a toy. Use it as a toy, not as a service where you'd
actually put your company files.

That said, good job!

~~~
GordonS
> because 5€/m can get you hundreds of GBs of storage in AWS Glacier

Yes... until you want to take anything back out - then you're going to have to
wait, and you're going to pay a lot more than €5

~~~
alexandernst
Indeed, but it matters how much of the data you want to recover and how fast
you want to recover it. Getting 100gb from glacier is not expensive at all if
you can wait a couple of days.

~~~
GordonS
When Glacier first launched I looked at pricing, and it was going to cost a
_lot_ to restore from backup if needed. But after your comment I checked the
pricing again, and you're right - it's very much cheaper now that it used to
be!

------
Arkanosis
I'm confused by the `find | xargs | sha256 | curl` trick… Is it even remotely
as efficient as the rolling checksum of an actual rsync?

------
decide1000
This is great. Building a company from the terminal. Good luck on your
journey.

Don't need a backup solution right now, but I keep it in mind.

------
brg1007
".... 750E profit - 50% tax = 375E profit"[1]. 50% Taxes in DE, nice place to
do business ...

[1] [https://github.com/jackdoe/baxx/blob/master/infra-and-
pricin...](https://github.com/jackdoe/baxx/blob/master/infra-and-pricing.txt)

~~~
q3k
Are you sure this is Germany? The author has their location set to 'Amsterdam'
on GH, and has no impressum on their site.

FWIW, corporate taxes in Germany aren't 50%, but 15%. What is close to 50% is
personal income tax + health care if you pass a certain threshold. However,
even if you're running a just sole tradership (Freier Beruf or Gewerbe), this
would only be your personal income - all business expenses are not counted
towards your personal income tax base and you get the VAT you paid back.

~~~
biztos
I would guess it’s someone with a job, and this is hitting income tax.

But corporate taxes in DE are about 30%, not 15. Don’t forget your
Gewerbesteuer.

------
DINKDINK
Issuing invoices and paying directly in the terminal via Bitcoin's Lightning
network would be pretty cool add-on.

Machine to Machine payments: Say you deploy a bitcoin miner to stranded gas.
It starts to mine coins and can start paying you to back up it's data without
ever needing to create a Paypal account.

------
helper
Registration over ssh is cute but not safe from MITM attacks. Even if the ssh
key was published somewhere (which as far as I can tell it isn't) you would be
dependent on people manually adding the key to their known_hosts file which
you can't reasonably expect most people to bother with.

~~~
marci
You can't reasonably expect most people who would register to a service over
ssh and make backups with cUrl in a shell script to update their known_hosts
file?

The instructions on the main page would change from:

    
    
      ssh register@ui.baxx.dev
    

to something like

    
    
      curl https://ui.baxx.dev/ssh_keys -o /tmp/baxx_ssh_keys
      cat /tmp/baxx_ssh_keys
      cat /tmp/baxx_ssh_keys >> ~/.ssh/known_hosts
      rm /tmp/baxx_ssh_keys
      ssh register@ui.baxx.dev
    

or in one line:

    
    
      curl https://ssh_key.baxx.dev >> ~/.ssh/known_hosts && ssh register@ui.baxx.dev

~~~
helper
The author of this service didn't think it important to provide the ssh key
and based on the comments in this thread people have already signed up for
this service without caring about the key. So yes, I think even people who
would use this service mostly can't be bothered to manually update their
known_hosts file.

~~~
marci
In that sense then I agree with you.

Edit: I thought you meant that, given the command, most would still not do it.

------
Krokku
Interesting concept to sell a shell service. I hope we will see this more in
the future.

------
weakwire
That is a website.

------
alexellisuk
I love the UX for this.. why haven't we seen more of going back to UNIX
principles? I'm also curious.. why and what ML do you want to add?

~~~
zulgan
* in the future it will use contextual bandits to probe for broken backups (such as: hey, is file XYZ with size X uploaded at Y weird?) and learn from all the customers triggering the AI flywheel, more backups more data, better contextual bandits, better service, more customers (the idea is to use bandits[or something else with exploration factor] for probability of "good" notification)

i think the problem needs exploration, i have been bitten by bad anomaly
detection and setting up alerts post-moretm after losing data one too many
times :)

i imagine creating a bunch of features such as:

    
    
      * file extension
      * time of upload
      * delta from previous version
      * size difference from other files in the directory
      * etc..
    

and having enough customers we should be able to do good-ish prediction

when a file version is "weird", so we could send notifications such as:

> hey is this file ok?

with some exploration factor, even using UCB will(should) be better than
nothing

I plan to use vowpal wabbit's contextual bandits with only 2 actions, "send
notification"/"dont send notification" given all the context

this of course will be just extra on top of the manual alert rules, but
hopefully it will save some data :)

If the users agree we could also publish anonymized datasets with labeled data
(such as: given context alert was sent: was [good]/[bad])

Which will be awesome.

------
antoineMoPa
Interesting! How did you integrate PayPal payment in the shell without a
website?

~~~
zulgan
using paypal IPN and redirect to paypal subscribe page, literally as simple as
[https://github.com/jackdoe/baxx/blob/master/api/account_rout...](https://github.com/jackdoe/baxx/blob/master/api/account_route.go#L440)

------
nik1aa5
I use borgbackup to a NAS on the LAN. I then use rclone to copy the
repositories to a Google Drive that happens to be unlimited because of the
subscription of my previous university. :-) Luckily my account persisted when
I left the uni.

------
Tepix
What's a good way to encrypt filenames on the remote side?

~~~
zulgan
i use something like curl -T path/to/file
[https://baxx.dev/io/$BAXX_TOKEN/$(echo](https://baxx.dev/io/$BAXX_TOKEN/$\(echo)
| encrypt -k ~/.pass | base32)

which uploads: /EAAAAAEQJUAYSU5PR3WO7IKOX2ZPM5NSNIALFYJE5MMDJYNMNUZEA===

------
johnklos
Go isn't as portable as it should be.

------
nzjrs
Where is this located (for GDPR purposes)?

~~~
wongarsu
Deducing from [1]: Currently somewhere on Digital Ocean infrastructure. Long
term on Hetzner servers (apparently SX62 [2]), so at that stage it's either
Germany or Finland.

1: [https://github.com/jackdoe/baxx/blob/master/infra-and-
pricin...](https://github.com/jackdoe/baxx/blob/master/infra-and-pricing.txt)

2: [https://www.hetzner.com/dedicated-rootserver/matrix-
sx](https://www.hetzner.com/dedicated-rootserver/matrix-sx)

------
edoo
For my home directory/active data I use 'rdiff-backup' to keep 7 days of
hourly snapshots. If I screw anything up I never lose more than an hour. My
bulk data gets a snapshot every 3 days due to the size/time. All that data is
already on raid1 but the extra filesystem level backup will protect from any
low level disk hosings. I mirror that backup drive weekly and keep an extra
copy off site. If I'm super paranoid about losing something I'll also save it
with 'duplicity' to S3, although I also like remote mounting an AWS volume and
using it with 'EncFS'.

If I'm going anywhere with my laptop and want access to my data faster than
internet speeds I'll preload everything with 'unison' onto the laptop and
temporarily have my own Dropbox clone while mobile.

Using those open source tools lets me do basically the same thing as these
services while keeping full control of my data.

------
mistrial9
roach motel ?

------
stephenr
Calling it Unix friendly but not supporting ssh (and thus sftp/rsync/etc)
seems like quite a weird choice, and one that’s lost you a potential customer.

~~~
zulgan
thanks for the feedback having `scp file $BAXX_TOKEN@scp.baxx.dev:file`
support is definitely in my list

