
Show HN: Joule, a Git-like workflow for your Datasets - Allezxandre
https://joule.host
======
DamonHD
I don't begin to understand what you're offering. Pretty graphics, nice check-
boxes. But what is your service?

(Yes, I have data sets... Yes I do AI stuff...)

~~~
Allezxandre
The service is basically dataset hosting. Today, hosting code is quite easy,
with git for instance, you can have your code up for sharing on Github in a
few seconds. However, as soon as you need to make your dataset available, you
can’t use Git anymore. So instead, I’ve seen people use Google Drive to host
the datasets, and then write scripts to make the downloading and stitching of
the Zip files easier.

I built this service to solve that: instead of hosting the datasets yourself,
you just push them using Joule, and the users that download your code will
only need to run a single command to download your code.

EDIT: here’s a sample Git repository with a dataset hosted on Joule:
[https://gitlab.com/joule-host/that-state-of-the-art-
research...](https://gitlab.com/joule-host/that-state-of-the-art-research-
paper)

~~~
phanaton
What advantages over Git LFS does it offer?

~~~
Allezxandre
Git LFS is limited by your hosting solution. If you use Gitlab, for instance,
your repository cannot exceed 10 Gb of storage space (including code and other
ressources). And on Github I believe that limit is 1 Gb.

Many Deep Learning datasets that deal with pixel-data grow past the 10 Gb
mark.

