

Show HN: Domino, a PaaS for data science - DominoDataLab
http://www.dominodatalab.com

======
stan_rogers
Not, perhaps, a little concerned with the naming conflict with IBM (formerly
Lotus) Domino? At version 9, it's not dead yet (despite rumours to the
contrary going back to R4.5).

------
earino
I have used the Domino platform a number of times when I needed access to
high-end hardware to build some large scale models (in R). They have cool
technology and are pretty reasonably priced.

------
kelv
I'm interested in the "Version control for data" feature. How does that work?
I have data that is still changing in both content and structure. As the test
suite grows, it is becoming a burden to maintain every time there is a change
to the data. It would be great to be able to write tests against snapshots of
the database, and be able to rollback to a different point for each test
seamlessly.

~~~
siganakis
From my reading of the getting started guide, it looks like it treats your
"working" directory as a git repository, with each run basically doing a
commit & push.

This means that your data (in files) needs to be in the working directory, and
is versioned along side your code. Sounds pretty cool, but I am not sure how
it would scale for large / constantly changing data sets.

~~~
DominoDataLab
siganakis's explanation is correct, with a minor caveat [0]. We do some clever
things (diffing, compressing), but if large amounts of data are always
changing, the network transfer will take time. We have folks comfortably using
Domino with about 50GB of data in their project directories. And it's only the
amount of data in the current revision -- not cumulative across all revisions
-- that matters. Anyway, if you have a use case with more than ~50GB, let us
know, we're eager to engineer a more advanced solution -- just haven't had the
need yet ;)

If your code pulls data from a database (e.g., kelv's test cases), one option
is to save the DB snapshot out to a file when the code runs. After we finish
executing your code, we snapshot all the new/changed files in your working
directory (we call those the "results" of the run). Using this approach, you'd
have a record of the DB snapshot for each run of your tests. But, again... if
your snapshot is more than 10s of GB, the network transfer time could get
annoying.

[0] We treat the working directory as a git repo, but since git breaks down
with large files, we only use it to track your directory structure; we store
actual file contents elsewhere.

------
alexatkeplar
In the case study,
[http://www.dominoup.com/buzz.html](http://www.dominoup.com/buzz.html), you
have an orphaned sentence:

    
    
        Second, it centralizes the storage of project data, enabling.
    

And the next sentence should be And third, not And second:

    
    
        And second, Domino automatically

~~~
DominoDataLab
Thanks, fixed now.

------
dbpokorny
C, you need C. Otherwise it's just a toy.

~~~
DominoDataLab
We let users run arbitrary shell scripts, so you can use that to run C code --
either by compiling binaries for x86-64 and including them in your project, or
by running a shell script that first compiles your code, and then executes the
resulting binaries. It hasn't been a common use case, but we definitely have
some folks doing that.

~~~
dbpokorny
Thanks for the quick reply! Would you be willing to provide some more details
about your service? Just so I understand: I can just upload my C code and you
will run it...can you tell me how to do that? As in, prove that this is
possible by at least mentioning this somewhere in the documentation? Or
perhaps giving me a link to the appropriate reference material at
dominoup.com? The website is very slick.

~~~
DominoDataLab
Hi there. I've made a sample public project you can check out to see an
example of running C code [0]. I just wrote a shell script that compiles my C
code and then executes the resulting binary. Then I tell Domino to run my
shell script. Voila. Let me know if you have any questions.

We allude to this in our FAQ ("What if I want to run C++ code...") but we'll
update the docs to make the specific instructions more clear.

[0]
[https://app.dominoup.com/nelprin_at_mac/c-example](https://app.dominoup.com/nelprin_at_mac/c-example)

~~~
dbpokorny
c++ LOL! I love it. Thank you so much. I don't mean to hate on C++ too much.

much love!

