
Show HN: Nextjournal – seamless data science for teams - kvlr
https://nextjournal.com
======
kvlr
I’m one of the founders of Nextjournal and I’m really excited that after
almost three years in Private Beta we’re finally opening signups to everyone
today!

Nextjournal is a computational notebook platform and our goal is to make
computation more accessible and automatically reproducible, so it becomes
easier to collaborate and build on top of each others work.

If you'd like to know more, check out our launch blog post at
[https://nextjournal.com/mk/public-beta](https://nextjournal.com/mk/public-
beta) or sign up and give it a try!

~~~
wuschel
Hey,

just had a very brief look out of sheer curiosity, so please take this quick
feedback with a grain of salt: The running times of your Python notebook get
longer and longer with each _print()_ statement and cell. While the
reproducibility of your python notebook is a wonderful thing to have, I think
the performance decrease is very strong downside.

Cheers -

~~~
kvlr
Hey, this sounds like something we'd want to look at, I don't think it's
inherent to reproducibility. I think we should be truncating the output in
this case. Please send us the notebook via the Help button that shows up when
you produce an error in a cell, I'll take a look and we'll figure this out.

~~~
wuschel
Done.

------
kfk
So I do BI/analytics at a big company with a team of 6 people, here is my
take. We need something like this aimed at business analysts with little to no
coding experience and we need it to be priced in the $100-300 per year, not
more. Such tool would compete with the MS Office package and would be great.
Most of the stuff is available in various open source packages, it would be
about putting all together in 1 easy desktop install, adding a nice gui
interface on top of various functions (like ipywidgets but more high level).
For instance, we could totally add a basic gui on top of altair to do some
basic charting, that basic charting is 80% of business needs when it comes to
explorative analysis.

~~~
akrymski
isn't that tableau.com ?

~~~
kfk
How is that tableau.com? Tableau only does visualization and a very small
subset of it, business analytics is a lot more than visually appealing
dashboards

------
MartinMond
I've been researching what data science tools to use at my company.

How is Nextjournal different from Jupyter or Google Colaboratory?

~~~
kvlr
While Colaboratory is built on top of Jupyter Nextjournal is not.

We do support importing Jupyter notebooks and running Jupyter kernels, we also
have our own runtime protocol.

In Jupyter (and hence in Colaboratory) you normally have one runtime that's
running both your server code as well as the user code. In Nextjournal there's
a separate application called the Runner that's orchestrating the runtimes
which currently are docker images.

This allows us to use Nextjournal notebooks to do any kind of installations
without the need for a full Jupyter kernel inside the image, something that
gets tricky in Jupyter. Once we have a bash shell inside the image, we can do
installations.

You can choose to commit the filesystem state at any time as a docker image
and reuse it in other notebooks. This is actually how our default environment
images are built: Our default Python environment
[https://nextjournal.com/nextjournal/python-
environment](https://nextjournal.com/nextjournal/python-environment) is built
on top of the minimal bash environment
[https://nextjournal.com/nextjournal/bash-
environment](https://nextjournal.com/nextjournal/bash-environment) which is
importing just a stock ubuntu image.

Our system takes care of only referencing the image sha's everywhere, so
everything is immutable and you can't accidentally overwrite anything.

You can also pull those docker images and use them locally.

Any data you upload or results you save (just write to a /results folder) is
put into content-addressed storage, so same thing here, you'll never
accidentally overwrite a file.

Lastly the document is stored in the database (Datomic) and you can restore
any previous state.

Leveraging immutability at all layers of the stack is what enables our "remix"
feature, so the ability to quickly and cheaply clone any published notebook
and continue where another person left off.

~~~
joshe
Put this on your site. I know that
[https://nextjournal.com/features](https://nextjournal.com/features) seems
more like marketing copy, but this is much more compelling.

Just saying "much more", or "fully" doesn't help much. Try removing all the
adjectives from your marketing copy to see if it's actually communicating
anything. (Then edit, then add some back :-)). Also most of the features on
this page are things you get with Jupyter or collab, address what is actually
different, like you do here.

------
r3tex
Nextjournal is really how notebooks were meant to be used - for sharing one's
code, its output, and all the reasoning in-between with great looking
presentation. I'm very happy that my articles turned out so good looking on
the platform.

~~~
kvlr
for reference: [https://nextjournal.com/r3tex/loss-
landscape](https://nextjournal.com/r3tex/loss-landscape) is the article he's
talking about.

------
sandGorgon
How is this comparable to Google Colab or Azure ML notebooks for python only ?
(i know that nextjournal supports many more languages)

especially pricing per resources (its not clear from the website)

~~~
kvlr
Our standard instances at 3,75 GB of Ram and we keep a pool of three idle ones
of those around. With the free account you can currently use larger instances
of up to 16 GB of Ram and 1 Nvidia K80 GPU for free.

If you sign up for the paid plan which is 99$ per researcher per month you can
provision more powerful machines – basically anything that Google Cloud
offers.

We currently don't enforce any storage limits.

This is our first iteration of pricing though so I'm pretty sure this will
still change over time. We've gotten a lot of feedback from people asking for
a cheaper plan.

What most people don't realise however is that you can use most of the
features (including private drafts) as it stands now for free. We've also been
debating weather we should allow for private drafts on the free plan or take a
stance on what open science really means (working in the open from the start)
but decided agains this for now.

Curious to hear what others think about this. Do you expect drafts to be
private and would it be a violation of those expectations if they were not?

------
reacharavindh
I wish I could leverage such polished interfaces for my research group. But,
we have lot of contracts that bind us to keep our research data in house. We
cannot simply "run something in the cloud".

So, Jupyterhub and manual tinkering to get such polish for now.

~~~
kvlr
While I can't say anything definitive or give a timeline we do want to support
research groups like yours. Ideally we'd be able to have our paid offering for
companies using Nextjournal in private subsidize our open science/source
offering.

We also definitely want to open source parts of our product but we haven't
figured out what parts (or everything) and under what license.

Our priority is currently on providing a useful hosted product and become
sustainable. It's certainly also interesting to see how e.g. metabase is doing
it the other way around, open source first without a hosted product but I
guess I'm a bit scared of not being ready for developing Nextjournal in the
open at this point in terms of bandwidth and keeping things backwards
compatible.

~~~
reacharavindh
Completely fair. Your hard work and your product - you should turn it into a
sustainable business as you see fit. I was only casually commenting to share
my personal thought that it would be a great time saver for me as a sysadmin
to "just" use something like Your product instead of spending hours tinkering
with Jupyterhub to get it to work well. I'm not entitled to anything.

From my past experiences there are a lot of enterprises that are rightfully
scared to let their employees use such a service and open up data regardless
of what promises a SaaS company makes. There is general assumption that if the
SaaS company fucks up, all we get is a "we take our security very
seriously...." blog post.

So, making your product work within a corporate network without "call home" is
a great advantage and immediately expands your target audience with some
potentially big pockets.

------
refset
This is really neat - great work! It took me less than 10m to figure out how
to copy a Crux tutorial into Nextjournal using the Clojure template:
[https://nextjournal.com/crux/a-bitemporal-
tale](https://nextjournal.com/crux/a-bitemporal-tale)

The only issue I encountered was that adding comments after the final close
parens in the code sections creates EOF errors.

~~~
kvlr
Awesome, happy to see that. Been wanting to play with crux anyway. Nextjournal
runs on Clojure and Datomic and we use some of pack.alpha and aero from juxt,
so thanks!

------
ZeroCool2u
Having just gone through an evaluation for platforms just like Nextjournal,
there are a lot of companies that make similar claims, but very few that
deliver in reality.

In the end, the only one we found that delivered on the promises of
reproducibility and managing the entire data science life cycle end to end,
facilitating collaboration, and getting stuff done was Domino Datalab[1].

Can you compare and contrast Nextjournal to DD? Better yet, do you feel you're
competing in the same areas or are you really more focused just on
reproducibility? Even if you're not now, it feels like eventually, all these
types of products seem to converge to this state eventually just by nature of
the sales process and promising more and more features to customers.

Regardless, it looks really solid, so best of luck!

[1]: [https://www.dominodatalab.com/](https://www.dominodatalab.com/)

~~~
kvlr
I haven't tried Domino Datalab myself so take this with a grain of salt.

While data science is an obvious use case of literate programming, it's not
the only one. I see the fundamental problem that needs to be addressed is one
of dependency management. We address this today using Docker. In the future we
plan to use a more functional approach most likely based on Nix or Guix. This
more principled approach should address both reproducibility and usability (by
allowing to compose images and providing much better install times thanks to
binary caching).

I haven't really used Domino Datalab but I'm not sure if they allow for the
installation of arbitrary system libraries and packages like we do. Check out
some out our machine learning samples which run on GPUs:
[https://nextjournal.com/collection/machine-
learning](https://nextjournal.com/collection/machine-learning)

In the future we also plan to allow in-browser JavaScript execution, this is
currently hidden behind a feature flag but we still have an article that uses
it in [https://nextjournal.com/dubroy/ohm-parsing-made-
easy](https://nextjournal.com/dubroy/ohm-parsing-made-easy)

------
cw
really excited to see you guys launch!

