Jupyter, Mathematica, and the Future of the Research Paper

dahart · on April 15, 2018

> The tie-breaker is social, not technical.

The tie-breaker is financial. Jupyter is winning because it's free, not because it's social. It becomes social because of widespread adoption, and it get's widespread adoption because it's free.

I love Jupyter, love love love. But there's a lot of hyperbole and opinion here. Mathematica is just a for-profit business, it's that simple. And it wouldn't be fair to deny the example that Mathematica, Maple & Matlab have set for free solutions like Jupyter.

There's nothing dishonest about a for-profit business. And if Mathematica wants to keep PDF export for themselves, so be it, that's their right. What's dishonest is expecting technical software and service for free and calling people names like vandals if you don't get it. Just celebrate Jupyter and enjoy that people are doing great work you get to use without paying. I don't love Mathematica or it's founder, but there's no real need to impugn Mathematica in order to make this point.

carreau · on April 15, 2018

> it's free

Well for you maybe – and we strongly believe it should be – but it's built on top of thousand of volunteers hours, grants money (Thanks Sloan, Helmsley, Moore) and donation from companies (Anaconda, microsoft...) and individual, and partners. NumFOCUS (https://www.numfocus.org/) manage all of that it's a 501c3 tax deductible ! If Jupyter is of help to you (or your company, organisation) think about contributing back (Code, Dev Time, Design, UX, translation, Legal, ...)

Much Love from the Jupyter Team.

ILikeConemowk · on April 15, 2018

Fernando, the creator of Ipython and ultimately what morphed into Jupyter is missing from your list. That dude is an inspiration.

carreau · on April 15, 2018

Sorry, Fernando as well as all the contributors are included in the "we, the Jupyter team". I did not start to mention individuals otherwise the list would be really long. I would encourage people to look at (at minimum) the steering council on Jupyter.org for a list of key people in the project.

We are also striving to make contributing to Jupyter / open source recognized in Academia - for people like Fernando to get proper recognition, and future generations of scientists to have incentives to do this kind of work.

ChrisRackauckas · on April 15, 2018

>We are also striving to make contributing to Jupyter / open source recognized in Academia - for people like Fernando to get proper recognition, and future generations of scientists to have incentives to do this kind of work.

And thank you for this! Do you have a post about what efforts are going on? I myself am trying to find out how to navigate academic spaces while spending significant time doing open source development. I am curious how you're pushing for changed incentives to help people in this kind of position out.

sonofaragorn · on April 15, 2018

Thank you for mentioning him, I didn't know of him. For anyone interested here is an interesting talk he gave recently: https://www.youtube.com/watch?v=xuNj5paMuow

dahart · on April 15, 2018

Thank you for all your hard work!!

Understanding what it took for Jupyter to be where it is, and how much work it took, who pushed for it, and how many people had to work to make it happen, this is important.

You're absolutely right to emphasize that it's not free to build. I worry that promoting a mentality like the author's that open source and free to use is honest, and for-profit is "dishonest" dramatically undervalues and undermines your work as well as the funding model. If we treat Jupyter like it's our right, and not a miracle of funding and hard work, we will make it more difficult for Jupyter to attract help and to succeed.

nerdponx · on April 15, 2018

I can only imagine how many billions of dollars in company value have been generated using the product of their uncompensated labor.

Same goes for the rest of the data stack: Python itself, Numpy, Scipy, Scikit-Learn, Pandas, etc.

carreau · on April 15, 2018

Fun fact, as far as the Jupyter team can tell, there are more full time Dev on Jupyter than CPython. Also pretty excited that bumpy recently got funding for 2 full time Dev that will work at UC BIDS!

recharged96 · on April 15, 2018

It's not free. Grant money means someone's paid for it (us, taxes). Volunteer is about credit, not fiat, but still costs something (gpl restrictions).

Jupyter is great, it's revolutionized online education for example. But it's not as full featured as Mathematica, and like most Foss projects, slower to fix bugs, aka you get what you pay for. Both have a place unless you have hired, 100% dedicated help on juptyer. This is much like public radio vs corporate. Together they do the market needs.

carreau · on April 15, 2018

Grant usually differ from US taxes. And AFAICT, Jupyter received no money from govt source.

Jupyter is not GPL but MIT, and volunteering is rarely about credit.

infinite8s · on April 16, 2018

I'm pretty sure some of Fernando's early work on ipython was funded through his NSF grants.

rotorblade · on April 15, 2018

>> The tie-breaker is social, not technical.

> The tie-breaker is financial. Jupyter is winning because it's free [...]

It is a bit more nuanced than that. Personally I do not pay from Mathematica usage, so why do I like Jupyter more?

The Mathematica notebook interface is horrible. You may go "oh, neat" the first few times you try it then, at least I, get more and more frustrated on all the idiotic issues

* Indentation/text-wrapping. Write a long line that starts wrapping, it gives it a little indentation to signify this, an your next line that you have indented is slightly more indented, but it is really hard to see, so you have no idea of where your line-breaks are.

* Brackets. "[" are used for function calls and for part-specification. The number of square brackets in your expression makes it necessary hard to read when it is big enough.

* Jumping text. The notebook interface does not have the "auto-complete brackets" (maybe v11 does), so you add your first, all the text in the Cell gets reformatted and you have to find the fucking place you wanted end the bracket. This is akin to working with images in MS Word.

* Exporting. The notation is just ugly, fine, that is personal, but "Sin[]" as "sin()"... ok. Ah, good, it has a "Copy as Latex", nice... "Sin[]" -> "\text{Sin}[]". Really? Who in the world uses "\text{Sin}" for the sine-function and square brackets for function-calls when typesetting maths in Latex?

It is just a complete nightmare to try and incorporate these things into your workflow, at least for me.

Jupyter just behaves as you'd expect. Just that it is so much smoother to work with wins. For me, I do symbolic calculations, SymPy can do some things much easier than Mathematica, but a lot of things it can't or you have to work some more to get going. That Jupyter allows you not to have an aneurysm every day at work, which makes you actually wanna spend the extra time working it out.

improbable22 · on April 15, 2018

These are matters of taste, I use both & find Mathematica's notebooks much better than Jupyter. Editing text in a web browser is just painfully clumsy, and plain text is not great for reading mathematical expressions of any length. But obviously it depends what you're doing with it.

You complain about Sin[] but not about ugly things like np.sin()? Also I think you're looking for

    Sin[θ] //TraditionalForm //TeXForm

pletnes · on April 15, 2018

Using \text{} instead of \sin is, arguably, plain incorrect. Or at least at odds with the tex philosophy of separation between content and formatting.

rotorblade · on April 15, 2018

Yeah, I guess that is fair. It might just be that it is more or less the furthest way from how I like to work as I can imagine.

> Sin[θ] //TraditionalForm //TeXForm

Did not know of this. Thanks!

henrikeh · on April 15, 2018

TeXForm automatically applies TradionalForm unless another *Form is applied.

So TeXForm[Sin[x]] gives "\sin (x)"

improbable22 · on April 15, 2018

Indeed, sorry! It even says so in the help.

And while I swear I've had rotorblade's problem before, right now "Copy as LaTeX" seems to do this too.

rexpress · on April 15, 2018

Regarding the double brackets for representing Part, you can instead use〚 and 〛which can be input with `ESC`-[-[-`ESC` and `ESC`-]-]-`ESC` respectively. This makes the code prettier and easier to read.

There is a very easy to implement modification you can make to allow the keyboard shortcuts `CMD`-[ and `CMD`-] for the double bracket symbol, which is described here:

http://szhorvat.net/pelican/pages/mathematica.html

I rather wish Wolfram would provide these keyboard shortcuts in the base software

promer · on April 15, 2018

This is a good summary of the problems I have with the Mathematica notebook interface. And just as with the problems I described with the typography on PDF output, when you encounter something like Latex output that reads "\text{Sin}[]" you have to wonder, is this bad by mistake or bad on purpose?

gaius · on April 15, 2018

The tie-breaker is financial. Jupyter is winning because it's free

This is it really

Back in the 90s I was using a program called MathCAD, it provided a “notebook” interface by running as a plugin to Word 6. In terms of general usability and experience, 20-years-ago blows away modern-day Jupyter and it’s silly “cells” interface, which it does not because it’s better but because it’s trying to force itself into a web browser. I haven’t used MathCAD since but I bet in 2018 it’s amazing.

I think few people who have used the commercial tools think Jupyter is better. But the commercial tools are soooooooo expensive...

fperez_org · on April 15, 2018

FWIW, I used MathCAD extensively around that period while in grad school (I taught physics lab courses that were 100% structured around MathCAD workflows). I hated it, and Jupyter is explicitly informed by that experience. So it's not like MathCAD very much by design, not by lack of knowledge.

We acknowledge there's a lot to improve in Jupyter, and some discussions in this post make excellent points (many of which we'd like to make progress on in the future). But the Jupyter team did probably use most/all of the modern scientific computing platforms, MathCAD included (and Maple, Mathematica, IDL, Matlab, Gnuplot, ...) at some point in our careers. We typically make our choices with reasonably good knowledge of the landscape.

We make mistakes, or our tradeoffs may be different than the optimal ones for your use case. But lack of knowledge of these tools is rarely the reason :)

sanderjd · on April 15, 2018

I used MathCAD in physics and chemistry classes in college. I frequently wonder what happened to it, because I share your perception that while Jupyter is very nifty, it is hamstrung by the limitations of the browser environment.

I'm waiting impatiently for the coming revolution of non-web collaborative internet-connected rich client applications.

tincholio · on April 15, 2018

As a die-hard Emacs user, I think that org-mode with it's org-babel capabilities blows Jupyter out of the water, and it produces much better, readable output. You can use it with pretty much any language, and combine several languages in a single document without any issues. Besides all this, it's just a plain text format, and you can extract all the code into proper source files for later offline use, too ('tangling', in literate programming parlance). Jupyter uses a rather obtuse json format which is not practical to work on directly.

Of course, it's not web-based and it requires some basic knowledge of Emacs, but functionality-wise is so much better that it ends up being frustrating to use Jupyter when collaborating with other people. There are some workarounds (EIN, and ob-ipython modes help, but it's not quite the same)

dahart · on April 15, 2018

> org-babel capabilities blows Jupyter out of the water ... Besides all this, it's just a plain text format

This feels like the argument against Slack in favor of IRC. I don't doubt it's true for you that org-babel is better, but there isn't really even that much comparison between Jupyter and org-babel, they are very different things.

One of the more compelling reasons to use Jupyter is inline images and plots. Another one of the more compelling reasons to use it is it basically comes bundled with everything you need, python with it's own virtual environment that doesn't modify your system python, and a bunch of amazing libraries, numpy, scipy, matplotlib. Not needing to know emacs helps too, but that's really low on the list.

> Jupyter uses a rather obtuse js on format which is not practical to work on directly.

This is a wierd thing to say. Jupyter has download as .py right in the file menu. Not to mention download as Markdown, HTML, LaTeX or PDF, if you want to present or share instead of export the python.

tincholio · on April 15, 2018

> One of the more compelling reasons to use Jupyter is inline images and plots

You can do those in org, since forever. I meant that the files themselves are plain text, which is good for sharing, version control, etc. ipynb files embed images with Base64 encoding, which makes seeing diffs a major PITA.

>This is a wierd thing to say. Jupyter has download as .py right in the file menu. Not to mention download as Markdown, HTML, LaTeX or PDF, if you want to present or share instead of export the python.

You can export org to those formats and more, but the main part is that you can organize your code (e.g, tangling to different files, etc.), and use other languages than python (or whatever other kernel you're running in Jupyter).

Plus, you get to use a proper text editor instead of editing on a browser textarea.

dahart · on April 15, 2018

I Agree with all those points. I've used org mode for a few things, but not for python notebooks. I remember seeing a great video a few years back on using python in org mode to make a research notebook. It really made me want to try org mode for literate / notebook style programming. I think it was this one: https://youtu.be/1-dUkyn_fZA

It sounds like org mode does not store images inline, since you don't like the b64 images in Jupyter? I imagine that could go both ways depending on what you want. Even for diffing, you might like to see that the image changed (though I generally agree that diffing images is not ideal). If you have to package images separately when sharing, then wouldn't inline images score a point for easier sharing?

> use other languages that python

Agreed, and that is powerful, but this also goes to my point. Jupyter is made for python and org mode isn't. Jupyter comes with python and libraries, where with Emacs you're on your own to figure out how to install and use different languages, and setup your environment so Emacs can see them. You're on your own to bring in images and plots, where Jupyter already understands commands that output images.

> Plus, you get to use a proper text editor instead of editing on a browser textarea.

Yes, but the editor is Emacs. :) For a lot of people the browser textarea is a bonus because it's the same thing you use everywhere. It's uncomplicated and non-technical and utterly consistent, even if lacking any power.

tincholio · on April 15, 2018

>If you have to package images separately when sharing, then wouldn't inline images score a point for easier sharing?

I guess that depends on who you're sharing with and how. If you're using git, you can just add them to your repo (only they won't clobber your diffs). If you're just publishing, you can export to HTML and put it somewhere online.

>Agreed, and that is powerful, but this also goes to my point. Jupyter is made for python and org mode isn't. Jupyter comes with python and libraries, where with Emacs you're on your own to figure out how to install and use different languages, and setup your environment so Emacs can see them

Well, that requires a minimum of computer literacy, but rocket science, it ain't. I agree that Emacs is not for everyone, but if you're doing data analysis (and presumably something with the results), it's certainly not out of reach, and good tooling is worth learning.

I agree that Jupyter is "easier", but it's not simpler. And also, it's limiting in what it can do. So yes, you might get up and running a bit faster, but you're setting yourself for frustration down the line.

> Yes, but the editor is Emacs.

Precisely, it's vastly better than the alternative. If you don't like the bindings, you can use Evil (this is my choice, despite never having been a heavy vi user), or CUA mode, and get a more familiar experience for people used to Windows.

dahart · on April 15, 2018

Jupyter isn’t competing with emacs, it’s not an editor. It’s competing with the python shell. And it’s editing capabilities are way better than the python shell.

Adding git as a dependency to sharing is too much in general, that would prevent sharing.

tincholio · on April 15, 2018

> Jupyter isn’t competing with emacs, it’s not an editor.

I'm fully aware of that, my point was that functionality- and ergonomics-wise, you're better off using Emacs/org. I agree that it's not for everyone, but for people working on data science, it's a much more powerful tool.

>Adding git as a dependency to sharing is too much in general, that would prevent sharing.

I think we probably have vastly different target audiences in mind.

tincholio · on April 15, 2018

BTW, and unrelated to the Jupyter vs. org discussion, you may find this interesting (just came across it earlier today): https://blog.oscarnajera.com/2017/11/git-diff-images-and-pdf...

goerz · on April 16, 2018

> Plus, you get to use a proper text editor instead of editing on a browser textarea

I have a shortcut that lets me edit any browser text area in MacVim, which works out really well in the Jupyter notebook.

todd8 · on April 15, 2018

I share your opinion about the superiority of org-mode as a means to assemble a document out of various pieces. It's not hard to show nicely formatted code, the code's output, the math behind the code and perhaps a graph of the results all in a LaTeX/PDF document and or HTML. All of this can be done with a flat text file that is easy to save as a part of a project under source control.

What I wonder is why are Jupyter notebooks so popular? I've tried them, but perhaps I don't understand the workflow in the way people use them. Are they intended to be works in progress where one explores ideas and looks at results? This is how I've usually used them and Mathematica's notebooks as well.

For me though, my Jupyter or Mathematica notebooks end up with lots of false steps and dead-ends. To prepare a presentable form of the results I have to go back and edit the notebook to such a degree that it doesn't feel like working in a notebook as much as it is working in a complex document with a really weak set of editing tools. To be fair, I haven't used Jupyter or Mathematica for a couple of years and things may have changed. Maybe HN readers can help be understand how to more effectively use these tools.

dahart · on April 15, 2018

> What I wonder is why are Jupyter notebooks so popular?

How easy is it to share a Jupyter notebook with someone else, versus sharing an org-mode notebook? Think about the recipients rather than how hard it is for you. Assume the person you share with needs to be able to re-run your notebook. Assume the person you share with isn't using the same operating system that you are. Be honest and think about all the steps involved.

With Jupyter, I send a single install link to Anaconda. With Emacs + org mode + python + numpy + scipy + matplotlib, I might not even know where to begin if the recipient is using Windows or Mac. The recipient needs to install emacs, and they need to know how to use Emacs. They have to know how to install python packages (and possibly a package manager), and ideally be able to use virtualenv too.

If I need different libraries than what Anaconda comes with, I can put the commands in the notebook, and I won't mess up the recipient's system version of Python. With org-mode, they'd need to figure it out on their own.

If I choose to use a different language than Python in my org-node notebook, the recipient will have to install that programming language and environment themselves before they can use your notebook.

It's the simplicity and bundling of dependencies that make Jupyter popular. All the flexibility and power you get with org mode comes at a high price.

takluyver · on April 15, 2018

If the notebook is something you can share publicly, then https://mybinder.org/ makes sharing even easier. You send a link, and the only thing the recipient needs is a modern browser.

If they want to actually work on the notebook and keep their changes, they'll probably need to install it (or switch to a platform like Cocalc). But Binder is enough to let them view, run it and make experimental changes, without installing anything or creating any new account.

tincholio · on April 15, 2018

> How easy is it to share a Jupyter notebook with someone else, versus sharing an org-mode notebook?

Well, now you're talking about a whole environment. Use docker, or VMs or whatever. There's plenty of options for doing this (we actually use Docker for Jupyter, to make sure we work off a consistent install, so I guess sharing can be non-trivial if you walk a bit off the beaten path, anyway)

> If I choose to use a different language than Python in my org-node notebook, the recipient will have to install that programming language and environment themselves before they can use your notebook.

Sure, and if you need to use another language from your Jupyter notebook, you can't. I don't get your point. Yes, if you need to use a tool, you need to install it. It requires a minimum of computer literacy.

dahart · on April 15, 2018

Yes, Anaconda + Jupyter is an environment and org mode is not. That’s a big reason Jupyter is so popular.

> I don’t get your point.

My point was Jupyter inflicts far less technical burden and debt on other people. There’s a huge downside to custom environments. If that’s not obvious, that explains why it’s hard to understand jupyter’s popularity.

> it requires a minimum of computer literacy

I’m sure you didn’t mean it that way, but this borders on white tower programmer centric dismissiveness.

Personally I know how to install all the things an org mode notebook might require, but I don’t want to. It takes too much of my time and energy. Expecting other people to adopt your choice of tools and environment by arguing that your tools are basic literacy is presumptuous. Use org mode, I’m wildly in favor of people using tools that they like, and you like it. Just don’t claim it’s objectively better on all axes when it’s not.

tincholio · on April 15, 2018

> Yes, Anaconda + Jupyter is an environment and org mode is not. That’s a big reason Jupyter is so popular.

You're again comparing apples to oranges. If you want to use Jupyter, you need to install a whole lot of stuff (easier to do if you're using Anaconda, agreed). It's "not simple" enough that there are public docker images for it. My point is that when tooling is an issue, you can do the exact same thing with an org-based setup.

>My point was Jupyter inflicts far less technical burden and debt on other people. There’s a huge downside to custom environments. If that’s not obvious, that explains why it’s hard to understand jupyter’s popularity.

I agree that there's difficulty in achieving reproducible environments, and that's why there's tools that handle that for you. I think that much of Jupyter's popularity comes from the fact that it's easily available, and based on Python, which let's non-programmers get started moderately quickly. I just don't think that it's the best choice. This is not to shit on Jupyter, I've mentioned a bunch of advantages of using org (which by the way, you haven't really addressed, other than focusing on the setup aspects)

> I’m sure you didn’t mean it that way, but this borders on white tower programmer centric dismissiveness.

It's not, really. My point is that when you're doing presumably advanced analysis on presumably complex data, you (should!) certainly have the capacity to deal with tooling.

dragonwriter · on April 15, 2018

> Sure, and if you need to use another language from your Jupyter notebook, you can't.

Not entirely true, BTW.

https://blog.dominodatalab.com/lesser-known-ways-of-using-no...

tincholio · on April 15, 2018

Still limited, and pretty clumsy. Have a look here: https://orgmode.org/worg/org-contrib/babel/languages.html

jononor · on April 15, 2018

There are Jupyter kernels for most languages out there. https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

tincholio · on April 15, 2018

And for the most part, you cannot use them at the same time from the same notebook.

paultopia · on April 15, 2018

I think of a Jupyter notebook as a REPL with superpowers. I can try code out in real-time and incrementally like a REPL, but when something breaks, the full history is right in front of me, so I can tweak something 10 lines ago and run the entire thing again like with an ordinary source file. It combines the best of both methods of programming.

Seen that way, sharability is just a nice bonus. Yeah, I'd never try to turn anything directly from a jupyter notebook into a serious way of sharing research results---but for a quick and dirty "here's what I did" on a blog or email or something, or to show students, it does the job.

taeric · on April 15, 2018

You should give some of the emacs modes a try. Could do this sort of stuff for wire a while now. And I'm really just talking of interior buffers. Not org.

Kind of odd, but repl on steroids is really just how dim the modern view of the old lisp machines is. Most of our neat repl tricks were just taken for granted on some of those machines.

paultopia · on April 15, 2018

Yeah, I like the idea of that stuff, and org-babel too, actually. But the setup just seems like a bit of a bear. I use spacemacs, and the python layer for that[1] can spawn a repl and send code, but in order to do so incrementally, I have to manually select the code to send to the repl. The great thing about the cell construction in jupyter is that it supplies a natural way to organize chunks of code, you don't have to define the chunks of code before executing them.

That being said, it's still pretty sweet!

[1] https://github.com/syl20bnr/spacemacs/tree/master/layers/%2B...

tincholio · on April 16, 2018

Maybe you should try ob-ipython. It'll spawn a Jupyter server with a python kernel, and send your source blocks as cells, and you also get to keep all the other org goodies.

paultopia · on April 16, 2018

Thanks!

tincholio · on April 15, 2018

> What I wonder is why are Jupyter notebooks so popular?

I often wonder the same. I'm now kinda forced to use them, since they're the tool of choice in my team, but it's a major pain. I think that they're "easy" (in the way Rich Hickey describes "easy" vs "simple") to get started with (just as python, btw), but their lack of flexibility comes back to bite you sooner or later.

For me, I've resorted to using a mix of EIN and ob-ipython, and then copy/pasting as needed to keep the stuff in synch with Jupyter. I'm now considering writing a simple ox-jupyter exporter, and maybe some import functionality, to avoid dealing with Jupyter directly.

John Kitchin has written a basic exporter (http://kitchingroup.cheme.cmu.edu/blog/2017/01/21/Exporting-...), but it doesn't support inline images (it should be simple to fix this), and it doesn't really adhere to the ox- style of exports, so I think I'll probably do my own instead of adapting his (it'll be a good learning experience, for sure).

gaius · on April 15, 2018

combine several languages in a single document without any issues

You can do this with RMarkdown as well, despite the name, and run notebooks in RStudio. Much better experience than Jupyter in a browser.

https://yihui.name/knitr/demo/engines/

_Wintermute · on April 16, 2018

RMarkdown is pretty useless if you're not using R.

  Except engine='R' (default), all chunks are executed in separate sessions, so the variables cannot be directly shared.

So yes you can write a python snippet, but good luck trying to write a notebook.

tincholio · on April 15, 2018

I haven't used Knitr (I do my R stuff from within Emacs, and org is a natural choice there), but it does look pretty powerful

emj · on April 15, 2018

Requiring basic Emacs knowledge is just too much, just looking through the features of org-babel I can't really find anything that makes it better suited. Except not having to bother with a webbrowser, that is a nice plus, but hardly important for most people I work with.

abhirag · on April 15, 2018

I have used both Jupyter Notebooks and org mode with org-babel extensively and I agree with the OP regrading the fact that the org-babel workflow is vastly superior, OP did point out a few features which org mode workflow has and Jupyter Notebooks don't but I will try and provide a comprehensive list:

1. Plain text format, git and git diffs work

2. You can combine many languages in a single document, and every code block can be part of a separate session, as an analogy to Jupyter Notebooks, you can have multiple kernels backing a single notebook and you can decide what kernel you want the current code block to run in.

3. You can edit a code block in the major mode for that language, i.e. you get all the features of Emacs while editing code: documentation, auto-complete, snippets and anything Emacs can do, and Emacs can do a lot :)

4. You can have internal and external links to any part of the document (or any other org-mode file) within the editor which get exported as links in the HTML file too. Want to refer to a code block you used before, just name it and drop a link. Extremely useful in binding the whole document together.

5. Literate Programming support -- You can decide the order the concepts are introduced in according to the human reader, not according to the execution order the machine demands it to be in:

  #+NAME: named_code_block :eval no
  function_not_defined_yet()

  #NAME: complete_code_block
  def function_not_defined_yet():
      print("nice function innit?")
  
  <<named_code_block>>

The <<named_code_block>> gets expanded to whatever you defined it and you control the way you want to structure the document to be the most readable. You can keep working backed by a REPL in the initial stages and then extract(tangle in literate programming speak) to a file, again in the order you want using the <<named_code_block>> (NOWEB syntax). So one org-mode can generate your whole project if you wish so.

6. With the internal and external links and <<named_code_block>> (NOWEB syntax) the org-mode file is closer to being a hypertext file than Jupyter Notebook even though Jupyter Notebook is the one running in a browser.

I have covered only the major features of org-babel, I haven't even covered all the features. I love Jupyter Notebooks too, but org-babel is something else. I am currently working on a toy ray tracer in Clojure in literate programming style and loving every moment :)

emj · on April 17, 2018

I will try to run a org-mode to ipynb converter, so thanks to your suggestion! I just wish there was an Emacs version that wasn't to differnt from everything else so other people could look past the stigma of Emacs. So thank you for the feature run down, and to be cyrstal clear I want more options, but I will only be able to run org-mode/babel myself. Requiring Emacs is just too big of an hurdle for anything bigger than a two man team.

abhirag · on April 17, 2018

It hurts to hear stigma and Emacs together in a sentence but I guess you are referring to the arcane keybindings of Emacs. In that case I use native keybindings for editing in Emacs i.e. the old Cntrl-C, Cntrl-V, Cntrl-A, Cntrl-Z, Cntrl-S and vim keybindings for executing commands i.e. for things like running code. This is a great setup for beginners so do contact me if you need Emacs to be used with native keybindings :)

Just to give an example of what can be done using org-mode, this is the project I am using to grok Literate Programming -- http://ray_tracer.surge.sh/

The whole thing is generated using one org-mode file -- https://gitlab.com/snippets/1710454

and this org-mode file is the one I work in, it will eventually generate the source code in separate files too, once I have finished the project.

This is what my setup looks like -- https://i.imgur.com/mqi8vPR.png

Anyways I can't convince you to use it, but hopefully I can convince you to give it a try, it isn't easy but it is worth it :)

tincholio · on April 15, 2018

You saved me a whole lot of typing! Thanks :)

I'd add that you can also benefit from other aspects of org, such as project management functionality, outline editing, table editing, tables with formulas, direct git integration via Magit, etc., etc.

I agree that getting people on Emacs is a non-minor issue, but using something like Spacemacs with CUA-mode enabled could go a long way towards acceptance.

abhirag · on April 15, 2018

I would love discussing the literate ray tracer I have been working on and the literate programming workflow I have made for it. Couldn't find your email in the profile, mine is abhirag@outlook.com, would be great to discuss if I can improve my workflow further :)

tincholio · on April 15, 2018

Just sent you a mail (check spam folder if you don't see it, it comes from my own domain, and sometimes it's dumped with the spam)

abhirag · on April 15, 2018

Found it in the spam folder, will mail you the details soon :)

FractalLP · on April 15, 2018

Carl Sassenrath (Amiga and Rebol inventor) tried to do this with Rebol. He had what he called iOS (internet operating system) that would let you share the tiny Rebol source files. A 1/2 page of Rebol can run a fully functional GUI with graphics...etc as there is a DSL for most things and high level primitives. The interpreter is < 5MB, so you can transfer everything over the web, but still avoid JavaScript...etc. A cool vision that didn't pan out. The Red project is continuing what is cool about Rebol, but with native compilation as well and a cross-platform compiler for Mac, Linux, Windows, Android, BSD, and several others. They have a whitepaper out on Dapps (decentralized apps) and how to use it with ethereum and blockchain. Worth a look.

tunaoftheland · on April 15, 2018

As an alternative to Jupyter UI and Emacs, Hydrogen (https://github.com/nteract/hydrogen) could be viable. It runs as an Atom extension and connects to a Jupyter server instance. I haven't used it for anything other than a minimal project, but I preferred its UX to that of the browser interface of Jupyter. Atom isn't my favorite editor, but it's pleasant for this particular use case. Looks like the same team also offers an Electron-based application instead of an Atom extension.

tincholio · on April 15, 2018

That looks quite similar (though much nicer UI-wise) to EIN (the Emacs mode for jupyter). I guess if you're doing D3-based viz stuff it will be a better choice than EIN, as you can render those properly.

spot · on April 15, 2018

> In terms of general usability and experience, 20-years-ago blows away modern-day Jupyter and it’s silly “cells” interface, which it does not because it’s better but because it’s trying to force itself into a web browser.

I am really interested in the specific advantages MathCAD had/has, and the problems you see with cells and being in a web browser.

> the coming revolution of non-web collaborative internet-connected rich client applications.

why? how is the web not up to this?

gaius · on April 17, 2018

the problems you see with cells and being in a web browser.

My end goal is a document, and for that I either want a programmer-grade text editor, or a full-featured word processor. Filling in a web form is a terrible way to write a document. My tool of choice at the moment is RStudio - all the interactivity you could possibly ask for, and very easy to edit complex files then render them to a great-looking PDF with the output of my code interspersed with paragraphs of text, footnotes and TOC and all the rest.

madengr · on April 15, 2018

MathCAD went to shit in about 2000. I had been using it for 10 years then (started in DOS). They made some annoying UI changes and really increased the cost. The kicker was when they called me and accused me of using a fake license, when I had dutifully been paying support for both a home and work license. I got fed up and went to Mathematica.

MathCAD was later bought by PTC, and further went to shit. It amuses me how some companies can self destruct.

gaius · on April 15, 2018

MathCAD went to shit in about 2000

That would be a couple of years after I last used it. What a shame, at least I have those memories. Really loved how low-friction working with it was, and how trivial it was to generate high quality documents or reports, there was no "write it up" stage, you were done basically as soon as you had done the actual work. Then just print and go! Still can't really do that with Jupyter.

madengr · on April 15, 2018

Yes, it was a good run. Now I remember specifically:

Started using the DOS version in 1990 in college.

Bought version 3.1 in ~1992, Windows version running it under OS/2. Could afford to buy it then, even as a student. Ah, OS/2 was also good.

It was < $100 for each version upgrade until 2000, when they switched to maintenance based licensing, soon bumping the price to $495.

I had a home license which I dropped. I still had a work license, but then some sales twerp said I was using a pirated license at work, and wanted to speak to my IT department. This is a >100k employee company, so serious consequences for software piracy. I faxed them my valid license certificate, and told them to stuff it, dropping my renewal, and buying Mathematica.

Now I see it's Mathcad Prime, a subscription based license, for $800 year. They were purchsed by PTC several years ago, which only made things worse.

This is textbook case of screwing over your user base, cha$ing corporate accounts. I don't know anyone who uses it now. Many people used it back in the day.

Sad thing is I don't even use Mathematica much. Too painfull most of the time. I don't like the interface, but it's better than browser based Jupyter. Just forget trying to install Jupyter and dependancies on a corporate managed Windows computer. I just use papaer and calculator. I swear things have regressed from 20 years ago. At least I can still buy a new RPN calculator.

FractalLP · on April 15, 2018

Yea. Using Mathematica is much easier to do than Jupyter as my IT department just has to purchase. Sadly, I have a lot of data, so using a calculator (even programmable RPN)is out of the question. Even Excel is too small foe some things.

promer · on April 15, 2018

I've still got a Mathematica license. I never use it. For me, the out-of-pocket monetary cost was never an issue.

gravypod · on April 15, 2018

Has anyone tried integrating this sort of thing into LibreOffice? I bet Calc would make a decent DataFrame tool.

gravypod · on April 15, 2018

Posted too long ago for me to be able to edit.

Looks like someone has: http://comppad.sourceforge.net/

FractalLP · on April 15, 2018

I used MathCAD in college engineering, but never the Word interface. My wife had to use it for all her engineering classes and was a whiz with it. Still, the overall system is pretty far behind in overall functionality I would say.

klmr · on April 15, 2018

> The tie-breaker is financial.

It can be both financial and social: individual institutes are all too happy to pay for Mathematica licenses, and consequently institute members can use it to produce reproducible research with it. However, the reproducibility of the resulting notebooks is drastically hindered by the fact that a reader essentially also needs to pay for Mathematica to get the full benefit out of these notebooks (even if they are readable without Mathematica). As a consequence, few people bother using it even though they can afford to. Social drivers disincentivise its usage.

By a similar dynamic, Git beat out the competing DVCS: in this case mostly technical rather than financial factors that drove individual actors to prefer Git over alternatives (due, to a large part, to GitHub). But many people don’t actually care about technical considerations (or even prefer other systems over Git in this regard). What people most care about is seamless integration. In the end, a social driver caused Git’s adoption.

acidburnNSA · on April 15, 2018

Fair points. Just to clarify I think the author was referring to Vandals [1] in the original sense rather than merely poking fun.

[1] https://en.wikipedia.org/wiki/Vandals

blablabla123 · on April 15, 2018

Also not to forget that Jupyter is just different to Mathematica. With Mathematica you can do symbolic computations, yes also Statistics and Machine Learning, but also Group Theory and what not. Jupyter does a great job as an interface for certain Statistics and Machine Learning tasks, also I'm quite sure that it needs less resources but that's all.

That said, I'm still missing a free but powerful tool for symbolic computations like Mathematica or Maple.

> The tie-breaker is financial.

Exactly, it cannot be emphasized enough. Of course as a student you get these powertools for a small price or even for free. But if you are not in University, those tools are super expensive. For a reason but there is still a need for far more open source in this area.

EDIT: I'm just realizing there is Sympy, niiceee...

hpcjoe · on April 15, 2018

Maxima, the gpl'ed version of Macsyma (which I used in grad school days for stat mech calculations on spin systems with Potts, Ising, and other models) is available for most systems. There is a Jupyter kernel for it[1].

I am personally more a fan of Julia than Python, and Julia + Jupyter is an awesome combination.

[1] https://github.com/robert-dodier/maxima-jupyter

stjohnswarts · on April 15, 2018

Exactly, you can be a programmer and be a capitalist too. May the best paradigm or hybrid of the various paradigms win.

FractalLP · on April 15, 2018

Long time user of Python here and recent user of Mathematica.

Some observations I have are that they're both great. Python is a nice open source scripting language, but getting libraries to work can sometimes be a pain. Mathematica is basically install this and everything is included. The Mathematica documentation is amazing and it is really simple how to do most things. The whole iPhone "there is an app for that" is equivalent to "there is a function for that".

Graph Theory works flawlessly in Mathematica. In Python, there is a module to Graphviz. Let me know if Python has something new though. There are a lot of other examples. Mathematica's Import[] function can read over 150 different file types including: CSV,.XLS, genetic encoding files, optimization files....whatever. It is usually far easier and more consistent than finding a corresponding Python library and struggling with the install and minimal documentation. Let me be clear that Python is awesome and rocks and i think Jupyter is moving it in the right direction. I just feel like many dismiss Mathematica as something that does Calculus homework rather than what it is today which is a massive 20 million LOC conglomeration of C & Java & Wolfram language that does everything from Statistics, Machine Learning, Visualization, BlockChain, 3D printing, NodeGraphs, data sets and analysis...etc in a single consistent package. It is expensive and proprietary and certainly has its own faults, but a lot of that cash is funneled back into a great product.

askvictor · on April 15, 2018

While I totally hear you regarding the pain of python modules (particularly on Windows), the point of python 'distributions' like anaconda and canopy is to bring the kitchen sink along, kind of like mathematica.

The problem with Mathematica from a science point of view is that, being closed source, means you can't independently ensure the calculations are happening correctly. To be replicable, science involving data needs to use open source tools.

gaius · on April 15, 2018

The problem with Mathematica from a science point of view is that, being closed source, means you can't independently ensure the calculations are happening correctly

Have there been any high profile failures root-caused to Mathematica (or MATLAB or any similar product) getting its sums wrong? I can find any news stories etc. Plenty of serious calculations were and are done on “closed source” HP and TI calculators too. Every serious scientific instrument with its own data capture uses a binary blob too somewhere, so if that’s a problem for you then you can’t even trust the raw data!

And even if you have all of the code - you still need to worry if the proprietary, closed FPU is working “correctly”.

This sounds very much like a post-hoc justification for “its free as in beer”. Do you think Wikipedia is more trustworthy than real references too? What about blogs?

PopePompous · on April 15, 2018

Science publication is moving (very, very, very slowly) towards a model where instead of a final, polished traditional paper, the raw data along with the software tools and interpretation is published. In principle this should allow readers to completely understand and reproduce the processing of the raw data, rather than reading a few paragraphs summarizing the processing done by the authors. Using a closed source tool for processing the data limits how deeply a reader can delve into the processing that the authors did, because the functions in the proprietary package are black boxes. Jupyter has no black boxes.

gaius · on April 15, 2018

In principle this should allow readers to completely understand and reproduce the processing of the raw data, rather than reading a few paragraphs summarizing the processing done by the authors. Using a closed source tool

But consider http://www.bbc.co.uk/news/science-environment-39054778

"Science is facing a "reproducibility crisis" where more than two-thirds of researchers have tried and failed to reproduce another scientist's experiments"

I don't think that can be handwaved away as "OMG closed source software!". Especially since all the scientists in a given field will have access to the same software anyway. Give them open source and the issue will persist, and we both know it because the root cause isn't anything to do with the license of the software

jononor · on April 15, 2018

Reducing the barriers for doing reproduction studies will likely increase how often they happen.

Of just consider peer review. How often does a reviewer today actually review the code and data used in a study? As far as I know this is essentially never done. That would involve seeing the code, running it, maybe messing a bit with it.

nmca · on April 15, 2018

Open source software, in general, tends to make things more reproducible. Sure, software licencing might not be the singular root cause, but why does that suggest we shouldn't capitalise on the improvements available?

gaius · on April 15, 2018

Open source software, in general, tends to make things more reproducible

Citation very much needed for that. Because you can very easily find that 6 months or a year later you update your dependencies and everything is now broken. I recently came back to a Python project after a year, updated my packages then realized: I simply cannot be bothered to unpick the mess that resulted just to add one trivial feature. Whereas the poster child for backwards compatibility is closed-source and proprietary.

Science is not reproducible because there are no incentives for it to be so, despite everyone paying lip service to it. It's extra work and helps those who are competing with you for grants, after all. That's a social problem, not a software one. The software is irrelevant.

takluyver · on April 15, 2018

Open source software is important for reproducibility for a couple of reasons. Firstly, if you record that you've done your analysis with Python 3.6.3 and Numpy 1.14.2, and it later breaks on some newer version, it's relatively easy to get the same versions you were using. Commercial software vendors are usually not keen on you downloading and running a version of their product which was superseded four years ago.

Secondly, of course, open source means that if you're not sure why two versions/functions/libraries are giving you different answers, you can go and find out. I accept that a lot of people may not have time for that, but I don't think you can fix that problem unless it's possible to dig down and follow the working.

Finally, 'reproducible by anyone with a computer' is a lot better than 'reproducible by people who buy a license for the tool I used to do it'.

gaius · on April 15, 2018

f you record that you've done your analysis with Python 3.6.3 and Numpy 1.14.2, and it later breaks on some newer version, it's relatively easy to get the same versions you were using

Better record which compiler you used too, and what flags, and every version of every library and everything else. It’s not as simple as you make out and it’s far from guaranteed that all those packages will still be available or compile on your OS.

Commercial software vendors are usually not keen on you downloading and running a version of their product which was superseded four years ago.

I guess you must not deal with vendors much because generally they are fine with this. It’s part of the support agreement usually, just another service. Getting an “obsolete” version for whatever reason has never been a problem for me.

By “anyone with a computer” you mean “anyone who can exactly reproduce my configuration which I don’t even know myself for certain”

improbable22 · on April 15, 2018

Because the reproducibility crisis has very little to do with the difficulty in re-running the same code. These are almost orthogonal concerns.

The typical problem paper has a small data set, on which the authors tried 20 different things, one of which achieved p<0.05 and got published. The result tells you nothing meaningful about the world (or more often, tells you something about how its authors wish the world worked). But re-running their code on their data set will not reveal the problem.

BlackFingolfin · on April 15, 2018

I am not sure that there are/were "high profile failures", but serious bugs for sure, not just in Mathmetica but also in the calculators you mention; ser eg here: https://mathoverflow.net/questions/11517/computer-algebra-er...

Now being open source does not prevent such errors. What I think is far more important here is open issue tracking: companies like Wolfram may not be exactly eager to let you know that there are serious bugs in their products (and what they are) unless they must. Being able to fix the bug yourself can of course be a real perk for some user's, but realistically, it's out if question for most.

fdej · on April 15, 2018

Here is an article from a couple of years ago about how one group of mathematicians were misled by (and had to spend some time tracking down) a bug in Mathematica determinant evaluation: http://www.ams.org/notices/201410/rnoti-p1249.pdf

FractalLP · on April 15, 2018

Thanks for posting. I'm sure bugs like this show up in any and all software. I do wish they had a public bug tracker though.

madengr · on April 15, 2018

I’ll add that many closed sourced developers will show you sections of the source code if you sign an NDA. I’ve done that on several occasions when I needed to see how a model was implemented. As long as you are not a competitor, it’s usually not an issue. I sign NDAs often to see IP.

gaius · on April 15, 2018

Additionally most of Mathematica is written in Mathematica and you can just read it. Same with MATLAB.

I get the feeling that many of the Jupyter fans commenting here have only ever used a Jupyter setup and are unaware of the wider industry

starpilot · on April 15, 2018

> The problem with Mathematica from a science point of view is that, being closed source, means you can't independently ensure the calculations are happening correctly. To be replicable, science involving data needs to use open source tools.

Excel and MATLAB can't be used for real science?

inigoalonso · on April 15, 2018

At least you need to be very careful with them: https://genomebiology.biomedcentral.com/articles/10.1186/s13...

Abstract: "The spreadsheet software Microsoft Excel, when used with default settings, is known to convert gene names to dates and floating-point numbers. A programmatic scan of leading genomics journals reveals that approximately one-fifth of papers with supplementary Excel gene lists contain erroneous gene name conversions."

starpilot · on April 15, 2018

Yeah, but that doesn't change the fact that 99% of the science/engineering/business world is reliant on Excel. Airplanes are designed in Excel.

PopePompous · on April 15, 2018

I think your estimate of 99% is a wee bit high. At least in the field I'm familiar with, astronomy, the idea of using Excel for any serious computation or design would be met with laughter.

iguy · on April 15, 2018

Well since the author is an economist, questions about "real science" can be interpreted a few ways.

For a fun look into the high standards of the field, a few years ago Piketty made the mistake of sharing his Excel files, in which all sorts of crucial adjustments were hard-coded into tables of data...

https://marginalrevolution.com/marginalrevolution/2014/05/pi...

jjgreen · on April 15, 2018

Matlab's fine, but Excel has (had?) some serious defects in the basic statistical functions, in particular, it coundn't reliably calculate a standard deviation.

chillydawg · on April 15, 2018

I'm not sure I agree with that. If you publish your methods and make your data available, then a replicator has all they need. They'd have to reimplement the data pipeline anyway, otherwise it's not a replicated result it's just someone running your notebook with potentially the same bugs in it again.

FractalLP · on April 15, 2018

I hear what you're saying and you're right that WinPython & Anaconda certainly help, but the documentation is still a long way off from Mathematica in my opinion.

One thing in Python's favor though might be depth in certain categories. The machine learning stuff in Mathematica is very nice and high level if you want neural networks, but if you need PSO or GA, you'll probably have to write your own or grab someone else's notebook.

It was difficult for me to support closed-source software as I've always supported linux for this reason.

As far as ensuring accuracy of calculations, having a very large and highly technical user base over several decades helps, but I'm not sure how much this is used in theory. If a statistician publishes a paper using R, is anyone really going to check the R module source code? I bet this is a rare occurrence.

dr_zoidberg · on April 15, 2018

The best package for graph analysis in Python is NetworkX: http://networkx.github.io/

cs702 · on April 15, 2018

This is spot-on:

"Membership in an open source community is like membership in the community of science. There is a straightforward process for finding a true answer to any question. People disagree in public conversations. They must explain clearly and listen to those who response with equal clarity. Members of the community pay more attention to those who have been right in the past, and to those who enhance their reputation for integrity by admitting in public when they are wrong. They shun those who mislead. There is no court of final appeal. The only recourse is to the facts.

It’s a messy process but it works, the only one in all of human history that ever has. No other has ever achieved consensus at scale without recourse to coercion.

In science, anyone can experiment. In open source, anyone can access the facts of the code. Linus Torvalds may supervise a hierarchy that decides what goes into the Linux kernel, but anyone can see what’s there. Because the communities of science and open source accept facts as the ultimate source of truth and use the same public system for resolving disagreements about the facts, they foster the same norms of trust grounded in individual integrity."

The entire blog post is worth a read.

gaius · on April 15, 2018

Membership in an open source community is like membership in the community of science. There is a straightforward process for finding a true answer to any question

Oh please. Dare to ask what is the best of anything and prepare for an epic flame war.

ChrisRackauckas · on April 15, 2018

Well, I asked the author for the data that led him to believe that the Julia community is "monopolistic", and he got mad and blocked me instead of linking to anything...

https://twitter.com/ChrisRackauckas/status/98552939470474035...

So your remark is pretty spot on.

littlehood · on April 15, 2018

You asked for that block in the second question. You sound very entitled to answers.

ChrisRackauckas · on April 16, 2018

I am just curious what led him to his conclusions. He made a very nasty comment about a group of people and gives no justification for why. I cannot seem to find out why when Googling either. Is it not okay to ask someone how they came to their conclusions?

littlehood · on April 16, 2018

You could've asked politely, but it looks you missed your chance.

dEnigma · on April 15, 2018

Because "What is the best X" is a vague question and rarely has just one true answer.

gaius · on April 15, 2018

As is true of most interesting questions in most fields.

dEnigma · on April 15, 2018

Possibly, but this is why one shouldn't expect to get a simple, "true" answer or be suprised when there are arguments and flamewars.

yaroslavvb · on April 15, 2018

I've been using Mathematica since 1995 and Jupyter/colab for 5+ years. Most recently I've been using them both in parallel. While Jupyter is probably the future in terms of mass adoption, there are still some areas where Jupyter is lagging.

1. Mathematica has an easy way of sharing notebook. I just run "deploy" command which turns notebook into publicly accessible webpage, hosted by wolfram, here's an example -- https://www.wolframcloud.com/objects/user-eac9ee2d-7714-42da...

2. Mathematica has more active community. Mathematica-specific questions are likely to be answered within an hour by experts on https://mathematica.stackexchange.com/

3. Mathematica has better tools for simple interactivity. I like to throw in "Manipulate" for a simple graph with a draggable constant, or go to http://demonstrations.wolfram.com/index.php for an idea for more complicated demonstration to use in a presentation

4. Mathematica has more options for advanced visualization, and interfaces are more uniform since graph drawing, 3D drawing, and other kinds of visualizations are developed within a single system. Some examples https://www.wolfram.com/language/11/new-visualization-domain...

carreau · on April 15, 2018

Thanks for your feedback, Mathematica has indeed millions of $ to provide more features and advertise them, and Jupyter have only a few full time devs that probably do not advertise enough its features:

1) Binder makes that a git push away https://mybinder.org/ Want to check the discovery of gravitational waves ? Go ahead ! https://github.com/minrk/ligo-binder You know the nice thing ? it does not require you to opt-in, as long as a repo is public you an run it. So you don't need have to deploy, or know it exists. we are _already_ doing that for you.

2) Jupyter is "Just" the frontend. StackOverflow have matpltlolib, numpy, sympy, .. tags. We don't the subdomain (yet), and I actually prefer to have tags to have better searching :-)

3) Sure it's called ipywidgets (https://ipywidgets.readthedocs.io/en/latest/), that's the tech. From ipywidgets import interact, and @interact as decorator on your function... that's it.

4) For convenience Library that use ipywidgets for 3D see https://ipyvolume.readthedocs.io/en/latest/animation.html (Hey it also support VR !) See https://www.youtube.com/watch?v=nZ3HQpSXn2U that will blow your mind.

We'll try to be better at advertising our features !

smortaz · on April 15, 2018

MyBinder is great. As well as cocalc and sagemath. There's also https://notebooks.azure.com which is a free hosting of Jupyter notebooks. You get a linux/docker container w a Terminal, Anaconda, etc.

Check out jakevp's book for example:

https://notebooks.azure.com/jakevdp/libraries/PythonDataScie...

Clone to run.

Also try Jupyter Lab (experimental) - closer to an IDE than plain notebooks. right click on a Library (repo) and select Open in Jupyter Lab.

[disclaimer - our team's offering]

applecrazy · on April 15, 2018

There’s also Google Colaboratory

askvictor · on April 15, 2018

With regards to sharing, both Google and Microsoft have free hosting for shareable jupyter notebooks. Probably not quite as easy to get them from your computer to the cloud as a deploy command, but it probably wouldn't be hard to create a module that does exactly that (if one doesn't already exist)

gaius · on April 15, 2018

See https://blogs.msdn.microsoft.com/uk_faculty_connection/2017/... - you can easily import from Github

https://notebooks.azure.com

iguy · on April 15, 2018

Right, Jupyter is nice to have, but this is really over the top nonsense:

Jupyter encourages individual integrity; Mathematica lets individuals hide behind corporate evasion

I have no idea what he's talking about re PDF export either. I print to PDF all the time, to email people a static document to look at, etc. It works just fine. (Whether you can you make book-quality formatted text easily, I've no idea, never been tempted to try.)

rexpress · on April 15, 2018

I suspect that the OP used the "Print..." command in the File menu, and selected PDF as the printer option. ISTR that this can sometimes result in poor quality results as presumably it is relying on an external PDF engine to render the notebook.

Whenever I've used the "Save As..." command, choosing PDF as the target, I've also only had good quality output.

ChrisRackauckas · on April 15, 2018

There was a bug in Mathematica 10.0.3 that messed up saving parts to PDFs though. I wonder if he was on the release where this was broken:

https://mathematica.stackexchange.com/questions/68893/save-a...

promer · on April 15, 2018

Nope.

mistermann · on April 15, 2018

The writing style is very reminiscent of Ayn Rand.

promer · on April 15, 2018

mistermann · on April 15, 2018

Have any downvoters actually even read Ayn Rand, or is it the usual people who don't even have to, because they know all they need to about her?

This segment, with it's hyperbole, ridiculous black and white simplification of complex matters, and tone of (over)confidence is unmistakably typical of her writing style:

"Jupyter rewards transparency; Mathematica rationalizes secrecy. Jupyter encourages individual integrity; Mathematica lets individuals hide behind corporate evasion. Jupyter exemplifies the social systems that emerged from the Scientific Revolution and the Enlightenment, systems that make it possible for people to cooperate by committing to objective truth; Mathematica exemplifies the horde of new Vandals whose pursuit of private gain threatens a far greater pubic loss–the collapse of social systems that took centuries to build.

Membership in an open source community is like membership in the community of science. There is a straightforward process for finding a true answer to any question. People disagree in public conversations. They must explain clearly and listen to those who response with equal clarity. Members of the community pay more attention to those who have been right in the past, and to those who enhance their reputation for integrity by admitting in public when they are wrong. They shun those who mislead. There is no court of final appeal. The only recourse is to the facts.

It’s a messy process but it works, the only one in all of human history that ever has. No other has ever achieved consensus at scale without recourse to coercion."

promer · on April 15, 2018

iguy, agreed. You have have no idea because you haven't tried.

iguy · on April 15, 2018

In other news, you will be shocked to hear that major journals also don't accept photographs of proofs clearly worked out on the blackboard. Although, to be fair, I have also not tested this myself.

coldtea · on April 15, 2018

Defensive much? Still, that fact remains that the profit-motive accusations for them hampering PDF export are unsubstantiated.

sago · on April 15, 2018

Jupyter is an amazing and useful piece of software. I agree that its openness is important, that its flexibility in producing content is excellent, and that it deserves to be the current hotness. But I'm afraid

> Now, Jupyter is the unambiguous technical leader.

is pure fantasy, imho. SymPy is still two decades behind Mathematica in large swathes of symbolic computation.

It may be that, for the things that the author wanted to do, the Python libraries were a good fit (it seems he was working on NLP), but overall, I just don't see it.

askvictor · on April 15, 2018

I'm surprised there's no mention or discussion of the importance of open-source tooling for replicable science. Without seeing and reviewing the source, how can you tell that a particular calculation is right? Also, relying on costly tools such as Mathematica cuts off a sizable amount of the population from being able to replicate or play with your findings on cost grounds alone.

falkod · on April 15, 2018

Long-term Mathematica user (physicist) here: I don't think the use of open source software would make most science -- maybe that does not apply to cs/datascience -- more replicable. Usually that takes an expert in the field. And usually these experts are employed at universities where Mathematica licenses are not the prime cost factor. That said, I am all for open source software. Although I would argue that probably trustable scientific results do not rely on the inner workings of e.g. Mathematica anyway, but use Mathematica as vehicle for say linear algebra or symbolic manipulation etc. While the inner workings of Mathematica may not be open source, in principle the relevant algorithms are not propriertary but usually well-known mathematical results and as such at least in principle easily reproducible outside of the ecosystem.

jononor · on April 15, 2018

Open source brings with it a strong culture of publishing code openly. It seems that the amount of public Jupyter notebooks is already higher than that of Mathematica or similar, despite those tools having a 20 year lead?

hpcjoe · on April 15, 2018

Responders have been making the argument that Jupyter is beating Mathematica because of financial or social issues. I'd like to posit a different interpretation, which could be construed to encapsulate these reasons, as well as additional other factors.

Jupyter has a lower friction to adoption and usage than Mathematica, for a definition of friction which encompasses ease of acquisition and sharing. I include economic considerations in the ease of acquisition and sharing. Lack of proprietary walled garden lock-in/lock-out factors in as well.

People are also likely considering the longer term scenario, whereby data, model, and information interchange has been hindered by proprietary formats (the "wall" in the walled garden) and lack of complete information on how to get information in and out. Which is what the OP was complaining about, as they were not able to easily construct a publication quality preprint/submission from one, but could do it easily from the other.

Some of these sources of friction are effectively "own-goals", that is, you increase friction in such a way as to prevent something that people need to do, to be effectively impossible. Or you hide it. Or disable certain groups from using that functionality.

Then the question is balancing the longevity of the format, the proprietary value against alternatives. Increasingly, people are less interested in this friction for a number of critical systems.

I am looking at this from the perspective of someone who has a few 10's of MB of data/writings on 25-30 year old 3.5 inch and 5.25 inch floppies. These are in formats for which I may not have an ability to extract the data/information without some significant effort.

The formats that have survived well for me over the last 30 years have been either open, or readable/writeable with open tools. The closed ones, not so much luck with.

limeblack · on April 15, 2018

FYI there is a not as complete open source implementation of Mathematica called mathics[1][2]. In fact it is also Python based just like Jupyter(I don't think this is a coincidence).

[1]: http://mathics.org

[2]: http://mathics.net

ninguem2 · on April 15, 2018

He should look into Sage.

http://www.sagemath.org/

lopmotr · on April 15, 2018

Open source is only great when it exists. For finite element analysis, there are only two generally useful open source products and neither of them has a remotely modern or easy UI. For $10,000 or so, you can get a proprietary one that's fast to use and doesn't have you hitting a brick wall when you find there's some key feature it can't do.

UI is a major failure of open source - it can hardly ever achieve it, at least not well. Most of the popular open source programs have no UI at all.

forapurpose · on April 15, 2018

> Python libraries let me replicate everything I wanted to do with Mathematica: Matplotlib for graphics, SymPy for symbolic math, NumPy and SciPy for numerical calculations

Are the Python libraries precise enough for professional mathematicians? And do they deal with mathematical 'edge cases', a variety of inputs (formats, notations, etc.), etc.?

On one hand, I could say 'the author uses them therefore they must be sufficiant'. On the other, I've seen plenty of cases where the professionals were not careful about the tools they use (e.g, spreadsheets running critical, large-scale financial operations).

williamstein · on April 15, 2018

SageMath (which I started in 2004) is in fact a Python library targeted at professional mathematicians (mainly research in pure math), and is much stronger than Mathematica in many areas, including number theory, algebraic combinatorics and algebraic dynamics. It is weaker than Mathematica in symbolic calculus.

bloaf · on April 15, 2018

People tend to underestimate the extent of Mathematica's libraries. Take for example process control [1]. Not only does Mathematica have a pretty thorough set of functions for solving process control problems, it works with both linear and non-linear systems, and can find symbolic solutions. When I search for python equivalents, I find abandoned or incomplete-looking projects that are much more limited in scope (e.g. linear systems only) and trying to just provide some of Matlab's functionality (i.e. no symbolic analysis)

[1] https://reference.wolfram.com/language/guide/ControlSystems....

mlevental · on April 15, 2018

in particular I'm curious if sympy is really as good as Mathematica. I haven't used Mathematica since doing physics hw as an undergrad but it's symbolic manipulation was amazing most of the time

ChrisRackauckas · on April 15, 2018

It's not, but for most people it may not matter. Mathematica seems to have a much larger set of integrals, differential equations, special functions, etc. that it can recognize. So as much as I dislike the language itself, I do keep Mathematica installed because in many cases SymPy cannot handle the transformation.

On the other hand, open source tools are catching up. Part of the SymPy project is SymEngine which is a reconstruction in C++. SymEngine is not as feature-filled as SymPy yet, but it flies. It's much faster than SymPy, and also Mathematica. SymEngine.jl works very well since by using the symbolic types in side of Julia you get both the speed (via function specialization) while getting a lot of free features through generic code. For example, Julia code is generic over number types, so if you call inv on a matrix of SymEngine symbols the built-in Base version will compile a fast version for SymEngine symbols and use it. So you have all of Julia's Base available along with Julia packages (yeah, you can put them in a neural net if you wanted to). So SymEngine.jl is the only thing that works when I need that speed, but I do have to keep Mathematica installed for its special handling of specific equations.

And I hate to say it but... the Mathematica notebook is really pretty. You put the exponents up, the fractions on top of each other...

mlevental · on April 15, 2018

>And I hate to say it but... the Mathematica notebook is really pretty. You put the exponents up, the fractions on top of each other...

yup. for quick and dirty type setting i preferred it to latex

fdej · on April 15, 2018

> On the other hand, open source tools are catching up.

SymPy is doing nicely, but it's decades behind Mathematica when it comes to symbolic computation.

By the way, if you want to invert a symbolic matrix, chances are that generic Gaussian elimination working with symbolic expressions isn't the best algorithm (it might even give wrong answers if zero testing is done incorrectly), and compiled code isn't going to change that. The state of the art in symbolic linear algebra uses specialized algorithms like evaluation-interpolation, modular computation, and all kinds of low level optimizations.

ChrisRackauckas · on April 15, 2018

It was a bunch of small matrices so it worked out well, but that's good to know. For future reference, what library implements these symbolic linalg routines? When I was looking around I could only find LinBox (http://www.linalg.org/) but the barrier to entry was a little high for me to dig in.

fdej · on April 15, 2018

In Julia, there is Nemo.jl for high performance exact and symbolic linear algebra. At least that is the goal; it doesn't have the best algorithms yet in all cases, but it has some of them (and more are being developed). You can see some examples here: http://nemocas.org/benchmarks.html

ChrisRackauckas · on April 15, 2018

I really like that library for other reasons, but I couldn't find documentation for anything symbolic. Could you point to the right places? Feel free to follow up on other forums since this is getting pretty off topic.

fdej · on April 15, 2018

You have to work with multivariate polynomials (possibly over number fields). If you have symbolic expressions in another format, you would have to manually convert them to that representation first, possibly after introducing extra variables for non-algebraic constants (which may or may not be a good idea depending on the specific circumstances). Having that kind of rewriting done automatically would certainly be a good thing!

bloaf · on April 15, 2018

Re: Speed

Mathematica is undergoing a compiler re-vamp, which may change that.

currymj · on April 15, 2018

one pretty clearly inspired by Julia's approach, from what I can understand.

ChrisRackauckas · on April 15, 2018

Well I think Mathematica has an interesting spot here. If you consider their symbolic language as instead an AST for mathematical functions with tons of available transformations to perform simplifications, then that can allow for a very nice compiled output. Julia, Modelica, TensorFlow, Mathematica, etc. all seem to be heading in this same direction of codifying and then simplifying mathematical structures to receive better runtime code. There's a lot of convergent development going on here.

currymj · on April 16, 2018

https://www.youtube.com/watch?v=jxl6IvDvGHU

That's a video where they lay out how they want it to work, if you're interested and haven't seen it.

arca_vorago · on April 15, 2018

I have chosen the emacs org mode system over Jupyter, but I still like Jupyter regardless. The real tragedy is how dependent people have become on proprietary stacks like Mathematica.

hatmatrix · on April 15, 2018

org-mode + babel is excellent. But again the social aspect necessary for adoption is much less developed than Jupyter (the emacs community is very social, but small).

In that Jupyter files are just json files, I hope that there it will be easier to switch between the two in the future. Like [1], [2], and [3].

[1] https://github.com/gregsexton/ob-ipython

[2] https://github.com/jkitchin/ox-ipynb

[3] https://github.com/millejoh/emacs-ipython-notebook

mfe5003 · on April 15, 2018

I learned and was fluent with Mathematica early and learned python later. I still run to Mathematica for doing symbolic analysis because there is basically no impedement between my ideas and the keyboard when I am solving that type of problem. I've moved all my numerical analysis to the scipy system since it is a more natural language for those types of problems.

ChrisRackauckas · on April 15, 2018

>Which reminds me. If you are a Julia enthusiast, how do you suppose the investors in this new language plan to make their big score?

This is a weird jab at Julia. Open source software is woefully underfunded. Julia Computing was founded in the wake of Heartbleed where people learned that open source needs some kind of funding to keep developers alive (example article: https://arstechnica.com/information-technology/2014/04/tech-... ). Coming from academic backgrounds, the core contributors really had two options if they wanted to devote full time to Julia: either everyone gets an academic job while working on Julia instead of papers (lol), or band together to get R&D funding and use that to fund a life of open source development. They did the latter.

It's quite silly to even imply there's something nefarious that can go on here. Their main product is the language. They can't sabotage that without sabotaging themselves. They may have some priorities swayed, just like how any other individual who's working on open source is doing it for their own reasons. For example, IBM funded them to add PowerPC support, and what do you know Julia works on PowerPC. Is that so awful? With this funding model, what ends up happening is you have a large group of people who dedicate their lives to developing open sourced code for automatic GPU compilation, machine learning libraries, etc. along with compiler support for optimizing scientific computing. Because of this (and other reasons), Julia ends up having a much stronger governance which is one reason why its development ends up being more active. And this activity in turn makes its project more democratic than projects like CPython or Jupyter which have been larger projects for a longer time, but with less contributors (Julia's 686 vs CPython's 524 vs Jupyter's 330).

And most of the Julia contributors aren't even part of that company! Many are academics. A lot of the funding is through NumFOCUS, a non-profit which also helps projects like Jupyter, matplotlib, etc. which the author is for! (And they are great projects as well!)

So while I am happy that the author is pro open source, I think it's necessary to point out that this open source outsider view is both wrong and dangerous. Saying that you love the purity and despise anyone who gets to make a living from it is harmful! Open source is a labor of love, but it has also destroyed many careers. I think society has this view that open source (mathematical) projects are "funded" by academic careers, but even creators of popular projects like SageMath have publicly noted that open source is harmful to academic success (https://escapethetower.wordpress.com/2016/06/13/creator-of-s...).

Instead of being against funding open source contributors, I would like to see the author promote funding for open source. Paul Romer is a leading economist. He has the power to proclaim that open source matters for academic careers and push for it to be put on equal footing with papers for grant applications in his field. People like him should be advocating for jobs dedicated to open source development, not scoffing at the supposed impurity of someone being paid to develop a public good. Someone at the top of the academic hierarchy should start a change and make the development of public tools as valued as the development of (non-public) publications.

jhbadger · on April 15, 2018

Also the idea of companies making money off of open source isn't new or nefarious and often helps people who aren't even their customer. Red Hat is the classic example, and many organizations use CentOS, a distribution based off of Red Hat's distro without paying Red Hat anything. And in scientific computing there's RStudio, which makes a great open source IDE for R besides offering prducts and services for sale.

ChrisRackauckas · on April 15, 2018

Wow, I asked for the data that led him to his conclusion about "the monopoly" and he blocked me. So much for open science...

https://twitter.com/paulmromer/status/985529525491654657

improbable22 · on April 15, 2018

I just realised who this blog post is by, it's the Paul Romer. Apparently gravitas is important at the World Bank, because clearly he's run the tank dry.

rev · on April 15, 2018

Dan Toomey's Learning Jupyter is free today on https://www.packtpub.com/packt/offers/free-learning btw.

jonnycomputer · on April 15, 2018

R Studio Notebooks are pretty good too; I like that, by default, there is an interactive console connected to the same kernel in addition to the notebook. This allows me to use the console to interactively probe my data, or try out something, and then record a more finished product in the notebook itself. I think this can be done in Juypter (http://jupyter-notebook.readthedocs.io/en/latest/examples/No...), but, not out of the box.

heisenzombie · on April 15, 2018

In JupyterLab (http://jupyterlab.readthedocs.io/en/stable/), this is built-in. Just right click a notebook and choose "New Console for Notebook".

JupyterLab is in beta and is intended to replace the current Jupyter front-end.

jonnycomputer · on April 23, 2018

I tried JupyterLab out. Definitely an upgrade.

jonnycomputer · on April 15, 2018

cool!

wodenokoto · on April 15, 2018

Since the author is talking about using Jupyter for research papers, how do you do basic things, like bibliographies, naming tables and referencing them later?

I have seen table of content, but those have been generated by a big block of javascript.

llamaz · on April 15, 2018

You can export to LaTeX and hack the "template" (in the terminology of Jupyter) that defines how this conversion occurs.

I used this as a starting point and modified it for my own purposes: http://blog.juliusschulz.de/blog/ultimate-ipython-notebook

EGreg · on April 15, 2018

It always goes like this.

The initial solutions may be proprietary, and financed by investors. They have a business model so of course they don’t give everything away for free.

With time, enough people get together to build an open source alternative. And then like a snowball it eclipses everything proprietary that went before it.

What would a world look like that didn’t apply Capitalism to ideas?

One where companies couldn’t sue one another for Intellectual Property infringements. Like Waymo suing Uber.

One where self driving cars can incorporate improvements made by any other self driving car instead of putting people at risk reinventing the wheel.

Where the long tail of drug research leads to something.

Why would people release their findings? Because if they don’t, others will. And then they won’t get that small measure of input and control and attach their name to it. Jonas Salk is an exception in the biomedical field. Albert Einstein is the norm in physics.

The alternatives to Capitalism do not have to be Socialism. They can be SCIENCE. OPEN SOURCE. WIKI.

Collaboration instead of Competition.

I would like to see the same in web browser engines. WebKit instead of IE. And so on. When that happens, we all win.

Yes free software is good. Software, like knowledge, does not have to be scarce.

TeMPOraL · on April 15, 2018

> It always goes like this.

> The initial solutions may be proprietary, and financed by investors. They have a business model so of course they don’t give everything away for free.

> With time, enough people get together to build an open source alternative. And then like a snowball it eclipses everything proprietary that went before it.

I wish it did. The most obvious counterexamples that come to mind are Microsoft Office and Adobe Photoshop. With maybe the recent exception of Krita for the latter case, there are no known open-source alternatives that don't suck hard compared to the propertiary applications I mentioned.

I'm not really sure why that happens, but open source doesn't seem to scale well when building large end-user applications.

EGreg · on April 15, 2018

Maybe not end-user applications. But everything underneath them.

I also think that open-source needs some sort of end-user facing consulting, the “last mile” at it were. A Wordpress expert or hosted service setting up your blog. A browser like Chrome that uses WebKit or Blink underneath.

But the underlying free platform beats the proprietary ones in the end. Let people build their little competing businesses on top of it if they want.

In the world I’m talking about (collaboration via contributions) everyone would get UBI and be able to not contribute anything.

zeth___ · on April 15, 2018

Jupyter isn't a foil for mathematica. It has completely different use cases.

The straight one to one fight is between sagemath, or their rebranded cocal site, and mathematica. In terms of ability to do things sagemath is the glue between all the amazing open source math and science software that has been written over the last 50 years.

https://cocalc.com/

But in terms of presentation by far the best is org-mode with sage, julia and everything else tied in. One text file that can be emailed around, put under version control, and can speak two dozen computer languages with babel, and has latex support for both pdf and html output.

williamstein · on April 15, 2018

Thanks for mentioning SageMath and CoCalc (I founded both of these projects)! A minor clarification is that CoCalc is not a rebranding of SageMath, but is instead a new web application whose goal is to make it very easy to collaboratively use Sage, Jupyter, LaTeX, Julia, etc. In constract, Sage is a more traditional open source software package, which people install on their own computers. The goal of Sage is to be a viable open source alternative to the core Mathematica computer algebra system (and also to Magma, etc.), whereas the goal of CoCalc is to make all technical open source software very, very easily accessible, mainly to students.

zeth___ · on April 15, 2018

Have you had a look at org-mode babel?

It's extremely interesting how they have managed to create an all plain text environment that is in some ways[0] better than the current notebooks. Of course it's not as user friendly and needs quite a bit of polish, and a major refactor to separate the above use case from the tangle use case.

The tangle part of org is even more interesting as it's probably the first fully literate system that allows you to completely document everything about a software project. And since it's language agnostic I do mean everything from install, compile, connect to the external databases, export the documentation in every format under the sun and whatever else you can think of (including docx through pandoc).

I'm going through the source code slowly to get it streamlined, but the code is suffering from the typical lisp syndrome of "Everything is implemented everywhere."

[0] Being able to run multiple languages on the same org file, controlling environments for scripting languages, compiling on page for non-scripting languages.

williamstein · on April 15, 2018

Yes, org-mode is very inspiring. But, as you say, it's not quite user friendly enough for some target audiences (e.g., beginning students).

ChrisRackauckas · on April 15, 2018

First of all, thank you for all of your hard work. One thing I am curious about though is how you think you can match the "cohesiveness" of Mathematica. While I am not a big fan of Mathematica the language, there is something gained by a top down approach. The options all seem to have the same names, the display tools tend to compose with new functionality well, the symbolic parts integrate well with the numerical tooling. While the poly-sourcing approach of Sage can definitely excel in the features area (and definitely surpass it in more "esoteric" areas of algebra), I do think that the tools end up with a higher barrier to entry since it's much harder to police syntactic and dependency inconsistencies. I am trying to work out some of these problems of my own and was wondering how you're coping with it.

williamstein · on April 15, 2018

> how you think you can match the "cohesiveness" of Mathematica

The basic strategy is to provide a new Python library ("the sage library" [1]), which is a layer between the user and the other dependencies. E.g., instead of just saying "use Maxima for symbolic limits", we have our own notion of symbolic functions (implemented using Ginac), and automatically convert them to and from Maxima when doing symbolic limits. The user never has to know anything about Maxima. When we first implemented this, it just used interprocess communication under the hood, but (thanks to Nils Bruin and others) we now use a direct C library interface builton top of ECL (embedded common lisp). This "new Python library", which isn't so new, since I started it in 2004, is now nearly a million lines of code.

[1] https://github.com/sagemath/sage/tree/master/src/sage