
Stencila – Spreadsheet-like live reactive programming environment - anu_gupta
https://stenci.la/stencila/blog/introducing-sheets/spreadsheets-are-dead-long-live-reactive-programming-environments-
======
nokome
Hey, Stencila developer here. Thanks to the original poster for sharing the
link and for all the interest. Unfortunately, the site is not handling all
that interest too well (R session hosting instances filling up, timeouts,...).
I'm working on it but please bear with me while I try to stabilize things.

~~~
Dwolb
Hey great work. I like that people are re-thinking spreadsheets as users get
more technically inclined to use more sophisticated tools.

What do you think about fixing R notation to not describe cells, but the named
array? i.e. instead of Average(A2:B2) it's Average(this_array) where
this_array is the name for the row/column data and when the Average cell is
clicked, highlight cells A2:B2?

There are many cases around detecting the proper array name or the user not
inputing column or row names, but due to the ability to quickly read and
understand the code as it relates to the sheet may make certain trade-offs
more acceptable (i.e. forcing he user to enter array names).

~~~
nokome
Thanks for the feedback.

Your suggestion about named arrays is important and relates the discussion
here around the approach taken in the OSX Numbers app and the now dead Lotus
Improv. Its connected to a planned feature in Stecnila sheets that I call cell
"mapping" \- effectively projecting an R/Python object onto grid cells. There
is some more about that here:
[https://github.com/stencila/stencila/issues/118](https://github.com/stencila/stencila/issues/118).

Also related is an proposed improvement in how sheet expressions are
translated into R code for arrays:
[https://github.com/stencila/stencila/issues/157](https://github.com/stencila/stencila/issues/157)

------
mattbowen
I've been shifting a bunch of data analysis out of spreadsheets (and some
adhoc SQL) to Jupyter/Pandas, and we've found some unexpected tradeoffs on
both sides.

The lack of testability and version control (and really, long-term
maintainability) is what drove us out of spreadsheets into Jupyter. We've
found though that the workflow in Jupyter, even with Pandas, is not great for
exploring and getting a feel for data --- we end up missing the very quick
"poking around" you can do in excel or in a good sql client.

I'd have to use Stencila more to know if it strikes the right balance for the
kind of analysis work I do, but I'm glad to see such a thoughtful attempt to
try a new balance.

~~~
Mikeb85
Try R and RStudio. It excels at 'poking around' data.

In addition, you can show any data frame as a spreadsheet of data,
import/export CSV files, etc... Knitr, Shiny, Plotly and other technologies
also make producing documents, graphs and whatnot super easy.

This Stencila also looks cool though. Spreadsheet + R....

------
AndyMcConachie
Lotus 1-2-3 got people to use computers. One could argue that spreadsheets are
the single biggest business innovation of the late 20th century. I don't think
they're going away that easily.

I know people who type whole letters in Excel. I know an accountant in
particular who I showed how to use MS Word for writing letters, but he prefers
Excel. He writes full documents in Excel, prints them out and mails them.

Laugh if you want but spreadsheets are not going away in my lifetime.

~~~
koz1000
Sooner or later, all project management ends up in Excel. Doesn't matter what
ticket or tracking tool you've paid for.

~~~
grahamburger
My experience has been that it always ends up in Google Docs. Collaboration in
Docs is just so stupidly easy compared to emailing spreadsheets around. And
yeah I've been through several cycles of this:

    
    
        start tracking in google docs, we'll get a system eventually. 
        -> OK we have bought a new PM system, let's start using it. 
        -> Everyone please start using the new system. Move your google docs there today. 
        -> I ... why is this still in Google Docs?
        -> Ok we've paid a bunch more money to get all of the requested features in our new system. Please please please start using it
        -> Sigh. Just share me on the Google Doc.

~~~
jschwartzi
Purpose-built software systems allow you to ask a lot of important questions,
but they don't allow you to ask all of the important questions. It's useful to
have a tool that you can fall back on that gives you a basic representation of
the data that you can manipulate with simple tools. For most people that will
be a spreadsheet. You can also do it with *nix tools but they're really
unapproachable for the uninitiated.

------
giardini
I think the more usual form of that phrase is "Spreadsheets are dead, long
live spreadsheets!". And that captures the truth of the situation better.

Despite all the known weaknesses of spreadsheets, it is sheer hubris to
believe that they will be supplanted by reactive programming (or databases, or
anything else for that matter). It seems that users will give up their
spreadsheets only when they are pried from their cold, dead hands.

European Spreadsheet Risks Interest Group has some great links to articles
about the pros/cons/whattodo of spreadsheets:
[http://www.eusprig.org/](http://www.eusprig.org/)

Jocelyn Ireson-Paine has done lots of work with spreadsheets and their
problems. Lots of links to spreadsheet sites from there:
[http://www.j-paine.org/](http://www.j-paine.org/)

~~~
blackbagboys
Sorry, I can't resist indulging in a moment of pedantry here.

The phrase "The king is dead, long live the king!", which this title and many
others play off of, is referring to two separate individuals, the now-dead
king in the first clause and his successor in the second. The phrase is
representative of a theory of monarchy in which, upon the death of the
sovereign, their authority passes instantaneously to their heir, without the
need for a coronation or a formal investiture, in order to foreclose on the
possibility of pretenders to the throne arising during an interregnum.

So the correct (and I know you said 'usual and not 'correct', but bear with
me) form of the phrase would be 'spreadsheets are dead, long live [whatever is
succeeding spreadsheets]'. Of course, historically speaking, it was not at all
uncommon for dead royals to suddenly reappear and reassert their claim to the
throne....

~~~
conceit
Spreadsheet software is improving here and there. Excel today is not your old
lotus-1-2-3. Although you'd be hard pressed to notice.

~~~
discreteevent
Indeed. After all the phrase is not: "The king is dead. Long live the
republic"

------
askyourmother
Spreadsheets can be a pain, but be careful what you ask for to replace them...

Having contracted at two big american banks ( _cough jp, cough boa_ ) that
decided to build their own proprietary technology sinking ship solutions (
_cough athena, quartz_ ), it is a horrible experience.

The business get sold on the idea of the "benefits" of this new "solution" so
they pay lots for it. Lots and lots. Then all tech projects are guided towards
it, otherwise it becomes a "political" problem if you don't use it, even if
you can present a technical case.

Basically, it works, just about, if you are building the equivalent of a
simple option spreadsheet pricer. Even then, the effort required should ring
alarm bells. Still, job for life for those that maintain it...

------
dnprock
I think notebook environments like iPython provide greater flexibility in
terms of programming. I don't think they can replace spreadsheet nor
spreadsheet would develop functionality to replace notebook environments.

The missing link is a seamless experience between spreadsheet and reproducible
data analysis (programming).

My team is working on [https://nxsheet.com](https://nxsheet.com). We look to
provide this seamless experience. For example: Generate normal distribution -
[https://nxsheet.com/sheets/56e845da4030182e337c6c2b](https://nxsheet.com/sheets/56e845da4030182e337c6c2b)

Stencila looks interesting. Great work!

~~~
nokome
Hey, Stencila developer here. Thanks, nxsheet looks really nice - great work
too!

I agree, somehow getting spreadsheets into the workflow of reproducible data
analysis is important. They are a tool that I for one had ignored (focussing
instead of reproducibility of text + code) - but there are too many people
that use them to ignore them!

------
shaftway
Iteration on spreadsheets is great, and there's opportunity for tons of
innovation in this space. But this suffers from the same problems that most
spreadsheets have. Errors like not averaging over the right set of data (which
you pointed out) don't come up because the code can't be diffed in git. They
come up because there's not enough clarity in whether a set of cells are
consistent or not. Making this text-based doesn't add clarity because
ultimately there will be too much text to read.

Here's a test for the most common kind of error. If I add data in A5 and B5,
will that data be represented in the chart? How about in the averages? Will I
be able to even see that?

Here's the pivot: Break your data into regions, where regions have repeated
elements. Something like this:

    
    
        A1 = 'Height
        B1 = 'Width
        C1 = 'Area
        A2:C2 = Region {
          C1 = =A1*B1
        }{
          A1 = 2
          B1 = 3
          A2 = 4
          B2 = 5
        }
        B3 = 'Average
        C3 = =sum(C2)/count(C2)
    

This could be used to generate:

    
    
        Height  Width   Area
        2       3        6
        4       5       20
                Average 13
    

The dataset associated with that sub-block can be clearly annotated to an
editing user as such, with simple tools for adding a row to that dataset, or
sorting it without effecting the rest of the sheet. Within the dataset,
there's no corruption of the formulas (the third row's C cant be different
than the second's), you've still got your diffing (probably better because
it's clear that data changed but formulas different) and it's extremely hard
to make the calculated average come out wrong.

Yeah, that exact representation isn't great. Maybe a "view" metaphor, so that
headers and footers can be attached to the dataset instead of floating outside
it. But once you've gone this direction, there's all sorts of amazing things
possible by sharing datasets, linking them, transforming them, etc.

~~~
chillingeffect
The Numbers app on OSX does something quite like this. It's highly underrated,
but having a number of smaller, floating spread sheets in one big sandbox is
waaaaay better for many things than a single sheet. I think the concept should
be pushed even further.

~~~
Someone
The UI is clunkier than that of Numbers, but you can do something similar in
Excel with tables (blocks in a sheet that you somewhat enforce some structure
on. See [https://support.office.com/en-gb/article/Overview-of-
Excel-t...](https://support.office.com/en-gb/article/Overview-of-Excel-
tables-7ab0bb7d-3a9e-4b56-a3c9-6c94334e492c)).

And of course, Excel can flag cells that, according to its heuristics, seem to
use inconsistent formulas, such as using empty cells in a summation, or having
one cell in a column use a different formula.

------
michaelwww
This article reminded me of an interesting video by Chris Granger, co-author
of the Light Table code editor, titled "In Search of Tomorrow: What does
programming look like in 10 years?" He's designing a visual/code block hybrid
editor.

[https://www.youtube.com/watch?v=VZQoAKJPbh8](https://www.youtube.com/watch?v=VZQoAKJPbh8)

~~~
erichocean
Thanks for the link! I've been working on the same problem as well
(reinventing computing)—vastly different solution though.

------
gavinpc
Spreadsheets somehow hit a vein with the public, so that tells you something.
I've been kind of obsessed with the Tup build system, and in trying to distill
what it does into one sentence, I find it helpful to say, It's like a
spreadsheet for files, where the formulas are shell commands. Even though I
can grok FRP on its own, I still find that analogy helpful.

------
migueldeicaza
I am personally a fan of Calca, not only does it do live coding, it can also
do some powerful math:

[http://calca.io/](http://calca.io/)

~~~
mgoszcz2
I agree, and variable names (with spaces!) are much more descriptive than cell
ids. I do wish for an open source version through.

------
tryitnow
Exciting stuff. I'm a financial analyst and I spend most of my time in Excel
because I'm not about to send an R script to our executive team. Everybody
understands spreadsheets.

But spreadsheets are terrible at everything except their original use case
(small bore financial modelling).

Right now I'm transitioning a lot of my workflow to MSFT's Power BI tools, but
I'd love to have a justification for using more R at work. Stencila could be
that.

------
steveeq1
Has anyone here ever used Lotus Improv? Does this program solve some of the
flaws that this article highlights? I keep on hearing how Improv was one of
the great programs that never took off, and I'm curious to now try it.

For those of you that don't know (or weren't even born yet), Lotus Improv was
a spreadsheet alternative that was released for the NeXT computer in the early
'90s that was well-reviewed but never sold well. It was eventually abandoned
when IBM bought lotus in the mid-90's.

~~~
nokome
Hi, article author here. I have never used Improv but while researching for
this work I did read about it and yes, it did seem to have introduced several
innovations to address flaws in the spreadsheet approach. The OSX Numbers app
seems to have picked up some of these ideas. And, I'm hoping to apply some of
them to Stencila Sheets too.

------
debacle
This looks very sexy, and I don't say that often about technology. This is
something I've wanted to see for a long time - the ability to seamlessly move
from a spreadsheet to an application

Edit: Quick feedback

Cell names appear to be case sensitive. While this makes sense from a
programming standpoint, if I can't have a cell named "a1" the application
should convert that to "A1" for me.

Not being able to tab between cells is annoying. Was there a decision made not
to capture the tab event?

~~~
nokome
@debacle, thanks for the nice feedback.

Regarding cell names. Currently every cell has an "id" e.g. A1 which is
represented in the background R session as the variable A1. If you enter a
formula like "pi = 3.14" into cell A1 then that cell has an id of A1 as well
as a "name" (like an alias) of "pi". You can name a cell "a1" e.g. enter "a1 =
42" into cell A1 - but that could be confusing.

Yes, not being able to tab between cells is annoying - no decision was made on
that - we just haven't got round to it. Just created an issue:
[https://github.com/stencila/stencila/issues/162](https://github.com/stencila/stencila/issues/162).
Thanks for the prod!

~~~
debacle
It might be good user experience to either make variable names case-
insensitive (could be done any time before R), or default a1/A1 as the same
pointer.

You'll really want to appeal to less technical folk, and they have trouble
with case-sensitive things.

------
jmj42
It seems, perhaps, Resolver One's time has finally come. Alas, Resolver One
(and Resolver Systems) died a slow death and ceased to be in 2012. Perhaps
this will fare better.

I always thought Resolver was a brilliant idea, but they never gained any
traction against the heavy weights (Excel).

Edit- Add link to Resolver One wikipedia article
[https://en.wikipedia.org/wiki/Resolver_One](https://en.wikipedia.org/wiki/Resolver_One)

~~~
nokome
Thanks for the link. I was aware of Pyspread (which sound similar) but not of
Resolver One - I'll add a link in the article

------
digi_owl
I think another change is needed for spreadsheets to be a more reliable tool.

Right now a sheet is a full document thing. Meaning that it starts with A1 in
top left, and spreads out from there.

but what if you could break a "sheet" down into units?

So that you can have say a constants unit, with its own A1 to whatever as
needed, and then another unit, perhaps called results, that hold the formula
in their own A1 to whatever, each unit movable and extendable on the screen at
the same time.

Now if formula A1 wants to refer to something in constants A1, the entry would
look something like constants(A1).

This way one would have everything on screen, while avoiding the worry that as
the sheet grows and one move things around to make it more readable, the
formulas are referencing the wrong cells.

~~~
zyxley
This is literally exactly what Numbers.app on OS X does.

~~~
digi_owl
And its stuck to OSX, yay...

~~~
mcphage
There's iWork for iCloud, but I'm not sure what the deal with it is if you're
not a Mac user.

------
kornish
One project I've recently been wanting play around with is Beaker, which touts
itself as a polyglot data science notebook tool. The pitch is that you can use
the right tool for the job, no matter what questions you want to ask next to
each other. It looks like an iPython notebook on steroids.

[http://beakernotebook.com/](http://beakernotebook.com/)
[https://github.com/twosigma/beaker-
notebook](https://github.com/twosigma/beaker-notebook)

Disclaimer (or not): no involvement, but looks very handy.

------
akhilcacharya
This looks very similar to AlphaSheets and the thing I was making over spring
break until my friends called it stupid.

Very interesting!

~~~
meesterdude
> the thing I was making over spring break until my friends called it stupid.

Those sound less like friends, and more like people holding you back from
curiosity and creativity, which can bring many direct and indirect fruits.

~~~
akhilcacharya
Eh, they'd rather have me work on shady cryptocurrency projects instead.

------
davecap1
Off-topic: how do people make those animated gifs like the one in the blog
post ([https://stenci.la/stencila/blog/introducing-
sheets/screencas...](https://stenci.la/stencila/blog/introducing-
sheets/screencast.gif))?

~~~
nokome
Hey, author here. I used byzanz for this one:
[https://www.maketecheasier.com/record-screen-as-animated-
gif...](https://www.maketecheasier.com/record-screen-as-animated-gif-ubuntu/)
. For snazzier stuff, Screenflow is good.

------
ebiester
This kind of makes me want to brush off POI and attach it to Scala to develop
a git-able way to create spreadsheets.

Create your spreadsheet, import it to an IDE, and create a way to
intelligently create reports that can be connected to business intelligence
suites... Someone must have done this.

------
samfisher83
Can you make this in your spreadsheet software:

[http://www.quertime.com/article/arn-2012-08-22-1-25-drawings...](http://www.quertime.com/article/arn-2012-08-22-1-25-drawings-
and-games-made-with-microsoft-excel/)

~~~
nokome
Hey, that's really cool, thanks for the link. Right now you can't do
that...but give us a few weeks. We are planning to implement conditional cell
formatting using CSS i.e. cell CSS styles as a function of R expressions
[https://github.com/stencila/stencila/issues/97](https://github.com/stencila/stencila/issues/97).
Once that is done, it would be fun to see if we could create these types of
pretty picture.

------
meesterdude
very well done! a plaintext spreadsheet. I think it has a lot of appeal, and
for all the reasons outlined. Looking at it raw, is fairly understandable.
Awesome! Coolest thing i've seen all month on HN.

------
sandGorgon
hi - quick question. Is your Dockerfile open source ? We have been building an
R powered backend for one of our analytics services and it's been a struggle
getting a scalable R-based infrastructure going.

BTW - another killer usecase would be an easy way to build database backed
spreadsheets. instead of the TSV, it would be cool to figure out how to
interface to postgresql for example.

~~~
nokome
Hey, yeah it's all open. The Dockerfile is here:
[https://github.com/stencila/stencila/blob/master/docker/ubun...](https://github.com/stencila/stencila/blob/master/docker/ubuntu-14.04-r-3.2/Dockerfile)
I'm no expert, but it seems to be doing the job. Would appreciate any
suggestions you might have! I'd like to make it smaller i.e. use a smaller
base image.

~~~
sandGorgon
just one comment - your entire js is basically building a reactive workflow
through vanilla js. it is a very commendable piece of work!

however, I'm wondering if using Reactjs+Redux would not reduce the amount of
js by many orders of magnitude. You would get all your reactive processing for
free.

EDIT: your CPP code is quite cool ! But same comment there, you could probably
use nodejs and npm packages to get this out of the box for free. for example -
git.cpp ->
[https://www.npmjs.com/package/git](https://www.npmjs.com/package/git),
frame.cpp ->
[https://www.npmjs.com/package/dataframe](https://www.npmjs.com/package/dataframe),
http-client ->
[https://github.com/mzabriskie/axios](https://github.com/mzabriskie/axios) \+
[https://github.com/petkaantonov/bluebird](https://github.com/petkaantonov/bluebird),
etc.

------
robbiemitchell
Do you think named cells and ranges in Excel / Google Sheets gets at this
problem?

~~~
nokome
Yes, I think that they certainly help at making spreadsheets more readable. At
present Stencila sheets only has named cells but one of the things we want to
add soon is mapping of cells onto underlying `data.frames` or matrices. In
other words to "project" data objects onto the spreadsheet grid.

------
known
Awesome.

------
zellyn
Misspelling Dan Bricklin's name within the first four words gives me low
expectations…

~~~
nokome
Hey, thanks for pointing that out. Fixed now. I actually got it wrong in all
three places I mentioned him :(

------
luso_brazilian
OT and just my opinion: it seems to exists an unfortunate trend in the
promotion of new ideas where, instead of simply exposing the virtues of the
idea on its own, it first needs to knock down the (proven and world adopted)
predecessor as something unpolished, unplanned, untested and, in general, a
bad idea from the start.

Compare this trend, for instance, with the first Linux announcement by Linus
Torvalds [1] below.

It is an unfortunate trend because it ends up polarizing and creating
unnecessary division among the early adopters of the new idea and the current
userbase of the old one.

 _From: torvalds@klaava.Helsinki.FI (Linus Benedict Torvalds)_

 _Newsgroups: comp.os.minix_

 _Subject: What would you like to see most in minix?_

 _Summary: small poll for my new operating system_

 _Message-ID: <1991Aug25.205708.9541@klaava.Helsinki.FI>_

 _Date: 25 Aug 91 20:57:08 GMT_

 _Organization: University of Helsinki_

 _Hello everybody out there using minix –_

 _I’m doing a (free) operating system (just a hobby, won’t be big and
professional like gnu) for 386(486) AT clones. This has been brewing since
april, and is starting to get ready. I’d like any feedback on things people
like /dislike in minix, as my OS resembles it somewhat (same physical layout
of the file-system (due to practical reasons) among other things)._

 _I’ve currently ported bash(1.08) and gcc(1.40), and things seem to work.
This implies that I’ll get something practical within a few months, and I’d
like to know what features most people would want. Any suggestions are
welcome, but I won’t promise I’ll implement them :-)_

 _Linus (torvalds@kruuna.helsinki.fi)_

 _PS. Yes – it’s free of any minix code, and it has a multi-threaded fs. It is
NOT protable (uses 386 task switching etc), and it probably never will support
anything other than AT-harddisks, as that’s all I have :-(._

[1] [http://www.thelinuxdaily.com/2010/04/the-first-linux-
announc...](http://www.thelinuxdaily.com/2010/04/the-first-linux-announcement-
from-linus-torvalds/)

~~~
sevensor
The article itself is much more positive on spreadsheets than the headline --
it suggests improvements to the underlying data / computational model while
preserving the interface.

And it's about time somebody said something nice about spreadsheets. Go to
your average (non-software) engineer and ask to see an engineering model.
Chances are, it's in Excel. It's less buggy than you expect, because your
average engineer is going to notice and fix nonsensical results. Unlike the
scientists the author is talking about, your average engineer is looking at
very similar data from day to day and is attuned to anomalous behavior.

I think there's a lot of room for a tool like the one described by the article
that includes built-in validation and is git-friendly. However, given the
overwhelming dominance of Excel, it's going to have to integrate with, rather
than replace Excel.

~~~
nokome
Thanks for the comments. Actually, I was just working on importing Excel
spreadsheets into Stencila sheets last night. Some interesting aspects
involved in translating Excel formulas into R expressions e.g.
`AVERAGE(A1:A5)` to `mean(A1:A5)`. To do it properly looks like we'll need to
build an Excel formula parser. But do-able - and could be fun!

~~~
sevensor
Interesting! I imagine there's a more-or-less reasonable subset of Excel
formulas you could start with. I look forward to seeing what comes of this!

------
dang
Since that title is baity, we changed it to something more neutral in
accordance with the HN guidelines:

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

If anybody suggests a better title, we can change it again.

------
wrong_variable
Spreadsheets are optimized for data entry. Try entering CVS data manually
using a text editor.

Without Excel there would be no way for my sales/marketing team to give me
valuable data to process at the backend.

If you have been doing programming in excel for the past 20 years then you are
simply a fool - sorry.

~~~
ebiester
You would be surprised how many people have done this. The average office
worker doesn't have a development environment on their computer, but they have
Excel, and they have a coworker who was a whiz who said, "Did you know you
could do this?"

They just kept automating task after task, silently, without the IT department
ever knowing. 90% of them never thought of it as coding, and 10% realized they
could make more money as an expert who knows how to coerce excel combined with
their domain knowledge than they could as a programmer.

~~~
shaftway
This.

I've been at top 20 banks in the world that run 14 hour jobs in Excel that
import data from web services, run analysis, and FTP result sets elsewhere.
They were used to analyze risk on tens of billions of dollars worth of
securities.

They were riddled with bugs, but it let the front-line traders iterate quickly
and prototype something that the backend team (my team) could turn into a
"real" reporting process.

~~~
nokome
Absolutely. I think it is the rapid prototyping aspect of spreadsheets that is
perhaps the most powerful. For analysing large amounts of data they are
obviously not great. But they excel (pardon the pun) at providing an
environment in which to quickly formulate and visualise numerical models -
because they are reactive so you can see what happens when you change inputs -
without having to recompile, or manually rerun code.

