
The Future of Excel - karamazov
https://datanitro.com/blog/2013/2/12/future/
======
d4nt
I think an important aspect of what makes Excel formulas (relatively) useable
to people who don't think they're programmers is their functional nature. You
don't have to simulate the program's state in your mind because there is no
state. You hit return and it shows you what the result is, and if your formula
gets too long you can break it up into multiple cells. Giving users a
procedural programming language is only going to help a small number of Excels
users - the ones who like thinking in a procedural way.

For me, the real pain point in Excel's formula language is in how the path of
least resistance leads you towards unmaintainable code. "SUM(A3:K754)" is not
very meaningful. If only named ranges were more discoverable and useable, we
might end up with more examples like "SUM(ProductSales2012)".

Shameless plug: I'm working on a visual, functional programming tool for
querying data that's aimed at people who currently only use Excel:
<http://querytreeapp.com>. I think the advance in browser technology is going
to bring a number of tools like this to market to challenge Excel's dominance
and hopefully improve to quality of "shadow IT" in business.

~~~
niggler
'"SUM(A3:K754)" is not very meaningful.'

That's not true. You are looking at that formula in the context of a sheet,
and excel takes great care to preserve the formula under meaningful
transformations. For example, if you insert a row between rows 100 and 101,
the formula will transform to "SUM(A3:K755)" without intervention.

~~~
pasbesoin
What happens when you insert a row at the beginning or end? That's a bit less
"certain", for most people, and can lead to incorrect sums.

Re the grandparent, Excel does have named ranges. Some it constructs on its
own based upon column headers and whatnot. Others can be explicitly defined.

Unfortunately (again), the tools/widget Excel provided -- back a decade or
more ago, when I lasted used it intensively -- were "out of the way" and
fairly limited in terms of their interface and behavior/functionality.

Still, if you have e.g. a column headed REVENUE and a column headed COST, you
can insert "= REVENUE - COST" into an adjacent column in the same row, and
Excel will know what you mean. And... in this case, it is less likely to
become "confused" (do what it's programmed to, versus "do what I mean" --
DWIM) if/when you move individual cells around.

P.S. What I do wish, or did, was that Microsoft would better document and
explain publicly some of Excel's underlying behavior and edge cases. It can
take some time, exposure, and mental effort to trick some of these out.

Excel is a powerful tool that does its job very well. A good part of the
problem is that with its ubiquitous distribution and use, much of the time
this is like handing a toddler a loaded handgun to play with.

The handgun is functioning just fine. Its deployment and use in this case is,
however, less than optimal.

~~~
niggler
"what happens when you insert a row at the beginning or end?"

Excel's documentation clearly explains the rules: "You can insert blank cells
above or to the left of the active cell on a worksheet. When you insert blank
cells, Excel shifts other cells in the same column down or cells in the same
row to the right to accommodate the new cells. Similarly, you can insert rows
above a selected row and columns to the left of a selected column. "

So lets say you are looking at the formula =SUM(A3:A5). If you click A3 and
hit insert, A3:A5 are shifted down to A4:A6 and the sum formula is =SUM(A4:A6)
If you click A5 and hit insert, A5 is shifted down to A6 and the new row is
inserted, so the formula does include the new cell and you see that:
=SUM(A3:A6).

The rule is fairly defininitve, and the BIFF8 Rk transformation is fairly
straightforward

~~~
pasbesoin
"for most people"

I've seen a lot of incorrect ranges, often the result of subsequent grid
manipulations.

With hopefully honest and sufficient humility, I submit that we are not "most
people" -- in this respect.

------
ollysb
It strikes me that excel appeals to people for exactly the same reasons that
lighttable appeals to developers. You put numbers and equations in and
instantly see the results. If you need to do something complicated you can
build it up, one step at a time and see how the numbers flow through the
system. You can even try different scenarios by just tweaking the numbers.

~~~
hammock
Yes, for me at least. (IAMA non-coder, Excel power user)

SQL is the same way- it suits my "power user" style by allowing me to type in
a piece of script, hit the checkbox to find errors (or run it in a test
environment), debug, rinse, repeat. Then I can stitch together all these
pieces into one giant operation and let it run.

I would love to learn Python but as OP says I also have a job to get done.
Every time I sit down with a Python tut, it seems (maybe just my perception)
that there's so much fundamentals, coding, etc before I see the output what
I'm actually doing. That there is a larger upfront investment than there is
when learning Excel or SQL.

~~~
nnq
...you know that there's a significant group of, uhm... "programmers", that
are really _afraid_ of SQL and do everything to wrap it in things like ORMs
and hope they'll never have to write it or think about it, right? As a
programmer, I only started to like and "understand" SQL after learning more
about the underlying relational algebra and seeing its beauty, but it was just
a stream of WTFs for me before that... and some kids that I recognize as
"thinking the way I do" can learn programming in Python starting from 0 in
less than week but look at SQL with a complete "wtf" face ...I guess different
people really think differently ...sometimes it's scary how differently

~~~
thirsteh
It's almost as if a query language is a good way to query a relational
database... Hmm.

~~~
nnq
indeed ...but: try explaining to someone who has just went through the effort
of picking up a general purpose programming language why does he need to learn
"a retarded query language that doesn't even have functions, loops and
assignments" :) ...of course, you can answer him "well... but actually it
does" and expose him mercilessly to the world of sql extensions before he even
learned basic sql.

~~~
wvenable
I don't think of SQL as programming -- programming is telling the computer
what to do, SQL is describing the data that you want and letting the computer
figure out how to get it to you. Regular expressions work the same way.

The worst SQL is done by people who treat it like a programming language.

I've taught SQL to non-developers and most pick it up pretty easily as long
they're taught the right way.

~~~
thirsteh
Small correction: The worst SQL is done by people who treat it like an
_imperative_ programming language.

SQL is a declarative programming language. It's probably much easier to learn
for people with experience in functional programming (where you also specify
what you want, not how to get it), and tools like Excel, than people who are
used to thinking about telling the computer what to do.

------
OliverM
You can already embed Python in Excel. There's a few different projects that
do this; here's one: <http://www.bnikolic.co.uk/expy/expy.html>

The article doesn't do any close human-factor analysis of why Excel is so much
more accessible than programming languages; it's the lower abstraction bar. No
need to name variables in functions when you can just click on the cell you
want. No need to create loops when you can drag an already entered formula
across a number of cells. Excel is a direct-manipulation programming
environment, not a language, and a language can't compete here.

But I wish them luck; they diagnose the problem reasonably well, if not the
solution.

------
spankalee
How does adding Python to the spreadsheet paradigm improve the fragility
problem at all? The major problem with spreadsheets is a dangerous mix of ad-
hoc model definition and data. An author can build ad-hoc structure, but has
no easy or default way to ensure that the structure is consistent or valid,
and it's incredibly to accidentally change both data and model with bad
keystrokes.

What really needs to replace spreadsheets is new ways to define models that
are as easy-to-use as ad-hoc spreadsheets. For instance:

    
    
      * A single table should only contain one set of data adhering to the same schema.
      * Formulas, formatting and validation should be defined per-column, not per-cell.
      * Typing and pasting into ranges should be disabled by default.
      * Auto-import from other sources into a table needs to be much easier.
    

Excel actually contains some of these features, but until it removes or
disables the ad-hoc nature it'll still be dangerous. Maybe it should add a
"safe mode" so that financial institutions can have an easy migration path
from such high risk.

I've spent a lot of time migrating financial apps away from spreadsheets. Some
of the errors I've seen could have caused many millions in unnecessary losses,
maybe did before I got there. It's insane, but this doesn't seem to fix it in
any meaningful way.

------
mckoss
_I'm one of the (10) original developers of Excel._

I would say people love spreadsheets (Excel just happened to be a particularly
well done implementation of a spreadsheet when the GUI became the most popular
desktop platform) because:

a) Easy visual data entry - everyone understands a table of data.

b) Non-procedural programming model.

c) Simple analysis tools (batteries included): charts and pivot tables.

There just has not been much imagination applied to improving on this model -
there are great gobs of ground for innovation for someone willing to take a
stab.

1) Make spreadsheets "functional" - i.e., allow a whole spreadsheet model to
be used as a function definition (I was amazed when I first learned
spreadsheets and found they lacked this capability).

2) Make it easier to sling large data sets around - and especially share and
work with dataset repositories.

3) Better hooks for developers to integrate spreadsheets with more advanced
functionality and features. There is too much locked up in the spreadsheet as
a monolithic application. I can publish my Python program easily - why can't I
do the same for an application developer in "spreadsheet language"?

There has been relatively little thinking about how to organize and manage
collections of spreadsheets and models. Most people still deal with one
spreadsheet at a time; it's its own little island - largely cut off from the
rest of the world. In the era of cloud infrastructure and hosted data, why
aren't there better systems in place to manage data and models in the
aggregate? What if you married the best of GitHub and Excel together, for
example?

~~~
gruseom
This comment is amazing. It deserves a lot more attention in this thread.
Thank you!

~~~
mckoss
Nice of you to say so...

------
aleyan
"Excel is like a user-friendly nuclear power plant. An amateur operator might
be able to keep it running most of the time, and most of his mistakes won't be
serious. But the wrong mistake can blow up your business."

Yeah, running huge Excel sheets is a risk. Running huge Excel sheets with VBA
is a risk and a pain. Powering Excel with python is a step in definitely a
step in the right direction and would save me personally a lot of pain and
boost my productivity right now.

Unfortunately it is a step I can not take. Excel is supported by Microsoft and
Excel Bloomberg plugin is supported by Bloomberg. DataNitro is supported by a
company founded last year that doesn't have single person's name in their
About Us or Contact Us pages. If I was to build a sheet depending on their
technology, would a wrong misstep two years down the line blow up my business?

~~~
cgio
We should not confuse procurement risks with usage risks. In general, the risk
with Excel, as indicated in all these articles, does not come from the vendor
but from the user. Your systemic risk is always bounded below by the highest
risk you are facing, which is user incompetence. Even with power users you
still get errors. The main negative differentiator for Excel is that these
errors are sometimes hard to identify.

------
bobx11
Many developers think they know the solution to a problem before understanding
the problem. They jump to the solution, so to speak. This is exactly what HN
has been doing about Excel the last few days... if you want to replace Excel,
spend a year building models in it, then you'll see how it's harder to replace
than teaching someone development.

~~~
hoodwink
Agreed. I'm a non-programmer, Excel power user for finance. The HN community
seems to think that Excel is the primary cause of all these financial
catastrophes and that some new software solution will prevent them going
forward. In reality, the problem is user error, missing checks and balances,
and oftentimes, complete lack of common sense.

An organization that doesn't have the discipline to incorporate some checks
into its influential financial models or doesn't have quants who can "back of
the envelope" gut check the model is going to make errors regardless of what
number crunching software it uses.

I could see a consequence of taking the calculations normally in Excel
formulas and making them even more blackbox so only the software engineers can
check the calculations to lead to MORE errors, not fewer because its less
accessible to the people who know the math.

I could go on and on about how mid-sized private equity shops like mine would
never adopt some proprietary solution over Excel because then they'd be too
dependent on the developer to make changes, but that's not the point.

~~~
gknoy
> the problem is user error, missing checks and balances, and oftentimes,
> complete lack of common sense.

You are somewhat right, but I don't think you get what we are saying. My
background: I was a developer tasked with integrating cost models that were
implemented completely in Excel. The problems we had were not because the
organization didn't have checks, but because Excel makes it Very Hard to do
such checks on a spreadsheet with sufficient complexity -- or perhaps makes it
Very Easy to make it hard to do that.

It's easy to say "The model sucked because they built it poorly", but honestly
that's disingenuous. The models sucked because the tool was pushed too far.
There were VB functions with it which were well-written - it was the mess of
calculation logic that was hard to deal with. Here are a laundry list of the
types of problems I encountered, and I'll go into more detail later on what
programmers want instead.

\- Debugging an error (bad value, NaN, etc) meant tracing the parent formulas
through several sheets, with no way to unit tests smaller calculations that
were used as part of larger ones. Large amounts of stuff were done with table
transforms, because Excel makes that Easy, but it was hard to understand what
it was doing because of the dependencies.

\- Moving input cells, or copy/pasting output tables, often meant that later
versions of the sheet had an ever-changing interface of where you write your
inputs and read your outputs (its API, essentially), which had to be
constantly updated, and then debugged.

\- Formulas that had any decision logic in them nearly never had humanly
readable names, but rather were cell addresses -- does
'intermediate_calcs'!C42 refer to the Frobulated Foo, or is it the un-
normalized Foo before we frobulate it?

\- Results tables had frequent copy and paste errors, where they referred to
the wrong sub-range of some more-detailed results page. These are really easy
to do wrong, because it's hard to tell by looking at the formulas what you're
referring to -- since regions and the like are quite often left un-named.

Why is this bad? What do programmers see as missing in Excel? For me, it's all
about testability and source control.

In contrast, developers are accustomed to building things by aggregating
smaller pieces of logic, each of which has a unique name, and code comments to
describe its purpose and intended use, and with well-defined interfaces for
communication between the pieces. We write tests which ensure that each little
piece works right, tests that verify that we are passing them the right data,
and tests that verify that the results we get match what we expect. (This sort
of testability is Very Hard to do with excel.)

The code we work in are stored in text files, which means that everything we
do is traceable in Version Control -- you can ensure that you have the correct
version of a file by more than its checksum or modified date, and can see the
logic changes which were added to fix a problem. Variables have names like
"foo_frobulated" and "foo_normalized", so it's much harder to accidentally use
the wrong variable for a calculation.

Many times formulas in Excel start off simple, and later become very complex
in order to account for exceptional data. Code does too, but it tends to be
more readable, so you can more easily notice when you make a mistake.

[edit: finished the paragraph that I prematurely posted.]

edit 2: You make a VERY good point about the accessibility to people who know
the math. The math's correctness is sometimes hard to validate when there's a
mistake in the inputs -- mistakes which can be hard to find. You're correct
that describing parts of it in code might make it more complicated, but I feel
that the ability to more easily sanity-check the inputs, outputs, and
intermediate calculations of The Maths will make for a more robust calculation
tool.

Programming isn't wizardry, any more than SQL or creating complex Excel tools
are. Excel presents data well, as complexity scales it's easy to introduce
human errors -- whether in inputs, calculations, or outputs. Software
development has mostly solved that through tests and source control, which is
why we rail so hard about Excel. It would be like watching someone struggle
with talking with someone remotely by spooling a Very Long wire, when you're
standing there holding a cellular phone.

------
sakopov
Excel is a pretty amazing piece of software. Previous place i worked at -- a
medium-size insurance company -- did all of their monthly expense calculations
in uber-complicated spreadsheets and then exported the results to AS/400. Part
of their issue was that training staff became an increasingly complex task
because of the structure of the data in the sheets some of which had to be
imported from a 3rd party source. I have never seen Excel used to this extent
until i was brought in to convert these sheets to a web app. We ended up
building a custom DSL based on mathematical expressions, an in-browser IDE for
accountants to configure formulas in the system and a full-blown custom report
builder. It was by far the most complex project I've ever designed and it had
an immense impact on how i look at advanced Excel users today.

------
michaelfeathers
It's interesting to see this discussion of the future of Excel and complaints
about its language. The fact of the matter is that the language is an evolved
artifact with an incredible history. Spreadsheets started with VisiCalc. Then
came Lotus 1-2-3, and Quattro. Finally, we have Excel and Numbers.

The language that Excel uses is odd because it was forged in the fire of
competition between vendors. Each of them had to support their predecessors'
idiosyncrasies to get adopters. No one was able to make a clean cut at
spreadsheet languages and make it stick. I know there have been some attempts.
Back in the days of the spreadsheet Cambrian explosion there were even some
oddities like 3D and nD spreadsheets. None of them could ditch "the language"
that we are left with today.

It's a shame. I think a good "ground up" redesign of the spreadsheet would be
incredibly useful if it was done and people bothered to use it.

The thing that makes it hard to revisit the spreadsheet is that it is "the
programming model" for non-programmers. Non-programmers just want to get their
work done and they bias toward familiarity.

------
bradgessler
The beginning is silly: "Why do people keep using Excel to build giant, error-
prone spreadsheets?"

This could just as easily be: "Why do people keep using programming languages
to build giant, error-prone applications?"

I can't even read the rest of it. Excel is so successful because it lowers the
bar for creating applications. They might not be pretty, and they may have
expensive bugs, but its like everything else in the world.

I'd bet the amount of value Excel has created in the world far outweighs the
consequences of any bugs in the worksheets.

------
nnq
There already is (and has been for some time) a "better Excell" that'a also a
"better Ms Access" at the same time: Quantrix (<http://www.quantrix.com/>), it
can replace brittle spreadsheets with much easier to manage documents and do
much more

...but there's a problem with it: it's just _too much_ for 99% of people, both
as _features_ (It's probably easier to learn a general purpose programming
language) and as _price_ :)

~~~
nnq
If someone would: (a) develop a web-based version of it, (b) make it easy to
use (hard part 1), (c) make excell (ab)users realize that they _really need
it_ (hard part 2 - maybe impossible? :) ), (d) make it extendable with
multiple general purpose programming languages (python, ruby, ocaml or f#...)
and (e) give a "personal/student version" for free to get people hooked on it,
it would _really shake things up_ in this area...

P.S. If you get filthy rich from my 5 point plan above, please remember to
give me and my company a life-long full-support license for your software :)

P.S.2. ...or open-source it and become the Linus of financial analysis
software ;)

------
PeterisP
Excel is excellent for simple formulas in a single spot. It sucks in two
points: complex/ugly formulas (anything that requires a switch/case statement
or nested IF's) and heavily repeated(copypasta) formulas.

Here's what I can imagine as a real improvement: Allow me to write a piece of
code that does the job of a function, but allows to touch multiple cells - and
integrate it into excel as deeply as 'normal' formulas.

This means the following things that current macros don't really do:

a) Show the "formula" in the formula bar when selecting a result cell (it will
usually be only a couple or dozen lines long anyway), and allow editing it in
the sheet, not in a separate editor;

b) When a result cell is selected, highlight the data source cells (as is done
for 'normal' formulas), and the other result cells;

c) When I change a source cell, recalculate all result cells automagically as
for normal formulas - this likely imposes a 'no-side-effect' restriction on
the code language.

d) For the "stdlib" use the same function names as current excel formulas, so
you don't have to learn a new set of them.

Simple, and very useful - could LibreOffice handle that?

~~~
mdda
Perhaps in the pop-up editor, the names of the parameters could also show
their current values on hover (where possible), so that one could debug the
function live, in a similar way to the sheet showing live numbers most of the
rest of the time (with show-me formulae).

Also, by doing functions like this, one could have some documentation too...

------
zjonsson
Matrix functions and named ranges are the most underused functionality of
Excel. Collectively they can replace a large amount of the conventional
spagetti VBA code, emphasizing Excel's true strength.

Simple example: CDO pricer using Gaussian Copula and base correlation
<https://dl.dropbox.com/u/10755342/120626_Simple_CDO.xlsx>

~~~
niggler
You mean array formula (the {=....} jazz). Matrix functions refer to functions
like MINVERSE

~~~
zjonsson
Not only that, regular algebraic operations also work with vectors/matrices as
inputs, as long as you press CTRL-SHIFT-ENTER after editing. This means that
most functions with complex MATCH and OFFSET can be replaced as well (i.e.
SUM( (column_vector = cvalue) * (row vector = vvalue) * (matrix))

------
politician
Does anyone remember Resolver One? The proprietary Excel-with-python clone? I
wish they'd release the source.

~~~
mwexler
Loved that tool. For those who didn't know, they EOLd it. Sigh.
<http://www.resolversystems.com/eol.php>

------
rlpb
A couple of problems with Excel [edit: actually just spreadsheets in general],
which I think fundamentally restrain it:

1\. Difficult change control. It is very difficult to diff spreadsheets. So
changes to a model are very hard to verify for correctness.

2\. Massive code duplication by design (eg. the fill function). So it is easy
to introduce an error by failing to update all relevant cells.

~~~
cgio
1\. The review functionality goes to a very granular monitoring of changes in
a spreadsheet. It is indispensable, especially when many people have access to
the file. It can be circumvented, of course, but we are talking about
unintentional errors. 2\. In Excel you get very evident visual indicators when
a formula is dissimilar to the formulas around it. Duplication is necessary at
the most ad-hoc level. When you use Tables, things are more safe.

The major issues with Excel are not of functionality but of user training on
the functionality available.

------
endlessvoid94
> The only way to prevent people from running important systems with giant
> Excel spreadsheets is to present a better alternative.

I think the spreadsheet can still be improved. Macros can help a lot with
this, but there are a myriad of other ways you could improve upon excel that
gives a smooth transition from "casual user" to "power user". Read "A Small
Matter of Programming"[0] for more!

[0] [http://www.amazon.com/Small-Matter-Programming-
Perspectives-...](http://www.amazon.com/Small-Matter-Programming-Perspectives-
Computing/dp/0262140535)

------
cdcox
This is one of the beauties of Matlab and similar languages. They look and act
like excel but more explicit and powerful. (and harder to use) Matlab
documentaiton is very similar to excel documentation and the ability to run
m-files with the touch of a button gives a lot of 'reactivity' that other
languages lack. Also a live, easily accessible and manipulable variable editor
makes it much easier to work with data.

I would really like a program that wrapped something like Matlab around excel.
Basically something that let you work in excel then let you hit a button to
'vomit' the variable space into Matlab code. I often find it much easier to
manipulate data in Excel and often find myself working in Excel until it
becomes too burdensome then swapping over and converting into Matlab or
Python. I think seeing a language 'live coded' and being able to alter the
live code would allow people to bridge the gap more easily. It would also fix
the debugging issues as each spreadsheet would also contain a executable code
that could be implanted into other spreadsheets easily, (it would also contain
a number of assumptions made explicit like what variables it takes in and what
it kicks out.) Of course how strings etc are handled is a messy question.

A similar tool exists for the open source scientific image altering toolbox-
ImageJ that allows it to 'record' your actions and easily loop/automate your
code by turning your calls into code as you perform them. I've found non-
programmers can easily learn how to write 'code-like' actions into this macro
language.

------
tjoff
One of the problems is that Excel is used for tasks that it isn't designed
for.

I'm currently plotting a lot of graphs in Excel (2007) and the experience is
just dead awful, wanting to quickly change parameters and ranges and sheets to
plot is a _nightmare_. So naturally, I went to find a good specialized tool
for this but haven't come up with any. Is Excel cannibalizing the market for
such tools despite it being so bad at it?

~~~
wycx
I found R to be the solution for me. I faced the exact same problem when I
wanted to quickly plot lots and lots of X-ray spectra. It was slow and tedious
to do in Excel because it involved repetitive plotting of similar data.

I put my data into Access, then accessed it using R via ODBC. The cost was the
time to validate my data and get it into a database. The benefit was that
subsetting and plotting the data became very straightforward, and the source
of error was the query and plotting script. Transposition, highlighting the
wrong cells, not pasting the correct data, etc were eliminated as sources of
error.

Plotting via a script is to plotting by hand in excel what using a computer is
to photocopying, cutting and pasting physical paper.

Repetitive plotting and storing data: tasks for which excel is not really
designed.

------
lightblade
Excel really is programming. It is classified as cell-oriented dataflow
programming. Functional Reactive Programming is a close cousin to such
paradigm.

~~~
Robin_Message
Classifying such languages is actually part of my PhD thesis, so I can
authoritatively :) say that while merely adding the concept of time gets you
to functional reactive programming from excel, they aren't really related
because FRP is so much more powerful.

------
nailer
This sounds like Resolver One, which was a Python based SpreadSheet (using
IronPython) from a few years ago. I wish these guys better luck.

------
spikels
I would love to see a good open source Excel clone. It does not need to have
all the unnecessary features of the recent versions. IMHO Excel 97, the
version that added VBA, was just about perfect.

Adding support for Linux, other "macro" languages and easy interfaces to
popular open source projects, which MS will never do, could potentially make
it the dominant spreadsheet.

~~~
recoiledsnake
Isn't this what you want? <http://www.openoffice.org/product/calc.html>

~~~
spikels
Perhaps but despite being nominally open source interference by Sun (and later
Oracle) have crippled the OpenOffice project. Back in 2010 most of the
developers left for LibreOffice. Currently it seems to have become an IBM
sponsored project after being spun out of Oracle to Apache. Not sure if this
history has made it impossible for OpenOffice to succeed.

However you are right OpenOffice does deserve another close look as does
LibreOffice. I wish there were alternatives with less baggage.

~~~
mdda
LibreOffice has Python 3+ now baked-in (i.e. part of the default install).

~~~
spikels
Thanks for the tip. I would love to have access to all the Python libraries in
a spreadsheet. In the past I have had to build custom C++ libraries for
Windows Excel to do complex calculations quickly. It was quite tedious and
involved lots of undocumented tricks.

------
mherdeg
There's a name for this product space! "Spreadsheet controls" describes the
product space inhabited by various products that aim to help businesses manage
the terrifying, complex Excel workbooks whose contents influence business
decisions in subtle, interrelated ways.

Microsoft recognized that this was an important hole in the way enterprises
used Excel, which is why they acquired a company in this space, Prodiance, in
2011. Office 2013 integrates some but not all of the company's products in its
"enterprise risk management" / "spreadsheet controls" functionality; see some
discussion at [http://blogs.office.com/b/microsoft-
excel/archive/2012/09/13...](http://blogs.office.com/b/microsoft-
excel/archive/2012/09/13/introducing-spreadsheet-controls-in-office-2013.aspx)
.

It was really interesting to read the prior blog post on this topic (
[http://baselinescenario.com/2013/02/09/the-importance-of-
exc...](http://baselinescenario.com/2013/02/09/the-importance-of-excel/) )
because it sounded so familiar! That blog post is spiritually identical to
about the first third of the Prodiance sales pitch.

The "controls" space has a surprising amount of depth. Just a few problems
that these folks think about:

(1) Excel spreadsheets are not a very good computation engine: business logic
can be subtly altered by a typo that includes one too few lines in a range or
a copy-paste error that accidentally includes a constant instead of a
calculated cell. It turns out to be possible to programmatically inspect a
workbook for "risky" or "possibly wrong" calculations and warn on their
presence.

(2) Chained dependencies can be really hard to track: some businesses use one
spreadsheet that refers to data in another spreadsheet that refers to data in
another spreadsheet… and you have to hit "refresh" in all of them if you want
your changes to propagate correctly.

(3) Data can be changed maliciously: by leaving some text white-on-white, by
using Very Hidden sheets, or by deliberately changing a formula, you can put
misleading information in a quarterly report to hide bad news!

(4) It's hard to find "one version of the truth": people constantly e-mail
around things like "quarterly report - new calculations.xlsx" or "quarterly
report - new calculations - REVISED USE THIS.xlsx" and we just have to hope
that business processes use the right one. In some kinds of businesses, people
_really_ want to see a complete audited list of "important spreadsheets" and
"all the versions of them" and "how they have changed and what specifically
has changed".

These are all problems that can be solved at other layers — for example, this
blog post advertises "move your business logic to Python with
DataNitro/IronSpread" as a solution to problem (1).

If DataNitro's plugin is going to help provide "enterprise risk management",
one really interesting thing to hear would be what their plan is for auditing.
How can people see what ancillary Python code is attached to a workbook and
how it's changed over time? That is, what assurances can someone get that the
generateQuarterlyReport() method is working as designed, and can someone get
an e-mail when it changes?

~~~
sliverstorm
_It's hard to find "one version of the truth"_

This is where network shares, Sharepoint, and so on come in handy.

Even more so if Excel supports revisions the way Word apparently does now.

------
redact207
I love excel. Users love excel. Love it so much to the extent that we actively
embed an excel sheet in our internal apps now instead of any third party "data
grids".

Why?

They're easy to work with, copy/paste across apps, easy to create interface
logic - even so much that the users can "code" up the sheet and we embed it
and bind the dataset to a database or service layer.

------
wtvanhest
I can't just install a random EXE file on my finance computer without knowing
who you are. Even if I felt like this was a great product, there is no way I
can risk my client's data to save some time.

I'm honestly not sure how you get over that hump, but you should think about
it, hard.

------
eterps
The STEPS system provides a spreadsheet that is a simple extension of the
system language: <http://www.vpri.org/pdf/tr2011004_steps11.pdf>

------
ExcelGuru
Great idea but not sure who is going to use this. I can't see enterprises
installing this on their desktops and as someone already pointed out the
larger enterprises have been using <http://www.clusterseven.com> or
<http://www.prodiance.com> for years to control their spreadsheets

------
rhokstar
Yes, I agree that Excel is a programming language. This is one of the ways I
learned programming also! Good way at looking at programming.

However, VBA is not the answer for most Excel users to move past because the
expectation is set by Microsoft; it requires nothing more than introductory
courses and common sense. This is where most people will use Excel.

------
BigBlueSaw
As has been mentioned elsewhere, Excel users are programmers, whether they
know it or not.

What they need is the discipline of software engineering, a completely
separate discipline from programming. If Excel spreadsheets had better
encapsulation, internal documentation, automated testing, and the like, the
world would be a better place.

------
dagw
Slightly off topic, but are there any good books or other sources for
experienced programmers who quickly needs to get up to speed on the more
powerful features of excel/VBA? I've just been asked at work to take a look at
a huge importan excel 'program' that's not quite working the way it should.

------
HCIdivision17
Excel VBA has libraries, they're just called references. It lacks an exception
library, but it does have a similar style of attempt-fail-compensate (though
you've got to put up with ugly labels). Personally I'd use IPython Notebook
over it, but that's not always the reasonable choice.

------
theaeolist
Can someone explain why Excel disallows cycles (recursion) in formulas?

~~~
CurtHagenlocher
This is configurable. You can enable recursion and set a max for the number of
iterations. You can implement relaxation-based differential equation solvers
that way :).

Edit: You can also solve Sudoku this way: [http://office.microsoft.com/en-
us/excel-help/microsoft-excel...](http://office.microsoft.com/en-us/excel-
help/microsoft-excel-blog-building-a-sudoku-solver-using-excel-iterative-
calculation-HA010293321.aspx)

------
elchief
So, 6 months until the TDE (Test-Driven Excel) book comes out?

------
smnrchrds
Why people don't use Microsoft Access when they need more complicated logic?

------
tbatterii
how does putting a "real" language on the backend of spreadsheets help fix
what happened in London? seems like it's just a bigger gun with which to shoot
yourself in the foot.

------
kros
VBA should be replaced by VB.NET or (Iron)Python.

~~~
karamazov
Why would you prefer those to Python? (And to every other language?)

~~~
tbatterii
a .net based language probably already has hooks into the office API's, that
would be one reason to use iron python vs. regular cpython.

~~~
vj44
The problem with ironpython is that it's not always compatible with all python
libraries (esp. those based on C extensions; ie. heavy, scientific libraries
similar to scipy); Using regular CPython immediately solves this problem.
Also, the official .Net office API (read: VSTO) is not targeted on newbie
programmers (imho).

~~~
tbatterii
and the problem with CPython is that is wouldn't be compatible with any of the
office api's that are likely based on .net technology. so which one do you
think would be better for office. I haven't done ms office automation since
office 2k7 but that is the way it was heading.

edit: oh I see what you are getting at. a new product from a new company
created at mit in 2012. <https://www.datanitro.com/>

I would argue that a $500 license is also not targeted at newbie programmers
either.

can you run native python from excel without shelling out $500 ?

~~~
vj44
That's exactly what DataNitro offers - the entire Excel api in (C)Python.

Oh, and it's free for non-commercial use (educational etc.)

~~~
tbatterii
so datanitro really is the future of excel as described in the blog post. good
to know.

I'm still curious as to how putting any "real" programming language behind
spreadsheets is supposed to fix the situation that happened last week. the
blog wasn't very convincing and it feels like it's just giving a poor
disciplined group of people a more powerful tool to screw things up with.
That's not a future I look forward to.

------
RaSoJo
no for mac??? :(

