
In defense of Excel - dangoldin
http://dangoldin.com/2013/09/20/in-defense-of-excel/
======
Pxtl
My problem with excel is that it is not conducive to making well-documented
systems. The formulas are hidden and there's no way to comment them. The
ability to name your variables is clumsy and pushed away from the main UI so
most users are unaware of it and tend to use anonymous variables. The default
UI for creating formulas binds them to a fixed range of cells instead of the
entire column, which is absurd.

It's _intensely_ powerful (my boss can do spectacular things with pivot tables
and teaches all the business people to do this as a means to doing their own
analysis of our reports) but the high-power features have a positively nasty
UI compared to the corresponding SQL query.

I'd love to see an Excel designed-from-scratch that scratches the same itches
of approachability and user-friendliness but with an awareness of Excel's
failings and _without_ obsessing over the spreadsheet metaphor. Something in-
between Excel and Access.

~~~
dangoldin
Interesting idea. It's such a difficult usability problem that it's tough to
solve it for everyone. I had a job where I was responsible for getting
business folk the data they needed and kept on trying to come up with a way
for them to extract data from the database without having them to write
queries. I got this nugget from the CTO: "To get the expressiveness of SQL you
will be writing SQL"

Just need to figure out where the tradeoffs are and I think Excel made a
pretty good one. I think Excel shouldn't be used for recurring jobs but for
the one off it can't be beat.

~~~
Pxtl
Heh, at one smaller group I worked we ended up just creating some query-only
SQL accounts for some select technically-inclined business folks and
installing Toad on their machines, and let them hit the production database.
It worked shockingly well, in spite of the 9000 reasons that this is a
terrible idea. For more complicated stuff, we often could handle their service
requests just by emailing out a query that they'd just save to the desktop.

It was obviously a really crude approach, but it worked... i've worked at a
lot of places with more mature and sophisticated approaches to ad-hoc
reporting requests but they were all way more of a headache in terms of IT
workload.

Of course, it didn't hurt that one of the VPs quickly became a SQL select-
statement guru within the exec team.

~~~
dangoldin
We had that going for a while too and it worked well. Someone with a SQL
knowledge would write the query and then hand it off to the people with the
read only SQL accounts. The issue we ended up running into is that the schema
ended up evolving quite a bit so the queries became outdated and gave
incorrect results. Then we had to have a process to centralize the queries and
turn them into maintained views that could then be used as a the bases for the
queries.

------
nonchalance
Excel needs no defense: it is a rock-solid product which gives non-programmers
lots of power. People have tried for decades to build alternatives with little
success.

EDIT: My favorite example is [http://carywalkin.ca/download-arena-
xlsm/](http://carywalkin.ca/download-arena-xlsm/)

~~~
Someone
Also:
[http://www.geocities.jp/nchikada/pac/](http://www.geocities.jp/nchikada/pac/)
Pacman in Excel.

------
golergka
I don't think that Excel was ever in need to be defended. It's one of the best
analytic tools for tasks of proper scope. It was also the first tool that we
had to learn and master in bioinformatics class before writing even a single
line of code.

~~~
cowls
Agreed, in my experience Excel is an excellent tool for beginners and advanced
users. I don't think it has too much to worry about from "tableau"...

~~~
dangoldin
Maybe I'm in the wrong circles but many of the "hardcore" quant folk will
belittle it. That's why I decided to write something up. I don't think Excel's
the right tool for every job but it's a lot more powerful than many people
give it credit for.

~~~
jasonpbecker
I belittle it. A lot. And rightfully so.

It's fine if you want to keep a small budget or a quick bit of work with
numbers, but it's awful for dealing even modest amounts of data and here's
why:

Nothing you do is captured and everything is destructive.

Any time someone edits their spreadsheet they are irreparably altering the
data without real documentation. Excel encourages moving, shaping, dragging,
and futzing that may be easier, but ultimately can cause major problems
because there's no audit trail. Ever try and replicate something modestly
sophisticated that was done in Excel? Good freaking luck.

There's nothing wrong with Excel for some basic graphs and calculations. The
problem is when people start to use Excel as a data base or as a serious tool
for analytics (like some of those business folks).

~~~
smacktoward
What's needed is a "serious tool" that's as easy for people to get into and
understand as Excel is.

Excel is as ubiquitous as it is because the mental model of a spreadsheet is
very simple and easy to wrap one's brain around. Here's a grid of cells, type
stuff into them. Boom, you're an Excel user. (It can do a lot more than that,
of course, but many many many people never realize that and still get lots of
value out of it. Which is kind of remarkable, really.)

Proper databases are much more powerful, but none of them present the user
with an interface whose basic metaphors are as immediately obvious as Excel's.
You have to learn about tables, data types, relations, etc. to get anything
done in them. Most of that stuff comes up with Excel too, but you don't _need_
to know it to get started. The Excel user's first baby steps are much smaller
than the database user's are.

Given a choice between a flawed tool that's immediately understandable and a
perfect tool that requires learning, people will always choose the flawed-but-
understandable tool. So they go with Excel.

~~~
pyoung
I don't think we should necessarily aim to make "serious tools" easy for
people to get into. I am not saying that the UI shouldn't be intuitive, but
rather that making something 'too easy' results in low barriers to entry and
you end up with the wrong types of people using that tool. Excel is very
useful (I use it much in the same way the OP does), but it has it's
limitations. Because it is so easy to get into, it's easy for individuals
lacking the proper experience to over extend themselves.

I think much of the explosion in data science jobs these last few years isn't
really about "big data", but rather the fact that your typical corporate
analyst does not have the experience to tackle mid range data analysis jobs
because the only tool they have under their belts is Excel. What really is
needed is a larger number of people with experience in statistics, scientific
computing (Python, R etc..), and database systems. The current tools we have
are fine, they just need to be used properly (well, we should always strive to
improve our tools, but you get the idea).

------
sailfast
Excel can be an excellent tool - I think the problem is in creating something
that buries key calculations or is heavily prone to human error so major
mistakes are made. Some of these things (recently reported in the news) move
markets and made / lost billions. You should definitely check your math!

Tableau is great for visualizations but if you want to do anything on your own
servers it is prohibitively expensive and there are other alternatives out
there. Not sure it provides the same capability as Excel for analysis.

~~~
chadgeidel
This is my biggest gripe. There have been several well-publicised studies
where calculations were in error.

88% Of excel spreadsheets have errors - I couldn't find the actual study after
a quick googling. [http://www.marketwatch.com/story/88-of-spreadsheets-have-
err...](http://www.marketwatch.com/story/88-of-spreadsheets-have-
errors-2013-04-17)

Recent national debt study found to contain erroneous calculations;
[http://www.nextnewdeal.net/rortybomb/researchers-finally-
rep...](http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-
reinhart-rogoff-and-there-are-serious-problems)

~~~
lucozade
If 88% of spreadsheets have errors, what's the likelihood that the figure of
88% is correct?

~~~
peatmoss
That figure was calculated using R.

------
James_Duval
I'm currently attempting to write a roguelike using nothing other than Excel
formulae (no macros etc.). It's going surprisingly well so far, although the
limitations are obviously frustrating - not least the inability to correctly
lock-in random numbers, although I think that can be worked around by linking
the data from another workbook.

Obviously this is not a serious project, but it's kind of fun. I think this is
where Excel gets dangerous. To a certain type of mind (perhaps like mine), the
enjoyment from forcing a non-compliant system to do your bidding is greater
than the enjoyment of using the proper tool to do the job quickly and
effectively.

I still don't use pivot tables. Those are too practical.

~~~
DanBC
> I think this is where Excel gets dangerous. To a certain type of mind
> (perhaps like mine), the enjoyment from forcing a non-compliant system to do
> your bidding is greater than the enjoyment of using the proper tool to do
> the job quickly and effectively.

You're hacking around with something for fun.

Other people? Not so much. EG, the guy who used MS Works spreadsheet (which
was not loadable by anything else) for _everything_ , including word
processing. He'd open a spreadsheet, and type text in, and get "nicely" laid
out documents.

------
abvdasker
I recently had to develop a little reporting system for my company's sales
team and I have to say, Excel has some excellent interfaces on Windows. The
little app I built is a GUI that lets someone choose a spreadsheet tied to a
remote MySQL database via ODBC, refresh the data in that spreadsheet and email
it at scheduled intervals. It literally took like a few days.

Also, coming from a Java background but doing all my recent work in Ruby
(about which I feel ambivalent), coding in C# was a real pleasure. I'm more of
a text-editor type, but I was also profoundly impressed with VisualStudio.
Microsoft ain't all bad.

------
DanBC
I agree, Excel (and other spreadsheet software) is easy and fun.

But people are bad with numbers and spreadsheets give them the power to be bad
with lots of numbers very quickly.

Spreadsheets are tricky to audit, and often they're not audited because
they're just a tool that some guy uses without anyone knowing.

I've mention the EU SPreadsheet Risk Interest Group before, but here's the
link for people who haven't seen it
([http://www.eusprig.org/](http://www.eusprig.org/))

Panko has a nice website with some information about human error and the
things that people do with spreadsheets
([http://panko.shidler.hawaii.edu/SSR/Mypapers/whatknow.htm](http://panko.shidler.hawaii.edu/SSR/Mypapers/whatknow.htm))

If you like Excel and pivot tables you might like PowerPivot, a free addon
from MS. Here's some information about it:
([http://www.powerpivotpro.com/what-is-
powerpivot/](http://www.powerpivotpro.com/what-is-powerpivot/))

EDIT: I like some of the horror stories here ([http://www.eusprig.org/horror-
stories.htm](http://www.eusprig.org/horror-stories.htm)), eg

> _The London 2012 organising committee (Locog) confirmed on Wednesday that a
> decidedly unsynchronised error in its ticketing process had led to four
> synchronised swimming sessions being oversold by 10,000 tickets._

[...]

> _Locog said the error occurred in the summer, between the first and second
> round of ticket sales, when a member of staff made a single keystroke
> mistake and entered ‘20,000’ into a spreadsheet rather than the correct
> figure of 10,000 remaining tickets. The error was discovered when Locog
> reconciled the number of tickets sold against the final layouts and seating
> configurations for venues, and began contacting ticket holders before
> Christmas._

~~~
gruseom
Isn't it a bit silly to blame Excel for a user's typing 20,000 instead of
10,000? Computers can't magically come up with the right answer from wrong
input. You could argue that there should have been error handling or data
validation, but most custom software doesn't do that either.

~~~
DanBC
I'm not singling out one spreadsheet package.

I'm not even singling out spreadsheets - thus the link to Panko and "human
errors".

But fat-finger errors are very easy to make with spreadsheets, and very hard
to find. This error, (20,000 instead of 10,000) was only found when LOCOG
realised they didn't have enough seats for all the tickets they'd sold.

Spreadsheets are particularly tricky because people scatter input data among
all those cells; Sometimes data is manually copied and pasted (which can
either update automatically with outher changes, or not, depending how it's
cut and pasted); they then manipulate those cells with formulas (sometimes
across multiple sheets) and then they spit out a number. Or sometimes not even
a number but a bar on a chart. Few people bother looking at the input or at
the formulas used, they just see the output and sometimes apply a brief sanity
check.

> You could argue that there should have been error handling or data
> validation, but most custom software doesn't do that either.

Most people don't write custom software. Spreadsheets give people a very
powerful tool. Most people don't have the skills to protect themselves from
that power - they don't know about error handling or data validation and
spreadsheets make it hard to provide those sanitation and checking tools.

~~~
gruseom
My point about custom software might have been unclear. Here's what I meant. A
one-off ticketing app is a classic example of software that would normally get
developed either in a spreadsheet _or_ as some custom app that a team would
build for the occasion. Such custom apps are typically bad at data validation
and error handling, so it's plausible that the same typo would have had the
same effect outside of Excel.

When I read that Eusprig literature on spreadsheet errors a few years ago, it
seemed to me of rather low quality. Certainly many spreadsheets contain
errors, but what are we comparing this to? Software in general contains
errors. Custom app software—the only alternative to spreadsheets in most
cases—tends to be of poor quality and is very expensive compared to
spreadsheets. So it's far from clear that spreadsheets are a bad tradeoff.

Programmers take it for granted that "proper" software is better than
spreadsheets, but I think that is mostly bias. Intellectually and technically,
spreadsheets are low-status. Among users, on the other hand, they're high-
status, for very good and deep reasons.

I'm also biased. I think that these problems need to be solved by innovation
within the spreadsheet space itself.

------
kyllo
The biggest problem with Excel is version control, because a spreadsheet is
just a file (.xls are binary, .xlsx are XML but not human-readable), and
people share it as an e-mail attachment. This is what makes it a horrible,
unmaintainable mess.

This is also the advantage that Google Docs Spreadsheets have over Excel. The
data is cloud-hosted and automatically keeps a revision history, allowing non-
technical users to collaboratively edit without having to learn about version
control.

But most companies don't allow their employees to upload company data into
Google Docs, and with good reason, especially in light of the NSA spying
scandal.

~~~
lucozade
It's really not the NSA or GCHQ that most companies are concerned about, at
least in this regard. They're concerned that confidential information will be
become available to competitors, the public etc. Companies are generally much
more afraid of Anonymous and Lulzsec than the NSA.

The concern with the NSA, if they have one, is usually the perception that
customer/client information that they hold internally being abused.

~~~
kyllo
Right, anyway the point being that cloud-hosting the spreadsheets resolves the
biggest problem with Excel, which is version control, but it introduces a new
problem, which is data security on servers that you don't control.

------
bgilroy26
Sarbane Oxley will take a big bite out of MS Office licenses. The law is 10
years old, but there was a lot of lock in in 2003.

The JP whale provides a great just-so story for regulators who want to take
powerful, flexible tools like excel out of analysts hands. Give it another 10
years, Excel will not have the ubiquity it enjoys today in financial
institutions.

------
PaulHoule
I just computed summary statistics from something in my Hadoop cluster and I
made the plots with Excel rather than R because it was a lot quicker.

If I had to generate these plots every week then I could script it easily with
R, but in a fast-paced environment excel has its place

------
mistercow
It would have been nice if this article had explained the shortcomings Google
Spreadsheets have compared to Excel. As someone who has never actually run
into a significant difference in practice, I would have found that
interesting.

------
leandrod
Gnumeric is so much more pleasurable… and LibreOffice so much more compatible…

------
maxcan
[http://blog.docmunch.com/blog/2013/re-in-defense-of-
excel](http://blog.docmunch.com/blog/2013/re-in-defense-of-excel)

