
Why MS Excel Is a Poor Choice for Data Projects - preetish
https://www.promptcloud.com/blog/Why-MS-Excel-Is-A-Poor-Choice-For-Data-Projects
======
CJefferson
This article seems to be a bunch of random claims with no evidence.

If you want to demonstrate how Excel is a poor choice, show how someone would
do data analysis using excel, then show how your alternative system, whatever
it is, would do it "better", for some definition of better you want to use.

~~~
dkersten
_" This article seems to be a bunch of random claims with no evidence."_

I was part of a startup that was essentially trying to replace excel for a
specific market and we made a very similar list (with sources). Sadly I don't
have access to it anymore...

Anecdotally, from what I hear from friends who work in the industry, many
banks have a rather error-prone practice of emailing excel sheets to each
other resulting in version mismatches, versions getting lost, latency etc.

Here are a few reports for you to look at:

[http://www.businessinsider.com/excel-partly-to-blame-for-
tra...](http://www.businessinsider.com/excel-partly-to-blame-for-trading-
loss-2013-2?utm_source=scoopinion&IR=T)

[http://www.cio.com/article/2438188/enterprise-
software/eight...](http://www.cio.com/article/2438188/enterprise-
software/eight-of-the-worst-spreadsheet-blunders.html)

[http://blogs.wsj.com/moneybeat/2014/10/16/spreadsheet-
mistak...](http://blogs.wsj.com/moneybeat/2014/10/16/spreadsheet-mistake-
costs-tibco-shareholders-100-million/)

But I guess the question you are asking isn't _" is excel error prone"_, but
rather, _" are the alternatives less error prone"_, and while I don't have any
supporting evidence, I would think that when using a "real" programming
language, you would have version control, unit tests and so on.

~~~
TeMPOraL
> _I would think that when using a "real" programming language, you would have
> version control, unit tests and so on._

You would if and only if you employed professional programmers at those tasks,
which with Excel (apparently) could be done with less/differently trained
people. Version control and unit tests are not features of environment, but of
professional training and discipline.

~~~
dkersten
True, although availability of tools and ease of use does encourage certain
practices. You "could" do these things with Excel, but it doesn't encourage it
at all.

But you're right, without professional training and discipline even if the
tools exist, they're not used...

~~~
TeMPOraL
I agree. My point being, when thinking about viable Excel replacement, one
also needs to consider if and how the skillset differs to get the most of the
benefits of the new tool. A company may not be willing to switch if the
alternative solution requires extended specialized training for the staff.

~~~
dkersten
Yes, indeed, you are correct.

------
agentgt
I'm surprised they didn't mention Looker as an alternative (I have zero
association with the company). They also missed R. yes it is just a language
but its impressive growth and tools make it a compelling alternative.

I think the snowplow analytics guys (again zero association) have better and
arguably more transparent/honest documentation. Particularly the whole end to
end complete process.

One of the reasons people like Excel (besides the obvious ubiquity of that
software) is that they don't have to put their data on some other service.
Some even feel it is more secure (this is perhaps somewhat false). That beings
said I don't know if I would ever trust a company hosting all my data
collection/warehouse needs like the authors of this blog. They might not do
anything bad with the data but they sure do have a lot of leverage on you once
you completely rely on them.

The other thing is the proprietary data visualization/calculation companies
will come and go. I bet Excel will still be here 20 years from now. That is
why R and SQL are also good things to learn as well.

------
nickpeterson
Another issue is that there is a massive difference between casually knowing
excel formulas and being familiar with power query/power pivot/etc cetera. I
think someone well versed in DAX expressions and other portions of excel could
probably handle a few midsized things, that said, if your dataset is gigabytes
then obviously you're going to need a specialized database or custom
programming.

------
infoaddicted
Lost me at the opening sentence:

> A business organization or enterprise always needs to adopt multi-
> directional approaches.

Reads like a marketing class assignment from junior high.

------
thanatropism
> With MS Excel failing to offer optimum results and performance,
> organizations across the business landscape are in search of better tools
> and technologies to bring out insights from mere numbers. With the advanced
> data visualization tools mentioned here, they will have perfect
> opportunities to leverage the power of information and data.

Who writes like this in 2017?

~~~
kristianc
> There’s no denying the fact that we are living in an age, where Big Data
> analytics seems to be the key to achieving unsurpassed success. Whether it’s
> a startup or an established business, data analytics happens to be a crucial
> necessity.

We're getting closer than ever to the world's first Turing-complete thinkpiece
bot.

------
sevensor
I used to believe that Excel was a bad choice and that users should use a real
modeling language. I've changed my mind. Excel is the only tool I've seen that
actually gets domain experts who aren't programmers to express their models in
any kind of formal language. It deserves a lot more credit than it gets.

------
minimaxir
A good rule of thumb for using spreadsheets vs. a programming
language/database is to only limit the amount of data in the datasheet _to
what you can perceivably scan_ with panning/scrolling around the spreadsheet,
otherwise there will be a lot of inefficiency/performance issues when
organizing/analyzing the clump of data, which is bad in the long run.

A fair limit is 1,000 rows in a sheet, which is enough for most utilitarian
use cases (e.g. daily-aggregated data for a year), and certainly enough for
bespoke models. I would not recommend using spreadsheets for Kaggle
competitions, though.

------
danzig13
This seems to be a theme of my comments lately, but I wonder why the writer
doesn't mention SSAS, SSIS and SSRS as alternatives.

------
ioquatix
I once had a customer try to load a 20GB CSV file into Excel. Well, it didn't
take long to crash, that's for sure.

------
HalfwayToDice
spam?

------
rodionos
Here's a good example of a dataset that cannot be loaded into Excel:

[https://catalog.data.gov/dataset/crimes-2001-to-
present-398a...](https://catalog.data.gov/dataset/crimes-2001-to-
present-398a4)

> The dataset contains more than 65,000 records/rows of data and cannot be
> viewed in full in Microsoft Excel.

~~~
masklinn
Excel has supported 1M rows since Version 12 (Excel 2007). The binary format
(XLS) doesn't support more than 64k rows, but XLSX works just fine.

~~~
krembo
In addition, Excel's data model is potentially infinite and limited only by
your machine's hardware.

