Hacker News new | past | comments | ask | show | jobs | submit login
Stop Using Excel, Finance Chiefs Tell Staffs (wsj.com)
627 points by petethomas on Nov 22, 2017 | hide | past | web | favorite | 417 comments

I would argue that the most commonly used programming language is Excel. But few of the people using it realize they're programming.

It's a brilliant reactive data programming model that makes intuitive sense to non-technical users. They feel empowered to use it to solve problems right now with a computer. They experiment with it, try things, Google how to do more things- just like any programmer does. And they feel capable of doing this because they don't know they're programming.

Within the Amazon warehouse world, I have seen incredible innovation using it. An acquaintance of mine got into development by using it to help save soldiers lives while serving in Afghanistan[0].

I agree that mission critical data needs to get out of it and into a centralized system, but I still feel Excel is an incredible tool in any business.


I used to be a Lotus Notes specialist, and one interesting thing I noticed is that from the earliest versions it was designed specifically to be a "step up" from a spreadsheet (Lotus of course also made Lotus 1-2-3, which was the original killer-app spreadsheet before Excel took over).

Notes still let you see data as rows and columns and had very spreadsheet-like functions, many of which could be transferred directly from Lotus 1-2-3 - but it was a full client-server system which supported decentralized replication, built-in messaging and email, and a solid security model (yes yes I know it was also a crazy confused mess and a UI nightmare but that's a different topic).

Back in the day, organizations had teams going around finding the most useful ad-hoc spreadsheets and converting them into simple Lotus Notes databases. Other organizations would sometimes also give staff training in building simple Notes databases - not a whole lot more difficult than doing a spreadsheet - and then have expert teams come in and polish them up as needed. This made business software bottom-up rather than top-down, which turned out to be a whole lot more useful in a lot of cases.

In many ways, even after I left the Notes world for PHP and Rails and JS, my career in business software has still mostly been about converting spreadsheets into more "proper" applications.

Spreadsheets are the 'blue-green algae' of the software world: the simple base level that's everywhere and surprisingly essential.

> my career in business software has still mostly been about converting spreadsheets into more "proper" applications.

Still true for many people in 2017. @patio11 probably said it best:

> Every spreadsheet shared in a business is an angel announcing another SaaS app still needs to be built.


FWIW we ended up in business software after one of our open source libraries to read and write spreadsheets (https://github.com/sheetjs/js-xlsx, demo http://oss.sheetjs.com/) ended up garnering lots of demand from businesses looking to build those "proper" applications

> Every spreadsheet shared in a business is an angel announcing another SaaS app still needs to be built.

It's a pithy quote, you'll get a very different reaction from a finance or small to med-sized business professional.

Excel is the ultimate maker studio that actually lives up to the promise of "build your own mini-app". I'm not saying it can't be improved on. It's just that the core model of flexibility is so compelling, pushing everyone into a SaaS app is invariably going to take away substantial power from the user.

Excel is an incredible swiss army knife when dealing with one person in one location on one computer, but the warts start showing up when any of those assumptions break.

One computer: once you decide you want to use your smartphone, for all but the most basic sheets there are problems with Excel mobile preview, even with Microsoft's official apps.

One location: Excel's lack of a universal time (showing different times in for users in different timezones) have led to gnarly workarounds. There was an attempt to correct for it in the XLSX format (cell type 'd') but it has its own problems.

One person: Excel has no inherent sync strategy. You are forced to farm out to email (and the inherent filename versioning nonsense) or use a system like SharePoint (which has its own problems you notice when dealing with people in different timezones across the world, like accidental file locking and data loss)

I hear you. I think One Computer and One Person issues have generally been solved by Office Online or Google Spreadsheets. I can't speak to difficulty around lack of universal time nor do I know how important of a requirement that is to most companies.

Any solution, SaaS or otherwise, will have flaws. It might solve Excel's weaknesses really well, but will likely lack its strengths.

Excel is still going strong in 2017 to a large degree because its strengths continue outweigh its weaknesses and that people have found reasonable ways, or add-on tools, to work around it's weaknesses.

It will be interesting to see in the coming decade if niche industry solutions or an even more incredible swiss army knife were to come around and dethrone the incumbent.

Add another 'one': one point in time.

What I mean is that Excel is essentially a write-only language. It's very quick to prototype something for a semi-technical person, but auditing / code-reviewing an Excel sheet later is essentially impossible. The ubiquitous manual loop unrolling doesn't help readability either.

From my experience, this is more dependent on the team/process than Excel. I've worked in investments for several years and auditing Excel sheets was a core, well-run process at several companies I worked for. It was quite easy for smart, non-technical folks to follow formulas and equations to track down why their checks weren't passing.

It also scaled fine to thousands of employees.

What's with all talk wrt Office 365 and web versions and syncing that goes on with that? Is none of that actually the case? I wouldn't be surprised, just don't know.

True but there is a threshold. If the problem is small enough, you can't beat an Excel spreadsheet's flexibility and low startup cost.

Having worked with a number of finance teams, the biggest problem is not when Excel is used as an analytics tool but as a financial database. Excel as a source of truth for financial data is truly terrifying. Even financial audits are still often handled with custom data pulls exchanged over Excel spreadsheets.

Shameless plug: that's why we ended up developing one of those SaaS tools (http://modfin.io). It lets finance teams map their source data to a proper, easily auditable accounting ledger so that they don't need to do their "magic" in Excel.

Why is it terrifying?

I am a programmer and I use csv/Excel as the single-entry financial database for a small healthcare business ($500k revenue). The accountant is fine with it. Filtering and pivot tables are a joy in Excel, and I can still run Python scripts whenever I need (mainly to auto-categorize bank transactions). Most other businesses around here use QuickBooks, which has a proper system of accounts, but then I can't play with the data.

I'm guessing because of 1) lack of built in logging, 2) lack of change control and 2) difficult to automate testing and verification (because of it's relatively unstructured nature).

Excel as the source of truth for a financial system is particularly scary to me b/c it's so easy for someone/anyone to change entries or miscalculate so it's very difficult for me to get confidence in its completeness and correctness.

Completely agree about analytics and scripting, but I'd utilize the CSV/Excel reports as point-in-time analytics that can be tied back a structured source of truth (or a source of truth I can hold liable, e.g. a bank or credit card statement).

Logging: I just save a copy of the spreadsheet every time I make a tax filing. Do I need anything more fine-grained?

Validation: I'm confident that I can catch duplicated rows and mistyped amounts because Excel formulas verify that the sum of all transactions for each account equals the account balance, and that transfers between accounts add up to $0. Do I need anything more?

Point-in-time doesn't work because I need to keep track of info that doesn't show up in the bank statements. I run a Python script that adds pretty-prints names and adds categories, and I also manually enter check details, category adjustments, business purpose of meals and flight tickets, etc.

I don't use a proper database because to do so I'd have to give up Excel's nice features (or worse, switch back and forth between Excel and the database).

> Validation: I'm confident that I can catch duplicated rows and mistyped amounts because Excel formulas verify that the sum of all transactions for each account equals the account balance, and that transfers between accounts add up to $0. Do I need anything more?

I'm sure that the people who made these billion dollar mistakes were pretty confident too... https://www.cio.com/article/2438188/enterprise-software/eigh...

"The error occurred when the accountant omitted the minus sign on a net capital loss of $1.3 billion" (Fidelity)

"There were honest mistakes made in a spreadsheet used in the implementation of a new accounting standard." (Fannie Mae, $1.36 billion loss)

There's a bunch of lower (million dollar) losses listed in the linked page too.

I don't doubt that people have made mistakes involving spreadsheets. But these mistakes are not unique to spreadsheets. Switching to an enterprise system doesn't magically prevent data entry problems.[1] Similarly, you're as likely to implement an accounting function incorrectly in Excel as in SQL or Java.

I think that you need to show that Excel requires more data entry, or riskier data entry, or that Excel can't catch as many errors, etc. For now I'm not convinced.

[1] https://www.accountingweb.com/aa/auditing/human-errors-the-t...

This doesn't scale well. If only one person edits the spreadsheet and that person knows how to save copies and make sure there is only one latest version it may work. If you give this to 10 people they will start making copies, saving those with different names ans save the copies on some server or on their own machines and after a while you don't have a single source of truth anymore. And you almost never have logging when many people edit copies of Excel documents.

That's true. Currently I handle scaling by automating nearly all the work. Most routine transactions are automatically categorized without manual input; I just skim the report. We modify 0-2 transactions/day, which is low enough for Dropbox to work well. This wouldn't work for a Fortune 500.

How do you know the account balance total they you are validating your line items against is correct? Is that coming from another accounting system?

I validate account totals against the balance reported by the bank.

Oh I see so you are only reporting on cash receipts, so presumably there is an accounts receivable ledger to track the invoices raised with your cash received etc. Thanks for the reply

Yup, I use cash accounting because our medical AR numbers are mostly made-up. They're tracked and analyzed through a medical billing program.

Apart from the generic question of the pros and cons of using Excel as a data store and how you solve issues around access control, concurrent access, backups, etc., financial databases have some additional requirements to be properly auditable. For example, one of the biggest differences is that the data should be immutable with only additions allowed. E.g. if you made an error and need to correct it (say you sent the wrong amount of money to someone or simply just had a bug), you should keep the record of the erroneous transaction and reverse it or book an adjustment.

You can try to institute the same controls over Excel, either directly in code or just implicitly in terms of how it's used, but they tend to break down pretty quickly with scale.

No. An domain-specific SaaS app takes all the power away from the people writing their spreadsheet programs. Only one app is needed, a hosted spreadsheet with backup and versioning an replication. The only domain-specific business is in writing plugins.

I agree that purpose-specific tools aren't the answer, unless the question was, "can I have a straight jacket?"

Flip side, my feeling is that, if an analysis was worth doing, it's probably worth doing right. And to do it right, it should probably be done in a way that one can show their work. I like where the R community in particular has gone in dropping the barrier to reproducible analysis to roughly the same place as Excel.

My friend's startup has a project that matches up with a lot of these requirements, called Coda. It's easy to query data and build visualizations, has a spreadsheet data model, and has an easier programming model. Excel did a great job getting people to "program" their spreadsheets though.

I also wonder about google sheets capabilities. It does work so well for many people working on one spreadsheet, except for undo. The major flaw I've seen is undo, because in google sheets it was global undo, not personal undo.

I wrote this in another thread, but I think the next move should be to create an Excel-compatible spreadsheet program with a few additional constraints inspired by programming. Make primary keys required, each column needs a type, etc. These are concepts I’ve successfully explained to non-technical coworkers to great effect. By enforcing them in a tool, I think you would get about 80% of the benefits of true software without all of the overhead that entails.

This actually exists inside excel since 2010, it is called get & transform (known before as power query) and lives in the data tab. It allows the user to mash data from different sources (databases, csv, excel files, current workbook) and perform sql-like opérations : filter, join, etc. In a wisiwyg-record-macro fashion. I find it very intuitive and the steps are easily reproducible as each transformation step is translated into a language derived from F#. It is a game-changer for business users that dare using it, as it is rigorous and precise like a classical SQL flavor, but also usable with zero training.

Since it was a very powerful tool but struggled getting traction, they are using it as the core query tool for power BI (alongside DAX for great dashboarding).

Well done Microsoft.

Airtable does something like this. I don’t know how excel-compatible it is, but the core of data types and something similar to primary keys are enforced in a pretty intuitive way, and basic formulas work fine. Haven’t done much advanced stuff with it.

There's a reason chefs don't use multi-tools; purpose-built tools always beat generic tools when you need to maximize efficiency on a task.

I would say a chef's knife is the ultimate kitchen multi-tool. Having one great knife you really know how to use for cutting everything is much more efficient than a bunch of different knives.

Of course the multitool is the single point of failure. If the chef isn't careful, that one great knife can be the vector that means ALL the customers get salmonella, not just the ones who ordered the chicken!

A nice quote but getting data between SaaS apps is much harder than between Excel sheets, and modifying a SaaS app with that one little thing you need for your particular case is of course impossible.

Getting data between SaaS apps is not only easier, but formats like JSONAPI are far more powerful than pushing csv binaries that may or may not adhere to a spec.

One can see parallels between Excel and Unix command line scripting here.

> Every spreadsheet shared in a business is an angel announcing another SaaS app still needs to be built.

With new (or prospective) clients that are SMBs, one of the first things I ask is, "so, what are you still using spreadsheets for?"

> Every spreadsheet shared in a business is an angel announcing another SaaS app still needs to be built.

Really? I am working in company which is not handling super-sensitive data (i.e. not a Bank, for example) but I can assure you that if went around proposing to adopt a SAAS for what Excel is used internally I would not get very far.

No matter how good you are as a SAAS provider, our data stay in-house. End of story.

Many people on HN make the jump from "I would not buy that" to "There is not a market for that", but there is, in fact, a market for SaaSifying lots of things.

It's no value judgement if you want to continue using Excel for privacy, compliance, or security reasons. You do you. But SaaS companies get very good at developing privacy, compliance, and security stories, particularly as they move into enterprise. (Below enterprise it's less of a big deal, partially because customers care about them less and partially because small businesses are, as a judgement-free statement of engineering reality, abominably bad at data security.)

Yes - I agree, the point is that most of the time an Excel "app" is solving a very specific problem using data coming from some other (internally managed, usually legacy) system. Like: I run a report, transform the result in .csv with some ultraedit macro, slurp it up in Excel and off I go.

(off I go can often being something like: upload the result to the same or a different legacy system, again in some custom format).

Maybe I can pester my own IT dept. to add an extra option to the report so that it spouts off a .csv directly. That's all, the rest still lives in Excel.

Then I am free to play with data as much as I like (take also in account that in some cases you want these in a Excel just to be able to manipulate them better, while the old app works record by record, you can make changes across a thousand records using the Excel interface... and keep also in mind that this is maybe something you need to do once or twice a year, when you renew your catalog prices or whatever).

Now, what is the cost of:

- going out on the market looking for a SAAS that can get my data exactly in the format I use internally, applies the required transformations and send the result back in a format I can use

- assess that the SAAS vendor is indeed trustable for my desired level of security, SOX compliance, etc.

- add one more vendor to my portfolio of vendors/licenses/purchasing orders

So the reason not to go to SAAS is actually a combination of security/external resource dependency/cost/bureacracy.

Where "cost" is mostly inertia+sunk cost fallacy. I am sure that a SAAS may cost less than the hours spent manually doing all the stuff I mentioned above (including mantaining the Excel spreadsheet) but the latter is a "hidden cost" because it something that happens infrequently and is part of the normal chores of whoever is using the Excel sheet itself.

While introducing a new SAAS app will be an IT cost (to identify/approve it, add a recurring subscription etc.)

As a Lotus specialist I can see why you'd think Lotus 1-2-3 was the original killer app spreadsheet and as someone who worked retail at the time selling UNIX and Microsoft machines I'd have to say we sold a ton of Lotus 1-2-3.

But we have to give Dan Bricklin and Bob Frankston their dues because Visicalc is the original killer app spreadsheet in the literal sense: people used to say they didn't know if Visicalc sold more Apples or if Apple sold more Visicalc. It truly was the program for which people bought the platform.

You're right - my mistake!

Maybe I did not appreciate it properly, but I remember using Lotus Notes at my work a few years ago, and I can remember it being the most awful piece of software I have ever used. Admittingly I am not a specialist, more an employee trying to use it to do my job.

> I can remember it being the most awful piece of software I have ever used

Many people seem to say the same thing but the only problem I could ever see (in occasional use) was a non-standard UI which non-technical users had trouble grasping. What made it so awful?

Using Lotus Notes at my current employer.

Notes itself has some great functionality. The UI feels a little clunky - but generally you can do what you need to do.

The main problems I encounter is the way our IT department have deployed it. Many functions are not enabled or only partially working. As far as I am aware, we have no Lotus/Domino expert in our country; head office (other side of the world) has quite a few though. Any requests made through our regional helpdesk for fixes/improvements mostly go nowhere.

I like the idea of the replicated databases for email and other business functions. It should mean staff could switch to a different machine and quickly and painlessly get access to all their email and other data. However, there is a considerable amount of convoluted configuration required (the way it is currently configured) that appears to require local admin access that makes it impossible without helpdesk assistance.

I suspect many companies deployed Lotus/Domino is similar semi-functional ways that made it less useful than it could be.

The reason many companies migrated away from Lotus/Domino though was the risk of vendor lock in. My employer is stuck with a vast amount of business process captured in Lotus/Domino. They have been slowly migrating core functionality out to other web or networked database applications over the last 12 years. Even so, I don't see any move away from Lotus Notes for email for many years to come.

Navigation, the UI, the way the databases where setup. I can remember that even basic things not working the way normal windows applications work (usability related things), but it was so long ago I can't remember the specific things, sorry. Just that it was awful.

Bloaty when I used it. Crappy ui. Crashed a lot. (Circa 2007 or so for me)

It was pretty awesome for enterprise software in the mid-90s, but the world caught up with it soon after.

"You either die a hero, or you live long enough to see yourself become the villain."

In the mid-90's, I thought I was helping by bringing Notes in to the company I worked at, and people seemed to agree. A ticket tracking database we set up seemed to help. Many years later, after I left, I saw a presentation where Notes was cast as the problem, with a website proposed as the solution.

It is, you're not wrong

What would you use now if you needed the same sort of functionality? I find myself with some unexpected needs to build databases whose specifications and scope I'm still trying to figure out, and for now I'm missing about in Google sheets and figuring Fusion tables will probably be sufficient, but I'd be interested in any alternatives as long as they're secure and shareable.

Some combination of things like Python and SQLite would probably go a long way?

I basically agree - but in my case, instead of Lotus, it was Framework: https://en.wikipedia.org/wiki/Framework_(office_suite) - I think I used versione III and IV, mostly.

This sounds familiar to my, albeit limited, Microsoft Access experience.

Completely agree.

It's common to tell kids (and everyone) that they need to "learn to code." That is what Excel already is! And it's incredibly powerful, especially if you don't have any other coding skills.

But, of course, it's messy. When tens of millions of people can each write their own programs and easily modify them and change them to suit their own needs without needing to recompile or anything, then you get the mess which is Excel.

...and, in fact, that's also what you'd get if you taught everyone how to program Python in high school instead of Excel. You'd get tens of millions of non-interoperable messes of programs (which still get the job done!).

But that's fine. Give humans tools, and they'll find clever ways of using them that you don't intend. And it'll continue to drive people crazy. In some ways, the Excel hate (though obviously not unfounded) seems almost kind of like a bourgeoisie elitism. THIS is what it looks like when you give everyone the ability to code!

> ...and, in fact, that's also what you'd get if you taught everyone how to program Python in high school instead of Excel.

I disagree. Python leaves a readable instruction set, while Excel buries the logic in an opaque two dimensional grid of cells (plus or minus atrocious VBA). I think this gives a better ladder for people to improve their skills to the point of not producing unmaintainable garbage.

The grid of cells is your memory space. Coders like looking at a stack. Spreadsheet users like looking at the registers.

Which you prefer depends to some extent on which end of the problem you're coming from; if you're starting out with a lot of domain knowledge, a spreadsheet is attractive because you can look at the data, see whether the way it's being stored/processed is basically correct by spot checking it, and then automate. I'll still often reach for a spreadsheet if I don't know off the top of my head what the rules or relationships are and I need the equivalent of a whiteboard with logic.

If you have a top-down perspective and you want to automate a process whose fundamentals you know well, code is more comfortable. And you can go back and expand, modify, or adjust it later. But two things I don't like about code for handling large volumes of data are that I have to construct some sort of viewer to format the output (requiring more coding, or slower bug discovery if I press ahead without), and endless typing I like selecting things from menus and tool palettes because while I don't mind holding an instruction set in my head I don't like doing the same for every library I might want to use, especially remembering all the parameters. I hate hate hate writing glue code and want the computer to do that stuff for me.

Fortunately a lot of these problems seem to be receding thanks to tools like Jupyter and some of the promising Flow-based programming tools that let you easily visualize structure, code, and visualization without getting bogged down in details.

I should add that I prefer hacking to software engineering; that is, I'm only interested in programming as a means to an end of having a customizable tool for myself rather than building a product for other people, so I want spend as little of my time as possible on constructing the tool.

> Python leaves a readable instruction set

Not everyone structures programs in the same way, or puts functions in easily found locations, or names things usefully, or even approaches the problem in a way that makes sense to the majority of other people. For a concrete example, see this[1].

> Excel buries the logic in an opaque two dimensional grid of cells

It would be trivial to represent excel grid formulas as a set of predefined functions, or a special data structure that you register functions in, and then it could be in textual form. It wouldn't make it any easier to reason about, but then again control flow isn't always easy to reason about when a program is split into many different objects and functions, until you've internalized a fair amount of the program. Spreadsheets give people a very limited, but structured output and a very limited, but easily reasoned about way or processing. It's different than "normal" programming, but that difference allows for very intuitive use.

> I think this gives a better ladder for people to improve their skills to the point of not producing unmaintainable garbage.

How much work has been put into making spreadsheets easier to share with other people? I'm not sure the same culture has existed for spreadsheets as for programming, where people have needed to share and reason about complex algorithms and data structures. If it had, I'm sure we'd have much more in the way of how to structure your excel well so it can easily be consumed by others, or yourself in the future (if it doesn't already exist, which it may).

1: http://thedailywtf.com/articles/Python-Charmer

> Excel buries the logic in an opaque two dimensional grid of cells (plus or minus atrocious VBA)

Yes, mixing the good old Excel dependency tree together with VBA macros creates truly dreadful and opaque spreadsheets.

...and, in fact, that's also what you'd get if you taught everyone how to program Python in high school instead of Excel. You'd get tens of millions of non-interoperable messes of programs (which still get the job done!).

That's exactly what happened with Perl when everyone learned to use, modify, and develop CGI scripts in the early days of web-based application development. Dynamically-generated website content worked and evolved into the web we know today, but those scripts were awful and did a lot of harm to Perl's reputation.

I personally think that a lot of notoriety from Perl stemmed from insane overuse of regular expressions. I'm not talking simple and concise expressions doing ordinary useful things, but rather insanely long and intricate ones which one had to be at a god-expert-level on regex just to attempt to understand.

These kinds of expressions are what ultimately caused the derisive terming of Perl as "line noise". It was really unfair, to say the least, because there were plenty of ways to code in Perl and do so cleanly, with a minimal usage of regex.

Honestly, it seemed more like people were coding in regular expressions, and simply using Perl as a "run-time" wrapper for those expressions (and maybe those expressions existed first; I don't know). I'm not the greatest fan of regular expressions (to me - especially on more complex operations - it is almost like programming the tape of a Turing machine), but I do know they have their place in the world. However, because the "language" (if you will) of regex allows for an extreme number of operations and such, it almost succumbs to an "inner platform effect", allowing those with familiarity to roam free, to the detriment of future maintainers.

That said - none of this is new. If you go back to the beginning of microcomputing, you'll find the same thing happened with the various dialects of BASIC, as people self-taught themselves programming in that language, then spread the knowledge thru various means (magazines, books, and later bbs systems) - it became a horrible mess, even though it all pretty much "worked".

Regular expressions can get out of hand, but that's a developers (bad) choice. Perl lets you use multi-line regular expressions that ignore whitespace and allow end-of-line comments, which lets you properly document longer expressions if you need to use them.

Regular expressions are really powerful, and are so much better than the typical non-regex string operations most languages support. Combine that power with the ~= and qr operators, which make regexes as integral to the language syntax as == and "", and you get why Perl is so good at textual searching, parsing, and manipulation. It's easily what I miss most about Perl development since my career has moved to primarily C# and Javascript.

> I personally think that a lot of notoriety from Perl stemmed from insane overuse of regular expressions.

The problem was that Perl's performance on builtins(sorting and regular expressions in particular) was stunningly good while its performance on anything else was stunningly bad.

The idea that you would Decorate-Sort-Undecorate an enormous array because the overhead of passing a function to sort() was so much slower was ridiculous.

In the case of regexes, breaking apart the regex into readable chunks and processing the data multiple times was so much slower that everybody piled everything into the regex. Named captures didn't appear until 2007.

Passing a function into a sort() is going to be slow in any language, unless the language is able to inline the function. That depends on static analysis by the compiler, which is notoriously difficult in Perl.

The key, as you mentioned, is to use builtins, or more generally to use canonical expressions. There's always more than one way to write an expression, but Perl definitely has opinions about the correct way(s), and the Perl compiler does pattern-matching against the code to recognize many of the canonical expressions. They then get compiled to higher-performing IL code. This is why using things like the spaceship operator, <=>, will run much faster than equivalent verbose comparison code.

This is over-simplified, but I always thought about it this way: the Perl compiler generates "assembly code" for the "Perl CPU" to run. It can generate verbose assembly code using lots of low-level assembly commands, but it can also recognize commonly-used expressions and compile them to a single high-level assembly command, which can then run in a single "CPU cycle" instead of stepping through verbose code. Therefore, it was always worthwhile to learn how to write the commonly-used expressions in the common way, so that the compiler can recognize and optimize them.

I don't know if other languages have this kind of optimization. JIT compiling, as I understand it, does a quick and rough compile without optimizing on the first pass in order to start execution quickly, and then recompiles with more optimization on code blocks that are consuming a lot of runtime. I don't think any of the JIT compilers look for common expressions and apply pre-determined, hand-coded optimized function calls to them.

PHP always got heat for the same reasons (so much terrible, amateur advice replicating via StackOverflow et al.), and the only thing keeping Python from meeting the same fate is the fact that shoddy scripts written in Python generally don't get deployed to a public-facing platform for the world to abuse.

Craziest thing I ever found that someone did with excel was implementing a neural network using it. It was meant as a tutorial to teach how neural networks worked, and not meant for production systems, but I do wonder if anyone ever took it further.

Stands to reason that "everyone" is coding wrong. You wouldn't say people are programming just because natural language is turing complete, or because they are while physics is turing complete. One important aspect of information is structure, but programming classes at the low level merely teach syntax. So I'm thinking that therefore spreadsheets are unstructured. Sure that's elitist or at least academic - in comparison to amateurish.

Excel is a great way to code, but a horrible way to decode. Every try to pick apart someone else's spreadsheet.

You could make the same argument about most programming languages. I hate to sound like I'm parroting Spolsky, but, he's right that the reason that the reason every developer ever wants to re-write the code they inherit is because it's easier to write code than it is to read it.

It's obviously an exaggeration, but not by much.

the way i tell great developers from good developers is from how willing they are to dive into someone else's turd and make changes to it.

Oh god the years I spent on doing that with wireless drivers were a fucking nightmare.

Which I guess makes that test more about dedication and attitude than about raw skill.

have you tried to do scientific plots with Excel? I only tried libre calc, which admittedly might lack behind in a couple regards, but it's nigh impossible, no compare to scripting.

When I consulted at a health insurance company (a Blue Cross Blue Shield licensee with 1000 employees, 200 IT staff) about 10 years ago, I was absolutely horrified that the system of record for health care plan data was Excel spreadsheets. They had staff who would update the claims processing system based on thousands of spreadsheets. When you called customer service, the CSR app would auto-open the member's plan information XLS file to the CSR rep. It got the job done - but it was the wrong tool for the job.

We recommended they implement a PDM (product data management) system - it would have saved them millions of dollars (considering they had a staff of 20 whose sole job was to update the claims processing system by hand based on these spreadsheets.) Unfortunately when the 2008 financial crisis happened, management balked at the initial outlay. It's quite possible they're still using Excel to this day.

There are upsides to excel in cases like that…

The health insurance people of various sorts who use this setup are the ones with domain knowledge. With an excel program, they can (1) understand how it works (2) add to it, modify it (3) create a copy and break it (4) debug it (5) have ideas about how it could work differently…

I’m not saying that excel is the tool for the job (sounds like it is definitely not), and those advantages erode as the setup gets more complex and/or has more people using it. But, there are reasons excel gets used this way.

The alternative is often rough. Software projects like this have high failure rates. Getting “requirements” out of users’/clients’ heads and into programmer’s is difficult. It means the standard (natural?) gap between domain knowledge and programming knowledge is restored. That has costs.

I’m going to repeat again that I am not advocating health insurance companies run policies and claims in excel. It is a bad idea, almost certainly. I am saying that we should be curious at why they are doing it this way. I think it’s a “good” question in that it leads to interesting places.

If the excel team was blessed with a plague of inspiration tomorrow morning, Is there a way they could make it so that excel is the right way to manage an insurance business?

On top of what you're saying, the alternative is frequently a heap of under specified, under tested (and only manually tested!), copy-pasted and shoddy quality Cobol/Java/Oracle/C#/SQLServer/PHP/MySQL code.

Yes, the performance will generally be better (1 minute to run a report, not 1 hour), but I'm not sure things will generally be better. As you said, the business people can at least modify the Excel. These systems? Not so much, without an usually massive support contract (Yaaaay! for us IT people, I guess).

People sometimes make a false equivalent and imagine that the Excel jungle will be replaced by a Google-developed, 90%+ unit tested, refactored and cleaned periodically bunch of code.

> the business people can at least modify the Excel

I'm not convinced that's a good thing. If they can change the Excel then they can also fuck up the system, and if you fuck up a system like this then it's horrendously difficult to unfuck. I think systems like this look great when they're working, but the second they go tits up you realise why people are horrified by them.

Users able to modify cod is the opensource model. It doesn't lose its benefits and advantages just because the users are not us(professional devs)

It's not so much being able to view or modify code that's the problem, but the fact that any modifications to an excel spreadsheet are inherently running in production. Open source has many benefits, but while open source projects may accept pull requests there is very few which put contributions straight into the latest build without review.

That's not a realistic comparison. I have yet to see one user try to modify Firefox's code. Spreadsheets, on the other hand ...

You are still making a distinction between user and developer. I think the point is that we can all be both.

That's not the issue at all. The issue is that these systems invariably have no review process, no testing, and no versioning (at best they'll have each spreadsheet versioned with no real way to know what each version contains). And this makes them incredibly fragile.

Especially since the people who do this generally don't use version control.

They do use version control. They put each version in a folder named after the date

And these user-developed systems are often great...until someone accidently deletes or changes some data and it becomes a mess.

Unfortunately, many businesses then find they have no way to know the extent of the data lost or who did it. And there’s no proper backups. Happens over and over thoughout the world.

Excel’s a great tool and there are lots of terrific uses for it. But what the article describes isn’t one of them. If we’re being honest, this isn’t the first time the issue’s been discussed and the companies likely aren’t actually switching away from excel, but rather supplementing it.

In fact, the article seems to be more a “hit job” / “marketing piece” for Anaplan Inc., Workiva Inc. and Adaptive Insights.

I agree, the issue is not Excel as such but how it's being handled: I once wrote a blog post on how you can unit test Excel spreadsheets, too: https://www.zoomeranalytics.com/blog/unittests-for-microsoft...

> The alternative is often rough. Software projects like this have high failure rates. Getting “requirements” out of users’/clients’ heads and into programmer’s is difficult. It means the standard (natural?) gap between domain knowledge and programming knowledge is restored. That has costs.

The alternative doesn't have to have a high failure rate. I was brought in to help reboot their in-house implementation of a product data management system - the core issue was they didn't have any real competency in software development (though they believed they did.) Ultimately I recommended they abandon the in-house effort and instead proposed using an off-the-shelf PDM system, we evaluated quite a few decent options over the course of six months and found several that met the business's requirements. It would have been the right thing to do and I believe it would have been successful. Culturally the IT department staff was plagued with not-invented-here syndrome. It was a generally dysfunctional IT group and it's quite likely that Excel got it's start as the system of record because the business needed to get the job done, and the IT department was unable or unwilling to deliver.

Alternatively you misunderstood the business requirements and would have cost then a fortune on a failed project.

Alternatively you were brought in by management to make cost cutting recommendations - and your solution wasn't cheap enough.

Alternatively the company accounted for the enormous risk of the project succeeding but actually being worse than what they had in place.

You really don't know the politics of this.

Edit: Another alternative is that your plan would have resulted in sufficient job losses that the company was worried that it would demotivate other employees and damage the company more than the cost savings.

I upvoted your post because I think those alternatives are interesting to consider, but _you_ know far less about the politics of the situation than the OP.

It's entirely possible that the OP was exactly correct, or close to correct. It's also quite likely that at least one of the factors you mentioned (or others) was also at play.

OP here - the director that brought me in was (and still is) a straight shooter. There's no question as to why I was brought in. While she had the support of the CIO, the rest of the directors spent more time protecting their kingdoms than trying to modernize their systems. Fortunately for both myself and the director, we moved on from that organization.

Make it easy to interoperate with databases and/or revision control painlessly for the end user and the developer. The users are fine with excel, it's the attendant issues like centralizing data, running automated services off of the data, backing it up, and keeping an audit trail that created pain when I worked in healthcare.

If excel online was better, a lot of that would be pretty doable.good points.

But..I do think that if we're thinking in terms of "users" and "developers" then we're no longer talking about excel. The whole reason excel is used this way is the continuous spectrum of sophistication.

Also because Excel is both an IDE for developing spreadsheets, and a GUI container for operating them. Splitting dev time and run time environments would help with control. When a dev supplies a user with a traditional .Net GUI app, they don't enable the user to hack the app by supplying a copy of VisualStudio alongside. Effectively, this is what Excel does.

Yes, but.

It's no good if you fix excel by making it not excel. The whole reason excel is so popular is the intuitive, unintimidating but terribly wrong way it mixes data, code and GUI. Functions in the data, constants repeated in every row, hard coded variables all over the place. If you're just supplying a GUI and hiding the code, the user can't change, add or debug anymore. The point is that they can code, even if they don't know they are coding.

The trick is to think up a way of shipping VisualStudio, but making it less terrifying to engineers, managers and such.

Agreed. But a different way of working the same trick is to have an alternate run time that can execute .xls[x|m] as a headless server process.

One other point is let's say you do create some fancy new system to handle the task. It's likely going to erode and be surpassed by new computing technology at which point you're looking at an expensive legacy migration which could prove to be costly and you have the same downsides you mentioned.

Excel upgrades don't seem to break previous files as severely and have the backing of Microsoft.

I can't believe I'm defending Excel.

"If the excel team was blessed with a plague of inspiration tomorrow morning, Is there a way they could make it so that excel is the right way to manage an insurance business?"

What a professional software developer's workflow has and Excel lacks:

* principled way of sharing code across a team with distributed source control (or similarly powerful tool),

* automated regression tests,

* input validation,

* continuous deployment tools to automatically deploy working code (default mode for Excel I guess), and a mechanism to roll back to a previous version (not so much),

* build and dependency management tools for bringing in libraries from a vast open source ecosystem,

* systems for reporting bugs and feature requests and tracking their progress,

* networked database that is the shared "source of truth" for all users,

* security and auditing,

* hardware capacity planning (run locally or in AWS? what are the cost trade offs?)

So sure, get the health insurance people doing all of those things, and have Microsoft figure out how to do all of them well in Excel, and congratulations, they are now software developers. None of these require using a "real programming language", at least not directly.

That's great (though I'd quibble about a few of them, eg I was doing input validation in Excel 25 years ago, and boy do I feel old now), but while you're learning to do all that great stuff you're not learning anything about the domain you want to work in.

OK, I don't know who wants to work in health insurance but the basic point is Excel is a good tool for people who have specialist knowledge in a particular domain and want to automate their work. Programmers know the best way of programming but frequently overestimate their understanding of other people's problems.

You underestimate how difficult it is to work with at least some software companies, how expensive it is, how difficult negotiation it takes to convince quite a few nerds that user actually has business case for that requirement. And many software companies produce poor quality and are very slower then necessary despite using all buzzwords and all nighters. Managing even outsourced projects takes learning and expensive failures are nor an exception. All in all, the software we all collectively produce is still very expensive and unreliable.

Even absent that, it is difficult to explain what users need to coders who have zero idea about the domain. Unless it turns out they are the kind that just know user is idiot the moment she has different opinion on her own job, in which case expect needed time to tripple.

The homegrown excel is awesome analysis once the company decides to move on. It is concrete, you can generate test cases from it, it has features.

Integrated version control is one of the biggest things I wish they included in Excel rather than relying on an external system to do it.

>If the excel team was blessed with a plague of inspiration tomorrow morning, Is there a way they could make it so that excel is the right way to manage an insurance business?

No... but I think the converse might be possible. If those with the programming knowledge sought to replace the Excel system but in a way which actually preserved the useful features of the Excel solution. Dirty, filthy things that programmers don't like. No referential integrity, data duplication, deferring decisions to human beings rather than lists of rigid criteria, etc. If they could swallow their bile for just a bit, they could build a system just as useful and flexible and introduce tools to add in the things like data consistency and referential integrity - tools that the user would initiate, thus accepting responsibility for the restrictions such things impose on themselves and the data.

That is not, unfortunately, how it is usually done. (I am currently literally working on a program to replace a 40K row Excel sheet that crashes multiple times a day which has a workflow centered around it that involves slicing and dicing the sheet, passing it off to multiple different groups for separate editing, etc... not sure I will be able to sell the team on this approach, however I do believe it is sound.)

| I am saying that we should be curious at why they are doing it this way.

If you hand a caveman a machine gun, he will club you to death with it. IOW, the casual ability for domain experts to screw up a spreadsheet is enough of a reason to decide that excel is a terrible solution to this problem.

I would make a different caveman analogy. Hand a caveman a gun. He probably won't be a better rifleman than you (or someone). Really bad form and technique. But, he'll probably be a way better hunter than you, shoddy rifle technique or not.

Caveman were better than you at a lot of things. They knew how to find squirrels

The problem is switching would mean transitioning to a generalized system that they would have to work around. The Excel system looks like a nightmare to the HN crowd, but it was built around their business processes. It's basically a boutique app designed specifically for their job, it would be very hard to find a general product that does the job as well.

I've seen this a lot where a consultant will swoop in and tell them that they need to use Peoplesoft or something and the organization spends millions of dollars and years of retraining to get a system that has a lot more friction in their actual workflow and is never quite as good for getting the job done.

Of course there are often side benefits that can't be overlooked, like a much more sane backup solution, interoperability with other providers, etc... But in the end most of the employees will tell you straight up that the old system was better. Too much of the domain knowledge was baked into the old system and doesn't make the transition to the new system.

>When I consulted [...] We recommended they implement a PDM (product data management) system.

>We recommended they get a system that manages their product data

Yep, I believe you are a consultant.

They should be careful though: excel allows a maximum of 1,000,000 rows. I have seen more than one major crisis occur when a spreadsheet ran out of space, yet all of it was needed together to do daily business.

1,048,576 rows to be precise. ;) Though you'll run into performance issues in more complex spreadsheets once you get into the 100,000 row area, or even 10,000 or so. Then you'll see power user departments switch to the more opaque but more performant Access.

They switch to Access until they hit the data size limit of 2GB. Then they come back to IT who has to deal with the mess that you get when you allow domain experts to build databases.

I feel your pain but not everything built in Access becomes an unwieldy, klugey monstrosity. When Access solutions are born, their useful life, evolving reasons for use and expected storage requirements aren't always well-defined. It's only after users begin to embrace something that works, and request new functionality, that these things become clearer.

When faced with a decision b/w using a RAD tool to create something outside of IT, or spending $500,000 and waiting 6 months so real devs can deliver something Finance needed yesterday, that may not be needed tomorrow, the choice is an easy one.

Your generalization isn't entirely unfair; IT is often called on to fix a mess, but some Access solutions are built by people who understand database design/normalization and can put together a easily-maintainable application that never hits storage limits or requires upgrading to a SQL Server-managed back-end.

Corporate IT never lays eyes on many Access projects that aren't crap.

Domain experts are not the best people to build databases (says a domain expert who's failing miserably at that task right now), but database experts are sort of terrible at understanding domain problems.

It seems to me that many programmers don't appreciate that industry practices or regulatory requirements impose quirky requirements on domain specialists that everyone knows are far from optimal but which need to be complied with anyway if the business is to turn a profit or retain its licensure.

Someone mentioned a health insurance example above: there's 50 states, with 50 different sets of insurance regulations, and then some additional territories, federal regulations, and then wird business practices of healthcare providers who are subject to their own 57 varieties of regulation, and there are powerful economic and political interests that get in the way of burning it all down and replacing it with a single system designed by a committee of one patient, one doctor, one nurse, one pharmacist, one lawyer, and one economist.

And even that highly simplistic requirements committee went through 3 versions while I was writing it up.

The domain experts already have lots of complex knowledge and experience they need to automate. They don't want to learn about databases. Unfortunately, the reverse is equally true.

> It seems to me that many programmers don't appreciate that industry practices or regulatory requirements impose quirky requirements on domain specialists that everyone knows are far from optimal but which need to be complied with anyway if the business is to turn a profit or retain its licensure.

I think only those that turn into consulting understand it properly, because they aren't confined into a cubicle coding all day long without direct feedback how the real world actually looks like.

At most there is some feedback coming from ticket management system or the product manager.

Yeah, updating these sorts of insurance systems was a huge line of work for Rails devs in the late '00s.

Excel is simultaneously flexible enough to solve a wide array of problems and structured enough to enable larger tasks like slicing and filtering data. This article is trying to push for cloud-based silos that solve specific problems, but most of them fall apart in "real-world" scenarios because they aren't flexible enough to support the ad-hoc workflows people have developed over years.

The dystopia that I fear is an endless gauntlet of powerless employees droning 'the computer doesn't have a field for that' or 'the computer won't let me do that'. We already seem perilously close to this, putting the consistency of systems ostensibly built to serve the needs of society over and above the needs of society.

You can do crazy shit in excel. After someone opens an excel sheet - download a webpage and save it to disk? no problem. Lets load other excel files in memory and run their scripts too! Don't worry about your computer fan - while we scan your disk for stuff. Just state "On Error Resume Next" and start scanning hardware. Company email accounts are stuffed with one off sketchy excel attachments.


When you put it like that, ransom1538, I'm not going to click the link!

The idea is that general Excel files do not run macros. They need to be saved with a special extension and you need to manually approve macros, with warnings if the file is in a location where other users can write to the file.

I have no idea how watertight this is though, it’s Microsoft quality (and some of the most ancient code they have) so I wouldn’t bet the farm on it.

I've been working on a spreadsheet which is more amenable to embedding in programs (compiling to an IR which can be translated to high level languages or [eventually] compiler IRs like those of LLVM or GCC). I worked at a company where months were spent recreating models from a spreadsheet produced by subject matter experts, and this was a major cost.

A spreadsheet optimized for integration into software build processes would probably create more than a billion dollars of real value every year with even limited adoption.

While it may be true that Excel is overused, I believe that the spreadsheet is chronically underused as a model for software development.

I'll join in with a bit of show-and-tell of my own, if I may.

I'm working on a plugin for Excel for dev's and other technical people. It lets the user use SQL and C# in Excel, connect to various types of databases and move data in either direction, and provides a runtime for automation.

The idea is basically to let tech users use their existing skills to make life easier. Main areas of use are:

- process (live) data in Excel via SQL and C#/linq

- move data between Excel and databases

- fetch data from various sources using .net (e.g. get data from a AD, file system, REST etc. in a few lines of code)

- replace VBA with C# and .NET for automation

It's also very useful for prototyping. Business logic goes inside C# scripts, and Excel takes care of user input, displaying results, data binding (formulas and graphs) and data storage.

It's commercial but freemium, the url is http://www.querystorm.com.

I posted about it a few times: https://news.ycombinator.com/item?id=15670030 https://news.ycombinator.com/item?id=11583488

Here's an intro video: https://vimeo.com/242216594

Yeah, I've always thought with the types of integration hooks you're talking about that Excel could serve as a front end for many enterprise/back office apps.

My first heavy use of Excel was in a thermodynamics class in college. We modeled heat gradients for various HVAC systems with Excel. After that, I used it to calculate stresses and strains of different structures using each cell as a node in finite element analysis model.

I did something like this in 2002 for a VFX house; but it was loading a DLL that hot loaded other DLLs, queried the routines and exposed them as Excel functions. Through that I pulled in a stochastic modeling framework I wrote, and pulled in 5 years of the studio's timecard records to create a production forecasting system, right down to selecting the correct staff for each position, with backup people, for 1200 digital artists. The Excel "spreadheet" was something like 250 MB when saved with a film production in it.

Yes, I saw your tool before. I recommended it to somebody a couple months ago and they're very happy with it for their use case.

It is brave of you to integrate directly with Excel, and I commend you for that. Maybe some day I will work up the courage to do something other than basic one-way import.

.NET and ODBC already provide C# and SQL, no?

Yes...but. You can't do a SQL join between the data in an excel table and a remote database via a nice GUI front end using just .NET and ODBC. I am evaluating this product at the moment, and it looks really useful for Finance/analyst types

Interesting: optimized for integration is what I've been doing too. If you've got a minute to take a look at the links in my profile I'd be keen to hear your feedback.

This idea looks super interesting. Is the source already online, or will be?

Will be, "soon". It turns out it's difficult to get the basics of a spreadsheet right. I made the mistake of trying to use an existing spreadsheet application (like LibreOffice Calc or Excel) as the source format early on.

Have you seen Treesheets, http://strlen.com/treesheets/?

That's quite cool, thanks for the link. Tables seem to be a more popular model for finance and similar, but it's always interesting to see new data models like this.

This could allow for more interesting things like gathering fields from records into a meta-table from a relational DB, that would definitely be a valuable next step.

Was minorly obsessed with this a couple of years ago. Is very interesting but ultimately lacks a lot that could make it useful

Out of curiosity, what's missing from it ? I've been meaning to use it but still haven't found actual uses, I'd love to see what others do with it.

Mathematical functions, easy formatting, and more anything a purpose. Basically it does thinga you don't want a spreadsheet to do. It's more akin to a mind mapper and there are easier to use tools for that.

It’s a good foundation for future open-source development, uses wxWidgets toolkit for cross-platform desktops. Author has graphics background, now busy at BigCo.

Is there a github or similar I can watch?

After seeing the level of interest here, I set up a (tentative) github org where I'll put the repo for the evaluator when it has some reasonable subset of the desired functionality.


I definitely agree that spreadsheets (of which Excel is just the latest) have drawn a lot of people into basic programming. But at the same time, it has stunted a lot of people's growth because of its sharp limitations.

Spreadsheets have a lot of similarities to the sort of code bases that as an industry we've been moving away from. E.g., no version control, no tests, opaque code, bugs are hard to find, write-only code, bit rot, random antique scripts used in critical spots, poor reusability, latest version is on somebody's hard drive somewhere, nobody knows which version is the right one.

I also think it's deeply unfortunate that there's no good path from spreadsheet master to real developer. As great as spreadsheets are, they're a cul de sac. I hope this new generation of tools fixes that. If not, somebody sure should.

"The dynamic spreadsheet is a good example of such a tissuelike superobject. It is a simulation kit, and it provides a remarkable degree of direct leverage. Spreadsheets at their best combine the genres established in the 1970s (objects, windows, what-you-see-is-what-you-get editing and goal-seeking retrieval) into a “better old thing” that is likely to be one of the “almost new things” for the mainstream designs of the next few years.

"A spreadsheet is an aggregate of concurrently active objects, usually organized into a rectangular array of cells similar to the paper spreadsheet used by an accountant. […] A spreadsheet is a simulated pocket universe that continuously maintains its fabric; it is a kit for a surprising range of applications.”

– Alan Kay, “Computer Software”, Scientific American (1984). pp. 56–57. PDF: https://frameworker.files.wordpress.com/2008/05/alan-kay-com...

Republished as “Computer Software” (1984). pp. 6–7. PDF: http://home.cc.gatech.edu/guzdial/uploads/130/Alan%20Kay%20-...

The issue isn't that you can't program in excel the issue is that excel lacks many of the safeguards that have been built around programming in the last few years.

Version control for excel is terrible at best. Code review is basically non-existent and how could it be when logic is split over several cells and variables are named $A17 or B23.

I think the win with excel is its super simple interface, that said there is plenty of room to improve on the programming language interface.

> variables are named $A17 or B23.

Excel lets you name tables, cell ranges, or individual cells. I have seen spreadsheets put together by people who take the time to do this, and they are remarkably readable!

Way back before they added visual basic into it Excel had what was called a Macro language that was also built in sheets - just slightly different appearance (very wide cells) and altered functionality (to prevent you from doing numerical operations to your macro code). I think this was around version 4?

Anyway, it basically worked like a stack, with the currently selected cell being the instruction pointer. So when debugging you could either look at your data and watch blocks of numbers flashing in and out of existence or changing state, or look at the macro sheet and watch the pointer bounce around as it went to different branches or sat in loops or whatever. There were commands to turn screen updates on and off (which sped things up considerably) so as you got more and more of it working it would go from looking like some confusing nightmare to a smooth list of progress bars and passed sanity checks.

They moved to VBS because programmers were finding the cell model too restrictive/weird, and for integration with Access, but doing it all in cells was remarkably and pleasantly like assembly-level debugging. You'd hack something up quickly, figure out your algorithm, then name the ranges and use and index value, and you could lock ranges and what-all else. It wasn't as easy to read as a page of code (eg no whitespace) but the crudity of it pushed you towards that sort of 1-operation-per-line discipline that assembler demands and really forced you to think about semantic efficiency. I had macros consisting of thousands of cells of code with every single operation commented. You could, in turn, name the ranges that had macro code in them and do code profiling to find out where your performance bottlenecks were, and so on.

If Excel is even a small part of your job, it is worth taking an hour of your time to watch Joel Spolsky's "You Suck at Excel": https://www.youtube.com/watch?v=0nbkaYsR94c

Some HN comments here: https://news.ycombinator.com/item?id=12448545

TIL, thanks for letting me know

Completely agree. I think simply enforcing a version control would be sufficient to solve most Excel woes, however. It doesn't solve manual user error (a la Reinhart-Rogoff), but it's a good next step.

M$ft XL developer here. I attended a conference to meet our Excel guru users and it's fairly impressive what these non-coders achieve. Excel is like an operating system to them, and they even create Excel files to "debug" there other Excel files, e.g. debugging the order of formula calculation and validating the output at each step in their property assets spreadsheets - without using any VBA

MS access is often a way better tool for what people do. The problem is that it requires a bit more work up front, a little more abstraction, and usually some programming. Having said that, most companies have people in IT that can help out something like that together for people but that's not their job.

> it requires a bit more work up front

Because of this "bit more work", it's also much less flexible in the long run.

Excel is not a great database, but it's a highly flexible one.

Does access have a spreadsheet view for bulk editing?

Yes, it’s called the datasheet view.

Access does have a datasheet view, but it is _much_ less powerful than Excel e.g. copy/paste multiple cells, fill down, and it is SLOWER.

In a database you don't "fill down". You build a query to generate what you want (enter the expressions once) and click on the "data sheet" view for the query. It's different, but the simplicity of referencing $A$5 in excel and using whatever cells you want as inputs is way simpler than any way you might take user inputs in Access.

Excel totally totally missed the wonderful opportunity to do so much better than Google Spreadsheets. They could have created a centralized shared way of creating, editing, sharing and storing data. Their limitations as it exists today are ludicrous at best - 10 max users? Give companies a way to keep the data in a centrally managed place with access control and auditing and users the ability to freely share their creations (and data)...

> few of the people using it realize they're programming

An idea I've been toying with for a few months now is an automation app (read: window manager) whose UI is somewhere between Excel and GameMaker Studio's drag & drop UI. Like where you just have a giant grid/canvas, and where you can define "routines" and "variables" as groups of cells. If I could figure it out, I think it would open up the idea of an automation app to a broader scope, to people who aren't programmers, and I might be able to make some money off it. Especially considering I made like $500 in donations for Hydra back when I was doing more WM stuff for Mac, so there seems to be some excitement for the idea of programming your Mac, but perhaps it could be a living if I can figure out how to reach out to power users who aren't programmers. And it was your comment that gave me the idea of making it more like a grid instead of more like GameMaker. So, thanks!

I think that excel is one of the best applications on the market.

Every single department in any company can benefit from knowing it well.

The problem is getting people comfortable and competent to know it well.

This is where I think google sheets fucking failed: they should have made an effort to make sure every single gmail user had a tutorial to know how to empower their curiosities and interests via google sheets.

I found this just the other day: A User-Centred Approach to Functions in Excel [PDF][2003]: https://www.microsoft.com/en-us/research/wp-content/uploads/...

Honestly the article stinks like an advertorial of cloud vendors trying to push their wares. I agree that Excel isn't a good fit for many things, but blanket rules like 'no excel' seem counterproductive.

Why do we readers take these sorts of things at face value? You know the whole article is a sales pitch.

I suppose you either understand how PR works and simply disregard anything printed or broadcast and do your own research, and accept this system unlikely to change...

Or you get irritated by it, while waiting for the "inevitable" collapse of this promotional/propaganda model...

Honestly I don't know what is happening, but it seemed like the first option, which I've followed for a couple of decades is a good enough cure for disillusionment and depression. (Caused by investing in the second option)

Edit: add quotes to inevitable.

Every excel spreadsheet is an error waiting to happen.

Implicit & unexplained business rules, no tests, no version control, unlabeled cell references, hardwired variables, everything is global, no encapsulation, no logs and no documentation.

Even assembly code is easier read and maintain than a complicated excel workbook.

Functional programming before it was fashionable!

This is how I like to explain functional programming to developers today. When you say "Functional Programming", a lot of devs think "scary complicated thing". When you say "Basic excel spreadsheets" they go "anyone can do that!".

Excel is a very limited form of functional programming. It's sort of SIMD and it eschews side effects, but it has no first class functions which is extremely frustrated and -non-functional

Perhaps a better term is declarative?

Excel is the Swiss Army Knife application - it isn't the best tool, but it's the most versatile and the one you have with you. Good insight about the relation to programming.

In the book "A Small Matter of Programming"[0] (1993), Bonnie A. Nardi makes the case that there are really only two successful end user programming systems - spreadsheets and CAD.

I think that's just as true today, at least as far as business software goes.

0 - https://mitpress.mit.edu/books/small-matter-programming

Compare "Improving the world’s most popular functional language: user-defined functions in Excel" https://www.microsoft.com/en-us/research/publication/improvi...

UDFs are the real LPT. Much of gnarly spreadsheeting going around has to do with having to simulate loops with rows or just partition very large formulas (think option pricing formulas) into partial result cells.

Could you explain your first sentence? The acronyms are a bit dense.

Edit: UDF is user defined functions. What's LPT?

Many of the larger businesses I have seen(not too many) have a disconnect between the subject matter experts that use Excel to codify their knowledge and the developers. The program(Excel workbook) often isn't big enough to warrant the dev costs when it works "now" but is too important not to in the long run.

I do think this is a tooling issue that could generate code based on Excel workbooks.

I agree that mission critical data needs to get out of it and into a centralized system, but I still feel Excel is an incredible tool in any business.

You would have thought this would be obvious- it seems analogous to perl/python or other such Swiss army knife tools we use for quick one-off solutions that occasionally get pressed into permanent formal use where they shouldn't.

Python (and probably Perl too) work fine for permanent formal use as long as they are engineered as such. I have seen clear, well structured permanent code bases in python. The problem is just that people like coding one off prototypes in python, and those end up in prod; it's just as bad as when Java prototypes end up in prod, but who actually wants to write prototypes in Java, if given the choice?

Couldn't you argue the same thing about excel? It does not lend itself to well engineered permanent formal use, but you certainly could build a robust system with it.

I agree - it's also really useful as a prototyping language, since its accessibility allows domain experts to really make something that suits their workflow. My current company (Parsley, www.parsleycooks.com) would have been a lot harder to do without looking at the Excel sheets that the more tech-savvy and organized chefs use for logistics work.

> I would argue that the most commonly used programming language is Excel.

Anytime another programmer asks me what language they should learn I tell them Excel and they think I'm joking. Besides being a quick and dirty way to solve a variety of problems it helps you speak in the language of your users (if you work in B2B)

> I would argue that the most commonly used programming language is Excel. But few of the people using it realize they're programming.

Indeed, I'd also argue that the most commonly used database is Excel. Although few people using it realize they're designing schema.

> reactive data programming model

Can we say "one way dataflow constraint system"? They're even called "spreadsheet constraints"...

Agreed! The business critical solutions (!) you can build quickly with Excel that even non-technical people can understand is the real MVP.

"[Excel's non-VBA interface is] a brilliant reactive data programming model.."

Also known as functional programming

Excel remains a tool.

It can be a really powerful one I agree. It's simple to use, a lot of people knows how to handle it. And it's probably one of the best tools to do some quick data analysis.

However, like every tool, it has its pros & cons. One of the main cons, IMHO, is that data tends to be mixed with processing a lot (but it also simplify usage, so...).

Overall spreadsheets have been really helpful. Retrospectively, there is a good reason why it triggered the personal computers boom in the 80ies. They are good adhoc tools that can help lot. And Excel is the dominant one.

However, there are tons of horror stories around Excel, worst case scenario, it has lead to wrong decisions affecting the world economy.

I've personally a few horror stories (fortunately, not affecting the world economy).

At my last job, we used an internal spreadsheet to evaluate costs and margins when answering contract bids. This spreadsheet was a monstrosity taking ages to compute and it was a nightmare to manage, sync, and share between all the participants.

I've also seen the traditional "on what did you work last week" spreadsheet, quite annoying to update, even more given that my workstation is under Linux...

I've also seen spreadsheets as specification documents, generating an xml file, which in turn, through some xslt, was used to generate C++/C++ headers files describing an interface between two sub-systems. The concept (automating spec to code translation) was really interesting. However, I don't know about you, but I personally hate to have MS Office in the tool chain used to compile a piece of software, specially when it targets Linux and only Linux...

But those are not the most terrifying spreadsheets I've seen. The most terrifying on was a giant spreadsheet basically managing every aspects of a huge project (thousands of piece of equipment, hundreds of sites). This document was so big it took several hours just to open it. It was like a giant CMDB of all the configuration items, and, trust me there were a lot. But it also did more, a lot more. It was generating configuration files for applications, background images for monitoring maps, firewall rules... It was also a tool that helped architect and size the number of equipment required for a given site. I probably don't know the full extent of its functionalities. Just to give you an idea of its complexity, at one point, even the number of cells in a whole excel document was reached (with Office 2003 however). All the auto-generated stuff were always kind of wrong and had to be fixed by hand, which made for a repetitive, annoying and error prone process. Also, there was no clear naming convention, and it was a mess to take the generated stuff and load it in our automation scripts. It was, de-facto, a huge piece of software with no versioning, tests or QA and it showed. There was only one guy behind it, basically managing the beast full time. When he was sick or on PTO, the whole project was impacted. And when he left, it was a huge mess, nobody was able to take over.

Spreadsheets are great, but past an adhoc usage, please, please, please, implement something more serious. And also, keep track of Excel usage to avoid these kinds of monsters. Often there are real needs behind these beasts, but Excel is a really horrible choice in the long run.

Related commitstrip:


> I would argue that the most commonly used programming language is Excel.

I wouldn't say Excel is a programming language - unless you are referring to VBA attached to excel. Excel is a spreadsheet application. It's no more a programming language than Microsoft Word is a programming language.

Since PowerPoint was proven to be Turing complete my bet is on it and not Excel being the most commonly used: https://www.youtube.com/watch?v=uNjxe8ShM-8

PowerPoint, while popular, has no where near the widespread use of Excel. In most businesses, PowerPoint is the sort of thing you use once a month, once a week maybe if you are doing a lot of meetings. Excel gets used daily or even hourly by members of almost every department in businesses large and small.

I would agree with OP that Excel is the most-used programming environment of all time

The phrases "avoid data silos" and "we want single source of truth" have been used repeatedly about Excel misuse/abuse/overuse for 20+ years. The WSJ story talks about the complaint but not about the underlying tension between data redundancy vs data manipulation.

Excel endures because it's the non-programmer's REPL for data. Like a Lisp REPL, using Excel is nimble and has an instant feedback loop. Click "A-Z" to sort ascending, click "Z-A" to sort descending. Move/hide/freeze columns. If one has intermediate-to-advanced skills, one can pivot the data on multiple axis. None of the centralized systems with even the slickest web GUIs can match Excel's capabilities.

That's why the centralized systems like SAP ERP, Microsoft Team Foundation, mainframe sytems, etc offer functionality to export to Excel spreadsheets. (Or they offer export to csv files -- which is 1 step removed from Excel.) With MS TFS, you can even re-import from Excel so in that case, Excel acts as a sort of offline data-entry client for the project management tool.

One middle ground between Excel data silos and the utopia of a centralized system is a shared document store like Sharepoint or Dropbox. The hierarchy would be something like:

1) Excel spreadsheets emailed around. Lots of redundancy; uncertainty of which version is canonical.

2) Excel xls file stored on Sharepoint with automatic version control. Employees must have discipline to only email links to the "one true" spreadsheet and never attach the actual xls file in the emails.

3) The data that was in Excel is migrated to centralized system and all data entry and reporting is done there. However, because the central systems are so cumbersome, this abstraction "leaks" back out to Excel because of the highly desired "export to Excel" functions! (This restarts the dreaded Excel "data silos" complaint cycle all over again.)

There isn't an IT department in the world that can develop centralized IT solutions fast enough to keep up with the ways employees slice & dice data. That's why Excel is "overused".

> There isn't an IT department in the world that can develop centralized IT solutions fast enough to keep up with the ways employees slice & dice data. That's why Excel is "overused".

It wasn't very long ago that I realized that my two decades in IT could be summed up as: making proper applications out of Excel workbooks with feeble macros. I hate Microsoft as much as the next Linux zealot, but dang if they didn't create a world in which I've made a career. Where I'm at now, I'm drowning in opportunities to ameliorate horrific Excel-based workflows, and I'm fairly confident that there's enough work out there to last me until another 20 years to retirement.

>making proper applications out of Excel workbooks

Yes, exactly.

If we're being uncharitable, we can spin Greenspun's 10th rule[1] of programming as:

>"Any sufficiently complicated Excel spreadsheet contains an ad-hoc, informally-specified, bug-ridden, slow implementation of a proper centralized database."

Or, if we're being charitable, we frame it as an internal MVP (Minimum Viable Product):

>"Spreadsheets are the internal 'mvp' that proves the business value before you build the centralized systems. When the spreadsheet becomes unmaintable spaghetti formulas and the xls email workflow crushes under its own weight, that will give the company the evidence and the confidence to spend $1 million and migrate the spreadsheet to a proper centralized database."

[1] https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

I’ve just done exactly this: replace an old xls file which was sent each month by email with a simple web based application. The app includes some nice dataviz.

But then the people keep asking “how can I export to excel”. One user was copying an html table split in 20 pages one by one to excel.

So I found a elegant and simple solution: an excel spreadsheet linked to the PostgreSQL db of the app via ODBC. Now users can just click on Refresh All in excel and get the latest data. I’ve also added an “click here to open” link in excel which opens the form with the row’s id in the web app.

Now everyone is happy: the data is kept centralized and available to everyone in real time. And users can make all the pivot tables and pivot charts that they want natively in excel.

Do not underestimate the power of ODBC with excel!

(may not work for big data though, but most business app do not contain more than a few thousand rows)

>that will give the company the evidence and the confidence to spend $1 million and migrate the spreadsheet to a proper centralized database

...along with understanding that migrating at earlier stage would cost only $100K.

Migrating too early before a business flow is properly understood could lead to that reduced cost but also a product that doesn't quite fit the actual needs.

A database is not a spreadsheet replacement. A database backed application is a spreadsheet replacement.

Henceforth known as "jasode's corollary."

How did you build that career if you don't mind me asking? I've moved on from that stuff in my early career but would love to get back into it.

None of the centralized systems with even the slickest web GUIs can match Excel's capabilities.

You can very easily use Excel as a client to a central database and get the best of both worlds.

Actually you run into the one of Excel's biggest problems when you do this, namely "Multiple versions of the truth". Here's how that happens: The central database often contains bad data. Users "fix" the bad data in their spreadsheet. Excel doesn't have the ability to write the updated data back to the database. Now everyone has their own versions of "the truth". There are products that attempt to solve this problem by adding the ability to write modified data back to the database. See for example Pebble Reports, http://pebblereports.com/features/ but this is a reporting program and does not have spreadsheet capabilities.

My first question was going to follow the poster above, and you answered it for me. I didn't realize Excel couldn't write back to the database. It shouldn't be too hard to develop an extension for that purpose, would it?

Aside from that — what about MS cloud services? I mean, maybe that's not appropriate for healthcare or certain businesses, but for others it allows there to be a single source of truth if the workflow is correctly tuned, no? I can produce a doc in my local instace, push to 365, edit remotely on the web, have somebody else edit remotely, and sync it back to my local instance without a hitch.

I use Office365 for that purpose regularly at my job. I'm a developer, but I have regular communication with editors, producers, project managers, division managers, operations managers, art directors, and so on. A lot of the information we share is tabled and needs to stay updated, and editable by each party involved. It's worked well for me in that regard.

Maybe others have had different experiences?

"I didn't realize Excel couldn't write back to the database. It shouldn't be too hard to develop an extension for that purpose, would it?"

Well, more accurately, it can write back to the database, but normal usage of Excel won't afford that result. Normal usage of Excel you have an independent value.

It is also a dubious proposition as to whether we want people to actually write back to the central database; you'd be one stray macro away from disaster. You sorta want it if someone manually corrects something, but not if they operate on it programmatically, and now we're making things really complicated.

There shouldn't be any risk if you have version control.

Thanks for the insight.

> There are products that attempt to solve this problem by adding the ability to write modified data back to the database.

Another company/product that I know about (though it's been over a decade since I last touched it) that attempted to fix this issue was Actuate:


Hmm - and now that I've started digging into it, I see that this company has radically changed from what I knew of it. I had known it as a company which created software for designing and running reports from various data sources (whatever could feed it, honestly - excel, odbc, csv files, etc), which internally could be configured for more advanced usage using a language that had similarities to Visual Basic (it was OO long before VB, and geared for the reporting system, not general purpose usage). Before I left my employer who used it, they had just announced a Java-based system of "business objects" and some kind of tool to interact with data in an Excel-like manner (the tool essentially had all of the functionality of Excel at the time - including pivot tables), where changes would propagate thru the system. Plus interaction could only be done thru those objects, which were defined as "views" into the dataset.

So - after research this (just now!) - imagine my surprise at finding that this company now is named:


...and they also did something called "BIRT" - which was open source and part of Eclipse:


Apparently this started in 2004 (not long after I left my employer - and that it is Java-based also fits).

I'm not sure what is what with the company any longer? If I had to guess (probably wrongly), they spun off their Actuate reporting stuff and the other tools as open source (?), and are now concentrating on other things, possibly in the same vein?

Weird. But very interesting.

>easily use Excel as a client to a central database and get the best of both worlds.

Yes it seems that way but many spreadsheets exist because the centralized system is an incomplete solution.

For example, a company might have a "human resources tracking" centralized system. (E.g. Ariba, Peoplesoft, etc). It has some standard features like ingesting resumes and extracting "skills" from parsing MS Word docs, etc.

However, the company happens to use a interviewing process where the candidate must pass (1) phone interview and (2) onsite with 5 peer programmers, and (3) hiring manager.

The hiring manager wants to "capture" all of those states (phone_passed=True/False, peer interviews with 5 different numerical scores, manager_pass=True/False). The centralized system doesn't have fields that capture all that so the hiring manager creates a spreadsheet that has those extra columns. Many spreadsheets out in the wild "fill the gap" between incomplete centralized systems and custom IT programming. An adhoc xls file can be created far faster than submitting a change request to the corporate IT department to customize Ariba -- which could take months (or never) to implement.

Or spreadsheets act as the "intersection" of data between multiple systems. Again, one purpose of centralized "data warehouses" was to eliminate Excel "data silos" but again, no IT department can create data warehouse solutions (which also includes all the custom reports) as fast as someone creating spreadsheets and just emailing them around.

Having a hybrid process where Excel acts as a round-trip offline client to a central db doesn't really eliminate the fundamental reason Excel (over)use keeps multiplying like rabbits.

This also illustrates another problem: all the programmers here are assuming perfect domain knowledge even exists in the first place.

When it doesn't, Excel will work fine (and will only cause disaster in individual cases). A "proper" unit-tested solution will either not work at all or just won't support half the use cases.

And without someone specifying the domain knowledge, there's nothing any programmer, no matter how well-versed in unit testing, agile, or ... can do about it.

Can you easily integrate with externals systems via REST & SOAP? I've done some quick web searches on this topic and have never seen anything that jumped out at me.

I know Google Sheets gives you a little bit of this with Javascript embedding but I was searching for something more tightly integrated with Excel or LibreOffice.

Look on the "Get & Transform" tab

> There isn't an IT department in the world that can develop centralized IT solutions fast enough to keep up with the ways employees slice & dice data.

That's why the principal way you sell centralized systems to users isn't in the ease of manipulating results but in the ease / automation of data entry. The whole pain point and problem with Excel is in manual data entry, and having a single source of truth, even if you succeed in enforcing its property as a single source of truth, does not automatically solve disagreements between different human sources of data.

That's why tools like Grafana have interesting futures for business users, because they enforce a clear separation between data sources and data visualization, while offering a lot of control to the end user over statistical manipulations and visualizations that they can create, and keeping those visualizations updated in real time, so that they don't have to be re-prepared every time period for some new report.

"One middle ground between Excel data silos and the utopia of a centralized system is a shared document store like Sharepoint or Dropbox." ... 100% agree, and add I find that Google Sheet is the best for this role: 1 source of truth, modification history, read/write permissions and shareable. I am now starting to use spreadsheets for use cases I would have strongly objected in the past. I'm surprised Google sheet isn't more mainstream.

And an API that lets data be used in other 'real' applications

Excel endures because it's the non-programmer's REPL for data. Like a Lisp REPL, using Excel is nimble and has an instant feedback loop. Click "A-Z" to sort ascending, click "Z-A" to sort descending. Move/hide/freeze columns. If one has intermediate-to-advanced skills, one can pivot the data on multiple axis. None of the centralized systems with even the slickest web GUIs can match Excel's capabilities.

What about Business Intelligence products such as Hyperion? They can provide reports from a defined data model which gets data directly from the database. People can create their pivots and other transformations on the data fairly easily. Is their something that Excel offers that makes it much better than one of these Business Intelligence products in certain situations?

Excel offers a few things in most scenarios, primarily because that any enforced structure always implies lost flexibility, and one of the issues often quoted with Excel is the lack of enforcement of using the right structures and data sources.

A simple one: If I have been asked to do a report, but it turns out I don't have the correct access rights to some data, my manager can simply send it to me in an Excel sheet, or he can log an issue with IT to change the acess rights to a specific key/dimension/hierarchy.

The 'one true data source' move often correspond to tighter access controls when good auditing really would have been the correct solution, hence the above situation is fairly common.

Thanks very much for the info that Excel may offer least friction in a lot of circumstances, especially when ad-hoc things need to be done with data.

For qualitative data management, but numerical analysis, 1 source of truth collaboratively - I use Atlassian Jira with great success.

Got a complex work stream? Got thousand small problems that needs to be slowly figured out over 50 different main topics? Got a team of 2,50 or 1000? No problem.

1. Just set up a bunch of jira entries while tagging them with the same labels

2. set up filters on these labels.

3. Set up dashboard(s) that give a complete realtime/update to any inquiring user.

Need to report status? No problem, screenshot of the Dashboard and done.

It's really great tbh.

Alternatively, use Google Sheets :) Centralized, single source of truth spreadsheets.

Google Sheets is great and makes sense for a lot of vanilla use cases, but even in its current form, doesn't hold a candle to Excel running on a Windows machine when it comes to doing anything even slightly fancy or non-trivial.

That's not true. Like newer versions of Excel, you can use Javascript on Google Sheets now https://developers.google.com/apps-script/guides/sheets It's better than Excel's VBA.

Also Google Sheets has an API for stuff like Python.

Excel can cover many use cases with just with UI workflows without requiring its users to actually learn programming.

Only the very advanced users turn to VBA programming.

Example of fancy or non-trivial?

Google Sheets and other similar web apps replicate most if not all of the Excel features you just listed, and it's centralized and more than one person can work on it simultaneously.

> The phrases "avoid data silos" and "we want single source of truth" have been used repeatedly about Excel misuse/abuse/overuse for 20+ years.

Yep. Even if they get centralize data via RDBMs like SQL Server or even god forbid Microsoft Access, office workers are going to want the front-end of that data to be excel.

There is too much knowledge investment in excel for many office workers to want to switch.

> A year ago, Mr. Bell’s team spent hours distributing hundreds of Excel spreadsheets to regional and unit leaders each month for planning and performance tracking of the company’s 415 U.S. restaurants, he said. Now the same process takes minutes.

Then I believe it should say stop abusing Excel.

Excel can be an incredibly useful tool, but it's definitely not universally useful. If you're using it to manually collate data from 400+ sources, you're doing it wrong.


> A year ago, Mr. Bell’s team spent hours distributing hundreds of Excel spreadsheets to regional and unit leaders each month for planning and performance tracking of the company’s 415 U.S. restaurants, he said. Now the same process takes minutes.

Reads to me like, "last year Mr. Bell's team spent hours hammering in nails by smacking them with the wooden part of the hammer's handle"

I agree. Excel files prolly should not be distributed using the same method as a freshly-out-of-the-ditto-machine dictaphone memo from 1967.

And the premise that things have changed enough to warrant reevaluating worn out business processes like Mr. Bell's here is true enough.

But dumping a tool because some out of touch C-level tool says so is a recipe for huge traction loss.

I mean ... people that use nail-guns for a living still have to use hammers.

This is the point that's going to get missed a lot. In the corporate world Excel is the proverbial hammer to nearly every use case involving data manipulation, regardless of how well suited it is for the task.

Edit: I just noticed that I'm the third person to reply to you and we all referenced hammers.

I don't know that Excel was the wrong tool for this task. I'm not sure why the spreadsheets couldn't have been distributed and consumed automatically by an Excel application, or at least by helper applications that were fed by and/or into Excel sheets. An Excel spreadsheet is a good enough form; and I don't think replacing one with a webform or pdf form would be any better.

edit: my favorite thing about using excel is that instead of spitting out an answer, you can spit out an Excel sheet that shows a lot of the work itself, therefore partially explaining the answer to the people that have to consume it. The product of your program can often be another program, giving the person receiving the report the opportunity to tweak or add new parameters to your output. Basically currying accounting calculations.

That is exactly right.

Many moons ago large bank I worked at purchased a company and soon it became apparent that their entire business ran on a Lotus 1-2-3 spreadsheet with 30,000 lines of really dense, uncommented, unindented, spaghetti LotusScript code spread over 2 or 3 functions. The kicker is it would only run in the 16-bit edition of the product for whatever reason. So we had to have a VM to run it in because Windows 2000 had a problem with either Lotus 1-2-3 16 bit edition or the spreadsheet itself.

So yeah, stop abusing Excel - Lotus 1-2-3 in this case.

The core complaint of not having a single source of truth seems more about how the company handles data/filing rather than Excel itself.

Maybe I'm a bit bias having used Excel my entire career and very dependent on it, but everytime I've seen companies steer away from the desktop application with 0365/sheets/tableau its simply not as useful for adhoc data analysis and reporting.

I agree. Those are great products, if you have the staff to support and use them, but this is rarely the case. Every time I see a company implement them, they think it's going to be a silver bullet to reporting and analytics, and most of the time it just causes a mess, because no one really knows how to properly use them, or if someone does, that person doesn't have the bandwidth to keep up with the entire organization's needs. Excel is amazing, it's relatively easy to use, has immense power for dealing with calculations and graphs, and it is cheap. An organization's problem isn't excel, its the fact that they keep 20 copies of the same data and no one is keeping track of which is the source of truth.

Every email reply becomes an independent fork of the data inside the spreadsheet. The issue is storing and sending the file locally, and the solution is storing it in a centralized location. Cloud storage is often used towards this end, but on-premises file shares work very well too.

This is one of those solved problems that companies inexplicably keep working to solve again.

It's such a bizarrely difficult uphill battle. We have Office 365, with a damned good web editor, and the ability to save directly to OneDrive/Sharepoint from the desktop client, with the versioning that that supports, and we still end up with shit emailed all over and files with version-12.xlsx and -$DATE.xlsx saved here, there, and everywhere.

The web editor might be good enough today but it used to choke and wouldn’t open/render all kinds of stuff that’s normal in financial spreadsheets. Thats not even getting into what a shit show SharePoint has been over the years. Microsoft has slowly eroded trust in anything but email sharing and (if there haven’t been missteps within an org with them) file shares.


This is a key point. It's easy to branch data, but painful to merge it.

How do you even HAVE a source of truth? You email a spreadsheet to two people, and the both want to work on it... which one is the source of truth? They each have modified the original, and we want to keep both changes.

I mean, this is the problem that distributed version control was designed to solve. Emailing a spreadsheet around and calling one a source of truth doesn’t solve the simultaneous edits problem.

>its simply not as useful for adhoc data analysis and reporting.


You wanna know what's even more useful? Code notebooks.

I store everything from my sql queries to my processing pipeline to my models, and my vizualizations. Most notebooks have slide-show modes now so your presentation to business stakeholders is simple to prepare as well, and with the right plotting packages you can draw interactive charts that do all the fancy stuff. And when it's time to update the analysis, you just re-run the notebook.

But business analysts are too freaking lazy to just learn to write some python or R, and they're perfectly happy to get paid to do the same task over and over again when it comes time to making updates.

And don't get me started on source control.


Most business analysts who are pulling this kind of data have a finance background. To say that they're "too lazy" because they don't want to do their work using Python seems a bit harsh.

In practice what I've seen is that IT and finance departments don't even let business analysts write their own SQL queries for fear that they'll write bad ones that bring the database server to its knees.

>In practice what I've seen is that IT and finance departments don't even let business analysts write their own SQL queries for fear that they'll write bad ones that bring the database server to its knees

I'll take data warehouses for 600, Alex.

I use jupyter a good amount, but even in a low user environment there's a big worry about version control and code sharing/review

I know there's " automatically check stuff into git" sort of models but would really like an integrated Jupyter + GitHub + Google Drive system so that it's easier to track what's happening and to find older versions of stuff.

Oh yes. I have tried to introduce jupyter notebooks with pandas to data analysts so many times, but they always want to do it in excel.

Do you have any way you'd recommend of getting multiple users working together with code notebooks? I've messed around with Jupyter, but that's about it.

We usually just send .ipynb files around because of the stuff we do, we all usually have the same libraries.

Installing a notebook server is an option though. IBM will be happy to let you pay to have them help, or you can do it the sane way with continuum analytics.


I happen to use and advocate for R/RStudio for ad-hoc work (I spin it as "excel on steroids" when I am trying to convince someone). But it is hard to pull people away from something that has worked for them.

The alternatives to excel in the past have been really, really unpleasant to use. A now EOL product from Oracle comes to mind. It is called "Discoverer"-- that's right, "Discover" + "er". It is a turgid POS requiring extreme suffering and handholding/permission granting from your local friendly DBA-- all this to produce 90's era bar charts that are all but impossible to tweak.

IMO if you really want non-tech users to become more technical and use DBs as single source of truth, use KNIME. It still needs proper reporting addons though, but you can export data, even to tableau.

I saw a presentation about KNIME but was unfamiliar with many of the processes that it allows one to string together to build a workflow, so the notion of building a workflow myself seemed very daunting. Do you think it would allow one to build workflows for basic stuff and gradually grow in their confidence with it?

The presentation mentioned that the company is/was working on reporting add-ons, I think.

Oh yeah, I'm pretty dumb and I managed to do very complicated stuff. It seems a bit tricky to "get" at the beginning, but it's pretty easy.

The trick is to learn what node does what, because there are lots of them. You can also download community nodes for additional behaviors. Once you learn the basics, it's pretty straightforward.

I learnt by trial and error, looking for examples and info about nodes, and now I have some workflows with data from DBs, files, APIs and all at once.

Totally agreed. Especially when the alternative is abusing an ERP for ad hoc reports or using a journey map approach that captures the top X use cases, but misses what is actually needed.

Agreed. Excel users should learn more about importing data from other sources and processing that. Excel itself should make it even more obvious.

I'm not sure of the origin, but I was once told a truism that any speccing process, if left long enough, will eventually spec out Excel.

This is NOT about "inefficiencies", "data security" or whatever.

In big biz, the "finance chiefs" (or other persons with purchase authority) usually get pampered by vendors with expensive business trips or whatever... especially at the scales that require SAP or Oracle solutions. So this likely means: the CFOs bought some expensive software that promises to replace Excel without any due diligence, the staff looks at the software and decides it's crap and continues to use Excel => CFO gets mad because no one uses the expensive software.

If you don't believe me, just ask yourselves how Oracle, IBM and Accidenture are still in business despite numerous highly expensive and publicized failures. It's all due to CxO pampering (in addition to a bit of vendor lock in).

This 1000%. The companies you mentioned don’t innovate on tech, they innovate on sucking up to executives and convincing them to buy things that the executive’s own staff recommends against. They have extraordinarily well-oiled sales machinery that deftly bypasses any kind of sanity checking inside the customer’s company and devote all their efforts to keeping it running.

http://bad.solutions is one of my favorite providers of Enterprise-grade software that is not remotely usable.

What's Accidenture?

"Nickname" for Accenture, an IT consulting/bodyshop famous for delivering stuff at waaay less than acceptable quality (or not delivering at all). In addition they're also known for exploiting their employees.

See also http://exposingevilempire.com/.

Is it different from other giant IT services firms?

No, but Accidenture is the most known of the bunch. And the other IT "service" companies, well, guess how they get their contracts...

Likely Accenture: https://www.accenture.com

Not sure if the misspelling is "accidental" or on purpose, in which case both it and this sentence are puns.

He probably means Accenture.


A hilarious pun at Accenture's expense.

More like expense of their feed^W clients.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact