I've learned products designed to replace spreadsheets have a huge hurdle because the people who use spreadsheets treat operating the sheet as their job. Replacing them removes their autonomy and control over an information process, and subsumes the value they bring to their employers - so they will resist products that threaten that. Excel is a complete management subculture.
The other advice I give is if you are generating analytics, have a PowerBI connector of some kind because the people who make decisions (managers, etc) make them based on PowerBI, and not from an interface their staff is a peer at using, and likely has control over. In enterprise, they want data in metrics their staff can't see, hence a separate tool.
Spreadsheets will always be with us I think. The opportunity may be in creating one that is has sufficient work-alike features with legacy ones, with new power features (python, etc) where there is a connector between the high power open development environment, and the familiar Excel ones managers use. Key thing being not asking managers or sr. employees to change.
It's not about autonomy in an abstract sense. It's the fact that regular people can customize, hack, modify, automate, and otherwise program the spreadsheet. Data analysts and managers are more than trained monkeys; they actually need to do those kinds of things (at least sometimes) in order to do their jobs.
I agree that the ideal world is one where you can connect the spreadsheets to other data sources, so you get the best of both.
This is a double edge sword, because often time there is no one validing the results of the spread sheets...
"The number looks ok" is not a good validation, and there has been some very public data errors as a result of bad spreadsheets.
I often wondered if in the average business is even 10% of the spreadsheets where actually audited what would happen.... I suspect the results would be rather shocking
I'm not sure if I should be proud of this or not, but once I had some pretty critical field values in a Excel spreadsheet that I had to make sure were compatible with my code. I ended up adding some derived formulas in a new locked sheet within the file, committed the XLSX to Git and then created a JUnit test to make sure everything was in sync. In a nutshell, the XLSX became the source of truth
Yup, and this is frequently a total disaster. Like, hundreds of thousands of dollars get lost either in spreadsheet errors or by paying the salaries of people who manage spreadsheets when their jobs could probably be replaced by a much better solution.
But, whatever - more work for everyone! Certainly suits me. It's like wanting the average person to keep programming C because you're in infosec.
Indeed, and I think this is the real problem with non-programmer/data people working with spreadsheets. The problem isn't the spreadsheet interface itself. The problem is that people are doing what amounts to engineering without any training in engineering and no sense of best practices.
> I've learned products designed to replace spreadsheets have a huge hurdle because the people who use spreadsheets treat operating the sheet as their job. Replacing them removes their autonomy and control over an information process, and subsumes the value they bring to their employers - so they will resist products that threaten that. Excel is a complete management subculture.
Spreadsheets are better because as you say the owner is their job to maintain it. If you replace with an IT process the new "owner" is likely a below-average developer that probably is uninterested in the business. A few years down the road the usefulness of the replacement will suffer.
> Spreadsheets are better because as you say the owner is their job to maintain it. If you replace with an IT process the new "owner" is likely a below-average developer that probably is uninterested in the business.
I disagree. Spreadsheets are incomparably worse because what you charitably described as "the owner is their job to maintain it" in real life it's reflected as having a single employee who abused a first-move advantage to monopolize and excerpt unduly control over, and even hijack, key operation areas.
We all heard horror stories of how employees screwed over their former bosses because only they had control over things like key spreadsheets. Advocating for spreadsheets is advocating for these vulnerabilities.
I have heard these horror stories so many times but never witnessed them in reality. From my point of view, excel is a great way to ensure knowledge is not hidden, as you have a file format that embeds calculations, data and outputs. It can be ugly, but nothing that a seasoned excel warrior cannot parse and there’s plenty of them around. Now if someone as an employer does not even have copies of the files, they have bigger problems than Excel itself.
The other horror stories of errors in spreadsheets, them yes I have witnessed them regularly.
> Advocating for spreadsheets is advocating for these vulnerabilities.
Spreadsheets will exist regardless of what developers think of them. Ironically, that's a good thing.
Many projects that put food on dev's tables started out as out-of-control Excel monstrosities that were created and operated for long spans of time by well-meaning and productive folks. They start as simple manual spreadsheets with some formulas and then evolve into much more involved beasts. Work gets done and it's all nicely contained in somebody's cube and they look good and can be rightfully proud of their accomplishment.
Things just get done. For a while. Sometimes a LONG while. Until the bitter realities that software developers have learned to deal with over the decades start to seep into these projects and drown the unwitting folks who created them, slowly but surely, like an ever-increasing number of small holes in the bottom of boat. That's when things break or become unmanageable and that's when developers start getting engaged-- assuming these excel masterpieces have actually become mission-critical.
There's a guy at work that operates one of these excel monstrosities. It's been going for ~7 years now. It's a monster excel spreadsheet that, among an ever-growing list of things, does dubious probabilistic forecasting of future PO's based on shit ripped from salesforce (not even using the api). He has a dedicated laptop behind him pulling in data from multiple sources, like clockwork, and has recently started making attractive Power-BI dashboards using his excel worksheets as the data sources. And you know what? He looks GOOD to the people that matter. Does the forecasting actually work? Not really, but being so immersed in all that data has made him knowledgeable about many details of the operation. He's able to keep track of costs and stay on top of things. It doesn't matter (to him) that the whole thing will vaporize when he leaves, or that he could tire of it and just foist it upon some hapless supply-chain person who's just learned to use formulas in excel.
Most forecasting doesn't really work, as in, most falls between "not useful for predicting the future" and "outright wrong." You could get your forecasts for free or you could pay millions of dollars, pretty much same result. Not an Excel thing in any way.
Indeed, the spreadsheet I am thinking is particularly hot garbage, but slap some dorky corporate bar-charts on there and it's like beer-googles for suits.
But why are you associating that with Excel? Every analytics tool features charts front and center regardless of how it is delivered, web, BI, dashboard, Jupyter, etc, and no matter what language it is written in (R: ggplot, Py: matplotlib, JS: highcharts etc.).
The guy's spreadsheet seems to work. He's delivering what his bosses want to see. You might have an issue with the final output but they apparently don't. What exactly is the problem you think you can fix?
I am on the side of excel being used like this, even if it's hot-garbage. The worst that can happen is that it collapses upon itself and then others need to come in and do it right, or migrate the thing to something else entirely.
A counter point to this is that spreadsheets bottle-neck the sharing of data and introduce data cleaning issues.
Most spreadsheets are built with the mindset that it is the end of the dataflow. However, at some point, this data needs to be shared forward. This might not be the original intention, but the more important the report is, the more important downstream use-cases become.
This is when spreadsheets become problematic. One can say that it's the owner's job to keep it compatible, but thinking of keeping it compatible isn't what normal spreadsheet users do.
Some issues I've seen:
* One can add a column easily/ rename it. This breaks any data sharing because now, downstream reports break. (In many cases, the data could have been added as a row instead of a column (new status code, etc.)
* Data-types are not enforced. Nothing prevents entering text into what should be a number or even create a completely new status code. Again, automation downstream breaks.
* Important info is usually not included. The spreadsheet is the latest representation of the data, so in many cases, attributes like the time-period (because it's implied) and unique identifiers (skus most frequently) aren't included.
* Maintaining compatible dimensions across different domains is not a priority for a spreadsheet owner. Finance may group countries differently than Supply-chain, which means they'll always see different numbers and argue that their number is correct.
Source: work at a Fortune 500 company, that has way too many excel reports (with critical performance metrics) and combining them to get an accurate view of the company performance is very labor-intensive and error-prone.
Excel can be replaced, but it won't be replaced by the current crop of VC-funded SaaS.
> Airtable really ought to be killing Excel, but the SaaS model combined with a stupidly low artificial row count limit (over 50000 rows is listed as "contact us for pricing") means that it will never achieve penetration into weird and wonderful use cases like Excel has.
Many Excel processes are 20+ years old. No SaaS could replace the stability and pricing.
The ability to email a spreadsheet and have it just work on the other side (and become the recipients copy they can fork a version and "own") is huge. In a SaaS situation, you have to solve IAM and security vs. leveraging what users already have as a sunk cost in email and windows.
When we think of document based workflows as a problem (vs. say just the data/info), we tend to think of them as inefficient and prone to duplication, forking, editing, versioning problems - but I'd argue these are valuable features because they create levers for managing. Maybe I've spent too much time staring into the enterprise abyss and this is the inner deadness of a consultant speaking, but what documents facilitate (e.g. MS Office) is flexibility of ownership, provenance, authenticity, sources of truth, authority, and other qualities.
When you solve a problem, it becomes inert, there is nothing about it to manage anymore, which means someone can't extract value from it, and that's value destruction to them. SaaS problematizes these document features and then "solves," them, which in fact just constrains managers by concretizing data and workflows instead of being a tool that provides some data that ultimately supports a narrative conversation without being a forcing function on a dynamic of ongoing "problems" that is producing value for the business.
I'd suggest this is the quiet part your SaaS prospect customers can't say out loud, because managing isn't solving problems, it's extracting value from them, and using tech to collapse dynamics that are producing value is anti-value from that perspective.
Another way to see the same problem is that the IT system replacing the spreadsheet will take a long time to build and replicate the process in the narrowest possible way. Then something changes, updating the system will be an uphill battle that will take years fighting for budget and prioritisation, and you have to revert to spreadsheets.
Plus the gap between a spreadsheet and an application can be huge in a large organization. No one needs a permit-to-build process or ci/cd pipeline to start a new spreadsheet and make changes to it.
> I've learned products designed to replace spreadsheets have a huge hurdle because the people who use spreadsheets treat operating the sheet as their job.
In general? Not really. Just as frequently, the fancy tool promoters don't care to understand the subtleties of the job and when it requires flexibility or judgment that the spreadsheet accommodates better. They have their hammer -- software formally engineered by software experts for disempowered "users" -- and everything looks like a nail.
"It's faster and more reliable (when everything goes as planned)" isn't really the slam dunk these folks think it is.
Give these users a more flexible tool like Alteryx, that actually lets them do their job, and I've seen that they'll happily migrate off of Excel.
I agree. One of the great things about Excel is that it’s massively flexible. It’s almost the antithesis of what most development environments want to be. Programming is about considering all the code paths. Spreadsheets are about “what if”.
The flip side of this is that understanding the behaviour of a spreadsheet is generally a specialist job, which is why we have people whose job it is to “run” the spreadsheet. The spreadsheet has rules and boundaries and it will stop working if you just start plugging random values into formula cells.
I would add that any tool looking to replace Excel will be build on some very powerful but very primitive foundations, I’m order to compete with that flexibility. It’ll never be about adding a special “view” like AirTable or just tacking in Python.
Excel is like Emacs; most users will write some Elisp at some point, it’s designed to be meddled with from the ground up. AirTable is the VSCode; most users will never write a line of plugin code and when you do you’ll find you can’t extend much.
> the people who use spreadsheets treat operating the sheet as their job
Yes, observed the same. And "The new tool is faster and more reliable!" does not help either. They got their workflow and cope with it - for years.
Only time people adopted new tools - banking - when they could deliver their assignments way faster to their superiors. Personal advantages must be spotted.
Spreadsheets are malleable by design. Being able to modify how it functions on the fly is an enormous power for these users. And the better/faster/smarter SaaS replacement is also extremely rigid and inflexible. So while it works to replace the current version of their god spreadsheet, it can’t adjust on the fly as the user’s needs change like the spreadsheet can.
I’m convinced that there does exist a “better spreadsheet” that treats power users as exactly they and incorporates things from the software engineering works like version control, modularity, reusability, sharability, etc. that hasn’t been built yet.
I’m trying my best to create such a product. What you’re describing is something that (1) amateur spreadsheeters can use, (2) the power users, and (3) programmers, will be able to use and not feel out of their depth or patronized. I personally think that given that Excel is a pure-ish declarative language, then building on a declarative language that programmers use is a solid attack on this, which is why I’ve chosen pure functional programming as a basis for my work. It’s been an enjoyable design process to include or throw out ideas that would alienate either end of the spectrum.
Elsewhere: Making all cells CAS gives you a strong foundation for version control, modularity, resizability and share-ability.
Happy to share more thoughts on this. I’m two years into the process.
I realized I left that open ended but I’ll touch on a few points that I think are relevant here;
* There’s a language which is like Haskell/Elm/PureScript in terms of being purely functional and statically typed. But with syntax that looks more like Excel.
* Purity gets us fearless recalculation.
* Static types let us build UI elements automatically based on the inferred types of code.
* It’s content-addressable like Unison. That means every expression and “cell” has a unique SHA512 hash of it which refers to only that expression.
* Content addressability makes cache invalidation of results trivial.
* It also makes it easy to say “I want exactly this version of that person’s cell and for all time.” Makes it impossible to break someone else’s code once it’s working.
* It also lets you fearlessly federate, if ever needed.
* Content addressed also means you can write tests against code and have them run on every change. Only the tests whose dependencies changed will be rerun. That’s not normal in Python or Haskell, but in a spreadsheet it is.
There are other design choices related to your comment but I don’t want to ramble on.
Truly Excel could be improved in a vast number of ways. But would anyone use it? Excel is a kind of product that is perfectly understandable to the average business person.
The more complicated you make it, the more it becomes like real software development. And that is a skill that most people don’t possess.
Excel is ridiculously complicated. I think you're vastly underestimating the ability of these "average business people" because they don't know how "real software development" works.
I think there is scope to replace Excel, but it’s hard. You’re seeing a gradual spread of “data science” tools in finance, as the more technical analysts start to use Python over Excel. But as you say, the management layer still uses Excel to poke and prod a model.
I think that there will be a generational shift here - you are not going to train an SVP to use Python, but the next generation of SVPs might have more exposure and be willing to use Numpy in a Jupyter notebook.
And in the other direction - there is definitely scope to come up with an “Excel isomorphic” Python framework for data science. It’s fairly easy to generate an Excel sheet from Python computations, but maintaining bi-directionality is Hard, and would require restrictions on the Excel side. I think with the right UI, you could do this though.
I have a colleague who is a PhD in Applied Math. Pretty bright guy, huge Python/Jupyter lover with many years of experience. Loves to use git, loves to write dozens of unit tests. A true Man of the Future, according to Excel haters.
He wrote some Python to solve some mildly complex business problem. I told him to translate it into Excel for stakeholders. He did, and the answer came out completely different.
It turned out he had made multiple catastrophic errors in the Python. This is not the first, second, or third time this has happened.
Python, and tech-beyond-Excel in general, just isn't the silver bullet software types often seem to think it is. Even experts sometimes seem to do a worse job in it than in Excel.
I love both spreadsheets and Python/Jupyter. But neither is Excel the silver bullet? The reason the error is caught (and the real lesson) is more because he tried to reproduce his work as opposed to a particular technology stack.
I think Excel has something to do with it. In Excel, you're usually forced to do computations in small steps and look at the intermediate results. In traditional coding, you don't see anything but the final result (unless you ask to see it). It's much easier to assume everything is working as intended when it's not.
Excel has its problems too. Different tools for different jobs. Tech boosters need to understand this and not just cynically assume that spreadsheet lovers are old fogeys who are afraid of their jobs being automated away.
Certainly if you do traditional TDD or something like REPL-driven development, you see the intermediate results and validate the correctness of your code as you go.
Those are great! But they are coding disciplines one must choose to follow, and continue following every step of the way. You aren't inherently following them because that's how the tool works. Sometimes the tool-enforced discipline has advantages.
How would one write tests for a complex, mission-critical Excel spreadsheet? Or use version control?
Spreadsheets often mix the data and code/formulas and the formulas are hidden behind the sheet view and sprinkled across many cells. At least Python scripts separate the code from the data so you can write tests using known good or fuzzing data. And you can use version control to track and review code changes.
I agree that, in the hands of a hypothetical ideal person, Python should be better than Excel for almost everything. But as my story above shows, even people you'd expect to be experts, seemingly following best practice, don't end up being very close to this ideal person.
Honestly, your story just indicates that someone senior needs to have a discussion with the applied mathematician about their coding practices (and they may even need a bit of training in that regard).
His code style is fine. His problem is that he is short on the discipline and focus to validate his results in meaningful ways. An exhortation to follow some vague "coding practices" won't fix that - he'll respond e.g. by writing lots of tests, none of which catch the problem his code actually has.
The two ways I know to get correct work out of him are (1) review it and kick it back to him when problems are found, (2) have him implement what he's trying to do in Excel.
So I was mainly a Haskell programmer in grad school before getting into scientific computing, and the main good habit I picked up was working at the REPL and breaking my code up into small, testable functions/components, and thinking very hard about state. The main bad habit I had was expecting the Julia compiler to be as helpful as GHC when refactoring code.
While I was writing my thesis and job hunting, I attended a few workshops aimed at “grad students breaking into industry”. The main thing I noticed from applied mathematics students in particular was they would write out long functions (really hard to debug) or they worked exclusively in Jupyter notebooks (these have super complicated state, so it takes a lot of discipline to be able to translate these into usable code).
Sometimes it's hard to think of the right tests, especially when you're solving a mathematical model that hasn't been solved before.
Even for things that are intuitive and have been implemented thousands of times before, like web logins and shopping carts, where the tests one should do are not hard to think of... even so, software engineers rarely develop tests that catch all possible bugs on the first try.
Replacing excel is like replacing paper. People always talk about formulas, but most spreadsheets don’t have any. They’re todo lists, or more commonly, static database exports.
Or they have formulas, but the author calculates a few numbers and throws it away. In this case they’re like a scratch pad, or a calculator with a visible memory.
I think it’s certainly possible to make better business modeling tools, and have played with some designs in that space. But they’ll never be Excel. And that’s ok
Warning: I'm a founder working in the spreadsheet space, so take the rest of my this comment with a (large) grain of salt. I've written before [1] (HN and elsewhere) about how I think spreadsheets are the most popular programming paradigm ever, we just don't talk about it much. As this article mentions, there are many ways we can push this forward.
I personally think the most powerful low-code spreadsheet tools we can build are those that allow spreadsheet users to easily transition to full programming languages, if they want to. So rather than locking users into limited and proprietary product number #115 (some of them are mentioned in this article), IMO it's better if users can transition to a full programming language (like Python) very naturally. Som I've spent the past 2 years building Mito [2].
Mito is a spreadsheet extension to your JupyterLab environment. You can display any Pandas dataframe as a spreadsheet, and edit it in a very similar way to Excel. For each edit you make, it generates the corresponding Python code below for those edits. Practically, you can think about Mito as recording a macro, but instead of generating scummy-crummy VBA code, it generates Python.
We're open core [3]. Feedback greatly appreciated!
Shouldn't your product be independent of Jupyter? Jupyter is amazing but still seems to have a beta feel to me as often old Jupyter notebooks will not work at all (for lots of reasons).
I love python, hate the limits on google sheets apis, and don't really honestly think VBA is "scummy-crummy" it is just unsupported. Microsoft made a huge mistake discontinuing real Visual Basic, as it honestly could have been where Python is now, instead VB.net is basically dead, and the momentum Microsoft had with wysiwyg code editing is way behind where it was.
Very cool that you are generating code for the Jupyter notebook. What are the practical row limits as I see that as one reason someone might use Pandas instead of Excel?
The main benefit (to our users) of being in Jupyter is that we don’t have to force them to switch up their whole workflow. If they want to layer on a spreadsheet, bring an extension mak it easy to do this. We don’t want to lock people into our platform vs just provide the best spreadsheet experience we can!
For us, it’s nice because we don’t have to reinvent the wheel(s) that Jupyter comes with :-)
Yeah that’s fair feedback. Our Pro features are a WIP currently, so this might evolve in the future. It was important to us that there is a way to be totally telemetry-less if users prioritize that - vs most other cloud based sass data science tools where you pretty much have no hope of total privacy.
Sadly, it isn't. Microsoft set the precedent with Windows 10's telemetry, which they only give you a setting to turn off if you bought Enterprise Edition.
Mito looks very cool! Looking forward to trying it.
The blending of spreadsheets and notebooks seems inevitable. One trend notebook-as-program. I've heard several variations of this: "The data scientists give us a notebook, and our automation runs the notebook to do <ML thing> on <our internal data sets>". It's clunky to use a format initially intended for interactive visual use for headless automation, but there's a practical wisdom to sticking with whatever format the data scientists prefer. Twenty years ago, I saw the same sort of thing with engineers and spreadsheets.
The one thing notebooks lose, though, is data flow computing. That's a major strength of spreadsheets, and the imperative execution of notebook cells seems like a step backwards. Although I'm sure somebody has bolted some kind of inter-cell dependency execution onto Jupyter by this point.
Seemlessly integrating spreadsheets into jupyter sounds like the holy grail to me. I haven't tried it yet, though.
I think their users are people like me - who work a lot in Jupyter and swear by it - in a Python data analysis/visualisation environment. To bring in the best of spreadsheets into that could be magical. To just work in libreoffice calc or Excel would be a nonstarter, it just doesn't match all the other python tools in the workflow.
Mito isn’t just a spreadsheet that works with Python, but a spreadsheet that allows you to generate Python code when you edit!
If your just looking to work with spreadsheets with Python, I’d also reccomend checking out XLWings - I haven’t used it myself but some of our users do and love it!
Another direction here is reactive notebooks where when you change one field in the notebook all of the dependent fields are automatically updated. Pluto.jl for example [1]. It has the feeling of a cross between a Jupyter notebook and a spreadsheet without looking like a spreadsheet.
Thanks. I'm a bit dubious about using the jargon - which people can easily misunderstand - isntead of saying, "mostly open source components, some priopriatary ones".
Every day I think about how Microsoft Access allowed otherwise not-so-tech-savvy users to, with just a little training and practice, build a complete relational database for their entire business, supported by a relatively sane GUI and a way to build forms and reports with very little (if any) 'code'. I have no idea what small companies are using nowadays, but I think there has to be some untapped middle ground somewhere between Microsoft Access and a dumb spreadsheet.
There are still lot's of folks using Access. For those of us in the Linux space, we can use Libreoffice Base and script it with python. I once built for a local business that needed to join the computer age using this. Then about a year+ the owner's relative came to the business and decided that it was too old school and to get it redone as a web app. Needless to say, the owner is not too happy with it, took too long, cost too much both to develop and run and has more errors. He lamented to me about it when I was in town... The issue is the world has moved on from local/native apps.
Claris Works on Macs in the 1990s is the best database product I’ve ever used. It was amazingly user friendly. I used it to make form letters & reports. Claris Works prioritized the application layer of the database and hid a lot of the complexity at the data layer.
DB Browser for SQLite is the coolest database product I’ve used recently. Similar to Mito, DB Browser generates SQL into a log as operations are performed in the GUI. Kind of a fun way to learn some basic SQL. I need to find a good GUI authoring layer for SQLite…
I'm structuring my UI for Adama (https://www.adama-platform.com/) around the baby of Access and Excel, and the engine is available in early access now. I love Excel and Access, and I plan a future pivot into small business space.
Microsoft Access is, of course, still around. It would be great if there were a solid way to develop a database, forms, and reports in Access and then deploy it to the web.
If you've ever had to work with someone else's Access database, it is unusual to see a reasonably normalized relational database. Most people are much more comfortable with the single flat file of Sharepoint lists.
Speaking as someone with direct experience maintaining Access databases that should have been SQL Server from inception, yes it absolutely does. If you need to force feed 10+GB through Access artificial 2GB constraint and your programming language (VBA) is both single threaded and interpreted, if you aren't clever then you will run into performance problems, just as you would if you were doing something similar in Javascript or Python + SQLite.
That's a good question. For an honest and complete answer, I'd have to remember the messes I've seen and it's been a couple decades now, so I don't. But simply, everything in setting up forms and reports is more convenient if you have real relations as opposed to a bunch of columns like "supplier1, supplier2, supplier3...".
Founder of excel collaboration and versioning startup [1] here so I am a believer.
There are only about 30 million programmers. There are over 1 billion Excel users. Excel is Turing complete. Excel is by far the most used programming language on the planet. It is easily 20 times more popular than the next contender.
The value of Excel is that it is presenting the data, with a set of formulae that let you keep derived data up-to-date. This inferred data provides sums and computations, sometimes simple, but sometimes exquisitely complex. And through this whole range of complexity, with a billion users, virtually nobody treats Excel seriously like a programming language.
We have a programming language which is essentially acting as a declarative database, and yet we don't do unit tests, we don't keep track of changes, we collaborate with Excel by sending it to our colleagues in the mail and god-forbid we should doing any serious linting of what is in the thing.
Anyone who has used Excel in anger realizes why it is so brilliant. Show me another declarative constraint based, data driven inference language that I can teach to my grandmother.
The problem isn't Excel. The problem is that we are treating Excel like its a word processor, and not what it is: a programming language.
We are a small B2B, and during the pandemic our primary source of income disappeared. Our president built out a new service, one that is VERY analytics heavy. He built it in weeks using Excel, rather than the months it would have taken using "proper development". It's a beautiful monstrosity, and eventually needs to be ported to our codebase, but it saved the company.
Excel is great and versatile, yet lacks some critical features.
- It's easy to load data from database or API without VBA, yet impossible to write updates back without VBA. With VBA it's still messy string concatenation of SQL queries.
- VBA is security nightmare
- Version control is bad. Distributing spreadsheets in emails and collecting changes back is nightmare. Sharing a file on network drive lacks fine-grained permission control.
- It's hard to maintain proper normalized relational data model. It's impossible to abstract the normalized model from the user (i.e. show labels in selectboxes but store ID's)
- It's locale dependent. Date formatting, column separator (comma/semicolon separated), even function names. Unusable in international data exchange.
- It mixes formatting, logic and data. Impossible to make reusable blocks. New lambda functions help somewhat.
Excel is locale dependent in a weird way. The internal model (which includes the function names) is completely locale independent and only the UI is internationalized (according to OS locale) and localized (according to Excel language version). This is not a bad construction, but breaks down spectacularly when you start to use Excel as any kind of real programming or data analysis environment, because then you get to the cracks in the user-visible “API surface” where the internal/external format distinction is somewhat smeared.
Click and drag down is the opposite of code reuse, it is effectively a one time code generator.
In fact it is one of the main reason I have largely abandoned spreadsheats, the behaviour I actually want is found in sql views. Calculate all items in column with one expression vs calculate items in column with unique expressions that were copy, pasted then transformed for each and every row.
The other main reason I abandoned spreadsheets is row level integrity. too many times data that goes together in a row has drifted apart(sort on column subset is main culprit), another inherent problem solved by using a relational database.
the solution to having excel at every desk is to have postgres at every desk. the code will be just as bad but the data integrity will be better.
Bypassing the need to consciously name things is one of the worst parts of excel.
Imagine taking a data pipeline and renaming all columns and variables alphabetically and then asking someone to check the business logic
Also it pushes people to include excessive parts of formulas into a single calculator rather than breaking them into comprehensible and testable chunks.
>Suddenly, the field has begun to bloom. A small cluster of startups have in the past year released spreadsheet products
All of which are subscription based SAAS. Maybe I am showing my VisiCalc oldster roots, but I want my document related tools to be pay once / run locally.
I long for a computing world where the free tier is run-in-a-browser-on-someone-else's-computer and the paid tier is run local and native.
It is a horrible and unusable business model/architecture.
1) Requiring subscription to access when it is not technically required is onerous and very questionable value, especially for tools that I'll use intermittently.
2) Requiring online access instead of local & native apps makes the false assumption that I'm always connected. Often my most productive times are on 6-hour flights (& no, the onboard wifi is not reliable) or away at locations where there is no connectivity.
3) Performance will always suck compared to a fully local app - all the optimizations a developer could do to make it run acceptably could also be applied to a local app to make it really snappy
4) Any SAAS carries significant extra risk that does not exist local native app - that the software /service could disappear any time due to a host of business reasons. Sure, native app companies can also go out of biz or discontinue products, but I still have the software and can upgrade on MY schedule.
5) MOST CRITICALLY, entire classes of very attractive customers are LEGALLY locked out of using your apps. In my industry, there is an entire new class of information called CUI — Confidential Unclassified Information — anything related to DOD projects that isn't quite classified, but there are very strong restrictions on what can be exposed in what way. My wife's industry is legal, and they have almost nothing that could be put on a SAAS, even billing information. And of course same with medical.
Of course there are some apps where the online SAAS is close to essential, like real-time collaboration / conferencing.
However anything else, and it looks strongly like the business model is extractive and rent-seeking, instead of providing value. (and this also applies even to IOT devices - I should be able to access those via the IP, not only through your service...)
I'm glad there are more tools to make working with data easier, but I don't use any of them because putting my data into a proprietary service I have to keep paying for is a dealbreaker.
Subscription based SaaS doesn't bother me in and of itself. Cloud based supporting multiple collaborators and multiple devices, tech support often built in, no managing upgrades or patches, etc.
The problem is portability and lock-in. Which itself might be due to lack of standardization.
Also that it's pretty hard to make good spreadsheet software I guess ? (Even Excel has bugs sometimes, not to mention various quirks for backward compatibility...)
AFAIK FramaSoft decided to stop working on FramaCalc (based on / hosting EtherCalc, based on SocialCalc), because it was just too hard for a small association like that (which already has to deal with managing lots of other services for free), but they seem to have decided to postpone the shutdown, I guess because of COVID ?
My reactive workspace tool aka spreadsheet replacement (Inflex) is SaaS, but I’ve also considered offering a completely downloadable tool for a one off price. I think if ever get a substantial user base and people ask for that, it’s on the table.
As a greybeard, I have seen plenty of old ways become new again. It's hard to predict when the next shift will be or where it will go, but it will shift.
I’ve been doing this for customers for years now using Microsoft Lists (sometimes I use Excel) as the back end and PowerApps as the front end gui/data entry. Now the language from PowerApps (which is based on Excel) has been open sourced and Microsoft want to make it available in other apps too. That has questionable value but my solutions continue to be popular because the people I make them or can easily edit and make simple changes.
Some of the solutions we make have been widely adopted too, with hundreds and in some cases thousands of users.
I just learned how to use spreadsheets recently, and I love them. As a programmer, spreadsheets are exactly the kind of tech I want to use every day. User-friendly point and click, IDE features, complex functions mapping data across rows/columns/tables. All I want now is to hook it up to Postgres, and then maybe a new app to automatically design simple web pages that pull out and display data with the same functions. This would save me months of time per year versus manually coding the same things.
We're building an (open source, self hosted) tool that does the "hook it up to Postgres and pull out and display data with the same functions" portions of what'd you like. https://github.com/centerofci/mathesar
Excel users tend to do a far better job of data storage, analysis, and forecasting than most of overpaid SaaS-reliant "modern data team" (Data Engineers, Data Analysts, Analytics Engineers, Data Scientists).
It definitely is a scary business proposition for them to improve Excel and awaken the 800lb gorilla. Microsoft is certainly capable of improving Excel in ways that might even improve the productivity of the entire world. Seems like some of these startups are likely just giving training to future Microsoft employees.
Agree with this. When battling Microsoft product can matter less than things like contracts and licensing. I have to believe that businesses are spending so much on Azure that they get Office for free (as an analogy - at a previous employer GSuite was nearly free because our AdSense bill was so high). Businesses are unlikely to pay extra for a few more features when they already have Excel and have no plans to move off Azure.
Excel is constantly being improved. Are you aware of Power Query? And how about the 14 news functions that were released to Insiders a few weeks ago? Excel Online is far behind the desktop version of Excel but it's making fast progress.
Meh, depends on the data teams (and their chosen suite of tools)... Or on the Excel users
Both sides can either hack together horrible work-arounds (a matter of "when you have a hammer, everything looks like a nail...") as well as brilliantly thought through solutions.
Each tool should be used for it's best use cases, but not bent into what it wasn't designed for!
IMHO spreadsheets excel at intuitively manipulating the data ON the data itself.
While "modern data" tools (especially dbt) try to convert date teams to use developer best practices... At the expense of less intuitive/direct manipulation of the data.
That being said, I think there are also things we could explore in that space: how to make the modern data stack more intuitive?
> how to make the modern data stack more intuitive?
I'd start with getting people to learn relational data modeling and SQL [1][2] at a deep level. Stop reaching for python and pandas/spark for every basic data manipulation task or query. Stop adding in layers of Airflow/Dagster/Prefect when a simple cron would work. Stop adding in Kubernetes/GKE/Fargate to manage the aforementioned. Stop moving data between systems constantly (meltano, airbyte, fivetran) when you already have it in a perfectly good place. Stop with the toxic positivity that's completely overflowing the modern-data world and all these bullshit VC-infused startups who are convinced they need every single element and more.
99% of business needs can be satisfied by a single Postgres/Mysql installation and a halfway-competent person armed with SQL and an understanding of normalization. Reach for Excel when you need to do more "hands-on" analysis, business modeling, charting, basic forecasting, and presentation for non-technical users.
I definitely agree with the over-hype making simple mundane tasks way harder than they should be.
So yeah, do NOT over-engineer!!
But, on the other hand, doing everything with a single Postgres and spreadsheets seems to go with the hammer-to-nail adage.
And all too often, you end up with unmaintainable duck-taped hack-arounds...
Which is clearly NOT better (nor necessarily worst) than the over-engineered solution.
In some cases (maybe not 1% but clearly not the majority either), it does make sense to look at other tools that might be available.
That being said, there are waaaay too many options to filter through, because of that darn hype bubble.
As much as I hate to admit it, it is going to be a long time before we get rid of spreadsheets in de facto production use. And for that, I really would like to see one thing:
I'd like to have a separate worksheet type for "datasheets". Looks and behaves just like an ordinary worksheet, but:
- plugging in a formula applies the formula on every cell of the column. No exceptions.
- you can not have different data types in cells within one column. That is, if you have dates in a column, you can't have a string in one cell of thecolumn etc.
Yes, I know about powerthings in excel. No, I do not want to mess with those. Just normal spreadsheet formulas and sheets.
(Okay, a couple of other things: Get rid of vba and bring in the full python ecosystem instead. And if not yet available, version control. It is luckily a while I have worked with excel, so these may be outdated comments)
>I'd like to have a separate worksheet type for "datasheets". Looks and behaves just like an ordinary worksheet
I think you're on to a good idea, but it has to be a table in the middle of a spreadsheet if you want to get acceptance. What would probably cinch it for you is if you could embed an entire table in a bounded box, with scrolling. It would break the row/column addressing for the contents, but that's always clumsy with tables in sheets anyway.
Example: a data table lives in C10-M10 / C20/M20
Down in C21-M21 you could have sums, counts, etc. that work across all the data.
With this arraignment, you could then sum columns, count, etc. It would be really slick if you snuck in SQL. The tradeoff in immediate addressing of cells within a table could be acceptable.
A while ago, I saw something that allowed spreadsheets in spreadsheet cells, it was mostly intended for interactive use, and dispensed with cell addresses completely in a manner that seemed reasonable at the time.
People always want to reinvent the spreadsheet with their latest whizzbang solution without a problem. But Excel solves 80% of the common problems so nobody really cares.
YUP! And too often a company claiming to have something better than Excel isn't a direct equivalent. It's a way to market their whizzbang solution.
Domo was claiming it could get companies off of spreadsheets. But it turns out that they're an enterprise level reporting system that's a minimum of $30,000. That's not a viable option for a 5-person non-profit, but they'll get swept up in the conversation only to find out later that no, Domo and Excel are not equivalents.
Excel, and spreadsheets, need a re-branding campaign. They are cloud-based, low-code, ELT tools; these terms are typically associated with buzzy enterprise tech companies, not technology developed in the 1980’s.
What do are you using spreadsheets for privately? I never dig deep into their capabilities, so I mostly use them to track expenses within some particular scope, e.g. healthcare. When I recently wanted to compare several loans and estimate our financial situation several years in the future, I wrote some Python code and used a Jupyter notebook to enter parameters and make plots. Has any of you done something similar using spreadsheets?
On a side note: I didn’t find a Python library for time series generation (not analysis). Something where you can build some models (e.g. loan, income, expenses) which depend on a common parameter (time) and then evaluate all your models for different values of the common parameter. Right now, I generate pandas series/dataframes and combine them afterwards, which also took some massaging of pandas (which I also usually don‘t use a lot).
I use Google Sheets for personal finance with TillerHQ to automate downloading bank data. It is, as far as I know, the only workable solution in this space.
I think 'products replacing spreadsheets' is not the right angle. Spreadsheets are great but not for everything.
I think there are a lot of mis-uses of spreadsheets. When users record list type data, long-form content, relational data, etc within spreadsheets, they are probably better off using an app - they, or their IT team can build.
I once seen an enterprise org process where a health and safety check was completed on a clipboard. That user would go back to their desk, push that data into excel, another user would write a script to push that excel data into MySQL.
There are many use cases where spreadsheets are perfect - building financial models.
For transparency - I am the cofounder of a no/low-code tool called Budibase. I am also a happy spreadsheet user.
https://github.com/Budibase/budibase
Well it disappeared over a decade ago. Very interesting demo. Time is a flat circle I guess. Nothing actually comes to mind from the present that is quite like this, hmm.
Excel is by far the most successful programming language and IDE. People love to hate it (and the people using it), which is somewhat misguided: there is simply no way to change people (and they keep making new ones), so telling them that what works for them is somehow wrong is both wrong and doesn't work.
Instead, the spreadsheet paradigm has the promise of being far more powerful. Jupyter notebooks are one example of adapting it to a different realm, and it also ended up being used everywhere and looked down upon by the snobs.
Been thinking about this problem for a long time. I use python/jupyterhub notebooks for our data analysis flows at work. I’ve become an expert at it and am a go-to person in our org… but even still when I have to try something new on “small” amounts of data, I _still_ go to excel first. It’s barrier to get the results you want os extremely low - thus-far unparalleled by any other software I’ve used in the last 20 years… and I have a hard time seeing anything that would replace that (though maybe I am just set in my ways…)
> Spreadsheets.com, for example, lets users dump almost anything into a cell. Drop a photo or a PDF into a cell and the product will immediately create a thumbnail, which you can then expand, as if the spreadsheet were some sort of blog content-management system.
The other advice I give is if you are generating analytics, have a PowerBI connector of some kind because the people who make decisions (managers, etc) make them based on PowerBI, and not from an interface their staff is a peer at using, and likely has control over. In enterprise, they want data in metrics their staff can't see, hence a separate tool.
Spreadsheets will always be with us I think. The opportunity may be in creating one that is has sufficient work-alike features with legacy ones, with new power features (python, etc) where there is a connector between the high power open development environment, and the familiar Excel ones managers use. Key thing being not asking managers or sr. employees to change.