This reminds me of the 3D roller coaster simulator in Excel[1]. The difference here is this was made for readability in mind and even works without macros.
Well, the roller coaster sim is wireframes only, which is always much easier to compute. You’ll notice that the Doom clone has lighting and shaders, which, however primitive, add complexity to the render.
It’s like the difference between text rendering versus decompressing JPEG images. Both can pile up enough data to exhibit lag, but you get more bang for your buck with the lighter data stream.
If Excel had just a couple more features it could be much more ergonomic and not even need macros. Specifically: anonymous functions/lambdas made with just pure formulas. Define a function in a cell like `=($s) => [expr]` (or something idc). Call it with `=A1("string arg", B2:B100)` or by name if you name it via named ranges. Functions are obviously a new data type, so you can pass them as parameters to each other or to relevant builtins. It would turn Excel into some kind of hybrid pure functional/spatial language where the code itself is spatial data.
The typical things one gains from higher order functions are things like mapping and filtering of data structures. Excel only has one data structure—the table. Mapping is done by writing the formula once and then dragging it from the corner to the whole column. Filtering is not really done. Normally use the gui to hide the rows to be filtered away.
Functions add in questions of scoping. How are closures supposed to work? Could there be some way to say “the function in cell A1 at the time and closing over the state of when cell B2 was last logically computed”? Excel handles scoping in the non-function-in-cell approach using $.
Obviously this all breaks apart when you don’t want tables where you drag things either always down or always across
> Mapping is done by writing the formula once and then dragging it from the corner to the whole column. Filtering is not really done. Normally use the gui to hide the rows to be filtered away.
You should look into array formulas (or block formulas). They break out of the one-at-a-time mold and could be much more powerful.
Yes but it’s not done with a function like filter in scheme or Haskell, it’s done by the gui, outside of the world of formulas. For more complex stuff there are database features hidden away inside
You jest, but people really do underestimate Excel. Yes, it sucks - but there is a reason that despite all its flaws, people keep trying to use it for things. (And no, that reason is not "they're too stupid to learn how to code")
And I'm not sure the things that make Excel so crappy could really be fixed without sacrificing the things that make it such a flexible and useful tool. For instance, adding the ability to recurse or iterate sanely would remove the transparency of having every iteration of a computation clearly visualized cell-by-copy-pasted-cell.
For me it usually is "I would like to use some sensible tools, but corporate policy requires me to not do that (because they are not available) - so I have to (ab)use Excel/Office/VBA to get things done." It's not pretty, stable, efficient or really maintainable, but if you are forced to use a hammer..
ps. Excel does some things OK, like drawing pretty graphs on smallish datasets and visualize quick analysis with pivot tables. It just lacks a sensible scripting language and is generally very brittle once the data is getting nontrivial and more than one person is involved.
Excel-based workflows are definitely brittle, but people learn to be careful with it (you have to, if you want to keep your job).
One big problem it solves for regular people (i.e. not programmers) is this: "our workflow and/or reality of our business changes much faster than the IT department/outside contractors can keep up with".
(The other big, and related, problem is: "our IT department/outside contractors have no clue and don't really even care much about what we actually do or need".)
It's also really quite good for automating numerical calculations; a sort of poor-man's-Matlab. Having every intermediate value in a multi-step computation both immediately visualized (and thus easily sanity-checked) and accessible as a variable, without having to keep track of names, is very powerful.
I’ve been solving this problem with PowerShell for the last year or so. It’s usually available on locked-down corporate Windows boxes, and you can use the entire .NET Framework, as well as third-party .NET libraries, without admin rights. I’m using it with System.Data.SQLite to build a WPF application at the moment, which will hopefully replace a whole bunch of messy and hard-to-maintain Excel sheets and Access databases.
I love PowerShell for this reason. I had a 100 CSV files, and I needed the first two rows removed to put them into a signal viewing app, and it took only a few lines of PowerShell to do this on my corporate machine. I would normally do this in another PL, but strangely PowerShell's solution further pushed me to look into it for other uses on that machine. I wish there were a good book for this, since I actually prefer that than just Googling quick solutions when trying to actually learn something.
Yep, I could have learned much faster with a good book. Most PS tutorials are targeted at sysadmins with minimal programming experience. I would have preferred something that assumed programming experience and focused on what makes PS different from other scripting languages.
I work in a Microsoft shop, but I always have a bash shell running. It prevents me from really diving into PowerShell. I will end up running a quick sed '2,3d' on the csv files and move on with my life.
I have PowerShell on my local company machine, and without Admin priveleges I can't even sideload bash through a Git install. We're not a software company, so requesting Git is also out of the standard operational procedures as a user. IT knows my coding background (since 1978), but they can't change policy.
I work at an engineering firm (not software), and for performing and presenting manual calculations it it great. You wouldn't write a Python or Julia script or use a Jupyter notebook for any of it. Because I like to code, I've written small programs for more involved calculations in J, C, F#, and my favorite Frink. Frink is great for engineering because it handles different units and systems throughout a calculation. Mathcad is also good for laying out manual calculations. Excel is the engineer's scratchpad. There are just certain things that are just easier to do in Excel, and not everything turns into a problem that needs a programming language. But there are ways of working with Excel from most programming languages aside from VBA.
A lot of Excel's issues are well-known because of how pervasive it is. Python and kin rely on C libraries and other layers of imported packages for numerical work that can lead to all sorts of numerical mischief. Even doing:
round (2.575, 2)
in Python leads to 2.57 instead of rounding up to 2.58. Everything in our engineering firm is checked by another engineer.
The behavior of round() for floats can be surprising: for example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This is not a bug: it’s a result of the fact that most decimal fractions can’t be represented exactly as a float.
> Yes, it sucks - but there is a reason that despite all its flaws, people keep trying to use it for things
Yes, but the main one is that institutional IT policies frequently dictate that people whose aren't employed specifically in software development roles aren't allowed software development tools, but everyone in the org tends to be allowed core Office apps like Excel.
When it's literally the only remotely applicable tool most people are allowed to have, it shouldn't be surprising that it gets used for a lot, independent of merit.
you can recurse in excel. its disabled by default but you can set the max depth on circular references in some menu. I haven't used it since the 2007 version
I'm surprised that Microsoft hasn't plugged the Roslyn .NET compiler into Excel as a first-class scripting environment yet. LinqPad in Excel would be miles better than VBA.
Spreadsheets are pretty good for using when exploring the execution of an algorithm or computation graph. You can basically book keep every step of an execution.
Efficiencies or lack of it is visible in the spreadsheet in a very concrete way. How many used cells are there, how many “steps” or rows is required to finish a computation and how many times are cells addressed clearly relates to time and space complexity of an algorithm.
Excel has been an interesting tool that has been abused by many different people for many different subject areas, from petro-engineers to asset management to budget forecasting to drug sales forecasting to activity planning and on and on and on.
With a judicious use of vba, many of the tasks above, especially those related to forecasting, planning and reporting can be automated down from hours to just seconds. I have seen inordinate amount of hours wasted by people using excel to do all sorts of tasks that were repetitive to the max. A thinking programmer can work with such people to change those tasks from being hugely manual to simply automatic.
Mind you, the vba that has been written by non-programmers is usually so bad that you don't want to even try to attempt to change or fix it. It is far better to just start from scratch.
It is a tool that can be used in quite novel ways, but it does have its gotchas that will cause interesting failures for businesses that rely on it. But when it is the only tool you have, then you use it any way you can.
What we have often seen in the business, is that Excel gives a flexibility and freedom that purpose-built tools often don't have. Want a extra field? Done. Field need to have a date larger then xx-xx-xxxx? Done. Even if the purpose-built tool has some kind of custom field system that accomodates these wishes, there always will be some feature the developers need to build. And that costs more turnaround time (and money) then doing it yourself.
That is certainly something that I have seen as well. The problem lies in validating the spreadsheets. Often, the amount of validation is minimal, whether that is the actual formulae used or the relationships that bare created or even the data that is used.
I have seen some outstanding examples of well written spreadsheets but these tend to be the minority case. Depending on the organisation and the priority it puts on validation, the spreadsheets can range in quality from vry poor to mediocre.
I have seen critical spreadsheets that were a disaster of coding.
Many times, the IT costs for getting the IT developers to do the work for you is so far above what it costs for you to do it yourself or even to get a developer in specifically to work for you, outside of any controls that your IT group would exert. Plenty of the work I have done in the last 20 years was based on the end-user employing me to do the work taht was not cost effective or even allowed for by the company IT teams.
Is validation a problem? You see the intermediate steps of calculation in the columns. It's way more clear then black-box unit testing. It's like some nice live debugging with print statements everywhere.
Validation is always a problem. In one early example, I came across, the core data on which the model did not match what was really there. The data was assumed to be correct and, of course, nobody actually did the checking. The previous year's data was used for the following year's calculations.
I knew the data for the spreadsheet was wrong only because I had been involved in looking at the actual data sources in weeks previously.
But there can also be formulae errors. They seem to produce the correct information, but they will have subtle errors that miss necessary edge cases or use set values when they shouldn't.
The problem is that when you involve a Real Programmer you ruin the flexibility of the sheet that made it so useful for the original user.
It's better waste hours using Excel than waste even more hours trying to convince some developer to do a change and wait for the product - which might never come.
I have been the "Real Programmer" that was closely involved with building the sheet models. The flexibility is in being able to build lots of different models for investigation. It is oft found that the model builder misses some of the subtleties of using spreadsheets and their models don't quite go the way they expect.
In my case, I was on call to discuss what was to be achieved and to help them build what they needed. The turn around times were short. My function as the "Real Programmer" was to get them into a position of solving any relevant problems and making sure that they were able to progress with their work. That is the beauty of working directly for the end-user instead of being part of the central IT team.
Probably French. The free.fr email/www addresses and the French typography of having spaces before punctuation other than plain commas and full-stops (like this ..., or this :, or even this ?) give it away. :)
Somebody said that a decade ago and the responses provided invaluable pointers into regions of PL-theory I would not have consider explored this well... Happy reading ;)
I'm not going to argue that one spreadsheet solution is "better" than another as that is really a question of personal preference. However it should be noted that people do actually do these sorts of demos in other programs as well. And not just FOSS solutions but ive seen people do fanciful stuff in Google Docs too.
I wrote this [0] little quick and dirty snippet to enable a Google Sheet as a JSON data source via GET requests.
Unfortunately Google App Scripts only accept GET and POST requests [1] so it wouldn't be semantically RESTful, but adding update and delete features in the script are trivial if you send an action param along with your POST.
Using that data source snippet I wrote a 'static' CMS that pulls the JSON data from sheets in the workbook and then compiles and deploys a Jekyll site. I've been enthusiastically replacing WordPress sites with this wherever possible when clients only need data to change and don't require actual page editing (this system is perfect for restaurants - think menu changes). The "deploy" event is a little convoluted since I wanted to control what happens and when it happens, but it is all kicked off by submitting a Google Form which is attached to the Sheet.
The script as-is returns one object per row but converting it to key/value responses is also pretty easy. If Office365 has an equivalent to Google App Scripts that can publish a script as a web app, I'd love to hear about it.
ANECDOTE TIME! And actually a career related question:
I'm an absolute nerd for sports stats, and have a friend that moved out of the finance/BI world in the hospitality sector and landed herself an impressive gig at ESPN-of all things, as a stats editor.
It seems a relevant jaunt (because all I know of her profession is that she's an incredible mathematician) but the gap seems wide just from a perspective of domain knowledge; she openly states not caring at all for sports but the new job location puts her close to family.
My question is: is Data Science so applied that one can make that kind of jump domains easily? This gal is one of the smartest people I know but the more I look into what it is you folks do, the more I am simultaneously intimidated yet interested in the field.
Domain knowledge is always useful. But can also introduce biases in one's analysis.
For these types of professional roles you generally see specialized teams in three broad groups: data engineers, modelers/analysts, domain experts. Unicorns sometimes exist, but usually you see T-type experience with depth in one of the three.
A bit off topic, but Excel has many dangerous features. Is there a spreadsheet that is safe to use ? I just know about gnumeric and LO's calc, but I don't know how goo they are (I rarely use spreadsheets, most of the time to draw simple curves)
From that link:
“Spreadsheet errors are costing businesses billions of pounds, according to a financial modeling company, which is calling for the introduction of industry-wide standards to reduce the risk of mistakes.
F1F9 estimated that 88 percent of all spreadsheets have errors in them, while 50 percent of spreadsheets used by large companies have material defects. The company said the mistakes are not just costly in terms of time and money - but also lead to damaged reputations, lost jobs and disrupted careers.”
[1] http://www.excelunusual.com/coaster/