
Benchmarking Spreadsheet Systems - manigandham
https://blog.acolyer.org/2019/12/06/benchmarking-spreadsheet-systems/
======
sheetjs
Performance is affected by a myriad of factors, many of which aren't visible
by looking at the worksheet.

For example, XLS/XLSX support a special "shared formula" representation which
tells Excel that a formula is structurally similar to other formulae. This is
not always written, depending on how the file was generated and other factors.
Without that, Excel will take an "inefficient" calculation approach. You can't
figure it out from the UI or by inspecting the formula text -- you actually
have to dig into the file to see it.

~~~
magnifique
That's really cool! How did you figure this out and what does this
representation look like?

~~~
bri3d
[https://docs.microsoft.com/en-
us/dotnet/api/documentformat.o...](https://docs.microsoft.com/en-
us/dotnet/api/documentformat.openxml.spreadsheet.cellformula?view=openxml-2.8.1)

The OpenXML basis format for XSLX is an ISO standard. Most things are
documented albeit in a somewhat obtuse way.

------
known
Spreadsheets freeze to process more than 1 million records; AWK works like
charm;

[https://blog.jpalardy.com/posts/alternative-to-sort-
uniq-c/](https://blog.jpalardy.com/posts/alternative-to-sort-uniq-c/)

------
rolling_roland
I was struggling with spreadsheet performance some time ago, trying to open
csv with over million of rows, which made my excel cough. To help with this I
built collection of tools for myself
([https://blocksheet.io](https://blocksheet.io)) so I could at least split my
csv files to smaller ones and then edit with excel.

>In a similar vein, sorting causes problems on very small datasets (less than
10K rows):

Reading this made me wonder just how much excel prioritizes formulas and other
fancy features over basic utilities such as sorting. For comparison, I just
tried sorting csv with over 2 million rows on blocksheet and even though it
took few seconds and made my laptop fan do extra jumps (it's a static site),
it still managed to do it in reasonable time.

For me it's such a rare problem to run into huge spreadsheets, that it's a bit
overkill to build my own software for it. So if anyone knows any good ones
that deal well with rearranging large spreadsheets, I'd be happy to hear about
them.

------
conductr
What I always remind people is, Microsoft has no intention of fixing these
types of performance in Excel because they have other products to sell you if
you have “big data”

------
skavi
Has anyone done testing that includes Apple's Numbers?

~~~
tln
I always regret it when opening something in Numbers. Excel and Gsheets have a
lot of shared idioms that make gaining and keeping expertise far easier and
those two platforms start up faster and have less annoying barriers to working
at speed (eg, interstitials).

