
ExcelCompare: Command line tool and API for diffing Excel Workbooks - jsvine
https://github.com/na-ka-na/ExcelCompare
======
artmageddon
I work at a certain custodial bank and it's scary just how much of our
business depends on Excel VBA. Something like this would be amazing as we have
virtually no SCM / revision control for our critical spreadsheets.

Does anyone else know of any neat "can't-live-without" Excel utilities or add-
ons that would make a software engineer's life a lot easier? A few weeks back
someone posted ThingieQuery[0], which also looks fantastic. The IDE for VBA is
awful, and I'll take any improvement I can get.

[0]:
[https://news.ycombinator.com/item?id=11583488](https://news.ycombinator.com/item?id=11583488)

~~~
dgudkov
How about getting rid of VBA entirely and replacing it with a visually
designed workflow? EasyMorph[1] does data transformation outside Excel (so
it's not a plugin or add-on), but then results can be inserted into a sheet of
existing spreadsheet.

[1] [http://easymorph.com](http://easymorph.com) (I'm the founder).

~~~
artmageddon
Part of the issue is that the team I'm on consists of a majority of finance
majors who only know VBA, and maybe just a few C++ devs (I'm pretty much the
only non-finance person on the team). This is sort of those "when all you have
is a hammer, everything looks like a nail" issues. Simple one-off VBA tools
find a little bit of traction and then, through patchwork, evolve into
something critical yet lack proper up-front and long-term-planning design.
They're all smart folks of course, but I feel like so much more could be
accomplished with better tooling.

Your product looks really cool; I've actually been looking for something that
will do ETL tasks and I've only used Informatica and SSIS, which obv aren't
free. Will be checking it out!

~~~
Chris2048
This might be changing though. I'm following a CQF program, and while the
tools are traditionally excel, VBA and C++ - Yves from Python Quants is there
lecturing about using python.

------
stephengillie
It's interesting to see this written in Java, since the Excel application can
be reached as a .NET object[0]. On top of that, Compare-Object is a built-in
function for Powershell. Sure, with this built in Java you can use it to diff
Excel spreadsheets on CentOS, but how often do you have those there? But in
.NET most of the pieces are probably already there, so less work may have been
needed to produce the same tool.

[0] [http://www.madwithpowershell.com/2013/11/using-excel-
functio...](http://www.madwithpowershell.com/2013/11/using-excel-functions-in-
powershell.html)

~~~
aargh_aargh
It's for more than just for Excel and Windows.

------
tacon
Microsoft finally added the ability to compare two workbooks in Excel 2013:

[https://support.office.com/en-us/article/Basic-tasks-in-
Spre...](https://support.office.com/en-us/article/Basic-tasks-in-Spreadsheet-
Compare-f2b20af8-a6d3-4780-8011-f15b3229f5d8)

Not a command line tool, and no API, but they delivered the basic feature. As
far as I can tell, it isn't used very much.

------
itisbiz
Good strategy is to move all data transformation into Excel add-in Power Query
and leave raw data in original format. No more copy/ paste of data. "Fat
finger" events are minimized. Using PQ gets you moving towards more robust
business process, data management and IT architecture.

------
sha1-1b141e
OOC, how much of the complexity for implementing this was getting the data out
of the formats, and how much was implementing the diff once you had the data?

~~~
jkaptur
It looks like the Apache POI library gets the data out of the formats. The
tool itself seems to compare the data cell-wise (i.e. A1 against A1 - no
alignment seems to be attempted) and there doesn't seem to be any handling for
merged cells, charts, pivot tables, filters, data validation, etc.

------
teddyh
They should add this to Diffoscope
<[https://diffoscope.org/>](https://diffoscope.org/>).

