
XLParser: A C# parser for Microsoft Excel formulas - mpweiher
https://github.com/spreadsheetlab/XLParser
======
dtech
Hi! I'm one of the authors and did most of the implementation work, very
surprised seeing this appear on HN!

If you want to see the tool in action, check out our online demo [1].

If you have any questions you can ask them here. We also have a paper [2]
which described the innards and design choices.

[1]: [http://xlparser.perfectxl.nl/demo/](http://xlparser.perfectxl.nl/demo/)

[2]:
[https://drive.google.com/file/d/0B79P2Uym3JjvMjlaWWtnTWRLQmc...](https://drive.google.com/file/d/0B79P2Uym3JjvMjlaWWtnTWRLQmc/view?usp=sharing)

~~~
mpweiher
You shouldn't really be surprised, this immediately looked like awesome work
showing up in my twitter feed, and I think the comments largely agree.

------
emeryberger
Nice to see this, and thanks for open-sourcing it. We (by which I mostly mean
Dan Barowy
([http://people.cs.umass.edu/~dbarowy/](http://people.cs.umass.edu/~dbarowy/))
burned a considerable amount of time and energy on parsing Excel formulas for
our CheckCell project ([http://checkcell.org](http://checkcell.org)), but
yours looks more general (for example, CheckCell doesn't care about
precedence, so we didn't implement that).

~~~
dtech
Glad to hear, that's exactly why we open-sourced it.

We want to prevent everyone from doing the same work over and over again and
provide a basis to work from if people so desire.

------
cm2187
One feature that would be cool is an Excel addin to which you would designate
a bunch of cells from a spreadsheet as inputs and one output, would follow the
logic and emit code to replicate this logic as a function.

Code in VBA would be an obvious candidate, allowing non technical users to
apply a complex Excel based logic to a whole list of entries for instance, by
making a function that can be called on every line. Or allowing to hide the
logic in a password protected VBA project so that the spreadsheet can be
shared without disclosing the logic.

Could also be used by IT departments when they inherit from spreadsheets they
need to integrate in systems. In fact it could even be used as a UI. Business
users express the business logic in excel, a syntax they are familiar with,
and update the logic in a system automatically by the system parsing the
spreadsheet.

~~~
masklinn
> One feature that would be cool is an Excel addin to which you would
> designate a bunch of cells from a spreadsheet as inputs and one output,
> would follow the logic and emit code to replicate this logic as a function.

From having "you suck at excel" run in the background a few weeks back, I
learned that feature is already in excel, there are "input", "calculation" and
"output" cell formats/styles:
[https://youtu.be/0nbkaYsR94c?t=2756](https://youtu.be/0nbkaYsR94c?t=2756)

------
jimmcslim
Interesting. I would have thought that Excel formulas would be relatively easy
to parse, but they with things like array formulas, etc it could start to get
a bit hairy.

This might be very useful for a project I have coming up, so I'll definitely
look at it further!

~~~
dtech
The basics are very simple, but there are some very intricate rules/quirks...

For example, the following are valid formulas:

    
    
        `=SUM(IF(TRUE,A2,C7):C10)`
        `=SUM(A1:B3:C2)`
        `=SUM(A1,,,,,,A2)`
        `=SUM((A1,A2),A3)` means something different than `=SUM(A1,A2,A3)`, both are valid
        `=SUM(A:A 1:1)`
    

We still have only <<100 production rules while the official Microsoft one has
hundreds. If you don't care about the complicated structures (and our research
indicates that they are very rarely used) you can make the grammar _a lot_
simpler.

------
mkesper
Why do you always need the most recent Visual Studio version for projects?

~~~
taspeotis
It could be written with C# 6 and Visual Studio 2015 is (currently) the only
version of Visual Studio to support C# 6.

You can see C# 6 features like the nameof operator in use here [1].

[1]
[https://github.com/spreadsheetlab/XLParser/search?utf8=%E2%9...](https://github.com/spreadsheetlab/XLParser/search?utf8=%E2%9C%93&q=nameof)

------
excel2flow
Too bad I didn't have this year ago, it would have saved me some work with
[http://excel2flow.asp2.cz/](http://excel2flow.asp2.cz/)

