
Excel_to_code – Converts some Excel spreadsheets to C - gmac
https://github.com/tamc/excel_to_code
======
donatj
I've got a project in a similar veign here: [https://github.com/donatj/XLS-
XML-Formula-Expander](https://github.com/donatj/XLS-XML-Formula-Expander)

It's ancient and flawed but it converts a SpreadsheetML (
[https://en.wikipedia.org/wiki/SpreadsheetML](https://en.wikipedia.org/wiki/SpreadsheetML)
) aka "Excel 2003 XML" spreadsheet to PHP in real time.

There is a VERY complicated configuration tool for a major steel company that
runs on this still, last I checked anyway. I left that job ~5 years ago. I've
done a few small updates since then for them, but not much.

It was nice though because it allowed the company to just submit a new
Spreadsheet and their tool would automatically update. I was _very_ proud of
it at the time. Haven't had reason to maintain it though, but it works much
faster than other tools that do similar for PHP.

------
danjc
As a developer (or someone with some scripting or DBA skills), it's pretty
easy to scoff at Excel being used as a way to internalise business rules but
the thing about it is that it's this ubiquitous language that a lot of people
understand.

We've saved ourselves a lot of hassle implementing application integration
solutions by outsourcing domain-specific number crunching/mapping to a
spreadsheet in our integration platform. It enables us to treat the customers'
business rules as a black box - feed inputs against named ranges and read out
named ranges
([https://developers.flowgear.net/kb/Node:Excel_Workbook](https://developers.flowgear.net/kb/Node:Excel_Workbook)).

~~~
morgante
How did you go about converting the Excel workbooks into a usable program
which you can feed data into and get back computed results?

~~~
cnvogel
I think Joel Spolsky (?) is often quoted by saying that the easiest way to
parse ms office file formats is to just install it on a server and use the
documented API to Word, Excel,...

I used this approach regularly about 4 years ago in my previous job to
interact, query,... with data stored in "excel databases".

So, while I'm not the parent commenter, my suggestion would be to have a
windows server that runs both my interface software, and excel. You load the
workbook, populate input fields, let excel do the processing and query output
fields/workbooks/...

Of course the excel sheet could also contain VBA code actively pulling out
data from a DB, heck it could even call code in custom shared libraries. But I
doubt that passing around workbooks full of code amongst it illiterate users
will do any good. The less magic they contain outside of stock cell formulas
the less chance of users messing up the computation.

If you want to play with it, look up the win32 modules shipping directly with
ActiveState's python distribution.

~~~
danjc
That's about how we do it. Excel (and its COM API) was never intended for
multi-tenancy or consumption from a service. That said, if you run the service
under a named user account (as opposed to SYSTEM or NETWORK) and make sure you
won't have users accessing the file and locking it, you're good to go.

If you need to invoke the same sheet repeatedly, you can build a cache of
range output values keyed on a hash of range input values. This enables you to
bypass Excel for a state that you've previously seen.

It's also pretty easy to inject and execute script (as opposed to invoking
existing scripts) - that deals with the problem of users being able to see or
edit VBA.

~~~
cnvogel
> users accessing the file and locking it

I'd say that having such a file on a network location accessible by your users
is a pretty bad idea. As always, a proper develop/test/release process should
be in place, otherwise great confusion about the actual revision of a "Excel
Program" in production will ensue with absolute certainty.

Treating it as a binary blob doesn't preclude one from implementing at least a
basic form of revision control. You'll not be able to create useful diffs
between revisions, but at least one can identify them.

------
robsco
I've got a system that has been working quite nicely. Basically it's a python
server that interfaces with a headless instance of LibreOffice Calc.
Spreadsheets are uploaded to a directory that is polled for changes. The
python server opens/closes the spreadsheets in LibreOffice. The python client
connects to the python server and can modify and read cells in any
spreadsheets. Each spreadsheet is 'locked' when it is being accessed so to
avoid two simultaneous modifications/requests giving incorrect results.

LibreOffice nowadays works pretty well with excel spreadsheets. It also means
that as LibreOffice improves so does my excel sever. I'm also working in the
GNU/Linux world so this is a great solution for me.

~~~
batou
This sounds similar to a reporting system I wrote in about 1999 using Word
automation API. It sat on a windows NT box and watched for changes on a
netware share. That would load up a text file, read it, parse it and pick
documents and fill in bookmarks in the documents then print it to a printer or
PDF file.

Worked wonderfully apart from the memory leaks :)

------
cm2187
In fact I always thought it should be a feature of Excel. Take a worksheet,
define inputs and outputs, and make the worksheet a function which can then be
reused to apply for instance to a table of data.

There are many things that could be improved in Excel. But Microsoft seems to
have unstaffed all developpers from the office team and that product seems to
be in maintenance mode. The only innovation in 10y is a button that enables to
share a spreadsheet on facebook. Will be very popular in a corporate
environment.

------
lorddoig
> For small spreadsheets this will take a minute or so. For large spreadsheets
> it is best to run it overnight.

Ah, Ruby.

~~~
jakobegger
Any insights why the conversion is so slow? Intuitively, converting a
spreadsheet doesn't sound like it should be a high complexity task.

~~~
JupiterMoon
Where do you start? No seriously where would your program start and where
would it finish? Most spreadsheets are a tangled mess of interacting cells.

~~~
laumars
Excel manages to process it in real time.

~~~
JupiterMoon
Yep for all its faults it's an impressive program.

~~~
laumars
In fairness to Excel, I think most of it's faults are more down to user misuse
(eg people creating "databases" in Excel) rather than problems with the
package itself.

Though Excel's handling of CSV can sometimes be a little troublesome -
particularly with date / time fields. And I wish it support Open Document
Spreadsheet (.ODS). But the former is a fringe case and the latter is
obviously just a business decision. So, as you say, it's a very impressive
program.

------
jscottmiller
> The generated code uses floating point, rather than fully precise
> arithmetic, so results can differ slightly

Use with caution.

~~~
teraflop
Excel already uses floating point internally, so if you're worried about
compatibility, this is _less_ risky than switching to arbitrary precision
math.

~~~
jscottmiller
Wow, didn't realize that about excel (though I have observed it in google
sheets). I wonder how many hedge funds rely on sheets with catastrophic
cancellation issues.

~~~
AceJohnny2
I expect all financial math is fixed point.

~~~
chillingeffect
I'm not sure why ppl downvoted you. But from what I've heard - never
implemented myself - what you're saying is common... At least as far as using
separate fixed point values to represent the left and right sides of the
decimal place.

Now I'm sure for things like computing interest by raising things to a power
or even computing percentages, etc, floating point is used (and eventually
reduced to fixed point values), so maybe someone found it too absolute to say
"all" financial math is fixed point? Anyway, if someone with experience can
confirm or deny, it would be neat...

~~~
JupiterMoon
Because he is wrong? Excel uses 64 bit floats.

------
barsonme
Really interesting project. The source code is a bit scary[1] but I suppose
that's to be expected (machine) translating Excel to C. Good work.

[1] - [https://i.imgur.com/q2KZRR1.png](https://i.imgur.com/q2KZRR1.png)

~~~
maguirre
unrelated but you should see the generated code that matlab/simulink outputs
from their Realtime workshop. This reminds of that

------
arthurcolle
Not really understanding why one would want this?

~~~
untog
Because a lot (a _lot_ ) of companies have a staggering array of business
logic sewn into spreadsheets. Eventually that might end up having performance
issues, so transforming some of that logic to C is, at the very least, an
interesting solution to a problem.

Everyone looks down on them, but spreadsheets really are incredibly
functional, and a lot more accessible that most programming languages.

~~~
arthurcolle
Right, I get that... but why C? Presumably if you're using Excel you have
access to the whole MSFT stack, so it just seems weird to use a Unix language
when it seems likely that some VS/C# tool has already been built for this.

I know C/C++ works just fine on Windows, but the command-line tools on cmd.exe
leave a lot to be desired.

~~~
kcbanner
What makes C a Unix language?

~~~
cdcarter
Well, C was designed for the purpose of writing Unix.

------
kranzky
Nice work! I built something similar that keeps everything in Ruby. It was
made to allow models built in Excel by a non-programmer to be executed on the
back-end of a Rails app, but it's pretty generic:
[https://github.com/agworld/soroban](https://github.com/agworld/soroban)

I had big plans for the project; the next step was to convert the models to
JavsScript so they could be executed directly on the client. But, like most of
my pet projects, this has remained a pipe dream.

Good luck with it!

------
sqlacid
Surely I'm not the only one that remembers someone doing this with Lotus 1-2-3
in the 80's...how far we've come. What's next, a database in a spreadsheet?
Word processor? Modem application?

~~~
amyjess
At my last company, someone wrote a format to encode RDF data in spreadsheets.

Google Sheets was our primary tool for configuration and content authoring for
way too long. Eventually, he added support for OOXML sheets, too. At some
point, a long time later, we started migrating to TTL, and thank goodness for
that.

The code was all open-sourced, by the way, but I'd rather not link it here
because I don't really want this username associated with that company.

~~~
morgante
> The code was all open-sourced, by the way, but I'd rather not link it here
> because I don't really want this username associated with that company.

Any search clues? It sounds really interesting and I've been exploring the
possibility of using spreadsheets as a widely understood authoring tool.

~~~
rch
I've no idea if it's related, but here is one I'm familiar with:

[https://code.google.com/p/owlpopulous/](https://code.google.com/p/owlpopulous/)

------
tmaly
I did something like this at work to convert these huge excel sheets to a
compressed format. I normalize the excel xml files with an xslt then I extract
of the meta data I store in named ranges.

------
dnesteruk
Also check out [http://activemesa.com/x2c](http://activemesa.com/x2c)

