
Ask HN: Language compiled to Excel spreadsheets? - marekr
Hi!<p>Yesterday there was a thread about biggest problems with Excel. I’d like to ask you for an opinion on an idea - source code for Excel. The goal is to give people who create models and tools in Excel the advantages of having a source code to what they are building and maintaining.
It is challenging and time-consuming to build spreadsheets that are really solid, minimize the chance of errors, that can be unit-tested and version controlled.<p>The idea is to create a declarative language which lets you describe fields, tables and worksheets in spreadsheet. You are encouraged to use names instead of cell coordinates. For formulas you just use Excel’s formula language. There’s a simple “compiler” that converts that into a spreadsheet.<p>Compared to regular Excel:
- you can unit-test the source code
- you can keep it in version control
- it’s much easier to read and reason about (it’s auditable)
- the outcome is a spreadsheet that satisfies the best practices (e.g. formula cells are blocked, input cells are validated).<p>What do you think about the idea? Who could make a good early adopter?
======
BjoernKW
There might be some merit to this in very specific use cases but I suppose for
those use cases you might be better off using a language / environment such as
Python or R anyway.

I know quite a few heavy Excel users (mostly of the MBA variety). They
sometimes accomplish astonishing feats with Excel. They're skilled and
knowledgeable problem-solvers but they're not very likely to touch a
programming environment other than Excel.

The reason for this presumably is that spreadsheets and Microsoft Excel in
particular have hit a sweet spot of an IDE that sort of covers many common
business use cases while still being easy enough to use and getting started
with for an intelligent person who knows nothing about programming at first.
Spreadsheets are also a very visual environment compared to a text-based
programming language.

So, yes there's lots to improved about handling spreadsheets, making them more
testable and scalable but the core - the spreadsheet development environment
if you will - isn't going anywhere. At least I can't see too many people
(especially the MBA kind) using a declarative language instead of Excel.

~~~
slantyyz
>> Spreadsheets are also a very visual environment compared to a text-based
programming language.

I like to think of a spreadsheet to be a REPL for the masses.

------
dragonbonheur
A programming language that is as easy to write as Excel formulas exists
already: It's called BASIC. Lots of people fell in love with it since the late
70's. Sure it's often ridiculed, mostly by people who somehow manage to slide
an ill-fitting and unwarranted Djikstra quote in their argument, but the
language has evolved with the times and once enabled millions of programmers
to earn a respectable and honest living. Dialects like FreeBASIC, QB64,
DARKBASIC, XOJO, FNX Basic, AutoIT and GAMBAS may seem very different to one
another, but they remain BASIC and easy enough to learn within a week.
Communities still flourish and are very tightly knit. Source code, Modules and
Libraries are abundant, as are very capable IDEs, together with complete
documentation.

Go forth(cough) and give FreeBASIC a try for a start. It already runs faster
than Python ever will on Linux, DOS32 and Windows. QB64 is cross-platfrom too.
If you're on the move give RFO BASIC a try, when you get back to your regular
PC you can make APKs of the programs you've designed on your tablet. B4J
enables you to run your programs on the Java VM. And if they are still not
enough, you can still write your own dialect of BASIC... in BASIC!!!

~~~
networked
I've also noticed the trend with various dialects of Basic producing tightly-
knit communities but I wonder if you can credit that to the language itself.
Is a modern, object-oriented Basic really friendlier than (a subset of) Python
or Lua (edit: with a similar standard library)? I find it at least as likely
that today Basic's reputation as an informal, hobbyist- and beginner-friendly
language shapes the communities that use it in a certain way more than the
language itself. I acknowledge this may not have been the case in the 1985 or
even 1995. Blitz3D [1] was an exceptionally quick and easy way to do 3D in the
year 2001 but at that point it could have been written to expose the same API,
say, to a Pascal compiler.

What exactly is right about Basic has been a matter of great interest to me
lately because I have been thinking about developing a bootable early home
computer Basic-like interactive environment for UEFI. I quickly realized that
one of the hardest design decision involved in this project would be whether
the language of the environment should actually be a dialect Basic (rather
than Logo, Lua, Scheme, JavaScript, Tcl, Python or something else entirely).

On a related note, I wish there were large empirical studies of how quickly
new programmers learn to use each of the stereotypically friendly languages.
Perhaps MOOC will eventually produce those.

[1]
[https://en.wikipedia.org/wiki/Blitz3D](https://en.wikipedia.org/wiki/Blitz3D)

~~~
dragonbonheur
> Is a modern, object-oriented Basic really friendlier than Python or Lua?

QuickBASIC had User Defined Types, FreeBASIC not only has Pointers (and
therefore Function Pointers) but also adds methods to TYPES which are defined
as Subroutines just like in C++. In Fact FreeBASIC can be thought of as a
covert way to learn and understand C++ for QBasic programmers.

As for it being an informal or hobbyist language, if one begins by designing
the proper data structures and adds the adequate functions to handle them, the
resulting programs can be just as capable as those designed in any other
language. Being able to think like a programmer makes all the difference, not
the reputation of the language being used.

As for Blitz3D, it can access external DLLs. DLLs can be easily written and
compiled with freeBASIC. No problem for any adventurous BASIC programmer.

As for your BASIC dialect you can take a look at [1] to learn how to write
your own in FreeBASIC or QB64. This code can be easily understood and ported
to any language.

[Edit] The standard library may not be as complete as Python's but FreeBASIC
can work with most DLLs compiled with C and the documentation will mention
ways to include C headers and work with various external libraries.

[1]
[http://www.qb64.net/forum/index.php?topic=6388.0](http://www.qb64.net/forum/index.php?topic=6388.0)

~~~
networked
>As for it being an informal or hobbyist language, if one begins by designing
the proper data structures and adds the adequate functions to handle them, the
resulting programs can be just as capable as those designed in any other
language.

Well, that is exactly my point. Basic has the reputation of being easy for
beginners and hobbyists but properly written code in a recent OOP dialect of
Basic doesn't look _too_ different from code written in any object-oriented
Algol-derived language. As such, it is probably about as easy/as hard to
write. Which is what makes me suspect that Basic-the-language might not really
be the secret sauce of environments like QB64 or Blitz3D. Other possible
reasons they get the sort of following they do could be that

1\. Those environments are easy to install and create your first project with.

2\. They start you with an IDE, so compiling and running your first code is
only a matter of pressing a single key or clicking a single button.

3\. They ship a "fun-oriented" standard library. You don't have to install any
addition packages or even import extra modules into your project to access
subroutines for plotting, sprite graphics, sound, 3D, networking, etc.

4\. They have high-quality, beginner-friendly documentation with plentiful
examples.

5\. They are fast, both the IDEs and the runtimes.

6\. (A language-related one.) Somewhat surprisingly, they are more often that
not statically typed and offer decent error messages. Perhaps compile-time
errors are a better educational tool than runtime errors.

I am not sure how important it is to have all of the above points but 1-3 seem
crucial.

~~~
dragonbonheur
The secret is indeed the traditional tight integration of IDEs + Documentation
+ Ease of compilation and deployment of code. I'd say the first 5 points are
absolutely crucial to the success of good BASIC coding environments and more
languages should offer the same.

~~~
networked
I think a great example of a non-Basic language that succeeded because of 1-6
is Turbo Pascal. Its documentation was thorough enough that you could learn
and use the language and its many libraries without external references
(except, perhaps, to get you started).

Going with Basic instead of a different language for your IDE can be a good
choice for another reason. It is, or at least was, good for marketing to
people who write code but don't consider themselves developers. To the
potential buyer of the IDE using Basic signals, "this is a friendly
environment; we don't expect you to be a grizzled professional programmer".
(This may not work today. The latest product from Blitz Research is called
"Monkey" and REALbasic was renamed "Xojo" in 2010, both apparently for
marketing reasons. The company behind Xojo described theirs at
[http://www.xojo.com/support/faq_xojonew.php](http://www.xojo.com/support/faq_xojonew.php))

------
gus_massa
I think that the advantage of Excel for non programmers is that it's less
abstract and you can see all the intermediate results instead of having a
mental model in your head. Also it makes some bugs show faster, because you
see the immediate result.

Perhaps an alternative is to make something between Excel and "text"
programming. Usually a big spreadsheet has many almost independent blocks.
Make some mini-spreadsheets with names and clear title rows/columns. Allow the
mini-spreadsheets to be positioned easily and connected by formulas.

(Disclaimer: I use a lot of Excel instead of small scripts, the main problem
is that you cant's nest 3 for loops.)

[In a related way, I saw some physicist program small calculations in
Mathematica because it's more interactive than Fortran and it's easier to make
graphs and find the errors.]

[There are some similar attempts. For example in Racket you can use #lang
frtime to get a "reactive" program thatauto updates the values.
[http://docs.racket-lang.org/frtime/](http://docs.racket-lang.org/frtime/)
It's a pity that the docs don't have an easy example.]

~~~
dang
> the main problem is that you cant's nest 3 for loops

I'm curious what you mean by this. Care to explain?

~~~
gus_massa
The problem is that the spreadsheets are 2D, so you iterate easily over two
variables, the row and the column.

If you need to "program" this in an Excel spreadsheet

    
    
      For i = 1 to 5
        For j = 1 to 5
          If 2 * i + j = i * j then
            Print i, j, "*"
          End If
        Next
      Next
    

You can "write" this in a 6x6 block with the values

    
    
        1  2  3  4  5
      1 =If(2*B$1+$A2=B$1*$A2, "*", "")
      2
      3
      4
      5 
    

And copy the formula from B2 to all the cells, so you get

    
    
        1  2  3  4  5
      1 
      2
      3       *
      4    *
      5

~~~
dang
Ah, that's what I thought you might mean. Can you talk about why you use
spreadsheets in the first place, then? You're dealing with nontrivial problems
if you hit this limitation, and you could write programs in a general-purpose
language if you wanted to. So what do you get from Excel?

(I care about this, btw, because my real job is making a new spreadsheet.)

------
davidwihl
We did something like this in OfficeWriter [1]. We built a set of APIs to
create, edit and generate Excel (and Word) files programmatically without the
need for MS Office in the server. In addition, we had a "template mode" so a
user could place datamarkers within an existing Excel spreadsheet. At run-
time, our code would open the spreadsheet, populate data from a provider data
source and then expand all associated cell references such as formulas,
charts, etc.

so it was not fully declarative but would not have been difficult to map a
high level language to our API. Some sophisticated customers did something to
that effect with a web based wizard type interface.

Microsoft purchased the IP in 2007. The independent product is still sold and
maintained for existing customers.

[1] www.officewriter.com

------
CamTin
For non-technical people whose tasks have reached the limit of what can easily
be maintained in Excel, it seems like you need the opposite as well--call it a
spreadsheet decompiler, which outputs this hypothetical declarative DSL.

Alice in accounting maintains a set of spreadsheets for a number of years, and
finally the task of adding some new feature, or porting it to a batch system
in some production line-of-business system falls to Bob, a developer. Bob runs
Alice's spreadsheet through the decompiler, spends some time refactoring it by
adding descriptive names to some of the "variables," building tests, and
generally structuring it for extensability and maintenance. Now he can compile
it back to Excel and send it back to Alice and others that need it, or (and
this is the really cool part), deploy as just another module in the ordinary
business automation.

You've basically built a way to turn a spreadsheet hacked together by a domain
expert and a small number of development man-hours into something much move
valuable and maintainable.

I don't know much about enterprise-y reporting software, but it seems that if
you built this, with the right integrations, you could probably sell quite a
few copies of it, plus some high-buck support contracts.

~~~
thanatropism
This is a bit harder. An Excel spreadsheet has no self-evident order of
evaluation -- the cell dependency graph can be arbitrarily complicated.

Of course, Excel itself (and other spreadsheet programs) already do
"decompiling", but this works like an always-on loop that resolves the direct,
algebraic dependencies but makes no distinction for business logic. This is
probably true of most real decompilation tasks.

------
platform
Excel's programming model is called "non monotonic data flow programming" [1].
The diagram referenced in [1] shows which other languages support that model
Therefore I would suggest to start from Oz, Alice, Curry, AKL, FCP... as the
platforms for Excel replacement

[1]
[https://www.info.ucl.ac.be/~pvr/paradigmsDIAGRAMeng108.pdf](https://www.info.ucl.ac.be/~pvr/paradigmsDIAGRAMeng108.pdf)

------
saboot
Wouldn't existing excel bindings for languages accomplish this?

[https://www.pyxll.com/](https://www.pyxll.com/)
[http://xlwings.org/](http://xlwings.org/)

------
thanatropism
When we have semi-complicated models we try to factor out the harder part to
run outside Excel -- and thus have programming utilities available -- and
write "front-end" logic (currency conversions, inflation adjustments, charts,
etc.) in Excel. I work a lot with Matlab, and this is really to use with the
.xlam add-in provided. The trick is to figure out the division between the
"back-end part" (that needs debugging tools, tests, etc.) and the "spreadsheet
part" (that can take an understandable, but black-boxed/stored-as-values input
and do spreadsheet logic that's flexible enough for the guys who need to mess
with that).

------
dragonwriter
I think the main reason people choose to use Excel the way they do (aside from
lack of choice resulting from policy, etc., decisions) is the easy, no-coding
onramp to building things (And tools that let you do coding aside from what is
built into Excel would likely, for those policy constrained, fall afoul of the
same constraints.) So, I suspect a compile-to-excel language helps anything;
the people that need it won't use it, the people that would use it already
have many ways to use traditional programming languages to produce apps that
the end-user accesses through Excel.

------
slantyyz
This is a useful idea, IMO. I would use it for programmatically generating
complex reports in Excel. Would be nice if it supported advanced stuff like
charts and PivotTables too.

I've recently been messing with some Office files (mostly XLSX) using various
libraries in Node, and I've been thinking that a transpiler type of tool would
be great for this use case, given that an XLSX file is basically just a zip
file with some XML files.

------
brudgers
Historical precedent indicates that this will not be a high value story for
the product owner. Variant data types are a feature, not a bug. If they don't
ship in the box, someone will union up their own to make life easier.

People use Excel to reason about their business. Reasoning about the code is a
second order problem and SourceForge is paved with the rotting corpses of
projects focused on making a better spreadsheet.

------
lbruder
Sounds like a mixture of the sc file format and BASIC:

    
    
        let A0 = 123 
        let A1 = 234 
        let A2 = 345 
        rightstring B3 = "asd"
        leftstring B5 = "foo"
        label B11 = "bar"
        goto B11 A0
    

Just a textual description of what your spreadsheet should look like. Gosling
had it right back in the 80s ;)

------
travjones
Forgive me if I'm an idea killer, negative nancy, devil's advocate, or
whatever. Although this may be functional for a few people, I don't see it
gaining traction among users of Excel. I'm a grad student and my lab mates all
use Excel proficiently. However, they steer away from programming or anything
to do with "code." As an anecdote, I had a few terminal windows open on my
laptop. One of them walked by and remarked, "Are you hacking?!"

Also, I doubt version control is something this community would begin using,
simply because of the perceived effort required to get going (in reality,
source control is not that complicated, but once you start talking about
"repositories" you are going to lose these people). It's unfortunate, but most
of society (even really smart folks!) prefer to remain ignorant to tech in
general. :/

------
cfallin
I did something like this when I was doing lots of data analysis for a
research project: [1]

The problem space I was addressing was more restricted than a fully generic
spreadsheet model, but I found everything you say about reproducibility and
descriptiveness to be true. I kept my report/plot descriptions in git
alongside the experiment setups / parameters.

FWIW, prior to this, I often wrote one-off Python scripts that would generate,
e.g., 1000 rows of a particular set of formulae in CSV then import that into
Excel. It probably wouldn't be hard to build a little tool in a few hundred
lines of code to do this.

[1] [https://github.com/cfallin/scomp](https://github.com/cfallin/scomp)

------
codekiln
There is far too little done to bridge traditional programming expertise with
spreadsheet expertise. In that respect your idea hits close to a sweet spot.
However, I think you're not quite at the optimum center of opportunity if you
care about this realm; instead of making a programming language that can
compile to spreadsheets (few programmers truly care about spreadsheets; few
spreadsheet masters care about programming), consider making a declarative
markup for ETL processes based on visualizing results in spreadsheets, with
bindings in all the major programming languages.

------
eitally
It's a useful idea, but it would only be useful for those programmers who are
saddled with using Excel as their delivery mechanism ... which is a tiny
minority of the people writing complex VBA, macros or formulas, or using Excel
add-ins they customize. Those people are usually business users and are not
interested or not willing to act like real programmers.

Imho / ymmv, of course, but I am coming from 15 years of enterprise experience
where a lot of my team's dev work seems to have been replicating Excel
spreadsheets as web apps. :-/

------
bryanh
Link to thread from yesterday:
[https://news.ycombinator.com/item?id=9935357](https://news.ycombinator.com/item?id=9935357)

~~~
pickle27
thank you

------
jakozaur
How about company that will turn your spreadsheets into apps. Start with
specialized consulting and start building product to cover common use cases.

Like Gigster: [http://techcrunch.com/2015/07/22/uber-for-
developers/](http://techcrunch.com/2015/07/22/uber-for-developers/)

Input: Send the existing spreadsheet and maybe some description.

Output: Based on spreadsheets and their history you build and host a webapp to
do that.

~~~
gmac
A friend wrote this: a project that converts Excel sheets to C or Ruby.

[https://github.com/tamc/excel_to_code](https://github.com/tamc/excel_to_code)

------
draegtun
There is an Excel dialect (DSL) for Rebol -
[http://www.robertmuench.ch/development/projects/excel/dialec...](http://www.robertmuench.ch/development/projects/excel/dialect_documentation/)

Here is a cleaner example I posted on HN before -
[https://news.ycombinator.com/item?id=6668323](https://news.ycombinator.com/item?id=6668323)

------
jeffwass
Do any of the open source office spreadsheet equivalents have a sanely
parsable file format that can be extended for this? Ie, this language would
basically be a more verbose file format?

------
w_t_payne
I'm not sure this warrants a domain-specific language, but I can see the
attraction of a _really_ nice API for generating dynamic Excel reports with VB
macros and whatnot.

------
andrewchambers
Hmm, there may be merit in this. Though a counter case was when my brother
just wrote a C# plugin for my dad to perform some of the more annoying
functions he was dealing with.

~~~
marekr
Would it be beneficial if your dad was able to achieve the same on his own by
using the language?

~~~
andrewchambers
Really hard to say I think, anything is possible.

------
_pmf_
> You are encouraged to use names instead of cell coordinates.

You do realize that it is perfectly trivial to name cells and regions in Excel
and use these names in formulas?

------
cschmidt
The [https://datanitro.com/](https://datanitro.com/) product integrates Python
into Excel quite nicely.

------
vhogemann
Like Sublime Text, but for Spreadsheets?

1) Create a contained python/node/ruby environment.

2) Bind an object to a spreadsheet ui

3) ????

4) Profit?

------
WorldWideWayne
It's a great idea. I built a whole reporting package that basically rendered
to Excel and it was a real time-saver vs developers creating one-off Excel
books and writing custom code to fill them in.

Rendering to Excel isn't the difficult part though (unless you're trying to do
it equally well on all platforms I guess). Creating a new language/DSL and all
of the other features you mentioned are the riskiest parts of this venture.

So, can you do it? If so, do it quickly and then sell it to Microsoft for a
couple hundred million :)

