
Computer vision basics in Excel, using just formulas - alok-g
https://github.com/amzn/computer-vision-basics-in-microsoft-excel
======
eigenvalue
This is an amazing idea! It's also a testament to the extreme power and
efficiency of the core Excel code that everything works so smoothly despite
this being not at all what Excel was designed for. There is something about
everything-- data and "code"\-- being so instantly and interactively available
for inspection that makes everything seem simpler and easier to grasp.

~~~
asdfman123
That's true. It's also important to point out that a lot of what runs neural
networks is just multiplying a bunch of numbers by each other and getting the
"best fit." It's all matrix math. I was surprised how straightforward it was
after taking Andrew Ng's coursera course, since ML is considered to be so
advanced and cutting edge.

~~~
barrkel
Well matrices just encode linear functions, so it's not the most surprising
thing that you'd use them for calculating a lot of linear functions.

------
dsalzman
If you want to see the power of "basic" operations. Watch this video of Dan
Ingalls, co-inventor of Smalltalk, demo his software to do OCR on Devanagari
text in 1980! [https://vimeo.com/4714623](https://vimeo.com/4714623)

~~~
scroot
This is such a cool demo

------
sambeau
I had a friend doing this in 2003. He had a spreadsheet that could read road
signs with a lot of white noise applied. He called it foveola vision. It was
super impressive. He later converted it to a C library but the concept was
essentially the same.

[https://www.scenereader.com](https://www.scenereader.com)

------
stared
See [http://www.deepexcel.net/](http://www.deepexcel.net/) \- and educational
April Fool's Day from 2016.

I used to show these spreadsheets to make it explicit that all operations are
simple, as in addition, multiplication, max and ReLU.

------
yummypaint
Very nice. Spreadsheets are also great for doing quick monte carlo
simulations. Things like finding the solid angle of a cylinder from an
arbitrary perspective quickly become algebraicaly intractible. Raytracing with
gnumeric is comparably easy.

------
StreakyCobra
Remind me of this video of Matt Parker (standupmaths):
[https://www.youtube.com/watch?v=UBX2QQHlQ_I](https://www.youtube.com/watch?v=UBX2QQHlQ_I)

~~~
alok-g
OP co-author here. :-)

Yes, someone told us about this video when we first showcased this work. This
video and a few more such works that we have discovered since then are linked
in Q&A #7 in the readme. :-)

~~~
LegitShady
The text mentions a talk you gave. Was that recorded anywhere?

~~~
alok-g
We don't have a good-quality recording of this that we could release. I have
been looking forward to making one in the future. Thanks for the interest. :-)

Addendum: The text notes inserted within the Excel files partially cover for
it, as that's roughy what the talk had, other than a possibility of live Q&A.
The Q&A present in the readme is based on questions we have been actually
asked. :-)

------
BrandiATMuhkuh
This reminds me of a fast.ai video where they use Excel for a CNN.

~~~
BrandiATMuhkuh
Found the video: [https://youtu.be/gbceqO8PpBg](https://youtu.be/gbceqO8PpBg)

~~~
bigmit37
Yeah thought of this as well.

------
bsenftner
Anyone remember the line of PlayStation (PSX) bowling games? I was director of
the studio that wrote "10 Ten Pin Ally", "Brunswick Circuit Pro Bowling",
"Flintstone Bowling" and others. The bowling physics engine was originally
written by the founder of the studio in Excel. This is the same guy that made
the Vectrix game console
([https://en.wikipedia.org/wiki/Vectrex](https://en.wikipedia.org/wiki/Vectrex)),
and he found it easier to work in Excel than the fixed point math & C compiler
for the original PSX.

------
bradgessler
Spreadsheets are highly hackable sandboxed self-contained runtimes. They’re a
really great way to deliver self-contained client-side software that can
quickly evolve.

------
samdung
What is the most widely used database? Microsoft Excel. I was blown when i
first learnt this fact.

~~~
ganstyles
I would probably lean towards describing Excel as a data store rather than a
database because it doesn't preserve many of the properties that make a
database a database, such as acid compliance. Would anyone disagree?

~~~
catblast
Yes, acid compliance is not a prerequisite for what most people call a
database either laypeople or not.

------
Robotbeat
This makes me wonder... has anyone bothered with hardware-accelerated Excel?
Not just graphics acceleration. Seems like something you could do with an
FPGA.

I bet Microsoft has an FPGA-accelerated version of Excel in a lab somewhere.

------
bArray
I think the nicest part of this is for people to be able to poke at and
inspect every part of the code - very cool. Normally these things are hidden
in large loops! Here you can tug on a single thread and follow it through.

------
lowdose
Chart to data would be awesome. It takes a lot of time to adjust a png into an
image possible to redistribute with custom branding. It's only surface
detection so must be possible.

------
sandesh1712
insightful and commendable effort to explain a complex topic of computer
vision and CNN with lucid simple hands on step wise example in excel

------
godelmachine
The idea of transposing the 2-dimensional structure of image to the 2
dimensionalities of MS Excel is very intuitive.

------
wiseleo
Interesting how they published a paper on GitHub. I wonder who else will adopt
this format.

~~~
ta999999171
Anyone who wants their study reproduced!

------
atum47
I heard a girl once saying that her dad can prove anything using Excel (she
was talking about how her Dad has raised her and her brother), everyday I'm
more convinced she was right.

------
animalnewbie
I'm beginning to wonder who/what does/did more harm to technological progress
- Microsoft or aversion to Microsoft?

Sounds of the stuff they do is incredible, just not talked about.

~~~
harry8
Don't do stats with excel. It's all wrong, Microsoft won't fix the bugs.

They really kind of earned their reputation in an honest and direct fashion.
Aversion to Microsoft works great, doesn't it? Need a spreadsheet? Use
gnumeric. Calculation errors are bugs and those bugs get fixed.

~~~
nnghj
> Don't do stats with excel. It's all wrong, Microsoft won't fix the bugs.

e.g?

~~~
ska
See, for example these slides:

[http://biostat.mc.vanderbilt.edu/wiki/pub/Main/TheresaScott/...](http://biostat.mc.vanderbilt.edu/wiki/pub/Main/TheresaScott/StatsInExcel.TAScott.slides.pdf)

Or more formally papers like:
[https://doi.org/10.1007/s00180-014-0482-5](https://doi.org/10.1007/s00180-014-0482-5)

Both of which are a bit out of date so some things may have been fixed.

It's generally not a great platform for numeric work, but some things can be
improved if you know the issues. For examples the last time I checked (a while
ago) things like sum/std/mean would not do anything intelligent with large
columns/rows leading to accumulation errors if you did it naively, but you can
work around stuff like that if you know it is there... but you will end up re
implementing which makes it painful

------
hemantvirmani
this is an awesome approach to demonstrate something very complex in extremely
easy way.

------
FpUser
This is just awesome

------
gugagore
Could someone report how feasible it is to run this in LibreOffice Calc?

~~~
ryanjshaw
It's in the linked readme:

"While the files open in LibreOffice (tested in version 6.4.0.3 (x64)), it is
slow to the level of being unusable. We have not tested in Apache OpenOffice."

~~~
whereistimbo
In my opinion this might because of xlsx compatibility overhead. Using ods
format might speed up a bit.

~~~
dclusin
Microsoft has had really sharp people working on spreadsheet performance for
many years. I remember reading a blog post from I believe Joel Spolsky or
someone talking about what excel is doing behind the scenes to achieve high
performance and I was pretty impressed.

One example that comes to mind was that spreadsheets are just memory mapped
files and the layout of the file on disk is identical to the data structures
in memory. This allows them eschew translation to a data interchange format.
So they got performance at the cost of interoperability, which is probably
what's hampering open office & friends.

~~~
Someone
That’s certainly history, if you use a modern file format such as .xlsx, and,
likely, also if you use the old format.

Microsoft likely changed several in memory structures when Excel went 64-bit,
if not earlier.

One thing that Execl does is multi-threaded recalculation
([https://docs.microsoft.com/en-us/office/client-
developer/exc...](https://docs.microsoft.com/en-us/office/client-
developer/excel/multithreaded-recalculation-in-excel))

~~~
saber6
> Microsoft Office Excel 2007 was the first version of Excel to use
> multithreaded recalculation (MTR) of worksheets. You can configure Excel to
> use up to 1024 concurrent threads when recalculating, regardless of the
> number of processors or processor cores on the computer.

Somewhere, there is probably someone running hundreds of threads for excel
(likely in a beefy VM/VDI). It is probably wired so deep into their business
that they are afraid to move to other methods (that are more scalable). But
such is the power of excel. What you see is what you get is not to be
underestimated.

