
Excel and SQL - karamazov
https://datanitro.com/blog/2013/5/23/Excel_and_sql/
======
slg
I might be laughed off of HN for even suggesting this, but why not use Access
and VBA for something like this? Most computers that have Excel installed will
already come with a copy of Access. Plus Microsoft built the three to work
together pretty seamlessly. It might not provide all the same advantages of
Python and other database software, but it is easier to pickup and is a huge
improvement over the old Excel as a database solution.

~~~
mattmanser
Access is for people who should be programmers but aren't yet. If you find
yourself thinking about Access it's probably time to hire some proper
programmers or get some proper programming training rather than a SAMS teach
yourself VBA in 21 days book (OT, do they still have them?).

But most businesses won't realize that/be too cheap to spend the money to save
much more money. Although it's much better these days than it was 5 years ago.

Though I know a lot of MI people where most of their job could probably be
automated.

~~~
jacques_chester
Speaking as a Real Database(tm) bigot, Access has two key advantages over Real
Databases.

1\. It's there.

Most large organisations stump up for Office Pro, and Office Pro includes
Access. The corporate policies prevent you from installing the Real Database
of your choice -- you can only use what's already installed. Happily, that
includes Access.

2\. It's upgradeable to a Real Database.

Microsoft make transforming Access into a true multi-user SQL database fairly
straightforward: install SQL Server and run the upgrade Wizard.

If SQL Server is not your personal favourite Real Database, then with a bit
more work you can get Access to talk to something else via ODBC. Not as
seamless, but still a clear upgrade pathway.

One of the projects that made me realise I wanted to be a developer and not a
lawyer (long story) was an Access database I wrote for my part-time job. An
errors-tracking system. I calculated that it saved the company 35 hours of
_manager_ time per month.

What did they have before that? A physical book, typed into an Excel
spreadsheet once per month.

Access is a tool with unique bureaucracy-dodging properties. It's important
not to discount those.

~~~
untog
Seconded- Access was my gateway drug into development, too. Well, not quite- I
knew I wanted to do it and had played around with PHP in my spare time, but
after graduating college I found myself in a crappy administration job. They
had a crappy Excel-based tracking system, I replaced it with Access. Then that
grew into a SQL Server-backed VB.NET application. Then I left, went into
another .NET job developing C#, then journeyed into the world of Node, Python
and Ruby.

TL;DR: Access is a great place to get started.

------
sokoloff
Is Python/DataNitro really "One of the easiest ways to connect Excel to a
database"?

I can go to the Data tab, click on the ribbon item "From other sources:From
SQL server" or "From other sources:From Microsoft Query" in far less time than
it takes me to read this blog post.

Am I missing something? (other the possibility for SQL injection attacks by
users of that spreadsheet)

~~~
gizzlon
Can you use it with arbitrary sql (non MS) servers on other hosts?

~~~
sokoloff
Yes, the latter is an ODBC option, which gives you arbitrary connectivity.

------
danbruc
Excel supports querying databases out of the box - Data > Get External Data,
no coding besides writing the query required.

~~~
TheAnimus
Yes, and with the new 2013, it has some good OData stuff too.

Whilst this is a blog post by the party who have created the library, I also
disagree with some of their assertions.

Hosting it in a sharepoint type thing, with track changes on, multiple users
work quite well indeed. Not to mention that if someone is doing modifies or
deletes, I'd much, much rather have that kind of history (hell even git/svn)
than having a database without an audit setup. Given the amount of work
involved in setting up an audit system, merging it into the excel UI they've
just created, I really can't see the point he is making, or where he is coming
from.

In fact I wouldn't really suggest people moved away from Excel for the volume
of data he speaks of ether, it is very easy to backup (host on sharepoint or
similar) incredibly easy to share with the people work on it.

What I would say for it being time to move is when you have a _relationship_
then its damn well time to move.

------
reeses
Pre-SQLite, I used the Excel OLEDB provider for a simple data store.
Basically, each sheet is a table, and once you have a connection, you can run
SQL against the workbook. "SELECT * FROM [People]" works mostly as expected,
and ranges can be specified like "SELECT [People$A] FROM [People$A2:Y314]"

------
BrianEatWorld
Its no substitute for a real db but, if you are stuck in Excel, I find that
Microsoft's Power Pivot is a good transitioning step. It comes free with most
licenses and is pretty well supported.

It adds SQL Server functionality to Excel, speeding up large queries, adding
SQL-like query functionality and greatly extending the limits on data, such as
the approx one million row max.

<http://www.microsoft.com/en-us/bi/PowerPivot.aspx>

------
kijin
> _ticker text, position integer, purchase_price real_

Oops, bad example. Never store money using a floating-point data type! SQLite
has a NUMERIC type that preserves the exact value of your decimal amount.

------
dergachev
Excel can easily export CSV files, which are easy to parse. However, standard
Linux tools aren't great for working with CSVs, especially ones with multi-
line content (eg exports from a CMS).

As an experiment, I wrote a quick ruby script called csv2sqlite which parses
one more CSV files (and their headers), and automatically populates an SQLite
database based on the CSV.

If you have a CSV and want to easily know how many records it has, or to
filter or join these records, it can be just a matter of running something
like following:

ruby ~/csv2sqlite/csv2sqlite.rb baby-names-10.csv --output babynames.db

sqlite3 babynames.db "SELECT * FROM baby_names_10 WHERE percent > .05;"

Hope it helps you!

~~~
Someone
_"Excel can easily export CSV files, which are easy to parse"_

'CSV' and 'easy to parse' do not go together that well
[http://en.wikipedia.org/wiki/Comma-
separated_values#Toward_s...](http://en.wikipedia.org/wiki/Comma-
separated_values#Toward_standardization) also is instructive:

Nevertheless, RFC 4180 is an effort to formalize CSV. It defines the MIME type
"text/csv", and CSV files that follow its rules _should_ be very widely
portable.

[...]

Each record " _should_ " contain the same number of comma-separated fields.

[...]

Fields containing a line-break, double-quote, and/or commas _should_ be
quoted.

[...]

The format is simple and can be processed by _most_ programs that claim to
read CSV files.

~~~
Groxx

      $ irb
      >> require 'csv'
      # => true
      >> CSV.parse(file)
    

^ works every time I've tried it. Systems which export mangled data to CSV
probably do so elsewhere, so CSV isn't special there, and CSV is really really
simple to escape well enough that any decent parser won't have any problems at
all.

------
cafard
Not to be too obvious, but what about plain old Microsoft Query? I pull data
into Excel from time to time with that.

------
jwilliams
Not sure of the audience for this. If you understand SQL you'd already know
most of it... The rest use Excel because that's what they understand.

In this case you have someone that understands SQL and Python, but still wants
to use a spreadsheet. Applying the predicates at the start, this person should
dump Excel entirely - they can get much better integrity using SQL and Python
alone.

~~~
regularfry
There are many, many organisations which use Excel sheets as de facto standard
interfaces, or "applications" you _have_ to use, because it's the only common
tool the people with the domain knowledge have access to and are trained in,
and sharing them is trivial - just email a .xlsx. I've seen government
regulators supply template Excel sheets as the mandated standard by which
private companies _must_ , legally, submit data.

Being able to interact with those spreadsheets in a sensible language with a
front-end that doesn't suck would be huge.

------
andrewflnr
I don't understand the point about backups. I would have thought a database
was more complicated to back up than an excel file.

------
amitparikh
I'm not saying it's the smartest thing to do, but Excel can most certainly
handle data on the order of tens of thousands of rows. It's not blazing fast,
but Pivoting that much data is not a problem at all.

------
retube
As a heavy user of Excel+VBA I would love to be able to switch to coding in
Python. I detest VBA with a passion. Plus the DB stuff is nice. It's much
better able to return results as a formula array vs the native MS facilities
which populate ranges with static data (and potentially clobber existing
content). Plus it also means your data is refreshed with a simple Shift-F9.

I actually wrote a wrapper for the ADO/Jet DB engine in VBA which does exactly
this [1]. However, doing it all in python would be a heck of a lot easier.

1) {=DB_QUERY("/path/to/.csv|.xls|.db","SELECT * FROM....")}

~~~
bornhuetter
Out of curiosity, what do you hate so much about VBA?

------
digz
I have had the unfortunate pleasure of having to integrate SQL and Excel for
my start up. A provider's tool that we need only works in Excel, so I have to
bring data from SQL into Excel, process it and then put it back into the DB.
Then all of our real data processing is built with python. So far I have just
used VBA/ODBC to handle everything, but for obvious reasons, I hate it.

I'll definitely give this plugin a try.

------
rebelde
"DataNitro is embedded in Excel." So users need to install something? Yuck.
That doesn't work for us.

We use handsontable (like Google Docs spreadsheet) with a MySQL back end.
Users just need a web browser to edit data. I wouldn't call it a replacement
for Excel, just a way for non-technical people to edit info in a database.

~~~
n00b101
Just curious, why does installing software on end-user devices not work for
you?

------
yahelc
Oof, DataNitro being Windows-only is a major bummer. Something like this could
be hugely useful for slightly less technical analysts on my team.

Anyone know of a Mac OS X friendly alternative?

~~~
quiesbill
It's not an exact alternative but I'm using Google Spreadsheets with Python
and SQLite. It's webapp, it's in the cloud and always synced among the team.
There are some API performance issues since you're working with a remote
server but it's OK for me.

Some API wrappers I've used: a slightly outdated official wrapper gdata-
python-client [1] and a bit more convenient gspread [2].

[1]: <https://code.google.com/p/gdata-python-client/>

[2]: <https://github.com/burnash/gspread/>

~~~
yahelc
That's really interesting. Is any of your GDocs/Python/SQLite setup publicly
available?

------
magicg
Which database is best for stuff like this?

And works with Python.

~~~
IanChiles
Python has bindings for pretty much everything, so I'd imagine that the
database you're most familiar with, or that there is the most information on
(most likely MySQL, Postgres, or SQLite)

~~~
kevinthew
I'd suggest SQLite unless you had a ton of users interacting all at once (if
it's something involving editing excel spreadsheets, I doubt you'd have a
problem with this).

~~~
IanChiles
Slightly off topic, but has anyone built an Excel-esque program on top of SQL?
I feel like something of the sort could really be an improvement, as now it'd
be easy for end users (who normally use excel) and developers to collaborate
on the same data.

