

Show HN: A midterm test on basic SQL for my journalism class - danso
http://www.padjo.org/2014-10-23/

======
Alupis
This is fantastic!

We're in a technological world now, and not knowing how to use that technology
puts one at a severe disadvantage.

I'm very glad to see other fields start using technology and actually
_teaching_ it in the Universities.

I have a friend who is in the Political Science program at his University, and
they are teaching everyone how to use the R programming language. He couldn't
stop complaining until I just laid out everything he could do with it.

~~~
capkutay
I think teaching a declarative language like SQL along with some basic, common
queries is a lot easier than teaching R to someone with no technical
background. Even as a CS student, it took me a couple weeks to get the hang of
R.

In general, I see a trend towards teaching data science and analytics to
people who aren't software engineers. I think SQL is an excellent way to bring
analytics skills to the masses (over python, R, java).

~~~
Alupis
R is taught specifically because it is a statistical analysis language, which
is what a Poli Sci student can expect to be doing a _lot_ of after school.

R isn't really "for CS students", it's for statisticians and analytics
professionals.

Just like chemists often write their own chemical modelling programs, it takes
a lot of domain-specific knowledge to craft the correct algorithms for things
like this.

~~~
ams6110
That's good to know. As someone with a VERY solid grasp of SQL, I found R to
be baffling (though I've only tinkered with it a few times).

~~~
grayclhn
Trying to use R to do statistics -- graphs, linear regression, t-tests, etc.
-- is the best way to get your feet wet. It's ironically not great for data
management and relations. (Although there are very good packages for that sort
of thing: datatable, dplyr, and sqldf all come to mind.)

------
pjungwir
Cool! It'd be great for more journalists to learn SQL. It was originally
designed to let managers run reports without asking the programmers, so
perhaps there is hope.

I see you have discussed GROUP BY. In that case, you should also talk about
HAVING. It's just like WHERE, but it runs after grouping instead of before. It
can be really useful, and a lot of people don't even know it's there.

------
nkozyra
This is also required reading:

[http://blog.codinghorror.com/a-visual-explanation-of-sql-
joi...](http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/)

------
wcummings
Did you learn regular expressions as well? I bet they'd be very useful for
journalists / journalism majors.

~~~
danso
Yes...teaching regexes is one of my goals...since the beginning I've forced
them to jump through the hoops of setting up Github and submitting Markdown
files, to get used to the idea of dealing with plain text (which of course, is
key for understanding CSVs)...so regexes are a natural thing to learn for
finding patterns and, on a day to day basis, cleaning data...I think in terms
of technical skills, it's probably the most useful thing I could teach given
how much time is ultimately spent on data munging.

~~~
maxerickson
Have you seen [http://software-carpentry.org/](http://software-carpentry.org/)
?

I haven't looked close enough at what you are doing or what they are doing to
pretend to make a comparison, but I guess it isn't too risky to say that they
are at least thinking about some of the same things you are.

~~~
danso
Yes I have...they're hitting a lot of what I want to do in a class for next
quarter, which will be heavily command-line focused. Besides using some of
their lessons, hopefully I'll come up with a few of my own to contribute to
the project. But I do agree that both scientists and journalists could benefit
a lot from learning how to work with their computers at a lower-level.

------
krick
I think this is just excellent. I'm more a tech person and not a journalist
myself, so I don't expect to find something new for me there in sense of using
technology, yet I surely will go through it, as I find the approach itself
interesting enough to do it.

By the way, maybe there are other courses/book/sets of exercises/whatever I
could use to learn more about that non-technological, "liberal" side of
journalism? I was always wondering what makes person a journalist, but
unfortunately never achieved it myself.

------
declan
This is excellent! I've taught a graduate journalism class in the past and
<danso>'s approach here is inspiring -- I hope other instructors pay
attention.

Though note the class appears to be "COMM273D/Public Affairs Data Journalism,"
so the students enrolling presumably are willing to put in the effort required
to learn a bit about SQL (I don't know if this approach would be as successful
in a basic journalism course).

------
jefflinwood
Interesting! I'm teaching journalism students how to build mobile apps on iOS
with Objective-C at UT-Austin. For most of them, this is their first
experience programming.

------
bdcravens
With the heavy reliance on ORM magic in modern frameworks, how long before
students in a class like this know SQL better than some professional
developers?

------
jedanbik
It's so great to see such accessible teaching materials for the next
generation of data journalists. Almost makes me wish I studied journalism in
college.

------
Animats
You can use SELECT inside of SELECT now, and MySQL is usually smart enough to
plan and optimize the join for you.

------
illicium
Why MySQL and SQLite, but not PostgreSQL?

~~~
danso
Besides being the two variants I'm most familiar with, MySQL and SQLite have
the most variety of GUIs and, ostensibly, the most help docs...but the point
of the SQL isn't for them to learn even the most basic things about database
admin (I decided to skip over indexing and I just provide the SQL dumps for
imports)...it's: 1) be able to handle datasets of more than a million rows
(Excel and Spreadsheets top out at around 1M), 2) be able to join datasets on
foreign keys (in the past, I've tried Fusion Tables, but the merging function
is a bit wonky, and is pretty inflexible)...and 3) ...something that I've now
since realized, it's a great way to show the difference between learning how
to use software (i.e. Excel, Google Spreadsheets), and learning how to tell
the computer explicitly what you are trying to find.

The most practical difference for me is that it's _much_ easier for me to
describe a data querying process through pseudo-SQL than it is to describe all
the submenus and clicking and highlighting you have to do in a
spreadsheet....the tradeoff being, you have to get past the wall of learning
SQL syntax. But I was very surprised...the students picked it up very quickly,
even a student-athlete who hasn't been able to come to a single class...(and
no, I don't think someone is doing his homework for him, since I meet with him
weekly and he's since been able to work on SQL and datasets on top of what
I've assigned him)...

They're still a long ways from understanding SQL to the point of being able to
administer a DB, but that's OK, the class is about querying and investigating
data...and using SQL to the point where they can wrangle the data to
analyze/visualize it in other software (such as a spreadsheet)

~~~
ChrisAntaki
You're doing great work! The students that graduate with this SQL knowledge
will be empowered to find needles in haystacks.

------
mceoin
This is pretty cool - I had no idea journalism students had to know SQL.
Thanks for sharing!

~~~
tomelders
Having worked with journalists, I'm still shocked that don't know SQL. It
seems like a perfect fit.

