

Show HN: SQL Teaching – Codecademy for SQL - rhc2104
http://www.sqlteaching.com/

======
krat0sprakhar
As a backend developer, I've been increasingly trying to reduce my reliance on
ORMs and instead focus on writing SQL. Owing to my poor SQL chops, I spent a
couple of weeks working through @danso's data journalism course[0]. It is a
fantastic introduction to SQL, even for all non-technical people (forwarded to
my business analyst friends as well).

Amongst other topics, it covers aggregations, joins, grouping and gives a good
well-rounded introduction to performance characteristics and best practices.
For those wondering "where to go from here", don't look any further and give
the course a shot.

[0] - [http://www.padjo.org/](http://www.padjo.org/)

~~~
PJDK
Why the move away from ORMs? I've certainly banged my head against the wall
with them any number of times, but I wouldn't want to give up the compile time
checking of queries when I make a db change.

~~~
collyw
Maybe it depends on your ORM.

I use the Django ORM most of the time, and it is great for quick stuff.

Say I have to do a bulk update, with a few conditions thrown in. MySQL becomes
too much hassle trying to parse the first digit out of a string ,and updating
another field with that. A simple script using the Django ORM won't take long
to do that and will be far easier than trying to work around MySQL's limited
string functions.

On the other hand Django's ORM doesn't handle some pretty common cases
(conditional aggregates are the first thing that comes to mind). Or complex
joins on more than one field. I looks around, and there are ways of hacking
these things into the ORM, but they really don't seem worth the bother, when
you can use SQL for the query.

~~~
jeffasinger
This is why I'm so excited for django 1.8, it introduces the ability to use
pretty much any SQL function in aggregates, annotations and order bys,
including CASE.

~~~
collyw
Where can I read about this?

------
veb
I believe that visualization plays a huge part in learning. In reality, you'll
never see "current tables" looking like that. I checked Google Images ("sqlite
result") and it seems that you could benefit heaps by designing your tables a
bit better so they look like
[http://i.stack.imgur.com/9CXVO.png](http://i.stack.imgur.com/9CXVO.png)
perhaps.

I would also pay a little more attention to explaining what something is, and
why you do it like that. Instead to me it feels more like "copy and paste this
command, click run" and so you're not really learning at all. Why SELECT? Why
am I using *?

I realise these explanations may come in the later lessons, but from the very
beginning I should know what these are (syntax-wise). Then, as things become
progressively different I can understand and follow.

I love sites like this, I think they're so beneficial but my main gripes are
usually just that: the results, they really need to feel more real-world, and
practical-like. Then, I want to know the why/what/how about everything in what
I'm doing.

That's just me though. I may be good at SQL, but I'm damn sure I've probably
missed a lot of simple things!

Is there a way to sign my email up for updates? I'd like to follow the
progress!

EDIT: Ah there is:
[https://github.com/rhc2104/sqlteaching](https://github.com/rhc2104/sqlteaching)

~~~
rhc2104
Thanks for the feedback!

SQL Teaching is designed for non-technical people that want to learn the
basics of data analysis. Just learning SELECT with WHERE clauses and basic
joins get them pretty far.

The Github repo is:
[https://github.com/rhc2104/sqlteaching](https://github.com/rhc2104/sqlteaching)
, so I guess following that repo is a way of following progress.

~~~
veb
Heh, found it just before your comment. :)

Keep up the great work.

------
chernevik
IMHO, I think the material here is too brief, and insufficiently progressive,
to help a beginner move to confidence. To master concepts, people need
repetition, and application of concepts to problems of sufficient difference
to see the fundamentals of the concept apart from any particular. Giving
people a dozen tasks, each introducing some different concept, won't leave
them with a basis for a confident understanding.

I also think beginners need:

\- A strong grasp of DISTINCT. This prepares them for the division of tables
into subsets by GROUP BY, which is vital to an understanding of how
aggregation and window functions operate. ORDER BY can used to show beginners
how they can first organize the data, and see how different values in
different fields can be used to segregate records into groups.

\- WHERE is important but actually not that hard to communicate. The key
things about WHERE are one, you can specify true / false expressions whose
evaluation dictates inclusion or exclusion of a record (this is crucial to
understanding how a JOIN works), and two, you can use IN to compare a record's
value to a _list_ of values. Once someone grasps this, it becomes very easy
for them to see how a subquery works.

\- A strong grasp of how subqueries work. One of the great strengths of SQL is
the ability to encapsulate complexity into a subquery and pass that result on
to another query for further use.

\- A thorough walk-through on "how" JOINs work. What does it mean to JOIN two
records? To JOIN two tables? How do we "filter" that join of two tables? (All
of this at a _conceptual_ level, taking care to note that how we might think
through a complicated join is _not_ how the operation is implemented by the
software.) What is the difference between an INNER JOIN and a LEFT JOIN, and
how are these differences useful?

\- Some discussion of just what exactly SQL is, what it is used for, and what
it isn't used for. The tool has to be placed into some context so that
beginners are oriented to where they will encounter it and what they can use
it for. In particular they need to understand its interface, and the textual /
non-visual nature of that interface. Most people are accustomed to working
with computers through GUIs. The textual nature of SQL makes it very
different, in profound ways, from the vast majority of tools people use on
computers.

~~~
buckbova
If your code is littered with DISTINCT clauses either you are doing something
wrong or the data design is poor.

DISTINCT leads to expensive SORT operations and effectively poor performance.

~~~
swyman
I'm genuinely curious. Would you mind elaborating or pointing me to another
resource that goes into detail on why this is the case?

~~~
buckbova
The poor design part or the performance part?

The use of DISTINCT everywhere smells of denormalized tables. Now if you are
in data warehouse or reporting position, then this likely makes sense. I tend
to work in transactional applications and keep the redundancy down to an
absolute minimum. I abstract away some of the complicated queries with views,
procs, and functions where allowable.

As for performance, different rdbms implement this differently, but the
general query plan category for GROUP BY and DISTINCT is SORT. If you are
querying something where you have multiple subqueries with DISTINCT and you
and ordering the final results, you are adding extra sort operations to the
query plan, hence hurting performance.

I tend to design everything to optimize read operations because there tends to
be much more read than write in systems I work in. For me this means
denormalized and heavily indexed where table/index scans are extremely rare.
On smaller data sets, some rdbms always scan tables because it saves
operations based on table statistics.

I hope this helps.

------
xtrumanx
I guess this is the client-side alternative to SQLZOO[0] which is a fantastic
resource I used to use back in the day to sharpen my SQL skills.

Been doing SQL for so long now but only recently have I been exposed to how
powerful indices and learning how to read the execution plan is.

I realize this is beginner stuff based on a local browser-based SQLite but I
wish more learning resources like this and SQLZOO had an advanced section to
discuss performance issues you may encounter once you have a large enough
dataset. For years I just assumed we just had so much data that having to wait
for the query to complete was natural until I tried figuring out how to
resolve a deadlock issue we've been having an accidentally figured out how to
tune queries to run instantaneously.

There are probably many people out like my former self who know the basics and
also have to maintain huge databases.

By the way, if you're one of them, "Use the Index, Luke"[1] is where I learned
everything I needed to know about creating indices (though I don't think it
has much about reading execution plans).

[0] [http://sqlzoo.net](http://sqlzoo.net) [1] use-the-index-luke.com

~~~
hobs
It depends on what engine you are using once you are talking about xplans, if
you are into SQL Server and you are looking for a great book to reference
thats almost all practical "this is how stuff is going to be interpreted",
check out [http://www.amazon.com/SQL-Server-Query-Performance-
Tuning/dp...](http://www.amazon.com/SQL-Server-Query-Performance-
Tuning/dp/1430267437/) by Grant Fritchey (the previous book had Sajal Dam)

Additionally, SQLPass puts out a great DVD every year and in the past year or
two they had some great talks about how to read xplans, why they are actually
lies, and going into things like statistics io, measuring recompile/cpu time,
determining if implicit conversions are causing issues, etc etc etc

~~~
xtrumanx
It seems that the SQLPass requires registration to enter the "Session
Recordings" page so I'll have to look into it later.

I googled "execution plan lie" and the only stuff that comes up is regarding
how Oracle's Explain Plan sometimes provided inaccurate information. Is that
what you were referring to?

I guess explains why Microsoft labelled their version of Explain Plan "
_Estimated_ Execution Plan". I mostly just include the actual execution plan
when running the query so I can get the real execution plan and the io stats
all in one go.

~~~
hobs
Actual execution plan's costs in SQL Server still contain a lot of estimates,
I have seen SQL Server guide you in the wrong direction if you are looking at
the thing "costing" the most, when in fact I find when I am doing perf tuning
my biggest first steps are: 1\. Is the code pants on head stupid and not set
based or not sargable 2\. Do the query plans clearly misinterpret the data and
why that is

My current process is: 1\. set statistics io on; set nocount on; 2\. Grab
actual execution plan, throwing it in sql sentry plan explorer (free edition)
(holy crap good) 3\. Finding which estimates are off in the breakdown in 2 4\.
Investigating specific code areas where estimates are vastly different than
actual, fixing whatever issue (implicit conversions, udfs, old stats, etc)
arises 5\. Consider adding indexes if needed 6\. Paste outputs of both stats
into [http://www.statisticsparser.com/](http://www.statisticsparser.com/) (you
can have it print things as headers, so I like print 'test a' go exec sp_proc
@params go print 'test b' go exec sp_proc2 @params ) 7\. Compare if I did it
right or not by checking overall reads, cpu time, etc

Update: I have the videos in question on my dropbox, I dont know if they would
like a public link, so if you dont want to sign up let me know via my email in
my profile and I will send them to you.

------
brudgers
One thing that makes CodeAcademy successful is an attention to gamification. I
think sketching some of that out is a priority for the next iteration. Clear
game mechanics is what keeps people coming back to a platform like this.

------
andrewstuart2
I would most definitely love to see some table design thrown in there as well,
with ER diagrams, cardinality, and normalization (bonus points for really
explaining _why_ well). I just noticed this is on GitHub, so if I get a
chance, I'd certainly like to contribute :-)

From my talks with a few of my great programmer friends, it seems this is an
area that more people could use some bolstering.

------
clay_to_n
In the 'Select specific columns' lesson, FROM isn't capitalized. Not an error
but could cause confusion. Same with AND in its lesson. It might also be nice
in the first lesson to explain that the caps aren't necessary, but make it
more understandable.

In the first lesson, after the paragraph "Imagine we have a table...", it
might help to display an actual table there. I was a bit confused, reading
through the whole page and not visually seeing any table. Only after I started
writing the command did I realize I had a table underneath the input box.

Overall, looks really good! Not a bad place to start for developers who
haven't used SQL before but want to get a simple working knowledge of it.

~~~
teddyuk
<personalOpinionPleaseDontCry>Capitalization doesn't make it more readable -
just use a ide (or whatever) with syntax colouring and your sorted.
</personalOpinionPleaseDontCry>

~~~
vertex-four
It is a convention, much the same as variable/function/class naming
conventions and whitespace conventions, that makes it easier to read an
arbitrary piece of code when you come up to it for the first time.

------
contradictioned
Also: "SQL Island", where you text-adventure-like play a game through SQL
[http://wwwlgis.informatik.uni-
kl.de/extra/game/](http://wwwlgis.informatik.uni-kl.de/extra/game/)

~~~
Orangeair
Is there by any chance an English version available?

~~~
contradictioned
Mh on my settings it was english by default. With this parameter it should
work: [http://wwwlgis.informatik.uni-
kl.de/extra/game/?lang=en](http://wwwlgis.informatik.uni-
kl.de/extra/game/?lang=en)

~~~
Orangeair
Weird. It did not want to be in English for me on Chrome or IE. Thanks for the
link.

------
nissimk
Looks good for the basics, but as some other folks mentioned here, it would
benefit from some articles about design, and even more importantly, rationale.
I hear from so many programmers "why?" They don't understand the benefits of
relational. This perspective ranges from old school programmers that just want
to store data in the file system, to newcomers that like mongoDB or other
object stores.

In my point of view, the main answer to "why relational" is that with an rdbms
you can answer ad-hoc questions by writing a query rather than writing a
program. Even as a programmer, having the power of SQL on the server is really
great for analytics.

Furthermore, as an ORM hater, I would say that query results should be
processed in your programming language as a table structure rather than
converting to specific objects per table. If you're building a CRUD system and
the screens update one or two objects/tables at a time, then an ORM is useful,
but if you're doing any kind of analytics, any object structure other than a
table or list of dictionaries is cumbersome and encourages code over sql which
is a bad thing.

~~~
thanksgiving
> encourages code over sql which is a bad thing

Not saying you are incorrect but you made an assertion here. Can you please
elaborate?

~~~
astine
Not certain about the OP's reasoning, but doing aggregate functions on the
database server is often more performant than doing the same in your
application. This is, in part, because of the smaller result set sizes.

Also, it's useful having logic baked into the database rather than your
application in the case that you need a different application to connect to
your database. It's a problem, for example, if you're maintaining your foreign
key relationships through ActiveRecords's relations and ignoring it on the
server. It makes it much easier for junk data to get in the database.

------
AlisdairO
Good stuff!

You might want to be a little careful with ensuring an aggressive query
timeout. I ran the following:

select * from family_members f1, family_members f2, family_members f3,
family_members f4, family_members f5, family_members f6, family_members f7,
family_members f8

This took a while to run - I didn't go any further as I didn't want to damage
the site during your launch, but add a few more cartesian products in there
and it could hurt.

~~~
rhc2104
Thanks for the heads up. I actually use Sql.js (SQLite in JavaScript), so
there is currently no server-side component of the website.

~~~
AlisdairO
Ah, fantastic! I assumed based on the mention of SQLite on the front page.

~~~
mdellabitta
This is pretty much the DOS version of "why are you hitting yourself"

~~~
AlisdairO
Indeed :-)

I actually had no idea sql.js existed up until now. Seems like it could be
pretty useful.

------
asafira
I was looking for a way to sharpen my SQL skills a few months ago, and after
being annoyed with some of the very, very beginner stuff out there I found
this:

[http://www.sql-ex.ru/](http://www.sql-ex.ru/)

The site can be slow sometimes, but I have found queries/puzzles that (a) have
shown me use-cases for functions and capabilities I have never used and (b)
can be challenging! Later they apparently focus more on optimization, but I
haven't gotten there. (Especially since they have hundreds of these more
difficult queries). I've done about 70ish so far.

What do you guys think about it? Overall, never going to be useful, or a good
resource to get better with SQL?

------
villek
Looks like a great start!

One thing that would be essential is better feedback on errors. Currently, if
I enter an invalid SQL query I’m only told that it was incorrect. For
beginners, this can be frustrating.

------
avinassh
I have learnt SQL from [https://sqlzoo.net](https://sqlzoo.net) It's a really
good site with options of other database system syntax also.

~~~
AlisdairO
This is a self-plug, but you might find
[http://pgexercises.com](http://pgexercises.com) useful as well.

~~~
avinassh
Hey your exercises are really good. I am half way through it and so far I am
enjoying it a lot.

~~~
AlisdairO
fab - I'm really glad you like it!

------
elyase
I also like [http://datamonkey.pro](http://datamonkey.pro).

------
uberneo
looks good and might be helpful for somebody starting with SQL . You can add
some advanced version as well with some more complex examples and use cases
with inner queries , UPSERTS n all

------
domoarevil
Cool, will check, thanks.

Despite the clunky UI, I've found that www.sql-ex.com provides the best medium
to tricky problems for those wanting a refresher. (MS T-SQL centric.)

------
zer00
This is awesome!

One thing though, any reason you have Mary making 10% less than Dave in your
first data set? Seems like kind of a weird thing to include.

------
warkid
Cool! One should be able to run his code by pressing something like Ctrl/Alt-
Enter, instead of reaching for mouse all the time.

------
nodesocket
Very nice. Love to see a few more intermediate examples using LEFT and RIGHT
OUTER and INNER JOIN, HAVING, UNION, and sub-queries.

------
gauravgupta
Looks useful. You should submit this on [http://hackr.io](http://hackr.io) as
well.

------
allworknoplay
Good start. Many others have suggested other query types. I would also suggest
that you cover inserts as well.

------
mohdmaqboolalam
completed the course thanks for the tutorials.

