Hacker News new | past | comments | ask | show | jobs | submit login
Learn SQL Interactively on Khan Academy (khanacademy.org)
439 points by pamelafox on May 11, 2015 | hide | past | web | favorite | 86 comments

This is great!

I created http://sqlteaching.com , so it's wonderful to see Khan Academy's take. If anyone from KA would like to chat with me about SQL tutorials or anything else in general, feel free to reach out- rhc2104@columbia.edu

I love your site and plan to use it as a teaching tool for my Wed Development course I'm teaching.

As a request, are you working on a user system to record completed lessons? As an extension, are you looking into making a classroom management system akin to Codecademy? I'd love to create a classroom, have students log in, complete the necessary lessons and have the means to know where they are in the overall plan.

If you'd like some help getting that going, let me know! I'm in the process of developing typing exercises for my courses to give students extra practice with the basics.

Pamela from KA here. We're using the same client-side SQL solution as sqlteaching.com (which is great!). Our editing environment is here and open source: https://github.com/Khan/live-editor

KA does have classroom tools and progress reports, but we don't have the ability for teachers to create their own lessons (if that's what you're looking for). More details on that here: https://www.khanacademy.org/coach-res/reference-for-coaches/...

It's a lot of work to create the classroom management itself (that's done by a whole separate team at KA) so it's not something most folks tack on to teaching sites.

Hi tsumnia,

Thanks for the feedback! One nice thing about SQL Teaching is that it is completely client-side. This is possible because somebody compiled SQLite to JavaScript: https://github.com/kripken/sql.js/

Thus, I would have a bias against adding a classroom management system which would require some kind of backend database that I would need to keep running (funny, I know).

However, SQL Teaching is open source https://github.com/rhc2104/sqlteaching , so I would update the site if you were to add the feature, or you could just take the JavaScript files and integrate it into your site!

Not funny at all! Its something that I am toying with on my end as well. I'll have to play with it this summer and let you know if I get anything rolling!

I've definitely recommended your service to my friends looking to learn SQL.

Also, I made my share of contribution to the SQL pedagogy by writing blog articles on "SQL for Excel Users":


It was nice to find your site in addition the the Khan Academy's SQL course. Maybe I will try to direct my kids to walk through this.

I am curious if you know a resource that would start approximately where your site/tool stops. I know there are books and reference manuals - but I would love to see the intermediate SQL version of your tool - including things like cursors, stored procedures and Common Table Expressions.

Or does SQL at the next level lose too much of its uniformity? And, you would have to make separate sites for PostgreSQL, MySQL, SQL Server etc.

Hi mswen,

Unfortunately, I don't know of interactive tutorials that cover the things you mentioned. The Khan Academy tutorial and SQL Bolt seem to have a few topics not covered by SQL Teaching, so that is a start! (Mainly, write queries)

SQL Teaching uses SQLite because that way I don't need to maintain a backend to the site.

But if there are more advanced topics that can be taught by SQLite, feel free to make a issue in the repo. Pull requests are also welcome!

I mainly learn now by looking up things when I have a need and then hoping to still remember it 3 months from now when I need it again. The material covered in your tutorial is kind of the daily bread and butter stuff that I use and am comfortable with, but I know just enough to understand that there is this whole other level to SQL querying - even without really dealing with Database Administration and index tuning.

I guess I will just keep learning as the need becomes apparent. Anyway thanks for making your resource available.

Hey. Thanks for this, it's absolutely fantastic.

Been using SQL for a while now but never got the hang of joins and could never find an easy guide for them. This was the simplest explanation I could find.

Some people find graphics like this helpful for conceptualizing them :


I used your site recently to help my GCSE students revise for their CS exam. It was perfect, so thanks for all your efforts in creating it.

Your going to get a load of spam putting you email address up here like that.

Thanks! Will check out both sites.

I'm doing the course, as it's something I haven't learned in depth yet.

They mentioned pronunciation : 'S-Q-L' vs 'sequel'. I've always pronounced it S-Q-L. The lecturer implies there's no right answer, that 'sequel' is historical but 'S-Q-L' is preferred for international use.

The annoying thing, though, is that in writing a choice has to be made. If someone pronounces it 'S-Q-L' then to read 'a SQL query' is weird -- it should say 'an SQL query'. And vice versa for someone who pronounces it 'sequel'. 'an sequel query' isn't right.

So, in writing, what is correct? 'An sql query' or 'a sql query'?

This is a question for English.StackExchange.



That said... As a native English speaker, I can infer your pronunciation from which of the options of "a" and "an" you decide to use.

FWIW, "an" is appropriate for any word starting with a vowel sound (whether or not it actually starts with a vowel) and inappropriate for any word that doesn't. "An historic" drives me nuts, but I'm one of those jerks. (My mother was an English teacher - I never stood an chance!)

On some level, I think you can still apply the "starts with a vowel" rule for SQL, since in this case "S" is really standing in for "ess" (assuming you're not saying "sequel", which I've never gotten into the habit of doing).

It's even simpler. The "starts with a vowel" rule is about pronunciation. The spelling is entirely orthogonal. If you pronounce "SQL" as "Ess-cue-ell", it's "an". If you pronounce it as "sequel", it's "a".

Aside: the reason "an" is also sometimes used for word starting with an "h" is that the "h" in those cases is silent and the pronounciation therefore starts with a vowel (following the silent "h"). There are some style guides that recommend using "an" for words where the "h" isn't silent (or "a" for words where it is) but they're just being intentionally obtrusive.

Yes - exactly my point. It is interesting to learn, as another commented somewhere in this thread, that my brain just invokes the appropriate "sequel" vs. S.Q.L. just from whether or not someone includes the ever-so-important "n."

It's not automatic for me, which is why I found it annoying enough to ask. If a writer writes 'an SQL query', that doesn't bother me at all, but if a writer writes 'a SQL query', it takes me a while before I start hearing it in my head as 'a sequel query' -- until then, it just sounds wrong.

Worse is that some writers seem to mix it up, use both periodically.

Generally in English, I think that when it's unclear whether the majority considers an abbreviation to be an acronym or a plain initialism, it's proper to treat the abbreviation as an initialism.

Importantly, people who have only ever read the abbreviation, and not heard it spoken aloud, will assume it's a plain initialism. It's better, I would think, to write something that reads properly to these people—something "more accessible."

The people who are aware of the acronym pronunciation, meanwhile, will be able to adapt their reading either way, because they're also aware that a plain-initialism reading exists, so you're not unduly burdening them in parsing the sentence.

Wow. I never realized it but the author's choice of a/an dictates how I read it in my head. Without even thinking about it, if I see "an SQL query" I read it as "an es-queue-ell query". If it's written as "a SQL query", then I read "a sequel query"

But that doesn't answer your question. I suppose it's the same as when saying it. Pick one and stick with it, whatever your choice.

I believe it's written "[Aa]n? (sql|SQL) query" but it depends on which standard library your native language is using.

A choice has to be made in speech, as well. And, as there is no one clear correct answer in speech, and the choice in writing is based on the choice in speech, there is no one clear correct answer in writing. Choose one, and use it consistently within a given work, and you should be fine.

Of course, if its a work on SQL, you can normally avoid "a(n) SQL query" and just say a "a query" (with SQL implicit from context) or, in the less-common situation where you need to distinguish it from a query in a different query language, "a query in SQL".

I always pronounced it 'S-Q-L' until I got a job where everyone says "sequel". Now I just say "sequel" to fit in, and because it's fewer syllables. Even if a text said "an SQL query", I would read it "a sequel query" without even thinking about it.

Some languages, like French, have a group which can dictate what is and isn't correct in the language (as it turns out, the vast majority of French speaking people don't give a rat's butt about what the Académie Française thinks and still do as they please, but that's another story).

In English, however, there is no such authority. Whether you are writing a SQL query or an SQL query is all up to you. Aren't you happy you don't have an old wrinkly white French dude to chastise you about which one is correct?

Speaking of the Académie Française—do published works in France get "French-picked" to conform to proper Académie grammar?

Furthermore, there's the QUEL language, and SQL is probably going off that, especially with the original name of SEQUEL. Here's an interesting post on it: http://patorjk.com/blog/2012/01/26/pronouncing-sql-s-q-l-or-...

FWIW, I use sequel as that's how I learned it. But I also dislike unnecessary syllables so that's my justification.

To make things more confusion there is a software product called SEQUEL

This was posted sometime ago and I really enjoyed : http://sqlbolt.com/?rp I will definitely check out Khan's academy course as well!

I like that SQLBolt starts with querying (Khan does not). Even though a logical proceeding would start with CREATE and INSERT, many/most people will probably never do anything more than SELECT.

I was going to post exactly this exact comment.

Wow, KA hits another one in the outfield!

The audio that highlights part of the code is awesome (much better than watching a passive video), and the fact that everything can be executed in real time is awesome. Good job!

Thanks! That environment is all open source, if you're ever looking for some OSS bugs to tackle :-)


Is the recording and playback of course content part open sourced somewhere?

The recording part is there as well, in particular https://github.com/Khan/live-editor/blob/master/js/ui/record... These slides give a little high level overview of that: https://speakerdeck.com/pamelafox/teaching-programming-onlin... (slide 26)

That's great, thanks for pointing in the right direction, and great job btw!

It's great to see KA getting involved in teaching SQL - I really think there's too little SQL literacy out there amongst devs in general.

I created http://pgexercises.com a while back, which goes up to some relatively tougher exercises. I'd be very happy to help out with KA's effort if it would be useful. Email is in my profile.

Oh, nice! I'll add that to our final article in the course, for folks that want to learn PostGres. I'm hoping this course is a starting off point for people, and they can get more advanced with non-KA resources.

That's very kind of you - thanks!

This is great... but I am torn because, as someone who has spent a lot of time with databases, I feel like there are many mistakes that can lead to confusion.

First, it's not teaching standard SQL, it appears to be teaching MySQL. That means any user who tries out any other system will immediately run into confusing error messages about the double quotes. Not ideal, and totally unnecessary -- sticking to standard SQL should work reasonably well in MySQL anyway (at the beginner level, at least).

Second, in the section "Creating Tables and Inserting Data", the teacher jumps into surrogate keys immediately. Again, I feel that is unnecessary and will just lead to confusion. The beauty of a relational database is that it connects with the real world; and having arbitrary numbers floating around destroys that beauty.

I really hate to criticize, because I really support the effort. Hopefully my feedback is useful.

Khan Academy is using SQLite (because there is a JavaScript version of it)

Can't they use the standard single quotes then?

These are all artifacts from the fact that the implementation is based on SQLite, which does things this way.

Imo every developer should be comfortable with sql, upto joins at least. Basic data modelling is also invaluable.

Serious question...what kind of developer can you be without knowing SQL?

One who does lots of low-level C/Assembly programming. I know several people who do lots of low-level OS work, and don't use SQL.

I was a developer for >3 years at my last job, and barely touched SQL. I did desktop software that mostly interacted with streaming data, state machines, etc, and mostly wrote limited data to flat files as text or XML where necessary. We just didn't do anything where database storage was appropriate. I worked with a few SELECT queries to get data from a legacy Access DB, but that's about it. I was probably barely at lesson 1 level on all of the linked lessons.

I have learned a lot of SQL at my current job, though. I can now admin multiple types of DB servers and write SQL up through window queries, CTEs, indexes, stored procedures, etc, so I've learned enough to be past the end of every SQL lesson I've ever seen. There's a huge amount of fascinating stuff to learn and a ton of cool stuff you can do.

Web developers don't have to know SQL if they are only doing front end work, since the routes they hit which may or may not use SQL are written by back end developers.

Actually I'm a back end developer and I don't use SQL at all. I learned it (a bit at least) on my own just out of interest and random personal projects, but I don't use SQL at work at all; we only use DynamoDB.

Game engines, graphics, animations, lot of front-end web stuff... The list goes on.

Not that I hadn't already done plenty of CRUD in SQL in jobs prior, but for 4 years I worked on streaming stock quote servers. All in all the system had 500+ servers deployed in a tiered setup. No data was persisted to disk (save for some config), and no SQL was ever involved. Ticker symbols (CUSIPs) did need to be looked up and those went through our network, but the actual lookups went out to a service, and results were cached.

That's just one example. I'm sure there are many.

I write mathematical optimization software. I haven't used SQL since my undergrad. I know that one of our teams though use SQLite to store benchmarking results or something.

I've been programming professionally for 6 years and only just started using SQL for the first time for a side project -- some web dev stuff.

I know Prolog pretty well so it's not like I was completely flabbergasted by it, I just had no need to use SQL before.

Big Data Developers can use Spark, Cascdading or Pig to do pretty advanced data warehouse, analytics, ETL tasks. Web app developers accessing databases can use an ORM e.g. Hibernate to abstract away the SQL layer or use JSON to query MongoDB, CouchDB etc.

You could write every application under the sun without knowing a single bit of SQL.

One can't seriously use an ORM without knowing SQL, because they all come with their own proprietary incomplete wrapper language (HQL, etc.). ORM simplicity breaks down the moment you push them even slightly hard.

Big data analytics is increasingly done with Hive's SQL dialect, or Impala, or Hawq. Beyond all of these, relational data modelling is an essential skill for anyone building server side applications if you have to deal with information management, whether you use SQL or not.

In today's world you could probably be powerful using the NoSQL databases. But, imho, you will still have to understand the basics of set theory. Unions, intersects, etc.

But I agree with the spirit of what you're saying. A good programmer will know SQL. It's definitely not going away anytime soon.

SQL has almost no practical overlap with set theory. Relational calculus, and relational algebra, sure, but SQL neither operates on nor produces sets.

SELECT column_1 FROM Strings_1 UNION SELECT column_1 FROM Strings_2

Is this set theory? A UNION of two sets? Oh I see your pedantic argument. They aren't sets because they are ordered, they are lists.

If I can't explain how UNIONS and INTERSECTS are relevant to SQL to you then I can't help you because you're simply being difficult. It has every practical application to SQL.

    > They aren't sets because they are ordered, they are
    > lists.
Half correct - the other reason the results aren't sets are because they are also able to return duplicates, as evidenced by UNION ALL and INTERSECT ALL.

Were I being pedantic, I would turn to the (horrifyingly inconsistently written) SQL1992 standard, which explicitly says that your columnar inputs are multisets (which is a posh way of saying "not a set").

Coming back to your original assertion:

    >> you will still have to understand the basics of set
    >> theory. Unions, intersects
In 15 years of software development, open-source and commercial, mostly with databases, I've only used UNION, INTERSECT, and EXCEPT in SQL when doing academic homework, translating from relational algebra and relational calculus. I note the world's most popular open-source database doesn't support INTERSECT.

But most people don't write SQL by hand any more, they use an ORM - most of which have scant support for UNION, INTERSECT, and EXCEPT, because they are obvious vestigial details left over from archaic database theory. I have resisted the temptation to become actually pedantic by discussing the SQL treatment of NULLs.

This constitutes "almost no practical overlap".

But look, here's the meat: Fundamentally, sets are distinguished from other list-like data types by having no ordering, and having no repetition. Practical SQL usage requires you to understand that you will have duplicates, that you may well need to aggregate those duplicates somehow, and that order matters.

That's a nicely crafted response. Allow me to retort: http://lmgtfy.com/?q=set+theory+and+sql

Huh, I guess others think like I do.. I guess we're all morons. Shrugs.

    > Huh, I guess others think like I do..
There is absolutely no safety in numbers:




    > I guess we're all morons
Maybe, but it's more likely you're just wrong.

I'm not exactly sure why you're so opposed to the idea, but okay. You've won. I will never ever say SQL relates to sets. Thanks!

I've used intersections and unions with sql. And also Cartesian products (a cross join). Don't those count?

No, because in both cases the inputs and outputs are lists, not sets.

Mobile developer who uses existing APIs ?

You could develop relational database applications using Dataphor's D4 language without knowing SQL, though I suspect most D4 users also know SQL.

Or you could develop software that doesn't use relational databases for persistence; there's still lots of that.

As a game developer, I have a bunch of very talented programmers in my office, and afaik I know the most SQL in the office, which isn't much.

Embedded? D3? COBOL?

Yup - very few of our graphics required any SQL


They exist, they really exist!

So far my experience is that getting a SQL query correct is fairly easy and usually fun but that the real difficulty, e.g., for SQL Server, is for the rest:

(1) Installation.

Grim. Barbed wire enema.

Wiped out my Windows boot partition requiring that I reinstall everything starting with the Windows DVD.

(2) Security.

The stuff with users, logins, etc. and nearly everything in SQL Server Management Studio was clear as mud. In the end:

(A) Assume that basically what is there is essentially old capabilities and access control lists, imagine how they might work, and make that a first guess. It helped.

(B) Use SQL Server Management Studio only for passively reading but never for actual management where change things.

(C) For each of the important steps in setting up security, do searches on the Internet, find standard text SQL statements, and try those. Save and profusely document the statements that do work.

(D) Do system management only with SQL statements in simple text files and executed with the SQL Server command line utility SQLCMD.EXE.

(3) Connection String.

The last time I had to get a SQL Server connection string to work took a solid week of throwing things against a wall to see if they would stick. The effort was an unanesthetized upper molar root canal procedure.

I haven't found that stuff overly difficult so far.

http://downloadsqlserverexpress.com/ is a nice source to install SQL Server and Management Studio from, which also has a link to a blog post kinda mocking how ridiculously hard it is to find the installer on the Microsoft website.

I also find it odd just how long it takes to install, versus Postgresql on Windows installing in like 5 minutes I think.

I'm not sure what you're trying to do with users and permissions, but I haven't had much trouble with that in Management Studio. Add a new login in the server Logins folder, set either a username/password or domain authentication, select databases and set permissions. Not that I've tried anything tricker like assigning multiple users to a role, or doing stuff with AD groups.

https://www.connectionstrings.com/ is a decent resource for connection strings too. It seems to be pretty easy, as long as you're connecting from other Microsoft tools, like Management Studio, sqlcmd, ADO.NET, etc. I'm pretty sure I once spent like a day or so trying to figure out how to get ActiveRecord to connect to SQL Server before I got it working.

That said, the usual lessons could stand to pay more attention to this kind of management stuff.

Many thanks. Your post is a keeper.

I've got SQL Server working now with notes on how I did it and with some simple text files with SQL commands to be run with the Microsoft SQLCMD.EXE for setting security. For the connection string, I found something that is working and, of course, have it documented, saved, etc. To define the tables and columns, I have that also as just simple text in a file run with SQLCMD.EXE.

To keep SQL Server installation problems from ruining my boot main partition, finally I am good at using NTBACKUP to get a clean backup and know how to restore it. So, before doing anything like a SQL Server install, I will use NTBACKUP to save the main boot partition. If SQL Server messes up, then, boom, I will just restore the boot partition and start over with the SQL Server install.

The worst time with SQL Server installs it wasn't clear what the heck to click on, ended up with two versions installed side by side, tried to uninstall one, and ended up with a sick SQL Server that wouldn't do even the simplest things.

Another time I wanted SQL Server to use a database from an earlier install, and trying to do this got SQL Server sick. Trying to uninstall and reinstall ruined my boot partition.

But, as soon as I can get the other work done, I will get the first server computer for my "server farm", install some recent versions of Windows Server, SQL Server, IIS, etc. and go live. Then, with Windows Server, etc., I will be back into system management again.

Today I am fixing a bug in my Web site: I have two Web pages where they use


to send a user from one to the other, and I have a fairly elaborate session state to be saved in one page and restored in the next one (instead of Redis, wrote my own session state store with just de/serialization from/to byte strings of instances of my session state class and two collection classes). But going between the two pages I must have coded when half asleep and didn't get quite right. So fixing that now. Also putting in lots of documentation to make it clear just what is going on.

My wisdom in building a Web site with ASP.NET: Start with the class for the session state you want for your site. That is, concentrate on this data, with essentially global scope. The first crucial work is just being really clear on just what the session state data is to be.

The second most important thing is to be clear on just how ViewState works. ViewState is useful and can save some coding.

Third, in positioning things on the Web page, maybe make good use of instances of the class Place Holder.

Fourth, have clearly in mind all the data that is available from a post back -- some of that data can be useful.

Fifth, make good use of the ability to write data to the log file.

Last, have clearly in mind when routines



the button click routines


etc. run.

After that, just keep it all simple and clear for the user.

If you want to get an understanding of basic SQL without all that overhead, you can try Microsoft Access.

Oh god, no more Access. Access has a weird semi-dialect of SQL where it's insanely picky about how you position parenthesis in multiple join statements. I found it impossible to get a query right without the graphical query generator, and that's after routinely writing 5-10 table joins in T-SQL. No thanks, I'd recommend Sqlite over it anyday for a lightweight database. Works great on command line or in any of dozens of helper tools on any platform, and it's Free.

I agree with the SQLlite recommendation.

> basic SQL

That was easy. Think of it as close to set theory!

I needed to get through all the overhead stuff since I'm using SQL Server in the server farm for my startup. So, I have code for Web pages with ADO.NET and also some stand alone programs that do SQL Server queries. Again, the only difficult stuff was the overhead stuff, and it would appear that the Khan course would not cover that.

Does anyone know of a good data modeling course like this? I understand SQL, but I think I could still improve at data modeling.

Try Jennifer Widom's online course on databases:


It reminds me of the interactive SQL tutorial GalaXQL.


We also have an interactive online SQL course at CodeHS (completely client side as well). You can check it out at https://codehs.com/sql or try out the first hour free at https://codehs.com/hoc

very nice. and if you're looking for a sql job, check this out, still in beta:


Could somebody confirm that SQLite is used to the course? By the way, this seems great!

Yep, confirmed. Here's a blog post by Brian who implemented the SQLite support in our editor: http://www.brianbondy.com/blog/id/168/sql-on-khan-academy-en...

Thanks pamelafox

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact