Anyone who's interested in this might want to check out the OSRM project, which uses a much more complex routing algorithm to efficiently find paths through the entire OSM graph, instead of just a tiny subset: http://map.project-osrm.org/
To expand on this comment a bit - OSRM still uses Dijkstra, so if you understand that, you already basically understand what OSRM does.
What OSRM does in order to speed things up is optimize the graph structure - we still use Dijkstra, but the search completes in a handful of steps, rather than hundreds of thousands.
There are quite a few techniques like this. OSRM implements an approach called Contraction Hierarchies. We scan over the graph, inserting "shortcuts" that skip over nodes. As long as you follow a few basic rules, you can repeatably insert shortcuts all over the graph. This gives you a routing graph that is equivalent, but a Dijkstra search will typically complete in a handful of iterations.
We hope one day to implement several other speedup techniques - each has advantages/disadvantages, depending on what you want to do. Contraction Hierarchies lead to very fast queries (~5ms for a cross-the-US route), but the pre-processing time is very long (~6hrs on a beefy machine for the OSM planet). Any updates to the graph require complete re-processing (new/removed roads, adjusted road speeds, etc). Other techniques compromise search performance for a bit more flexibility - faster update times, query customization (i.e. "avoid highways").
It's a really fascinating corner of CS theory to work in, I really enjoy it :-)
"Route Planning in Transportation Networks" gives an excellent overview of current search speedup techniques. It's a bit hefty, but if you're interested in knowing what's the state of the art, this is a good place to start.
Could you elaborate on why the preprocessing time is long? I didn't study contraction hierarchies, but to me it seems that computing shortcuts in a planar graph is the perfect fit for a divide-and-conquer approach.
Well, intuitively, I'd say you could divide the graph in two parts (along a cut). Then compute the CH-extended graph for both of the parts. And then combine those two graphs into the CH-extended graph for the whole graph. And you do this recursively, alternating the direction of the cut. This way, it is also easy to parallelize.
The difficulty is that the performance of queries on the final graph is dependent on it's shape. As lorenzhs said, you want the shortcuts to be as long as possible.
The final shape of the graph is highly dependent on the order you contract the nodes in - small changes in contraction order have large effects on the final shape.
One of the very expensive parts of the pre-processing step is determining the best order to perform contraction. Sure, you could just iterate over all nodes, contracting as you go (and parallelize), but you'd end up with a contracted graph that's not a whole lot better for queries than the original. Order matters.
There is a general group of approaches that do what you're describing - partition the graph recursively, and produce optimized overlays in various forms. This can be done in parallel, and recursively:
Query performance is generally not quite as fast as a well-optimizied CH graph, but the overlays can be generated much faster and that work can be highly parallelized. We hope one day to get a chance to implement this approach in OSRM.
The difficult problem with the second approach is partitioning the graph well :-)
Are there any instructions? It took me a long while to figure out how to drop pins (the placeholder text says you have to press Enter, but you actually need to use the mouse), and now I have no idea how to display the route.
(Also, it's open-source.)