The GDS companies get this data from these sources and then in turn provide (crappy) APIs for customers to use to query it.
In general, if you're doing bookings you can get small amounts of the data (a query at a time) from a GDS. Otherwise you're looking at millions of dollars per year. And then you need to write code to parse it and price tickets using it (approximately 1M LoC if you're terse about it -- more like 30M if you're a GDS).
(I know all this because I co-founded ITA Software, whose software now powers Google Flights.)
Honest question (I swear I'm not trying to troll): what do you actually do with it that you can't do (or do as well as) with Google Flights or Kayak or some other site like that?
I ask because I've used ITA Matrix and never managed to find a cheaper flight than I can through "normal" means... but I rarely fly multi-destination (and basically never do more complicated stuff) so I'm not sure if the use case is beyond mine or if I just don't know how to utilize it in a useful manner.
- Multiple departure/destination airports (you can search SFO,SJC,OAK to LGB,LAX,SNA and back in a single search). To be fair Google supports this on desktop too (but doesn't on mobile).
- Better search for return flights with variable stay lengths (Google only does a 5x5 matrix, ITA Matrix does a full month).
However, perhaps the biggest downside is that you can't actually book through ITA Matrix. It only finds the fare, you have to find it elsewhere to actually buy it (although Hipmunk takes routing codes, which makes it easier)
E: 25% Mileage
L, U, T, X, V: 50% Mileage
H, Q, K: 75%
B, M, S: 100% Mileage
Y: 125% Mileage
If you search on Google Flights, these will all be called "Economy". If you search on most of the other OTA's, you can sometimes find the fare class during checkout or even as part of your search results, but you can't filter on it (Hipmunk is one that does support some of ITA's syntax for these filters, but not all). The buckets aren't always strictly more/less expensive, but they're usually not exposed very easily, if at all. So, you're often left crawling from listing to listing, expanding to see if they are going to get you any miles. (I'll save the debate of whether miles are worth all the effort for another day.)
On ITA, it's not unreasonable to construct a query that says "During the month of November, show me round trips that are between 12 and 19 days that are going from Denver to either Narita or Haneda Airports, which will earn me more than 50% miles on either Delta or American or JAL, but also only ones that connect in Portland or Los Angeles, with no prop planes or overnight stops, and no <50 minute connections or 3+ hour connections". (I wouldn't actually specify all of these stipulations, but they're good for the example! :) )
 Delta, to its credit, does allow you to search by minimum fair class on its advanced search)
If you're only looking at flights out of a particular airport and with a specific (default) routing, you'll probably not find anything better.
If you know that certain airports are hubs, or that a particular carrier has a slightly longer route that takes you via a certain city and maybe have a longer layover then you might be able to find some really good deals.
Being in Australia, I really don't get to take advantage of these things at all.
2. As recently revealed by the company http://www.flyertalk.com/forum/27265924-post483.html you can avoid connections in certain countries which can be beneficial for visa purposes or certain personal privacy requirements ( I know a German physics professor who refuses to go to the US because of the fingerprinting. )
How did you go from games to developing airlines reservation and fares software?
As for the GDSes, their code is primarily TPF assembly. So now add the LoC blow-up of using a very low-level language.
There are like 25 people in the world who understand this stuff, and half of them likely work at ITA/Google.
As to the "categories", people at Sabre working with them used to say in some cases you can not even be sure if the calculation (of a ticket price - "fare") will end in finite time, so the "macro language" is apparently Turing-complete!
I've written elsewhere that I personally think the original Sabre system was/is one of the most impressive accomplishments in the history of computing. Using modern tools made things easier for us at ITA, though we compensated for that by trying to compute the entire (very large) solution space for every query, where prior systems used heuristics.