No, the core of a stock exchange resides entirely in RAM. Everything is reset once a day (overnight for stocks, immediately after regular hours for futures), and brokerages are responsible for resubmitting long-lived "good 'til cancelled" orders each day before the session begins.
Despite being RAM bound it's easy to parallelize because each stock is its own isolated exchange, so they can be distributed across hardware in proportion to the average volume in each stock. In other words, it's localized but embarrassingly parallel.
The only traditional databases are for reporting purposes e.g. who traded with whom and are append-only as far as the core exchange code is concerned. The reporting database is coupled through the same firehose feed that you can get as an exchange member, albeit with more redundancy. There are also access control systems that only come into play when you first open a session, and lots of other moving parts each with their own task. For example, if you lose your connection to the exchange you can ask a certain (non-core) server to replay all the events that happened from a given point in time in order to catch up. These, too, would be listening to the main event stream and spooling to disk, but the central exchange processes are RAM-only.
How do they handle machine failures? That's harder to speculate on from the outside, but if I were them I'd be running the same exchange in parallel on 2-3 machines. I don't know how it could be done without serializing the incoming order flow to make sure that each machine sees the same order of events, but it's not impossible.
Despite being RAM bound it's easy to parallelize because each stock is its own isolated exchange, so they can be distributed across hardware in proportion to the average volume in each stock. In other words, it's localized but embarrassingly parallel.
The only traditional databases are for reporting purposes e.g. who traded with whom and are append-only as far as the core exchange code is concerned. The reporting database is coupled through the same firehose feed that you can get as an exchange member, albeit with more redundancy. There are also access control systems that only come into play when you first open a session, and lots of other moving parts each with their own task. For example, if you lose your connection to the exchange you can ask a certain (non-core) server to replay all the events that happened from a given point in time in order to catch up. These, too, would be listening to the main event stream and spooling to disk, but the central exchange processes are RAM-only.
How do they handle machine failures? That's harder to speculate on from the outside, but if I were them I'd be running the same exchange in parallel on 2-3 machines. I don't know how it could be done without serializing the incoming order flow to make sure that each machine sees the same order of events, but it's not impossible.