> I'm the CEO of Mark Logic Corporation, a company which develops and markets an XML server.
Reminds me of a local company that was developing something similar, an "XML acceleration appliance" or something like that. The most notable fact about that company was that every single employee had a different version of what their product was actually doing. It was really bizarre. They had substantial funding, they had a brand, they were marketing the product and even managed to sell it, but every time they tried to explain the technical side of things it came out trivial. Don't know if they are still in business, but I wouldn't be surprised if they are.
I suspect that Google Protocol Buffers, Thrift, etc. are sufficient workarounds for anyone who has a current need for one of these things. If you are using XML as your payload and worried about shaving off milliseconds of latency, then you've done the wrong thing. Perhaps large IT organizations are forced to suffer the consequences of vendor-provided interoperability schemes and buy products like this to make the best of a bad situation.
At the very least, it's significantly less stupid than "XML acceleration". :-P
I'm thinking it's kinda a specialized object database for XML documents? Considering a lot of documents these days are ODF and MSOOXML, that might actually be an interesting technology.
The MarkLogic server is a native XML database. It's designed and written from the ground up to do XML. That's why it's so much better/faster than the competition.
Storage-wise, it's often the case that the MLS DB for a project is actually the same size or smaller than the original data even after you add all of the indexes, etc.
Performance-wise, the MarkMail.org site is running a 4 node cluster. The cluster is totally managed by the ML server and is fully transactional across the cluster.
Feature-wise, the ML server provides all of the underlying goodies for doing not only the various text searches but also the slick faceted search.
Mark Logic is more than legit, they have some very sexy stuff. They also power, to demonstrate their products foo, Mark Mail which is becoming one of the most widely used mail archive tools: www.markmail.org.
"and second how we've completely lost touch with how big things are."
I think the author missed reading his own writing :)
It's about 39000 times, not counting overhead for meta data and end-of-file slack.
It is amazing though, the speed with which the price of storage has come down. 10GHz processors are probably not going to happen, but I really won't be surprised if someone will offer a multiple terabyte solid state drive within a decade.
Not all that long ago I bought a 500 MB conner peripherals hard drive for about 1500$ and I wondered how we would ever fill that thing up...
Reminds me of a local company that was developing something similar, an "XML acceleration appliance" or something like that. The most notable fact about that company was that every single employee had a different version of what their product was actually doing. It was really bizarre. They had substantial funding, they had a brand, they were marketing the product and even managed to sell it, but every time they tried to explain the technical side of things it came out trivial. Don't know if they are still in business, but I wouldn't be surprised if they are.