Analysis of data coming from computers (logs coming from servers in this case) is guaranteed to have a structure that makes it easier to process, compared to data coming from more organic sources (reports collected by people, measurements from instruments, etc...) which are closer to the domain Brooks would have dealt with.
When dealing with these problems today, the greatest challenge isn't writing a script; it's deciding whether data points that don't fit the model invalidate the data or the model.
Moreover, we've spent five decades systemizing the former. Doing the latter is more challenging than ever and fraught with controversy.
Reading the article had me thinking: “computers are good at automating… computers.”
The quantity of logs he’s describing could only be created by having a machine take readings and dump them somewhere. The cause of the problem had to be a computer in the first place. For human data sources, even back in the 60s we had machines fast enough to tally the census, balance checkbooks, or book flights.
So yeah, Dan had a problem of unimaginable scope in the 80s but also the problem is kind of fake? “Help, my computers are spitting out too many numbers and I can’t graph them all!”
I feel like Dan was on to something here but also something in the article didn’t quite line up.
> So yeah, Dan had a problem of unimaginable scope in the 80s but also the problem is kind of fake?
I get what you're saying, but I don't think the problem is "fake" or artificial, maybe not even avoidable. Software is a tool, and you almost always need other tools to build and maintain tools.
Compare large construction projects, which often require separate construction projects, like building temporary roads to move all the material. Or drilling a long tunnel through a mountain: Frequently, entire factories will be built close to the tunnel entrance, which process the excavated material into concrete to be used for encasing the tunnel walls.
Another example: To produce huge numbers of screws in a cost-effective way, you need large, highly specialized machines, which serve no other purpose. And you need people who maintain those machines. Is that a fake problem, or are those fake jobs? Hardly.
Construction is an interesting case. If someone had written after the Empire State Building that skyscrapers would not grow another order of magnitude in the coming century, it would have been laughed at. There were lots of popular articles about “mile high” skyscrapers then. And skyscrapers today are much better! We have counterweights at the top, glass facades, fancy designs… But they’re mostly the same heights. Even the show off Burj Kalifa is only twice the occupied height. And we don’t even have a kilometer high skyscraper yet.
When dealing with these problems today, the greatest challenge isn't writing a script; it's deciding whether data points that don't fit the model invalidate the data or the model.
Moreover, we've spent five decades systemizing the former. Doing the latter is more challenging than ever and fraught with controversy.