Not even close to "almost every general". Even taking into consideration his own admission that he left out the Mongols, his analysis is obviously very, very Western centric. Somehow the omission of the most populous continent in the world and probably throughout history didn't dissuade the author from making such a bold claim. The most obvious omissions were the Chinese and Japanese generals. It's not as if those two countries didn't have a lot of wars, wrote a lot about war ("Art of War" anyone), or lack a historical tradition. Let's not forget the Indian sub-continent has its own history of warfare.
His comment about Rommel was kind of awful as well. Rather than trying to correct for why his model didn't generate some expected results, he tried to convince people that their perspective is wrong. Rommel may not be the impressive general he's popularly believed to have been but the author's understanding of Rommel is quite shallow. Rommmel's exploits goes back all the way to WWI. He was the youngest recipient of the "Pour Le Merite" when he captured over 9000 prisoners with just 150 soldiers.
This is an example of someone with technical skills applying those skills to a field they don't really understand and wrapping it up with a bold claim.
Some other thoughts:
- Analysis is limited to results of individual battles. That's a very narrow slice of a general's actual job.
- He ties WAR to overall W/L, which isn't great but the data doesn't give you many options.
- The model rewards "underdog" wins. This sounds like a decent proxy for skill, but it seems like a big part of the job is avoiding being an underdog in the first place.
- Army size and casualty figures for anything pre-17th century (and that's generous) are extremely suspect.
I wouldn't trust this model to compare Montgomery to Patton during the Sicily invasion, let alone compare Caesar to Napoleon.
As someone who picked up a computational military modeling course in college, attempting to model ancient warfar is a vastly different task than modern battles.
My gut would be that modern warfare de-correlates more strongly from numeric advantage due to increased speed and lethality of available force types.
Also, for the author, if you wanted to be more accurate, start calculating actual expected outcomes from the forces. Lanchester's Laws are as good a place as any to start.
Except he forgot to close with "which is why you should fund my startup."
2) If only the changed portion of the title is colored, there's an indication of whether it was a subtle changing to better fit the article content (even if the title was the same as the article previously) or an entire rewrite because it wasn't descriptive enough or was just plain misleading or flame-bait.
3) It may help you recognize it from the main page if you see it later, knowing it previously had a different title.
4) This is something separate from what I asked about before, but if all comments from HN staff noting changes followed the format Changed the title from $OLD" to "$NEW" because $REASON I think that it would cut down on the fluff comments about whether it changed (as in #1) and also cut down on the comments that opine about HN's editorializing of titles which seems to occur fairly commonly when the title changes on submissions of sufficient activity.
#4 would probably help the most, but even an orange asterisk after the title in a span with a title tag with the original title would be good. I'm just against loss of information in general, and would like to know what the original submitted title was in some cases, and it's not always obvious. Maybe I'm just weird.
However, one failure of this approach is that it assumes that the numbers of soldiers given for ancient battles is accurate. The Greeks will tell you that the Persians came with millions of soliders, but that just wasn't even possible given the state of military logistics at the time.
So, look at the #4 general in the article, Khalid ibn al-Walid. As an example data point, the article on the Battle of Yamama (https://en.wikipedia.org/wiki/Battle_of_Yamama) says he fought at nearly a 4:1 disadvantage, and as far as I can tell, the only source for these numbers is the Quran. Much like how the Greek sources love to tell you how big, bad and numerous the Persians that they defeated were, how great is God's glory if the army of God is steamrollering equal or smaller size armies?
I'm not sure how the author could account for this. IMO this underweights more recent and well-attributed generals such as Napoleon or Wellesley.
Subutai had over 60 battles. He also conquered largest areas and coordinated simultaneous operations over massive distances.
A general is also responsible for the logistics of maintaining his army and also of choosing battles where they have the most advantage. Thus, if a general over the course of a campaign is always engaging in battles with enemies where he is at a terrain or numerical advantage, that is positive for the general, not a negative. If a general is constantly fighting at a disadvantage in numbers or terrain, than can be a sign of weakness, not of strength - especially if they lose.
Thus, wins above replacement isn't a very good measure of generalship.
In addition, the data is incomplete. Napoleon fought 43 battles losing five. His biggest disaster, the Russian invasion was also his longest distance campaign. In contrast, here is the map of Alexander's empire. https://en.wikipedia.org/wiki/Alexander_the_Great#/media/Fil.... Believing that the 9 listed battles of Alexander represent every battle his army fought while conquering an area this vast is very naive. The truth is that over the 2000 years since Alexander, the accounts of battles got distilled and what we have is a greatest hits version of his battles. In contrast, Napoleon was in an era when it is more likely that we have a more complete list of his battles.
Finally, in comparing Napolean with some of the other generals, it is important to remember that ultimately Napoleon lost on the field of battle. When it really mattered, he failed. Alexander and Julius Caesar, Subatai, Genghis Khan, etc all won their battles - especially the ones that counted.
Napoleon may have been a great tactician on the field of battle, but being a great general is more than that.
Fighting battles on the eastern front in WW2 is significantly different then warfare during ancient times or even the 18th century. For one, warfare is far more deterimental to the rest of society, and fighting effectively requires a lot more effort in the modern world compared to early times in terms of logistics, production, technology and strategy.
Also, "warfare performance" is very much influenced by the enviroment one fights. How does one measure who is a better general when one, for instance, is severely undersupplied or in complete foreign terrain? there are simply too many variables to take into account to do suchs a study across suchs an incredibly large timeline.
This seems like a fun exercise in modeling data and doing some quick analysis. Obviously not for doing any worthwhile analysis or grading our existing general officer corps or putting any other weights in there.
For true analysis of tactical prowess you're looking at battle-to-battle analysis I'd imagine and metrics just don't go back that far. "most flanks covered", "most miles gained", "most efficiency per round of ammunition"... rather dark either way.
Well done for the purpose and nice post about how to gather / munge data!
Out of curiosity, anybody know what Schwarzkopf's stats would have been?
Otherwise, a general who was willing to nuke the other side might "win" a lot of battles, especially against a "replacement" who is morally opposed to civilian casualties.
You can also win the battle and lose the war. Are generals that overcommit to win a battle really good generals?
I wonder how true this is of many generals as well.
I think we are using "general" as a functional title here, not a rank.
The probability of accomplishing such a feat by pure chance is .00004 . But then the campaign was over, and that general never fought another battle, with a total WAR (according to the article) of 0.99996 .
Do you see my problem with the article? It rewards generals that fight unnecessary battles, completely discounts the decisiveness of the victory, and assigns a nearly arbitrary value for average replacement general. It is clear to me that the metric is measuring belligerence and battle prowess, rather than overall efficacy. The article even says it discounts strategy.
If, for instance, the general lured the rival's army away from the capital, while suffering a decisive defeat, but married his son to the widow empress during the battle, so the son could order the execution of the rival, the battle tactics were completely irrelevant in the face of the instantly-winning strategy.
"Alexander the Great, despite winning all 9 of his battles, accumulated fewer WAR largely because of his shorter and less prolific career."
WAA helps you compare between those with long careers that weren't great players and those who were truly great players who had short careers. The author needs a WAA for this comparison, not WAR.
It's a ranking of a few popular wikipedia articles, based on metrics drawn up without any connection to the realities of war.
Lots of people are commenting about the poor data set selection. I think that's understandable given that the author isn't a historian but rather a baseball geek and data scientist. Although I can't help but feel a little bit of anger to see (speaking as a software engineer and a history Ph.D. dropout) yet another example of a technical person blithely wandering into a field that has been studied for thousands of years and make very grandiose claims without even a cursory study of the field. Buy hey, that's tech for you, always disrupting (I mean that both sarcastically and not sarcastically at the same time).
What I want to address is more fundamental than getting the data set right, however. The author doesn't seem to understand that the very nature of the historical record is highly subjective.
1) Even if Wikipedia nailed every statistic, the statistics themselves about wins and losses, troop numbers, lengths of battle, places, etc. are increasingly unreliable as you go back in time. In some texts and historiographical traditions, the numbers are not just unreliable, they're arguably cut from whole cloth. Anything past, say, 1000 A.D. in Western Europe, for example, is highly disputed. Biases in old texts aside, we have enormous gaps in what texts have survived to this day. History isn't written by the victors, but the dried wood pulp and calf skins that it's written on is selectively preserved by them. The author's model has no awareness of the history of historiography.
(Tangentially, Napoleon, whom the author's model rates as the greatest general of all time, was indirectly responsible for a huge amount of destruction of Europe's archives after issuing orders to transfer archives from across the continent to Paris. Early modern logistics meant that huge portions of documents were destroyed in transport. I remember when I was working in the Vatican Secret Archives, something like 1/3rd of that archive was destroyed, and that's one of the main archives for European and world history.)
2) Even if we had all the numbers exactly perfect, what counts as a victory and what counts as a loss is highly subjective. Was the North Vietnamese Tet Offensive a victory or a loss? For whom? In what sense? These philosophical questions can't be answered by a model, at least not without the philosophical assumptions being made explicit.
This question is fundamentally one that requires a nuanced approach. I think that data-driven approaches can really help, but the author's model needs not only more refinement, it also needs to acknowledge more of the confounding factors involved. I encourage the author to keep working on it.
You should try this for admirals !
Another major problem with this ranking is that, in many cases, the battlefield was not "level". Sometimes literally, but also technologically, socially, etc.. For example, Napoleon's Grande Armée was quite different from any other European force in its composition, supply, and leadership. It wasn't a small group of nobles carrying on knightly traditions, like most other European countries had. It was every freakin' Frenchman that could hold a rifle. For supply its soldiers relied heavily on foraging and raiding, which allowed them to move faster than opposing armies could. Command was highly distributed and Napoleon's commanders had a lot more information and leeway to act. When big battles happened, command chains had a way of breaking down, and that put traditional, strictly top-down command structures at a disadvantage. Napoleons army was bigger, faster, and smarter (if left to its own devices) than other European forces. It's arguable that Napoleon won so many battles, not because of his own tactical performance on the battlefield, but because he built a better army than those his peers built. Is that tactics? To put it another way, would Napoleon have fared as well against other French commanders had a schism occurred and he been faced with a civil war against half of his own army?
Julius Caesar had an even more overwhelming advantage over most of his opponents, thanks to the extreme disparity in organization between Roman legions and the Gauls or pretty much anyone else who wasn't Roman. He did spend some time fighting other Romans, but which type of opponent shot him up this list? Also, Caesar is probably one of the luckiest military commanders in history. If not for his luck, Caesar and his forces could have been annihilated in a dozen different battles and history would just have said "Well, that was a poor choice". How much does Caesar's luck inflate his position on this list?
Finally, one huge difference between baseball and historical battles is the historians. You can get a pretty unbiased view of any baseball game in the last few decades because you can actually watch a recording of it. Historical accounts of battles, on the other hand, are subject to all kinds of bias, exaggeration, etc.. Just the numbers of combatants involved become notoriously unreliable in all but very recent history, if they're even reliable then. For example, casualties inflicted by U.S. forces in the Vietnam war are often thought to be wildly inflated because they were often just assumed. e.g. U.S. forces take fire from somewhere in a jungle. They call in a napalm strike. Everything burns and the shooting stops. The U.S. commanders have incentives to pad their body counts, so they assume everybody shooting them (seemed like there were hundreds of them!) was wiped out. Accounts of battles just aren't trustworthy in a quantitative sense most of the time.
At some point it's not just luck. He was a brilliant tactician and did innovative new things almost every battle.
The one battle Caesar did lose was an example of Caesars luck. Caesar only survived because Pompey (another great general) neglected (or was unable to) finish off Caesar's forces in the rout. Pompey still trapped Caesar's forces, who were sure to starve and dwindle if Pompey just waited them out. But his rich backers goaded Pompey into fighting Caesar again, a seemingly easy battle given his huge advantages in men, especially cavalry. But do or die Caesar pulled out another victory through innovative tactics, and that victory made him the ruler of Rome.
I think anyone who is remotely reasonable would agree that sometimes war is necessary, such as in wars of national survival like for the UK in the Second World War.
If we've agreed that war is sometimes necessary, we better make sure that we have people who are extremely good at it, which this is recognising and analysing.