I'd like to introduce a young man whose studies in chemistry at the University of London were interrupted by the Second World War. As a chemist, he was assigned to a chemical defense experimental station with the British Army Engineers and undertook work determining the effect of poison gas. There, he was faced with mountains of data on the effects of various doses of various compounds on rats and mice. Since the Army could provide no statistician, and since this bright lad had once read R.A. Fisher's revolutionary Statistical Methods for Research Workers, he was drafted to do the work. Thus was born a statistician who would become the Director of the Statistical Research Group at Princeton (where he married one of Fisher's daughters), create the Department of Statistics at the University of Wisconsin, and exert incredible influence in the fields of statistical inference, robustness (a word he defined and introduced into the statistical lexicon), and modelling; experimental design and response surface methodology; time series analysis and forecasting; and distribution theory, transformation of variables, and nonlinear estimation ... one might just as well say he influenced "statistics" and note that a working statistician owes much of his or her craft to this man whether they know it or not. Ladies and gentlemen, I'd like to introduce George Edward Pelham Box.
George Box |
We've all heard this aphorism, even if we do not know it springs from George Box. While true in the most fundamental sense possible and a profoundly important insight, what isn't widely understood or acknowledged is how incredibly dangerous and damaging this idea is and why, despite its veracity, we should ruthlessly suppress its use. True but dangerous? How is this possible?
Listen carefully the next time you hear this mantra produced. The key is the manner in which most use the statement, emphasizing the first half as exculpatory ("It doesn't matter that my model is wrong, since all models are") and the latter half as permissive. The forgiveness of intellectual sins implicit in the first half of the statement requires of the analyst or programmatic and planning partisan no examination of the sin and its consequences; we are forgiven, for we know not what we do ... though we should know and should not be forgiven for turning a blind eye.
Once forgiven, the utility of the model is elevated as the only criterion of interest, but this is a criterion with no definition. As such it admits all manner of pathologies that contradict the intent of Box in framing this discussion of statistical methods. Consider a somewhat more expansive discussion of the same concept. In the same work quoted above, Box and Draper wrote,
Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.And earlier, in a marvelous 1976 paper, Box averred,
Since all models are wrong the scientist cannot obtain a 'correct' one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so over-elaboration and over-parameterization is often the mark of mediocrity.In each case, the question of utility is tied explicitly to the question of how wrong the model is or is not. Similarly, this is precisely the emphasis in Einstein's famous injunction, often quoted "as simple as possible, but no simpler." He actually said,
It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.In all cases, it is the phenomenon under investigation, the datum of experience, that forms the basis for evaluating the relative utility of the model that is, of necessity, not a perfect representation of the phenomenon. The question is not whether the model is right or wrong, useful or useless. The real question of interest is whether the model is right in the ways that matter. The two concepts, separated by a lowly conjunction in Box's famed quote, are not separate in his intellectual construct. Yet they are very much so in the the framing of the quote as typically (ab)used.
A Spherical Chicken and a Vacuum? |
So why write about these things? The answer is simple. These questions affect every aspect of nearly every problem a military analyst will face--whether that analyst is an operations research analyst, an intelligence analyst, a strategist engaged in course of action analysis,etc. Examples abound.
Consider the ubiquitous 1-n list, a model of decision making that problematically imposes a strict, transitive order in preferences, treats cost and benefit as marginal with all the path dependence possibilities that entails, and does not typically account for interaction and dependencies across the list, all of which compromise the utility of the list as a tool. The model is, however, simple, easy to explain, and conveys some sense of rigor in the construction of the list ... even if none exists. Useful indeed.
Or consider the notion of risk as an expected value expressed via the product of probability and consequence. With no meaningful characterizaion of the underlying distribution in the probability half of this formula, risk degenerates to a simple point estimate with no consideration of the heaviness of the probability tails and the relative likelihood of extremity. Or worse, it is implicitly viewed as a Gaussian distribution because that is what we've been taught to expect, and extreme outcomes are unwittingly eliminated from our calculus. On a related note, when considering a given scenario (within the scope of the various Defense Planning Scenarios) and speaking of risk, are we considering the likelihood of a given scenario (by definition asymptotically close to zero) or the likelihood of some scenario in a given class? This sounds a bit academic, but it is also the sort of subtle phenomenon that can influence our thinking based on the assessment frame we adopt. As such, characterizing the output of such a model as a description of risk is specious at best.
John Maynard Keynes |
Frank H. Knight |
Further, the question of consequence is no less problematic. What do we mean by consequence and how do we quantify it (because the probability/consequence model of risk demands an ordered consequence)? And how does the need for a quantifiable expression of consequence shape the classes of events and outcomes we consider? Does it bias the questions we ask and information we collect, shifting the world subtly into a frame compatible with the probability/consequence mode of orienting to it? What are the consequences of being wrong in such a case?
Continuum of Conflict, 2015 U.S. National Military Strategy |
It offers another kind of list, though, based on the Defense Planning Scenarios. Since each scenario is assigned a numerical value, and since real numbers are well ordered we suddenly have a continuum of conflict. This model may be useful--it certainly makes the complex simple--but is it right in the ways that matter? The continuum makes each of the types of conflict shown effectively similar, differing only in degree. Even the implication of such a continuum is dangerous if it leads military planners to believe the ways and means associated with these forms of conflict identical or that one form of conflict can be compartmentalized in our thinking. Perhaps some room should be made for the notion that more is not always simply more; sometimes more is different, but this is an idea explicitly excluded from an ordering like that presented here.
Blind Men Building Models of an Elephant |
Modeling Counterinsurgency in Afghanistan |
Or what about our models of human nature and the international system. Are we classical realists, structural realists, institutionalists, Romantics, Marxists, or something else? The structural realism of Kenneth Waltz is famously parsimonious, abstracting a great deal into billiard balls that interact on the basis of power alone (a description that is itself more parsimonious than is fair). But this leaves us with a model that cannot explain critical phenomena and necessitates expansion and refinement--see Stephen Walt's balance of threat, for example, a socially constructed concept outside the Waltz model. In the end, we are faced with a model and not with reality, with approximations of truth and not with truth itself.
This notion is particularly important in thinking about the veracity and utility of our models. They are, in fact, models. In all cases, the intent is an "adequate representation of a single datum of experience." But in studying our models we can become detached from experience and attach ourselves to the models themselves, associate our intellectual value with their form and behavior, and make them into things worthy of study unto themselves. In short, we are wont to reify them, a process Peter Berger and Thomas Luckman describe as
... the apprehension of the products of human activity as if they were something else than human products-such as facts of nature, results of cosmic laws, or manifestations of divine will. Reification implies that man is capable of forgetting his own authorship of the human world, and further, that the dialectic between man, the producer, and his products is lost to consciousness. The reified world is ... experienced by man as a strange facticity, an opus alienum over which he has no control rather than as the opus proprium of his own productive activity.
Auguste Rodin, The Thinker |
Finally, if the model is wrong, we must demand a new model more closely aligned to the question of interest, a model right enough to be useful. And this is not just a task for analysts and mathematicians, though it is our duty. This is a task for planners, strategists, operators, decision makers, and everyone else. We must seek the truth, even if we may not find it.
First, however, we should probably scrub Box's exculpatory and damaging aphorism from our decision-making discourse.