Truth in Our Profession: models

Showing posts with label models. Show all posts

Friday, August 7, 2015

Right, Wrong, and Relevance

I'd like to introduce a young man whose studies in chemistry at the University of London were interrupted by the Second World War. As a chemist, he was assigned to a chemical defense experimental station with the British Army Engineers and undertook work determining the effect of poison gas. There, he was faced with mountains of data on the effects of various doses of various compounds on rats and mice. Since the Army could provide no statistician, and since this bright lad had once read R.A. Fisher's revolutionary Statistical Methods for Research Workers, he was drafted to do the work. Thus was born a statistician who would become the Director of the Statistical Research Group at Princeton (where he married one of Fisher's daughters), create the Department of Statistics at the University of Wisconsin, and exert incredible influence in the fields of statistical inference, robustness (a word he defined and introduced into the statistical lexicon), and modelling; experimental design and response surface methodology; time series analysis and forecasting; and distribution theory, transformation of variables, and nonlinear estimation ... one might just as well say he influenced "statistics" and note that a working statistician owes much of his or her craft to this man whether they know it or not. Ladies and gentlemen, I'd like to introduce George Edward Pelham Box.

George Box

But Box hardly needs an introduction. You already know him and no doubt quote him regularly even if you are not a professional analyst or statistician. Let me prove it to you. George Box, in his work with Donald Draper on Empirical Model-Building and Response Surfaces, coined the phrase, "Essentially, all models are wrong, but some are useful."

We've all heard this aphorism, even if we do not know it springs from George Box. While true in the most fundamental sense possible and a profoundly important insight, what isn't widely understood or acknowledged is how incredibly dangerous and damaging this idea is and why, despite its veracity, we should ruthlessly suppress its use. True but dangerous? How is this possible?

Listen carefully the next time you hear this mantra produced. The key is the manner in which most use the statement, emphasizing the first half as exculpatory ("It doesn't matter that my model is wrong, since all models are") and the latter half as permissive. The forgiveness of intellectual sins implicit in the first half of the statement requires of the analyst or programmatic and planning partisan no examination of the sin and its consequences; we are forgiven, for we know not what we do ... though we should know and should not be forgiven for turning a blind eye.

Once forgiven, the utility of the model is elevated as the only criterion of interest, but this is a criterion with no definition. As such it admits all manner of pathologies that contradict the intent of Box in framing this discussion of statistical methods. Consider a somewhat more expansive discussion of the same concept. In the same work quoted above, Box and Draper wrote,

Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.

And earlier, in a marvelous 1976 paper, Box averred,

Since all models are wrong the scientist cannot obtain a 'correct' one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so over-elaboration and over-parameterization is often the mark of mediocrity.

In each case, the question of utility is tied explicitly to the question of how wrong the model is or is not. Similarly, this is precisely the emphasis in Einstein's famous injunction, often quoted "as simple as possible, but no simpler." He actually said,

It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.

In all cases, it is the phenomenon under investigation, the datum of experience, that forms the basis for evaluating the relative utility of the model that is, of necessity, not a perfect representation of the phenomenon. The question is not whether the model is right or wrong, useful or useless. The real question of interest is whether the model is right in the ways that matter. The two concepts, separated by a lowly conjunction in Box's famed quote, are not separate in his intellectual construct. Yet they are very much so in the the framing of the quote as typically (ab)used.

A Spherical Chicken and a Vacuum?

Why does this matter? As a separate idea, "useful" is applied without considering the meaning of the word. Does it mean useful in the sense that it illustrates and illuminates some natural phenomenon, as Box would have it? Or does it mean that the model is simple enough to be understood by a non-technical audience? On the other hand, it may mean that our tools appear to deliver answers to life, the universe, and everything, a chimera some see as worth chasing. Or perhaps it means that the model is simple in the sense that Herbert Weisberg uses to characterize "willful ignorance" which "entails simplifying our understanding in order to quantify our uncertainty as mathematical probability." And it is no great stretch to extend this notion of willful ignorance to the ways in which we frame underlying assumptions regarding the structure of the elements and interactions in a model to facilitate their mathematical and/or digital representation. There is a great physics joke in this vein involving a spherical chicken in a vacuum, but that's a story for another day. If any of these begin to affect our assertions regarding utility, we have crossed over into a territory where utility becomes a permissive cause for intellectual failure, and that is a dangerous territory.

So why write about these things? The answer is simple. These questions affect every aspect of nearly every problem a military analyst will face--whether that analyst is an operations research analyst, an intelligence analyst, a strategist engaged in course of action analysis,etc. Examples abound.

Consider the ubiquitous 1-n list, a model of decision making that problematically imposes a strict, transitive order in preferences, treats cost and benefit as marginal with all the path dependence possibilities that entails, and does not typically account for interaction and dependencies across the list, all of which compromise the utility of the list as a tool. The model is, however, simple, easy to explain, and conveys some sense of rigor in the construction of the list ... even if none exists. Useful indeed.

Or consider the notion of risk as an expected value expressed via the product of probability and consequence. With no meaningful characterizaion of the underlying distribution in the probability half of this formula, risk degenerates to a simple point estimate with no consideration of the heaviness of the probability tails and the relative likelihood of extremity. Or worse, it is implicitly viewed as a Gaussian distribution because that is what we've been taught to expect, and extreme outcomes are unwittingly eliminated from our calculus. On a related note, when considering a given scenario (within the scope of the various Defense Planning Scenarios) and speaking of risk, are we considering the likelihood of a given scenario (by definition asymptotically close to zero) or the likelihood of some scenario in a given class? This sounds a bit academic, but it is also the sort of subtle phenomenon that can influence our thinking based on the assessment frame we adopt. As such, characterizing the output of such a model as a description of risk is specious at best.

John Maynard Keynes

This isn't the end of the issue vis-à-vis probability, though, and there are deeper questions about the model we use as we seek some objective concept of probability to drive our decisions. The very notion of an objective probability is (or at least once was and probably still should be) open to doubt. Consider A Treatise on Probability, a seminal work of John Maynard Keynes--a mathematician and philosopher of long before he became one of the fathers of modern macroeconomics--or Risk, Uncertainty, and Profit by Frank H. Knight, both first published in 1921. Both, in the formative days of the modern theory of probability, put forward a notion that probability is inherently subjective. Knight, for example, includes in his notion of risk (i.e., probability) the question of confidence: "The action which follows upon an opinion depends as much upon the confidence in that opinion as upon the favorableness of the opinion itself." But if subjective consequence is inherent to assessments of probability and risk, we enter into all manner of human cognitive shenanigans. Does increasing information increase the objective assessment of probability, the subjective assessment of confidence, both, or neither? There is some evidence to suggest the second and not the first, with all manner of consequences for how we conceive of risk (and for notions of information dominance and network-centric warfare). But these are central questions for models of decision making under risk.

Frank H. Knight

Further, the question of consequence is no less problematic. What do we mean by consequence and how do we quantify it (because the probability/consequence model of risk demands an ordered consequence)? And how does the need for a quantifiable expression of consequence shape the classes of events and outcomes we consider? Does it bias the questions we ask and information we collect, shifting the world subtly into a frame compatible with the probability/consequence mode of orienting to it? What are the consequences of being wrong in such a case?

Continuum of Conflict,
2015 U.S. National Military Strategy

There is an interesting corollary relationship between the the numerical output model of risk and 1-n lists in the sense that the numerical output provides a de facto list. Et voila! The model is useful, at least in one sense.

It offers another kind of list, though, based on the Defense Planning Scenarios. Since each scenario is assigned a numerical value, and since real numbers are well ordered we suddenly have a continuum of conflict. This model may be useful--it certainly makes the complex simple--but is it right in the ways that matter? The continuum makes each of the types of conflict shown effectively similar, differing only in degree. Even the implication of such a continuum is dangerous if it leads military planners to believe the ways and means associated with these forms of conflict identical or that one form of conflict can be compartmentalized in our thinking. Perhaps some room should be made for the notion that more is not always simply more; sometimes more is different, but this is an idea explicitly excluded from an ordering like that presented here.

Blind Men Building Models of an Elephant

Another interesting question arises from the ways in which these conflicts are modeled as we seek to develop computations of the consequences in them or to develop recommendations for the force structures best aligned with the demands of give scenarios. How will we represent the scenarios, our forces, the forces of the adversary, and their respective strategies? Will attrition define the objectives, and, if so, what is the model for attrition we will use and how does that model for attrition apply across the continuum of conflict? Will our enemies be volitional, dynamic, and devious or static and inanimate? Will we make simplifying assumptions of linearity, an assumption that sounds esoteric but matters in the sense that a nonlinear model exhibits behaviors a linear model cannot replicate, may be more difficult to develop and interpret, and is also generally more reflective of reality. Stainslaw Ulam's adage--"Using a term like nonlinear science is . . . like referring to the bulk of zoology as the study of non-elephant animals”--is a trenchant reminder of this principle.

Modeling Counterinsurgency
in Afghanistan

But this does not mean linear representations are necessarily inappropriate or without value, and precise emulation can be taken too far. Will we proceed down a path of non-linear interactions and voluminous detail, toeing Box's line of "excessive elaboration," as we often do with large-scale campaign simulations or the (perhaps unfairly) infamous effort to model the dynamics of counterinsurgency in Afghanistan? What does utility mean in each of these cases, and what does "right in the ways that matter" mean here?

Or what about our models of human nature and the international system. Are we classical realists, structural realists, institutionalists, Romantics, Marxists, or something else? The structural realism of Kenneth Waltz is famously parsimonious, abstracting a great deal into billiard balls that interact on the basis of power alone (a description that is itself more parsimonious than is fair). But this leaves us with a model that cannot explain critical phenomena and necessitates expansion and refinement--see Stephen Walt's balance of threat, for example, a socially constructed concept outside the Waltz model. In the end, we are faced with a model and not with reality, with approximations of truth and not with truth itself.

This notion is particularly important in thinking about the veracity and utility of our models. They are, in fact, models. In all cases, the intent is an "adequate representation of a single datum of experience." But in studying our models we can become detached from experience and attach ourselves to the models themselves, associate our intellectual value with their form and behavior, and make them into things worthy of study unto themselves. In short, we are wont to reify them, a process Peter Berger and Thomas Luckman describe as

... the apprehension of the products of human activity as if they were something else than human products-such as facts of nature, results of cosmic laws, or manifestations of divine will. Reification implies that man is capable of forgetting his own authorship of the human world, and further, that the dialectic between man, the producer, and his products is lost to consciousness. The reified world is ... experienced by man as a strange facticity, an opus alienum over which he has no control rather than as the opus proprium of his own productive activity.

Auguste Rodin, The Thinker

This suggests an important remedy to the problem of models that are wrong in ways that matter. If we recognize them as the products of human activity, as opus proprium, and not as handed down from authority, then they can be superseded by new products of human ingenuity. They are not sacred, and when we say a model is wrong, our next though should never be to apologize for the model ("but it is useful"). Rather, our thoughts should turn to whether the model is right in the ways that matter. This is the only proper way to defend our work.

Finally, if the model is wrong, we must demand a new model more closely aligned to the question of interest, a model right enough to be useful. And this is not just a task for analysts and mathematicians, though it is our duty. This is a task for planners, strategists, operators, decision makers, and everyone else. We must seek the truth, even if we may not find it.

First, however, we should probably scrub Box's exculpatory and damaging aphorism from our decision-making discourse.

Saturday, February 28, 2015

The Right Answer?

It is strange how serendipity occasionally intervenes to link multiple lines of thinking and the conversations that go along with them and thinking on a subject crystallizes. In June 2014, Harvard Business Review published an article titled "Why Smart People Struggle With Strategy." The piece begins thus:

Strategy is often seen as something really smart people do — those head-of-the-class folks with top-notch academic credentials. But just because these are the folks attracted to strategy doesn't mean they will naturally excel at it. The problem with smart people is that they are used to seeking and finding the right answer; unfortunately, in strategy there is no single right answer to find. Strategy requires making choices about an uncertain future. It is not possible, no matter how much of the ocean you boil, to discover the one right answer. There isn't one. In fact, even after the fact, there is no way to determine that one’s strategy choice was “right,” because there is no way to judge the relative quality of any path against all the paths not actually chosen. There are no double-blind experiments in strategy.

When this crossed my digital desk this week, I was reminded of a slew of recent articles on the Service Academies (here, here, and here) and a great discussion that ensued on The Constant Strategist over the questions raised in the first of the linked articles. Here, two questions matter: What was the crux of the original issue and where have I landed in the overarching questions?

The initial online debate in this exploration centered on the curriculum most appropriate to the education of military officers. What should be the emphasis in a liberal education intended to develop them deliberately? There are two--one might call them adversarial--camps in this debate centered on the relative importance of the sciences and the humanities. I have always found myself standing athwart the apparent chasm between these two positions. As a military analyst with too many graduate degrees in math, I have enormous sympathy for the technical side of this debate (perhaps selfishly, since another position would invalidate much of my education and professional life). But I began my life as a student of English Literature and I spent a formative interlude as a graduate student in military history and strategy, and I know the technocrat's approach to conflict and strategic planning is problematic. But since it is hard to ask that everyone know everything, what is the right answer?

In the end, I think appropriate diversity is the answer, at every unit of analysis from the individual to the population. The trick is to ensure both expertise in the population (i.e., someone somewhere has spent a life studying the topics of interest) and familiarity in individuals (i.e., we have all studied enough of the other that we can speak a common language and seek useful metaphors). That means we should encourage and incentivize both expertise and exposure in a variety of disciplines--math, statistics, physics, chemistry, engineering, history, anthropology, theology, literature, etc. But there's a catch.

In our world of military operations research analysts so well trained in seeking optimality the idea of external familiarity really matters, and this is why I'm writing. There is a distressing and problematic bias in the technical fields toward the existence of a "correct" answer, not unlike the assertion in the Harvard Business Review piece. It's how we're trained. In our education, there exists a provably true answer (at least within the constraints of our axiomatic systems) to most of the textbook questions we answer as we learn our trade. This is, in fact, part of the reason for my own shift once upon a time between literature and math as a chosen field of study. Certainty has a certain comfort and led to fewer arguments between student and teacher.

I do NOT want to encourage anyone to not study the sciences, operations research, or (my own love) mathematics. And I do NOT want to encourage avoidance of the the less technical disciplines. Rather, what I want to encourage is an appreciation of contingency in the application of the technical disciplines and a rigor in the application of the non-technical disciplines, especially the context of militarily relevant questions.

Why? Fundamentally because our models and computational tools are by definition rife with assumptions. What if one or more of those assumptions are wrong? What if we forget some minor idea that turns out to be critical? What if our axioms don't work? (I'm looking at you, Economics.) What if optimality itself is a chimera?

For us, reading history of various kinds and actively considering the question of how our forebears (analytic and otherwise) erred is perhaps a useful remedy. The problem is not that one is smart or not. The problem is how and what one studies and with what intent.

The proverb says, "Iron sharpeneth iron." A suggested circular addendum to this wisdom is that the humanities temper the steel of the sciences while the sciences sharpen the analytic edge of the humanities.

Saturday, February 14, 2015

Seeking Truth

The last few posts I've penned for this forum (here and here) have danced around the edges--and occasionally jumped up and down on--the notion that we humans are flawed, cognitively compromised, and subject to some intrinsic constraints on our ability to see, understand, communicate, and act on the truth. Though this is not a new soapbox, I hadn't realized that this notion had taken over my writing and become as strident as it had. Then a good friend asked a simple question, and I found myself wrestling with the consequences of the human cognitive silliness on which I've been recently focused and what it means for truth in general and, perfectly apropos of this forum, truth in our analytic profession.

So, what poser did my wise friend propose? He offered three alternative positions based on the existence of truth and our ability to know it:

There is a truth and we can grow to understand it.
There is a truth and we cannot understand it.
There is no truth for us to understand.

(Technically, I suppose there is a fourth possibility--that there is no truth and we can grow to understand it--but this isn't a particularly useful alternative to consider. As a mathematician and pedant by training and inclination, though, it is difficult to not at least acknowledge this.)

The question is then where I fall on this list of possibilities. It's an important question, if for no other reason than where we sit is where we stand, and it becomes difficult to hypocritical to conscientiously pursue an analytic profession if we believe either two or three is the case. Strangely, though, I found this a harder question to answer than perhaps I should have, but here is where I landed:

At least with respect to the human physical and social universes with which we contend, there is an objective truth that is in some sense knowable and we, finite and flawed as we are, can discover these truths via observation, experimentation, and analysis.

In retrospect, my position on this question should have been obvious. I've been making statements that human cognition is biased and flawed, averring that this is a truth, and I believe it to be one. We can observe any number of truths in the way humans and the universe we occupy behave. I find, on refection, though that there is a limit to this idea. Specifically, we can probably never know with precision the underlying mechanisms that produce the truths we observe. We may know that cognitive biases exist and we may be able to describe their tendencies, but (speaking charitably) we are unlikely to ever have an incontrovertible cause-and-effect model to allow us to interact with and influence these tendencies in a push-button way.

So, the trouble I have with truth is that we apply truth value to the explanatory models we create. Since these models are artificial creations and not the systems themselves they must, by definition, fail to represent the system perfectly. Newtonian theories of gravity based on mass give way to relativistic theories of gravity based on energy. In some ways one is better than the other, but neither is true in a deep sense. Our models are never true in the larger sense. They may constitute the best available model. They may be "true enough" or " right in all the ways that matter." But both of these conditions are mutable and context-dependent. In a sense, I find myself intellectually drawn to the notion that truth in the contexts that matter to us professionally is an inductive question and not a deductive one.

In the end, I'm actually encouraged by this reflection, though the conclusion that models are and must be inherently flawed results in some serious consternation for this mathematician (soothed only by the clarity with which mathematicians state and evaluate our axiomatic models). I understand better what I'm seeking. I understand better the limitations involved. And, at the risk of beating a dead horse, I am more convinced of the need to put our ideas out in the world. This reflection might never have taken place if not for Admiral Stavridis and his injunction to read, think, and write.

Monday, January 19, 2015

Data Worship and Duty

If you spend more than a few minutes working as an analyst--operations, program, logistics, personnel, or otherwise--it is almost inevitable that some wise military soul will offer trenchant historical lessons about undue trust in analytics for decision making derived from the performance of Robert McNamara as Secretary of Defense. Too often, these criticisms are intended to deflect and deflate criticisms and conclusions of analysis without addressing the analysis itself (an ad hominem approach without so much of the hominem). But that doesn't mean there aren't common mistakes made in the conduct of analysis and worthwhile lessons to be learned from McNamara.

This short article from the MIT Technology Review is a bit old, but it also makes a number of useful points. The "body count" metric, for example, is a canonical case of making important what we can measure rather than measuring what's important (if what is important is usefully measurable at all). Is the number of enemy dead (even if we can count it accurately) an effective measure of progress in a war that is other than total? So, why collect and report it? And what second-order effects are induced by a metric like this one? What behavior do we incentivize by the metrics we choose, whether its mendacious reporting of battlefield performance in Vietnam or the tossing of unused car parts in the river?

There's something more fundamental going on in the worship of data, though. We gather more and more detailed information on the performance of ours and our adversaries' systems and think that by adding decimals we add to our "understanding." Do we, though? In his Foundations of Science, Henri Poincaré writes:

If we could know exactly the laws of nature and the situation of the universe at the initial instant, we should be able to predict exactly the situation of this same universe at a subsequent interest. But even when the natural laws should have no further secret for us, we could know the initial situation only approximately. If that permits us to foresee the subsequent situation with the same degree of approximation, this is all we require, we say the phenomenon has been predicted, that it is ruled by laws. But this is not always the case; it may happen that slight differences in the initial conditions produce very great differences in the final phenomenon; a slight error in the former would make an enormous error in the latter. Prediction becomes impossible and we have the fortuitous phenomenon.

Poincare is describing here what would later be dubbed the butterfly effect for nonlinear systems (with the comparison to predicting the weather made explicit in a later chapter). In systems such as these, chasing data is to pursue a unicorn and the end of the rainbow. Rather, it is structure we should chase. Modeling isn't about populating our tools with newer and better data (though this may be important, if secondary). Rather, modeling is about understanding the underlying relationships between the data.

We often hear or read that some General or other should have fought harder against the dictates of the McNamara Pentagon, but one wonders if perhaps such a fight is also the duty of a military analyst.