28 January 2005

Secrets of a Theorist 3: Computer Models

by George E. Hrabovsky, President, MAST

Introduction

In the last installment I described the fundamental units of measurement and how these describe standard measuring requirements. I then showed how these ideas can be used to check the validity of an equation. This time around I will discuss a fad that has the potential to produce dangerously unpredictable results.

Models in General

Modeling is the pervue of theoretical science. That is what we do. The purpose of a model is to make prediction about the behavior of physical quantities in particular situations. Once a prediction is made, we can then test it in a lab or attempt to observe it in nature. We use mathematics to build models because it allows us to work with physical quantities in a very natural way and because we can assign relationships to quantities (which become constants or variables).

As our models become more sophisticated and complicated, the equations we use to make our predictions become harder to solve. Often it is mathematically impossible to arrive at a purely symbolic solution to an equation. This was a major limitation for centuries until the development of the computer and the advent of the compu ter model.

Computer Models

The computer allows us to approximate a solution to an equation. Depending on the method we use, our model can be very accurate. Here we understand the word accurate to mean close to the actual solution. We then understand that there is some error involved in our model. We can define this error by a simple formula,

error = | approximation - solution | .

We can use symbols, for error we will use Δ x, for the approximation we will use m_x, and for the solution we will use x.

Δ x = | m_x - x | .

This implies that

-Δ x ≤ m_x - x≤Δ x .

So,

-Δ x + m_x ≤ - x≤Δ x - m_x

and

m_x ≤ -Δ x - x≤ -m_x .

This makes no sense unless we switch signs (multiply through by -1),

-m_x ≤ Δ x + x≤m_x .

So,

m_x = Δ x + x .

In other words the model is the sum of the error and actual solution. Each time we solve an equation approximately, we introduce error into our model. If we have a model that makes predictions over time, where each solution represents another time step, then each time step will introduce additional error and eventually our model will have more error than solution.

The Danger of Computer Models

The propagation of error through a model is of deep concern. The predictive power of a model is only good so long as the error is below a certain level. Here is a quick probabilistic way of looking at it. The chance that a single result of something succeeding can be given by this formula,

If we assign the symbol p_h for the chance of a failure for a model in its history, p for chance of a single failure in any given model step, and n as the number of time steps we have,

p_h = 1 - (1 - p)^n .

Let's say that we have a really good model, with a 0.1% error in each time step. For a single time step we have,

FormBox[RowBox[{p_h,  , =,  , RowBox[{RowBox[{1,  , -,  , RowBox[{RowBox[{(, RowBox[{1,  , -,  ... 1,  , -,  , RowBox[{RowBox[{(, 0.999, )}], ^, 1}]}], =, RowBox[{0.001, .}]}]}]}], TraditionalForm]

OK, lets see what this is for a hundred time steps

FormBox[RowBox[{p_h,  , =,  , RowBox[{RowBox[{1,  , -,  , RowBox[{RowBox[{(, RowBox[{1,  , -,  , 0.001}], )}], ^, 100}]}],  , =,  , RowBox[{0.095, .}]}]}], TraditionalForm]

This is almost a 10% chance that an error will occur. Another way of looking at it is that nearly 10% of the model is filled with error. After a thousand time steps we will have slightly more than 63% of the model filled with error.

This points out the danger of of running a model without performing an error analysis on it. Without such an analysis there is no way to know at what point your model ceases to give meaningful results. Without knowing this, there is no reason for anyone to believe your model.

This has come to the fore recently, with predictive atmospheric models showing catastrophic climate changes over the next hundred years. But let's do a little error an alysis. The absolute best climate models have less than a 50% accuracy rate. Let's give the model that 50%, even though it isn't that good. And then let's run 200 time steps (each step is 6 months in the model).

FormBox[RowBox[{p_h,  , =,  , RowBox[{RowBox[{1,  , -,  , RowBox[{RowBox[{(, RowBox[{1,  , -,  , 0.5}], )}], ^, 100}]}],  , =,  , 1.}]}], TraditionalForm]

This is a disastrous result for the model. It indicates that essentially 100% of the model is filled with error at this point. The model is filled with 50% error right from the start. This means that there is no reason to believe this model is actually capable of solving the equations. Despite this, there are no error analyses of these models, and policy makers are using their results to shape policy and public opinion.

So, the next time someone tells you that a computer models indicates something, ask them, "Hey, what is the error?"

Created by Mathematica  (January 20, 2005)


   
Copyright 2005 by Society for Amateur Scientists