21 December 2001
Design Your Experiments:
Part II, Noise
by Kevin Kilty
I'll
spend at least three parts of this series on design of experiments,
maybe more if I think it useful, discussing models and their characteristics.
Scientists can never actually
say what something is, but only how it behaves; and models form the
basis for describing how a thing behaves. The models I typically think
about are mathematical, but models don't have to be so limited. They
could be purely descriptive. What is essential about models is that
they give us a means to describe, and predict behavior; and, provide
a way to test theories. It is this last point, about being able to test
theories, which separates scientific theories from the kind of theories,
interpretations actually, that the literary critic speaks of.
First and most fundamental,
we have to recognize that the measurements we make are not exact, but
include the influence of error or noise. Many people refer to the inexactness
of experiments as "experimental error." This is what I mean by the word
noise, except I think that noise describes the issue better because
some of the inexactness of experiments is unavoidable, and therefore
unlike an error of some type.
Just as experimental results
contain noise, our models have to reflect this reality as well. Noise
is actually quite difficult to define--you've all no doubt heard the
adage that one person's noise is another's signal. I'll define noise
as any influence which prevents us from attaining identical measured
values, or identical results, on repeated versions of an experiment.
Noise may come from many sources. I'll give you three examples.
- First, some processes
have inherent noisiness. Thermal noise, Johnson noise, is a good example
of this. All noise of this type comes from the molecular and atomic
world being fundamentally statistical.
- Second, we cannot make
measurements that are absolutely identical from one try to the next
because our measuring equipment has limited resolution and precision.
This leads to slight uncertainty in measured results.
- Third, it is darned difficult
to repeat any experiment under identical conditions. There are always
unaccounted-for influences in an experiment which cause it to be unique,
and the result of these unknown influences is noise.
One thing you probably notice
from this short list is that it covers two completely different types
of noise the sum of which become experimental noise. The first
type affects the phenomenon itself. To measure this type of noise we
must replicate experiments. We perform experiments on a limited
area, or a limited amount of material, or on a group of plants or animals.
These smallest units of stuff on which we perform experiments are experimental
units. We replicate an experiment on various experimental units
that we try to make as identical to one another as possible. Sometimes
we can use the same experimental unit over again sometimes we cannot.
The second type of noise
in our short list comes from our inability to measure anything with
absolute accuracy. This is observational noise. To measure this
we can repeat our measurement. The smallest unit of outcome that we
measure in the aftermath of an experiment is the observational unit.
The observational unit doesn't have to be identical to the experimental
unit. It can be only a portion of the experimental unit. Observational
error or noise is a part of the total noise in an experiment but it
cannot tell us anything about the experimental noise. Only replications
of the experiment will fully measure experimental noise.
To avoid being too abstract,
let me provide an example. Suppose we wish to measure how affective
is an antiseptic for sterilizing a surface. We take a single surface
that we believe is uniform--for instance, a single surface used to prepare
food. We subdivide this uniform surface into several smaller areas.
Each of these is a possible experimental unit to which we can apply
the antiseptic (test areas) or not (control areas). Since we intend
each subdivision to be identical, we are eliminating possible sources
of experimental noise.
To further reduce sources
of noise, we need to assign each subdivision into the control group
or the test group randomly. This helps to prevent either pure bad luck
or subconcious decision from adding noise or error to the experiment.
There are many ways to do this. For instance, we could subdivide the
surface into a Latin Square-something I'll discuss in a further installment.
Or, we could organize the surface into pairs of subdivided areas and
use a coin flip to assign one half of the pair to the control or test.
Making each experimental unit a pair is a well established technique
for making experimental units that are nearly identical and for reducing
experimental noise.
The experiment consists of
applying antiseptic to the test half of each paired area, and also applying
some type of placebo to the control half. The placebo is needed to keep
the two halves of each experimental unit as identical as possible. Now
it is time to measure the outcome of the experiment. The most reasonable
way to do this is to count bacteria in each region. But it seems likely
that our experimental units are too large to make a full census of bacteria.
Instead, we will sample each area. These sampled areas are the observational
units of this experiment, and it should be obvious that they are smaller
than the experimental units. Even so, they are probably still large
enough that we cannot count bacteria on them with complete accuracy.
Nothing else about the experiment at this point affects its design.
Let me summarize this simple
experimental design...
- Our experimental units
consist of small areas of a surface that are uniform as possible,
including being prepared in a uniform manner; and, which are divided
into a test half and a control half. These are paired experimental
units.
- The experiment consists
of applying a treatment (antiseptic in this case) to the test half
and applying some sort of placebo treatment to the control half of
each unit.
- An experimental result
is the difference in bacteria counts between the control and test
halves of each experimental unit. Each such result is a replication
of the basic experiment.
- Experimental noise consists
of variation in the bacterial counts among the experimental units.
Some of this variation comes from unaccounted-for variation in the
surface and other factors. Some comes from measurement.
- Measurement of the experiment
consists of counting bacteria in small sampled areas of each experimental
unit. These samples are the observational units.
- The observational units
are small, but still large enough to make a perfectly accurate count
of bacteria impossible. We can repeat our count on each experimental
unit to estimate observational noise.
- A portion of experimental
noise is observational noise; the rest comes from unaccounted for
influences.
Even though noise is a constant
impediment to our search for truth, there is no reason to despair. We
can handle noisy experiments if we learn to think statistically. Toward
this goal in the next installment I'll present models of noise, and
make recommendations on how to report noise, or uncertainty, in experimental
results. Then we are ready to being looking at inference in the presence
of noise and how this affects experimental design.