Essay - Fitting Facts to Equations
At H. P. Grice's seminar on metaphysics in the summer of 1975, we discussed Aristotle's categories. I argued then that the category of quantity was empty; there were no quantities in nature—no attributes with exact numerical values of which it could be said that they were either precisely equal or unequal to each other. I was thinking particularly about physics, and the idea I had was like the one I have been defending in these essays, that the real content of our theories in physics is in the detailed causal knowledge they provide of concrete processes in real materials. I thought that these causal relations would hold only between qualities and not between quantities. Nevertheless I recognized that real materials are composed of real atoms and molecules with numerically specific masses, and spins, and charges; that atoms and molecules behave in the way that they do because of their masses, spins, and charges; and that our theoretical analyses of the causal processes they are involved in yield precise numerical calculations of other quantities, such as line shapes in spectroscopy or transport coefficients in statistical mechanics.
Why then did I want to claim that these processes were essentially qualitative? It was because our knowledge about them, while detailed and precise, could not be expressed in simple quantitative equations of the kind that I studied in theoretical physics. The distinction I wanted, it turns out, was not that between the qualitative and the quantitative, but rather the distinction between the tidy and simple mathematical equations of abstract theory, and the intricate and messy descriptions, in either words or formulae, which express our knowledge of what happens in real systems made of real materials, like helium-neon lasers or turbo-jet engines. We may use the fundamental equations of physics to calculate precise quantitative facts about real situations, but as I have urged in earlier essays, abstract fundamental laws are nothing like the complicated, messy laws which describe reality. I no longer want to urge, as I did in the summer seminar, that there are no quantities in nature, but rather that nature is not governed by simple quantitative equations of the kind we write in our fundamental theories.
My basic view is that fundamental equations do not govern objects in reality; they govern only objects in models. The second half of this thesis grew out of another Grice seminar on metaphysics not long after. In the second seminar we talked about pretences, fictions, surrogates, and the like; and Grice asked about various theoretical claims in physics, where should we put the ‘as if’ operator: helium gas behaves as if it is a collection of molecules which interact only on collision? Or, helium gas is composed of molecules which behave as if they interact only on collision? Or . . . ?
Again, I wanted to make apparently conflicting claims. There are well-known cases in which the ‘as if’ operator should certainly go all the way in front: the radiating molecules in an ammonia maser behave as if they are classical electron oscillators. (We will see more of this in the last essay.) How closely spaced are the oscillators in the maser cavity? This realistic question is absurd; classical electron oscillators are themselves a mere theoretical construct. What goes on in a real quantum atom is remarkably like the theoretical prescriptions for a classical electron oscillator. The oscillators replicate the behaviour of real atoms; but still, as laser specialist Anthony Siegman remarked in his laser engineering class, ‘I wouldn't know where to get myself a bagful of them’.1
The classical electron oscillators are undoubted fictions. But, even in cases where the theoretical entities are more robust, I still wanted to put the ‘as if’ operator all the way in front. For example, a helium-neon laser behaves as if it is a collection of three-level atoms in interaction with a single damped mode of a quantized field, coupled to a pumping and damping reservoir. But in doing so, I did not want to deny that the laser cavities contain three-level atoms or that a single mode of the electromagnetic field is dominant. I wanted both to acknowledge these existential facts and yet to locate the operator at the very beginning.
It seems now that I had conflicting views about how to treat this kind of case because I was conflating two functions which the operator could serve. On the one hand, putting things to the left of the operator is a sign of our existential commitment. A helium-neon laser is a collection of three-level atoms . . . But putting things on the right serves a different function. Commonly in physics what appears on the right is just what we need to know to begin our mathematical treatment. The description on the right is the kind of description for which the theory provides an equation. We say that a ‘real quantum atom’ behaves like a classical electron oscillator; already the theory tells us what equation is obeyed by a classical electron oscillator. Similarly, the long description I gave above of a laser as a collection of three-level atoms also tells us a specific equation to write down, in this case an equation called the Fokker–Planck equation; and there are other descriptions of gas lasers which go with other equations. We frequently, for example, treat the laser as a van der Pol oscillator, and then the appropriate equation would be the one which B. van der Pol developed in 1920 for the triode oscillator.
Contrary to my initial assumption I now see that the two functions of the ‘as if’ operator are quite distinct. Giving a description to which the theory ties an equation can be relatively independent of expressing existential commitment. Both treatments of the laser which I mentioned assume that the helium-neon laser contains a large number of three-level neon atoms mixed with a much greater number of helium atoms, in interaction almost entirely with a single mode of the electromagnetic field. Similarly, when an experimentalist tells us of a single mode of a CW GaAs (gallium arsenide) laser that ‘below threshold the mode emits noise like a narrow band black body source; above threshold its noise is characteristic of a quieted amplitude stabilized oscillator’ he is telling us not that the make-up of the laser has changed but rather that its intensity fluctuations follow from different equations above and below threshold. In these cases what goes on the right of the ‘as if’ operator does not depend on what we take to be real and what we take to be fictional. Rather it depends on what description we need to know in order to write down the equation that starts our mathematical treatment.
The views I urged in the two seminars go hand-in-hand. It is because the two functions of the ‘as if’ operator are independent that the fundamental equations of our theories cannot be taken to govern objects in reality. When we use the operator to express existential commitment, we should describe on the left everything we take to be real. From a first, naïve point of view, to serve the second function we should just move everything from the left of the operator to the right. To get a description from which we can write down an equation, we should simply report what we take to be the case.
But that is not how it works. The theory has a very limited stock of principles for getting from descriptions to equations, and the principles require information of a very particular kind, structured in a very particular way. The descriptions that go on the left—the descriptions that tell what there is—are chosen for their descriptive adequacy. But the ‘descriptions’ on the right—the descriptions that give rise to equations—must be chosen in large part for their mathematical features. This is characteristic of mathematical physics. The descriptions that best describe are generally not the ones to which equations attach. This is the thesis that I will develop in the remaining sections of this paper.
1. Two Stages of Theory Entry
Let us begin by discussing bridge principles. On what Fred Suppe has dubbed the ‘conventional view of theories’,3 championed by Hempel, Grünbaum, Nagel, and others in the tradition of logical empiricism, the propositions of a theory are of two kinds: internal principles and bridge principles. The internal principles present the content of the theory, the laws that tell how the entities and processes of the theory behave. The bridge principles are supposed to tie the theory to aspects of reality more readily accessible to us. At first the bridge principles were thought to link the descriptions of the theory with some kind of observation reports. But with the breakdown of the theory–observation distinction, the bridge principles were required only to link the theory with a vocabulary that was ‘antecedently understood’.
The network of internal principles and bridge principles is supposed to secure the deductive character of scientific explanation. To explain why lasers amplify light signals, one starts with a description in the antecedent vocabulary of how a laser is constructed. A bridge principle matches this with a description couched in the language of the quantum theory. The internal principles of quantum mechanics predict what should happen in situations meeting this theoretical description, and a second bridge principle carries the results back into a proposition describing the observed amplification. The explanation is deductive because each of the steps is justified by a principle deemed necessary by the theory, either a bridge principle or an internal principle.
Recently, however, Hempel has begun to doubt that explanations of this kind are truly deductive.4 The fault lies with the bridge principles, which are in general far from exceptionless, and hence lack the requisite necessity. A heavy bar attracts iron filings. Is it thus magnetic? Not necessarily: we can never be sure that we have succeeded in ruling out all other explanations. A magnet will, with surety, attract iron filings only if all the attendant circumstances are right. Bridge principles, Hempel concludes, do not have the character of universal laws; they hold only for the most part, or when circumstances are sufficiently ideal.
I think the situation is both much better and much worse than Hempel pictures. If the right kinds of descriptions are given to the phenomena under study, the theory will
tell us what mathematical description to use and the principles that make this link are as necessary and exceptionless in the theory as the internal principles themselves. But the ‘right kind of description’ for assigning an equation is seldom, if ever, a ‘true description’ of the phenomenon studied; and there are few formal principles for getting from ‘true descriptions’ to the kind of description that entails an equation. There are just rules of thumb, good sense, and, ultimately, the requirement that the equation we end up with must do the job.
Theory entry proceeds in two stages. I imagine that we begin by writing down everything we know about the system under study, a gross exaggeration, but one which will help to make the point. This is the unprepared description—it is the description that goes to the left of the ‘as if’ operator when the operator is used to express existential commitment. The unprepared description contains any information we think relevant, in whatever form we have available. There is no theory–observation distinction here. We write down whatever information we have: we may know that the electrons in the beam are all spin up because we have been at pains to prepare them that way; or we may write down the engineering specifications for the construction of the end mirrors of a helium-neon laser; and we may also know that the cavity is filled with three-level helium atoms. The unprepared description may well use the language and the concepts of the theory, but it is not constrained by any of the mathematical needs of the theory.
At the first stage of theory entry we prepare the description: we present the phenomenon in a way that will bring it into the theory. The most apparent need is to write down a description to which the theory matches an equation. But to solve the equations we will have to know what boundary conditions can be used, what approximation procedures are valid, and the like. So the prepared descriptions must give information that specifies these as well. For example, we may describe the walls of the laser cavity and their surroundings as a reservoir (a system with a large number of resonant modes). This means that the laser has no memory. Formally, when we get to the derivation, we can make a Markov approximation. (Recall the discussion in Essay 6.)
This first stage of theory entry is informal. There may be better and worse attempts, and a good deal of practical wisdom helps, but no principles of the theory tell us how we are to prepare the description. We do not look to a bridge principle to tell us what is the right way to take the facts from our antecedent, unprepared description, and to express them in a way that will meet the mathematical needs of the theory. The check on correctness at this stage is not how well we have represented in the theory the facts we know outside the theory, but only how successful the ultimate mathematical treatment will be.
This is in sharp contrast with the second stage of theory entry, where principles of the theory look at the prepared description and dictate equations, boundary conditions, and approximations. Shall we treat a CW GaAs laser below threshold as a ‘narrow band black body source’ rather than the ‘quieted stabilized oscillator’ that models it above threshold? Quantum theory does not answer. But once we have decided to describe it as a narrow band black body source, the principles of the theory tell what equations will govern it. So we do have bridge principles, and the bridge principles are no more nor less universal than any of the other principles. But they govern only the second stage of theory entry. At the first stage there are no theoretical principles at all—only rules of thumb and the prospect of a good prediction.
This of course is a highly idealized description. Theories are always improving and expanding, and an interesting new treatment may offer a totally new bridge principle. But Hempel's original account was equally idealized; it always looked at the theory as it stood after the explanation had been adopted. I propose to think about it in the same way. In the next section I want to illustrate some bridge principles, and I shall describe the two stages of theory entry with some examples from quantum mechanics.
If we look at typical formalizations of quantum mechanics it seems that the fundamental principles do divide into internal principles and bridge principles, as the conventional view of theories maintains. The central internal principle is Schroedinger's equation. The Schroedinger equation tells how systems, subject to various forces, evolve in time. Actually the forces are not literally mentioned in the equation since quantum mechanics is based on William Hamilton's formulation of classical mechanics, which focuses not on forces, but on energies. In the standard presentation, the Schroedinger equation tells how a quantum system evolves in time when the Hamiltonian of the system is known, where the Hamiltonian is a mathematical representation of the kinetic and potential energies for the system. Conservation principles, like the conservation of energy, momentum, or parity, may also appear as internal principles in such a formalization. (On the other hand, they may not, despite the fact that they are of fundamental importance, because these principles can often be derived from other basic principles.)
The second class of principles provide schemata for getting into and out of the mathematical language of the theory: states are to be represented by vectors; observable quantities are represented by operators; and the average value of a given quantity in a given state is represented by a certain product involving the appropriate operator and vector. So far, all looks good for the conventional view of theories.
But notice: one may know all of this and not know any quantum mechanics. In a good undergraduate text these two sets of principles are covered in one short chapter. It is true that the Schroedinger equation tells how a quantum system evolves subject to the Hamiltonian; but to do quantum mechanics, one has to know how to pick the Hamiltonian.
The principles that tell us how to do so are the real bridge principles of quantum mechanics. These give content to the theory, and these are what beginning students spend the bulk of their time learning.
If the conventional view were right, students should be at work learning bridge principles with mathematical formulae on one end and descriptions of real things on the other. Good textbooks for advanced undergraduates would be full of discussions of concrete situations and of the Hamiltonians which describe them. There might be simplifications and idealization for pedagogical purposes; nevertheless, there should be mention of concrete things made of the materials of the real world. This is strikingly absent. Generally there is no word of any material substance. Instead one learns the bridge principles of quantum mechanics by learning a sequence of model Hamiltonians. I call them ‘model Hamiltonians’ because they fit only highly fictionalized objects. Here is a list of examples. I culled it from two texts, both called Quantum Mechanics, one by Albert Messiah6 and the other by Eugen Merzbacher.7 This list covers what one would study in just about any good senior level course on quantum mechanics. We learn Hamiltonians for:
free particle motion, including
the free particle in one dimension,
the free particle in three dimensions,
the particle in a box;
the linear harmonic oscillator;
piecewise constant potentials, including
the square well,
the potential step,
the periodic potential,
the Coulomb potential;
‘the hydrogen atom’;
central potential scattering;
and eventually, the foundation of all laser theory,
the electron in interaction with the electromagnetic field.
There is one real material mentioned in this list—hydrogen. In fact this case provides a striking illustration of my point, and not a counterexample against it. The Hamiltonian we learn here is not that for any real hydrogen atom. Real hydrogen atoms appear in an environment, in a very cold tank for example, or on a benzene molecule; and the effects of the environment must be reflected in the Hamiltonian. What we study instead is a hypothetically isolated atom. We hope that later we will be able to piece together this Hamiltonian with others to duplicate the circumstances of an atom in its real situation.
But this is not the most striking omission. In his section titled ‘The Hydrogen Atom’, Messiah proposes a particular Hamiltonian and uses it to provide a solution for the energy spectrum of hydrogen. He says:
This spectrum is just the one predicted by the Old Quantum Theory; its excellent agreement with the experimental spectrum was already pointed out. To be more precise, the theory correctly accounts for the position of the spectral lines but not for their fine structure. Its essential shortcoming is to be a non-relativistic theory . . . [Also] the Schroedinger theory does not take the electron spin into account.8
These are critical omissions. The discovery and the account of the fine structure of hydrogen were significant events in quantum mechanics for the reasons Messiah mentions. Fine structure teaches important lessons both about relativity and about the intrinsic spin of the electron.
The passage quoted above appears about three quarters of the way through Volume I. About the same distance into Volume II, Messiah again has a section called ‘The Hydrogen Atom’. There he uses the relativistic theory of the Dirac electron. Even the second treatment is not true to the real hydrogen atom. The reasons are familiar from our discussion of the Lamb shift in Essay 6. Here is what Messiah himself says:
The experimental results on the fine structure of the hydrogen atom and hydrogen-like atoms (notably He+) are in broad agreement with these predictions.
However, the agreement is not perfect. The largest discrepancy is observed in the fine structure of the n=2 levels of the hydrogen atom. In the non-relativistic approximation, the three levels 2s 1/ 2p 1/2 , and 2p 3/2 are equal. In the Dirac theory, the levels 2s 1/ and 2p 1/2 are still equal, while the 2p 3/2 level is slightly lower (the separation is of the order of 10−4eV). The level distance from 2p 3/2 to 2p 1/2 agrees with the theory but the level 2s 1/2 is lower than the level 2p 1/ and the distance from 2s 1/2 to 2p 1/2 is equal to about a tenth of the distance from 2p 3/2 to 2p 1/2 . This effect is known as the Lamb shift. To explain it, we need a rigorous treatment of the interaction between the electron, the proton and the quantized electromagnetic field; in the Dirac theory one retains only the Coulomb potential which is the main term in that interaction; the Lamb shift represents ‘radiative corrections’ to this approximation.9
We know from our earlier discussion that the treatment of these ‘radiative corrections’ for the hydrogen spectrum is no simple matter.
The last sentence of Messiah's remark is telling. The two sections are both titled ‘The Hydrogen Atom’ but in neither are we given a Hamiltonian for real hydrogen atoms, even if we abstract from the environment. Instead, we are taught how to write the Coulomb potential between an electron and a proton, in the first case non-relativistically and in the second, relativistically. Messiah says so himself: ‘The simplest system of two bodies with a Coulomb interaction is the hydrogen atom’.10 ‘The hydrogen atom’ on our list is just a name for a two-body system where only the Coulomb force is relevant. Even if the system stood alone in the universe, we could not strip away the spin from the electron. Even less could we eliminate the electromagnetic field, for it gives rise to the Lamb shift even when no photons are present. This two-body system, which we call ‘the hydrogen atom’, is a mere mental construct.
Messiah's is of course an elementary text, intended for seniors or for beginning graduate students. Perhaps we are looking at versions of the theory that are too elementary?
Do not more sophisticated treatments—journal articles, research reports, and the like—provide a wealth of different, more involved bridge principles that link the theory to more realistic descriptions? I am going to argue in the next chapter that the answer to this question is no. There are some more complicated bridge principles; and of course the theory is always growing, adding both to its internal principles and its bridge principles. But at heart the theory works by piecing together in original ways a small number of familiar principles, adding corrections where necessary. This is how it should work. The aim is to cover a wide variety of different phenomena with a small number of principles, and that includes the bridge principles as well as the internal principles. It is no theory that needs a new Hamiltonian for each new physical circumstance. The explanatory power of quantum theory comes from its ability to deploy a small number of well-understood Hamiltonians to cover a wide range of cases. But this explanatory power has its price. If we limit the number of Hamiltonians, that is going to constrain our abilities to represent situations realistically. This is why our prepared descriptions lie.
I will take up these remarks about bridge principles again in the next chapter. Here I want to proceed in a different way. I claim that in general we will have to distort the true picture of what happens if we want to fit it into the highly constrained structures of our mathematical theories. I think there is a nice analogy that can help us see why this is so. That is the topic of the next section.
3. Physics as Theatre
I will present first an analogy and then an example. We begin with Thucydides' views on how to write history:
XXII. As to the speeches that were made by different men, either when they were about to begin the war or when they were already engaged therein, it has been difficult to recall with strict accuracy the words actually spoken, both for me as regards that which I myself heard, and for those who from various other sources have brought me reports. Therefore the speeches are given in the language in which, as it seemed to me, the several speakers would express, on the subjects
under consideration, the sentiments most befitting the occasion, though at the same time I have adhered as closely as possible to the general sense of what was actually said.11
Imagine that we want to stage a given historical episode. We are primarily interested in teaching a moral about the motives and behaviour of the participants. But we would also like the drama to be as realistic as possible. In general we will not be able simply to ‘rerun’ the episode over again, but this time on the stage. The original episode would have to have a remarkable unity of time and space to make that possible. There are plenty of other constraints as well. These will force us to make first one distortion, then another to compensate. Here is a trivial example. Imagine that two of the participants had a secret conversation in the corner of the room. If the actors whisper together, the audience will not be able to hear them. So the other characters must be moved off the stage, and then back on again. But in reality everyone stayed in the same place throughout. In these cases we are in the position of Thucydides. We cannot replicate what the characters actually said and did. Nor is it essential that we do so. We need only adhere ‘as closely as possible to the general sense of what was actually said’.
Physics is like that. It is important that the models we construct allow us to draw the right conclusions about the behaviour of the phenomena and their causes. But it is not essential that the models accurately describe everything that actually happens; and in general it will not be possible for them to do so, and for much the same reasons. The requirements of the theory constrain what can be literally represented. This does not mean that the right lessons cannot be drawn. Adjustments are made where literal correctness does not matter very much in order to get the correct effects where we want them; and very often, as in the staging example, one distortion is put right by another. That is why it often seems misleading to say that a particular aspect of a model is false to reality: given the other constraints that is just the way to restore the representation.
Here is a very simple example of how the operation of constraints can cause us to set down a false description in physics. In quantum mechanics free particles are represented by plane waves—functions that look like sines or cosines, stretching to infinity in both directions. This is the representation that is dictated by the Schroedinger equation, given the conventional Hamiltonian for a free particle. So far there need be nothing wrong with a wave like that. But quantum mechanics has another constraint as well: the square of the wave at a point is supposed to represent the probability that the particle is located at that point. So the integral of the square over all space must equal one. But that is impossible if the wave, like a sine or cosine, goes all the way to infinity.
There are two common solutions to this problem. One is to use a Dirac delta function. These functions are a great help to physics, and generalized function theory now explains how they work. But they side-step rather than solve the problem. Using the delta function is really to give up the requirement that the probabilities themselves integrate to one. Merzbacher, for instance, says ‘Since normalization of ∫Ψ*Ψ to unity is out the question for infinite plane waves, we must decide on an alternative normalization for these functions. A convenient tool in the discussion of such wave functions is the delta function’.12 I have thus always preferred the second solution.
This solution is called ‘box normalization’. In the model we assume that the particle is in a very, very large box, and that the wave disappears entirely at the edges of this box. To get the wave to go to zero we must assume that the potential there—very, very far away from anything we are interested in—is infinite. Here is what Merzbacher says in defence of this assumption:
The eigenfunctions are not quadratically integrable over all space. It is therefore impossible to speak of absolute probabilities and of expectation values for physical quantities in such a state. One way of avoiding this predicament would be to recognize the fact that physically no particle is ever absolutely free and that there is inevitably some confinement, the enclosure being for instance the wall
of an accelerator tube or of the laboratory. V [the potential] rises to infinity at the boundaries of the enclosure and does then not have the same value everywhere, the eigenfunctions are no longer infinite plane waves, and the eigenvalue spectrum is discrete rather than continuous.13
Here is a clear distortion of the truth. The walls may interact with the particle and have some effect on it, but they certainly do not produce an infinite potential.
I think Merzbacher intends us to think of the situation this way. The walls and environment do contain the particle; and in fact the probability is one that the particle will be found in some finite region. The way to get this effect in the model is to set the potential at the walls to infinity. Of course this is not a true description of the potentials that are actually produced by the walls and the environment. But it is not exactly false either. It is just the way to achieve the results in the model that the walls and environment are supposed to achieve in reality. The infinite potential is a good piece of staging.
Adauga cod HTML in site