2 Classical Gases

Our goal in this section is to use the techniques of statistical mechanics to describe the dynamics of the simplest system: a gas. This means a bunch of particles, flying around in a box. Although much of the last section was formulated in the language of quantum mechanics, here we will revert back to classical mechanics. Nonetheless, a recurrent theme will be that the quantum world is never far behind: we’ll see several puzzles, both theoretical and experimental, which can only truly be resolved by turning on .

2.1 The Classical Partition Function

For most of this section we will work in the canonical ensemble. We start by reformulating the idea of a partition function in classical mechanics. We’ll consider a simple system – a single particle of mass m moving in three dimensions in a potential V(q). The classical Hamiltonian of the system33 3 If you haven’t taken the Classical Dynamics course, you should think of the Hamiltonian as the energy of the system expressed in terms of the position and momentum of the particle. is the sum of kinetic and potential energy,

H=p22m+V(q)

We earlier defined the partition function (1.21) to be the sum over all quantum states of the system. Here we want to do something similar. In classical mechanics, the state of a system is determined by a point in phase space. We must specify both the position and momentum of each of the particles — only then do we have enough information to figure out what the system will do for all times in the future. This motivates the definition of the partition function for a single classical particle as the integration over phase space,

Z1=1h3d3qd3pe-βH(p,q) (2.49)

The only slightly odd thing is the factor of 1/h3 that sits out front. It is a quantity that needs to be there simply on dimensional grounds: Z should be dimensionless so h must have dimension (length×momentum) or, equivalently, Joules-seconds (Js). The actual value of h won’t matter for any physical observable, like heat capacity, because we always take logZ and then differentiate. Despite this, there is actually a correct value for h: it is Planck’s constant, h=2π6.6×10-34Js.

It is very strange to see Planck’s constant in a formula that is supposed to be classical. What’s it doing there? In fact, it is a vestigial object, like the male nipple. It is redundant, serving only as a reminder of where we came from. And the classical world came from the quantum.

2.1.1 From Quantum to Classical

It is possible to derive the classical partition function (2.49) directly from the quantum partition function (1.21) without resorting to hand-waving. It will also show us why the factor of 1/h sits outside the partition function. The derivation is a little tedious, but worth seeing. (Similar techniques are useful in later courses when you first meet the path integral). To make life easier, let’s consider a single particle moving in one spatial dimension. It has position operator q^, momentum operator p^ and Hamiltonian,

H^=p^22m+V(q^)

If |n is the energy eigenstate with energy En, the quantum partition function is

Z1=ne-βEn=nn|e-βH^|n (2.50)

In what follows, we’ll make liberal use of the fact that we can insert the identity operator anywhere in this expression. Identity operators can be constructed by summing over any complete basis of states. We’ll need two such constructions, using the position eigenvectors |q and the momentum eigenvectors |p,

𝟏=dq|qq|  ,     1=dp|pp|

We start by inserting two copies of the identity built from position eigenstates,

Z1 = nn|𝑑q|qq|e-βH^𝑑q|qq|n
= 𝑑q𝑑qq|e-βH^|qnq|nn|q

But now we can replace n|nn| with the identity matrix and use the fact that q|q=δ(q-q), to get

Z1=𝑑qq|e-βH^|q (2.51)

We see that the result is to replace the sum over energy eigenstates in (2.50) with a sum (or integral) over position eigenstates in (2.51). If you wanted, you could play the same game and get the sum over any complete basis of eigenstates of your choosing. As an aside, this means that we can write the partition function in a basis independent fashion as

Z1=Tre-βH^

So far, our manipulations could have been done for any quantum system. Now we want to use the fact that we are taking the classical limit. This comes about when we try to factorize e-βH^ into a momentum term and a position term. The trouble is that this isn’t always possible when there are matrices (or operators) in the exponent. Recall that,

eA^eB^=eA^+B^+12[A^,B^]+

For us [q^,p^]=i. This means that if we’re willing to neglect terms of order — which is the meaning of taking the classical limit — then we can write

e-βH^=e-βp^2/2me-βV(q^)+𝒪()

We can now start to replace some of the operators in the exponent, like V(q^), with functions V(q). (The notational difference is subtle, but important, in the expressions below!),

Z1 = 𝑑qq|e-βp^2/2me-βV(q^)|q
= 𝑑qe-βV(q)q|e-βp^2/2m|q
= 𝑑q𝑑p𝑑pe-βV(q)q|pp|e-βp^2/2m|pp|q
= 12π𝑑q𝑑pe-βH(p,q)

where, in the final line, we’ve used the identity

q|p=12πeipq/

This completes the derivation.

2.2 Ideal Gas

The first classical gas that we’ll consider consists of N particles trapped inside a box of volume V. The gas is “ideal”. This simply means that the particles do not interact with each other. For now, we’ll also assume that the particles have no internal structure, so no rotational or vibrational degrees of freedom. This situation is usually referred to as the monatomic ideal gas. The Hamiltonian for each particle is simply the kinetic energy,

H=p 22m

And the partition function for a single particle is

Z1(V,T)=1(2π)3d3qd3pe-βp 2/2m (2.52)

The integral over position is now trivial and gives d3q=V, the volume of the box. The integral over momentum is also straightforward since it factorizes into separate integrals over px, py and pz, each of which is a Gaussian of the form,

𝑑xe-ax2=πa

So we have

Z1=V(mkBT2π2)3/2

We’ll meet the combination of factors in the brackets a lot in what follows, so it is useful to give it a name. We’ll write

Z1=Vλ3 (2.53)

The quantity λ goes by the name of the thermal de Broglie wavelength,

λ=2π2mkBT (2.54)

λ has the dimensions of length. We will see later that you can think of λ as something like the average de Broglie wavelength of a particle at temperature T. Notice that it is a quantum object – it has an sitting in it – so we expect that it will drop out of any genuinely classical quantity that we compute. The partition function itself (2.53) is counting the number of these thermal wavelengths that we can fit into volume V.

Deviations from ideal gas law at sensible densities
Figure 8: Deviations from ideal gas law at sensible densities
Deviations from ideal gas law at extreme densities
Figure 9: Deviations from ideal gas law at extreme densities

Z1 is the partition function for a single particle. We have N, non-interacting, particles in the box so the partition function of the whole system is

Z(N,V,T)=Z1N=VNλ3N (2.55)

(Full disclosure: there’s a slightly subtle point that we’re brushing under the carpet here and this equation isn’t quite right. This won’t affect our immediate discussion and we’ll explain the issue in more detail in Section 2.2.3.)

Armed with the partition function Z, we can happily calculate anything that we like. Let’s start with the pressure, which can be extracted from the partition function by first computing the free energy (1.36) and then using (1.35). We have

p = -FV (2.56)
= V(kBTlogZ)
= NkBTV

This equation is an old friend – it is the ideal gas law, pV=NkBT, that we all met in kindergarten. Notice that the thermal wavelength λ has indeed disappeared from the discussion as expected. Equations of this form, which link pressure, volume and temperature, are called equations of state. We will meet many throughout this course.

As the plots above show44 4 Both figures are taken from the web textbook “General Chemistry” and credited to John Hutchinson., the ideal gas law is an extremely good description of gases at low densities. Gases deviate from this ideal behaviour as the densities increase and the interactions between atoms becomes important. We will see how this comes about from the viewpoint of microscopic forces in Section 2.5.

It is worth pointing out that this derivation should calm any lingering fears that you had about the definition of temperature given in (1.7). The object that we call T really does coincide with the familiar notion of temperature applied to gases. But the key property of the temperature is that if two systems are in equilibrium then they have the same T. That’s enough to ensure that equation (1.7) is the right definition of temperature for all systems because we can always put any system in equilibrium with an ideal gas.

2.2.1 Equipartition of Energy

The partition function (2.55) has more in store for us. We can compute the average energy of the ideal gas,

E=-βlogZ=32NkBT (2.57)

There’s an important, general lesson lurking in this formula. To highlight this, it is worth repeating our analysis for an ideal gas in arbitrary number of spatial dimensions, D. A simple generalization of the calculations above shows that

Z=VNλDN         E=D2NkBT

Each particle has D degrees of freedom (because it can move in one of D spatial directions). And each particle contributes 12DkBT towards the average energy. This is a general rule of thumb, which holds for all classical systems: the average energy of each free degree of freedom in a system at temperature T is 12kBT. This is called the equipartition of energy. As stated, it holds only for degrees of freedom in the absence of a potential. (There is a modified version if you include a potential). Moreover, it holds only for classical systems or quantum systems at suitably high temperatures.

We can use the result above to see why the thermal de Broglie wavelength (2.54) can be thought of as roughly equal to the average de Broglie wavelength of a particle. Equating the average energy (2.57) to the kinetic energy E=p2/2m tells us that the average (root mean square) momentum carried by each particle is pmkBT. In quantum mechanics, the de Broglie wavelength of a particle is λdB=h/p, which (up to numerical factors of 2 and π) agrees with our formula (2.54).

Finally, returning to the reality of d=3 dimensions, we can compute the heat capacity for a monatomic ideal gas. It is

CV=ET|V=32NkB (2.58)

2.2.2 The Sociological Meaning of Boltzmann’s Constant

We introduced Boltzmann’s constant kB in our original the definition of entropy (1.2). It has the value,

kB=1.381×10-23JK-1

In some sense, there is no deep physical meaning to Boltzmann’s constant. It is merely a conversion factor that allows us to go between temperature and energy, as reflected in (1.7). It is necessary to include it in the equations only for historical reasons: our ancestors didn’t realise that temperature and energy were closely related and measured them in different units.

Nonetheless, we could ask why does kB have the value above? It doesn’t seem a particularly natural number. The reason is that both the units of temperature (Kelvin) and energy (Joule) are picked to reflect the conditions of human life. In the everyday world around us, measurements of temperature and energy involve fairly ordinary numbers: room temperature is roughly 300K; the energy required to lift an apple back up to the top of the tree is a few Joules. Similarly, in an everyday setting, all the measurable quantities — p, V and T — in the ideal gas equation are fairly normal numbers when measured in SI units. The only way this can be true is if the combination NkB is a fairly ordinary number, of order one. In other words the number of atoms must be huge,

N1023 (2.59)

This then is the real meaning of the value of Boltzmann’s constant: atoms are small.

It’s worth stressing this point. Atoms aren’t just small: they’re really really small. 1023 is an astonishingly large number. The number of grains of sand in all the beaches in the world is around 1018. The number of stars in our galaxy is about 1011. The number of stars in the entire visible Universe is probably around 1022. And yet the number of water molecules in a cup of tea is more than 1023.

Chemist Notation

While we’re talking about the size of atoms, it is probably worth reminding you of the notation used by chemists. They too want to work with numbers of order one. For this reason, they define a mole to be the number of atoms in one gram of Hydrogen. (Actually, it is the number of atoms in 12 grams of Carbon-12, but this is roughly the same thing). The mass of Hydrogen is 1.6×10-27Kg, so the number of atoms in a mole is Avogadro’s number,

NA6×1023

The number of moles in our gas is then n=N/NA and the ideal gas law can be written as

pV=nRT

where R=NAkB is the called the Universal gas constant. Its value is a nice sensible number with no silly power in the exponent: R8JK-1mol-1.

2.2.3 Entropy and Gibbs’s Paradox

“It has always been believed that Gibbs’s paradox embodied profound thought. That it was intimately linked up with something so important and entirely new could hardly have been foreseen.”

Erwin Schrödinger

We said earlier that the formula for the partition function (2.55) isn’t quite right. What did we miss? We actually missed a subtle point from quantum mechanics: quantum particles are indistinguishable. If we take two identical atoms and swap their positions, this doesn’t give us a new state of the system – it is the same state that we had before. (Up to a sign that depends on whether the atoms are bosons or fermions – we’ll discuss this aspect in more detail in Sections 3.5 and 3.6). However, we haven’t taken this into account – we wrote the expression Z=Z1N which would be true if all the N particles in the were distinguishable — for example, if each of the particles were of a different type. But this naive partition function overcounts the number of states in the system when we’re dealing with indistinguishable particles.

It is a simple matter to write down the partition function for N indistinguishable particles. We simply need to divide by the number of ways to permute the particles. In other words, for the ideal gas the partition function is

Zideal(N,V,T)=1N!Z1N=VNN!λ3N (2.60)

The extra factor of N! doesn’t change the calculations of pressure or energy since, for each, we had to differentiate logZ and any overall factor drops out. However, it does change the entropy since this is given by,

S=T(kBTlogZideal)

which includes a factor of logZ without any derivative. Of course, since the entropy is counting the number of underlying microstates, we would expect it to know about whether particles are distinguishable or indistinguishable. Using the correct partition function (2.60) and Stirling’s formula, the entropy of an ideal gas is given by,

S=NkB[log(VNλ3)+52] (2.61)

This result is known as the Sackur-Tetrode equation. Notice that not only is the entropy sensitive to the indistinguishability of the particles, but it also depends on λ. However, the entropy is not directly measurable classically. We can only measure entropy differences by the integrating the heat capacity as in (1.10).

The benefit of adding an extra factor of N! was noticed before the advent of quantum mechanics by Gibbs. He was motivated by the change in entropy of mixing between two gases. Suppose that we have two different gases, say red and blue. Each has the same number of particles N and sits in a volume V, separated by a partition. When the partition is removed the gases mix and we expect the entropy to increase. But if the gases are of the same type, removing the partition shouldn’t change the macroscopic state of the gas. So why should the entropy increase? This is referred to as the Gibb’s paradox. Including the factor of N! in the partition function ensures that the entropy does not increase when identical atoms are mixed55 5 Be warned however: a closer look shows that the Gibbs paradox is rather toothless and, in the classical world, there is no real necessity to add the N!. A clear discussion of these issues can be found in E.T. Jaynes’ article “The Gibbs Paradox” which you can download from the course website.

2.2.4 The Ideal Gas in the Grand Canonical Ensemble

It is worth briefly looking at the ideal gas in the grand canonical ensemble. Recall that in such an ensemble, the gas is free to exchange both energy and particles with the outside reservoir. You could think of the system as some fixed subvolume inside a much larger gas. If there are no walls to define this subvolume then particles, and hence energy, can happily move in and out. We can ask how many particles will, on average, be inside this volume and what fluctuations in particle number will occur. More importantly, we can also start to gain some intuition for this strange quantity called the chemical potential, μ.

The grand partition function (1.39) for the ideal gas is

𝒵ideal(μ,V,T)=N=0eβμNZideal(N,V,T)=exp(eβμVλ3)

From this we can determine the average particle number,

N=1βμlog𝒵=eβμVλ3

Which, rearranging, gives

μ=kBTlog(λ3NV) (2.62)

If λ3<V/N then the chemical potential is negative. Recall that λ is roughly the average de Broglie wavelength of each particle, while V/N is the average volume taken up by each particle. But whenever the de Broglie wavelength of particles becomes comparable to the inter-particle separation, then quantum effects become important. In other words, to trust our classical calculation of the ideal gas, we must have λ3V/N and, correspondingly, μ<0.

At first sight, it is slightly strange that μ is negative. When we introduced μ in Section 1.4.1, we said that it should be thought of as the energy cost of adding an extra particle to the system. Surely that energy should be positive! To see why this isn’t the case, we should look more closely at the definition. From the energy variation (1.38), we have

μ=EN|S,V

So the chemical potential should be thought of as the energy cost of adding an extra particle at fixed entropy and volume. But adding a particle will give more ways to share the energy around and so increase the entropy. If we insist on keeping the entropy fixed, then we will need to reduce the energy when we add an extra particle. This is why we have μ<0 for the classical ideal gas.

There are situations where μ>0. This can occur if we have a suitably strong repulsive interaction between particles so that there’s a large energy cost associated to throwing in one extra. We also have μ>0 for fermion systems at low temperatures as we will see in Section 3.6.

We can also compute the fluctuation in the particle number,

ΔN2=1β22μ2log𝒵ideal=N

As promised in Section 1.4.1, the relative fluctuations ΔN/N=1/N are vanishingly small in the thermodynamic N limit.

Finally, it is very easy to compute the equation of state in the grand canonical ensemble because (1.45) and (1.48) tell us that

pV=kBTlog𝒵=kBTeβμVλ3=kBTN (2.63)

which gives us back the ideal gas law.

2.3 Maxwell Distribution

Figure 10: Maxwell distribution for Noble gases: He, Ne, Ar and Xe.

Our discussion above focusses on understanding macroscopic properties of the gas such as pressure or heat capacity. But we can also use the methods of statistical mechanics to get a better handle on the microscopic properties of the gas. Like everything else, the information is hidden in the partition function. Let’s return to the form of the single particle partition function (2.52) before we do the integrals. We’ll still do the trivial spatial integral d3q=V, but we’ll hold off on the momentum integral and instead change variables from momentum to velocity, p=mv. Then the single particle partition function is

Z1=m3V(2π)3d3ve-βmv2/2=4πm3V(2π)3𝑑vv2e-βmv2/2

We can compare this to the original definition of the partition function: the sum over states of the probability of that state. But here too, the partition function is written as a sum, now over speeds. The integrand must therefore have the interpretation as the probability distribution over speeds. The probability that the atom has speed between v and v+dv is

f(v)dv=𝒩v2e-mv2/2kBTdv (2.64)

where the normalization factor 𝒩 can be determined by insisting that probabilities sum to one, 0f(v)𝑑v=1, which gives

𝒩=4π(m2πkBT)3/2

This is the Maxwell distribution. It is sometimes called the Maxwell-Boltzmann distribution. Figure 10 shows this distribution for a variety of gases with different masses at the same temperature, from the slow heavy Xenon (purple) to light, fast Helium (blue). We can use it to determine various average properties of the speeds of atoms in a gas. For example, the mean square speed is

v2=0𝑑vv2f(v)=3kBTm

This is in agreement with the equipartition of energy: the average kinetic energy of the gas is E=12mv2=32kBT.

Maxwell’s Argument

The above derivation tells us the distribution of velocities in a non-interacting gas of particles. Remarkably, the Maxwell distribution also holds in the presence of any interactions. In fact, Maxwell’s original derivation of the distribution makes no reference to any properties of the gas. It is very slick!

Let’s first think about the distribution of velocities in the x direction; we’ll call this distribution ϕ(vx). Rotational symmetry means that we must have the same distribution of velocities in both the y and z directions. However, rotational invariance also requires that the full distribution can’t depend on the direction of the velocity; it can only depend on the speed v=vx2+vy2+vz2. This means that we need to find functions F(v) and ϕ(vx) such that

F(v)dvxdvydvz=ϕ(vx)ϕ(vy)ϕ(vz)dvxdvydvz

It doesn’t look as if we possibly have enough information to solve this equation for both F and ϕ. But, remarkably, there is only one solution. The only function which satisfies this equation is

ϕ(vx)=Ae-Bvx2

for some constants A and B. Thus the distribution over speeds must be

F(v)dvxdvydvz=4πv2F(v)dv=4πA3v2e-Bv2dv

We see that the functional form of the distribution arises from rotational invariance alone. To determine the coefficient B=m/2kBT we need the more elaborate techniques of statistical mechanics that we saw above. (In fact, one can derive it just from equipartition of energy).

2.3.1 A History of Kinetic Theory

The name kinetic theory refers to the understanding the properties of gases through their underlying atomic constituents. The discussion given above barely scratches the surface of this important subject.

Kinetic theory traces its origin to the work of Daniel Bernoulli in 1738. He was the first to argue that the phenomenon that we call pressure is due to the constant bombardment of tiny atoms. His calculation is straightforward. Consider a cubic box with sides of length L. Suppose that an atom travelling with momentum vx in the x direction bounces elastically off a wall so that it returns with velocity -vx. The particle experiences a change in momentum is Δpx=2mvx. Since the particle is trapped in a box, it will next hit the wall at a time Δt=2L/vx later. This means that the force on the wall due to this atom is

F=ΔpxΔt=mvx2L

Summing over all the atoms which hit the wall, the force is

F=Nmvx2L

where vx2 is the average velocity in the x-direction. Using the same argument as we gave in Maxwell’s derivation above, we must have vx2=v2/3. Thus F=Nmv2/3L and the pressure, which is force per area, is given be

p=Nmv23L3=Nmv23V

If this equation is compared to the ideal gas law (which, at the time, had only experimental basis) one concludes that the phenomenon of temperature must arise from the kinetic energy of the gas. Or, more precisely, one finds the equipartition result that we derived previously: 12mv2=32kBT.

After Bernoulli’s pioneering work, kinetic theory languished. No one really knew what to do with his observation nor how to test the underlying atomic hypothesis. Over the next century, Bernouilli’s result was independently rediscovered by a number of people, all of whom were ignored by the scientific community. One of the more interesting attempts was by John Waterson, a Scottish engineer and naval instructor working for the East India Company in Bombay. Waterson was considered a crackpot. His 1843 paper was rejected by the Royal Society as “nothing but nonsense” and he wrote up his results in a self-published book with the wonderfully crackpot title “Thoughts on Mental Functions”.

The results of Bernouilli and Waterson finally became accepted only after they were re-rediscovered by more established scientists, most notably Rudolph Clausius who, in 1857, extended these ideas to rotating and vibrating molecules. Soon afterwards, in 1859, Maxwell gave the derivation of the distribution of velocities that we saw above. This is often cited as the first statistical law of physics. But Maxwell was able to take things further. He used kinetic theory to derive the first genuinely new prediction of the atomic hypothesis: that the viscosity of a gas is independent of its density. Maxwell himself wrote,

”Such a consequence of the mathematical theory is very startling and the only experiment I have met with on the subject does not seem to confirm it.”

Maxwell decided to rectify the situation. With help from his wife, he spent several years constructing an experimental apparatus in his attic which was capable of providing the first accurate measurements of viscosity of gases66 6 You can see the original apparatus down the road in the corridor of the Cavendish lab. Or, if you don’t fancy the walk, you can simply click here:
http://www-outreach.phy.cam.ac.uk/camphy/museum/area1/exhibit1.htm
. His surprising theoretical prediction was confirmed by his own experiment.

There are many further developments in kinetic theory which we will not cover in this course. Perhaps the most important is the Boltzmann equation. This describes the evolution of a particle’s probability distribution in position and momentum space as it collides with other particles. Stationary, unchanging, solutions bring you back to the Maxwell-Boltzmann distribution, but the equation also provides a framework to go beyond the equilibrium description of a gas. You can read about this in the lecture notes on Kinetic Theory.

2.4 Diatomic Gas

“I must now say something about these internal motions, because the greatest difficulty which the kinetic theory of gases has yet encountered belongs to this part of the subject”.

James Clerk Maxwell, 1875

Consider a molecule that consists of two atoms in a bound state. We’ll construct a very simple physicist’s model of this molecule: two masses attached to a spring. As well as the translational degrees of freedom, there are two further ways in which the molecule can move

  • Rotation: the molecule can rotate rigidly about the two axes perpendicular to the axis of symmetry, with moment of inertia I. (For now, we will neglect the rotation about the axis of symmetry. It has very low moment of inertia which will ultimately mean that it is unimportant).

  • Vibration: the molecule can oscillate along the axis of symmetry

We’ll work under the assumption that the rotation and vibration modes are independent. In this case, the partition function for a single molecule factorises into the product of the translation partition function Ztrans that we have already calculated (2.53) and the rotational and vibrational contributions,

Z1=ZtransZrotZvib

We will now deal with Zrot and Zvib in turn.

Rotation

The Lagrangian for the rotational degrees of freedom is77 7 See, for example, Section 3.6 of the lecture notes on Classical Dynamics

Lrot=12I(θ˙2+sin2θϕ˙2) (2.65)

The conjugate momenta are therefore

pθ=Lrotθ˙=Iθ˙   ,pϕ=Lrotϕ˙=Isin2θϕ˙

from which we get the Hamiltonian for the rotating diatomic molecule,

Hrot=θ˙pθ+ϕ˙pϕ-L=pθ22I+pϕ22Isin2θ (2.66)

The rotational contribution to the partition function is then

Zrot = 1(2π)2𝑑θ𝑑ϕ𝑑pθ𝑑pϕe-βHrot (2.67)
= 1(2π)22πIβ0π𝑑θ2πIsin2θβ02π𝑑ϕ
= 2IkBT2

From this we can compute the average rotational energy of each molecule,

Erot=kBT

If we now include the translational contribution (2.53), the partition function for a diatomic molecule that can spin and move, but can’t vibrate, is given by Z1=ZtransZrot(kBT)5/2, and the partition function for a gas of these object Z=Z1N/N!, from which we compute the energy E=52NkBT and the heat capacity,

CV=52kBN

In fact we can derive this result simply from equipartition of energy: there are 3 translational modes and 2 rotational modes, giving a contribution of 5N×12kBT to the energy.

Vibrations

The Hamiltonian for the vibrating mode is simply a harmonic oscillator. We’ll denote the displacement away from the equilibrium position by ζ. The molecule vibrates with some frequency ω which is determined by the strength of the atomic bond. The Hamiltonian is then

Hvib=pζ22m+12mω2ζ2

from which we can compute the partition function

Zvib=12π𝑑ζ𝑑pζe-βHvib=kBTω (2.68)

The average vibrational energy of each molecule is now

Evib=kBT

(You may have anticipated 12kBT since the harmonic oscillator has just a single degree of freedom, but equipartition works slightly differently when there is a potential energy. You will see another example on the problem sheet from which it is simple to deduce the general form).

Figure 11: The heat capacity of Hydrogen gas H2. The graph was created by P. Eyland.

Putting together all the ingredients, the contributions from translational motion, rotation and vibration give the heat capacity

CV=72NkB

This result depends on neither the moment of inertia, I, nor the stiffness of the molecular bond, ω. A molecule with large I will simply spin more slowly so that the average rotational kinetic energy is kBT; a molecule attached by a stiff spring with high ω will vibrate with smaller amplitude so that the average vibrational energy is kBT. This ensures that the heat capacity is constant.

Great! So the heat capacity of a diatomic gas is 72NkB. Except it’s not! An idealised graph of the heat capacity for H2, the simplest diatomic gas, is shown in Figure 11. At suitably high temperatures, around 5000K, we do see the full heat capacity that we expect. But at low temperatures, the heat capacity is that of monatomic gas. And, in the middle, it seems to rotate, but not vibrate. What’s going on? Towards the end of the nineteenth century, scientists were increasingly bewildered about this behaviour.

What’s missing in the discussion above is something very important: . The successive freezing out of vibrational and rotational modes as the temperature is lowered is a quantum effect. In fact, this behaviour of the heat capacities of gases was the first time that quantum mechanics revealed itself in experiment. We’re used to thinking of quantum mechanics as being relevant on small scales, yet here we see that affects the physics of gases at temperatures of 2000K. But then, that is the theme of this course: how the microscopic determines the macroscopic. We will return to the diatomic gas in Section 3.4 and understand its heat capacity including the relevant quantum effects.

2.5 Interacting Gas

Until now, we’ve only discussed free systems; particles moving around unaware of each other. Now we’re going to turn on interactions. Here things get much more interesting. And much more difficult. Many of the most important unsolved problems in physics are to do with the interactions between large number of particles. Here we’ll be gentle. We’ll describe a simple approximation scheme that will allow us to begin to understand the effects of interactions between particles.

We’ll focus once more on the monatomic gas. The ideal gas law is exact in the limit of no interactions between atoms. This is a good approximation when the density of atoms N/V is small. Corrections to the ideal gas law are often expressed in terms of a density expansion, known as the virial expansion. The most general equation of state is,

pkBT=NV+B2(T)N2V2+B3(T)N3V3+ (2.69)

where the functions Bj(T) are known as virial coefficients.

Our goal is to compute the virial coefficients from first principles, starting from a knowledge of the underlying potential energy U(r) between two neutral atoms separated by a distance r. This potential has two important features:

  • An attractive 1/r6 force. This arises from fluctuating dipoles of the neutral atoms. Recall that two permanent dipole moments, p1 and p2, have a potential energy which scales as p1p2/r3. Neutral atoms don’t have permanent dipoles, but they can acquire a temporary dipole due to quantum fluctuations. Suppose that the first atom has an instantaneous dipole p1. This will induce an electric field which is proportional to Ep1/r3 which, in turn, will induce a dipole of the second atom p2Ep1/r3. The resulting potential energy between the atoms scales as p1p2/r31/r6. This is sometimes called the van der Waals interaction.

  • A rapidly rising repulsive interaction at short distances, arising from the Pauli exclusion principle that prevents two atoms from occupying the same space. For our purposes, the exact form of this repulsion is not so relevant: just as long as it’s big. (The Pauli exclusion principle is a quantum effect. If the exact form of the potential is important then we really need to be dealing with quantum mechanics all along. We will do this in the next section).

One very common potential that is often used to model the force between atoms is the Lennard-Jones potential,

U(r)(r0r)12-(r0r)6 (2.70)

The exponent 12 is chosen only for convenience: it simplifies certain calculations because 12=2×6.

An even simpler form of the potential incorporates a hard core repulsion, in which

Figure 12:

the particles are simply forbidden from closer than a fixed distance by imposing an infinite potential,

U(r)={r<r0-U0(r0r)6rr0 (2.71)

The hard-core potential with van der Waals attraction is sketched to the right. We will see shortly that the virial coefficients are determined by increasingly difficult integrals involving the potential U(r). For this reason, it’s best to work with a potential that’s as simple as possible. When we come to do some actual calculations we will use the form (2.71).

2.5.1 The Mayer f Function and the Second Virial Coefficient

We’re going to change notation and call the positions of the particles r instead of q. (The latter notation was useful to stress the connection to quantum mechanics at the beginning of this Section, but we’ve now left that behind!). The Hamiltonian of the gas is

H=i=1Npi22m+i>jU(rij)

where rij=|ri-rj| is the separation between particles. The restriction i>j on the final sum ensures that we sum over each pair of particles exactly once. The partition function is then

Z(N,V,T) = 1N!1(2π)3Ni=1Nd3pid3rie-βH
= 1N!1(2π)3N[id3pie-βjpj2/2m]×[id3rie-βj<kU(rjk)]
= 1N!λ3Nid3rie-βj<kU(rjk)

where λ is the thermal wavelength that we met in (2.54). We still need to do the integral over positions. And that looks hard! The interactions mean that the integrals don’t factor in any obvious way. What to do? One obvious way thing to try is to Taylor expand (which is closely related to the so-called cumulant expansion in this context)

e-βj<kU(rjk)=1-βj<kU(rjk)+β22j<k,l<mU(rjk)U(rlm)+

Unfortunately, this isn’t so useful. We want each term to be smaller than the preceding one. But as rij0, the potential U(rij), which doesn’t look promising for an expansion parameter.

Instead of proceeding with the naive Taylor expansion, we will instead choose to work with the following quantity, usually called the Mayer f function,

f(r)=e-βU(r)-1 (2.72)

This is a nicer expansion parameter. When the particles are far separated at r, f(r)0. However, as the particles come close and r0, the Mayer function approaches f(r)-1. We’ll proceed by trying to construct a suitable expansion in terms of f. We define

fij=f(rij)

Then we can write the partition function as

Z(N,V,T) = 1N!λ3Nid3rij>k(1+fjk) (2.73)
= 1N!λ3Nid3ri(1+j>kfjk+j>k,l>mfjkflm+)

The first term simply gives a factor of the volume V for each integral, so we get VN. The second term has a sum, each element of which is the same. They all look like

i=1Nd3rif12=VN-2d3r1d3r2f(r12)=VN-1d3rf(r)

where, in the last equality, we’ve simply changed integration variables from r1 and r2 to the centre of mass R=12(r1+r2) and the separation r=r1-r2. (You might worry that the limits of integration change in the integral over r, but the integral over f(r) only picks up contributions from atomic size distances and this is only actually a problem close to the boundaries of the system where it is negligible). There is a term like this for each pair of particles – that is 12N(N-1) such terms. For N1023, we can just call this a round 12N2. Then, ignoring terms quadratic in f and higher, the partition function is approximately

Z(N,V,T) = VNN!λ3N(1+N22Vd3rf(r)+)
= Zideal(1+N2Vd3rf(r)+)N

where we’ve used our previous result that Zideal=VN/N!λ3N. We’ve also engaged in something of a sleight of hand in this last line, promoting one power of N from in front of the integral to an overall exponent. Massaging the expression in this way ensures that the free energy is proportional to the number of particles as one would expect:

F=-kBTlogZ=Fideal-NkBTlog(1+N2Vd3rf(r)) (2.74)

However, if you’re uncomfortable with this little trick, it’s not hard to convince yourself that the result (2.75) below for the equation of state doesn’t depend on it. We will also look at the expansion more closely in the following section and see how all the higher order terms work out.

From the expression (2.74) for the free energy, it is clear that we are indeed performing an expansion in density of the gas since the correction term is proportional to N/V. This form of the free energy will give us the second virial coefficient B2(T).

We can be somewhat more precise about what it means to be at low density. The exact form of the integral d3rf(r) depends on the potential, but for both the Lennard-Jones potential (2.70) and the hard-core repulsion (2.71), the integral is approximately d3rf(r)r03, where r0 is roughly the minimum of the potential. (We’ll compute the integral exactly below for the hard-core potential). For the expansion to be valid, we want each term with an extra power of f to be smaller than the preceding one. (This statement is actually only approximately true. We’ll be more precise below when we develop the cluster expansion). That means that the second term in the argument of the log should be smaller than 1. In other words,

NV1r03

The left-hand side is the density of the gas. The right-hand side is atomic density. Or, equivalently, the density of a substance in which the atoms are packed closely together. But we have a name for such substances – we call them liquids! Our expansion is valid for densities of the gas that are much lower than that of the liquid state.

2.5.2 van der Waals Equation of State

We can use the free energy (2.74) to compute the pressure of the gas. Expanding the logarithm as log(1+x)x we get

p=-FV=NkBTV(1-N2Vd3rf(r)+)

As expected, the pressure deviates from that of an ideal gas. We can characterize this by writing

pVNkBT=1-N2Vd3rf(r) (2.75)

To understand what this is telling us, we need to compute d3rf(r). Firstly let’s look at two trivial examples:

Repulsion: Suppose that U(r)>0 for all separations r with U(r=)=0. Then f=e-βU-1<0 and the pressure increases, as we’d expect for a repulsive interaction.

Attraction: If U(r)<0, we have f>0 and the pressure decreases, as we’d expect for an attractive interaction.

What about a more realistic interaction that is attractive at long distances and repulsive at short? We will compute the equation of state of a gas using the hard-core potential with van der Waals attraction (2.71). The integral of the Mayer f function is

d3rf(r)=0r0d3r(-1)+r0d3r(e+βU0(r0/r)6-1) (2.76)

We’ll approximate the second integral in the high temperature limit, βU01, where e+βU0(r0/r)61+βU0(r0/r)6. Then

d3rf(r) = -4π0r0𝑑rr2+4πU0kBTr0𝑑rr06r4
= 4πr033(U0kBT-1)

Inserting this into (2.75) gives us an expression for the equation of state,

pVNkBT=1-NV(akBT-b)

We recognise this expansion as capturing the second virial coefficient in (2.69) as promised. The constants a and b are defined by

a=2πr03U03   ,b=2πr033

It is actually slightly more useful to write this in the form kBT=. We can multiply through by kBT then, rearranging we have

kBT=VN(p+N2V2a)(1+NVb)-1

Since we’re working in an expansion in density, N/V, we’re at liberty to Taylor expand the last bracket, keeping only the first two terms. We get

kBT=(p+N2V2a)(VN-b) (2.78)

This is the famous van der Waals equation of state for a gas. We stress again the limitations of our analysis: it is valid only at low densities and (because of our approximation when performing the integral (2.76)) at high temperatures.

We will return to the van der Waals equation in Section 5 where we’ll explore many of its interesting features. For now, we can get a feeling for the physics behind this equation of state by rewriting it in yet another way,

p=NkBTV-bN-aN2V2 (2.79)
Figure 13:

The constant a contains a factor of U0 and so capures the effect of the attractive interaction at large distances. We see that its role is to reduce the pressure of the gas. The reduction in pressure is proportional to the density squared because this is, in turn, proportional to the number of pairs of particles which feel the attractive force. In contrast, b only contains r0 and arises due to the hard-core repulsion in the potential. Its effect is the reduce the effective volume of the gas because of the space taken up by the particles.

It is worth pointing out where some quizzical factors of two come from in b=2πr03/3. Recall that r0 is the minimum distance that two atoms can approach. If we think of the each atom as a hard sphere, then they have radius r0/2 and volume 4π(r0/2)3/3. Which isn’t equal to b. However, as illustrated in the figure, the excluded volume around each atom is actually Ω=4πr03/3=2b. So why don’t we have Ω sitting in the denominator of the van der Waals equation rather than b=Ω/2? Think about adding the atoms one at a time. The first guy can move in volume V; the second in volume V-Ω; the third in volume V-2Ω and so on. For ΩV, the total configuration space available to the atoms is

1N!m=1N(V-mΩ)VNN!(1-N22ΩV+)1N!(V-NΩ2)N

And there’s that tricky factor of 1/2.

Above we computed the equation of state for the dipole van der Waals interaction with hard core potential. But our expression (2.75) can seemingly be used to compute the equation of state for any potential between atoms. However, there are limitations. Looking back to the integral (2.5.2), we see that a long-range force of the form 1/rn will only give rise to a convergent integral for n4. This means that the techniques described above do not work for long-range potentials with fall-off 1/r3 or slower. This includes the important case of 1/r Coulomb interactions.

2.5.3 The Cluster Expansion

Above we computed the leading order correction to the ideal gas law. In terms of the virial expansion (2.69) this corresponds to the second virial coefficient B2. We will now develop the full expansion and explain how to compute the higher virial coefficients.

Let’s go back to equation (2.73) where we first expressed the partition function in terms of f,

Z(N,V,T) = 1N!λ3Nid3rij>k(1+fjk) (2.80)
= 1N!λ3Nid3ri(1+j>kfjk+j>k,l>mfjkflm+)

Above we effectively related the second virial coefficient to the term linear in f: this is the essence of the equation of state (2.75). One might think that terms quadratic in f give rise to the third virial coefficient and so on. But, as we’ll now see, the expansion is somewhat more subtle than that.

The expansion in (2.80) includes terms of the form fijfklfmn where the indices denote pairs of atoms, (i,j) and (k,l) and so on. These pairs may have atoms in common or they may all be different. However, the same pair never appears twice in a given term as you may check by going back to the first line in (2.80). We’ll introduce a diagrammatic method to keep track of all the terms in the sum. To each term of the form fijfklfmn we associate a picture using the following rules

  • Draw N atoms. (This gets tedious for N1023 but, as we’ll soon see, we will actually only need pictures with small subset of atoms).

  • Draw a line between each pair of atoms that appear as indices. So for fijfklfmn, we draw a line between atom i and atom j; a line between atom k and atom l; and so on.

For example, if we have just N=4, we have the following pictures for different terms in the expansion,

f12=    f12f34=      f12f23=      f21f23f31=

We call these diagrams graphs. Each possible graph appears exactly once in the partition function (2.80). In other words, the partition function is a sum over all graphs. We still have to do the integrals over all positions ri. We will denote the integral over graph G to be W[G]. Then the partition function is

Z(N,V,T)=1N!λ3NGW[G]

Nearly all the graphs that we can draw will have disconnected components. For example, those graphs that correspond to just a single fij will have two atoms connected and the remaining N-2 sitting alone. Those graphs that correspond to fijfkl fall into two categories: either they consist of two pairs of atoms (like the second example above) or, if (i,j) shares an atom with (k,l), there are three linked atoms (like the third example above). Importantly, the integral over positions ri then factorises into a product of integrals over the positions of atoms in disconnected components. This is illustrated by an example with N=5 atoms,

W[]=(d3r1d3r2d3r3f12f23f31)(d3r4d3r5f45)

We call the disconnected components of the graph clusters. If a cluster has l atoms, we will call it an l-cluster. The N=5 example above has a single 3-cluster and a single 2-cluster. In general, a graph G will split into ml l-clusters. Clearly, we must have

l=1Nmll=N (2.81)

Of course, for a graph with only a few lines and lots of atoms, nearly all the atoms will be in lonely 1-clusters.

We can now make good on the promise above that we won’t have to draw all N1023 atoms. The key idea is that we can focus on clusters of l-atoms. We will organise the expansion in such a way that the (l+1)-clusters are less important than the l-clusters. To see how this works, let’s focus on 3-clusters for now. There are four different ways that we can have a 3-cluster,

Each of these 3-clusters will appear in a graph with any other combination of clusters among the remaining N-3 atoms. But since clusters factorise in the partition function, we know that Z must include a factor

U3d3r1d3r2d3r3()

U3 contains terms of order f2 and f3. It turns out that this is the correct way to arrange the expansion: not in terms of the number of lines in the diagram, which is equal to the power of f, but instead in terms of the number of atoms that they connect. The partition function will similarly contain factors associated to all other l-clusters. We define the corresponding integrals as

Uli=1ld3riG{l-cluster}G (2.82)

Notice that U1 is simply the integral over space, namely U1=V. The full partition function must be a product of Ul’s. The tricky part is to get all the combinatoric factors right to make sure that you count each graph exactly once. The sum over graphs G that appears in the partition function turns out to be

GW[G]=N!{ml}lUlml(l!)mlml! (2.83)

The product N!/lml!(l!)ml counts the number of ways to split the particles into ml l-clusters, while ignoring the different ways to internally connect each cluster. This is the right thing to do since the different internal connections are taken into account in the integral Ul.

Combinatoric arguments are not always transparent. Let’s do a couple of checks to make sure that this is indeed the right answer. Firstly, consider N=4 atoms split into two 2-clusters (i.e m2=2). There are three such diagrams, f12f34=, f13f24=, and f14f23=. Each of these gives the same answer when integrated, namely U22 so the final result should be 3U22. We can check this against the relevant terms in (2.83) which are 4!U22/2!22!=3U22 as expected.

Another check: N=5 atoms with m2=m3=1. All diagrams come in the combinations

U3U2=i=15d3ri()

together with graphs that are related by permutations. The permutations are fully determined by the choice of the two atoms that sit in the pair: there are 10 such choices. The answer should therefore be 10U3U2. Comparing to (2.83), we have 5!U3U2/3!2!=10U3U2 as required.

Hopefully you are now convinced that (2.83) counts the graphs correctly. The end result for the partition function is therefore

Z(N,V,T)=1λ3N{ml}lUlml(l!)mlml!

The problem with computing this sum is that we still have to work out the different ways that we can split N atoms into different clusters. In other words, we still have to obey the constraint (2.81). Life would be very much easier if we didn’t have to worry about this. Then we could just sum over any ml, regardless. Thankfully, this is exactly what we can do if we work in the grand canonical ensemble where N is not fixed! The grand canonical ensemble is

𝒵(μ,V,T)=NeβμNZ(N,V,T)

We define the fugacity as z=eβμ. Then we can write

𝒵(μ,V,T)=NznZ(N,V,T)=ml=0l=1(zλ3)mll1ml!(Ull!)ml=l=1exp(Ulzlλ3ll!)

One usually defines

bl=λ3VUll!λ3l (2.84)

Notice in particular that U1=V so this definition gives b1=1. Then we can write the grand partition function as

𝒵(μ,V,T)=l=1exp(Vλ3blzl)=exp(Vλ3l=1blzl) (2.85)

Something rather cute happened here. The sum over all diagrams got rewritten as the exponential over the sum of all connected diagrams, meaning all clusters. This is a general lesson which also carries over to quantum field theory where the diagrams in question are Feynman diagrams.

Back to the main plot of our story, we can now compute the pressure

pVkBT=log𝒵=Vλ3l=1blzl

and the number of particles

NV=zVzlog𝒵=1λ3l=1lblzl (2.86)

Dividing the two gives us the equation of state,

pVNkBT=lblzlllblzl (2.87)

The only downside is that the equation of state is expressed in terms of z. To massage it into the form of the virial expansion (2.69), we need to invert (2.86) to get z in terms of the particle density N/V. Equating (2.87) with (2.69) (and defining B1=1), we have

l=1blzl = l=1Bl(NV)l-1m=1mbmzm
= l=1Blλ3(l-1)(n=1nbnzn)l-1m=1mbmzm
= [1+B2λ3(z+2b2z2+3b3z3+)+B3λ6(z+2b2z2+3b3z3+)2+]
    ×[z+2b2z2+3b3z3+]

where we’ve used both B1=1 and b1=1. Expanding out the left- and right-hand sides to order z3 gives

z+b2z2+b3z3+=z+(B2λ3+2b2)z2+(3b3+4b2B2λ3+B3λ3)z3+

Comparing terms, and recollecting the definitions of bl (2.84) in terms of Ul (2.82) in terms of graphs, we find the second virial coefficient is given by

B2=-λ3b2=-U22V=-12Vd3r1d3r2f(r1-r2)=-12d3rf(r)

which reproduces the result (2.75) that we found earlier using slightly simpler methods. We now also have an expression for the third coefficient,

B3=λ6(4b22-2b3)

although admittedly we still have a nasty integral to do before we have a concrete result. More importantly, the cluster expansion gives us the technology to perform a systematic perturbation expansion to any order we wish.

2.6 Screening and the Debye-Hückel Model of a Plasma

There are many other applications of the classical statistical methods that we saw in this chapter. Here we use them to derive the important phenomenon of screening. The problem we will consider, which sometimes goes by the name of a “one-component plasma”, is the following: a gas of electrons, each with charge -q, moves in a fixed background of uniform positive charge density +qρ. The charge density is such that the overall system is neutral which means that ρ is also the average charge density of the electrons. This is the Debye-Hückel model.

In the absence of the background charge density, the interaction between electons is given by the Coulomb potential

U(r)=q2r

where we’re using units in which 4πϵ0=1. How does the fixed background charge affect the potential between electrons? The clever trick of the Debye-Hückel model is to use statistical methods to figure out the answer to this question. Consider placing one electron at the origin. Let’s try to work out the electrostatic potential ϕ(r) due to this electron. It is not obvious how to do this because ϕ will also depend on the positions of all the other electrons. In general we can write,

2ϕ(r)=-4π(-qδ(r)+qρ-qρg(r)) (2.88)

where the first term on the right-hand side is due to the electron at the origin; the second term is due to the background positive charge density; and the third term is due to the other electrons whose average charge density close to the first electron is ρg(r). The trouble is that we don’t know the function g. If we were sitting at zero temperature, the electrons would try to move apart as much as possible. But at non-zero temperatures, their thermal energy will allow them to approach each other. This is the clue that we need. The energy cost for an electron to approach the origin is, of course, E(r)=-qϕ(r). We will therefore assume that the charge density near the origin is given by the Boltzmann factor,

g(r)eβqϕ(r)

For high temperatures, βqϕ1, we can write eβqϕ1+βqϕ and the Poisson equation (2.88) becomes

(2+1λD2)ϕ(r)=4πqδ(r)

where λD2=1/4πβρq2. This equation has the solution,

ϕ(r)=-qe-r/λDr (2.89)

which immediately translates into an effective potential energy between electrons,

Ueff(r)=q2e-r/λDr

We now see that the effect of the plasma is to introduce the exponential factor in the numerator, causing the potential to decay very quickly at distances r>λD. This effect is called screening and λD is known as the Debye screening length. The derivation of (2.89) is self-consistent if we have a large number of electrons within a distance λD of the origin so that we can happily talk about average charge density. This means that we need ρλD31.