Classical And Quantum Mechanics

  • July 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Classical And Quantum Mechanics as PDF for free.

More details

  • Words: 53,773
  • Pages: 132
Notes on Classical and Quantum Mechanics Jos Thijssen

February 10, 2005 (560 pages)

Available beginning of 1999

Preface These notes have been developed over several years for use with the courses Classical and Quantum Mechanics A and B, which are part of the third year applied physics degree program at Delft University of Technology. Part of these notes stem from courses which I taught at Cardiff University of Wales, UK. These notes are intended to be used alongside standard textbooks. For the classical part, several texts can be used, such as the books by Hand and Finch (Analytical Mechanics, Cambridge University Press, 1999) and Goldstein (Classical Mechanics, third edition, Addison Wesley, 2004), the older book by Corben and Stehle (Classical Mechanics, second edition, Dover, 1994, reprint of 1960 edition), and the textbook by Kibble and Berkshire, (Classical Mechanics, 5th edition, World Scientific, 2004). The part on classical mechanics is more self-contained than the quantum part, although consultation of one or more of the texts mentioned is essential for a thorough understanding of this field. For the quantum mechanics part, we use the book by D. J. Griffiths (Introduction to Quantum Mechanics, Second Edition, Pearson Education International/Prentice Hall, 2005). This is a very nice, student-friendly text which, however, has two drawbacks. Firstly, the informal way in which the material is covered, has led to a non-consistent use of Dirac notation; very often, the wavefunction formalism is used instead of the linear algebra notation. Secondly, the book does not go into modern applications of quantum mechanics, such as quantum cryptography and quantum computing. Hopefully these notes remedy that situation. Other books which are useful for learning this material from are Introductory Quantum Mechanics by Liboff (fourth edition, Addison Wesley, 2004) and Quantum Mechanics by Bransden and Joachain (second edition, Prentice Hall, 2000). Many more standard texts are availbale – we finally mention here Quantum Mechanics by Basdevant and Dalibard (Springer, 2002) and, by the same authors, The Quantum Mechanics Solver (Springer, 2000). Finally, the older text by Messiah (North Holland, 1961) the books by Cohen-Tannoudji, Diu and Lalo¨e (2 vols., John Wiley, 1996), by Gasiorowicz (John Wiley, 3rd edition, 2003) and by Merzbacher (John Wiley, 1997) can all be recommended. Not all the material in these notes can be found in undergraduate standard texts. In particular, the chapter on the relation between classical and quantum mechanics, and those on quantum cryptography and on quantum information theory are not found in all books listed here, although Liboff’s book contains a chapter on the last two subjects. If you want to know more about these new developments, consult Quantum Computing and Quantum Information by Nielsen and Chuang (Cambridge, 2000). Along with these notes, there is a large problem set, which is more essential than the notes themselves. There are many things in life which you can only learn by doing it yourself. Nobody would seriously believe you can master any sport or playing a musical instrument by reading books. For physics, the situation is exactly the same. You have to learn the subject by doing it yourself – even by failing to solve a difficult problem you learn a lot, since in that situation you start thinking about the structure of the subject. In writing these notes I had numerous discussions with and advice from Herre van der Zant and Miriam Blaauboer. I hope the resulting set of notes and problems will help students learn and appreciate the beautiful theory of classical and quantum mechanics. i

Contents Preface

i

1

Introduction: Newtonian mechanics and conservation laws 1.1 Newton’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Systems of point particles – symmetries and conservation laws . . . . . . . . . . . .

1 1 3

2

Lagrange and Hamilton formulations of classical mechanics 2.1 Generalised coordinates and virtual displacements . . . . . . . . . . . 2.2 d’Alembert’s principle . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 The pendulum . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 The block on the inclined plane . . . . . . . . . . . . . . . . 2.3.3 Heavy bead on a rotating wire . . . . . . . . . . . . . . . . . 2.4 d’Alembert’s principle in generalised coordinates . . . . . . . . . . . 2.5 Conservative systems – the mechanical path . . . . . . . . . . . . . . 2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 A system of pulleys . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Example: the spinning top . . . . . . . . . . . . . . . . . . . 2.7 Non-conservative forces – charged particle in an electromagnetic field 2.7.1 Charged particle in an electromagnetic field . . . . . . . . . . 2.8 Hamilton mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Applications of the Hamiltonian formalism . . . . . . . . . . . . . . 2.9.1 The three-pulley system . . . . . . . . . . . . . . . . . . . . 2.9.2 The spinning top . . . . . . . . . . . . . . . . . . . . . . . . 2.9.3 Charged particle in an electromagnetic field . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

8 8 10 11 11 12 14 15 16 20 20 21 23 23 24 27 28 28 29

3

The two-body problem 3.1 Formulation and analysis of the two-body problem . . . . . . . . . . . . . . . . . . 3.2 Solution of the Kepler problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30 30 33

4

Examples of variational calculus, constraints 4.1 Variational problems . . . . . . . . . . . 4.2 The brachistochrone . . . . . . . . . . . . 4.3 Fermat’s principle . . . . . . . . . . . . . 4.4 The minimal area problem . . . . . . . . 4.5 Constraints . . . . . . . . . . . . . . . . 4.5.1 Constraint forces . . . . . . . . . 4.5.2 Global constraints . . . . . . . .

35 35 36 37 38 39 39 41

ii

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

Contents 5

iii

From classical to quantum mechanics 5.1 The postulates of quantum mechanics . . . . . . . . . . . . . . . 5.2 Relation with classical mechanics . . . . . . . . . . . . . . . . . 5.3 The path integral: from classical to quantum mechanics . . . . . . 5.4 The path integral: from quantum mechanics to classical mechanics

. . . .

45 45 47 50 53

6

Operator methods for the harmonic oscillator 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55 55 55

7

Angular momentum 7.1 Spectrum of the angular momentum operators 7.2 Orbital angular momentum . . . . . . . . . . 7.3 Spin . . . . . . . . . . . . . . . . . . . . . . 7.4 Addition of angular momenta . . . . . . . . . 7.5 Angular momentum and rotations . . . . . .

. . . . .

60 60 62 63 64 67

8

Introduction to Quantum Cryptography 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The idea of classical encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Quantum Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69 69 69 71

9

Scattering in classical and in quantum mechanics 9.1 Classical analysis of scattering . . . . . . . . . 9.2 Quantum scattering with a spherical potential . 9.2.1 Calculation of scattering cross sections 9.2.2 The Born approximation . . . . . . . .

. . . .

75 75 78 82 84

10 Symmetry and conservation laws 10.1 Noether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Liouville’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87 87 88

11 Systems close to equilibrium 11.1 Introduction . . . . . . . . . . . . . . . . 11.2 Analysis of a system close to equilibrium 11.2.1 Example: Double pendulum . . . 11.3 Normal modes . . . . . . . . . . . . . . . 11.4 Vibrational analysis . . . . . . . . . . . . 11.5 The chain of particles . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

12 Density operators — Quantum information theory 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2 The density operator . . . . . . . . . . . . . . . . . . 3 Entanglement . . . . . . . . . . . . . . . . . . . . . . 4 The EPR paradox and Bell’s theorem . . . . . . . . . . 5 No cloning theorem . . . . . . . . . . . . . . . . . . . 6 Dense coding . . . . . . . . . . . . . . . . . . . . . . 7 Quantum computing and Shor’s factorisation algorithm

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . . . .

. . . . . .

92 92 93 95 96 97 100

. . . . . . .

103 103 103 110 112 114 115 116

iv

Contents

Appendix A Review of Linear Algebra 119 1 Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Appendix B The time-dependent Schr¨odinger equation

123

Appendix C Review of the Schr¨odinger equation in one dimension

125

1

Introduction: Newtonian mechanics and conservation laws In this lecture course, we shall introduce some mathematical techniques for studying problems in classical mechanics and apply them to several systems. In a previous course, you have already met Newton’s laws and some of its applications. In this chapter, we briefly review the basic theory, and consider the interpretation of Newton’s laws in some detail. Furthermore, we consider conservation laws of classical mechanics which are connected to symmetries of the forces, and derive these conservation laws starting from Newton’s laws.

1.1

Newton’s laws

The aim of a mechanical theory is to predict the motion of objects. It is convenient to start with point particles which have no dimensions. The trajectory of such a point particle is described by its position at each time. Denoting the spatial position vector by r, the trajectory of the particle is given as r(t), a three-dimensional function depending on a one-dimensional coordinate: the time. The velocity is defined as the time-derivative of the vector r(t), and by convention it is denoted as r˙ (t): r˙ (t) =

d r(t), dt

(1.1)

and the acceleration a is defined as the second derivative of the position vector with respect to time: a(t) = r¨ (t).

(1.2)

The last concept we must introduce is that of momentum p: it is defined as p = m˙r(t),

(1.3)

where m is the mass. Although we have an intuitive idea about the meaning of mass, this is also a rather subtle physical concept, as is clear from the frequent confusion of mass with the concept of weight (see below). Now let us state Newton’s laws: 1. A body not influenced by any other matter will move at constant velocity 2. The rate of change of momentum of a body is equal to the force, F: dp = F(r,t). dt 1

(1.4)

2

Introduction: Newtonian mechanics and conservation laws

Table 1.1: Forces for various systems. The symbol mi stand for the mass of point particle i, qi stands for electric charge of particle i, B is a magnetic, and E an electric field. G,  and g are known constants. The gravitational and the electrostatic forces are directed along the line connecting the two particles i = 1, 2.

Forces in nature System Gravity Gravity near earth’s surface Electrostatics Particle in an electromagnetic field Air friction

Force FG = GmM r12 Fg = −mgˆz 1 FC = 4π q1 q2 r12 0 FEM = q (E + r˙ × B) Ffr = −γ r˙

3. When a particle exerts a force F on another particle, then the other particle exerts a force on the first particle which is equal in magnitude but opposite in direction to the force F – these forces are directed along the line connecting the two particles. Denoting the particle by indices 1 and 2, and the force exerted on 1 by 2 by F1,2 and the force exerted on 2 by 1 by F2,1 , we have: F1,2 = −F2,1 = ±F1,2 rˆ 1,2 .

(1.5)

where rˆ 1,2 is a unit vector pointing from r1 to r2 . The ± denotes whether the force is repulsive (−) or attractive (+). Some remarks about these laws are in place. It is questionable whether the second law is really a statement, as a new vector quantity, called ‘force’, is introduced, which is not yet defined. Only if we know the force, we can predict how a particle will move. In that sense, a real ‘law’ is only formed by combining Newton’s second law together with an explicit expression for the force. In table 1.1, known forces are given for several systems. Note that the force generally depends on the position r, on the velocity r˙ , and also explicitly on time (e.g. when an external, time-varying field is present). An implicit dependence on time is further provided by the time dependence of the position vector r(t). In most cases, the mass is taken to be constant, although this is not always true: you may think of a rocket burning its fuel, or disposing of its launching system, or bodies moving at a speed of the order of the speed of light, where the mass deviates from the rest mass. With constant mass, the second law reads: m¨r(t) = F(r,t). (1.6) In fact, the second law disentangles two ingredients of the motion. One is the mass m, which is a property of the moving particle which is acted upon by the force, and the other the force, which itself arises from some external origin. In the case of gravitational interaction, the force depends on the mass, which drops out of the equation of motion. Generally, mass can be described as the resistance to velocity change, as the second law states that the larger the mass, the smaller the change in velocity (for the same force). It is an experimental fact that the mass which enters the expression for the gravitational force is the same as this universal quantity mass, which occurs for any force in the equation of motion. The weight is the gravity force acting on a body. Usually, the first law is phrased as follows: ‘when there is no force acting on a point particle, the particle moves at constant velocity’. This statement obviously follows from the second law by taking F = 0. The formulation adopted above emphasises that force has a material origin. It is impossible

1.2. Systems of point particles – symmetries and conservation laws

3

fulfill the requirements of this law, as everywhere in the universe gravitational forces are present: the first law is an idealisation. The first law is not obvious from everyday life, where it is never possible to switch friction off completely: in everyday life, motion requires a force in order to be maintained. The third law is a statement about forces. It turns out that this statement does not hold exactly, as the forces of this statement should act simultaneously. In quantum field theory, particles travelling between the interacting particles are held responsible for the interactions, and these particles cannot travel at a speed faster than that of light in vacuum (about 3 · 108 m/s). However, for everyday life mechanics, the third law holds to sufficient precision, unless the moving particles carry a charge and interact through electromagnetic interactions. In that case, the force acts no longer along the line connecting the two particles.

1.2

Systems of point particles – symmetries and conservation laws

Real objects which we describe in mechanics are not point particles, but to very good agreement they can be considered as large collections of interacting point particles – in this section we consider systems consisting of N point particles. It is possible to disentangle the mutual forces acting between these particles from the external ones. The mutual forces satisfy Newton’s third law: for every force Fi, j , which is the force exerted by particle j on particle i, the force F j,i is equal in magnitude but opposite in direction to Fi, j . For a particle i, we consider all the mutual forces Fi, j for j 6= i – the remaining forces on i must then be due to external sources (i.e., not depending on the other particles in our system), and we lump these forces together in one external force, FExt i : N

Fi =



Fi, j + FExt i ,

i = 1, . . . , N.

(1.7)

j=1; j6=i

The equations of motion read: N



mi r¨ i =

Fi, j + FExt i .

(1.8)

j=1; j6=i

The total momentum of the system is the sum of the momenta of all the particles: N

N

i=1

i=1

p = ∑ pi = ∑ mi r˙ i .

(1.9)

We can view the total momentum of the system as the momentum of a single particle with a mass equal to the total mass M of the system, and position vector rC . This position vector is then defined through: N

p = M˙rC = ∑ mi r˙ i ; i=1

N

M = ∑ mi .

(1.10)

i=1

This is equivalent to rC =

1 N ∑ mi ri M i=1

(1.11)

up to an integration constant which is always taken to be zero. The vector rC is called centre of mass of the system. A particle of mass M at the centre of mass (which obviously changes in time) represents the same momentum as the total momentum of the system.

4

Introduction: Newtonian mechanics and conservation laws Let us find an equation of motion for the centre of mass. We do this by summing Eq. (1.8) over i: N

∑ mi r¨ i =

i=1

N

N



i, j=1,i6= j

Fi, j + ∑ FExt i .

(1.12)

i=1

In the first term on the right hand side, for every term Fi, j , there will also be a term F j,i , but this is equal in magnitude and opposite to Fi, j ! So the first term vanishes, and we are left with N

N

i=1

i=1

Ext ∑ mi r¨ i = p˙ = ∑ FExt i ≡F .

(1.13)

We see that the centre of mass behaves as a point particle with mass M subject to the total external force acting on the system. Conservation of physical quantities, such as energy, momentum etcetera, is always the result of some symmetry. This deep relation is borne out in a beautiful theorem, formulated by E. Noether, which we shall consider in the next semester. In this section we shall derive three conservation properties from Newton’s laws and the appropriate symmetries. The first symmetry we consider is that of a system of particles experiencing only mutual forces, and no external ones. We then see immediately from Eq. (1.13) with FExt = 0 that p˙ = 0, in other words, the total momentum is conserved. Conservation of momentum In a system consisting of interacting particles, not subject to an external force, the total momentum is always conserved. Next, let us consider the angular momentum L. This is a vector quantity, which for particle i is defined as Li = ri × pi . The total angular momentum L is the sum of the vectors Li : N

L = ∑ Li .

(1.14)

i=1

To see how L varies in time, we calculate the time derivative of Li : ˙ i = r˙ i × pi + ri × p˙ i . L

(1.15)

The first term of the right hand side vanishes because pi is parallel to r˙ i , so we are left with ˙ i = ri × p˙ i = Ni ; L

(1.16)

Ni is the torque acting on particle i. Now we calculate the torque on the total system by summing over i and replacing p˙ i by the force (according to the second law): ! N N N Ext L˙ = ∑ ri × Fi = ∑ ri × (1.17) ∑ Fi, j + Fi . i=1

i=1

j=1, j6=i

The first term in right hand side vanishes, again as a result of the third law: ri × Fi, j + r j × F j,i = Fi, j × (ri − r j ) = 0,

(1.18)

1.2. Systems of point particles – symmetries and conservation laws

5

r2 dr

r1

F

Figure 1.1: Path from r1 to r2 . The force at some point along the path is shown, together with the contribution to the work of a small segment dr along the path.

where the last equality is a result of the direction of Fi, j coinciding with that of the line connecting ri and r j (which excludes electromagnetic interactions between moving particles from the discussion). We therefore have N

L˙ = ∑ ri × FExt i .

(1.19)

i=1

We see that if the external forces vanish, the angular momentum does not change: Conservation of angular momentum In a system consisting of interacting particles (not electromagnetic), not subject to an external force, the angular momentum is always conserved. Finally, we consider the energy. Let us evaluate the work W done by moving a single particle from r1 to r2 along some path Γ (see figure 1.1). This is by definition the inner product of the force and the infinitesimal displacements, summed over the path: Z

W=

Γ

F · dr(t) =

Z t2

F · r˙ dt.

(1.20)

  m 2 r˙ 2 dt = r˙ 2 − r˙ 21 , 2 dt 2

(1.21)

t1

Using Newton’s second law, we can write: Z t2

m¨rr˙ dt =

W= t1

Z t2 md t1

where r˙ 1 is the velocity at time t1 and similar for r˙ 2 . We see that from Newton’s second law it follows that the work done along the path Γ is equal to the change in the kinetic energy T = m˙r2 /2. A conservation law can be derived for the case where F is a conservative and time-independent force. This means that F can be written as the negative gradient of some scalar function, called the potential:1 F(r) = −∇V (r). (1.22) In that case we can write the work in a different way: W =− 1 From

Z Γ

∇V (r)dr(t) = −

Z t2 dV (r) t1

dt

dt = V (r1 ) −V (r2 ).

(1.23)

vector calculus it is known that a necessary and sufficient condition for this to be possible is that the force is curl-free, i.e. ∇ × F = 0.

6

Introduction: Newtonian mechanics and conservation laws

From this and from Eq. (1.21) it follows that T1 +V1 = T2 +V2 ,

(1.24)

where T1 is the kinetic energy in the point r1 (or at the time t1 ) etcetera. Thus T + V is a conserved quantity, which we call the energy E. Of course, now that we know the expression for the energy, we can verify that it is a conserved quantity by calculating its time derivative, using Newton’s second law: E˙ = m˙r · r¨ + ∇V (r) · r˙ = F · r˙ − F · r˙ = 0.

(1.25)

For a many-particle system, the derivation is similar – the condition on the force is then that there exists a potential function V (r1 , r2 , . . . , rN ), such that the force Fi on particle i is given by Fi = −∇iV (r1 , r2 , . . . , rN ).

(1.26)

Note that V depends on 3N coordinates – the gradient ∇i acting on V gives a 3-dimensional vector   ∂V ∂V ∂V , , . (1.27) ∇iV = ∂ xi ∂ yi ∂ zi The kinetic energy is the sum of the one-particle kinetic energies. Now the energy conservation is derived as follows: E˙ = ∑ mi r˙ i · r¨ i + ∑ ∇iV · r˙ i = ∑ (Fi · r˙ i − Fi · r˙ i ) = 0. i

i

(1.28)

i

The function V above depends on time only through the time-dependence of the arguments ri . If we consider a charged particle in a time-dependent electric field, this is no longer the case: then t occurs as an additional, explicit argument in V . If V would depend explicitly on time, the energy would change at a rate ∂ E˙ = V (r1 , r2 , . . . , rN ,t), (1.29) ∂t where the arguments ri also depend on time (but do not take part in the differentiation with respect to t). If V does not depend explicitly on time, we can define the zero of time (i.e. the time when we set our clock to zero) arbitrarily. This time translation invariance is essential for having conservation of energy. Similarly, the conservation of momentum is related to space translation invariance of the potential, i.e. this potential should not change when we translate all particles all over the same vector. Finally, angular momentum is related to rotational symmetry of the potential. In quantum mechanics, all these symmetries lead to the same conserved quantities (or rather their quantum mechanical analogues). A final remark concerns the evaluation of the kinetic energy of a many-particle system. As we have seen above, the motion of the centre of mass can be split off from the analysis in a suitable way. This procedure also works for the kinetic energy. Let us decompose the position vector ri of particle i into two parts: the centre of mass position vector rC and the position relative to the centre of mass, which we call r0i : ri = rC + r0i . (1.30) As, by definition, rC = ∑i mi ri /M, we have

∑ mi r0i = ∑ mi ri − MrC = 0. i

i

(1.31)

1.2. Systems of point particles – symmetries and conservation laws

7

We can use this decomposition to rewrite the kinetic energy: T =∑ i

2 M mi mi r˙ C + r˙ 0i = r˙ 2C + r˙ C · ∑ mi r˙ 0i + ∑ r˙ 02 i . 2 2 i 2 i

(1.32)

The second term vanishes as a result of (1.31) and therefore we have succeeded in writing the kinetic energy of the many-particle system as the kinetic energy of the centre of mass plus the kinetic energy of the relative coordinates: mi (1.33) T = TCM + ∑ r˙ 02 i . i 2 This formula is a convenient device for calculating the kinetic energy in many applications.

2

Lagrange and Hamilton formulations of classical mechanics The laws of classical mechanics, formulated by Newton, and the various laws for the forces (see table 1.1) supply sufficient ingredients for predicting the motion of mechanical systems in the classical limit. Working out the solution for particular cases is not always easy, however. In this chapter we shall develop an alternative formulation of the laws of classical mechanics, which renders the analysis of many systems easier than the traditional Newtonian formulation, in particular when the moving particles are subject to constraints. The new formulation will not only enable us to analyse new applications more easily than using Newton’s laws, but it also leads to an important example of a variational formulation of a physical theory. Broadly speaking, in a variational formulation, a physical solution is found by minimising a mathematical expression involving a function by varying that function. Many physical theories can be formulated in a variational way, in particular quantum mechanics and electrodynamics.

2.1

Generalised coordinates and virtual displacements

When observing motion in everyday life, we often encounter systems in which the moving particles are subject to constraints. For example, when a car moves on the road, the road surface withholds the car from moving downward, as is the case with the balls on a billiard table. Another example is a particle suspended on a rigid rod (i.e. the pendulum), which can only move on the the sphere around the suspension point with radius equal to the rod length. The constraints are realised by forces, which we call the forces of constraint. The forces of constraint guarantee that the constraints are met – they often do not influence the motion within the subspace.1 The main object of the next few sections is to show that it is possible to eliminate these constraint forces from the description of the mechanical problem. As the presence of constraints reduces the actual degrees of freedom of the system, it is useful to use a smaller set of degrees of freedom to describe the system. As an example, consider a ball on a billiard table. In that case, the z-coordinate drops out of the description, and we are left with the x and y coordinates only. This is obviously a very simple example, in which one of the Cartesian coordinates is simply left out of the description of the system. More interesting is a ball suspended on a rod. In that case we can use the angular coordinates ϑ and ϕ to describe the system – that is, we replace the coordinates x, y and z by the angles ϕ and ϑ – see figure 2.1. In this case, we see that the coordinates no longer represent distances, that is, they do not have the dimension of length, but rather they are values of angles, and therefore dimensionless. This is the reason why we speak of generalised 1 The

subspace on which the particle is allowed to move is not necessarily a linear subspace, e.g. the spherical subspace in the case of a pendulum. Mathematicians would use the term ‘submanifold’ rather than subspace.

8

2.1. Generalised coordinates and virtual displacements

9

z y x

ϕ θ

Figure 2.1: The pendulum in three dimensions. The position of the mass is described by the two angles ϕ and ϑ.

coordinates. These coordinates form a reduced representation of a system subject to constraints. In chapter 2 of the Schaum book you find many examples of constraints and generalised coordinates. Generalised constraints are denoted by q j , where j is an index which runs over the degrees of freedom of the constrained system. We now shall look at constraints and generalised coordinates from a more formal viewpoint. Let us consider a system consisting of N particles in 3 dimensions, so that the total number of coordinates is 3N. The system is subject to a number of constraints, which are of the form g(k) (r1 , . . . , rN ,t) = 0,

k = 1, . . . , K.

(2.1)

Constraints of this form (i.e. independent of the velocities) are called holonomic. Usually, it is then possible to transform the 3N degrees of freedom to a reduced set of 3N − K generalised coordinates {q} = q j , j = 1, . . . , 3N − K. It is now possible to express the position vectors in terms of these new coordinates: ri = ri ({q},t). (2.2) As an example, consider the particle suspended on a rod; see figure 2.2. The Cartesian coordinates are x, y and z and they can be written in terms of the generalised coordinates ϑ and ϕ as: x = l sin ϑ cos ϕ;

(2.3)

y = l sin ϑ sin ϕ;

(2.4)

z = −l cos θ ,

(2.5)

where l is the length of the rod (and therefore fixed). These equations are a particular example of Eqs. (2.2). The velocity can be expressed in terms of the q˙ j : 3N−K

r˙ i =



j=1

∂ ri ∂ ri q˙ j + . ∂qj ∂t

(2.6)

From this equation we also find directly ∂ r˙ i ∂ ri = ∂ q˙ j ∂qj

(2.7)

10

Lagrange and Hamilton formulations of classical mechanics

a result which will be very useful further on. Newton’s laws predict the evolution of a mechanical system without ambiguity from a given initial state (if that state is not on an unstable point, such as zero velocity at the top of a hill). However, we are sometimes interested in a variation of the path of a system, i.e. a displacement of one or more particles in some direction. Such displacements are called virtual displacements in order to distinguish them from the actual displacement, which is always governed by the Newton equations of motion. If we now generalise the definition of work, Eq. (1.20) to include virtual displacements δ ri rather than the mechanical displacements which actually take place, then the work done due to this displacement is defined as N

δW = ∑ F · δ ri .

(2.8)

i=1

The notion of virtual work is very important in the following section.

2.2

d’Alembert’s principle

We start from Newton’s law of motion for an N-particle system. p˙ i = m¨ri = Fi ,

i = 1, . . . , N.

(2.9)

It is always possible to decompose the total force on a particle into a force of constraint FC and the remaining force, which we call the applied force FA : F = FC + FA .

(2.10)

If you consider any system consisting of a single particle (or nonrotating rigid body), subject to constraints, you will find that the work forces of constraint are always perpendicular to the space in which the particle is allowed to move. For example, if a particle is attached to a rigid rod which is suspended such that it can rotate freely, the particle can only rotate on a spherical surface. The force of constraint, which is the tension in the rod, is always normal to that surface. Similarly, the force of the billiard table on the balls is always vertical, i.e. perpendicular to the plane of motion. This notion provides a way to eliminate these forces from the description. Consider an arbitrary but small virtual displacement δ r within the subspace allowed by the constraint. Because the force of constraint is perpendicular to this subspace, we have:  p˙ · δ r = FC + FA · δ r = FA · δ r. (2.11) We see that the force of constraint drops out of the system, and we are left with a motion determined by the applied force only. Because (2.11) holds for every small δ r, we have p˙ = FA

(2.12)

if we restrict all vectors to be tangential to the constraint subspace. The principle we have formulated in Eq. (2.11) is called d’Alembert’s principle. For systems consisting of a single rigid body, it expresses the fact that the forces of constraint are perpendicular to the subspace of the constraint. The expression F · δ r is the virtual work done as a result of the virtual displacement. It is important to note that the virtual displacements are always considered to be spatial – the time is not changed. This is particularly important in cases where the constraints are time-dependent. In the next section we shall consider an example of this.

2.3. Examples

11

l ϕ ϕ

FT m ϕ F g

Figure 2.2: The pendulum moving in a plane. The rod of length is rigid, massless, and is suspended frictionless.

For more than one object, the contributions to the virtual work must be added, so that we obtain: N

∑ p˙ i · δ ri = ∑ FAi · δ ri . i

(2.13)

i

In this form, the contributions of the constraint forces to the virtual work do not all vanish for each individual object, but the total virtual work due to the constraint forces vanishes:

∑ FCi · δ ri = 0.

(2.14)

i

In summary, we can formulate d’Alembert’s principle in the following, concise form: The virtual work due to the forces of constraint is always zero for virtual displacements which do not violate the constraint. The use of d’Alembert’s principle can simplify the analysis of systems subject to constraints, although we often use this principle tacitly in tackling problems in the ‘Newtonian’ approach. In that approach we usually demand that the forces of constraint balance the components of the applied force perpendicular to the constraint subspace. Nevertheless, it is convenient to skip this step, using d’Alembert’s principle, especially in complicated problems (many applied forces and constraints).

2.3

Examples

2.3.1 The pendulum As a simple example, let us consider a pendulum moving in a plane. This system is shown in figure 2.2. Using Newton’s mechanics, we say that the ball of mass m is kept on the circle by the tension in the suspension rod. This tension is directed along the rod, and it precisely compensates the component of the gravitational force along the same line. The component of the gravitational force tangential to

12

Lagrange and Hamilton formulations of classical mechanics

F1

F2

SB Fz

- F1

IP

^ n

^d

α

^y ^x

FZ

Figure 2.3: Small block on an inclined plane.

the circle of motion determines the motion. The motion is given by r(t) = lϕ(t), where ϕ is the angle ¨ shown in the figure. So r¨(t) = l ϕ(t), and the equation of motion is ¨ = −g sin ϕ(t). l ϕ(t)

(2.15)

Using d’Alembert’s principle simplifies the first part of this analysis. We can simply say that the motion is determined by the component of the applied force (i.e. gravity) lying in the subspace of the motion (i.e. the circle) and this leads to the same equation of motion. Although in this simple case the difference between the approaches with and without d’Alembert’s principle is minute, in more complicated systems, the possibility to avoid analysing the forces of constraint is a real gain. 2.3.2

The block on the inclined plane

Now we consider a more complicated example: that of a block sliding on a wedge . We shall denote the block by SB (small block) and the wedge by IP (Inclined plane). The setup is shown in figure 2.3. It consists of the wedge (inclined plane) of mass M which can move freely (i.e. without friction) over a horizontal table, and the small block off mass m, which can slide over the inclined plane (also frictionless). The aim is to find expressions for the accelerations of IP and SB. The cartesian unit ˆ and the vectors are xˆ and yˆ , and the unit vector along the inclined plane, pointing to the right, is d, ˆ Let us solve this problem using the standard upward normal unit vector to the plane is called n. approach. The acceleration of IP is called A, and that of the small block is A + a, i.e., a is the acceleration of the small block with respect to the inclined plane. Newton’s second law for the two bodies reads: ˆ MA = −Mgˆy + F2 yˆ − F1 n, ˆ m(A + a) = −mgˆy + F1 n.

(2.16a) (2.16b)

As we know that the motion of IP is horizontal, we know that all yˆ components of the forces acting ˆ This allows us on it will cancel, and A is directed along xˆ . Similarly, we know that a is zero along n.

2.3. Examples

13

to simplify the equations: MA = −F1 sin α; ˆ = −mgˆy + F1 nˆ m(Aˆx + ak d)

(2.17a) (2.17b)

ˆ The first of these equations is a scalar equation. The where ak is the component of a directed along d. second equation represents in fact two equations, one for the x and one for the y component. We have three unknowns: A, ak and F1 . Translating dˆ and nˆ in the x- and y- components is straightforward, and (2.17b) becomes: m(A + ak cos α) = F1 sin α,

(2.18a)

−mak sin α = F1 cos α − mg.

(2.18b)

Now we can solve for the accelerations by eliminating F1 from our equations, and we find: (M + m) sin α ; M + m sin2 α m sin α cos α A = −g . M + m sin2 α

ak = g

(2.19a) (2.19b)

The solution of this problem contains one nontrivial step: the fact that we have split the acceleration of SB into the acceleration of IP plus the acceleration of the SB with respect to the IP has enabled us ˆ This is not so easy a step when a different representation is to remove the latter’s component along n. used (e.g. when the acceleration is not split into these parts). Now we turn to the solution using d’Alembert’s principle: A p˙ SB · δ rSB + p˙ IP · δ rIP = FA SB · δ rSB + FIP · δ rIP .

(2.20)

We identify two natural coordinates: the coordinate X of the IP along the horizontal direction, and the distance d from the top of the IP to the SB. The total virtual work done as a result of displacements δ X and δ d is the sum of the work done by both bodies: δ rSB = δ d dˆ + δ X xˆ

and

δ rIP = δ X xˆ .

(2.21)

The applied forces are the gravity forces – we do not care about constraint forces any longer – and we find FA (2.22) IP · δ rIP = 0, as the displacement is perpendicular to the applied (gravity) force. Furthermore FA SB · δ rSB = mg sin α δ d.

(2.23)

ˆ pSB = m(X˙ xˆ + d˙d)

(2.24)

pIP = M X˙ xˆ ,

(2.25)

On the other hand: and ˆ Taking time derivatives of (2.24) and (2.25) and using so that p˙ IP = MAˆx and p˙ SB = m(Aˆx + ak d). d’Alembert’s equations (2.20) for this problem, together with (2.22) and (2.23), we obtain mA δ X + mak δ d + mA cos α δ d + mak cos α δ X + MA δ X = mg sin α δ d.

(2.26)

14

Lagrange and Hamilton formulations of classical mechanics z

^ q ω

^t

α

y x Figure 2.4: Bead on a rotating wire.

As this equation should hold for any pair of virtual displacements δ X and δ d, the coefficients of both δ X and δ d should vanish simultaneously, giving the equations: (m + M)A + mak cos α = 0.

(2.27a)

m(ak + A cos α) = mg sin α.

(2.27b)

Not surprisingly, these equations lead to the same result (2.19) as obtained before. Although the second approach does not seem simpler, it is safer since the constraint forces do not have to be taken into account explicitly. This manifests itself explicitly in the fact that we do not have to eliminate the constraint force F1 as in the direct approach. 2.3.3

Heavy bead on a rotating wire

In this section, we consider a system with a time-dependent constraint. A bead slides without friction along a straight wire which rotates along a vertical axis, under an angle α (see figure 2.4). The position of the bead along the wire is denoted by q, which is the distance of the bead from the origin. The momentum of the bead is given by p = mqω sin α ˆt + mq˙ qˆ

(2.28)

It should however be noted that the unit vectors ˆt and qˆ rotate themselves, and hence their time ˙ The latter occurs in d’Alembert’s equation, in which gravity enters as the derivatives occur in p. applied force FA . Instead of working out p˙ explicitly, we can use the following trick: p˙ · δ r =

d (p · δ r) − p · δ r˙ . dt

(2.29)

At first sight, you might think that the second term on the right hand side is zero as δ r = δ q qˆ and δ q does not involve any time dependence: virtual displacements are always assumed to be instantaneous and do not involve any time dependence. However, even with a time-independent δ q, the displacement δ r is time-dependent as the displacement is carried out in a rotating frame. This can also be seen from

2.4. d’Alembert’s principle in generalised coordinates

15

the fact that qˆ is time-dependent. In fact, in our system the displacement along the wire will cause a change in the rotational velocity, and it is this velocity change which gives δ r˙ . If the bead is moved upward, for example, the bead will move along a circle which has a larger radius, but still at the same angular velocity, so that the orbital speed increases. The orbital speed is given as qω sin α, so that we have: δ r˙ = ω sin α δ q ˆt. (2.30) ˆ we find As δ r is given by δ q q, p˙ · δ r = mq¨ δ q − mω 2 sin2 α q δ q = Fa · δ r = −mg cos α δ q

(2.31)

and we find the equation of motion: q¨ − ω 2 sin2 α q = −g cos α.

(2.32)

The solution to this equation can be found straightforwardly: q(t) = q0 + AeΩt + Be−Ωt

(2.33)

with q0 = g cot α/(ω 2 sin α), A and B arbitrary constants and Ω = ω sin α. Later we shall encounter more powerful techniques which enable us to solve such a problem more easily.

2.4

d’Alembert’s principle in generalised coordinates

In the previous section we have encountered a few examples of systems subject to constraints, and analysed them using d’Alembert’s principle. In this section we shall do the same for an unspecified system and derive the equations of motion for a general constrained system using d’Alembert’s principle. We start from d’Alembert’s equation for N objects: N

N

i=1

i=1

∑ p˙ i · δ ri = ∑ FAi · δ ri .

If we write

3N−K

δ ri =



j=1

∂ ri δ q j, ∂qj

(2.34)

(2.35)

and realise that the q j can be varied independently, we see that we must have N

∑ p˙ i ·

i=1

N ∂ ri ∂ ri = ∑ FAi · . ∂ q j i=1 ∂qj

(2.36)

In order to reformulate this equation we use a trick similar to the one we applied already to the bead sliding along the wire: ! N N N d ∂ ri ∂ ri d ∂ ri ˙ p · = p · − (2.37) i i ∑ ∂ q j dt ∑ ∂ q j ∑ pi · dt ∂ q j . i=1 i=1 i=1 We note furthermore that in the second term, the time derivative can be written as   d ∂ ri ∂ r˙ i = . dt ∂ q j ∂qj

(2.38)

16

Lagrange and Hamilton formulations of classical mechanics

In section 1.2 we have seen that the work done equals the change in kinetic energy. This suggests that the kinetic energy might be a convenient device for expressing d’Alembert’s equation in generalised coordinates. To see that this is indeed the case, we first calculate its derivative with respect to q j and multiply with δ q j and sum over j: N ∂T ∂ r˙ i = ∑ mi r˙ i · . ∂ q j i=1 ∂qj

(2.39)

N N 3N−K ∂T ∂ ri ∂ r˙ i = ∑ m˙ri · ∑ = ∑ pi · , ∂ q˙ j i=1 ∂qj i=1 j=1 ∂ q˙ j

(2.40)

Similarly:

where we have used (2.7). We see that the left hand side of d’Alembert’s equation leads to   d ∂T ∂T − . dt ∂ q˙ j ∂qj

(2.41)

Defining N

∂ ri

∑ FAi ∂ q j = F j ,

(2.42)

i=1

where F j is the generalised force, we have the following Formulation for d’Alembert’s principle in generalised coordinates:   d ∂T ∂T − = F j. dt ∂ q˙ j ∂qj

(2.43)

There is no sum over j in this equation because the variations δ q j are arbitrary and independent. It is then possible to obtain the form (2.43) from d’Alembert’s principle by taking only one particular δ q j to be nonzero.

2.5

Conservative systems – the mechanical path

Consider now a particle which moves in a constrained subspace under the influence of a potential. As an example you can imagine a non-flat surface on which a ball is moving from r1 to r2 . If the ball is not forced to obey the laws of mechanics, it can move from r1 at time t1 to r2 at time t2 along many different paths. Instead of approaching the problem of finding the motion of the ball from a differential point of view, where we update the position and the velocity of a particle at each infinitesimal time step, we consider the path allowed for by the laws of mechanics1 as a special one among all the available paths from r1 at t1 to r2 at t2 . We thus try to find a condition on the path as a whole rather than for each of its infinitesimal segments. To this end, we start from d’Alembert’s principle, and apply it to two paths, ra (t) and rb (t), which are close together for all times. The difference between the two paths at some time t between t1 and t2 is δ r(t) = rb (t) − ra (t), and we write down d’Alembert’s principle at time t using this δ r(t): m¨r(t) · δ r(t) = F · δ r(t), 1 This

path not always, but nearly always, unique.

(2.44)

2.5. Conservative systems – the mechanical path

17

where it is understood that F is the applied force only, as δ r lies in the constrained subspace.2 This equation holds for every t between t1 and t2 , and we can formulate a global condition on the path by integrating over time from t1 to t2 : Z t2

m¨r(t) · δ r(t)dt =

t1

Z t2

F · δ r(t)dt.

(2.45)

t1

The analysis which follows resembles that of the previous chapter when we derived the conservation property of the energy. Indeed, the right hand side looks like an expression for the work, but it should be kept in mind that δ r is not a real displacement of the particle, but a difference between two possible paths. Via partial integration, and using the fact that the begin and end point of the path are fixed, we can transform the left hand side of (2.45): Z t2

m¨r(t) · δ r(t)dt = −

t1

Z t2 m ∂ r˙ 2 t1

2 ∂ r˙

δ r˙ dt = −

Z t2 ∂T t1

∂ r˙

δ r˙ dt ≈ −

Z t2 t1

[T (˙rb ) − T (˙ra )] dt,

(2.46)

where the approximation holds to first order in δ r. The resulting expression is the difference in kinetic energy between the two paths, integrated over time. If we are dealing with a conservative force field, the right hand side of (2.45) can also be transformed to a difference between two global quantities: Z t2

F · δ r(t)dt = −

Z t2

t1

t1

∇V · δ r(t)dt ≈ −

Z t2 t1

[V (rb ) −V (ra )] dt.

(2.47)

Combining (2.46) and (2.47) we obtain: Z t2

δ

(T −V )dt = 0,

(2.48)

t1

in other words, d’Alembert’s principle for a conservative force can be transformed to the condition that the linear variation (2.48) vanishes. This global condition distinguishes the mechanical path from all other ones. R The quantity T −V is called the Lagrangian, L. The integral over time of this quantity tt12 L dt is called the action, denoted by S: Z t2

S= t1

dt (T −V ) =

Z t2

dt L.

(2.49)

t1

We have derived a new principle: The mechanical path of a particle moving in a conservative potential field from a position r1 at time t1 to a position r2 at t2 is a stationary solution of the action, i.e. the linear variation of the action with respect to an allowed variation of the path around the mechanical path, vanishes. This principle is called Hamilton’s principle. Note that the variations of the path are restricted to lie within the constrained subspace. The advantage of this new formulation of mechanics with conservative force fields over the Newtonian formulation is that it holds for any system subject to 2 We

suppose that the constrained subspace is smooth and that ra (t) is close to rb (t) for all t.

18

Lagrange and Hamilton formulations of classical mechanics

constraints, and that it holds independently of the coordinates which are chosen to represent the motion. This is clear from the fact that we search for the minimum of the action within the subspace allowed for by the constraint, and this subspace is properly described by the generalised coordinates q j . When solving the motion of some particular mechanical system our task is therefore to properly express T and V in terms of these generalised coordinates, plug the Lagrangian L = T −V into the action, and minimise the latter with respect to the generalised coordinates (which are functions of time). Although this might seem a complicated way of solving a simple problem, it should be realised that the transformation of forces and accelerations to generalised coordinates is usually more complicated than writing the kinetic energy and the potential in terms of these new coordinates. Furthermore we shall see below that the problem of finding the stationary solution for a given action leads straightforwardly to a second-order differential equation, which is the correct form of the Newtonian equation of motion in terms of the chosen generalised coordinates. As an example, consider the pendulum. The position of the mass m is given by the 2 coordinates x and y (we neglect the third coordinate z). The constraint obeyed by these coordinates is x2 + y2 = l 2 . This constraint allows us to use only a single generalised coordinate ϕ: x = l sin ϕ and y = −l cos ϕ. ˙ This example shows that the generalised coordinate q = ϕ does not The velocity is given by vϕ = l ϕ. necessarily have to have the dimension of length, and likewise q˙ = ϕ˙ does not necessarily have the dimension of velocity. The kinetic energy is now given as T = ml 2 ϕ˙ 2 /2, and the potential energy by V = −mgl cos ϕ. The Lagrangian of the pendulum is therefore  2 2  l ϕ˙ L = T −V = m + gl cos ϕ . (2.50) 2 We now turn to the problem of determining the stationary solution for an action with such a Lagrangian. The Lagrangian can have many different forms, depending on the particular set of generalised coordinates chosen; therefore we shall now work out a general prescription for determining the stationary solution of the action without making any assumptions concerning the form of the Lagrangian, except that it may depend on the q j and on their time derivatives q˙ j : Z t2

˙ L(q, q,t)dt.

S[q] =

(2.51)

t1

Here q(t) is any vector-valued function, q(t) = (q1 (t), . . . , qN (t)). We now consider an arbitrary, but small variation δ q(t) of the path q(t), and calculate the change in S as a result of this variation: δ S[q] = S[q + δ q] − S[q] =

Z t2 t1

Z t2

˙ dt − ˙ dt ≈ L (q + δ q, q˙ + δ q,t) L (q, q,t) t1   Z t2 ˙ ˙ ∂ L (q, q,t) ∂ L (q, q,t) δq+ δ q˙ dt. (2.52) ∂q ∂ q˙ t1

Note that both q and q˙ depend on time. Note further that ∂ /∂ q is a vector – the derivative must be interpreted as a gradient with respect to all the components of q. The use of ∂ and not d in the derivatives indicates that when calculating the gradient with respect to q, q˙ is considered as a constant, and vice-versa. Of course, δ q and δ q˙ are not independent: if we know q(t) for all t in the interval under consid˙ We can remove δ q˙ by partial integration: eration, we also know the time derivative q.   Z t2  Z t2  ˙ ˙ ∂ L (q, q,t) ∂L d ∂L ∂ L (q, q,t) δq+ δ q˙ dt = − δ qdt. (2.53) ∂q ∂ q˙ ∂ q dt ∂ q˙ t1 t1

2.5. Conservative systems – the mechanical path

19

Because δ q is small but arbitrary, this variation can only vanish when the term in brackets on the right hand side vanishes. Consider for example a δ q which is zero except for a very small range of t-values around some t0 in the interval between t1 and t2 . Then the term between the square brackets must vanish in that small range. We can do this for any small interval on the time axis, and we conclude that the term in brackets vanishes for all t in the integration interval. So our conclusion reads The action S[q] is stationary, that is, its variation with respect to q vanishes to first order, if the following equations are satisfied: d ∂L ∂L = , ∂qj dt ∂ q˙ j

for j = 1, . . . , N.

(2.54)

The equations (2.54) are called Euler equations. In the case where L is the Lagrangian of classical mechanics, L = T − V , the equations are called Euler–Lagrange equations (note that in the above derivation, no assumption has been made with respect to the form of L nor what it means – the only assumption is that L depends at most on q, q˙ and t). The Euler equations have many applications outside mechanics. Often the following notation is used: N

δL =





j=1

and

∂L d ∂L − ∂ q j dt ∂ q˙ j

δL = δq



δL = δqj



 δqj

 ∂L d ∂L − , ∂ q dt ∂ q˙

(2.55)

(2.56)

or, written in another way: d ∂L ∂L − ∂ q j dt ∂ q˙j

 .

(2.57)

Note that (2.56) is an equality between (N-dimensional) vector quantities. The analysis given here can be summarised by a procedure for solving a mechanical problem in classical mechanics with conservative forces: • Find a suitable set of coordinates which parametrises the subspace of the motion allowed for by the constraints. • Express the kinetic energy T and the potential V in those coordinates. • Write down the Lagrange equations (2.54) for the Lagrangian L = T −V and solve them. Turning again to our simple example of a pendulum, we use the Lagrangian found in (2.50) and write down the Euler–Lagrange equation for this: d ∂L ∂L ¨ = −mgl sin ϕ = = ml 2 ϕ. ∂ϕ dt ∂ ϕ˙

(2.58)

The solution to this equation can be found through numerical integration. In the next section we shall encounter some more complicated examples which show the advantages of the new approach more clearly.

20

Lagrange and Hamilton formulations of classical mechanics

2.6 2.6.1

Examples

A system of pulleys

We consider a system of massless pulleys as in the figure below.

l

l

2

l

3

l

4

1

mc

ma m

b

The string is also massless and furthermore inextensible. It is quite complicated to find out what the forces on the system are when taking all the forces on the pulleys and on the wire into account. However, it turns out that using Hamilton’s principle makes it an easy problem. The total string length is l = l1 + l2 + l3 + l4 and is fixed. Of course l2 = l3 . Therefore, we can take l1 and l4 as generalised coordinates, and we have: 1 l2 = l3 = (l − l1 − l4 ). (2.59) 2 The height of the central pulley is given by l2 (or l3 ), and the total potential energy is therefore given as: i h mb (2.60) V = −g ma l1 + (l − l1 − l4 ) + mc l4 . 2 The speed of the left mass ma is given by l˙1 , and that of the right one, mc , by l˙4 . Using (2.59) we find that the speed of the central pulley is given by 12 (−l˙1 − l˙4 ). The Lagrangian is therefore given as h i 1 1 1 mb L = ma l˙12 + mc l˙42 + mb (l˙12 + l˙42 + 2l˙1 l˙4 ) + g ma l1 + (l − l1 − l4 ) + mc l4 . 2 2 8 2

(2.61)

The Euler-Lagrange equations can be derived straightforwardly:   1 1 ¨ 1 ¨ (ma + mb )l1 + mb l4 = ma − mb g; 4 4 2   1 1 1 (mc + mb )l¨4 + mb l¨1 = mc − mb g. 4 4 2

(2.62a) (2.62b)

2.6. Examples

21

The two equations can be solved for l¨1 and l¨4 and the result is 4ma mc + ma mb − 3mc mb g; l¨1 = mc mb + 4ma mc + ma mb 4ma mc + mc mb − 3ma mb l¨4 = g. ma mb + 4ma mc + mb mc

(2.63a) (2.63b)

To check whether the answer is reasonable we verify that a stationary motion (i.e. a motion with constant velocity) is possible if mb = 2ma = 2mc . The solution is now trivial, since the right hand sides of (2.63) vanish as should indeed be the case. We see that the Lagrange equations provide a framework which enables us to find the equations of motion quite easily. 2.6.2

Example: the spinning top

Consider a top with cylindrical symmetry. The position of the top is defined by its two polar angles ϑ and ϕ and a third angle, ψ, defines the rotation of the top around its symmetry axis. The angular velocity is given in terms of these three polar angles as: ω = ϕ˙ zˆ + ϑ˙ eˆ + ψ˙ dˆ

(2.64)

where zˆ is a unit vector along the z-axis; eˆ is a unit vector in the xy plane which is perpendicular to the axis of the top, and dˆ is a unit vector along the axis of the top. The axis of the top is shown in the figure:

z ψ

ϑ

d e ϕ y

ϕ ϑ x

f

From this figure, it is clear that eˆ = (− sin ϕ, cos ϕ, 0) and dˆ = (cos ϕ sin ϑ , sin ϕ sin ϑ , cos ϑ ).

(2.65b)

ˆf = eˆ × dˆ = (cos ϕ cos ϑ , sin ϕ cos ϑ , − sin ϑ ).

(2.66)

(2.65a)

And it follows that The rotational kinetic energy of the top is given by 1 ω T = ω T Iω 2

(2.67)

22

Lagrange and Hamilton formulations of classical mechanics

(the superscript T turns the column vector ω into a row vector). It is always possible to find some axes with respect to which the moment of inertia tensor is diagonal, and as a result of the axial symmetry of the top one diagonal element, which we shall denote by I3 , corresponds to the symmetry axis d, and two other diagonal elements correspond to axes in the plane perpendicular to the body axis, such as e and f – we call these elements I1 . The kinetic energy is then given by 1 1 1 ˆ 2 = 1 I1 ϕ˙ sin2 ϑ + I1 ϑ˙ 2 + 1 I3 (ψ˙ + ϕ˙ cos ϑ )2 . ω · eˆ )2 + I1 (ω ω · ˆf)2 + I3 (ω ω · d) T = I1 (ω 2 2 2 2 2 2

(2.68)

The gravitational force results in a potential V = MgR cos ϑ , where M is the top’s mass and R the distance from the point where it rests on the ground to the centre of mass. The Lagrangian therefore reads: 1 I1 1 L = I1 ϕ˙ 2 sin2 ϑ + ϑ˙ 2 + I3 (ψ˙ + ϕ˙ cos ϑ )2 − MgR cos ϑ . (2.69) 2 2 2 The Lagrange equations for ϑ , ϕ and ψ are then given by: I1 ϑ¨ = I1 ϕ˙ 2 sin ϑ cos ϑ − I3 (ψ˙ + ϕ˙ cos ϑ )ϕ˙ sin ϑ + MgR sin ϑ ;  d  I1 ϕ˙ sin2 ϑ + I3 (ψ˙ + ϕ˙ cos ϑ ) cos ϑ = 0; dt d I3 (ψ˙ + ϕ˙ cos ϑ ) = 0. dt

(2.70a) (2.70b) (2.70c)

We immediately see that ψ˙ + ϕ˙ cos ϑ is a constant of the motion – we shall call this ω3 : ω3 = ψ˙ + ϕ˙ cos ϑ = Constant.

(2.71)

ω3 denotes the component of angular velocity along the spining axis. Let us search for solutions of constant precession: ϑ˙ = constant, or ϑ¨ = 0. We furthermore set ϕ˙ = Ω. The first Hamilton equation then gives: I1 Ω2 cos ϑ − I3 ω3 Ω + MgR = 0.

(2.72)

If ω3 is large, we find the two solutions MgR I3 ω3

(2.73)

I3 ω3 I2 cos ϑ

(2.74)

Ω= for which Ω is inversely proportional to ω3 and Ω=

i.e. Ω is proportional to ω3 . The first solution corresponds to slow precession and fast spinning around the spinning axis; the second solution corresponds to rapid precession in which the gravitational force is negligible. For general ω3 , the quadratic equation (2.72) with ϑ¨ = 0 has two real solutions for Ω if I32 ω32 > 4I1 cos ϑ MgR. For smaller values of ω3 , a wobbling motion sets in (“nutation”).

(2.75)

2.7. Non-conservative forces – charged particle in an electromagnetic field

2.7

23

Non-conservative forces – charged particle in an electromagnetic field

In this section we consider one particular type of force which is not conservative, but which can still be analysed fully within the Lagrangian approach. This is the very important example of a charged particle in an electromagnetic field. Suppose we have a collection of N particles which experience a non-conservative force which can be derived from a generalised potential W (ri , r˙ i ) in the following way: F=−

∂W d ∂W + . ∂ ri dt ∂ r˙i

(2.76)

Analogous to the previous section we can derive a variational condition, starting from d’Alembert’s principle:  Z t2 Z t2 Z t2  ∂W d ∂W − m¨ri δ ri dt = − δ T dt = + δ ri dt. (2.77) ∂ ri dt ∂ r˙i t1 t1 t1 The left hand side has been transformed as in (2.46), and the procedure for the right hand side is similar with the extension that the second term of the integrand is subject to a partial integration, leading to  Z t2 Z t2 Z t2  ∂W ∂W − δ T dt = δ ri − δ r˙ i dt = − δW dt. (2.78) − ∂ ri ∂ r˙i t1 t1 t1 So we see that the variation of the action Z t2

S[q] =

[T −W ] dt

(2.79)

t1

vanishes. It can also be checked by working out the Euler-Lagrange equations, which for this action directly leads to the classical equation of motion m¨ri = Fi . 2.7.1

Charged particle in an electromagnetic field

A point particle with charge q moving in an electromagnetic field experiences a force F = q (E + v × B) .

(2.80)

The charge q of the particle should not be confused with the generalised coordinates qi introduced before. E is the electric field, B is the magnetic field. These fields are not independent, but they are related through the Maxwell equations. We use the following two Maxwell equations ∇ · B = 0 and ∂B = 0. ∇×E+ ∂t

(2.81a) (2.81b)

We know from vector calculus that a vector field whose divergence is zero, can always be written as the curl of a vector function depending on space (and, in our case, time); applying this to (2.81a) we see that we can write B in the form B = ∇ × A, where A is a vector function, called the vector potential, depending on space and time. Substituting this expression for B in Eq. (2.81b) leads to   ∂A ∇× E+ = 0. (2.82) ∂t

24

Lagrange and Hamilton formulations of classical mechanics

Now we use another result from vector calculus, which says that any function whose curl is zero can be written as the gradient of a scalar function, which in this case we call the potential, φ (r,t). This results in the following representations of the electromagnetic field: E(r,t) = −∇φ (r,t) −

∂A (r,t); ∂t

B(r,t) = ∇ × A(r,t).

(2.83a) (2.83b)

In fact, by using two Maxwell equations, we have reduced the set of 6 field values (3 for E and 3 for B) to 4 (3 for A and 1 for φ ). As the force is velocity-dependent, it is not conservative. We are after a function W (r, r˙ ) which, when used in an action of the usual form, yields the correct equation of motion with the force (2.80). The potential which does the job is W (r, r˙ ) = qφ (r,t) − q˙r · A(r,t) = qφ − q(xA ˙ x + yA ˙ y + z˙Az ).

(2.84)

Note that Ax denotes the x-component and not the partial derivative with respect to x. The Lagrangian occurring in the action is therefore: 1 L = m˙r2 + q˙r · A(r,t) − qφ (r,t). 2

(2.85)

To see that this Lagrangian is indeed correct we work out the force component in the x-direction. First we calculate the derivative of the potential W with respect to x:   ∂ Ay ∂W ∂φ ∂ Ax ∂ Az − = −q + q x˙ + y˙ + z˙ . (2.86) ∂x ∂x ∂x ∂x ∂x Furthermore d dt



∂W ∂ x˙

 = −q

  dAx ∂ Ax ∂ Ax ∂ Ax ∂ Ax = −q + x˙ + y˙ + z˙ . dt ∂t ∂x ∂y ∂z

(2.87)

The Euler-Lagrange equations for the action contain the two contributions resulting from the potential. We have mx¨ = −

d ∂W + ∂x dt



∂W ∂ x˙



  ∂ φ ∂ Ax = −q + + ∂x ∂t      ∂ Ay ∂ Ax ∂ Az ∂ Ax q y˙ − + z˙ − = qEx + q(yB ˙ z − z˙By ) (2.88) ∂x ∂y ∂x ∂z

i.e. precisely the equation of motion with the force given in (2.80)!

2.8

Hamilton mechanics

It is possible to formulate Lagrangian mechanics in a different way. At first sight this does not add anything new to the formalism which was constructed in the previous sections, but we shall see that this new formalism provides us with a conserved quantity which is the energy or some analogous object. More importantly, this formalism is essential for setting up quantum mechanics in a structured way, as will be shown in a later course.

2.8. Hamilton mechanics

25

Let us again consider a system described by a Lagrangian formulated in terms of generalised coordinates, with the equations of motion given by: ∂L d ∂L = . dt ∂ q˙ j ∂qj

(2.89)

This is a second order differential equation, which we shall transform into two first order ones. We define the canonical momenta p j as pj =

∂L . ∂ q˙ j

(2.90)

The canonical momentum should not be confused with the mechanical momentum, which is simply ∑i m˙ri , although the two coincide when the generalised coordinates are simply the ri . Using the canonical momenta, the equations of motion can be formulated as: p˙ j =

∂L . ∂qj

(2.91)

In the particular example of a conservative system formulated in terms of the position coordinates ri : N

mi 2 r˙ i −V (r1 , . . . , rN ), i=1 2

L=∑

(2.92)

the momenta are given as pi = m˙ri

(2.93)

and the equations of motion are p˙ i = −

∂V . ∂ ri

(2.94)

We see that in the case of a particle moving in a conservative force field, the generalised momentum corresponds to the usual definition of momentum. We have reformulated the Euler-Lagrange equation as two first-order differential equations. The Euler-Lagrange equations were derived from a variational principle, the Hamilton principle, which requires the action to be stationary for the mechanical path. We may ask ourselves if it is possible to define our two new equations in terms of the same variational principle. This turns out to be the case indeed. If a variational principle should lead to two equations for each generalised coordinate, the corresponding functional to be minimized should have two independent parameters per generalised coordinate q j which should be varied. Of course, in addition to the generalised coordinate q j we use p j for the second coordinate. We know the form of the Lagrangian in terms of q j and p j (the parameter q˙ j obviously disappears from the description as argued above). The problem is that straightforward application of variational calculus with respect to q j and p j is quite intricate. In fact, in order to simplify the derivation of the variation principle, it is useful to introduce a new functional, called the Hamiltonian H, depending on the generalised coordinates and momenta, and the time t as follows: N

H(p j , q j ,t) =

∑ p j q˙ j − L [q j , q˙ j (qk , pk ),t] .

j=1

(2.95)

26

Lagrange and Hamilton formulations of classical mechanics

Note that we can indeed express q˙ j in terms of the pk and qk as indicated in the second argument of L by inversion of Eq. (2.90).1 Let us calculate the derivatives of H with respect to q j and p j : ∂ L ∂ q˙k ∂H ∂ q˙k = q˙ j + ∑ pk −∑ . ∂ pj ∂ pj k k ∂ q˙k ∂ p j

(2.96)

Note that it follows from (2.90) that the second and third terms on the right hand side cancel, so that we are left with ∂H = q˙ j . (2.97) ∂ pj Now let us calculate the derivative with respect to q j : ∂L ∂H ∂ q˙k ∂ L ∂ q˙k =− + ∑ pk −∑ . ∂qj ∂ q˙ j ∂qj k k ∂ q˙k ∂ q j

(2.98)

Again using (2.90) we see that the second and third term on the right hand side cancel – furthermore the first term on the right hand side is equal to − p˙i and we are left with: ∂H = − p˙ j . ∂qj

(2.99)

Eqs. (2.97) and (2.99), together with the definition of the Hamiltonian (2.95) and of the momentum (2.90) are equivalent to the equations of motion. Eqs (2.97) and (2.99) are called Hamilton’s equations. Note that we must consider the generalised coordinates and the canonical momenta as independent coordinates, in contrast to the Lagrange picture, in which q j and q˙ j are related by q˙ j =

∂q . ∂t

This independence of coordinates and momenta is needed in order to arrive at the correct equations of motion. When these equations are solved, we obtain relations between them. It is very important to realise the difference between the formal independence of the coordinates at the level of formulating the Hamiltonian and deriving the equations of motion and the dependence which is a consequence of the solution of these equations. If the system does not depend explicitly on time, the Hamiltonian is the analogue of the energy. The simplest case is a conservative system with the positions ri as coordinates. In that case it is easy to see that N p2 H = ∑ i +V (r1 , . . . , rN ). (2.100) i=1 2m More generally, let us consider a conservative system formulated in terms of generalised coordinates q1 , . . . , qs . Note the difference with Eq. (2.2), where ri may contain an explicit time-dependence – in the present case we assume that the constraints have no explicit time-dependence. In that case it is possible to express the position coordinates ri in terms of the s generalised coordinates q j , j = 1, . . . , s: ri = ri (q1 , . . . , qs )

(2.101)

1 For this inversion to be possible, the Lagrangian should be convex, but we shall not go into details concerning this point.

2.9. Applications of the Hamiltonian formalism

27

and therefore the velocities can be calculated as s

r˙ i =

∂ ri

∑ ∂ q j q˙j .

(2.102)

j=1

Therefore, if we formulate the kinetic energy ∑i 12 m˙r2i in terms of the generalised coordinates, we obtain an expression which is quadratic in the q˙ j : s

T=



Mk j (q1 , . . . , qs )q˙ j q˙k

(2.103)

k, j=1

where

N

mi ∂ ri ∂ ri . i=1 2 ∂ q j ∂ qk

M jk = Mk j = ∑

(2.104)

If we calculate the contribution to the momenta arising from the kinetic energy, we find that they depend linearly on the q˙ j : pj = Hence

s s  ∂T = ∑ M jk + Mk j q˙k = 2 ∑ Mk j q˙k . ∂ q˙ j k=1 k=1

(2.105)

s

∑ q˙ j p j = 2T

(2.106)

H = 2T − (T −V ) = T +V = Energy.

(2.107)

j=1

and For a general system Hamilton’s equations of motion can be used to derive the time derivative of the Hamiltonian: s s dH ∂H ∂H ∂H =∑ q˙j + ∑ p˙ j + . (2.108) dt ∂ q ∂ p ∂t j j j=1 j=1 Using Hamilton’s equation of motion (2.97) and (2.99) we see that the first two terms on the right hand side cancel and we are left with: dH ∂H = . (2.109) dt ∂t We see therefore that if H (or L) does not depend explicitly on time, then H is a conserved quantity. If the potential does not contain a q˙ j dependence, this implies conservation of energy. If the potential on the other hand does contain such a dependence, then (2.109) implies conservation of some quantity which plays a role more or less equivalent to energy.

2.9

Applications of the Hamiltonian formalism

In this section we shall reconsider the systems studied before in the Lagrange framework and point out which features are different when these systems are considered within the Hamiltonian framework. From the derivation of the Hamiltonian and Hamilton’s equations, it is seen that the latter can be viewed as a different way of writing Lagrange’s equations. The reason for introducing the Hamiltonian and Hamilton’s equations is that they are often used in quantum mechanics and because the Hamilton formalism is more convenient for discovering some conserved quantities.

28

Lagrange and Hamilton formulations of classical mechanics 2.9.1

The three-pulley system

From the Lagrangian (2.61), the momenta p1 and p4 associated with the degrees of freedom l1 and l4 are found as:  mb  ˙ mb ˙ p1 = m a + (2.110a) l1 + l4 ; 4 4  mb ˙ mb ˙ p4 = m c + (2.110b) l4 + l1 ; 4 4 After some calculation, we therefore find for the Hamiltonian: i h i 1 h mb mb H= mc p21 + ma p24 + (p1 − p4 )2 − g ma l1 + (l − l1 − l4 ) + mc l4 . 2∆ 4 2

(2.111)

with ∆ = (ma + mc )mb /4 + ma mc .

(2.112)

The Hamilton equations read: mb )g; 2 mb p˙4 = (mc − )g. 2

p˙1 = (ma −

(2.113a) (2.113b)

The solution is simple since the right hand sides of these equations are constants: mb )gt; 2 mb p4 = (mc − )gt, 2 p1 = (ma −

(2.114a) (2.114b)

where the initial conditions are that the system is standing still at t = 0. Together with Eqs. (2.110), we obtain the same solution as in the Lagrangian case. We see that the difference between the two approaches are not very dramatic in this case. Note that it is now easy to see that for ma = mc = 2mb the system is in equilibrium. 2.9.2

The spinning top

From the Lagrangian, we can derive the momenta associated with the three degrees of freedom ϕ, ϑ and ψ: pϕ = I1 ϕ˙ sin2 ϑ + I1 (ψ˙ + ϕ˙ cos ϑ ) cos ϑ ; pϑ = I1 ϑ˙ ;

(2.115b)

pψ = I3 (ψ˙ + ϕ˙ cos ϑ ).

(2.115c)

(2.115a)

If we want to express the kinetic energy in terms of these momenta, we need to solve for the time derivatives of the angular coordinates ϑ , ϕ and ψ in terms of these momenta: pϕ − pψ cos ϑ ; I1 sin2 ϑ pϑ ϑ˙ = ; I1 pψ pϕ − pψ cos ϑ ψ˙ = − cos ϑ . I3 I1 sin2 ϑ ϕ˙ =

(2.116a) (2.116b) (2.116c)

2.9. Applications of the Hamiltonian formalism

29

After some calculation, the Hamiltonian is then found to be H=

p2ψ (pϕ − pψ cos ϑ )2 p2ϑ + + Mgr cos ϑ . + 2I1 2I3 2I1 sin2 ϑ

(2.117)

As the Hamiltonian does not depend on ψ and ϕ, we see immediately that pψ and pϕ must be constant. Coordinates of which only the momentum does appear in the Hamiltonian are called ignorable: these momenta are constant in time – they represent constants of the motion. We have seen that both pψ and pϕ are constants of motion. The Hamiltonian now reduces to a simple form: H= where U(ϑ ) =

p2ϑ +U(ϑ ), 2I1

(pϕ − pψ cos ϑ )2 p2ψ + + Mgr cos ϑ . 2I3 2I1 sin2 ϑ

(2.118)

(2.119)

The Hamilton equations yield dU −I1 ϑ¨ = − p˙ϑ = . (2.120) dϑ This equation is difficult to solve analytically. Note that apart from the ignorable coordinates, we have an additional constant of the motion, the energy: p2ϑ +U(ϑ ) = E = constant. I1

(2.121)

The motion and its analysis will be considered in a worksheet. 2.9.3

Charged particle in an electromagnetic field

Finally we consider again the charged particle in an electromagnetic field. The momentum can be found as usual from the Lagrangian – we obtain p = m˙r + qA.

(2.122)

The Hamiltonian is H = p˙r −

m 2 m (p − qA)2 r˙ − q˙r · A + qφ = r˙ 2 + qφ (r) = + qφ (r). 2 2 2m

(2.123)

You might already know that this Hamiltonian is used in quantum mechanics for a particle in an electromagnetic field.

3

The two-body problem In this chapter we consider the two-body problem within the framework of Lagrangian mechanics. One of the most impressive results of classical mechanics is the correct description of the planetary motion around the sun, which is equivalent to electric charges moving in each other’s field. With the analytic solution of this problem, we shall recover the famous Kepler laws. The problem is also important in quantum mechanics: the hydrogen atom is a quantum version of the Kepler problem.

3.1

Formulation and analysis of the two-body problem

The two-body problem describes two point particles with masses m1 and m2 . We denote their positions by r1 and r2 respectively, and their relative position, r2 − r1 , by r. Finding the Lagrangian is quite simple. The kinetic energy is the sum of the kinetic energies of the two particles, and the potential energy is the interaction, which depends only on the separation r = |r| of the two particles, and is directed along the line connecting them (note that this last restriction excludes magnetic interactions). We therefore have: m1 2 m2 2 L= r˙ + r˙2 −V (r). (3.1) 2 1 2 Before deriving the equations of motion, we note that instead of writing the kinetic energy as the sum of the kinetic energies of the two particles, it can also be separated into the kinetic energy of the centre of mass and that of the relative motion, as in Eq. (1.33): 2

mi 02 ri , i=1 2

T = TCM + ∑

(3.2)

where r0i = ri − rCM ; m1 r1 + m2 r2 rCM = , M

(3.3) (3.4)

and

M 2 r˙ , M = m1 + m2 . (3.5) 2 CM As there are only two particles, we can work out the coordinates r0i relative to the centre of mass rC , and we find, using Eq. (3.4): TCM =

r01 = r1 − rCM =

m2 (r1 − r2 ), M

(3.6)

r02 = r2 − rCM =

m1 (r2 − r1 ). M

(3.7)

and

30

3.1. Formulation and analysis of the two-body problem

31

We can take time derivatives by simply putting a dot on each r in these equations and then, after some calculation, we find for the kinetic energy: T=

m1 m2 2 M 2 r˙ CM + r˙ . 2 2M

(3.8)

The Lagrangian is therefore M 2 m1 m2 2 r˙ + r˙ −V (r). (3.9) 2 CM 2M We see that the kinetic energy of the relative motion has the form of the kinetic energy of a single particle of mass m1 m2 /(m1 + m2 ) and position vector r(t). The mass term µ = m1 m2 /(m1 + m2 ) is called reduced mass. Of course we could write down the Euler–Lagrange equations for this Lagrangian as before, but it is convenient to perform a further separation: that of the kinetic energy of the relative coordinate into a radial and a tangential part. First we must realise that the plane through the origin and the initial velocity vector r˙ of the relative position will always remain the plane of the motion, as the force acts only within that plane. In this plane, we choose an x and a y axis. Then we can conveniently introduce polar coordinates r and ϕ, in which the x and y coordinate can be expressed as follows: L = T −V =

x = r cos ϕ;

(3.10)

y = r sin ϕ.

(3.11)

It then immediately follows that the kinetic energy of the relative motion can be rewritten as  µ 2  µ 2 x˙ + y˙2 = r˙ + r2 ϕ˙ 2 . 2 2 The Lagrange equations given in Eq. (2.54) then take the form: M¨rCM = 0;  d µr2 ϕ˙ = 0; dt

(3.12)

(3.13a) (3.13b)

dV (r) . (3.13c) dr The first equation tells us that the centre of mass moves at constant speed: it does not feel a net force. This follows from the fact that it does not appear in the potential, and is in accordance with the conservation of total momentum in the absence of external forces. Coordinates such as rCM with constant canonical momentum, are called ignorable – see section 2.9.2. The second and third equation do not depend on rCM – therefore we see that the relative motion can entirely be understood in terms of a single particle with mass µ and moving in a plane under the influence of a potential V (r). ˙ First note that the term in brackets occurring We now use the second equation to eliminate ϕ. in this equation must be a constant, – we call this constant ` (` = µr2 ϕ˙ is precisely the angular momentum, and we see that it is conserved); the third equation then transforms into µ r¨ − µrϕ˙ 2 = −

µ r¨ −

dV (r) `2 =− , 3 µr dr

(3.14)

Note that this equation can be viewed as that of a one-dimensional particle subject to a force F = `2 − dVdr(r) . Such a force can in turn be derived from a conservative potential: µr3  2  d d ` F(r) = VEff (r) = +V (r) . (3.15) dr dr 2µr2

32

The two-body problem

3

VEff (r)

2 1 E> 0

rmin

0 rmax

rmin

E< 0

-1 0

1

2

3

4

r Figure 3.1: Effective potential for a two-particle system.

The subscript Eff is used to distinguish between this ‘effective’ potential and the original, bare attraction potential V (r). The potential VEff is represented in figure 3.1 for the case V (r) = −1/r. From Eqs. (3.13b) and (3.13c) and from figure 3.1, we can infer the qualitative behaviour of the motion. We have seen [Eq. (3.13b)] that the angular momentum is constant. This implies that the motion will always keep the same orientation (i.e. clockwise or anti-clockwise). If the particles move apart, the speed at which they orbit around each other will be slower (since increasing r implies decreasing ˙ rϕ). The motion in the radial direction can be understood qualitatively as follows. Note that we can interpret Eq. (3.13c) as the motion of a particle in one dimension. This particle has a mechanical energy which is the sum of its kinetic energy and the effective potential, and this energy should remain constant. Furthermore, the energy cannot be lower than the lowest value of the effective potential shown in figure 3.1. If it lies between this value and 0, then then r will vary between some minimum and maximum value as shown in this figure. If E is on the other hand positive, r will vary between some minimum value and infinity. We have seen that the r-component of the two-body motion can be described in terms of a single particle in one dimension. The energy of this particle is the sum of its kinetic and potential energy – the latter is the effective potential [see Eq. (3.15)]. It turns out that this energy is equal to the total energy of the two-particle system (neglecting the contribution of the centre of mass motion to the latter). As we have already worked out the kinetic and potential energy of the two-body problem above, we immediately see that E = T +V =

 µ 2 r˙ + r2 ϕ˙ 2 +V (r), 2

(3.16)

which can easily be identified as the kinetic energy µ r˙2 /2 of the one-dimensional particle plus the effective potential.

3.2. Solution of the Kepler problem

3.2

33

Solution of the Kepler problem

The special case V (r) = −A/r is very important as it describes the gravitational and the Coulomb attraction. Also, in this special case, the motion can be studied further by analytical means. Finding the solution in the form r(t), ϕ(t) is not convenient – rather, we search for r(ϕ), which contains explicit information about the shape of the orbit. We use the fact that the angular momentum ` = µr2 ϕ˙ is constant and combine this with the fact that the energy is constant and given by (3.16): ` ; µr2 2 `2 r˙2 = (E −V ) − 2 2 . µ µ r ϕ˙ =

(3.17) (3.18)

Eliminating the dt of the time derivatives by dividing (3.17) by the square root of (3.18) leads to dϕ ±` = . dr r2 [2µ(E −V (r)) − `2 /r2 ]1/2 With V (r) = −A/r this can directly be integrated to give   µAr − `2 ϕ −C = arcsin . ε µAr

(3.19)

(3.20)

In addition to the integration constant C on the left hand side, we see a constant ε, called the eccentricity, which is given in terms of the problem parameters as s 2E`2 ε = 1+ . (3.21) µA2 Inverting Eq. (3.20) to find r as a function of the polar angle ϕ gives: r=

`2 . µA [1 − ε sin(ϕ −C)]

(3.22)

We have some freedom in choosing C – it changes the definition of the angle ϕ. If we take ϕ = 0 as the angle for which the two particles are closest (perihelion), we see that C = π/2. The motion can now be classified according to the value of ε. We take ε positive – changing the sign of ε does not change the shape of the orbit (putting ϕ → ϕ + π compensates this sign change). For ε = 0, r does not depend on φ . This corresponds to a circle. If 0 < ε < 1, we have an ellipse (r varies between some maximum and minimum value). For ε = 1, we have a parabola (r → ∞ for ϕ = π), and for ε > 1 we have an hyperbola (r → ∞ for cos ϕ = 1/ε). Usually, the notation `2 1 λ= (3.23) µA 1 + ε is used, so that the equation relating the two polar coordinates on the curve of the motion reads: r=

λ (1 + ε) . 1 + ε cos ϕ

(3.24)

34

The two-body problem

b aε

a

F2

F1

Figure 3.2: Ellipse with various parameters indicated.

In figure 3.2, we indicate the semi-major and semi-minor axis a and b respectively and the focal points. The semi-major axis can be related to the parameters we use to represent the motion: a=

λ . 1−ε

(3.25)

√ The area of an ellipse in terms of its semi-major axis is πa2 1 − ε 2 . This can be related to the angular momentum by realising that the infinitesimal area swept by a line from the origin to the point of the motion is given by r2 dϕ/2. This tells us that the rate at which this area changes is given as `2 /(2µ), so that the total area, which is swept in one revolution of period T is equal to T `/(2µ), so that we have: p ` (3.26) T = πa2 1 − ε 2 . 2µ The quantities a and ε are not independent – remember a = λ /(1 − ε); furthermore ` related to λ and ε [see Eq. (3.23)]. Using this to eliminate ε finally leads to T2 =

4π 2 µ 3 a . A

(3.27)

We have now recovered all three laws of Kepler: • All planets move around the sun in elliptical paths. In fact, most planets have eccentricities very close to zero. • A line drawn from the sun to a planet sweeps out equal areas in equal times. The rate at which this area increases is given by `2 /(2µ) as we have seen above. • The squares of the periods of revolution of the planets about the sun are proportional to the cubes of the semimajor axes of the ellipses. See Eq. (3.27).

4

Examples of variational calculus, constraints 4.1

Variational problems

In the previous chapters, we have considered a reformulation of classical mechanics in terms of a variational principle. This will lead the way to formulating quantum mechanics – this is the subject of the next chapter. In this chapter we make an excursion which is still in the field of classical problems, though not classical dynamics as in the previous chapters. In fact, variational calculus is not only useful for mechanics. Many physical problems which occur in every day life can be formulated as variational problems. In the next sections we shall consider a few examples. We shall first introduce some further analysis concerning the problems we are about to treat in this chapter. Consider an expression of the form Z

J=

dx F(y, y0 , x).

(4.1)

We have used a notation which differs from that used in previous chapter in order to emphasise that J is not always the action and F not always the Lagrangian. J assigns a real value to every function y(x) – it is called a functional. There is a whole branch of mathematics, called functional analysis, dedicated to such objects. Here we shall only consider finding the stationary solutions (minima, maxima or saddle points) of J; they are given as the solutions to the Euler equations ∂F d ∂F =0 − ∂ y dx ∂ y0

(4.2)

In the case where F does not depend explicitly on x, i.e. F = F(y, y0 ),

(4.3)

we can directly integrate the Euler equation(s) once: by multiplying the Euler equation by y0 we find   d ∂ F(y, y0 ) F(y, y0 ) − y0 = 0. dx ∂ y0

(4.4)

From this it follows that the solution must obey F(y, y0 ) − y0

∂ F(y, y0 ) = Constant. ∂ y0

(4.5)

This is a first order differential equation: we have integrated the second order Euler equations once. 35

36

Examples of variational calculus, constraints

4.2

The brachistochrone

Near the end of the 18-th century, Jean Bernouilli was studying a problem, which we formulate as follows. Suppose you are to design a monorail in an amusement park. There is a track in your monorail where the trolleys, which arrive at some high point A with low (approximately zero) speed should move to another place B under the influence of gravity (no motor is used and friction is neglected) in the shortest possible time. The problem is to design the shape of the track in order to achieve this goal. it will be clear that the track lies in a plane. Let us first consider the possible solutions heuristically. One could argue that a straight line would be the best solution because it is the shortest path between A and B. On the other hand, it would seem favourable to increase the particle’s velocity as much as possible in the beginning of the motion. This would call for a steep slope near the starting point A, followed by a more or less horizontal path to B, but the resulting curve is considerably longer than the straight line, which is the shortest path between A and B. We must therefore find the optimum between the shortness of the path and the earliest increase of the velocity by a steeper slope. We can solve this problem using the techniques of the previous section. We must minimise the time for a curve which can be parametrised as x(s), y(s). Obviously, there are many ways to parametrise a curve – we shall use for s the distance along the curve, measured from the point A. The infinitesimal segment ds is given by s  2 q p dy 2 2 ds = dx + dy = dx 1 + = dx 1 + y0 2 (4.6) dx s can be expressed as a function of t – the relation between the two is given by ds = vdt q where v = v2x + v2y is the particle speed. The time needed to go from A to B is given by t=

Z B ds A

v

.

(4.7)

(4.8)

We need an equation for v in terms of the path length. As the gravitational force is responsible for the change in velocity it is useful to consider the x- and y-components of the path. In fact, we have the following relation between v and y as a result of conservation of energy: 1 2 v = gy, 2

(4.9)

where the height y is measured from the point A. This means that we have put in the boundary condition that when y = 0, then v = 0, which is correct since the particle is released from A with zero velocity. Therefore, using (4.9), we arrive at Z x0 p 1 + y0 2 t[y] = dx √ (4.10) 2gy 0 where x0 is the horizontal distance between A and B. We have to find the stationary function y(x) for the functional t[y]. The Euler-Lagrange equations have the solution [see Eq. (4.5)]: s s 1 + y0 2 1 2  − y0 = Constant. (4.11) 2gy 1 + y0 2 2gy

4.3. Fermat’s principle This can be simplified to

2

y(1 + y0 ) = C = Constant.

37

(4.12)

In order to solve this equation we substitute y0 = tan φ so that we have:



 1 1 y = C cos φ = C + cos(2φ ) . 2 2

(4.14)

dx 1 dy C sin(2φ ) = 0 = = 2C cos2 φ . dφ y dφ tan φ

(4.15)

2

And

(4.13)

The solution is therefore   1 x = C φ + sin(2φ ) + D 2   1 1 y=C + cos(2φ ) . 2 2

(4.16a) (4.16b)

D and C are integration constants – if we identify the point A with (0, 0), the curve starts at φ = π/2, and D/C = −π/2. The two coordinates of B are used to fix the value of φ at point B and the constants D and C. Note that the boundary condition y0 = 0 at point A was already realised in Eq. 4.9). The resulting curve is called the cycloid – it is the curve described by a point on a horizontally moving wheel.

4.3

Fermat’s principle

The path traversed by a ray of light in a medium with a varying refractive index is not a straight line. According to Fermat’s principle this path is determined by the requirement that the light ray follows the path which allows it to go from one point to another in the shortest possible time. The time needed to traverse a path is determined by the speed of light along that path, and this quantity is given as c c(n) = (4.17) n where c is the speed of light in vacuum and n is the refractive index. The latter might vary with position. If the path lies in the xy plane, the path length dl of a segment corresponding to a distance dx along the x-axis is given by s  2 dy . (4.18) dl = dx 1 + dx The time dt needed to traverse the path dl is given as: dt =

dl , c/n

so that the total time can now be given as an integral over dx:   s  2 Z L n(y) dy . t= dx  1+ c dx 0

(4.19)

(4.20)

38

Examples of variational calculus, constraints

y(x) x

Here we have assumed that n depends on the coordinate y only. Now take n(y) = must minimise   s  2 Z L p dy  ct = dx  1 + y2 1 + . dx 0

p

1 + y2 , then we

(4.21)

For this case, the Euler-Lagrange equations reduce to the equation [see Eq. (4.5)] dy 1 = dx A

q

(1 − A2 ) + y2 .

(4.22)

The solution is given as x  p +B . y(x) = ± 1 − A2 sinh A

(4.23)

The possible range of A-values is |A| ≤ 1.

4.4

The minimal area problem

Consider a soap film which is suspended between two parallel hoops (see figure). The soap film has a finite surface tension, which means that its energy scales linearly with its area. As the film tends to minimise its energy, it minimises its area. The minimal area for a surface of revolution described by a function y(x) is given by: q Z L 2π dx y 1 + y0 2 . (4.24) 0

Minimising this functional of y leads to the standard Euler-Lagrange solution Eq. (4.5) for functionals with no explicit time dependence: y p =C (4.25) 1 + y0 2 The solution to this equation is given by 

x+A y(x) = C cosh C

 (4.26)

4.5. Constraints

39

We now assume that the hoops have the same diameter. Let us furthermore choose the x-axis such that the origin is in the middle between the two hoops. Using the fact that cosh is an even function, we have:   L R = C cosh . (4.27) 2C where R is the radius of the hoops. Consider now the graph of C cosh[L/(2C)] as a function of C for fixed L: It is clear that for R lying in the “gap” of the graph, no solution can be found. What happens x*cosh(0.5/x)

4

2

0

-2

-4 -4

-2

0

2

4

is that if the hoops are not too far apart, the soap film will form a nice cosh-formed shape. However, when we pull the hoops apart, there will be moment at which the film can no longer be sustained and it collapses. It can be seen from the graph that usually there are two different solutions. The one with the smallest surface is to be selected. The surface area is found as h i p A(y) = π LC + 2R C2 + R2 . (4.28)

4.5 4.5.1

Constraints Constraint forces

In d’Alembert’s approach, the forces of constraint are neglected as they are usually of limited physical importance. In some cases, however, it might be useful to know what these forces are. For example, a designer of a monorail would like to know the force which is exerted on that rail by the train in order to certify that the monorail is robust enough. In fact, it is possible to work out within a Lagrangian analysis what the forces of constraint are. Let us first recall the solution to the following problem Find the minimum of the function f (x), where x = (x1 , x2 , . . . , xN ), under the condition g(k) (x) = Ck , where Ck are constants; k = 1, . . . , K. Consider a small variation δ x such that g(k) (x + δ x) = Ck still holds for all k. Then it holds that g(k) (x) + δ x · ∇g(k) (x) = g(k) (x) = Ck

(4.29)

40

Examples of variational calculus, constraints

hence δ x · ∇g(k) (x) = 0

(4.30)

for all k. On the other hand, for variations δ x satisfying (4.30), f should not change to first order along δ x, so we have δ x · ∇ f (x) = 0. (4.31) Now we can show that ∇ f (x) must lie in the span of the set ∇g(k) (x). If it would lie outside the span, we can write it as the sum of a vector lying in the span of ∇g(k) (x) plus a vector perpendicular to this space. If take δ x to be proportional to the latter, then (4.30) is satisfied, but (4.31) is not. Therefore we conclude that ∇ f can be written as a linear combination of the gradients ∇g(k) : ∇ f (x) =

K

∑ λk ∇g(k) (x).

(4.32)

k=1

This is the well-known Lagrange multiplier theorem. Let us consider a simple example: finding the minimum or maximum of the function f (x, y) = xy on the unit circle: g(x, y) = x2 + y2 − 1 = 0. There is only one Lagrange parameter λ , and Eq. (4.32) for this case reads (y, x) = λ (2x, 2y), (4.33) 2 2 whose √ solution is x = ±y,√λ = ±1/2. The constraint x + y = 1 then fixes the solution to x = ±y = 1/ 2 and x = ±y = −1/ 2. Indeed, the symmetry of the problem allows only the axes x = ±y and the x = 0 or y = 0 as the possible solutions, and it is easy to identify the stationary points (minima, maxima, saddle points) among these. Now suppose we have a mechanical N-particle system without constraints. For such a system we know the Lagrangian L. The combined coordinates of the system are represented as a vector R = (r1 , . . . , rN ). Then we have for any displacement δ R that the corresponding change in Lagrangian vanishes: M ˙ δ L(R, R,t) ˙ =0 (4.34) ∑ δ ri · δ ri ≡ δ L(R, R,t) i=1

for all t, where we have used the notation of (2.55). Now suppose that there are constraints present of the form g(k) (R) = 0. (4.35) The argument used above for ordinary functions of a single variable can be generalised to show that we should have K ˙ δ L(R, R,t) = ∑ λk ∇R g(k) (R); (4.36) δR k=1 the reader is invited to verify this. As L is the Lagrangian of a mechanical system without constraints, we know that the left hand side of this equation can be written as p˙ − FA . The right hand side has the dimension of a force and must therefore coincide with the constraint force. Let us analyse the simple example of the pendulum once again. Without constraints we have L=

 m 2 x˙ + y˙2 − mgy. 2

(4.37)

The constraint is given by l 2 = x 2 + y2 .

(4.38)

4.5. Constraints

41

So the pendulum equations of motion become mx¨ = 2λ x

(4.39a)

my¨ = −mg + 2λ y.

(4.39b)

These equations cannot be solved analytically, as they describe the full pendulum, and not the small angle limit. In the small angle limit the force in the x-direction dominates, and therefore mg/(2λ ) should be approximately equal to l. We then see λ is negative, so that the solution is oscillatory that p p and the frequency is given by ω = 2λ /m = g/l. The λ -dependent terms in the equation of motion represent indeed a force in the +y direction of magnitude mg: this is the tension in the string or rod on which the weight is suspended. When using polar coordinates, we have  2  ˙ 2 (rϕ) r˙ L=m + + mg cos ϕ, (4.40) 2 2 with the constraint r = l. which leads to the Lagrange equations: m¨r = mg cos ϕ + mrϕ˙ 2 − λ ;  m r2 ϕ¨ + rr˙ϕ˙ = −mgr sin ϕ.

(4.41) (4.42)

Filling the constraint is particularly easy. The constraint force is given by λ = mg cos ϕ + mrϕ˙ 2 .

(4.43)

The constraint force consists of a term which compensates for the gravity force (first term) and an extra term which is necessary for keeping the circular motion going (a centripetal force, the second term). The equation for ϕ¨ reduces to the usual pendulum equation when the constraint is used: ϕ¨ = −g/l sin ϕ.

(4.44)

In practice, constraints are seldom used explicitly in the solution of mechanical problems. 4.5.2

Global constraints

In (4.5.1) we have analysed constraints of the form: g(k) (r1 , r2 , . . . , rN ;t) = 0.

(4.45)

This type of constraint is called holonomic, and it frequently allows us to represent the system using generalised coordinates. This type of constraint imposes conditions on the system which should hold at any moment in time. We may therefore consider this type of constraints as an infinite set (one constraint for each time). Such constraints are called local, where this term refers to the fact that the constraint is local in time. Sometimes however, we must deal with constraints of a different form. Consider for example the problem of finding the shape of a chain of homogeneous density ρ (= mass/unit length) suspended at its two end points. We represent this shape by a function y(x) where x is the coordinate along the line connecting the two end points and y(x) is the height of the chain for coordinate x. The shape

42

Examples of variational calculus, constraints

x1

x2

Figure 4.1: Example of a function δ y(x) which is nonzero only near x1 and x2 .

is determined by the condition that it minimises the (gravitational) potential energy, and it is readily seen that this energy is given by the functional s

Z X

V = gρ

 1+

dx y 0

dy dx

2 .

(4.46)

We leave out the constants g and ρ in the following as they do not affect the shape. If we would minimise the potential energy (4.46), we would get divergences, as we have not yet restricted the total length L of the wire to have a fixed length. This requirement can be formulated as s

Z X

L=

dx 0



dy 1+ dx

2 .

(4.47)

This is a constraint which is not holonomic, and there is no way to reduce the number of degrees of freedom. This type of constraint is called global as it is formulated as a condition on an integral of the same type as the functional to be minimised, and is not to be satisfied for all values of the integration variable. Therefore, we must generalise the derivation of the Euler equations to cases in which a functional constraint is present. Let us consider two functionals, J and K: Z b

J=

dx F(y, y0 , x)

(4.48a)

dx G(y, y0 , x),

(4.48b)

a

Z b

K= a

and suppose we want to minimise J under the condition that K has a given value, i.e., for each variation δ y which satisfies: Z δ G(y, y0 , x) dx = 0;

(4.49)

δ F(y, y0 , x) dx = 0.

(4.50)

we require that Z

Consider now a particular variation which is nonzero only in a small neighbourhood of two values x1 and x2 (see figure 4.1). If the areas under these two humps are A1 and A2 respectively, we have Z b a

δ G(y, y0 , x) dx = A1

δ G[y(x1 ), y0 (x1 ), x1 ] δ G[y(x2 ), y0 (x2 ), x2 ] + A2 , δy δy

(4.51)

4.5. Constraints (X,0)

y

(0,0)

43

x Figure 4.2: The cosh solution to the suspended chain problem.

and therefore δ G1 /δ y A2 =− δ G2 /δ y A1

(4.52)

with an obvious shorthand notation. Applying this argument once again we see that for functions y(x) satisfying requirement (4.52), we should have δ F1 /δ y A2 =− (4.53) δ F2 /δ y A1 But this can only be true for arbitrary x1 and x2 when δ F/δ y and δ G/δ y are proportional: δF δG =λ . δy δy

(4.54)

Therefore, we must solve the Euler equations for the combined functional J(y) − λ K(y)

(4.55)

where λ is fixed by putting the solution of this minimisation back into the constraint. This is the Lagrange multiplier theorem for functionals. p 02 p We shall now apply this to the suspended chain problem. We have F = y 1 + y and G = 1 + y0 2 . Therefore, the Euler equations read: ! q 02 y 2 (y + λ ) 1 + y0 − p = Constant, (4.56) 1 + y0 2 which leads to y+λ =C

q 1 + y0 2 .

(4.57)

The solution is given by y(x) = A cosh [α(x − x0 )] + B

(4.58)

44

Examples of variational calculus, constraints

with A = C = 1/α

(4.59a)

B = −λ .

(4.59b)

Boundary conditions are y(0) = y(X) = 0 and the length of the wire must be equal to L. These conditions fix x0 and λ : x0 = X/2, λ = − cosh [X/(2C)] and C = L/ {sinh [X/(2C)]}. In figure 4.2 the solution is shown.

5

From classical to quantum mechanics In the first few chapters we have considered classical problems, in particular the variational formulation of classical mechanics, in the formulations of Hamilton and Lagrange. In this chapter, we look at quantum mechanics. In the first section, we introduce quantum mechanics by formulating the postulates on which the quantum theory is based. Later on, we shall then try to establish the link between the classical mechanics and quantum mechanics, via Poisson brackets and via the path integral.

5.1

The postulates of quantum mechanics

When we consider classical mechanics, we start from Newton’s laws and derive the behaviour of moving bodies subjects to forces form these laws. This is a nice approach as we always like to see a structured presentation of the world surrounding us. However, in reality, for thousands of years people have thought about motion and forces before Newton’s compact formulation of the underlying principles was found. It is not justified to forget this and to pretend that physics only consists on understanding and predicting phenomena from a limited set of laws. The ‘dirty’ process of walking in the dark and trying to find a comprehensive formulation on the phenomena under consideration is an essential part of physics. This also holds for quantum mechanics, although it was developed in a substantially shorter amount of time than classical mechanics. In fact, quantum mechanics started at the beginning of the twentieth century, and its formulation was more or less complete around 1930. This formulation consisted of a set of postulates which however do not have a canonized form similar to Newton’s laws: most books have their own version of these postulates and even their number varies. We now present a particular formulation if these postulates. 1. The state of a physical system at any time t is given by the wavefunction of the system at that time. This wavefunction is an element of the Hilbert space of the system. The evolution of the system in time is determined by the Schr¨odinger equation: i}

∂ |ψ(t)i = Hˆ |ψ(t)i . ∂t

Here Hˆ is an Hermitian operator: the Hamiltonian ˆ 2. Any physical quantity Q is being represented by an Hermitian operator Q. When we perform a measurement of the quantity Q, we will always find one of the eigenvalues of ˆ For a system in the state |ψ(t)i, the probability of finding a particular eigenvalue the operator Q. λi , with an associated eigenvector |φi i of Qˆ is given by Pi =

|hφi |ψ(t)i|2 . hψ(t)|ψ(t)i hφi |φi i 45

46

From classical to quantum mechanics Immediately after the measurement, the system will find itself in the state |φi i corresponding to the value λi which was found in the measurement of λi . Several remarks can be made.

1. The wavefunction contains the maximum amount of information we can have about the system. In practice, we often do not know the wavefunction of the system. 2. Note that the eigenvectors |φi i always form a basis of the Hilbert space of the system under consideration. This implies that the state |ψ(t)i of the system before the measurement can always be written in the form |ψ(t)i = ∑ ci |φi i . i

The probability to find in a measurement the values λi is therefore given by Pi =

|ci |2 2 . ∑ j c j

For a normalised state |ψ(t)i it holds that, if the eigenvectors |φi i are normalised too:

∑ |ci |2 = 1. i

In that case Pi = |ci |2 . 3. So far we have suggested in our notation that the eigenvalues and eigenvectors form a discrete set. In reality, not only discrete, but also continuous spectra are possible. In those cases, the sums are replaced by integrals. 4. In understanding quantum mechanics, it helps to make a clear distinction between the formalism which described the evolution of the wavefunction (the Schr¨odinger equation, postulate 1) versus the interpretation scheme. We see that the wavefunction contains the information we need to predict the outcome of measurements, using the measurement postulate (number 2). It now seems that we have arrived at a formulation of quantum mechanics which is similar to that of classical mechanics: a limited set of laws (prescriptions) from which everything can be derived, provided we know the form of the Hamiltonian (this is analogous to the situation in classical mechanics, where Newton’s laws do not tell us what the form of the forces is). However there is an important difference: the classical laws of motion can be understood by using our everyday life experience so that we have some intuition for their meaning and content. In quantum mechanics, however, our laws are formulated as mathematical statements concerning objects (vectors and operators) for which we do not have a natural intuition. This is the reason why quantum mechanics is so difficult in the beginning (although its mathematical structure as such is rather simple). You should not despair when quantum mechanics seems difficult: many people find it difficult, and the role of the measurement is still the object of intensive debate. Sometimes you must switch your intuitition off and use the rules of linear algebra to solve problems. Above, we have mentioned that quantum mechanics does not prescribe the form of the Hamiltonian. In fact, although the Schr¨odinger equation, quite unlike the classical equation of motion, is a linear equation, which allows us to make ample use of linear algebra knowledge, the structure of

5.2. Relation with classical mechanics

47

quantum mechanics is richer than that of classical mechanics because in principle any type of Hilbert space cold occur in Nature. In classical mechanics, the space containing all possible states of a system is essentially a 6N dimensional space (for a N-body system we have 3N space- and 3N momentum coordinates). In quantum mechanics, wavefunctions can be part of infinite-dimensional spaces (like the wave functions of a particle moving along a one-dimensional axis) but they can also lie in a finitedimensional space (for example spin which has no classical analogue).

5.2

Relation with classical mechanics

In order to see whether we can guess the structure of the Hamiltonian for systems which have a classical analogue, we consider the time evolution of a physical quantity Q. We assume that Q does not depend on time explicitly. However, the expectation value of Q may vary in time due to the change of the wavefunction in the course of time. For normalised wavefunctions:     d ∂ ∂ hψ(t)| Q |ψ(t)i + hψ(t)| Q |ψ(t)i . (hψ(t)|Q|ψ(t)i) = dt ∂t ∂t Using the Schr¨odinger equation and its Hermitian conjugate: −i}

∂ ˆ hψ(t)| = hψ(t)| H, ∂t

(note the minus-sign on the left hand side which results from the Hermitian conjugate) we obtain i}



  d ˆ ˆ Hˆ |ψ(t) , (hψ(t)|Q|ψ(t)i) = ψ(t)|Qˆ Hˆ − Hˆ Q|ψ(t) = ψ(t)| Q, dt

  ˆ Hˆ is the commutator. We see that the time derivative of Qˆ is related to the commutator where Q, ˆ This should wake you up or ring a bell. In the exercises, we have seen that for any between Qˆ and H. function f (q j , p j ) of the coordinates q j and momenta p j , the time derivative is given by   ∂ f ∂H ∂ f ∂H df =∑ − ≡ { f , H} . dt ∂qj ∂ pj ∂qj ∂ pj j We see that this equation is very similar to that obtained above for the time derivative of the expectation ˆ The differences consist of replacing the Poisson bracket by the commutator value of the operator Q! and adding a factor i}. It seems that classical and quantum mechanics are not that different after all. Could this perhaps be a guide to formulate quantum mechanics for systems for which have already a classical version? This turns out to be the case. As an example, we start by considering a one-dimensional system for which the relevant classical observables are the position x and the momentum p. Classically, we have {x, p} =

∂x ∂ p ∂x ∂ p − = 1. ∂x ∂ p ∂ p ∂x

The second term in the expression vanished because x and p are to be considered as independent coordinates. From this, we may guess the quantum version of this relation: [x, p] = i} which should sound familiar (if it does not, return to the second year quantum mechanics course). It seems that our recipe of making quantum mechanics out of classical mechanics makes sense! Therefore we can now state the following rule:

48

From classical to quantum mechanics If the Hamiltonian of some classical system is known, we can use the same form in quantum mechanics, taking into account the fact that the coordinates q j and p j become Hermitian operators and that their commutator relations are: [q j , qk ] = 0;

[p j , pk ] = 0;

[q j , pk ] = δ jk .

You can verify these extended commutation relations easily by working out the corresponding classical Poisson brackets. In the second year, you have learned that pˆ =

} d . i dx

What about this relation? It was not mentioned here so far. The striking message here is that this relation can be derived from the commutation relation. In order to show this, we must discuss another object you might have missed too: the wavefunction written in the form ψ(r) (for a particle in 3D). It is important to study the relation between this and the state |ψi. Consider a vector a in two dimensions. This vector can be represented by two numbers a1 and a2 , which are the components of the vector a. However, the actual values of the components depend on how we have chosen our basis vectors. The vector a is an arrow in a two dimensional space. In that space, a has a particular length and a particular orientation. By changing the basis vectors, we do not change the object a, but we do change the numbers a1 and a2 . In the case of the Hilbert space of a one-dimensional particle, we can use as basis vectors the states in which the particle is localised at a particular position x. We call these states |xi. They are eigenvectors of the position operator xˆ with eigenvalue x: xˆ |xi = x |xi . The states |xi are properly normalised:

0 x|x = δ (x − x0 ), where δ (x − x0 ) is the Dirac delta-function. We now can define ψ(x): ψ(x) = hx|ψi , that is, ψ(x) are the ‘components’ of the ‘vector’ |ψi with respect to the basis |xi. For three dimensions, we have a wavefunction which is expressed with respect to the basis |ri. d In order to derive the representation of the momentum operator, pˆ = }i dx , we first calculate the matrix element of the commutator:





x|[x, ˆ p]|x ˆ 0 = x|xˆ pˆ − pˆx|x ˆ 0 = (x − x0 ) x| p|x ˆ 0 . The last expression is obtained by having xˆ in the first term act on the bra-vector hx| on its left, and on the ket |x0 i on the right in the second term. On the other hand, using the commutation relation, we know that



x|[x, ˆ p]|x ˆ 0 = i} x|x0 . This is an even function of x − x0 , as interchanging x and x0 does not change the matrix element on the right hand side. Since this function is equal to (x − x0 ) hx| p|x ˆ 0 i, we know that hx| p|x ˆ 0 i must be an odd 0 function of x − x .

5.2. Relation with classical mechanics

49

Now we evaluate the matrix element hx| p|ψi. ˆ We recall from linear algebra that, since |xi are the eigenstates of an Hermitian operator, they form a complete set, that is: Z

|xi hx| dx,

I= where I is the unit operator. Then we can write hx| p|ψi ˆ =

Z

x| p|x ˆ 0



x0 |ψ dx0 .

Now we perform a Taylor expansion around x in order to rewrite hx0 |ψi:

0 d (x0 − x)2 d 2 hx|ψi + · · · x |ψ = hx|ψi + (x0 − x) hx|ψi + dx 2! dx2 Then we obtain hx| p|ψi ˆ =

  d (x0 − x)2 d 2 0 hx|ψi + (x − x) hx|ψi + hx|ψi + · · · dx0 . x|p|x dx 2! dx2

Z

0



The first term in brackets gives a zero after integration, as it is multiplied by hx| p|x ˆ 0 i, which was an odd function of x − x0 . The second term gives d d x| p|x ˆ 0 (x0 − x) hx|ψi = −i} hx|ψi , dx dx

Z

where we have used the relation

x|p|x0 (x0 − x) = −i}δ (x0 − x). We use the same relation for the second term. But then we obtain a term of the form (x0 − x)δ (x0 − x) in the integral over dx0 . This obviously yields a zero. The same holds for all higher order terms, so we are left with } d hx|p|ψi = hx|ψi , i dx which is the required result. Having obtained this we can analyse the form of the eigenstates of the momentum operator: pˆ |pi = p |pi . The states |pi can be represented in the basis hx|; the components then are hx|pi. We can find the form of these functions by using the eigenvalue equation and the representation of the momentum operator as a derivative: hx| p|pi ˆ = p hx|pi and } d hx| p|pi hx|pi . ˆ = i dx

50

From classical to quantum mechanics

The first of these equation expresses the fact that |pi is an eigenstate of the operator p, ˆ and the second one follows directly from the fact that the momentum operator acts as a derivative in the x-representation. Combining these two we obtain a simple differential equation } d hx|pi = p hx|pi , i dx with a normalised solution: hx|pi = √

1 eipx/} . 2π}

This allows us to find any state ψ in the momentum representation, that is, the representation in which we use the states |pi as basis states: hp|ψi =

Z

hp|xi hx|ψi dx =

1 2π}

Z

eipx/} ψ(x) dx.

The analysis presented here for a one-dimensional particle can be generalised to three or more dimensions in a natural way.

5.3

The path integral: from classical to quantum mechanics

The path integral is a very powerful concept for connecting classical and quantum mechanics. Moreover, this formulation renders the connection between quantum mechanics and statistical mechanics very explicit. We shall restrict ourselves here to a discussion of the path integral in quantum mechanics. The reader is advised to consult the excellent book of Feynman and Hibbs (Quantum Mechanics and Path Integrals, McGraw-Hill, 1965) for more details. The path integral formulation can be derived from the following heuristics: • A point particle which moves with momentum p at energy E can also be viewed as a wave with a phase ϕ given by ϕ = k · r − ωt where p = }k and E = }ω. • For a single path, these phases are additive, i.e. the phases for different segments of the path should be added. • The probablity to find a particle which at t = t0 was at r0 , at position r1 at time t = t1 , is given by the absolute square of the sum of the phase factors exp(iϕ) of all possible paths leading from (r0 ,t0 ) to (r1 ,t1 ): 2 iϕpath P(r0 ,t0 ; r1 ,t1 ) = ∑ e . all paths This probability is defined up to a constant which can be fixed by normalization (i.e. the term within the absolute bars must reduce to a delta-function in r1 − r0 ). These heuristics are the analog of the Huygens principle in wave optics. To analyse the consequences of these heuristics, we chop the time interval between t0 and t1 into many identical time slices (see Fig. 5.1) and consider one such slice. Within this slice we take the path to be linear. To simplify the analysis we consider one-dimensional motion. We first consider the

5.3. The path integral: from classical to quantum mechanics

51

xi xf tf

ti

Figure 5.1: A possible path running from an initial position xi at time ti to a final position xf at time tf . The time is divided up into many identical slices.

contribution of k · x to the phase difference. If the particle moves in a time ∆t over a distance ∆x, we know that its k-vector is given by mv m∆x = . k= } }∆t The phase change resulting from the displacement of the particle can therefore be given as ∆ϕ = k∆x =

m∆x2 . }∆t

We still must add the contribution of ω∆t to the phase. Neglecting the potential energy we obtain ∆ϕ =

m∆x2 }2 k2 m∆x2 − ∆t = . }∆t 2m} 2}∆t

The potential also enters through the ω∆t term, to give the result: ∆ϕ =

m∆x2 V (x) − ∆t. 2}∆t }

For x occurring in the potential we may choose any value between x0 and x1 – the most accurate result is obtained by substituting the mean value. If we now use the fact that phases are additive, we see that for the entire path the phases are given by (  )  1 m x(t j+1 ) − x(t j ) 2 V [x(t j )] +V [x(t j+1 )] ϕ= ∑ − ∆t. } j 2 ∆t 2 This is nothing but the discrete form of the classical action of the path! Taking the limit ∆t → 0 we obtain  Z  Z 1 t1 mx˙2 1 t1 ϕ= −V (x) dt = L(x, x) ˙ dt. } t0 2 } t0

52

From classical to quantum mechanics

We therefore conclude that the probability to go from r0 at time t0 is to r1 at time t1 is given by  2  Zt 1 i L(x, x) ˙ dt P(r0 ,t0 ; r1 ,t1 ) = N ∑ exp all paths } t0 where N is the normalization factor N =

r

m . 2πi∆t

This now is the path integral formulation of quantum mechanics. Let us spend a moment to study this formulation. First note the large prefactor 1/} in front of the exponent. If the phase factor varies when varying the path, this large prefactor will cause the exponential to vary wildly over the unit circle in the complex plane. The joint contribution to the probability will therefore become very small. If on the other hand there is a region in phase space (or ‘path space’) where the variation of the phase factor with the path is zero or very small, the phase factors will add up to a significant amount. Such regions are those where the action is stationary, that is, we recover the classical paths as those giving the major contribution to the phase factor. For } → 0 (the classical case), only the stationary paths remain, whereas for small }, small fluctuations around these paths are allowed: these are the quantum fluctuations. You may not yet recognise how this formulation is related to the Schr¨odinger equation. On the other hand, we may identify the expression within the absolute signs in the last equation with a matrix element of the time evolution operator since both have the same meaning:  Zt 

i 1 ˆ x1 |U(t1 − t0 )|x0 = ∑ N exp L(x, x) ˙ dt . } t0 all paths This particular form of the time evolution operator is sometimes called the propagator. Let us now evaluate this form of the time evolution operator acting for a small time interval ∆t on the wavefunction ψ(x,t):    Zt  2 Z Z ∞ x˙ (t) i 1 m −V [x(t)] dt ψ(x0 ,t0 ) dx0 . ψ(x1 ,t1 ) = N D[x(t)] exp } t0 2 −∞ R

The notation D[x(t)] indicates an integral over all possible paths from (x0 ,t0 ) to (x1 ,t1 ). We first approximate the integral over time in the same fashion as above, taking t1 very close to t0 , and assuming a linear variation of x(t) from x0 to x1 :     Z ∞ i (x1 − x0 )2 V (x0 ) +V (x1 ) ψ(x1 ,t1 ) = N exp m − ∆t ψ(x0 ,t0 ) dx0 . } 2∆t 2 2 −∞ A similar argument as used above to single out paths close to stationary ones can be used here to argue that the (imaginary) Gaussian factor will force x0 to be very close to x1 . The allowed range for x0 is (x1 − x0 )2 

}∆t . m

As ∆t is taken very small, we may expand the exponent with respect to the V ∆t term:    Z ∞ i (x1 − x0 )2 i[V (x0 ) +V (x1 )] ψ(x1 ,t1 ) = N exp m 1− ∆t ψ(x0 ,t0 ) dx0 . } 2∆t 2} −∞

5.4. The path integral: from quantum mechanics to classical mechanics

53

(x1 )] As x0 is close to x1 we may approximate }[V (x0 )+V by }V (x1 ). We now change the integration 2 variable from x0 to u = x0 − x1 :   Z ∞ i u2 ψ(x1 ,t1 ) = N exp m [1 − i/}V (x1 )∆t] ψ(x1 + u,t0 ) du. } 2∆t −∞

As u must be small, we can expand ψ(x) about x1 and obtain     Z ∞ im u2 i ∂ u2 ∂ 2 ψ(x1 ,t0 ) du. ψ(x1 ,t1 ) = N exp 1 − V (x1 )∆t ψ(x1 ,t0 ) + u ψ(x1 ,t0 ) + } 2∆t } ∂x 2 ∂ x2 −∞ Note that the second term in the Taylor expansion of ψ leads to a vanishing integral as the integrand is an antisymmetric function of u. All in all, after evaluating the Gaussian integrals, we are left with ψ(x1 ,t1 ) = ψ(x1 ,t0 ) −

i∆t i}∆t ∂ 2 V (x1 )ψ(x1 ,t0 ) + ψ(x1 ,t0 ). } 2m ∂ x2

Using ψ(x1 ,t1 ) − ψ(x1 ,t0 ) ∂ ≈ ψ(x1 ,t1 ), ∆t ∂t we obtain the time dependent Schr¨odinger equation for a particle moving in one dimension:   ∂ }2 ∂ 2 +V (x) ψ(x,t). i} ψ(x,t) = − ∂t 2m ∂ x2 You may have found this derivation a bit involved. It certainly is not the easiest way to arrive at the Schr¨odinger equation, but it has two attractive features; • Everything was derived from simple heuristics which were based on viewing a particle as a wave and allow for interference of the waves; • The formulation shows that the classical path is obtained from quantum mechanics when we let } → 0.

5.4

The path integral: from quantum mechanics to classical mechanics

In the previous section we have considered how we can arrive from classical mechanics at the Schr¨odinger equation. This formalism can be generalised in the sense that for each system for which we can write down a Lagrangian, we have a way to find a quantum formulation in terms of the path integral. Whether a Schr¨odinger-like equation can be found is not sure: sometimes we run into problems which are beyond the scope of these notes. In this section we assume that we have a system described by some Hamiltonian and show that the time evolution operator has the form of a path integral as found in the previous section. The starting point is the time evolution operator, or propagator, which, for a time-independent Hamiltonian, takes the form E D i ˆ U(rf ,tf ; ri ,ti ) = rf e− } (tf −ti )H ri . The matrix element is difficult to evaluate – the reason is that the Hamiltonian which, for a particle in one dimension, takes the form }2 d 2 +V (x) Hˆ = − 2m dx2

54

From classical to quantum mechanics

is the sum of two noncommuting operators. Although it is possible to evaluate the exponents of the separate terms occurring in the Hamiltonian, the exponent of the sum involves an infinite series of increasingly complicated commutators. For any two noncommuting operators Aˆ and Bˆ we have ˆ

ˆ

ˆ ˆ

ˆ ˆ

ˆ ˆ ˆ

ˆ ˆ ˆ

ˆ ˆ ˆ ˆ

eA+B = eA eB e−1/2[A,B]−1/12([A,[A,B]]+[B,[B,A]])+1/24[A,[B,[A,B]]]+... This is the so-called Campbell–Baker–Hausdorff (CBH) formula. The cumbersome commutators occurring on the right can only be neglected if the operators A and B are small in some sense. We can try to arrive at an expression involving small commutators by applying the time slicing procedure of the previous section: i i i i ˆ ˆ ˆ ˆ e− } (tf −ti )H = e− } ∆t H e− } ∆t H e− } ∆t H . . . Note that no CBH commutators occur because ∆t Hˆ commutes with itself. Having this, we can rewrite the propagator as (we omit the hat for operators) Z D ED E D E U(xf ,tf ; xi ,ti ) = dx1 . . . dxN−1 xf |e−i∆tH/} |xN−1 xN−1 |e−i∆tH/} |xN−2 · · · x1 |e−i∆tH/} |xi . Now that the operators occurring in the exponents can be made arbitrarily small by taking ∆t very small, we can evaluate the matrix elements explicitly: E D E

D i∆t 2 i∆t 2 x j |e−i∆tH |x j+1 = x j |e− } [p /(2m)+V (x)] |x j+1 = e−i∆tV (x j )/} x j |e− } p /(2m) |x j+1 . The last matrix element can be evaluated by inserting two unit operators formulated in terms of integrals over the complete sets |pi: E Z Z D

i∆t 2 p2 /(2m) 0 − i∆t } hx|pi hp| e− } pˆ /(2m) p0 p0 |x0 . |x = x|e √ We have seen that hx|pi = exp(ipx/})/ 2π}. Realising that the exponential operator is diagonal in p space, we find, after integrating over p: D E   i∆t 2 1 x|e− } p /(2m) |x0 = √ exp im(x − x0 )2 /(2∆t}) . 2π}∆t All in all we have D E   1 x j |e−i∆tH/} |x j+1 = √ e−i∆tV (x j )/} exp mi(x − x0 )2 /(2∆t}) . 2π}∆t Note that we have evaluated matrix elements of operators. The result is expressed completely in terms of numbers, and we no longer have to bother about commutation relations. Collecting all terms together we obtain (   ) Z i N m(x j+1 − x j )2 −V (x j ) ∆t . U(xf ,tf ; xi ,ti ) = dx1 . . . dxN−1 exp ∑ } j=0 2 The expression in the exponent is the discrete form of the Lagrangian; the integral over all intermediate values x j is the sum over all paths. We therefore have shown that the time evolution operator from xi to xf is equivalent to the sum of the phase factors of all possible paths from xi to xf .

6

Operator methods for the harmonic oscillator 6.1

Introduction

Now that we know the basic formulation in terms of postulates of quantum mechanics, we are ready to treat standard quantum problems. You have already met some wavefunction problems in the second year – they are briefly mentioned in the appendix. In this chapter we consider a completely different approach for finding the energy spectrum and eigenfunctions – this is the operator method. In the wavefunction, or direct method one tries to find an explicit form of a wave function satisfying the Schr¨odinger equation in, usually, the spatial representation. Operator methods however aim at solving the problem by finding particular operators satisfying particular commutation relations and in which the Hamiltonian can easily be expressed. By applying the commutation relations and a few general physical criteria, the solution is obtained without using tedious mathematics but at the expense of a somewhat higher level of abstraction. We shall consider an application to the harmonic oscillator and use operator methods to find spectra of angular momentum operators in the next chapter. The harmonic oscillator is of considerable interest in numerous problems. The reason is that often systems in nature are close to the classical ground state, and the potential can often be treated well in a harmonic approximation. Consider for example the hydrogen molecule, which consists of two atoms linked together by a chemical bond with an equilibrium distance r0 . We can stretch or contract the bond and it will then act as a spring, which for small deviations from the equilibrium distance, is approximately harmonic as we shal see in chapter 11. The harmonic oscillator also forms the basis of many advanced quantum mechanical field theories, where we shall not go into.

6.2

The harmonic oscillator

Consider the one-dimensional harmonic oscillator. The Schr¨odinger equation reads −

}2 d 2 1 ψ(x) + mω 2 x2 ψ(x) = Eψ(x). 2 2m dx 2

(6.1)

ω is the frequency of the classical harmonic oscillator. This equation can also be written as: p2 1 ψ(x) + mω 2 x2 ψ(x) = Eψ(x) 2m 2

(6.2)

} d i dx .

The momentum operator does not commute

} [p, x] = . i

(6.3)

where we have used the momentum operator p ≡ with the position x. We have:

55

56

Operator methods for the harmonic oscillator In order to simplify the notation, we scale the momentum and the distance according to: p p˜ = √ }mω r mω x˜ = x }

(6.4a) (6.4b)

so that p˜ = d/d x. ˜ The Schr¨odinger equation now assumes the form: }ω 2 [ p˜ + x˜2 ]ψ(x) ˜ = Eψ(x) ˜ 2

(6.5)

  d2 }ω 2 ˜ = Eψ(x). ˜ − 2 + x˜ ψ(x) 2 d x˜

(6.6)

or,

The commutation relation for p˜ and x˜ can be found using (6.3) and we have: [ p, ˜ x] ˜ = −i.

(6.7)

We shall first consider the solution of this problem following the direct method. In order to solve the Schr¨odinger equation it turns out convenient to write ψ(x) ˜ in the form: 2 /2

ψ(x) ˜ = e−x˜

u(x) ˜

(6.8)

where a new function u(x) ˜ has been introduced. Denoting derivatives with respect to x˜ by a prime 0 , we have:   2 ψ 0 (x) ˜ = −xu( ˜ x) ˜ + u0 (x) ˜ e−x˜ /2 (6.9)   2 ψ 00 (x) ˜ = x˜2 u(x) ˜ − u(x) ˜ − 2xu ˜ 0 (x) ˜ + u00 (x) ˜ e−x˜ /2 (6.10) and substituting these expressions in (6.6) we obtain:  }ω  0 ˜ + u(x) ˜ − u00 (x) ˜ = Eu(x) ˜ 2xu ˜ (x) 2

(6.11)

or

2E u(x) ˜ = 0. }ω The resulting equation can be analysed by writing u as power series expansion in x: ˜ −u00 (x) ˜ + 2xu ˜ 0 (x) ˜ + u(x) ˜ −

(6.12)



u(x) ˜ =

∑ cn x˜n .

(6.13)

n=0

Substituting this series into (6.12) leads to ∞

2E

∑ −n(n − 1)cn x˜n−2 + 2ncn x˜n + cn x˜n − }ω cn x˜n = 0.

(6.14)

n=2

Collecting equal powers in this expression and demanding that the resulting coefficients for each power should vanish, we obtain a recursive equation for the cn : cn+2 = −

2E }ω

− 1 − 2n cn . (n + 2)(n + 1)

(6.15)

6.2. The harmonic oscillator

57

This power series expansion diverges so strongly for large values of x˜ that it is impossible to normalise the corresponding wave function, unless it truncates for a particular value of n. Therefore we must require that cn vanishes for some n. This leads to the equation En = }ω(n + 1/2)

(6.16)

This is the spectrum of the one-dimensional harmonic oscillator: it is equidistant and bounded from below. The solutions ψ can be written in terms of the solutions u which, for the condition (6.16) are the so-called Hermite polynomials Hn : ψ(x) ˜ =

√ n −1/2 −x˜2 /2 π2 n! e Hn (x). ˜

(6.17)

We now show that the harmonic oscillator problem can also be solved by a different method, in which merely commutation relations between operators are used to arrive at the energy spectrum. We define two operators, a and a† which are each other’s Hermitian conjugates: 1 a = √ (x˜ + i p), ˜ 2 1 a† = √ (x˜ − i p). ˜ 2

(6.18a) (6.18b)

The fact that these operators are each other’s Hermitian conjugates can easily be checked using the fact both x˜ and p˜ are Hermitian. Using (6.7), it can be verified that [a, a† ] =

1 i i ([x˜ + i p, ˜ x˜ − i p]) ˜ = [ p, ˜ x] ˜ − [x, ˜ p] ˜ = 1. 2 2 2

(6.19)

Furthermore, using Eqs. (6.19) and (6.18), we obtain immediately: H=

  }ω † a a + aa† = }ω a† a + 1/2 . 2

(6.20)

From this, it is easy to calculate the following commutation relations: [H, a] = }ω[a† a, a] = }ω[a† , a]a = −}ωa

(6.21)

[H, a† ] = }ωa† .

(6.22)

and similarly After these preparations, we now consider the eigenvalue problem. Suppose ψE is an eigenstate with energy E. HψE = EψE . (6.23) We now consider the action of the Hamiltonian on the state aψE . Using the commutation relation (6.21): HaψE = aHψE − }ωaψE = aEψE − }ωaψE (6.24) or: H(aψE ) = (E − }ω)(aψE ) and we see that aψE is an eigenstate of H with energy E − }ω!

(6.25)

58

Operator methods for the harmonic oscillator Similarly we have for a† ψE : Ha† ψE = a† HψE + }ωa† ψE = (E + }ω)(a† ψE )

(6.26)

that is, a† ψE is an eigenstate with energy E + }ω. We say that a is a “lowering” operator, as it lowers the energy eigenvalue by }ω and accordingly a† is called raising operator. Note that if ψE is normalised, aψE and a† ψE need not have this property as an eigenvector is defined only up to a normalisation constant. We will return to this below. In order to find the spectrum, we use a physical argument. The spectrum must be bounded from below as the potential does not assume infinitely negative values. Therefore, if we start with some ψE and act successively on it with the lowering operator a, we must have at some point: an ψE = 0

(6.27)

because otherwise the spectrum would not be bounded from below. Let us call an−1 ψE = ψ0 . Then aψ0 = 0. Therefore,   1 1 † Hψ0 = }ω a a + ψ0 = }ωψ0 , (6.28) 2 2 that is, ψ0 is an eigenstate of H with eigenvalue }ω/2. Acting with a† on ψ0 we obtain an eigenstate ψ1 (up to a constant) with eigenvalue 3}ω/2 etc. Acting n times with a† on ψ0 , we obtain an eigenstate ψn (up to a constant) with energy }ω(n + 1/2), in accordance with the result derived above using the direct method. Often the operator a† a is called number operator, denoted by N, and H can now be written as }ω(N + 1/2). ψn is an eigenstate of N with eigenvalue n. The norm of a† ψn can be expressed in that of ψn :





a ψn | a† ψn = ψn | aa† ψn = ψn | (a† a + 1)ψn = (n + 1) hψn | ψn i . (6.29) √ Therefore, if ψn is normalised, a† ψn / n + 1 is normalised too, and normalised states ψn can be constructed from a normalised state ψ0 according to: n 1 ψn = √ a† ψ0 . n!

(6.30)

Using the commutation relations for a, a† , it is also possible to show that states belonging to different energy levels are mutually orthogonal: D E m hψn |ψm i ∝ ψ0 |an a† |ψ0 . (6.31) Moving the a’s to the right by application of the commutation relations leads to a form involving the lowering operator a acting on ψ0 which vanishes. Exercise: show that hψ2 |ψ3 i vanishes indeed. We have succeeded in finding the energy spectrum but it might seem that we have not made any progress in finding the form of the eigenfunctions ψn . However, we have a simple differential equation defining the ground state ψ0 : √ 2 (x˜ + i p)ψ ˜ 0 (x) ˜ =0 (6.32) aψ0 (x) ˜ = 2 or: d (x˜ + )ψ0 (x) ˜ =0 (6.33) d x˜

6.2. The harmonic oscillator

59

The solution can immediately be found as: 2 /2

ψ(x) = Const. e−x˜

(6.34)

in accordance with the result obtained in the direct method. The normalisation constant is found as Const. = (mω/}π)1/4

(6.35)

(check!). Using (6.30), we can write the solution for general n as: ψn (x) ˜ =

 mω 1/4 1 2 √ π (x˜ + i p) ˜ n e−x˜ /2 . n } n!2

(6.36)

which indeed turns out to be in accordance with the solution found in the direct method, but we shall not go into this any further.

7

Angular momentum 7.1

Spectrum of the angular momentum operators

We have seen that the energy spectrum of the harmonic oscillator is easy to find using creation and annihilation operators. Similar methods can be used to find the eigenvalues of angular momentum operators. We know two such types of operators: the analogue of the classical angular momentum: L = r×p

(7.1)

and the spin S. These operators satisfy the commutation relations: [Ji , J j ] = i}εi jk Jk .

(7.2)

Here, i, j and k are indices denoting the Cartesian components, x, y, z. The operator J is an angular momentum operator like L or S. εi jk is the L´evy-Civita tensor – it is 1 if i jk is an even permutation of 1231 and −1 for an odd permutation. In fact, we will call every operator satisfying (7.2) an angular momentum operator. From the commutation relations (7.2) it can be derived that the components of J commute with J 2 – we can write this symbolically as: (7.3) [J, J 2 ] = 0. Exercise: prove this relation.

The operator J 2 is positive. This means that for any state |ui, u|J 2 |u ≥ 0. Exercise: prove this. If the Hamiltonian of a physical system commutes with every component of an angular momentum operator J, the eigenstates can be rearranged to be simultaneous eigenstates of the Hamiltonian and J 2 and Jz (it is impossible to include Jx or Jy because they do not commute with Jz ). In analogy to the raising and lowering operators for the harmonic oscillator we define the operators J+ and J− as follows: J+ = Jx + iJy

(7.4a)

J− = Jx − iJy

(7.4b)

These operators are not Hermitian – they are each other’s Hermitian conjugates. They satisfy the following commutation relations:

1 The

[Jz , J± ] = ±}J± ;

(7.5a)

[J+ , J− ] = 2}Jz .

(7.5b)

even permutations of 123 are 123, 231 and 312. The remaining three are the odd permutations.

60

7.1. Spectrum of the angular momentum operators

61

By definition, we call the eigenvalues of J 2 }2 j( j + 1) and those of Jz }m. Here j and m are real (i.e. not necessarily integer) numbers which we will have to find. The eigenstates can now be written as | jmi where we have omitted quantum labels associated with other operators, such as the Hamiltonian. Note that we can always take j ≥ 0 because of the fact that J 2 is a positive operator. We now show that for an angular momentum eigenstate | jmi, J± | jmi is an angular momentum eigenstate too: Jz [J+ | jm i] = (J+ Jz + }J+ )| jm i = }(m + 1) [J+ | jm i] (7.6) and because J+ commutes with J 2 , we see that J+ | jmi is proportional to an angular momentum eigenstate | j, m + 1i. Similarly, J− | jmi is proportional to | j, m − 1i. Therefore, J± are called raising and lowering operators for the quantum number m. This means that, given an eigenstate | jmi, we can in principle construct an infinite sequence of eigenstates by acting an arbitrary number of times on it with J± . The sequence is finite only if after acting a finite number of times with either J+ or J− , the new state is zero. The first result we have obtained is that the eigenstates | jmi occur in sequences of states with the same j but m stepping up and down by 1. Suppose | jmi is normalised, then we can calculate the norm of J+ | jmi. Using J− J+ = J 2 −Jz2 −}Jz (check!) we have:

hJ+ jm|J+ jmi = h jm|J− J+ | jmi = jm|J 2 − Jz2 − }Jz | jm = }2 ( j − m)( j + m + 1). (7.7) Similarly:

hJ− jm|J− jmi = h jm|J+ J− | jmi = jm|J 2 − Jz2 + }Jz | jm = }2 ( j + m)( j − m + 1).

(7.8)

Both expressions must be positive and this restricts m to the values − j ≤ m ≤ j.

(7.9)

The only way to restrict m to |m| ≤ j is when J+ acting a certain amount of times on | jmi yields zero: J+p | jm i = 0.

(7.10)

J−q | jm i = 0.

(7.11)

| j, m + p − 1 i = J+p−1 | jm i

(7.12)

Similarly Now consider the state where the equality holds up to a normalisation constant. This is an angular momentum eigenstate since it is obtained by acting p − 1 times with J+ on an eigenstate | jmi. We must have J+ | j, m + p − 1i = 0

(7.13)

which implies that the norm of the resulting state vanishes. By (7.7) it follows that j = m+ p−1

(7.14)

(note that the other solution j = −m − p is impossible because |m| ≤ j and p > 0). In a similar fashion we find −j = m−q+1 (7.15)

62

Angular momentum

and combining the last two equations yields: 2 j = p + q − 2 = integer.

(7.16)

Therefore, j is either integer or half integer and m assumes the values m = − j, − j + 1, . . . , j − 1, j.

(7.17)

In conclusion we have The angular momentum states can be labelled | j, mi. The numbers j are either integer or halfinteger. For a given j, the numbers m run through the values − j, − j + 1, . . . , j − 1, j.

(7.18)

From (7.7) and (7.8) we see that from a properly normalised state | jmi we can obtain properly normalised states as follows: 1 | j, m − 1i = p J− | jmi } ( j + m)( j − m + 1) 1 J+ | jmi. | j, m + 1i = p } ( j − m)( j + m + 1)

(7.19a) (7.19b)

These states are defined up to a phase eiα .

7.2

Orbital angular momentum

As we have seen, the quantum analogue of the classical angular momentum L = r × p is an angular momentum operator because is satisfies the commutation relations (7.2). This can be shown using the commutation relation } [p, x] = . (7.20) i Exercise: Show that (7.2) holds, using the following formulation of the cross-product: (a × b)k = εi jk ai b j .

(7.21)

This type of angular momentum will be called orbital angular momentum since it is expressed in the orbital coordinates of the particle. Another type of angular momentum operator is that representing the spin. We will now find the spectrum of the orbital angular momentum operators for a single particle in three dimensions. It turns out convenient to express Lz in polar coordinates. We will not derive this expression but simply give: ∂ Lz = −i} . (7.22) ∂ϕ The fact that it depends only on ϕ is obvious since Lz is associated with a rotation around the z-axis and such a motion is expressed as a variation of the angle ϕ. The eigenfunctions of L2 , Lz can be written as functions of the angles ϑ and ϕ. We know that these are eigenfunctions of Lz with eigenvalue m. Denoting the eigenfunctions F(ϑ , ϕ) we have: −i

∂ F(ϑ , ϕ) = mF(ϑ , ϕ). ∂ϕ

(7.23)

7.3. Spin

63

This differential equation has a solution F(ϑ , ϕ) = G(ϑ )eimϕ .

(7.24)

The wavefunction of the particle should be single-valued – hence it should be equal for ϕ and ϕ + 2π and this restricts m to integer values. Hence we have: The orbital angular momentum of a single particle has only integer eigenvalues j, m. This result can be generalised to the orbital momentum of a system consisting of an arbitrary number of particles. Half-integer values of j can only come about by having particles with half-integer spin.

7.3

Spin

Classically, a charged particle having a nonzero angular momentum has a nonzero magnetic moment. The magnetic moment for a particle of charge q is given by: q q m = (r × v) = L. 2 2m

(7.25)

The energy of a magnetic moment in an external magnetic field B is given by −m · B.

(7.26)

According to the correspondence principle we add this energy as an extra term to the Hamiltonian of a (spinless) electron (q = −e): H = H0 + H1 H0 =

p2

+V (r); 2m e H1 = L · B. 2m

(7.27a) (7.27b) (7.27c)

We take B in the z-direction. If the potential is spherically symmetric, V (r) = V (r), the eigenstates of H0 can be taken as simultaneous eigenstates of L2 and Lz . But in that case, they are also eigenstates of H1 and therefore of H. For an eigenstate of H0 with energy E0 , H1 shifts the energy by an amount ∆E1 =

e} MB = µB MB. 2m

(7.28)

Here M is the quantum number associated with Lz (capital letter M is used in order to avoid confusion with the mass m). We see that a magnetic field lifts the Lz -degeneracy, yielding a splitting of a l-level into 2l + 1 sublevels. Zeeman observed indeed a splitting of the levels of atoms in a magnetic field, but these splittings were not in accordance with (7.28). Later, Uhlenbeck and Goudsmit (1925) explained the observed anomaly by the assumption of the existence of an intrinsic angular momentum variable, i.e. not associated with the orbital coordinates. This angular momentum was called spin, S. With the spin there is associated a magnetic moment and this is given by m=−

eg S. 2m

(7.29)

64

Angular momentum

The factor g is very close to 2 and its value can be derived only by using relativistic quantum mechanics. The eigenvalues of the spin operators S2 , Sz are by convention }2 s(s + 1) and }ms . s is always 1/2 for an electron and therefore, ms can only assume the values 1/2 and −1/2. Therefore, the eigenvalue of S2 is always }2 3/4 for an electron. Other particles have been found with spin 0, 1, 3/2 etc. Writing down the Hamiltonian for an electron with spin, we have: H = H0 + H1

(7.30)

p2 +V (r); 2m e (L + 2S) · B. H1 = 2mc H0 =

(7.31) (7.32)

We have however forgotten something. Associating a magnetic moment with the spin and one with the angular momentum, we must also take into account the interaction between these two! For a proper calculation of this interaction we would need relativistic electrodynamics and therefore we simply quote the result: ge 1 dV (r) HSO = S·L . (7.33) 2 2 4πε0 2m c r dr For the hydrogen atom, with V (r) = 1/r (choosing suitable units), we have: HSO =

ge2 1 S·L 3. 2 2 8πε0 m c r

(7.34)

This spin-orbit splitting is observed experimentally.

7.4

Addition of angular momenta

Consider an electron in a hydrogen atom. The electron has orbital angular momentum, characterised by the quantum numbers l, ml and spin quantum numbers, s, ms . The total angular momentum J is the sum of the vector operators L and S: J = L + S. (7.35) What are the possible eigenvalues of J 2 and Jz ? Heuristically, we can approach this problem by adding L and S as vectors. However, the relative orientation of the two is not arbitrary as we know that the eigenvalues j, m of the resultant operator are quantised. If L and S are “aligned”, we have j = l + s and if they are opposite we have j = l − s. This means that j is half-integer and does not differ more than 1/2 from l. We want to analyse the combination of angular momenta now in a more formal way, starting with the problem of adding two spins, S1 and S2 . Let us first ask ourselves why we would like to know the relation between the two angular momenta to be added and the result. To answer this question, consider a system consisting of two particles with orbital angular momentum zero (l = 0, ml = 0) and each spin 1/2, described by the interaction V (r) = V1 (r) +V2 (r)

S1 S2 . }2

(7.36)

The second term contains the magnetic interaction between the spins. Both S1z and S2z do not commute with the second term and therefore the eigenstates of the Hamiltonian are not simultaneous eigenstates of S1z and S2z . To find observables which do commute with the second term, we note that S1 S2 =

 1 2 S − S12 − S22 2

(7.37)

7.4. Addition of angular momenta

65

and this commutes with S2 , S12 , S22 and Sz = S1z + S2z . Exercise: Prove these commutation relations. Therefore, the eigenstates can be labeled by s1 , s2 (both 1/2), stot (to be evaluated) and ms (the eigenvalue for Sz ; to be evaluated). So let us consider the possible eigenvalues of S2 and Sz . We start from the states |s1 , m1 ; s2 m2 i where the labels belonging to the two particles are still separated. As m1 and m2 can assume the values 1/2 and −1/2 (denoted by + and − respectively), we have four such states. As the values of s1 and s2 are fixed, we can denote the four states simply by |m1 , m2 i: χ1 = | + +i

(7.38a)

χ2 = | + −i

(7.38b)

χ3 = | − +i

(7.38c)

χ4 = | − −i.

(7.38d)

We must find linear combinations of these states which are eigenstates of S2 and Sz . It turns out that all four states are indeed eigenstates of Sz : Sz χ1 = (S1z + S2z ) χ1 = }(1/2 + 1/2)χ1 = }χ1

(7.39)

and furthermore: Sz χ2 = 0

(7.40a)

Sz χ3 = 0

(7.40b)

Sz χ4 = −}χ4 .

(7.40c)

Now consider the state χ2 + χ3 . This is certainly not an eigenstate of S1z and neither of S2z . But it is an eigenstate of Sz with eigenvalue 0. We see therefore that eigenstates of Sz need not necessarily be eigenstates of S1z or S2z . Now we try to find eigenstates of S2 . It is convenient two write this operator in the following form: S2 = S12 + S22 + 2S1z S2z + S1+ S2− + S1− S2+

(7.41)

where we have used the raising and lowering operators: S1± = S1x ± iS1y

(7.42)

etc. These operators have the usual effect when acting on our states: S1+ | + +i = 0

(7.43a)

S1+ | − +i = }| + +i

(7.43b)

etcetera (check this). From this, it can be verified that the required eigenstates are: Ψ1 = χ1 = | + +i;

(7.44a)

Ψ2 = χ4 = | − −i χ2 + χ3 1 Ψ3 = √ = √ (| + −i + | − +i) 2 2 1 χ2 − χ3 Ψ4 = √ = √ (| + −i − | − +i) 2 2

(7.44b) (7.44c) (7.44d)

66

Angular momentum

Using Eq. (7.41), it follows that S2 Ψ1 = 2}2 Ψ1 ,

hence s = 1;

(7.45a)

S2 Ψ2 = 2}2 Ψ2 ,

hence s = 1;

(7.45b)

S Ψ3 = 2} Ψ3 ,

hence s = 1;

(7.45c)

S2 Ψ4 = 0

hence s = 0.

(7.45d)

2

2

Exercise: Check these results. The states can now be labeled |s, ms i (both s1 and s2 are equal to 1/2), and either s = 1 with ms either −1, 0 or +1, or s = 0 and ms = 0. The s = 1 state is called triplet state and the s = 0 state singlet – the names refer to the degeneracy. Now we consider the addition of an orbital momentum L to a single spin S which has quantum number s = 1/2: J = L + S. (7.46) The eigenstates we start from are |lm; sms i where s = 1/2 and we will omit the quantum number s in the remainder. Furthermore, we again denote the two possible values for ms , 1/2 and −1/2 by = and − respectively. The fact that we should end up with linear combinations being eigenstates of Jz = Lz + Sz restricts combinations to the pairs α|l, m; + i + β |l, m + 1; − i

(7.47)

which has eigenvalue m j = m + 1/2 of Sz . α and β will be fixed by the requirement that the resulting combination is an eigenstate of J 2 , which we write in the form: J 2 = L2 + S2 + 2L1z S1z + L+ S− + L− S+ .

(7.48)

Consider the action of this operator on the state (7.47):   p J2 3 [α|l, m; +i + β |l, m + 1; −i] = α l(l + 1) + (l − m)(l + m + 1)|l, m+1; −i+ + m |l, m; +i+α }2 4   p 3 β l(l + 1) + − m − 1 |l, m + 1; −i + β (l − m)(l + m + 1)β |l, m; +i. (7.49) 4 We require that this is equal to }2 j( j + 1)[α|l, m; +i + β |l, m + 1; −i]. This leads to a linear homogeneous set of equations for α, β :   p 3 α l(l + 1) + + m − j( j + 1) + β (l − m)(l + m + 1) =0 (7.50) 4   p 3 =0 (7.51) α (l − m)(l + m + 1) + β l(l + 1) + − m − 1 − j( j + 1) 4 This can only hold if the determinant of the system of linear equations vanishes and this leads to: [l(l + 1) + 3/4 + m − j( j + 1)] [l(l + 1) + 3/4 − m − 1 − j( j + 1)] = (l − m)(l + m + 1)

(7.52)

and for given l and m this equation has two solutions for j, given by the conditions : 3 = −l − 1 or 4 3 j( j + 1) − l(l + 1) − = l 4

j( j + 1) − l(l + 1) −

(7.53a) (7.53b)

7.5. Angular momentum and rotations

67

which leads to j = l + 1/2 or j = l − 1/2.

(7.54)

The ratio α/β follows from the above equations and these coefficients are fixed by the requirement that they are normalised. For j = l + 1/2: r r l +m+1 l −m α= ; β= (7.55) 2l + 1 2l + 1 and for j = l − 1/2: r l −m l +m+1 ; α= . (7.56) β= 2l + 1 2l + 1 The analysis presented here can be generalised to arbitrary angular momentum operators. This becomes a tedious job, which leads to the identification of the linear expansion coefficients, which are called Clebsch-Gordan coefficients. For details, see Messiah. r

7.5

Angular momentum and rotations

In this section we consider rotations of the physical system at hand. Such a rotation can for the three-dimensional space be expressed as a rotation matrix. This is the class of matrices which are orthogonal: the columns when considered as vectors form an orthonormal set and the same can be said of the rows. Furthermore the determinant of the matrix is +1 (if the determinant is −1, there is an additional reflection). For simplicity, we will confine ourselves in the analysis which follows to rotations around the z-axis. The matrix of such a rotation over an angle α reads:   cos α − sin α 0 (7.57) R(α) =  sin α cos α 0  0 0 1 Of course, if we rotate a physical system, its state, which we will denote |ψi, will change and we represent this change by an operator R: Rotation

|ψi −−−−→ R|ψi.

(7.58)

ψ(r) = hr|ψi .

(7.59)

Now consider the r-representation The new state of the system is the same as the old one up to a rotation, so if we evaluate the old state at a position r rotated back over an angle α, we should get exactly the same result as when we evaluate the new state in r (see figure 1):

−1 R r|ψ = hr|Rψi . (7.60) Using this relation we can find an expression for the operator R. Consider an infinitesimal rotation of a single particle around the z-axis with rotation angle δ , evaluated at r = x, y, z: R(δ )ψ(r) = ψ(R−1 r) = ψ(x + yδ , y − xδ , z) =   ∂ ψ(r) ∂ ψ(r) ψ(x, y, z) + δ y −x = (1 − iδ Lz /})ψ(x, y, z). (7.61) ∂x ∂y

68

Angular momentum

This relation is valid for small angles. The expression for larger angles can be found by applying many rotations over small angles in succession. Chopping the angle α into N pieces (N large), we have   α Lz N R(α) = 1 − i = exp(−iLz α/}). (7.62) N } This result can be generalised for rotations around an arbitrary axis characterised by a unit vector u: R(α) = exp(−iαu · L/˝).

(7.63)

This equation has been derived for a single particle, but it can be generalised for systems consisting of more particles. The angular momentum operator is then the sum of the angular momentum operators of the individual particles. If the particles have spin, this is to be included in the total angular momentum. Equation (7.63) is in fact often used as the definition of total angular momentum. The commutation relations (7.2) can be derived from it, using the commutation relations for rotation matrices (exercise!). Suppose we have a Hamiltonian H which is spherically symmetric. This implies that a rotation has no effect on the matrix elements of H:

hψ|H|φ i = ψ 0 |H|φ 0 (7.64) where the primed states are related to the unprimed ones through a rotation. Therefore we have: E

D hψ|H|φ i = ψ|R † HR|φ = ψ|eiαuJ/} He−iαuJ/} |φ . (7.65) This relation should hold in particular for infinitesimal rotations and expanding the exponentials to first order in α we obtain:   i hψ|H|φ i = hψ|1 − iα/} [HJ · u − J · uH] |φ i = ψ 1 − αu · [H, J] φ . (7.66) } As this should hold for arbitrary directions u and arbitrary states ψ, φ , we have [H, J] = 0

(7.67)

d hJi = i} h[H, J]i = 0. dt

(7.68)

and therefore J is a conserved quantity, as

Note that it is essential here to consider the total angular momentum, that is, including the spin degrees of freedom.

8

Introduction to Quantum Cryptography 8.1

Introduction

Some of the most important technical developments in the next few years will be based on quantum mechanics. In particular, spectacular developments and applications are expected in the areas of quantum cryptography, quantum teleportation and quantum computing. In this note, I shall briefly explain some issues involved in quantum cryptography. The idea of quantum cryptography hinges upon the measurement postulate of quantum mechanics. This postulate deals with measurements of physical observables. In quantum mechanics, such an ˆ The eigenvectors of this Hermitian operator observable is represented by an Hermitian operator, say Q. are denoted |φn i with corresponding eigenvalues λn . The measurement postulate says that, for a state |ψi in Hilbert space, which can be expanded as |ψi = ∑ cn |φn i ,

(8.1)

n

the result of a measurement of Qˆ yields one of its eigenvalues λn . The probability of finding a particular value λn is given by |cn |2 , and after the measurement the state of the system is reduced to the corresponding eigenvector |φn i. This last aspect, the fact that the state of a system is influenced by any observation, enables us to detect whether someone, an eavesdropper, has tried to read information as we shall see below.

8.2

The idea of classical encryption

Encryption of messages can be useful for many different applications. In all these applications, someone, denoted as A, sends a message to B, in such a way that an eavesdropper (E) cannot detect the information sent. To make the example more lively, A is usually given the name Alice, B is Bob, and the eavesdropper E is called Eve. A schematic drawing of the procedure is depicted here:

Alice

Bob

Eve A message which Alice sends to Bob is a series of bits: 69

70

Introduction to Quantum Cryptography 0110011101011101....

In order to prevent Eve from eavesdropping the message, Alice and Bob decide to encrypt their messages. For this purpose, several schemes exist, and we shall present the simplest one here. Alice and Bob have met once, and on that occasion they have agreed on a key which they will use to encrypt messages. A key is some sequence of bits, e.g.: 1111010001010100.... The key does not have any particular structure. Before Alice sends over her message, she encrypts it by performing an exclusive or with her message and the key. An exclusive or performed on message and key performs a bitwise comparison: if two corrsponding bits (at the same position) of the message and the key are equal, the result has a bit value 0. In the other cases, i.e. when the bits are unequal, the result has a bit value of 1: 0110011101011101.... Message 1111010001010100.... Key 1001001100001001.... Message XOR Key=Encrypted message. Bob receives the encrypted message and performs again an exclusive or with the key, which unveils the original contents of the message: 1001001100001001.... Encrypted message 1111010001010100.... Key 0110011101011101.... Message. Eve can only intercept the encrypted message and it is difficult (usually impossible) for her to make sense of it. Suppose however that Alice and Bob communicate very frequently, using the same key for each message. In that case, Eve might guess what the key is: if she would let her computer generate many different keys and use them to decrypt the messages exchanged between Alice and Bob. She then might quickly guess parts of the key, and gradually smaller and smaller parts of the key must be discovered, which takes less and less effort. Therefore, it would be wise for Alice and Bob to use a key which is at least as long as their messages. Here we have a problem. In order to safely exchange the keys, Bob and Alice have to meet before each message, or they must use a (hopefully) reliable courier. The dependence on couriers makes this encryption method cumbersome and vulnerable. Another way of encrypting messages is to use much shorter keys and to encrypt the message using some elaborate mathematical transformation depending on this key. This is done in the Rivest, Shamir and Adelman (RSA) encryption. The idea is based on factorisation of numbers into prime numbers. Consider the product of two large prime numbers. If you know that product, it is difficult to find out its two prime factors. On the other hand, if you know one of these factors, it is easy to compute the other. So, if Bob and Alice have an encryption and decryption algorithm based on the two prime factors, they can encrypt and decrypt their messages if they know these factors. The product is public in this case, that is, it is available to everyone, and to Eve in particular. Now suppose that Eve finds out the factorisation of the product, then she can eavesdrop all the messages. The point is now that the factorisation requires an amount of cpu time which grows exponentially with the number of bits of the product. So, if that number is large enough, Eve will never be able to crack the code. So this method seems quite safe. In 1994, it was shown that a new type computers, which is based on quantum mechanical behaviour of matter, should be able to do the factorisation in a number of steps which grows as a power

8.3. Quantum Encryption

71

of the number of bits used by the product number. Only a very primitive quantum computer has been developed to date, but people believe that in the future, RSA will not be safe anymore.

8.3

Quantum Encryption

In quantum encryption, two channels are used: one is a public channel, such as the internet, and the other is a private one. The channels are shown in the figure.

Public channel Alice

Bob

Eve

QM channel (private)

This private channel cannot always be guaranteed to be safe for Eve (otherwise, encryption would not be necessary any longer), but Bob and Alice can detect whether their communication has been eavesdropped by Eve, as shall explain below. The private channel is used to communicate the key only, and this key can be used for the standard exclusive-or encryption described in the previous section. The communication through the private channel is based on quantum mechanics. The information carriers of this channel are photons in some polarisation state. Unfortunately, the details of the quantum states of photons cannot be given here, as they involve quantum field theory. Therefore you must accept some of the facts which are given in the following. Recall that light is an electromagnetic wave phenomenon and is therefore a wave with a certain polarisation. A polaroid filter will be transparent for photons with a certain direction of polarisation only, and opaque for photons with the perpendicular polarisation. A photon polarisation state can be represented as a unit vector in the two-dimensional plane perpendicular to the direction of propagation of the photon. The states |1i and |0i shown in the figure below form a basis in the Hilbert space of all possible polarisation states (the wave propagates along a direction perpendicular to the paper).

|0> |1>

A state with angle ϑ with respect to the x-axis would then have the form |ϑ i = cos ϑ |1i + sin ϑ |0i .

72

Introduction to Quantum Cryptography

If a detector is put behind a polarisation filter aligned along the x-direction, and a photon is sent to that filter, the detector will register the arrival of the photon if it is polarised in the x direction, and not when its polarisation is along the y-axis. A photon in the state |ϑ i would thus be detected with a probability cos2 ϑ Now we consider the transmission of data through the quantum channel. This channel is a glass fiber through which Alice sends photons which she selects by a polariser which has one of the four possible orientations depicted in the figure below:





1β 1α

Note that

1 |1β i = √ (|1αi + |0αi) 2

etcetera. Now Bob will receive these photons at his end of the fiber. He first lets them pass through a polariser before they can arrive at the detector. For each photon, he aligns his polariser along either 1α in the figure, or along 1β . When he detects a photon, he records a ‘1’, otherwise a ‘0’. Whether Bob detects a photon depends on his and on Alice’s polariser. Suppose Bob has his polariser along 1α. Then, if Alice has sent a |1αi photon, Bob will detect it. If she has sent a 1β photon, Bob will or will not detect it with equal probabilities. The same holds for a 0β photon. If a 0α photon was sent by Eve, Bob will not detect it. The figure below gives the probabilities with which Bob detects a photon for each of the four possible polarisations which Alice can send over.

8.3. Quantum Encryption

Alice



Bob

73



0

0.5

0.5

1

1

0.5

0.5

0

After a number of photons has been sent over, Bob sends to Alice the settings of the polariser he has used, 1α or 1β , using the public channel for this information. Alice responds and tells Bob which of his settings corresponded to hers (i.e. whether she used an α or a β polariser, and not whether she used the 1 or the 0 setting). For compatible settings, i.e. when Bob and Allice both used α or both used β , they both know the result of Bob’s detections. They both keep these results as bits of a sequence. For all other photons, Bob has detected at random 0 or 1 photon, so these events are deleted. The sequence of retained bits is now taken as the key for encrypting a message using an exclusive-or encryption described in the previous section. Let us consider what would happen during a sample session: Alice: 1α 0α 0β 1β 1α 1β 1α Bob’s settings: 1α 1β 1β 1α 1α 1β 1β Bob’s detections: 1 1 0 0 1 1 1 Now Bob sends over his settings (see second line) Alice tells Bob which of these were compatible with hers Retained bits: 1 x 0 x 1 1 x An ‘x’ in the last line denotes a discarded bit. The sequence 1011 is now the key. Now consider the possibility of eavesdropping. If Eve intercepts the channel, the photons she measures are lost, so she has to send new photons to Bob. Suppose however that Alice used polarisation |1αi and Bob has used the 1α polariser. Then he would receive a correct bit of the key, which in this case is a 1. But suppose Eve used polariser β . If Eve detects |0β i, she will send a similar photon to Bob. But Bob used polariser α so he will find a 0 (i.e. no detection) with 50 % probability. If Bob and Alice exchange messages they immediately discover a mismatch in the keys, so they stop communicating.

74

Introduction to Quantum Cryptography

It is thus necessary to send over only one photon at a time, otherwise Eve could insert a beamsplitter in the quantum channel and detect half or more of the key without being noticed. Therefore, only low intensities must be used, which limits the distance over which communication is possible. With present-day technology, a few tens of kilometers can be reliably bridged with low intensity optical fibers.

9

Scattering in classical and in quantum mechanics Scattering experiments are perhaps the most important tool for obtaining detailed information on the structure of matter, in particular the interaction between particles. Examples of scattering techniques include neutron and X-ray scattering for liquids, atoms scattering from crystal surfaces, elementary particle collisions in accelerators. In most of these scattering experiments, a beam of incident particles hits a target which also consists of many particles. The distribution of scattering particles over the different directions is then measured, for different energies of the incident particles. This distribution is the result of many individual scattering events. Quantum mechanics enables us, in principle, to evaluate for an individual event the probabilities for the incident particles to be scattered off in different directions; and this probability is identified with the measured distribution. Suppose we have an idea of what the potential between the particles involved in the scattering process might look like, for example from quantum mechanical energy calculations (programs for this purpose will be discussed in the next few chapters). We can then parametrise the interaction potential, i.e. we write it as an analytic expression involving a set of constants: the parameters. If we evaluate the scattering probability as a function of the scattering angles for different values of these parameters, and compare the results with experimental scattering data, we can find those parameter values for which the agreement between theory and experiment is optimal. Of course, it would be nice if we could evaluate the scattering potential directly from the scattering data (this is called the inverse problem), but this is unfortunately very difficult (if not impossible) as many different interaction potentials can have similar scattering properties as we shall see below. Many different motivations for obtaining accurate interaction potentials can be given. One is that we might use the interaction potential to make predictions about the behaviour of a system consisting of many interacting particles, such as a dense gas or a liquid. Scattering might be elastic or inelastic. In the former case the energy is conserved, in the latter energy disappears. This means that energy transfer takes place from the scattered particles to degrees of freedom which are not included explicitly in the system (inclusion of these degrees of freedom would cause the energy to be conserved). In this chapter we shall consider elastic scattering.

9.1

Classical analysis of scattering

In chapter 3, we have analysed the motion of two bodies attracting each other by a gravitational force whose value decays with increasing separation r as 1/r2 . This analysis is also correct for opposite charges which feel an attractive force of the same form (Coulomb’s law). When the force is repulsive, the solution remains the same – we only have to change the sign of the parameter A which defines the interaction potential according to V (r) = A/r. One of the key experiments in physics which led to the notion that atoms consist of small but heavy kernels, surrounded by a cloud of light electrons, is Rutherford scattering. In this experiment, a thin gold sheet was bombarded with α-particles (i.e. 75

76

Scattering in classical and in quantum mechanics

helium-4 nuclei) and the scattering of the latter was analysed using detectors behind the gold film. In this section, we shall first formulate some new quantities for describing scattering processes and then calculate those quantities for the case of Rutherford scattering. Rutherford scattering is chosen as an example here – scattering problems can be studied more generally; see Griffiths, chapter 11, section 11.1.1 for a nice description of classical scattering. We consider scattering of particles incident on a so-called ‘scattering centre’, which may be another particle. The scattering centre is supposed to be at rest. This might not always justified in a real experiment, but the analysis in chapter 3, in which the full two-body problem was reduced to a one-body problem with with a reduced mass, pertains to the present case. The incident particles interact with the scattering centre located at r = 0 by the usual scalar two-point potential V (r) which satisfies the requirements of Newton’s third law. Suppose we have a beam of incident particles parallel to the z-axis. The beam has a homogeneous density close to that axis, and we can define a flux, which is the number of particles passing a unit area perpendicular to the beam, per unit time. Usually, particles close to the z-axis will be scattered more strongly than particles far from the z-axis, as the interaction potential between the incident particles and scattering centre falls off with their separation r. An experimentalist cannot analyse the detailed orbits of the individual particles – instead a detector is placed at a large distance from the scattering centre and this detector counts the number of particles arriving at each position. You may think of this detector as a photographic plate which changes colour to an extent related to the number of particles hitting it. The theorist wants to predict what the experimentalist measures, starting from the interaction potential V (r) which governs the interaction process. In figure 9.1, the geometry of the process is shown. In addition a small cone, spanned by the spherical polar angles dϑ and dϕ, is displayed. It is assumed here that the scattering takes place in a small neighbourhood of the scattering centre, and for the detector the orbits of the scattered particles all seem to be directed radially outward from the scattering centre. The surface dA of the intersection of the cone with a sphere of radius R around the scattering centre is given by dA = R2 sin ϑ dϑ dϕ. The quantity sin ϑ dϑ dϕ is called spatial angle and is usually denoted by dΩ. This dΩ defines a cone like the one shown in figure 9.1. Now consider the number of particles which will hit the detector within this small area per unit time. This number, divided by the total incident flux (see above) is called the differential scattering cross section, dσ /dΩ: dσ (Ω) Number of particles leaving the scattering centre through the cone dΩ per unit time = . dΩ Flux of incident beam (9.1) 2 The differential cross section has the dimension of area (length ). First we realise ourselves that the problem is symmetric with respect to rotations around the z-axis, so the differential scattering cross section only depends on ϑ . The only two relevant parameters of the incoming particle then are its velocity and its distance b from the z-axis. This distance is called the impact parameter – it is also shown in figure 9.1. We first calculate the scattering angle ϑ as a function of the impact parameter b. We use the solution found in chapter 3 [Eq. (3.24)] which is now a hyperbola. We write this solution in the form r=λ

1+ε . ε cos(ϑ −C) − 1

(9.2)

The integration constant C reappears in the cosine because we have not chosen ϑ = 0 at the perihelion – the closest approach occurs when the particle crosses the dashed line in figure 9.1 which bisects the in- and outgoing particle direction.

9.1. Classical analysis of scattering

ϕ

8

ϑ

77

b



O d cos ϑ dϕ

Figure 9.1: Geometry of the scattering process. b is the impact parameter and ϕ and ϑ are the angles of the orbit of the outcoming particle.

We know that for ϑ = π, r → ∞, from which we have cos(π −C) = 1/ε.

(9.3)

Because of the fact that cosine is even [cos x = cos(−x)] we can infer that the other value of ϑ for which r goes to infinity, and which corresponds to the outgoing direction occurs when the argument of the cosine is C − π, so that we find ϑ∞ −C = C − π,

(9.4)

or ϑ∞ = 2C − π. The subscript ∞ indicates that this value corresponds to t → ∞. From the last two equations we find the following relation between the scattering angle ϑ∞ and ε: sin(ϑ∞ /2) = 1/ε.

(9.5)

We want to know ϑ∞ as a function of b rather than ε however. To this end we note that the angular momentum is given as ` = µvinc b, (9.6) where ‘inc’ stands for ‘incident’, and the total energy as µ 2 v , 2 inc

(9.7)

` b= √ . 2µE

(9.8)

E= so that the impact parameter can be found as

78

Scattering in classical and in quantum mechanics

Using Eq. (3.21), we can finally write (9.5) in the form: cot(ϑ∞ /2) =

p

ε2 − 1 =

2Eb . |A|

(9.9)

From the relation between b and ϑ∞ we can find the differential scattering cross section. The particles scattered with angle between ϑ and ϑ + dϑ , must have approached the scattering centre with impact parameters between particular boundaries b and b + db. The number of particles flowing per unit area through the ring segment with radius b and width db is given as j2πbdb, where j is the incident flux. We consider a segment dϕ of this ring. Hence: dσ (Ω) = b(ϑ )dbdϕ.

(9.10)

Relation (9.9) can be used to express the right hand side in terms of ϑ∞ :  dσ (Ω) =

A 2E

2

 cot(ϑ /2) d cot(ϑ /2) dϕ =

A 2E

2

2

1 . sin ϑ /2

cot(ϑ /2)

d cot(ϑ /2) dϑ d cos ϑ dϕ. dϑ d cos ϑ (9.11)

This can be worked out straightforwardly to yield: dσ (Ω) = dΩ



A 4E

4

(9.12)

This is known as the famous Rutherford formula.

9.2

Quantum scattering with a spherical potential

We now consider the scattering problem within quantum mechanics, by looking at a particle incident on a scattering centre which is usually another particle.1 We assume that we know the scattering potential which is spherically symmetric so that it depends on the distance between the particle and the scattering centre only. dσ We shall again calculate the differential cross section, dΩ (Ω), which describes how these intensities are distributed over the various spatial angles Ω. This quantity, integrated over the spherical angles ϑ and ϕ, is the total cross section, σtot . The scattering process is described by the solutions of the single-particle Schr¨odinger equation involving the (reduced) mass m, the relative coordinate r and the interaction potential V between the particle and the interaction centre:   }2 2 − ∇ +V (r) ψ(r) = Eψ(r). (9.13) 2m This is a partial differential equation in three dimensions, which could be solved using the ‘brute force’ discretisation methods presented in appendix A, but exploiting the spherical symmetry of the potential, we can solve the problem in another, more elegant, way which, moreover, works much faster on a computer. More specifically, in section 9.2.1 we shall establish a relation between the phase shift and the scattering cross sections. In this section, we shall restrict ourselves to a description 1 Every

two-particle collision can be transformed into a single scattering problem involving the relative position; in the transformed problem the incoming particle has the reduced mass m = m1 m2 /(m1 + m2 ).

9.2. Quantum scattering with a spherical potential

79

V= 205

V= 10

V= 0

-V

r

Figure 9.2: The radial wave functions for l = 0 for various square well potential depths.

of the concept of phase shift and describe how it can be obtained from the solutions of the radial Schr¨odinger equation. For the potential V (r) we make the assumption that it vanishes for r larger than a certain value rmax . In case we are dealing with an asymptotically decaying potential, we neglect contributions from the potential beyond the range rmax , which must be chosen suitably, or treat the tail in a perturbative manner. For a spherically symmetric potential, the solution of the Schr¨odinger equation can always be written as ∞ l ul (r) m ψ(r) = ∑ ∑ Alm Y (ϑ , ϕ) (9.14) r l l=0 m=−l where ul satisfies the radial Schr¨odinger equation:  2 2   } d }2 l(l + 1) + E −V (r) − ul (r) = 0. 2m dr2 2mr2

(9.15)

Figure 9.2 shows the solution of the radial Schr¨odinger equation with l = 0 for a square well potential for various well depths – our discussion applies also to nonzero values of l. Outside the well, the solution ul can be written as a linear combination of the two independent solutions jl and nl , the regular and irregular spherical Bessel functions. We write this linear combination in the particular form ul (r > rmax ) ∝ kr [cos δl jl (kr) + sin δl nl (kr)] . (9.16) δl is determined via a matching procedure at the well boundary. The motivation for writing ul in this form follows from the asymptotic expansion for the spherical Bessel functions:

80

Scattering in classical and in quantum mechanics

kr jl (kr) ≈ sin(kr − lπ/2)

(9.17a)

krnl (kr) ≈ cos(kr − lπ/2) √ k = 2mE/}

(9.17b)

which can be used to rewrite (9.16) as ul (r) ∝ sin(kr − lπ/2 + δl ),

large r.

(9.18)

We see that ul approaches a sine-wave form for large r and the phase of this wave is determined by δl , hence the name ‘phase shift’ for δl (for l = 0 ul is a sine wave for all r > rmax ). The phase shift as a function of energy and l contains all the information about the scattering properties of the potential. In particular, the phase shift enables us to calculate the scattering cross sections and this will be done in section 9.2.1; here we simply quote the results. The differential cross section is given in terms of the phase shift by 2 1 ∞ dσ (9.19) = 2 ∑ (2l + 1)eiδl sin(δl )Pl (cos ϑ ) dΩ k l=0 and for the total cross section we find Z

σtot = 2π

dϑ sin ϑ

dσ 4π (ϑ ) = 2 dΩ k



∑ (2l + 1) sin2 δl .

(9.20)

l=0

Summarising the analysis up to this point, we see that the potential determines the phase shift through the solution of the Schr¨odinger equation for r < rmax . The phase shift acts as an intermediate object between the interaction potential and the experimental scattering cross sections, as the latter can be determined from it. Unfortunately, the expressions (9.19) and (9.20) contain sums over an infinite number of terms – hence they cannot be evaluated on the computer exactly. However, cutting off these sums can be motivated by a physical argument. Classically, only the waves with an angular momentum smaller than }lmax = }krmax will ‘feel’ the potential – particles with higher l-values will pass by unaffected. Therefore we can safely cut off the sums at a somewhat higher value of l – we can always check whether the results obtained change significantly when taking more terms into account. How is the phase shift determined in practice? First, the Schr¨odinger equation must be integrated from r = 0 outwards with boundary condition ul (r = 0) = 0. At rmax , the numerical solution must be matched onto the form (9.16) to fix δl . This can be done straightforwardly in the few cases where an analytical solution is known. For example, if the potential is a hard core with ( ∞ for r < a V (r) = (9.21) 0 for r ≥ a, we know that the solution is given as u(r) ∼ (r − a) jl (k(r − a))

(9.22)

which vanishes for r = 0. We therefore immediately see that δ = ka, which can be substituted directly in the expressions for the cross sections.

9.2. Quantum scattering with a spherical potential

81

6

Veff (r) [m eV]

4 2 0 -2 -4 -6 0

0.5

1

1.5

r [σ]

2

2.5

Figure 9.3: The effective potential for the Lennard-Jones interaction for various l-values.

In a computational approach, we use the value of the numerical solution at two different points r1 and r2 beyond rmax and we will use the latter method in order to avoid calculating derivatives. From (9.16) it follows directly that the phase shift is given by tan δl =

(1)

(2)

(1)

(2)

K jl − jl

Knl − nl

with

(9.23a)

(2)

K=

r1 ul

(1)

.

(9.23b)

r2 ul (1)

In this equation, jl stands for jl (kr1 ) etc. A computational example is based on the work by Toennies et al., (J. Chem. Phys., 71, p. 614, 1979) on the scattering of hydrogen off noble gas atoms. Figure 9.3 shows the Lennard-Jones interaction potential plus the centrifugal barrier l(l + 1)/r2 of the radial Schr¨odinger equation. For higher l-values, the potential consists essentially of a hard core, a well and a barrier which is caused by the 1/r2 centrifugal term in the Schr¨odinger equation. In such a potential, quasi-bound states are possible. These are states which would be genuine bound states for a potential for which the barrier does not drop to zero for larger values of r, but remains at its maximum height. You can imagine the following to happen when a particle is injected into the potential at precisely this energy: it tunnels through the barrier, remains in the well for a relatively long time, and then tunnels outward through the barrier in an arbitrary direction because it has ‘forgotten’ its original direction. In wave-like terms, the particle resonates in the well, and this state decays after a relatively long time. This phenomenon is called ‘scattering resonance’. This means that particles injected at this energy are strongly scattered and this shows up as a peak in the total cross section.

82

Scattering in classical and in quantum mechanics

45

Total cross section

40 35 30 25 20 15 10 5

0

0.5

1

1.5

2

Energy [m eV]

2.5

3

3.5

Figure 9.4: The total cross section shown as function of the energy for a Lennard-Jones potential modeling the H–Kr system. Peaks correspond to the resonant scattering states.

Such peaks can be seen figure 9.4, which shows the total cross section as a function of the energy calculated with a program as described above. The peaks are due to l = 4, l = 5 and l = 6 scattering, with energies increasing with l. Figure 9.5 finally shows the experimental results for the total cross section for H–Kr. We see that the agreement is excellent. 9.2.1

Calculation of scattering cross sections

In this section we derive Eqs. (9.19) and (9.20). At a large distance from the scattering centre we can make an Ansatz for the wave function. This consists of the incoming beam and a scattered wave: ψ(r) ∼ eik·r + f (ϑ )

eikr . r

(9.24)

ϑ is the angle between the incoming beam and the line passing through r and the scattering centre. f does not depend on the azimuthal angle ϕ because the incoming wave has azimuthal symmetry, and the spherically symmetric potential will not generate m 6= 0 contributions to the scattered wave. f (ϑ ) is called the scattering amplitude. From the Ansatz it follows that the differential cross section is given directly by the square of this amplitude: dσ = | f (ϑ )|2 . dΩ

(9.25)

Beyond rmax , the solution can also be written in the form (9.14) leaving out all m 6= 0 contributions

9.2. Quantum scattering with a spherical potential

83

Figure 9.5: Experimental results as obtained by Toennies et al. for the total cross section (arbitrary units) of the scattering of hydrogen atoms by noble gas atoms as function of centre of mass energy.

because of the azimuthal symmetry: ∞

ψ(r) = ∑ Al l=0

ul (r) Pl (cos ϑ ) r

(9.26)

where we have used the fact that Y0l (ϑ , φ ) is proportional to Pl [cos(ϑ )]. Because the potential vanishes in the region r > rmax , the solution ul (r)/r is given by the linear combination of the regular and irregular spherical Bessel functions, and as we have seen this reduces for large r to ul (r) ≈ sin(kr −

lπ + δl ). 2

(9.27)

We want to derive the scattering amplitude f (ϑ ) by equating the expressions (9.24) and (9.26) for the wave function. For large r we obtain, using (9.27):   ∞ sin(kr − lπ/2 + δl ) eikr Pl (cos ϑ ) = eik·r + f (ϑ ) . (9.28) ∑ Al kr r l=0 We write the right hand side of this equation as an expansion similar to that in the left hand side, using the following expression for a plane wave (see e.g. Abramovitz and Stegun, Handbook of Mathematical functions, 1965, Dover) ∞

eik·r = ∑ (2l + 1)il jl (kr)Pl (cos ϑ ). l=0

(9.29)

84

Scattering in classical and in quantum mechanics

f (ϑ ) can also be written as an expansion in Legendre polynomials: ∞

f (ϑ ) = ∑ fl Pl (cos ϑ ),

(9.30)

l=0

so that we obtain:    ∞  ∞ eikr sin(kr − lπ/2 + δl ) l Pl (cos ϑ ) = ∑ (2l + 1)i jl (kr) + fl Pl (cos ϑ ). ∑ Al kr r l=0 l=0

(9.31)

If we substitute the asymptotic form (9.17) of jl in the right hand side, we find: ∞

∑ Al

l=0



 sin(kr − lπ/2 + δl ) Pl (cos ϑ ) = kr     1 ∞ 2l + 1 2l + 1 ikr l+1 −ikr ∑ 2ik (−) e + fl + 2ik e Pl (cos ϑ ). (9.32) r l=0

Both the left and the right hand side of (9.32) contain in- and outgoing spherical waves (the occurrence of incoming spherical waves does not violate causality: they arise from the incoming plane wave). For each l, the prefactors of the incoming and outgoing waves should both be equal on both sides in (9.32). This condition leads to Al = (2l + 1)eiδl il (9.33) and fl =

2l + 1 iδl e sin(δl ). k

(9.34)

Using (9.25), (9.30), and (9.34), we can write down an expression for the differential cross section in terms of the phase shifts δl : 2 dσ 1 ∞ iδl = ∑ (2l + 1)e sin(δl )Pl (cos ϑ ) . dΩ k2 l=0

(9.35)

For the total cross section we find, using the orthonormality relations of the Legendre polynomials: Z

σtot = 2π

dϑ sin ϑ

9.2.2

dσ 4π (ϑ ) = 2 dΩ k



∑ (2l + 1) sin2 δl .

(9.36)

l=0

The Born approximation

Consider again the solution of a particle which is being scattered by a potential. We shall now relax the condition that the potential be spherically symmetric. Let us write down the stationary Schr¨odinger equation for the wavefunction:   }2 2 − ∇ +V (r) ψ(r) = Eψ(r). 2m For V (r) ≡ 0, an incoming plane wave would be a solution to this equation. It turns out possible to write the solution to the Schr¨odinger equation with potential formally as an integral expression. This

9.2. Quantum scattering with a spherical potential

85

is done using the Green’s function formalism. The Green function depends on two positions r and r0 – it is defined by   }2 2 ∇ −V (r) G(r, r0 ) = δ (r − r0 ). E+ 2m To understand the Green function (and easily recall its definition) you may view the delta function ˆ on the right hand side as a unit operator, so that G may be called the inverse of the operator E Iˆ − H, where Iˆ is the unit operator. For V (r) ≡ 0 we call the Green’s function G0 :   }2 2 E+ ∇ G0 (r, r0 ) = δ (r − r0 ). 2m Before calculating G0 let us assume we have it at our disposal. We then may write the solution to the full Schr¨odinger equation, i.e. including the potential V , in terms of a solution φ (r) to the ‘bare’ Schr¨odinger equation, that is, the Schr¨odinger equation with potential V ≡ 0: Z

ψ(r) = φ (r) +

G0 (r, r0 )V (r0 )ψ(r0 ) d 3 r0 .

(9.37)

This can easily be checked by substituting the solution into the full Schr¨odinger equation and using ˆ acting on the Green’s function, gives a delta-function. the fact that E Iˆ − H, Now we consider the scattering problem with an incoming beam of the form φ (r) = exp(iki · r) (the subscript ‘i’ denotes the incoming wave vector). We see from Eq. (9.37) that this wave persists but that it is accompanied by a scattering term which is the integral on the right hand side. Now the wavefunction ψ(r) is still very difficult to find, as it occurs in Eq. (9.37) in an implicit form. We can make the equation explicit if we assume that the potential V (r) is small, so that the scattered part of the wave is much smaller than the wavefunction of the incoming beam. In a first approximation we might then replace ψ(r0 ) on the right hand side of Eq. (9.37) by φ (r) which is a plane wave: Z

ψ(r) = φ (r) +

0

0

0

3 0

iki ·r

G0 (r, r )V (r )φ (r ) d r = e

Z

+

0

G0 (r, r0 )V (r0 )eiki ·r d 3 r0 .

The key to the scattering amplitude is given by the notion that it must always be possible to write the solution (9.37) in the form: eikr ψ(r) = eiki ·r + f (ϑ , ϕ) . r At this moment we hardly recognise this form in the expression obtained for the wavefunction. We first must find the explicit expression for the Green’s function G0 . Without going through the derivation (see for example Griffiths, pp. 364–366) we give it here: 0

G0 (r, r0 ) =

2m eik|r−r | }2 4πr

p with k = 2mE/}2 . Now we take r far from the origin. As the range of the potential is finite, we know that only contributions with r0  r have to be taken into account. Taylor expanding the exponent occuring in the Green’s function: 0 p r − r0 = r2 − 2r · r0 + r02 ≈ r − r · r r leads to 2m eikr −ikr·r0 /r G(r, r0 ) = 2 e . } 4πr

86

Scattering in classical and in quantum mechanics

The denominator does not have to be taken into account as it gives a much smaller contribution to the result for r  1/k. Now we define kf = kr/r, i.e. kf is a wave vector corresponding to an outgoing wave from the scattering centre to the point r. We have 2m eikr ψ(r) = φ (r) + 2 } 4πr

Z

V (r0 )e−ikf ·ri eiki ·r d 3 r0 .

This is precisely of the required form provided we set m f (ϑ , ϕ) = 2π}2

Z

V (r0 )ei(ki −kf )·r .

This is the so-called first Born approximation. It is valid for weak scattering – higher order approximations can be made by iterative substitution for ψ(r0 ) in the integral occurring in Eq. (9.37). In the first order Born approximation, the scattering amplitude f (ϑ , ϕ) is in fact a Fourier transform of the scattering potential. As an example, we consider a potential which is not weak but which is easily tractable within the Born scheme: the Coulomb potential q1 q2 1 V (r) = . 4πε0 r The Fourier transform of this potential reads V (k) =

q1 q2 1 . 4πε k2

Therefore, we immediately find for f (ϑ ): f (ϑ ) =

mq1 q2 . 4πε}2 (ki − kf )2

The angle ϑ is hidden in ki − kf , the norm of which is equal to 2 sin(ϑ /2). The result therefore is, using E = }2 k2 /(2m): 2  dσ q1 q2 . = dΩ 16πε0 E sin2 (ϑ /2) This is precisely the classical Rutherford formula, which also turns out to be the correct classical result. This could not possibly be anticipated beforehand, but it is a happy coincidence.

10

Symmetry and conservation laws In this chapter, we return to classical mechanics and shall explore the relation between the symmetry of a physical system and the conservation of physical quantities. In the first chapter, we have already seen that translational symmetry implies momentum conservation, that time translation symmetry implies energy conservation and that rotational symmetry implies conservation of angular momentum. There exists a fundamental theorem, called Noether’s theorem, which shows that, indeed, for every spatial continuous symmetry of a system which can be described by a Lagrangian, some physical quantity is conserved, and the theorem also allows us to find that quantity. The special form of the equations of motion for a system described by a Lagrangian (or Hamiltonian) leads already to a large number of conserved quantities, called Poincar´e invariants. We shall consider only one Poincar´e invariant here: phase space volume. The associated conservation law is called Liouville’s theorem.

10.1

Noether’s theorem

Suppose a mechanical system is invariant under symmetry transformations which can be parametrised using some real, continuous parameter. Examples include those mentioned already above: rotations (parametrised by the rotation angles) or translations in space or time. The fact that the system is invariant under these transformations is reflected by the Lagrangian being invariant under these symmetries. For simplicity we shall restrict ourselves to a single continuous parameter, s. In the case of rotations one could imagine s to be the rotation angle about an axis fixed in space, such as the z-axis. The mechanical path for some system, i.e. the solution of the Euler-Lagrange equations of motion, is called q(t). Now we perform a symmetry transformation. This gives rise to a different path, which we call Q(s,t), with Q(0,t) = q(t). The path Q(s,t) should have the same value of the Lagrangian L as the path q(t), in other words, L should not depend on s: d ˙ L(Q(s,t), Q(s,t)) = 0. ds

(10.1)

This leads to N



j=1



 ∂L ∂Qj ∂ L ∂ Q˙ j + = 0. ∂Qj ∂s ∂ Q˙ j ∂ s

(10.2)

Now we use the Euler-Lagrange equations: ∂L d ∂L = ∂Qj dt ∂ Q˙ j 87

(10.3)

88

Symmetry and conservation laws

in order to write " #  ∂Qj ∂ L ∂ Q˙ j d N ∂ L dQ j + = ∑ ∑ ∂ Q˙ j ds = 0, ∑ ∂s dt j=1 ∂ Q˙ j ∂ s j=1 j=1 (10.4) and we see that the term within brackets in the last expression must be a constant of the motion: dL = ds

N



 ∂L ∂Q ∂ L ∂ Q˙ j + = ∂ Q j ∂ s ∂ Q˙ j ∂ s

N



j=1

N



d dt

∂ L dQ j = ∂ Q˙ j ds

N



∂L ∂ Q˙ j

∑ pj

j=1



dQ j = Constant in time. ds

(10.5)

We see that any continuous symmetry of the Lagrangian leads to a constant of the motion, given by (10.5). This analysis is obviously rather abstract, so let us now consider an example. Suppose a one-particle system in three dimensional space is invariant under rotations around the z-axis. The rotation angle is called α. In order to be able to evaluate the derivatives of the coordinates with respect to α, we use cylindrical coordinates (r, ϕ, z) with x = r cos ϕ and y = r sin ϕ. A rotation about the z axis over an angle α then corresponds to ϕ → ϕ + α.

(10.6)

so that we have dx = −px r sin(ϕ + α) = −px y; dα dy py = py r cos(ϕ + α) = py x; dα dz pz =0 dα

px

(10.7) (10.8) (10.9)

so that the conserved quantity, from (10.5) is xpy − ypx = Lz ,

(10.10)

the z-component of the angular momentum. Similarly, we would find Lx and Ly for the conserved quantities associated with rotations around the x- and y- axes respectively. Also, it is easy to verify that for more than one particle, the total angular momentum is conserved. The reader is invited to check that space translation symmetry results in momentum conservation.

10.2

Liouville’s theorem

A special conservation law is due to the fact that the equations of motion can be derived from a Hamiltonian (or from a Lagrangian). Such equations of motion are called canonical. The fact that the equations of motion are canonical reflects a symmetry which is called symplecticity (or symplecticness), a discussion of which is outside the scope of these notes. The important notion is that this type of symmetry leads to a number of conserved quantities, called Poincar´e invariants, of which we shall consider only one, the volume of phase space. The proof of Liouville’s theorem hinges upon the fact that whenever in a volume integral like Z

V=



dnx

(10.11)

10.2. Liouville’s theorem

89

we perform a variable transformation x → y, we must put a correction factor det(J) in the integral, where J is the Jacobian matrix, given by Ji j = We thus have

Z

V=



∂ yi . ∂xj

d n x det(J =

(10.12)

Z Ω0

d n y.

(10.13)

where Ω0 is the volume Ω transformed to y-space. The state of a mechanical system consisting of N degrees of freedom is represented by a point in 2N-dimensional phase space (qi , p j ). In the course of time, this point moves in phase space, and forms a trajectory. We now consider not a single mechanical system in phase space, but a set of systems which are initially homogeneously distributed over some region Ω0 , with volume V0 . In the course of time, every point in Ω0 will move in phase space, Ω0 will therefore transform into some new region Ω(t). The volume of this new space is given as Z

V (t) =

Ω(t)

dnq dn p

(10.14)

We want to show that V (t) = V0 , hence the volume does not change in time. To this end, we consider a transformation from time t = 0 to ∆t: ∂ H(p, q) ∆t + O(∆t 2 ); ∂ pi ∂ H(p, q) p0i ≡ pi (∆t) = pi (0) − ∆t + O(∆t 2 ), ∂ qi

q0i ≡ qi (∆t) = qi (0) +

(10.15) (10.16)

where we have used a first order Taylor expansion and replaced time derivatives of qi and pi using Hamilton’s equations. Now we can evaluate the original volume V0 as follows: Z

V0 =



dnq dn p =

Z Ω(∆t)

d n q d n p det [J(∆t)] .

The Jacobi determinant can be written in block-form as follows: 2 1 + ∆t ∂ 2 H −∆t ∂∂qi ∂Hqi ∂ qi ∂ p j det [J(∆t)] = 2 2 ∆t ∂ ∂p ∂Hp 1 − ∆t ∂ ∂q j ∂Hpi i j



(10.17)

(10.18)

Careful consideration of this expression should convince you that det [J(∆t)] = 1 + O(∆t 2 ). We see therefore that V (∆t) = V0 + O(∆t 2 ), from which it follows that dV (0) = 0. dt

(10.19)

This argument can be extended for arbitrary times, so that we have proven that V is a constant of the motion. We have found Liouville’s theorem in the form: The volume of a region phase occupied by a set of systems does not change in time.

90

Symmetry and conservation laws

  

   

 



       



 

  

 

Figure 10.1: A box divided into two halves by a wall with a hole. Initially, particles will be in the right hand volume, and will move to the left. After large times, they will all come back to the right hand volume.

Of course, the region can change in shape, but its total volume will remain constant in time. We could have put any density distribution of points in phase space in the integrals, which does not change the derivation. Liouville’s theorem is important in equilibrium statistical mechanics. So-called ergodic systems are assumed to move to a time-independent distribution in phase space, that is, any large set of systems setting off at time t = 0 from different points in phase space and moving according to the Hamiltonian equations of motion will assume the same, invariant distribution after long times. Liouville’s theorem, moreover tells us that the systems will not all end up in the same point in phase space, but spread over a region with a volume equal to the initial volume. There exists a more specific theorem concerning the behaviour of systems in time. This is Poincar´e’s theorem, which says that a system which is to evolve under the mechanical equations of motion will always return arbitrarily close to its starting point within a finite time. Consider for example a box partitioned into two sub-volumes (figure 10.1). There is a small hole in the middle, and there are N particles in the right hand volume. Obviously, a fraction of these particles will move to the left hand volume, but the Poincar´e recurrence theorem tells us that after a finite time, all particles will reassemble in the right hand volume! This seems to be in contradiction with the second law of

10.2. Liouville’s theorem

91

thermodynamics. This law states that the entropy will not decrease in the course of time. Here we see an increase of entropy when the particles distribute themselves over the two volume halves rather than a single one, but come back in a more ordered (less entropic) state after some time. This is only an apparent contradiction, as the Poincar´e theorem holds for a finite number of particles (finite-dimensional phase space). What we see here is an example of the inequivalence of interchanging the order in which limits are taken: if we first take the system size to infinity (the approach of statistical mechanics and thermodynamics), the recurrence time will become infinite. If, on the other hand, we consider a finite system over infinitely large times (the mechanics approach), we see that it returns arbitrarily close to its initial state infinitely often. Taking then the system size to infinity does not alter this conclusion.

11

Systems close to equilibrium 11.1

Introduction

When we prepare a conservative system in a state with a certain energy, it will conserve this energy ad infinitum. In practice, such is never the case, as it is impossible to decouple a system from its environment or from its internal degrees of freedom. This requires some explanation. We usually describe macroscopic objects in terms of the coordinates of their centre of mass and their Euler angles. These are the macroscopic degrees of freedom. As these objects consist of particles (atoms, molecules), they have very many additional, internal, or microsopic degrees of freedom. In fact the heat which is generated during friction is nothing but a transfer of mechanical energy associated with the macroscopic degrees of freedom to a mechanical energy of the internal (microscopic) degrees of freedom. So heat in the end is a form of mechanical energy. As a result of friction, macroscopic objects will, when subject to a conservative, time-independent force (apart from friction), always end up at rest in a point where there potential energy is minimal. Any system at a point where its potential energy is minimal is said to be in a stable state. A system which looses its kinetic energy via friction is called a dissipative system. All the macroscopic systems we know are dissipative, although some can approach conservative systems very well. If the interactions are not all harmonic (‘harmonic’ means that the potential energy is a quadratic functions of the coordinates) then there may be more than one minimum. Local minima correspond to metastable states. A system in a metastable state will return to that state under a small perturbation, but, when it is strongly perturbed, it might move to another metastable with lower potential energy, or to the stable state. An example is shown in figure 11.1, for a particle in a one-dimensional potential. Molecular systems, in which we take all degrees of freedom explicitly into account, are believed

V M S Figure 11.1: System with a metastable (M) and a stable (S) state. A strong perturbation may kick the ball out of its metastable state, and under the influence of dissipation it will then move to the stable state.

92

11.2. Analysis of a system close to equilibrium

93

to be non-dissipative. We know from statistical physics that every degree of freedom in a system in thermal equilibrium carries a kinetic energy equal (on average) to kB T , where kB is Boltzmann’s constant. At low temperatures, the energies of the particles are small as can be seen from the Boltzmann distribution, which gives the probability of finding a system with energy E as exp[−E/(kB T )]. Therefore, for low temperatures, the kinetic and the potential energy of a system are small. It can therefore be inferred that at low temperatures, a system is close to a (meta-)stable state. In this section, we analyse systems close to mechanical equilibrium. The beautiful result of this analysis is a description in terms of a set of uncoupled harmonic oscillators, which are themselves trivial to analyse. Moreover we obtain a straightforward recipe for finding the resonance frequencies (related to the coupling strengths) of those oscillators.

11.2

Analysis of a system close to equilibrium

Consider a conservative system characterised by generalised coordinates q j , j = 1, . . . , N. The system is in equilibrium, defined by q˜1 , . . . , q˜N , if its potential energy is minimal. In that case we have ∂ V (q˜1 , . . . , q˜N ) = 0; ∂qj

j = 1, . . . , N.

(11.1)

Now suppose that we perturb the system slightly, i.e. we change the values of the q j slightly with respect to their equilibrium values. As the first derivative of the potential with respect to each of the q j vanishes, a Taylor expansion of the potential only contains second and higher order terms: V (q1 , . . . , qN ) = V (q˜1 , . . . , q˜N ) +

1 2

N

∂2

∑ (q j − q˜ j )(qk − q˜k ) ∂ q j ∂ qk V (q˜1 , . . . , q˜N ) + . . . .

(11.2)

j,k=1

The terms of order higher than two will be neglected, as we are interested in systems close to equilibrium (i.e. q j − q˜ j small). We can represent the resulting expansion using matrix notation. Introduce the matrix   ∂ 2V 2 ∂ 2V · · · ∂ q∂1 ∂VqN ∂ q1 ∂ q1 ∂ q1 ∂ q2 2 2  ∂ 2V   ∂ q2 ∂ q1 ∂ q∂2 ∂Vq2 · · · ∂ q∂2 ∂VqN   . K= (11.3) .. .. .. ..    . . . . ∂ 2V ∂ 2V ∂ 2V ∂ qN ∂ q1 ∂ qN ∂ q2 · · · ∂ qN ∂ qN The matrix K is obviously symmetric. We can write 1 V (q1 , . . . , qN ) = V (q˜1 , . . . , q˜N ) + δ qT Kδ q 2

(11.4)

where δ q is a column vector with components q j − q˜ j , j = 1, . . . , N; the superscript T denotes the transpose of the vector. Now we write down the kinetic energy in terms of the generalised coordinates. We assume that the constraints only depend on the generalised coordinates q j and not on their derivatives or on time. In that case, the kinetic energy can be written in the form (see page 26): T=

1 2

N



j,k=1

M jk q˙ j q˙k ,

(11.5)

94

Systems close to equilibrium

where the matrix M is symmetric: M jk = M jk . Note that M jk may depend on the q j . In terms of q j − q˜ j , and using vector notation, we can rewrite the kinetic energy as 1 ˙ T = δ q˙ T Mδ q. 2

(11.6)

The equations of motion now read: N

N

∂T

∑ (M jk δ q¨k + δ q¨k Mk j ) = − ∂ q j + ∂ q j = − ∑ (Kk j δ qk + δ qk Kk j ). ∂V

(11.7)

k=1

k=1

We have omitted the dependence of Mi j on q j – this dependence generates terms on the left- and right hand side, which are both of order δ q˙2 and can therefore be neglected. Using the symmetry of the matrices M and K, (11.7) reduces to N

N

k=1

k=1

∑ M jk δ q¨k = − ∑ K jk δ qk .

(11.8)

Let us consider the two-dimensional case to clarify the procedure. We consider two generalised coordinates q1 and q2 with the matrix M jk equal to the identity. Then, the kinetic energy has the form: 1 1 T = q˙21 + q˙22 . 2 2

(11.9)

The potential energy depends on the two coordinates q1 and q2 : V = V (q1 , q2 ).

(11.10)

∂V ∂ q1 ∂V q¨2 = δ q¨2 = − ∂ q2

(11.11)

The equations of motion read: q¨1 = δ q¨1 = −

(11.12)

Expanding about the point q˜1 , q˜2 , where V is supposed to be minimal, we have ∂2 1 V (q1 , q2 ) = V (q˜1 , q˜2 ) + (q1 − q˜1 )2 2 V (q˜1 , q˜2 )+ 2 ∂ q1 (q1 − q˜1 )(q2 − q˜2 )

∂2 1 ∂2 V (q˜1 , q˜2 ) + (q2 − q˜2 )2 2 V (q˜1 , q˜2 ). (11.13) ∂ q1 ∂ q2 2 ∂ q2

This can be written in the form:  1 V (q1 , q2 ) = V (q˜1 , q˜2 ) + (q1 − q˜1 , q2 − q˜2 )  2

∂ 2V ∂ q21 ∂ 2V ∂ q1 ∂ q2

∂ 2V ∂ q1 ∂ q2 ∂ 2V ∂ q21

 



q1 − q˜1 q2 − q˜2



Defining δ q1 = q1 − q˜1 and similarly for δ q2 , this equation reads:     ∂ 2V ∂ 2V 1 δ q1 ∂ q1 ∂ q2  ∂ q21  V (q1 , q2 ) = V (q˜1 , q˜2 ) + (δ q1 , δ q2 ) . ∂ 2V ∂ 2V δ q2 2 2 ∂q ∂q 1

2

The 2 × 2 matrix occurring in this expression is our matrix K.

∂ q1

.

(11.14)

(11.15)

11.2. Analysis of a system close to equilibrium 11.2.1

95

Example: Double pendulum

Consider as an example the double pendulum, consisting of two rigid massless rods of length l and L, with masses M and m:

L

θ M l ϕ m

The velocity of the upper mass is Lϑ˙ , that of the lower one is a vector sum of the velocity of the upper one and that of the lower one with respect to the upper one. For very small angles ϑ and ϕ both velocities will be approximately in the horizontal direction so that they can simply be added: ˙ The kinetic energy therefore reads: vm = Lϑ˙ + l ϕ. T=

2 m 2 M Lϑ˙ + Lϑ˙ + l ϕ˙ . 2 2

(11.16)

Let us perform a transformation to more convenient variables x = Lϑ

(11.17a)

y = Lϑ + lϕ.

(11.17b)

Note that x and y do not denote cartesian coordinates. In that case the kinetic energy can simply be written as M m T = x˙2 + y˙2 . (11.18) 2 2 The potential energy of the upper mass is MgL(1 − cos ϑ ) ≈ MgLϑ 2 /2, and that of the lower mass is given as mg [L(1 − cos ϑ ) + l(1 − cos ϕ)], which, in the small angle approximation becomes  1 VLower (ϑ , ϕ) = mg Lϑ 2 + lϕ 2 . 2

(11.19)

The total potential energy, written in terms of x and y, therefore reads: V=

(M + m)g 2 mg x + (y − x)2 . 2L 2l

(11.20)

96

Systems close to equilibrium

The equations of motion can therefore be written as      M+m M 0 x¨ − L g − mg l = mg 0 m y¨ l

mg l − mg l



x y

 .

(11.21)

This is the form given in (11.8). We shall return to this example in the next section.

11.3

Normal modes

Let us try to find solutions to Eq. (11.15) of the form δ q j = A j eiωt

(11.22)

where ω does not depend on j – all the degrees of freedom oscillate at the same frequency. Such a motion is called a normal mode. In the following we shall use q j instead of δ q j : q j is the generalised coordinate measured with respect to its equilibrium value. We have q¨ j = −ω 2 q j , so (11.8) reduces to N

N

k=1

k=1

∑ M jk ω 2 Ak = ∑ K jk Ak .

(11.23)

If the mass tensor M jk would be the identity, Eq. (11.23) would be an eigenvalue equation. For general mass tensors, the equation is a generalised eigenvalue equation. We can reduce this equation to an ordinary eigenvalue equation by multiplying the left and right and side by the inverse M −1 of the mass matrix. We then have: ω 2A j =

N



M −1 jk Kkl Al .

(11.24)

k,l=1

In algebraic terms, the solutions to these equations are the eigenvectors A (with components A j ) and the corresponding eigenvalues ω 2 . In physical terms, the components A j are the amplitudes of the oscillatory motions of the generalised coordinates, and ω is the frequency of the oscillation. In order for the normal modes to exist, the eigenvalues should be real. That the eigenvalues are indeed real follows from the fact that both M and K are real, symmetric matrices. This implies that M −1 K is a real, symmetric matrix, and it is a well-known result of linear algebra that the eigenvalues of a Hermitian matrix are real (real and symmetric implies Hermitian). Another question is whether the eigenvalues are positive or negative. Assuming that we are expanding the potential around a minimum, the matrix K can be shown to be positive definite. A positive definite matrix has only positive eigenvalues1 . Moreover, the mass matrix can be shown to be positive. Then its inverse M −1 is also positive. Multiplying two positive matrices yields a product matrix which is positive. Therefore M −1 K is positive, and the ω 2 are positive. Hence the frequencies of the oscillations are always real – we do not find expontial growth or decay. In physical terms one could say that perturbing the system from equilibrium always pushes it back to this equilibrium – therefore the ‘spring force’ experienced by the coordinates is always opposite to the perturbation, and therefore an oscillation arises, and not a drift away from equilibrium, or some exponential decay. Such decay may however be found near a local maximum or near a saddle point of the potential. Let us find the normal modes for the coupled pendulums. Note that this problem is relatively simple as a result of the fact that the mass matrix is diagonal and therefore trivial to invert. After 1 In

fact, we shall occasionally allow for zero eigenvalues; in that case, the matrix is called positive semidefinite.

11.4. Vibrational analysis

97

multiplying both sides of Eq. (11.21) by M −1 , we have a standard diagonalisation problem for the matrix:  M+m mg mg  ML g + Ml − Ml . (11.25) g − gl l The eigenvalues are the solutions of the so-called secular equation which has the from M+m mg − ω 2 − mg Ml = 0. ML g + Ml g g 2 −l l −ω

(11.26)

This reduces to the following quadratic equation in ω 2 : ω4 −

M + m  g g  2 M + m g2 + ω + = 0. M L l M Ll

(11.27)

This equation has two solutions for ω 2 . We will examine some special cases. If M  m, then, provided that l is not too close to L, the two roots with corresponding eigenvectors (Ax , Ay ) are given by r g Ax m L ω≈ ; ≈ (11.28) l Ay M l − L and

r ω≈

g ; L

Ax L − l ≈ . Ay L

(11.29)

The first solution describes an almost stationary motion of the upper pendulum with the lower one oscillating at its natural frequency. In the second case, the motion of the upper and lower are of the same order of magnitude with the natural frequency of the upper pendulum. If M  m, the solutions are g Ax L ω2 ≈ = (11.30) L+l Ay L + l and ω2 ≈

m g g + , M L l

Ax m L+l ≈− . Ay M L

(11.31)

The first case describes a motion in which the two rods are aligned so that we have essentially a single pendulum of length l + L and mass m. The second case corresponds to a very high frequency of the upper mass with an almost stationary lower mass.

11.4

Vibrational analysis

The way in which atoms are bound together in molecules is described by quantum mechanics. There is a long standing tradition in the quantum mechanical calculation of stationary states of molecules. In the last fifteen years or so it has become possible to perform dynamical computations of molecules to very good accuracy using fully quantum mechanical calculations. These calculations are quite demanding on computer resources and they do not always give a very good insight into the dynamics of interest. Therefore, a semi-classical approach is often adopted in order to calculate vibration spectra for example. First, the total energy of the molecule is calculated as a function of the nuclear positions Ri , i = 1, . . . , N for an N-atomic molecule. There is however a problem in doing this. Suppose we want

98

Systems close to equilibrium

stretch

bend

torsion Figure 11.2: Interactions in a molecule.

to calculate this energy for 10 values of all the coordinates of a 10-atom molecule. As there are 30 coordinates, we need to perform 1030 stationary quantum calculations, which would require the age of the universe. Therefore the potential is parametrised in a sensible way, which we now describe. All the chemically bonded atoms are described by harmonic or goniometric interactions. The degrees of freedom chosen for this parametrisation are the bond length, bond angle and dihedral angle. The forces associated with these degrees of freedom are called stretch, bend, and torsion respectively. These degrees of freedom are shown in figure 11.2 The form of the potentials associated with bond stretching is given as κ VStretch = (l − l0 )2 (11.32) 2 where l is the bond length and l0 is the equilibrium bond length. The spring constant κ determines how difficult it is to stretch the bond. The bending potential is given in terms of the bond angle ϕ: VBend =

α (ϕ − ϕ0 )2 2

(11.33)

A similar expression exists for the torsional energy. The constants κ and α can be determined from stationary quantum mechanical calculations. Assuming that these parameters are known, we shall now use the given form of the potential to calculate the vibration spectrum of a triatomic, linear molecule, such as CS2 or CO2 (see figure 11.3). We neglect bending here, so only bond stretching is taken into account. If the initial configuration is linear, the motion takes place along a straight line, which we take as our X-axis. The coordinates of the three atoms are x1 , x2 and x3 . The kinetic energy can therefore be written down immediately:  m µ 2 x˙1 + x˙32 + x˙22 . (11.34) T= 2 2 The potential energy is given by V=

κ κ (x2 − x1 − l)2 + (x3 − x2 − l)2 . 2 2

(11.35)

Here, l is the equilibrium bond length. The centre of mass of the system will move uniformly as there are no external forces acting, and we take this centre as the origin. The equilibrium coordinates are

11.4. Vibrational analysis

99

2 1

3

µ

µ m Figure 11.3: Triatomic molecule.

. then x1 = −l, x2 = 0 and x3 = l. The deviations from these values are δ x1 = x1 + l;

δ x2 = x2

and

δ x3 = x3 − l.

(11.36)

In this representation, we have  m µ δ x˙12 + δ x˙32 + δ x˙22 . 2 2

(11.37)

κ κ (δ x2 − δ x1 )2 + (δ x3 − δ x2 )2 . 2 2

(11.38)

T= and V=

We can find the matrices K and M directly from these expressions:  µ 0 0 M =  0 m 0 , 0 µ 

(11.39)

and 

 1 −1 0 2 −1  . K = κ  −1 0 −1 1

(11.40)

The normal modes can now be found by solving (11.23) with these matrices. The eigenvectors can be found by solving the secular equation: κ − µω 2 −κ 0 2 −κ 2κ − mω −κ 0 −κ κ − µω 2

= 0.

(11.41)

(κ − µω 2 )ω 2 (µmω 2 − κm − 2κ µ) = 0,

(11.42)

This leads to:

from which we find: r ω1 = 0;

ω2 =

κ ; µ

s   1 2 ω3 = κ + . µ m

(11.43)

100

Systems close to equilibrium

2 1

3

µ

µ m

Mode 1 Mode 2 Mode 3 Figure 11.4: The three modes of the triatomic molecule.

The corresponding eigenvectors can be found after some algebra:  1 A1 =  1  ; 1 

 1 A2 =  0  ; −1 

 1 A3 =  −2µ/m  . 1 

(11.44)

The first of these, corresponding to ω1 = 0, is a mode in which the atoms all slide in the same direction with the same speed. This is a manifestation of the translational symmetry of the problem, which has been recovered by our procedure. The second one represents a mode in which the middle atompstands still and the two outer atoms vibrate oppositely. Obviously, the frequency of this mode is ω2 = κ/m, corresponding to the two springs. Finally, the last mode is one in which the two outer atoms move in one direction, and the central atom in the opposite direction. The motion can be understood by replacing the two outer masses by a single one with mass 2µ at their midpoint, coupled by a spring with spring constant 2κ to the central mass. The reduced mass of this system (1/(2µ) + 1/m)−1 then occurs in the expression for the resonance frequency. The three modes are depicted in figure 11.4.

11.5

The chain of particles

In the previous section we have analysed a triatomic molecule. Now we shall analyse a larger system: a chain of N particles. We assume that all particles have the same mass, and that they are connected by a string with tension τ. The particles are assumed to move only in the vertical (y) direction, and the x-components of adjacent particles differ by a separation d. The first and last spring are connected to points at y = 0. The chain is depicted in figure 11.5. The chain is a model for a continuous string, which is obtained by letting N → ∞ and d → 0 while keeping the string length Nd fixed. Let us consider particle number k. The springs connecting this particle to its neighbours are stretched, and this may result in a net force acting on particle k. The spring between particle k and k + 1 has a length q (yk+1 − yk )2 l = d 2 + (yk+1 − yk )2 ≈ d + (11.45) 2d

11.5. The chain of particles

k−1 d

yk

101

k k+1

Figure 11.5: The harmonic chain of particles.

where a first order Taylor expansion is used to obtain the second expression. The potential energy for this link is equal to the tension τ times the extension of the string, and therefore we find for the total potential energy: τ N V= (11.46) ∑ (yk+1 − yk )2 2d k=0 with y0 = yN+1 = 0.

(11.47)

The kinetic energy is given by N

m 2 y˙k . k=1 2



(11.48)

Mkl = mδkl

(11.49)

T= We now find the matrices Mkl and Kkl as:

where δkl is the Dirac delta function, in other words, Mkl is find:  τ 2 d − dτ 0 0 0  −τ 2τ −τ 0 0  d d d K =  0 −τ 2τ −τ 0  d d d .. .. .. .. .. . . . . .

m/2 times the unit matrix. For Kkl we  ··· 0 ··· 0   . (11.50) ··· 0   .. . . . .

The normal mode equation (11.23) can be solved analytically for arbitrary N by substituting for the eigenvector Ak = γ exp(iαk), where γ is some constant. This trial solution does not satisfy the boundary equations (11.47), but we do not bother about this for the moment. Then for 2 ≤ k ≤ N − 1 we find  τ  iα(k−1) −e + 2eikα − eiα(k+1) (11.51) mω 2 eikα = d Dividing left and right hand side by exp(ikα), we find mω 2 =

2τ (1 − cos α). d

(11.52)

For each α, there is also a solution for −α for the same ω. This can be used to construct a solution Ak = γ(eikα − e−ikα ) = 2iγ sin α.

(11.53)

102

Systems close to equilibrium

This solution always vanishes at k = 0 and it vanishes also at k = N when Nα = nπ, for integer n. So the conclusion is that for each n = 0, . . . , N, we have a solution which vanishes at the two ends of the string. For values of n higher than N, or lower than 0, the solutions obtained are identical to the solutions with 0 ≤ n ≤ N. For each solution, all particles move up and down with the same frequency, given by (11.52). The wavelength is given by kd such that kα = 2π, so λ = 2πd/α, and the wavevector q = α/d. It is possible to formulate the Lagrangian directly in a continuum form, and derive the wave equation from this. Note that in the continuum limit, α and d small, we obtain from (11.52) for the frequency: τα 2 d ω2 = = τ q2 . (11.54) md m Comparison with the well known dispersion equation ω = cq, we learn that the sound speed c is given p as τd/m. Defining the density ρ = m/d, we have r τ c= . (11.55) ρ

12

Density operators — Quantum information theory 1

Introduction

In this section, we extend the formalism of quantum physics to include statistical uncertainty which can be traced back to our lack of knowledge of what the wavefunction actually is. States whose form we are not certain about, are described by an object called density matrix. Density matrices can be used to detect coupling between a particle and an outside world, which in the simplest case is another particle. We shall see that this coupling may lead to quantum states which do not have a classical analogue – these states are called entangled. Entanglement is used in novel technological applications which are based on the quantum nature of matter. The most spectular realisation of this trend which may be achieved in the next few decades is the quantum computer, which will be briefly discussed towards the end of this chapter.

2

The density operator

Up to this point, we have always assumed that a quantum system can be described by a wavefunction which contains all the information which can in principle be obtained about the system. In particular, knowledge of the wavefunction enables us to predict the possible outcomes of physical measurements and their probabilities. For example, if we know that the electron in a hydrogen atom finds itself in a state 1 √ (|2, 1, 1, +i + i |2, 1, 0, +i − |2, 1, −1, −i) , (1) 3 where the ket-vectors are of the form |n, l, mz , sz i, we can calculate the possible outcome of any measurement and its respective probability. In this example, a measurement of Lz would yield the value }, or 0 or −}, all with probability 1/3. Although knowledge of the quantum mechanical wavefunction does not predict outcomes of measurements unambiguously, the wavefunction is the most precise knowledge we can have about a system. In that sense, the wave function is for quantum mechanics what the positions and velocities are for a classical many-particle system. In some cases we might indeed know the state of a quantum system, for example at very low temperatures where particles are almost certainly in the ground state, or when they are in some collective quantum state, as is the case in superfluidity or superconductivity. Another example is a quantum system of which we just have measured all quantities corresponding to the observation maximum. If we would measure for example Lz = −}, L2 = 2}2 , E = E2 and Sz = −}/2, we can be sure that just after that measurement the system is in the state |2, 1, −1, −i .

(2)

Note that we do not know the overall phase factor, but this factor drops out when calculating physical quantities. 103

104

Density operators — Quantum information theory

In most practical situations, however, we do not know the wavefunction at all! Moreover, if we do not know the state, we cannot infer its shape from whatever sequence of whatever measurements, as the first measurement reduces the state so that it changes considerably (as mentioned above, we do know its state immediately after the measurement). Moreover, suppose we have carefully prepared a system in a well-defined state, the interaction with its surroundings will alter that state. Does this mean that quantum mechanics might be a nice theory, but that the state of affairs is that we cannot do anything with it in practice? The answer is no: even if we do not know the state of a system precisely, we usually have some statistical knowledge about it. This means that we know the probability Pr that a system is in state |ri. Now this might become very confusing: quantum mechanics allows us to make statistical predictions, and now I say that the state of a system is specified in a statistical manner. It might be helpful to keep in mind that a wavefunction by itself is not a statistical object at all: it is a well-defined object whose time-evolution can be calculated with — in principle — arbitrary precision. However, measurements performed on a system described by a known wavefunction are subject to quantum mechanical uncertainty. This is called intrinsic uncertainty. If we have — for whatever reason — incomplete knowledge of the state of the system, we speak of extrinsic uncertainty. One of the conceptual difficulties students and scientists have with quantum mechanics is the difference between the wavefunction which evolves smoothly and deterministically according to the time-dependent Schr¨odinger equation on the one hand, and the abrupt change taking place at a measurement where the state is instantaneously reduced to the eigenfunction of the operator corresponding to the physical quantity we measure, on the other hand. As an example of a state which is not known explicitly, consider again the hydrogen atom. Suppose we have thousands of hydrogen atoms which we can measure. All these atoms have undergone the same preparation. We measure energy, L2 and Lz of the atom. In all cases we find that energy is that of the first excited state E = E2 , and l = 1, but in 25% of the cases, we find m = 1, in 50 % of the cases m = 0 and in the remaining 25 % we find m = −1. It now is tempting to say that the state of every hydrogen atom (neglecting the electron spin) can be written as |ψi =

1 1 1 |2, 1, 1i + √ |2, 1, 0i + |2, 1, −1i 2 2 2

(3)

But can we really infer this information from our measurements? We could flip the sign of the second term on the right hand sign, and we would find the same probabilities as with the state given above. Can you now still tell me which state the atoms are in? To emphasise the point more strongly, we look at the simplest possible nontrivial system, described by a two-dimensional Hilbert state, e.g. a spin-1/2 particle. Suppose someone, Charlie, gives us an electron but he does not know its spin state. He does however know that there is no reason for the spin to be preferably up or down, so the probability to measure spin ‘up’ or ‘down’ is 1/2 for both. Does that give us enough information to specify the state? Well, you might guess that the state is 1 |ψi = √ (|1/2i + |−1/2i) , 2

(4)

but why couldn’t it be 1 |ψi = √ (|1/2i − |−1/2i)? 2 In fact the state of the system could be anything of the form  1 |ψi = √ |1/2i + eiϕ |−1/2i , 2

(5)

(6)

2. The density operator

105

for any real ϕ. Although we do not know the wavefunction exactly, we can evaluate the expectation value of the z-component of the spin: as we find }/2 and −}/2 with equal probabilities, the expectation value is 0. More generally, if we have a spin which is in the spin-up state with probablity p and in the down state with probablity 1 − p, the expectation value of the z-component of the spin is }(p − 1/2). So expectation values can still be found, although we do not have complete information about the state of the system. This fact might raise the question whether there is any difference in measured physical quantities between one of the candidate wavefunctions suggested above, and the information that the particle is in the ‘up’ state with probability 1/2 and in the ‘down’ state with the same probability. After all, they both give the same value for expectation value of the z-component of the spin. We now introduce the following states: |ψ1 i = |1/2i ;

(7a)

|ψ2 i = |−1/2i ; 1 |ψ3 i = √ (|1/2i + |−1/2i) ; 2 1 |ψ4 i = √ (|1/2i − |−1/2i) ; 2 1 |ψ5 i = √ (|1/2i + i |−1/2i) ; 2 1 |ψ6 i = √ (|1/2i − i |−1/2i) . 2

(7b) (7c) (7d) (7e) (7f)

These states are recognised as the spin-up and -down  √ states for the z, x and y directions. Let us iϕ consider a particle in the state |1/2i + e |−1/2i / 2, and calculate the probability of finding at a measurement this particle in the state |ψ3 i: 1  2 1 + exp(iϕ) 2 1 iϕ = (1 + cos ϕ) . (h1/2| + h−1/2|) |1/2i + e |−1/2i = 2 2 2

(8)

If we evalute the probability to find the particle in the state |ψ3 i in the case it was, before the measurement, in a so-called mixed state which is given with equal probabilities to be |1/2i and |−1/2i, we find 1/2, as can easily be verified. Calculating the probabilities for a particle to be found in the states |ψ1 i to |ψ6 i we find the following results. State |ψ1 i |ψ2 i |ψ3 i |ψ4 i |ψ5 i |ψ6 i

 √ |1/2i + eiϕ |−1/2i / 2 1/2 1/2 1/2(1 + cos ϕ) 1/2(1 − cos ϕ) 1/2(1 − sin ϕ) 1/2(1 + sin ϕ)

Equal mixture of |1/2i and |−1/2i 1/2 1/2 1/2 1/2 1/2 1/2

We see that there is no ϕ, i.e. no pure state, which leads to the same probabilities for all measurement results. It is important to make the distinction between the two cases very clearly: if Charley gives us

106

Density operators — Quantum information theory

millions of times a particle which is in a state √ with an arbitrary ϕ, we √ will find probabilities 1/2 to find the particle in either (|1/2i + |−1/2i)/ 2 or (|1/2i − |−1/2i)/ 2. If the phase would always be the same, say ϕ = 0 then we would find probabilities 1 and 0 respectively. Therefore, a mixed state indicates uncertainty of the relative phase of the components of the wave function. Let us summarize what we have learned: A system can be either in a pure or a mixed state. In the first case, we know precisely the wavefunction of the system. In the second case, we are not sure about the state, but we can ascribe a probability for the system to be in any of the states accessible to it. Note that the uncertainty about the state the particle is in, is a classical uncertainty. We can for example flip a coin and, depending on whether the result is head or tails, send a spin-up or -down to a friend. Our friend then only knows that the probability for the particle he receives to be ‘up’ is 1/2, and similar for ‘down’. We now turn to the general case of a system which can be in either one of a set of normalised, but not necessarily orthogonal, states |ψi i. The probability for the system to be in the state |ψi i is pi , with obviously ∑i pi = 1. Suppose the expectation value of some operator Aˆ in state |ψi i is given by Ai . Then the expectation value of Aˆ for the system at hand is given by hAi = ∑ pi Ai = ∑ pi hψi | Aˆ |ψi i . i

(9)

i

We now introduce the density operator, which is in some sense the ‘optimal’ specification of the system. The density operator is defined as ρˆ = ∑ pi |ψi i hψi | .

(10)

i

Suppose the set |φn i forms a basis of the Hilbert space of the system under consideration. Then the expectation value of the operator Aˆ can be rewritten after inserting the unit operator 1 = ∑n |φn i hφn | as hAi = ∑ pi hψi | Aˆ |ψi i = ∑ pi hψi | ∑ |φn i hφn | Aˆ |ψi i = i

i

n

"

#

∑ hφn | ∑ pi |ψi i hψi | n

i

 Aˆ |φn i = ∑ hφn | ρˆ |φn i = Tr ρˆ Aˆ . (11) n

Here we have used the trace operator, Tr which adds all diagonal terms of an operator. For a general ˆ operator Q: Tr Qˆ = ∑ hφn | Qˆ |φn i . (12) n

The trace is independent of the basis used — it is invariant under a basis transformation. We omit the hat from operators unless confusion may arise. Another property of the trace is Tr (|ψi hχ|) = hχ|ψi ,

(13)

which is easily verified by writing out the trace with respect to a basis φn . If a system is in a well-defined quantum state |ψi, we say that the system is in a pure state. In that case the density operator is ρ = |ψi hψ| . (14)

2. The density operator

107

If the system is not in a pure state, but if only the statistical weights pi of the states |ψi i are known, we say that the system is in a mixed state. When someone gives you a density operator, how can you assess whether it corresponds to a pure or a mixed state? Well, it is clear that for a pure state we have ρ 2 = ρ, which means that ρ is a projection operator1 : ρ 2 = |ψi hψ|ψi hψ| = |ψi hψ| = ρ, where we have used the fact that ψ is normalised. For a mixed state, such as ρ = α |ψi hψ| + β |φ i hφ |

(15)

(16)

where hψ|φ i = 0, we have ρ 2 = α 2 |ψi hψ| + β 2 |φ i hφ | 6= ρ.

(17)

Although we have considered a particular example here, it holds in general for a mixed state that ρ is not a projection operator. Another way to see this is to look at the eigenvalues of ρ. For a pure state, for which ρ = |ψi hψ|, clearly |ψi is an eigenstate of ρ with eigenvalue 1, and all other eigenvalues are 0 (their eigenstates are all states which are perpendicular to |ψi. These values for the eigenvalues are the only ones which are allowed by a projection operator. As Trρ = ∑ pi = 1,

(18)

∑ λi = 1.

(19)

hφ | ρ |φ i = ∑ pi |hψi |φ i|2 ≤ 1,

(20)

i

we have

i

Now let us evaluate

i

where the fact that |hψi |φ i| ≤ 1, combined with ∑i pi = 1 leads to the inequality. The condition ∑i λi = 1, means that either one of the eigenvalues is 1 and the rest is 0, or they are all strictly smaller than 1. Thus, for an eigenstate φ of the density operator, we have hφ | ρ |φ i = hφ |λ |φ i = λ < 1.

(21)

We see that a density operator has eigenvalues between 0 and 1. In summary The sum of the eigenvalues of the density operator is 1. The situation where only one of these eigenvalues is 1 and the rest is 0, corresponds to a pure state. If there are eigenvalues 0 < λ < 1, then we are dealing with a mixed state. To summarize, we can say that if a system is in a mixed state, it can be characterized by a set of possible wavefunctions |ψi i and probabilities pi for the system to be in each of those wave functions. But a more compact way of representing our knowledge of the system is by using the density operator, which can be constructed when we know the possible states ψi and their probabilities pi [see Eq. (10)]. The density operator can be used to calculate expectation values using the trace, see Eq. (11). 1 Recall

that a projection operator P is an Hermitian operator satisfying P2 = P.

108

Density operators — Quantum information theory

Let us consider an example. Take again the case where Charley sends us a spin-up or -down particle with equal probabilities. For convenience, we denote these two states as |0i (spin up) and |1i (spin-down). Then the density operator can be evaluated as ρ=

1 1 |0i h0| + |1i h1| . 2 2

(22)

This operator works in a two-dimensional Hilbert space – therefore it can be represented as a 2 × 2 matrix:   1/2 0 . (23) ρ= 0 1/2 The matrix elements are evaluated as follows. The upper-left element is h0| ρ |0i =

1 1 h0|0i h0|0i + h0|1i h1|0i = 1/2 2 2

(24)

as follows from (22) and from the orthogonality of the two basis states. The upper-right element is given by 1 1 h0| ρ |1i = h0|0i h0|1i + h0|1i h1|1i = 0 (25) 2 2 as a result of orthogonality. The lower left element h1|ρ|0i and the lower right h1|ρ|1i are found similarly. Another interesting way to find the density matrix (i.e. the matrix representation of the density operator) is by directly using the vector representation of the states |0i and |1i:       1 1 1 0 1/2 0 ρ= (1, 0) + (0, 1) = . (26) 0 1/2 2 0 2 1 Note the somewhat unusual order in which we encounter column and row vectors: the result is not a number, but an operator. Another day, Charley decides to send us particles which are either ”up” or ”down” along the x-axis. As you might remember, the eigenstates are 1 √ (|0i + |1i) 2

(27)

1 √ (|0i − |1i) 2

(28)

for spin-up (along x) and

for spin-down. You recognize these states as the states |φ3 i and |φ4 i given above. Now let us work out the density operator:       1 1 1 1 1/2 0 ρ= (1, 1) + (1, −1) = . (29) 0 1/2 4 1 4 −1 We see that we obtain the same density matrix! Apparently, the particular axis used by Charley does not affect what we measure at our end. Another question we frequently ask ourselves when dealing with quantum systems is: What is the probability to find the system in a state |φ i in a measurement? The answer for a system which is in a pure state |ψi is: Pφ = |hφ |ψi|2 .

(30)

2. The density operator

109

If the system can be in either one of a set of states |ψi i with respective probabilities pi , the answer is therefore (31) Pφ = ∑ pi |hφ |ψi i|2 . i

Another way to obtain the expression on the right hand side is by using the density operator: hφ |ρ|φ i = ∑ pi |hφ |ψi i|2 = Pφ .

(32)

i

This equation follows directly from the definition of the density operator. Important examples of systems in a mixed state are statistical systems connected to a heat bath. Loosely speaking, the actual state of the system without the bath varies with time, and we do not know that state when we perform a measurement. We know however from statistical physics that the probability for the system to be in a state with energy E is given by the Boltzmann factor exp[−E/(kB T )], so the density operator can be written as ρ =N

∑ |ψi i e−E /(k T ) hψi | i

(33)

B

i

where the ψi are eigenstates of the Hamiltonian. The prefactor N is adjusted such that N ∑ e−Ei /(kB T ) = 1 in order to guarantee that Tr ρ = 1. The density operator can also be written as ˆ

ρ = N e−H/(kB T ) ,

(34)

as can be verified as follows: e−H/(kB T ) = ∑ |ψi i hψi | e−H/(kB T ) ∑ |ψ j ihψ j | = ∑ |ψi i e−Ei /(kB T ) hψi | . ˆ

ˆ

i

ˆ

j

(35)

i

Any expectation value can now in principle be evaluated. For example, consider a spin-1/2 particle connected to a heat bath of temperature T in a magnetic field B pointing in the z-direction. The Hamiltonian is given by H = −γBSz . (36) Then the expectation value of the z-component of the spin can be calculated as hSz i = Tr (ρSz ). We can evaluate ρ. Using the notation β = 1/(kB T ) it reads:  β γ}B/2  1 e 0 ρ = β γ}B/2 . 0 e−β γ}B/2 e + e−β γ}B/2

(37)

(38)

Now the expectation value hSz i can immediately be found, using Sz = }σz /2, where σz is the Pauli matrix: hSz i = Tr (ρSz ) = }/2 tanh(β γ}B/2). (39) Considering systems of noninteracting particles, the density operator can be used to derive the average occupation of energy levels, leading to the well-known Fermi-Dirac distribution for fermions, and the Bose-Einstein distribution for bosons. This derivation is however beyond the scope of this lecture course — it is treated in your statistical mechanics course.

110

Density operators — Quantum information theory

3

Entanglement

Entanglement is a phenomenon which can occur when two or more quantum systems are coupled. We shall focus on the simplest nontrivial system exhibiting entanglement: two particles, A and B, whose degrees of freedom span a two-dimensional Hilbert space (as usual, you may think of two spin-1/2 particles). The states of the particles are denoted |0i and |1i. Therefore, the possible states of the two-particle system are linear combinations of the states |00i

|01i

|10i and |11i

(40)

(the first number denotes the state of particle A and the second one that of particle B). We use these states (in this order) as a basis of the four-dimensional Hilbert space, that is, we may identify   1  0   |00i ⇔  (41)  0  0 and so on. Suppose the system is in the state |ψi =

1 (|00i + |01i + |10i + |11i) 2

(42)

or, in vector notation:  1 1 1  . ψ=  2 1  1 

(43)

Note that this state is normalised. We perform measurements of the first spin only. More specifically, we measure the probabilities for a system to be in the states |ψ1 i = |0i ,

|ψ2 i = |1i ,

1 1 |ψ3 i = √ (|0i + |1i) or |ψ4 i = √ (|0i − |1i) . 2 2

(44)

The resulting probabilities are (check this!): P1 = P2 = 1/2;

(45a)

P3 = 1;

(45b)

P4 = 0.

These are precisely the same results as those found above for a single particle in the state |ψ3 i = √ (|0i + |1i)/ 2, that is, if we want to predict measurements on the first particle, we can forget about the second particle. The reason for this is that we can write the state (42) as 1 (|0iA + |1iA ) ⊗ (|0iB + |1iB ) 2

(46)

where ⊗ is the so-called tensor product. The fact that (42) can be written as a (tensor) product of pure states of the two subsystems A and system B is responsible for the fact that the second particle does not ‘interfere’ with the first one.

3. Entanglement Now consider the state

1 |ψE i = √ (|00i + |11i) . 2

111

(47)

If we evaluate again the probabilities of finding the first spin in the states ψ1 or ψ2 , we find: 1 P1 = P2 = , 2

(48)

as can readily be seen from (47). In order to evaluate the probability to find the first spin in the state ψ3 , we write ψ in the form |ψi =

1 [(|ψ3 i + |ψ4 i) |0i + (|ψ3 i − |ψ4 i) |1i] . 2

(49)

Now, the probability for the first spin to be in the state ψ3 or ψ4 , while not caring about whether the second spin is 0 or 1, is seen to be P3 = P4 = 1/2 (50) These results are the same as for a single particle with density operator ρ=

1 (|0i h0| + |1i h1|) . 2

(51)

The conclusion is that for the state (47), the first particle is well described by a mixed state. There is no way to assign a pure state to particle A: we say that particle A is entangled with particle B and ψE is called an entangled state. Thus, we see that the density operator is very useful for describing particles coupled to an ‘outside world’. A state is entangled when it does not allow us to assign a pure state to a part of the quantum system under consideration. Let us now consider entanglement from another point of view. We perform measurements on particle A and on B, checking whether these particles are found in state 1 or 0. For our entangled state (47) we find P00 = P11 = 1/2

(52a)

P10 = P01 = 0,

(52b)

where P01 is the probability to find particle A in state 0 and particle B in state 1 etcetera. We see that in terms of classical probabilities, the systems A is strongly correlated with system B. It turns out that this correlation remains complete even when the measurement is performed with respect to another basis (see exercises): Entanglement gives rise to correlation of probabilities, and this correlation cannot be lifted by a basis transformation. Now let’s start with a system which is not entangled — it might for example be in the state (42). We assume that the system evolves according to a Hamiltonian H which, in the basis |00i, |01i, |10i, |11i, has the following form:   1 0 0 0  0 1 0 0    (53)  0 0 1 0  0 0 0 −1

112

Density operators — Quantum information theory

ˆ The time evolution operator is given by T = exp(−it H/}) — at t = π}/2 it has the form 

 −i 0 0 0  0 −i 0 0     0 0 −i 0  0 0 0 i

(54)

so that we find

i |ψ(t = π}/2)i = − (|00i + |01i + |10i − |11i) , (55) 2 which is an entangled state (you will find no way to write it as a tensor product of two pure states of A and B). Thus we see that when a system starts off in a non-entangled state, it might evolve into an entangled state in the course of time.

4

The EPR paradox and Bell’s theorem

In 1935, Einstein, Podolsky and Rosen (EPR) published a thought experiment, which demonstrated that quantum mechanics is not compatible with some obvious ideas which we tacitly apply when describing phenomena. In particular the notions of an existing reality existing independently of experimental measurements and of locality cannot both be reconciled with quantum mechanics. Locality is used here to denote the idea that events cannot have an effect at a distance before information has travelled from that event to another place where its effect is noticed. Together, the notions of reality and locality are commonly denoted as ‘local realism’. From the failure of quantum mechanics to comply with local realism, EPR concluded that quantum mechanics is not a complete theory. The EPR paradox is quite simple to explain. At some point in space, a stationary particle with spin 0 decays into two spin-1/2 particles which fly off in opposite directions (momentum conservation does not allow the directions not to be opposite). During the decay process, angular momentum is conserved which implies that the two particles must have opposite spin: when one particle is found to have spin ‘up’ along some measuring axis, the other particle must have spin ‘down’ along the same axis. Obviously, we are dealing with an entangled state. Suppose Alice and Bob both receive an outcoming particle from the same decay event. Alice measures the spin of the particle along the z direction, and Bob does the same with his particle. Superficially, we can say that they would both have the same probability to find either }/2 or −}/2. However, if quantum mechanics is correct, these measurements should be strongly correlated: if Alice has measured spin up, then Bob’s particle must have spin down along the z-axis, so the measurement results are fully correlated. According to the ‘orthodox’, or ‘Copenhagen’ interpretation of quantum mechanics, if Alice is the first one to measure the spin, the particular value measured by her is decided at the very moment of that measurement. But this means that at the at the same moment the spin state of Bob’s particle is determined. But Bob could be lightyears away from Alice, and perform his measurement immediately after her. According to the orthodox interpretation, his measurement would be influenced by Alice’s. But this was inconceivable to Einstein, who maintained that the information about Alice’s measurement could not reach the Bob’s particle instantaneously, as the speed of light is a limiting factor for communication. In Einstein’s view, the outcome of the measurements of the particles is determined at the moment when they leave the source, and he believed that a more complete theory could be found which would unveil the ‘hidden variables’ which determine the outcomes of Alice and Bob’s measurements when the particles left the source. These hidden variables would then represent some “reality” which exists irrespectively of the measurement.

4. The EPR paradox and Bell’s theorem

a

113

c b

Figure 12.1: The measuring axis for a spin.

The EPR puzzle remained unsettled for a long time, until, in 1965, John Bell formulated a theorem which would allow to distinguish between Einstein’s scenario and the orthodox quantum mechanical interpretation. We shall now derive Bell’s theorem. Suppose we count in an audience the numbers of people having certain properties, such as ‘red hair’ or ‘wearing yellow socks’, ‘taller than 1.70 m’. We take three such properties, called A, B and C. If we select one person from the audience, he or she will either comply to each of these properties or not. We denote this by a person being ‘in the state’ A+ , B− ,C+ for example. The number of people in the state A+ , B− ,C+ is denoted N(A+ , B− ,C+ ). We now write N(A+ , B− ) = N(A+ , B− ,C+ ) + N(A+ , B− ,C− ) (56) which is a rather obvious relation. We use similar relations in order to rewrite this as N(A+ , B− ) = N(A+ ,C− ) − N(A+ , B+ ,C− ) + N(B− ,C+ ) − N(A− , B− ,C+ ) ≤ N(A+ ,C− ) + N(B− ,C+ ). (57) This is Bell’s inequality, which can also be formulated in terms of probabilities [P(A+ , B− ) instead of N(A+ , B− ) etcetera]. We have used everyday-life examples in order to emphasise that there is nothing mysterious, let alone quantum mechanical, about Bell’s inequality. But let us now turn to quantum mechanics, and spin determination in particular. Consider the three axes a, b and c shown in the figure. A+ is now identified with a spin-up measurement along a etcetera. We can now evaluate P(A+ ,C− ). Measuring A+ happens with probability 1/2, but after this measurement, the particle is in the spin-up state along the a-axis. If the spin is then measured along the c direction, we have a probability sin2 π/8 to find C− (see problem 16 of the exercises). The combined probability is P(A+ ,C− ) is therefore 12 sin2 (π/8) Similarly, P(B− ,C+ ) is also equal to 12 sin2 (π/8), and P(A+ , B− ) is 1/4. Inserting these numbers into Bell’s inequality gives:   1 1 1√ 2 ≤ sin (π/8) = 1− 2 , 4 2 2

(58)

which is obviously wrong. Therefore, we see that quantum mechanics does not obey Bell’s inequality. Now what does this have to do with the EPR paradox? Well, first of all, the EPR paradox allows us to measure the spin in two different directions at virtually the same moment. But, more importantly, if

114

Density operators — Quantum information theory

the particles would leave the origin with predefined probabilities, Bell’s inequality would unambiguously hold. The only way to violate Bell’s inequality is by accepting that Alice’s measurement reduces the entangled wavefunction of the two-particle system, which is also noticed by Bob instantaneously. So, there is some ‘action at a distance’, in contrast to what we usually have in physics, where every action is mediated by particles such as photons, mesons, . . . . In 1982, Aspect, Dalibard and Roger performed experiments with photons emerging from decaying atoms in order to check whether Bell’s theorem holds or not. Since then, several other groups have redone this experiment, sometimes with different setups. The conclusion is now generally accepted that Bell’s theorem does not hold for quantum mechanical probabilities. The implications of this conclusion for our view of Nature is enormous: somehow actions can be performed without intermediary particles, so that the speed of light is not a limiting factor for this kind of communication. ‘Communication’ is however a dangerous term to use in this context, as it suggests that information can be transmitted instantaneously. However, the ‘information’ which is transmitted from Alice to Bob or vice versa is purely probabilistic, since Bob nor Alice can predict the outcome of their measurements. So far, no schemes have been invented or realised which would allow us to send over a Mozart symphony at speeds faster than the speed light.

5

No cloning theorem

In recent years, much interest has arisen in quantum information processing. In this field, people try to exploit quantum mechanics in order to process information in a way completely different from classical methods. We have already encountered one example of these attempts: quantum cryptography, where a random encryption key can be shared between Bob and Alice without Eve being capable of eavesdropping. Another very important application, which unfortunately is still far from a realisation, is the quantum computer. When I speak of a quantum computer, you should not forget that I mean a machine which exists only on paper, not in reality. A quantum computer is a quantum machine in which qubits evolve in time. A qubit is a quantum system with a 2-dimensional Hilbert space. It can always be denoted |ϕi = a |0i + b |1i , (59) where a and b are complex constants satisfying a2 + b2 = 1. The states |0i and |1i form a basis in the Hilbert space. A quantum computer manipulates several qubits in parallel. A system consisting of n qubits has a 2n -dimensional Hilbert space. A quantum computation consists of a preparation of the qubits in some well-defined state, followed by an autonomous evolution of the qubit system, and concluded by reading out the state of the qubits. As the system is autonomous, it is described by a (Hermitian) Hamiltonian. The time-evolution operator U = exp(−itH/}) is then a unitary operator, so the the quantum computation between initialisation and reading out the results can be described in terms of a sequence of unitary transformations applied to the system. In this section we shall derive a general theorem for such an evolution, the no-cloning theorem: An unknown quantum state cannot be cloned. By cloning we mean that we can copy the state of some quantum system into some other system without losing the state of our original system. Before proceeding with the proof of this theorem, let us assume that cloning would be possible. In that case, communication at speeds faster than light would in principle be possible. To see this, imagine Alice has a qubit of which Bob has many clones, which are entangled with Alice’s qubit. If Alice performs a measurement on her qubit along the axis |0i or |0i + |1i, Bob’s clones will become

6. Dense coding

115

aligned along the same axis. As Bob has many clones, he can find out which measurement Alice performed without ambiguity (how?). So the no-cloning theorem is essential in making communication at speeds faster than the speed of light impossible. The proof of the no-cloning theorem for qubit systems proceeds as follows. Cloning for a qubit pair means that we have a unitary evolution U with the following effect on a qubit pair: U |α0i = |ααi .

(60)

The evolution U should work for any state α, therefore it cannot depend on α. Therefore, for some other state |β i we must have U |β 0i = |β β i . (61) √ Now let us operate with U on the state |γ0i with |γi = (|αi + |β i) / 2: √ U |γ0i = (|ααi + |β β i) / 2 6= |γγi , (62) which completes the proof.

6

Dense coding

In this section, I describe a way of sending over more information than bits. This sounds completely impossible, but, again, quantum mechanics is in principle able to realise the impossible. It is however difficult to implement, as it is based on Bob and Alice having an entangled pair of qubits, in the state |00i + |11i .

(63)

From now on, we shall adopt the convention in this field to omit normalisation factors in front of the wavefunctions. We can imagine this state to be realised by having an entangled pair generator midway between Alice and Bob, sending entangled particles in opposite directions as in the EPR setup. Note that the following qubit operations are all unitary: I |φ i = |φ i

(64a)

X |0i = |1i ,

(64b)

X |1i = |0i

(64c)

Z |0i = |0i ,

(64d)

Z |1i = − |1i .

(64e)

Y |0i = |1i

(64f)

Y |1i = − |0i Y = XZ .

(64g)

The operator I is the identity; X is called the NOT operator, We assume that Alice has a device with which she can perform any of the four transformations (I, X,Y, Z) on her member (i.e. the first) of the entangled qubit pair. The resulting perpendicular states for these four transformations are: I (|00i + |11i) = (|00i + |11i)

(65a)

X (|00i + |11i) = (|10i + |01i)

(65b)

Y (|00i + |11i) = (|10i − |01i)

(65c)

Z (|00i + |11i) = (|00i − |11i)

(65d)

Alice does not perform any measurement — she performs one of these four transformations and then she sends her bit to Bob. Bob then measures in which of the four possible states the entangled pair is, in other words, he now knows which transformation Alice applied. This information is ‘worth’ two bits, but Alice had to send only one bit to Bob!

116

Density operators — Quantum information theory

7

Quantum computing and Shor’s factorisation algorithm

A quantum computer is a device containing one or more sets of qubits (called registers), which can be initialised without ambiguity, and which can evolve in a controlled way under the influence of unitary transformations and which can be measured after completion of this evolution. The most general single-qubit transformation is a four-parameter family. For more than one qubit, it can be shown that every nontrivial unitary transformation can be generated by a single-qubit transformation of the form   cos(θ /2) −ie−iφ sin(θ /2) U(θ , φ ) = . (66) −ieiφ sin(θ /2) cos(θ /2) and another unitary transformations involving more than a single qubit, the so-called 2-qubit XOR. This transformation acts on a qubit pair and has the following effect: XOR (|00i) = |00i

(67a)

XOR (|01i) = |01i

(67b)

XOR (|10i) = |11i

(67c)

XOR (|11i) = |10i

(67d)

We see that the first qubit is left unchanged and the second one is the eXclusive OR of the two input bits. Unitary transformations are realised by hardware elements called gates. Several proposals for building quantum computers exist. In the ion trap, an array of ions which can be in either the ground state (|0i) or the excited state (|1i), controlled by laser pulses. Coupling of neighbouring ions in order to realise an XOR-gate is realised through a controlled momentum transfer to displacement excitations (phonons) of the chain. Here in Delft, activities focus on arrays of Josephson junctions. Josephson junctions are very thin layers of ordinary conductors separating two superconductors. Current can flow through these junctions in either the clockwise or anti-clockwise direction (interpreted as 0 and 1 respectively). Other initiatives include NMR devices and optical cavities. With this technique it has become possible recently to factorise the number 15. Realisation of a working quantum computer will take at least a few decades — if it will come at all. A major problem in realising a working quantum computer is to ensure a unitary evolution. In practice, the system will always be coupled to the outside world. Quantum computing hinges upon the possiblity to have controlled, coherent superpositions. Coherent superpositions are linear combinations of quantum states into another, pure state. As we have seen in the previous section, coupling to the environment may lead to entanglement which would cause the quantum computer to be described by a density operator rather than by a pure state. In particular, any phase relation between constitutive parts of a phase-coherent superposition is destroyed by coupling to the environment. We shall now treat this phenomenon in more detail. Consider a qubit which interacts with its environment. We denote the state of the environment by the ket |mi. The interaction is described by the following prescription: |0i |mi → |0i |m0 i ;

(68a)

|1i |mi → |1i |m1 i .

(68b)

In this interaction, the qubit itself does not change — if this would be the case, our computer would be useless to start with.

7. Quantum computing and Shor’s factorisation algorithm

117

Suppose we start with a state |0i + eiφ |1i which is coupled to the environment. This coupling will induce the transition  |0i + eiφ |1i |mi → |0i |m0 i + eiφ |1i |m1 i .

(69)

(70)

Suppose this qubit is then fed into a so-called Hademard gate, which has the effect 1 H |0i = √ (|0i + |1i) ; 2 1 H |1i = √ (|0i − |1i) . 2

(71a) (71b)

Then the outcome is i   h  eiφ /2 |0i e−iφ /2 |m0 i + eiφ /2 |m1 i + |1i e−iφ /2 |m0 i − eiφ /2 |m1 i .

(72)

If we suppose that hm0 |m1 i is real, we find for the probabilities to measure the qubit in the state |0i or |1i (after normalisation): 1 (1 − hm0 |m1 i cos φ ) 2 1 P1 = (1 + hm0 |m1 i cos φ ) 2 P0 =

(73a) (73b)

If there is no coupling, m0 = m1 = m, and we recognise the phase relation between the two states in the probabilities. On the other hand, if hm0 |m1 i = 0, then we find for both probabilities 1/2, and the phase relation has disappeared completely. It is interesting to construct a density operator for the qubit in the final state (72). Consider a qubit α |0i | + β |1i

(74)

which has interacted with its environment, so that we have the combined state α |0i |m0 i + β |1i |m1 i .

(75)

We can arrive at a density operator for the qubit only by performing the trace over the m-system only. Using (13) we find   |α|2 αβ ∗ hm1 |m0 i . (76) ρqubit = α ∗ β hm0 |m1 i |β |2 The eigenvalues of this matrix are 1 1 λ= ± 2 2

q

(|α|2 − |β |2 )2 + 4|α|2 |β |2 | hm0 |m1 i |2

(77)

and these lie between 0 and 1, where the value 1 is reached only for hm0 |m1 i = 1. The terms coherence/decoherence derive from the name coherence which is often used for the matrix element hm0 |m1 i. Now let us return to the very process of quantum computing itself. The most impressive algorithm, which was developed in 1994 by Peter Shor, is that of factorising large integers, an important problem

118

Density operators — Quantum information theory

in the field of encryption and code-breaking. We shall not describe this algorithm in detail, but present a brief sketch of an important sub-step, finding the period of an integer function f . It is assumed here that all unitary transformations used can be realised with a limited number of gates. The algorithm works with two registers, both containing n qubits. These registers are described by a 2n -dimensional Hilbert space. As basis states we use the bit-sequences of the integers between 0 and 2n−1 . The basis state corresponding to such an integer x is denoted |xin . Now we perform the Hademard gate (71) to all bits of the state |0in . This yields −n

H |0in ≡ |win = 2

2n −1

∑ |xin .

(78)

x=0

It is possible (but we shall not describe the method here) to construct, for any function f which maps the set of numbers 0 to 2n−1 onto itself, a unitary transformation U f which has the effect U f |xin |0in = |xin | f (x)in

(79)

using a limited number of gates. Now we are ready for the big trick in quantum computing. If we let U f act on the state |win then we obtain n U f |win |0in = 2−n

2 −1

∑ |xin | f (x)in .

(80)

x=0

We see that the new state contains f (x) for all possible values of x. In other words, applying the gates U f to our state |win |0in , we have evaluated the function f for 2n different arguments. This feature is called quantum parallelism and it is this feature which is responsible for the (theoretical) performance of quantum computing. Of course, if we were to read out the results of the computation for each x-value, we would have not gained much, as this would take 2n operations. In general, however, the final result that we are after consists of only few data, so a useful problem does not consist of simply calculating f for all of its possible arguments. As an example we consider the problem of finding the periodicity of the function f , which is an important step in Shor’s algorithm. This is done by reading out only one particular value of the result in the second register, f (x) = u, say. The first register is then the sum of all x-states for which it holds that f (x) = u. If f has a period r, we will find that these x-values lie a distance r apart from each other. Now we act with a (unitary) Fourier transform operator on this register, and the result will be a linear combination of the registers corresponding to the period(s) of the function f . If there is only one period, we can read this out straightforwardly. It has been said already that finding the period of some function is an important step in the factorising algorithm. Shor’s algorithm is able to factorise an n-bit integer in about 300n3 steps. A very rough estimate of size for the number to be factorized where a quantum computer starts outperforming a classical machine, is about 10130 .

Appendix A

Review of Linear Algebra 1

Hilbert spaces

A Hilbert space is defined as a linear, closed inner product space. The notions of linearity, inner product and closure may need some explanation. • A linear vector space is a vector space in which any linear combination of vectors is an element of that space. In other words, if u and v are elements of the space H , then αu + β v lies in H .

(1)

• An inner product is a scalar expression depending on two vectors u and v. It is denoted by hu|vi and it satisfies the following requirements: 1. hu|vi = hv|ui∗ ,

(2a)

where the asterisk denotes complex conjugation. 2. Linearity: hw|αu + β vi = α hw|ui + β hw|vi .

(2b)

hu|ui ≥ 0,

(2c)

3. Positive-definiteness: and the equals-sign only holds when u = 0. An inner product space is a linear vector space in which an inner product is defined. • Closure means that if we take a converging sequence of vectors in the Hilbert space then the limit of the sequence also lies inside the space. We shall now discuss two examples of Hilbert spaces. 1. Linear vector space in finite dimension N. The elements are represented as column vectors:   u1  u2    u = |ui =  .  .  .. 

(3)

uN The elements ui are complex. The vector hu| is conveniently denoted as hu| = (u∗1 , u∗2 , . . . u∗N ), 119

(4)

120

Review of Linear Algebra It is called the Hermitian conjugate of the column vector |ui; hu| is often denoted as |ui† . The inner product hu|vi is the product between the row vector hu| and the column vector |vi – hence it can be written as N

hu|vi = ∑ u∗i vi .

(5)

i=1

This definition satisfies all the requirements of the inner product, mentioned above. 2. A second example is the space of square integrable functions, i.e. complex-valued functions f depending on n real variables x1 , . . . , xn ≡ x satisfying Z

d n x | f (x)|2 < ∞.

(6)

Note that the x may be restricted to some domain. The inner product for complex-valued functions is defined as h f |gi

2

Z

d n x f ∗ (x)g(x)

(7)

Operators

An operator transforms a vector into some other vector. We shall be mainly concerned with linear operators Tˆ , which, for any two complex numbers α and β , satisfy Tˆ (α |ui + β |vi) = α Tˆ |ui + β Tˆ |vi . Examples are operators represented by matrices in a finite-dimensional Hilbert space:      1 2 3 1 8  −1 −2 1   2  =  −4  . 1 −1 0 1 −1

(8)

(9)

An example of a linear operator in function space is the derivative operator Dˆ = d/dx: d f (x). Dˆ f (x) = dx

(10)

The Hermitian conjugate Tˆ † of an operator Tˆ is defined as † Tˆ |ui = hu| Tˆ † . As an example, consider a two-dimensional Hilbert space:      T11 T12 u1 T11 u1 + T12 u2 ˆ |ui T = = . T21 T22 u2 T21 u1 + T22 u2

(11)

(12)

Taking the Hermitian conjugate of this we have, using (4): † ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ Tˆ |ui = (T11 u1 + T12 u2 , T21 u1 + T22 u2 ) .

(13)

2. Operators

121

According to (11), this must be equal to (u∗1 , u∗2 )



† † T12 T11 † † T21 T22

 (14)

and we immediately see that Tˆ † =



∗ ∗ T11 T21 ∗ ∗ T12 T22

 .

(15)

We conclude that the Hermitian conjugate of a matrix is the transpose and complex conjugate of the original. This result holds for matrices of arbitrary size. Now let us find the Hermitian conjugate of the operator Dˆ = d/dx: ∗ h f | Dˆ |gi = hg| Dˆ † | f i . (16) Writing out the integral expressions for the inner product we have:   Z Z Z ∗ d d ∗ ∗ ˆ h f | D |gi = dx f (x) g(x) = − dx f (x) g(x) = − dx g(x)Dˆ f ∗ (x) = hg| − Dˆ | f i dx dx (17) where we have used the partial integration to arrive at the first equality and we have assumed that the integrated terms vanish. This condition holds for virtually all sensible quantum systems. Comparing (16) with (17), we see that ˆ Dˆ † = −D. (18) A Hermitian operator Hˆ is an operator satisfying ˆ Hˆ † = H.

(19)

We have seen that the differentiation operator Dˆ is not Hermitian – however, Dˆ 2 is. A unitary operator Uˆ is an operator which satisfies ˆ Uˆ Uˆ † = Uˆ †Uˆ = I,

(20)

where Iˆ is the unit operator which leaves any vector unchanged, Iˆ |ui = |ui. An eigenvector of a linear operator Tˆ is a vector which satisfies Tˆ |ui = λ |ui ,

(21)

where λ is a complex number, which is called the eigenvalue. In geometrical terms, this means that a vector which is operated on by Tˆ will change its length, but not its direction. Eigenvectors are extremely important in quantum mechanics, as we shall see in this course. Eigenvalues are said to be degenerate if they are shared by at least two linearly independent eigenvectors. For an Hermitian operator we have the following: • The eigenvectors span the whole Hilbert space, which means that any vector of the space can be written as a linear combination of the eigenvectors. This property of the eigenvectors is called completeness. • All eigenvalues are real. • Any two eigenvectors belonging to distinct eigenvalues are mutually orthogonal.

122

Review of Linear Algebra

In the special case of a finite dimensional Hilbert space, the matrix representation of an Hermitian operator Hˆ satisfies ˆ = SˆHˆ Sˆ† Diag (22) ˆ is diagonal, i.e. only its diagonal elements are nonzero, and the columns matrix where the matrix Diag ˆ Sˆ are the eigenvectors of H. ˆ ˆ Two operators A and B are said to commute if their product does not depend on the order in which it is evaluated: ˆ Aˆ and Bˆ commute if Aˆ Bˆ = Bˆ A. (23) For two commuting operators Aˆ and Bˆ it holds that any nondegenerate eigenvector of Bˆ is also an ˆ If however Aˆ has a degenerate eigenvalue, then there can always be found a special eigenvector of A. orthogonal basis in the degenerate eigenspace of that eigenvalue such that all basis vectors are also ˆ with eigenvalues which may or may not be degenerate. eigenvectors of B,

Appendix B

The time-dependent Schr¨odinger equation The time-dependent Schr¨odinger equation reads: i}

∂ ˆ ψ(R,t) = Hψ(R,t). ∂t

(1)

The coordinate R denotes any dependence other than time. Usually R contains space coordinates of the particle(s) in the system and their spin. In Dirac vector notation, The Schr¨odinger equation can be written as i}

∂ |ψ(t)i = Hˆ |ψ(t)i . ∂t

(2)

This equation has a formal solution for the case where the Hamiltonian Hˆ does not depend on the time: ˆ |ψ(t)i = e−it H/} |ψ(t = 0)i . (3) This expression is difficult to evaluate as it involves the exponent of an operator. In case we know the eigenvectors |ϕn i and eigenvalues En of the Hamiltonian: Hˆ |ϕn i = En |ϕn i ,

(4)

the solution is not difficult to find. We have for any eigenvector |ϕn i: |ϕn (t)i = e−itEn /} |ϕn i .

(5)

Because of completeness, we can write |ψ(t = 0)i as |ψ(t = 0)i = ∑ cn |ϕn i ,

(6)

∑ cn e−itE /} |ϕn i

(7)

n

and we see that the following solution n

n

satisfies the time-dependent Schr¨odinger equation with starting value |ψ(t = 0)i, as can easily be verified by substitution. The stationary Schr¨odinger equation can be derived from the time-dependent Schr¨odinger equation by a separation of variables. Let us try to write the solution to the time-dependent Schr¨odinger equation in the form ψ(R,t) = Φ(R)Q(t). (8) 123

124

The time-dependent Schr¨odinger equation

Substitution into the time-dependent Schr¨odinger equation and division on the left- and right hand side by ψ(R,t) leads to ˆ i} ∂ Q(t) HΦ(R) ∂t = . (9) Q(t) Φ(R) On the left hand side, we have an expression depending on t, whereas on the right hand side we have an expression depending on R. These two expressions can therefore be equal only when they are constant. We call this constant the energy, En . This leads to the two equations ∂Q = En Q(t); ∂t ˆ HΦ(R) = En Φ(R). i}

(10a) (10b)

The second equation is the stationary Schr¨odinger equation which is essentially an eigenvalue equation ˆ The first equation has as its solution a time-dependent phase factor exp(−iEnt/}) for the operator H. which must be multiplied by the eigenfunction of Hˆ at energy E in order to obtain a solution to the time-dependent Schr¨odinger equation. From (7) we see that the full solution to the time-dependent Schr¨odinger equation can always be written as linear combination of the solutions found via the stationary approach.

Appendix C

Review of the Schr¨odinger equation in one dimension In your first quantum mechanics course, you have encountered the stationary Schr¨odinger equation in one dimension. In this appendix we review briefly some aspects of this equation and its solutions. The stationary Schr¨odinger equation in one dimension reads   2 2 −} d +V (x) ψ(x) = Eψ(x). (1) 2m dx2 This is an eigenvalue equation: on the left hand side, we have an operator acting on the wave function ψ(x), and the result must be proportional to ψ, with proportionality constant E, the energy. A restriction on the possible solutions is that they must be square integrable, that is, they must have finite norm. The solution of this equation is known in a few cases only: the constant potential, the harmonic oscillator and the Morse potential, which is related to the hydrogen atom. Here we shall restrict ourselves to the constant potential, V (x) = V . For E > V , the solutions can be written as ψ(x) = e±ikx ,

(2)

k2 = 2m(E −V )/}2 .

(3)

with In Eq. (2), the + sign in the exponent corresponds to a wave running to the right, and the − sign to a left-running wave. This can be seen when the solutions are multiplied by the appropriate timedependent phase factor exp(−iEt/}). The solution exp(±ikx) is not normalisable. Nevertheless, it is accepted as a solution, because it is the limit of a sequence of normalisable solutions ψn which are of the form exp(±ikx) for −n < x < n, and which are smoothly cut off to zero beyond these two boundaries. For E < V , the solution is ψ(x) = e±qx , (4) with q2 = 2m(V − E)/}2 .

(5)

When the solution extends to ∞, only the − sign is allowed as a result of the normalisability of the wave function. For x to −∞, only the + sign is admissable for the same reason. The difference with exp(±ikx) (which is also not normalisable but nevertheless accepted) is that it is not possible to find a series of normalisable solutions whose limit behaves as a diverging exp(±qx). 125

126

Review of the Schr¨odinger equation in one dimension

We often deal with a potential which is piecewise constant. At the boundary between two regions with constant potential, the boundary condition must be met that the value and the derivative of the wave function are equal on both sides. Often we do not care about the normalisation of the wave function at first (we do however care about the normalisability!). In that case, the two matching conditions for value and derivative can be replaced by a continuity condition for the so-called logarithmic derivative ψ 0 (x)/ψ(x) (the prime 0 stands for the derivative). We now consider a general (i.e. a non-constant) potential V (x). When E > V for x → ∞ or −∞, then a solution can be found for all values of E larger than V . In other words, the energy spectrum is then continuous. When E < V for x → ∞ and x → −∞, then a normalisable solution is found only for a discrete set of E-values. The spectrum is then discrete. Note that it is the normalisablity condition which restricts in this case the energy to a discrete set.

Related Documents

Classical Mechanics
November 2019 57
Quantum Mechanics
August 2019 38
Quantum Mechanics
May 2020 26
Quantum Mechanics
May 2020 27