Condensed Matter Physics I Peter S. Riseborough November 21, 2002
Contents 1 Introduction 1.1 The Born-Oppenheimer Approximation . . . . . . . . . . . . . . 2 Crystallography
9 9 13
3 Structures 3.1 Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Crystalline Solids . . . . . . . . . . . . . . . . . . . . . . 3.3 The Direct Lattice . . . . . . . . . . . . . . . . . . . . . 3.3.1 Primitive Unit Cells . . . . . . . . . . . . . . . . 3.3.2 The Wigner-Seitz Unit Cell . . . . . . . . . . . . 3.4 Symmetry of Crystals . . . . . . . . . . . . . . . . . . . 3.4.1 Symmetry Groups . . . . . . . . . . . . . . . . . 3.4.2 Group Multiplication Tables . . . . . . . . . . . 3.4.3 Point Group Operations . . . . . . . . . . . . . . 3.4.4 Limitations Imposed by Translational Symmetry 3.4.5 Point Group Nomenclature . . . . . . . . . . . . 3.5 Bravais Lattices . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Cubic Bravais Lattices. . . . . . . . . . . . . . . 3.5.3 Tetragonal Bravais Lattices. . . . . . . . . . . . . 3.5.4 Orthorhombic Bravais Lattices. . . . . . . . . . . 3.5.5 Monoclinic Bravais Lattice. . . . . . . . . . . . . 3.5.6 Triclinic Bravais Lattice. . . . . . . . . . . . . . . 3.5.7 Trigonal Bravais Lattice. . . . . . . . . . . . . . . 3.5.8 Hexagonal Bravais Lattice. . . . . . . . . . . . . 3.5.9 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . 3.6 Point Groups . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Space Groups . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Crystal Structures with Bases. . . . . . . . . . . . . . . 3.8.1 Diamond Structure . . . . . . . . . . . . . . . . . 3.8.2 Exercise 3 . . . . . . . . . . . . . . . . . . . . . . 3.8.3 Graphite Structure . . . . . . . . . . . . . . . . . 1
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
13 14 18 18 19 19 21 21 22 23 24 24 28 28 28 30 31 32 32 33 33 35 36 36 39 39 40 40
3.8.4 Exercise 4 . . . . . . . . . . . . . . 3.8.5 Hexagonal Close-Packed Structure 3.8.6 Exercise 5 . . . . . . . . . . . . . . 3.8.7 Exercise 6 . . . . . . . . . . . . . . 3.8.8 Other Close-Packed Structures . . 3.8.9 Sodium Chloride Structure . . . . 3.8.10 Cesium Chloride Structure . . . . 3.8.11 Fluorite Structure . . . . . . . . . 3.8.12 The Copper Three Gold Structure 3.8.13 Rutile Structure . . . . . . . . . . 3.8.14 Zinc Blende Structure . . . . . . . 3.8.15 Zincite Structure . . . . . . . . . . 3.8.16 The Perovskite Structure . . . . . 3.8.17 Exercise 7 . . . . . . . . . . . . . . 3.9 Lattice Planes . . . . . . . . . . . . . . . . 3.9.1 Exercise 8 . . . . . . . . . . . . . . 3.9.2 Exercise 9 . . . . . . . . . . . . . . 3.10 Quasi-Crystals . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
41 41 42 43 43 45 45 47 47 48 48 49 50 50 50 53 53 53
4 Structure Determination 4.1 X Ray Scattering . . . . . . . . . . . . . . . . . . . . 4.1.1 The Bragg conditions . . . . . . . . . . . . . 4.1.2 The Laue conditions . . . . . . . . . . . . . . 4.1.3 Equivalence of the Bragg and Laue conditions 4.1.4 The Ewald Construction . . . . . . . . . . . . 4.1.5 X-ray Techniques . . . . . . . . . . . . . . . . 4.1.6 Exercise 10 . . . . . . . . . . . . . . . . . . . 4.1.7 The Structure and Form Factors . . . . . . . 4.1.8 Exercise 11 . . . . . . . . . . . . . . . . . . . 4.1.9 Exercise 12 . . . . . . . . . . . . . . . . . . . 4.1.10 Exercise 13 . . . . . . . . . . . . . . . . . . . 4.1.11 Exercise 14 . . . . . . . . . . . . . . . . . . . 4.1.12 Exercise 15 . . . . . . . . . . . . . . . . . . . 4.1.13 Exercise 16 . . . . . . . . . . . . . . . . . . . 4.1.14 Exercise 17 . . . . . . . . . . . . . . . . . . . 4.2 Neutron Diffraction . . . . . . . . . . . . . . . . . . . 4.3 Theory of the Differential Scattering Cross-section . 4.3.1 Time Dependent Perturbation Theory . . . . 4.3.2 The Fermi-Golden Rule . . . . . . . . . . . . 4.3.3 The Elastic Scattering Cross-Section . . . . . 4.3.4 The Condition for Coherent Scattering . . . . 4.3.5 Exercise 18 . . . . . . . . . . . . . . . . . . . 4.3.6 Exercise 19 . . . . . . . . . . . . . . . . . . . 4.3.7 Exercise 20 . . . . . . . . . . . . . . . . . . . 4.3.8 Anti-Domain Phase Boundaries . . . . . . . . 4.3.9 Exercise 21 . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
56 56 56 57 59 60 61 62 62 69 69 70 70 71 72 74 74 76 77 78 80 83 85 85 85 86 87
2
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
4.4 4.5
Elastic Scattering from Quasi-Crystals . . . . . . . . . . . . . . . Elastic Scattering from a Fluid . . . . . . . . . . . . . . . . . . .
5 The Reciprocal Lattice 5.0.1 Exercise 22 . . . . . . . . . . . . . . . . . . . 5.1 The Reciprocal Lattice as a Dual Lattice . . . . . . . 5.1.1 Exercise 23 . . . . . . . . . . . . . . . . . . . 5.2 Examples of Reciprocal Lattices . . . . . . . . . . . . 5.2.1 The Simple Cubic Reciprocal Lattice . . . . . 5.2.2 The Body Centered Cubic Reciprocal Lattice 5.2.3 The Face Centered Cubic Reciprocal Lattice 5.2.4 The Hexagonal Reciprocal Lattice . . . . . . 5.2.5 Exercise 24 . . . . . . . . . . . . . . . . . . . 5.3 The Brillouin Zones . . . . . . . . . . . . . . . . . . 5.3.1 The Simple Cubic Brillouin Zone . . . . . . . 5.3.2 The Body Centered Cubic Brillouin Zone . . 5.3.3 The Face Centered Cubic Brillouin Zone . . . 5.3.4 The Hexagonal Brillouin Zone . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
6 Electrons
88 90 93 94 94 97 97 97 98 98 99 100 100 100 101 101 102 103
7 Electronic States 7.1 Many Electron Wave Functions 7.1.1 Exercise 25 . . . . . . . 7.2 Bloch’s Theorem . . . . . . . . 7.3 Boundary Conditions . . . . . . 7.4 Plane Wave Expansion of Bloch 7.5 The Bloch Wave Vector . . . . 7.6 The Density of States . . . . . 7.6.1 Exercise 26 . . . . . . . 7.7 The Fermi-Surface . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
103 104 110 110 114 116 118 120 122 123
8 Approximate Models 8.1 The Nearly Free Electron Model . . . . . . . . . . . . 8.1.1 Perturbation Theory . . . . . . . . . . . . . . . 8.1.2 Non-Degenerate Perturbation Theory . . . . . 8.1.3 Degenerate Perturbation Theory . . . . . . . . 8.1.4 Empty Lattice Approximation Band Structure 8.1.5 Exercise 27 . . . . . . . . . . . . . . . . . . . . 8.1.6 Degeneracies of the Bloch States . . . . . . . . 8.1.7 Exercise 28 . . . . . . . . . . . . . . . . . . . . 8.1.8 Brillouin Zone Boundaries . . . . . . . . . . . . 8.1.9 The Geometric Structure Factor . . . . . . . . 8.1.10 Exercise 29 . . . . . . . . . . . . . . . . . . . . 8.1.11 Exercise 30 . . . . . . . . . . . . . . . . . . . . 8.1.12 Exercise 31 . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
125 125 125 126 129 132 137 137 145 146 148 150 152 153
3
. . . . . . . . . . . . . . . . . . . . . . . . Functions . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
154 154 155 159 160 161 162 167 170 170 171 172 172 173 173 175
9 Electron-Electron Interactions 9.1 The Landau Fermi Liquid . . . . . . . . . . . . 9.1.1 The Scattering Rate . . . . . . . . . . . 9.1.2 The Quasi-Particle Energy . . . . . . . 9.1.3 Exercise 42 . . . . . . . . . . . . . . . . 9.2 The Hartree-Fock Approximation . . . . . . . . 9.2.1 The Free Electron Gas. . . . . . . . . . 9.2.2 Exercise 43 . . . . . . . . . . . . . . . . 9.3 The Density Functional Method . . . . . . . . . 9.3.1 Hohenberg-Kohn Theorem . . . . . . . . 9.3.2 Functionals and Functional Derivatives 9.3.3 The Variational Principle . . . . . . . . 9.3.4 The Electrostatic Terms . . . . . . . . . 9.3.5 The Kohn-Sham Equations . . . . . . . 9.3.6 The Local Density Approximation . . . 9.4 Static Screening . . . . . . . . . . . . . . . . . . 9.4.1 The Thomas-Fermi Approximation . . . 9.4.2 Linear Response Theory . . . . . . . . . 9.4.3 Density Functional Response Function . 9.4.4 Exercise 44 . . . . . . . . . . . . . . . . 9.4.5 Exercise 45 . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
176 176 177 177 180 180 184 192 192 193 195 198 200 202 204 206 208 211 213 215 215
8.2
8.3
8.1.13 Exercise 32 . . . . . . . . . . . . . 8.1.14 Exercise 33 . . . . . . . . . . . . . The Pseudo-Potential Method . . . . . . . 8.2.1 The Scattering Approach . . . . . 8.2.2 The Ziman-Lloyd Pseudo-potential 8.2.3 Exercise 34 . . . . . . . . . . . . . The Tight-Binding Model . . . . . . . . . 8.3.1 Tight-Binding s Band Metal . . . . 8.3.2 Exercise 35 . . . . . . . . . . . . . 8.3.3 Exercise 36 . . . . . . . . . . . . . 8.3.4 Exercise 37 . . . . . . . . . . . . . 8.3.5 Exercise 38 . . . . . . . . . . . . . 8.3.6 Exercise 39 . . . . . . . . . . . . . 8.3.7 Exercise 40 . . . . . . . . . . . . . 8.3.8 Wannier Functions . . . . . . . . . 8.3.9 Exercise 41 . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
10 Stability of Structures 217 10.1 Momentum Space Representation . . . . . . . . . . . . . . . . . . 217 10.2 Real Space Representation . . . . . . . . . . . . . . . . . . . . . . 222
4
11 Metals 11.1 Thermodynamics . . . . . . . . . . . . . . . . . 11.1.1 The Sommerfeld Expansion . . . . . . . 11.1.2 The Specific Heat Capacity . . . . . . . 11.1.3 Exercise 46 . . . . . . . . . . . . . . . . 11.1.4 Exercise 47 . . . . . . . . . . . . . . . . 11.1.5 Pauli Paramagnetism . . . . . . . . . . 11.1.6 Exercise 48 . . . . . . . . . . . . . . . . 11.1.7 Exercise 49 . . . . . . . . . . . . . . . . 11.1.8 Landau Diamagnetism . . . . . . . . . . 11.1.9 Landau Level Quantization . . . . . . . 11.1.10 The Diamagnetic Susceptibility . . . . . 11.2 Transport Properties . . . . . . . . . . . . . . . 11.2.1 Electrical Conductivity . . . . . . . . . 11.2.2 Scattering by Static Defects . . . . . . . 11.2.3 Exercise 50 . . . . . . . . . . . . . . . . 11.2.4 The Hall Effect and Magneto-resistance. 11.2.5 Multi-band Models . . . . . . . . . . . . 11.3 Electromagnetic Properties of Metals . . . . . . 11.3.1 The Longitudinal Response . . . . . . . 11.3.2 Electron Scattering Experiments . . . . 11.3.3 Exercise 51 . . . . . . . . . . . . . . . . 11.3.4 Exercise 52 . . . . . . . . . . . . . . . . 11.3.5 The Transverse Response . . . . . . . . 11.3.6 Optical Experiments . . . . . . . . . . . 11.3.7 Kramers-Kronig Relation . . . . . . . . 11.3.8 Exercise 53 . . . . . . . . . . . . . . . . 11.3.9 Exercise 54 . . . . . . . . . . . . . . . . 11.3.10 The Drude Conductivity . . . . . . . . . 11.3.11 Exercise 55 . . . . . . . . . . . . . . . . 11.3.12 Exercise 56 . . . . . . . . . . . . . . . . 11.3.13 The Anomalous Skin Effect . . . . . . . 11.3.14 Inter-Band Transitions . . . . . . . . . . 11.4 Measuring the Fermi-Surface . . . . . . . . . . 11.4.1 Semi-Classical Orbits . . . . . . . . . . 11.4.2 de Haas - van Alphen Oscillations . . . 11.4.3 Exercise 57 . . . . . . . . . . . . . . . . 11.4.4 The Lifshitz-Kosevich Formulae . . . . . 11.4.5 Other Fermi-Surface Probes . . . . . . . 11.4.6 Cyclotron Resonances . . . . . . . . . . 11.5 The Quantum Hall Effect . . . . . . . . . . . . 11.5.1 The Integer Quantum Hall Effect . . . . 11.5.2 Exercise 58 . . . . . . . . . . . . . . . . 11.5.3 The Fractional Quantum Hall Effect . . 11.5.4 Quasi-Particle Excitations . . . . . . . . 11.5.5 Skyrmions . . . . . . . . . . . . . . . . . 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
228 228 229 231 234 234 234 237 237 238 239 241 244 244 244 251 252 260 263 266 274 278 280 285 288 289 290 291 291 296 296 297 299 300 301 305 307 308 313 315 319 319 325 326 328 330
11.5.6 Composite Fermions . . . . . . . . . . . . . . . . . . . . . 338 12 Insulators 12.1 Thermodynamics . . . . . . . . . 12.1.1 Holes . . . . . . . . . . . 12.1.2 Intrinsic Semiconductors . 12.1.3 Extrinsic Semiconductors 12.1.4 Exercise 59 . . . . . . . . 12.2 Transport Properties . . . . . . . 12.3 Optical Properties . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
13 Phonons
341 344 345 347 349 352 353 353 354
14 Harmonic Phonons 14.1 Lattice with a Basis . . . . . . . . . . . . 14.2 A Sum Rule for the Dispersion Relations . 14.2.1 Exercise 60 . . . . . . . . . . . . . 14.3 The Nature of the Phonon Modes . . . . . 14.3.1 Exercise 61 . . . . . . . . . . . . . 14.3.2 Exercise 62 . . . . . . . . . . . . . 14.3.3 Exercise 63 . . . . . . . . . . . . . 14.3.4 Exercise 64 . . . . . . . . . . . . . 14.3.5 Exercise 65 . . . . . . . . . . . . . 14.3.6 Exercise 66 . . . . . . . . . . . . . 14.3.7 Exercise 67 . . . . . . . . . . . . . 14.4 Thermodynamics . . . . . . . . . . . . . . 14.4.1 The Specific Heat . . . . . . . . . 14.4.2 The Einstein Model of a Solid . . . 14.4.3 The Debye Model of a Solid . . . . 14.4.4 Exercise 68 . . . . . . . . . . . . . 14.4.5 Exercise 69 . . . . . . . . . . . . . 14.4.6 Exercise 70 . . . . . . . . . . . . . 14.4.7 Exercise 71 . . . . . . . . . . . . . 14.4.8 Lindemann Theory of Melting . . . 14.4.9 Thermal Expansion . . . . . . . . 14.4.10 Thermal Expansion of Metals . . . 14.5 Anharmonicity . . . . . . . . . . . . . . . 14.5.1 Exercise 72 . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
354 360 360 363 363 364 364 365 365 366 366 367 368 370 371 372 374 374 374 375 375 377 379 379 380
15 Phonon Measurements 15.1 Inelastic Neutron Scattering . . . . . 15.1.1 The Scattering Cross-Section 15.2 The Debye-Waller Factor . . . . . . 15.3 Single Phonon Scattering . . . . . . 15.4 Multi-Phonon Scattering . . . . . . . 15.4.1 Exercise 73 . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
381 381 382 387 389 390 391
6
. . . . . .
. . . . . .
. . . . . .
15.4.2 Exercise 74 . . . . . . . . . . . . . . . . . . . . . . . . . . 391 15.4.3 Exercise 75 . . . . . . . . . . . . . . . . . . . . . . . . . . 391 15.5 Raman and Brillouin Scattering of Light . . . . . . . . . . . . . . 391 16 Phonons in Metals 16.1 Screened Ionic Plasmons . . . . . . . . . . . . . 16.1.1 Kohn Anomalies . . . . . . . . . . . . . 16.2 Dielectric Constant of a Metal . . . . . . . . . . 16.3 The Retarded Electron-Electron Interaction . . 16.4 Phonon Renormalization of Quasi-Particles . . 16.5 Electron-Phonon Interactions . . . . . . . . . . 16.6 Electrical Resistivity due to Phonon Scattering 16.6.1 Umklapp Scattering . . . . . . . . . . . 16.6.2 Phonon Drag . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
394 395 396 396 399 400 402 403 408 409
17 Phonons in Semiconductors 410 17.1 Resistivity due to Phonon Scattering . . . . . . . . . . . . . . . . 410 17.2 Polarons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 17.3 Indirect Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . 411 18 Impurities and Disorder 18.1 Scattering By Impurities . . . . . . . . . 18.2 Virtual Bound States . . . . . . . . . . . 18.3 Disorder . . . . . . . . . . . . . . . . . . 18.4 Coherent Potential Approximation . . . 18.5 Localization . . . . . . . . . . . . . . . . 18.5.1 Anderson Model of Localization . 18.5.2 Scaling Theories of Localization .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
413 416 418 421 422 423 424 425
19 Magnetic Impurities 19.1 Localized Magnetic Impurities in Metals 19.2 Mean Field Approximation . . . . . . . 19.2.1 The Atomic Limit . . . . . . . . 19.3 The Schrieffer-Wolf Transformation . . . 19.3.1 The Kondo Hamiltonian . . . . . 19.4 The Resistance Minimum . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
428 428 428 431 431 434 435
20 Collective Phenomenon 21 Itinerant Magnetism 21.1 Stoner Theory . . . . . 21.1.1 Exercise 76 . . . 21.1.2 Exercise 77 . . . 21.2 Linear Response Theory 21.3 Magnetic Instabilities . 21.4 Spin Waves . . . . . . .
440
. . . . . .
. . . . . .
. . . . . .
. . . . . .
7
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
440 440 442 442 442 444 447
21.5 The Heisenberg Model . . . . . . . . . . . . . . . . . . . . . . . . 449 22 Localized Magnetism 22.1 Holstein - Primakoff Transformation 22.2 Spin Rotational Invariance . . . . . . 22.2.1 Exercise 78 . . . . . . . . . . 22.3 Anti-ferromagnetic Spinwaves . . . . 22.3.1 Exercise 79 . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
450 451 454 457 458 460
23 Spin Glasses 460 23.1 Mean Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 462 23.2 The Sherrington-Kirkpatrick Solution. . . . . . . . . . . . . . . . 463 24 Magnetic Neutron Scattering 24.1 The Inelastic Scattering Cross-Section . . . . 24.1.1 The Dipole-Dipole Interaction . . . . . 24.1.2 The Inelastic Scattering Cross-Section 24.2 Time Dependent Spin Correlation Functions . 24.3 The Fluctuation Dissipation Theorem . . . . 24.4 Magnetic Scattering . . . . . . . . . . . . . . 24.4.1 Neutron Diffraction . . . . . . . . . . 24.4.2 Exercise 80 . . . . . . . . . . . . . . . 24.4.3 Exercise 81 . . . . . . . . . . . . . . . 24.4.4 Spin Wave Scattering . . . . . . . . . 24.4.5 Exercise 82 . . . . . . . . . . . . . . . 24.4.6 Critical Scattering . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
467 467 467 467 471 473 475 475 476 477 477 478 478
25 Superconductivity 25.1 Experimental Manifestation . . . . . . . . . . . . . . . 25.1.1 The London Equations . . . . . . . . . . . . . . 25.1.2 Thermodynamics of the Superconducting State 25.2 The Cooper Problem . . . . . . . . . . . . . . . . . . . 25.3 Pairing Theory . . . . . . . . . . . . . . . . . . . . . . 25.3.1 The Pairing Interaction . . . . . . . . . . . . . 25.3.2 The B.C.S. Variational State . . . . . . . . . . 25.3.3 The Gap Equation . . . . . . . . . . . . . . . . 25.3.4 The Ground State Energy . . . . . . . . . . . . 25.4 Quasi-Particles . . . . . . . . . . . . . . . . . . . . . . 25.4.1 Exercise 83 . . . . . . . . . . . . . . . . . . . . 25.5 Thermodynamics . . . . . . . . . . . . . . . . . . . . . 25.6 Perfect Conductivity . . . . . . . . . . . . . . . . . . . 25.7 The Meissner Effect . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
480 480 481 483 485 489 489 491 493 494 496 499 499 501 503
26 Landau-Ginsberg Theory
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
504
8
1
Introduction
Condensed Matter Physics is the study of materials in Solid and Liquid Phases. It encompasses the study of ordered crystalline phases of solids, as well as disordered phases such as the amorphous and glassy phases of solids. Furthermore, it also includes materials with short-ranged order such as conventional liquids, and liquid crystals which show unconventional order intermediate between those of a crystalline solid and a liquid. Condensed matter has the quite remarkable property that, due to the large number of particles involved, the behavior of the materials may be qualitatively distinct from those of the individual constituents. The behavior of the incredibly large number of particles is governed by (quantum) statistics which, through the chaotically complicated motion of the particles, produces new types of order. These emergent phenomena are best exemplified in phenomenon such as magnetism or superconductivity where the collective behavior results in transitions to new phases. In surveying the properties of materials it is convenient to separate the properties according to two (usually) disparate time scales. One time scale is a slow time scale which governs the structural dynamics, and a faster time scale that governs the electronic motion. The large difference between the time scales is due to the large ratio of the nuclear masses to the electronic mass, Mn /me ∼ 103 . The long-ranged electromagnetic force binds these two constituents of different mass into electrically neutral material. The slow moving nuclear masses can be considered to be quasi-static, and are responsible for defining the structure of matter. In this approximation, the fast moving electrons equilibrate in the quasi-static potential produced by the nuclei.
1.1
The Born-Oppenheimer Approximation
The difference in the relevant time scales for electronic and nuclear motion allows one to make the Born-Oppenheimer Approximation. In this approximation, the electronic states are treated as if the nuclei were at rest at fixed positions. However, when treating the slow motions of the nuclei, the electrons are considered as adapting instantaneously to the potential of the charged nuclei, thereby minimizing the electronic energies. Thus, the nuclei charges are dressed by a cloud of electrons forming ionic or atomic-like aggregates. A qualitative estimate of the relative energies of nuclear versus electronic motion can be obtained by considering metallic hydrogen. The electronic energies are calculated using only the Bohr model of the hydrogen atom. The equation of motion for an electron of mass me has the form −
me v 2 Z e2 = − a2 a
(1)
where Z is the nuclear charge and a is the radius of the atomic orbital. The stan9
dard semi-classical quantization condition due to Bohr and Sommerfeld restricts the angular momentum to integral values of ¯h me v a = n ¯h
(2)
These equations can be combined to find the quantized total electronic energy of the hydrogen atom Ee
= = =
me v 2 Z e2 + a 2 Z e2 − 2a me Z 2 e4 − 2 n2 ¯h2 −
(3)
which is a standard result from atomic physics. Note that the kinetic energy term and the electrostatic potential term have similar magnitudes. Now consider the motion of the nuclei. The forces consist of Coulomb forces between the nuclei and electrons, and the quantum mechanical Pauli forces. The electrostatic repulsions and attractions have similar magnitudes, since the internuclear separations are of the same order as the Bohr radius. In equilibrium, the sum of the forces vanish identically. Furthermore, if an atom is displaced from the equilibrium position by a small distance equal to r, the restoring force is approximately given by the dipole force −α
Z e2 r a3
(4)
where α is a dimensionless constant. Hence, the equation of motion for the displacement of a nuclei of mass Mn is −α
Z e2 d2 r r = Mn 3 a dt2
(5)
which shows that the nuclei undergo harmonic oscillations with frequency ω2 = α
Z e2 Mn a3
(6)
The semi-classical quantization condition I dr Mn v = n ¯h
(7)
yields the energy for nuclear motion as EN
=
n ¯h ω
=
me Z 2 e4 1 α2 n ¯h2 10
me Mn
12 (8)
Thus, the ratio of the energies of nuclear motion to electronic motion are given by the factor 1 EN me 2 ∼ (9) Ee Mn 1 Since the ratio of the mass of electron to the proton mass is 2000 , the nuclear kinetic energy is negligible when compared to the electronic kinetic energy. A more rigorous proof of the validity of the Born-Oppenheimer approximation was given by Migdal.
In the first part of the course it is assumed that the Born-Oppenheimer approximation is valid. First, the subject of Crystallography shall be discussed, and the characters of the equilibrium structures of the dressed nuclei in matter are described. An important class of such materials are those which posses long-ranged periodic translational order and other symmetries. It shall be shown how these long range ordered and amorphous structures can be effectively probed by various elastic scattering experiments, in which the wave length of the scattered particles is comparable to the distance between the nuclei. In the second part, the properties of the Electrons shall be discussed. On assuming the validity of the Born-Oppenheimer approximation, the nature of the electronic states that occur in the presence of the potential produced by the nuclei shall be discussed. One surprising result of this approach is that, even though the strength of the ionic potential is quite large (of the order of Rydbergs), in some metals the highest occupied electronic states bear a close resemblance to the states expected if the ionic potential was very weak or negligible. In other materials, the potential due to the ionic charges can produce gaps in the electronic energy spectrum. Using Bloch’s theorem, it shall be shown how periodic long-ranged order can produce gaps in the electronic spectrum. Another surprising result is that, in most metals, it appears as though the electron-electron interactions can be neglected, or more precisely can be thought of sharing the properties of a non-interacting electron gas, albeit with renormalized masses or magnetic moments. The thermodynamic properties of electrons in these Bloch states shall be treated using Fermi-Dirac statistics. Furthermore, the concepts of the Fermienergy and Fermi-surface of metals are introduced. It shall be shown how the electronic transport properties of metals are dominated by states with energies close to the Fermi-surface, and how the Fermi-surface can be probed. The third part concerns the motion of the ions or nuclei. In particular, it will be considered how the fast motion of the electrons dress or screen the internuclear potentials. The low energy excitations of the dressed nuclear or ionic structure of matter give rise to harmonic-like vibrations. The elementary exci11
tation of the quantized vibrations are known as Phonons. It shall be shown how these phonon excitations manifest themselves in experiments, in thermodynamic properties and, how they participate in limiting electrical transport. The final part of the course concerns some of the more striking examples of the Collective Phenomenon such as Magnetism and Superconductivity. These phenomena involve the interactions between the elementary excitations of the solid, and through collective action, they spontaneously break the symmetry of the Hamiltonian. In many cases, the spontaneously broken symmetry is accompanied by the formation of a new branch of low energy excitations.
12
2
Crystallography
Crystallography is the study of structure of ordered solids, disordered solids and also liquids. In this section, it shall be assumed that the nuclei are static, frozen into their average positions. Due to the large nuclear masses and strong interactions between the nuclei (dressed by their accompanying clouds of electrons), one may assume that the nuclear or ionic motion can be treated classically. The most notable failure of this assumption occurs with the very lightest of nuclei, such as He. In the anomalous case of He, where the separation between ions, d, is of the order of angstroms, the uncertainty of the momentum is given by h¯d and the kinetic energy EK for this quantum zero point motion is given by EK =
¯2 h 2 M d2
(10)
The kinetic energy is large since the mass M of the He atom is small. The magnitude of the kinetic energy of the zero point fluctuations is larger than the weak van der Waals or London force between the He ions. Thus, the inter-ionic forces are insufficient to bind the He ion into a solid and the material remains in a liquid-like state, until the lowest temperatures. For these reasons, He behaves like a quantum fluid. However, for the heavier nuclei, the quantum nature of the particles manifest themselves in more subtle ways. First, the various types of structures and the symmetries that can be found in Condensed Matter are described and then the various experimental methods used to observe these structures are discussed.
3
Structures
The structure of condensed materials is usually thought about in terms of density of either electrons or nuclear matter. To the extent that the regions of non-zero density of the nuclear matter are highly localized in space, with linear dimensions of 10−15 meters, the nuclei can be discussed in terms of point objects. The electron density is more extended and varies over length scales of 10−10 meters. The length scale for the electronic density in solids and fluids is very similar to the length scale over which the electron density varies in isolated atoms. The similarity of scales occurs as electrons are partially responsible for the bonding of atoms into a solid. That is, the characteristic atomic length scale is almost equal to the characteristic separation of the nuclei in condensed matter. Due to the near equality of these two length scales, the electron density in solids definitely cannot be represented in terms of a superposition of the density of well defined atoms. However, the electron density does show a significant variation that can be interpreted in terms of the electron density of isolated atoms, subject to significant modifications when brought together. As the electron density for isolated atoms is usually spherically symmetric, the structure 13
in the electronic density may, for convenience of discussion, be approximately represented in terms of a set of spheres of finite radius.
3.1
Fluids
Both liquids and gases are fluids. The macroscopic characteristics of fluids are that they are spatially uniform and isotropic, which means that the average environment of any atom is identical to the average environment of any other atom. The density is defined by the function X δ 3 ( r − ri ) ρ(r) =
(11)
i
in which ri is the instantaneous position of the i-th atom. A measurement of the density usually results in the time average of the density which corresponds to the time averaged positions of the atoms. In particular, for a fluid, spatial homogeneity ensures that the average density ρ(r) at position r is equal to the average density at a displaced position r + R, ρ(r) = ρ(r + R) (12) The value of the displacement R is arbitrary, so the average density is independent of r and can be expressed as ρ(0). This just means that the average position of an individual atom is undetermined. The operations which leave the system unchanged are the symmetry operations. For a fluid, the symmetry operations consist of the continuous translations through an arbitrary displacement R, rotations through an arbitrary angle about an arbitrary axis, and also reflections in arbitrary mirror planes. The set of symmetry operations form a group called the symmetry group. For a fluid, the symmetry group is the Euclidean group. Fluids have the largest possible number of symmetry operators and have the highest possible symmetry. All other materials are invariant under a smaller number of symmetry operations. Nevertheless, fluids do have short-ranged structure which is exemplified by locating one atom and then examining the positions of the neighboring atoms. The local spatial correlations are expressed by the density - density correlation function which is expressed as an average C(r, r0 )
= ρ(r) ρ(r0 )
14
=
X
δ 3 ( r − ri ) δ 3 ( r0 − rj )
i,j
(13) Since fluids are homogeneous, the correlation function is only a function of the difference of the positions r − r0 . Furthermore, since the fluids are isotropic and invariant under rotations, the correlation function is only a function of the distance separating the two regions of space | r − r0 |. At sufficiently large separation distances, the positions of the atoms become uncorrelated, thus, lim
r−r 0 → ∞
C(r, r0 ) → ρ(r) ρ(r0 ) → ρ(0) ρ(r − r0 )
(14)
That is, at large spatial separations, the density - density correlation function reduces to the product of the independent average of the density at the origin and the average density at a position r. From the homogeneity of the fluid, ρ(r) is identical to the average of ρ(0). The density - density correlation function contains the correlation between the same atom, that is, there are terms with i = j. This leads to a contribution which shows up at short distances, X X δ 3 ( r − ri ) δ 3 ( r − ri ) δ 3 ( r0 − ri ) = δ 3 ( r − r0 ) i
i=j
= δ 3 ( r − r0 ) ρ(r)
(15)
which is proportional to the density. The pair distribution function g(r − r0 ) is defined as the contribution to the density - density correlation function which excludes the correlation between an atom and itself, g( r − r0 ) = C( r − r0 ) − δ 3 ( r − r0 ) ρ(r)
(16)
For a system which possesses continuous translational invariance, the pair distribution function can be evaluated as X g( r − r0 ) = δ 3 ( r − ri ) δ 3 ( r0 − rj ) i6=j
Z
=
1 V
=
1 X 3 δ ( r − r0 − ri + rj ) V
d3 R
X
δ 3 ( r − ri − R ) δ 3 ( r0 − rj − R )
i6=j
(17)
i6=j
Since the sum over i runs over each of the inter-atomic separations rj − ri for 15
each value of j, spatial homogeneity demands that the contribution from each different j value is identical. There are N such terms, and this leads to the expression for the pair distribution function involving an atom at the central site r0 , and the others at sites i X g(r) = ρ(0) δ 3 ( r − ri + r0 ) (18) i
where
N (19) V As this only depends on the radial distance | r | this is called the radial distribution function g(r). For large r, the pair distribution function, like C(r, 0), ρ(0) =
2
approaches ρ(0) , or lim
r → ∞
2
g(r) → ρ(0)
(20)
Liquids are defined as the fluids that have high densities. The liquid phase is not distinguished from the higher temperature gaseous phase by a change in symmetry, unlike most other materials. In the liquid phase the density is higher, the inter-atomic forces play a more important role than in the low density gaseous phase. The interaction forces are responsible for producing the short ranged correlation in the density - density function. A model potential that is representative of typical inter-atomic force between two neutral atoms is the Lennard-Jones potential. 12 6 a a V (r) = 4 V0 − (21) r r The potential has a short ranged repulsion between the atoms caused by the overlap of the electronic states, and the long-ranged van der Waals attraction caused by fluctuation induced electric polarizations of the atoms. The resulting 1 potential falls to zero at r = a and has a minimum at r = 2 6 a. The potential at the minimum of the well is given by − V0 . Another model potential that is of use is the hard sphere potential which excludes the center of another atom from the region of radius 2 a centered on the central atom. As the repulsion between atoms dominates the structure of liquids, the Bernal model of random close packing of hard spheres is responsible for most of the structure of a liquid. On randomly packing spheres, one finds a packing fraction of atoms given by 0.638. The packing fraction is defined as the total volume of the hard spheres divided by the (minimum) volume that contains all the spheres. Random packings of hard spheres can be used to calculate the radial distribution function g(r). The model shows that there are strong shortranged correlations between the closest atoms and that there are 12 in three dimensions at radial distances 2 a. These correlations show up as a strong peak 16
in the radial distribution function at 2 a, and there are other peaks corresponding to the next few shells of neighboring atoms.
17
3.2
Crystalline Solids
A perfect crystal consists of a space filling periodic array of atoms. It can be partitioned into identical individual structural units that can be repeatedly stacked together to form the crystal. The structural unit is called the unit cell. There are many alternate ways of performing the partitioning and, therefore, there are many alternate forms for the unit cells. The unit cells which have the smallest possible volume are called primitive unit cells. A unit cell may contain one or more atoms. The positions of the atoms, when referenced to a specific point in the unit cell, composes the basis of the lattice.
3.3
The Direct Lattice
Equivalent points taken from each unit cell in a perfect crystal form a periodic lattice. The points are called lattice points. Any lattice point can be reached from any other by a translation R that is a combination of an integer multiple of three primitive lattice vectors a1 , a2 , a3 , R = n1 a1 + n2 a2 + n3 a3
(22)
Here, n1 , n2 and n3 are integers that determine the magnitudes of three components of a three-dimensional vector. The set of integers (n1 , n2 , n3 ) can be used to represent a lattice point in terms of the primitive lattice vectors. The set (n1 , n2 , n3 ) runs through all the positive and negative integers. The set of translations R is closed under addition and, therefore, the translation operations form a group. Given any lattice, there are many choices for the primitive lattice vectors a1 , a2 , a3 . The array of lattice points have arrangements and orientations which are identical in every respect when viewed from origins centered on different lattice points. For example, on translating the origin through a lattice vector Rm , the displacements in the primed reference frame are related to displacements in the unprimed reference frame via r0 = r + m1 a1 + m2 a2 + m3 a3
(23)
and the lattice points in the two frames are related via n0i = ni + mi
(24)
and as the numbers ni and n0i take on all possible integer values, the set of all lattices are identical in the two reference frames. A crystal structure is composed of a lattice in which a basis of atoms is attached to each lattice point. That is, a complete specification of a crystal 18
requires specifying the lattice and the distribution of the various atoms around each lattice point. The basis is specified by giving the number of atoms and types of the atoms in the basis (j) together with their positions relative to the lattice points. The position of the j-th atom relative to the lattice point, rj , is denoted by rj =
xj a1 + yj a2 + zj a3
(25)
where the set (xi , yi , zi ) may be non-integer numbers. The choice of lattice and, therefore, the basis, is non-unique for a crystal structure. An example of this is given by a two dimensional crystal structure which can be represented many different ways including the possibilities of a representation either as a lattice with a one atom basis or as a lattice with a two atom basis.
3.3.1
Primitive Unit Cells
The parallelepiped defined by the primitive lattice vectors forms a primitive unit cell. When repeated a primitive unit cell will fill all space. The primitive unit cell is also a unit cell with the minimum volume. Although there are a number of different ways of choosing the primitive lattice vectors and unit cells, the number of basis atoms in a primitive cell is unique for each crystal structure. No basis contains fewer atoms than the basis associated with a primitive unit cell. There is always just one lattice point per primitive unit cell. If the primitive unit cell is a parallelepiped with lattice points at each of the eight corners, then each corner is shared by eight cells, so that the total number of lattice points per cell is unity as 8 × 81 = 1. The volume of the parallelepiped is given in terms of the primitive lattice vectors via Vc = | a1 . ( a2 ∧ a3 ) | (26) The primitive unit cell is a unit cell of minimum volume.
3.3.2
The Wigner-Seitz Unit Cell
An alternate method of constructing a unit cell is due to Wigner and Seitz. The Wigner-Seitz cell has the important property that there are no arbitrary choices made in defining the unit cell. The absence of any arbitrary choice has the consequence that the Wigner-Seitz unit cell always has the same symmetry as the lattice. The Wigner-Seitz unit cell is constructed by forming a set of
19
planes which bisect the lines joining a central lattice point to all other lattice points. The region of space surrounding the central lattice point, of minimum volume, which is completely enclosed by a set of the bisecting planes constitutes the Wigner-Seitz cell. Thus, the Wigner-Seitz cell consists of the volume composed of all the points that are closer to the central lattice site than to any other lattice site. The equations of the planes bisecting the vector from the central point to the i-th lattice point is given by 1 r − R . Ri = 0 (27) 2 i where Ri is the lattice vector. The sections of planes closest to the origin form the surface of the Wigner Seitz-cell. As the definition does not involve any arbitrary choice of primitive lattice vectors, the Wigner-Seitz cell possesses the full symmetry of the lattice. Furthermore, the Wigner-Seitz cell is space filling, since every point in space must lie closer to one lattice point than any other.
20
3.4
Symmetry of Crystals
A symmetry operation acts on a crystal producing a new crystal, shifting the atoms to new positions such that the new crystal is identical in appearance to the original crystal. That is, the positions of the atoms in the new crystal coincide with the positions of similar atoms in the original crystal. The symmetry operations may consist of :(i) Translation operations which leave no point unchanged. (ii) Symmetry operations which leave one point unchanged. (iii) Combinations of the above two types of operations.
3.4.1
Symmetry Groups
A set of symmetry operations form a group if, when the symmetry operations are combined, the following properties are satisfied : (I) The product of any two symmetry operators from the set, say A and B, defined as A B = C then C is also in the set. That is, the set of symmetry operations is closed under composition. (II) The composition of any three elements is associative, which means that the symmetry operation is independent of whether the first and second operators are combined before they are combined with the third, or whether the second and third operators are combined before they are combined with the first. A(BC ) = (AB)C
(28)
(III) There exists a symmetry operator which leaves all the atoms in their original places, called the identity operator E. The product of an arbitrarily chosen symmetry operator of the group with the identity gives back the arbitrarily chosen operator. AE = EA = A (29)
(IV) For each operator in the group, there exists a unique inverse operator such that when the operator is combined with the inverse operator, they produce the identity. A A−1 = A−1 A = E (30)
21
A group of symmetry operators may contain a sub-set of symmetry operators which also form a group. That is, the group laws are obeyed for all the elements of the sub-set. This sub-set of elements forms a sub-group of the group, but is only a sub-group if the elements are combined with the same law of composition as the group. The symmetry group of the direct lattice contains at least two sub-groups. These are the sub-group of translations and the point group of the lattice. Under a translation which is not the identity, no point remains invariant. The point group of the lattice consists of the set of symmetry operations in which at least one point of the lattice is invariant.
3.4.2
Group Multiplication Tables
The properties of a group are concisely represented by the group multiplication table. The number of elements in the group is called the order of the group, so the general group with n operators is of order n. The group multiplication table consists of an n by n array. The group multiplication table has the convention that if A × B = C then the operator A which is the first element of the product is located on the left most column of the table, and the operator B which is the second element is located in the uppermost row. The product C is entered in the same row as the element A and the same column as element B. E . A . .
. . . . .
. B . . . C . . . .
. . . . .
(31)
In general the symmetry operations do not commute, that is, A × B 6= B × A. The identity operator is placed as the first element of the series of symmetry operators, so the first row and first column play the dual role as the list of group elements and also are the elements found by compounding the elements with the identity. Every operator appears once, and only once, in each row or column of the group table. The fact that each operator occurs only once in any row, or any column, is a consequence of the uniqueness of the inverse. As an example, consider the point group for a single H2 O molecule. The group contains a symmetry operation which is a rotation by π about an axis in the plane of the molecule that passes through the O atom and bisects the line between the two H atoms. This is a two-fold axis since a second rotation by π is equivalent to the identity. The two-fold rotation is labelled as C2 . In this case, the two-fold axis is the rotation axis of highest order and, thus, is considered to define the vertical direction. In addition to the two-fold axis, there are two mirror planes. It is conventional to denote a mirror plane that contains the nfold axis of rotation (Cn ) with highest n as a vertical plane. The H2 O molecule 22
is symmetric under reflection in a mirror plane passing through the two-fold axis in the plane which contains the molecule. That is, the mirror plane is the plane passing through all three atoms. This is a vertical mirror symmetry operation and is denoted by σv . The second mirror symmetry operation is a reflection in another vertical plane passing through the C2 axis, but this time, the mirror plane is perpendicular to the plane of the molecule, and is denoted by σv0 . The symmetry group contains the elements E, C2 , σv , σv0 . The group is of order 4. The group table is given by E
C2
σv
σv0
C2
E
σv0
σv
σv
σv0
E
C2
σv0
σv
C2
E
Since all the operations in this group commute, the group is known as an Abelian group. Inspection of the table immediately shows that σv × C2 = σv0 . The symmetry group of a crystal has at least two sub-groups. One sub-group is the group of translations through the set lattice vectors R. A general translation which is not the identity, leaves no point unchanged by the translations. A second sub-group is formed by the set of all transformations which leave the same point of the crystal untransformed. This sub-group is the point group.
3.4.3
Point Group Operations
The crystallographic point group consists of the symmetry operations that leave at least one point untransformed. The possible symmetry elements of the point group are:Rotations through integer multiples of tions are denoted as Cn .
2 π n
around an axis. The n-fold rota-
Reflections that take every point into its mirror image with respect to a plane known as the mirror plane. Reflections are denoted by σ. Inversions that take every point r, as measured from an origin, into the point − r. The inversion operator is denoted by I. Rotation Reflections which are rotations about an axis through integer multiples of 2nπ followed by reflection in a plane perpendicular to the axis. The n-fold rotation reflections are denoted by Sn . For even n, (Sn )n = E, while
23
for odd n, (Sn )n = σ. Rotation Inversions which are rotations about an axis through integer multiples of 2nπ followed by an inversion through an origin. The International notation for a rotation reflection is n. The rotation inversion and rotation reflection operations are related for example, 3 = S6−1 , 4 = S4−1 and 6 = S3−1 . Since at least one point is invariant under all the transformations of the point group, the rotation axes and mirror planes must all intersect at these points.
3.4.4
Limitations Imposed by Translational Symmetry
Although all point group operations are allowable for isolated molecules, certain point group symmetries are not allowed for periodic lattices. The limitations on the possible types of point group symmetry operations can be seen by examining the effect of an n-fold axis in a plane perpendicular to a line through lattice points A − B . . . C − D, with 1 + m1 lattice points on it. The direction of the line will be chosen as the direction of the primitive lattice vector a1 , and the line is assumed to have a length m1 a1 . A clockwise rotation of 2nπ about the n-fold axis of rotation through point A will generate a new line A − B 0 . Likewise, a counter clockwise rotation of 2nπ about the n-fold axis of rotation through point D will generate a new line D − C”. The line constructed through B 0 − C” is parallel to the initial line A − D. The length of the line B 0 − C” is equal to m1 a1 − 2 cos 2nπ a1 and must be equal to an integer multiple of a1 , say m01 . Then 2π m1 − m01 cos = (32) n 2 Thus, cos 2nπ must be integer or half odd integer which is in the set {±1, 0, ± 21 }. This restriction limits the possible n-fold rotation axis to be of order n = 1 , 2 , 3 , 4 , 6 and allows no others. Thus, a crystalline lattice can only contain a two, three-fold, four-fold or six-fold axis of rotation. However, there do exist solids that possess five-fold symmetry, such as quasi-crystals. Quasicrystals are not crystals as they do not possess periodic translational invariance.
3.4.5
Point Group Nomenclature
The point groups are referred to by using two different notation schemes, the Schoenflies and the International notation. In the following examples, first the groups are labelled by their Schoenflies designation and then their International designation is given. The point groups are:
24
Cn The groups Cn only contain an n-fold rotation axis. The group contains as many elements as the order of the axis. The international symbol is n. Cn,v The groups Cn,v contain the n-fold rotation axis and have vertical mirror planes which contains the axis of rotation. The effect of the n-fold axis, if n is odd, is such that it produces a set of n equivalent mirror planes. This yields 2 n symmetry operations, which are the n rotations and the reflections in the n mirror planes. If n is even, the effect of repeating Cn only produces n2 equivalent mirror planes. The other n2 rotations merely bring the mirror plane into coincidence with itself, but with the two surfaces of the mirror interchanged. A mirror plane is equivalent to its partner mirror plane found by rotating it through π since, by definition, a mirror plane is two-sided. However, for even n, the effect of the compounded operation Cn σv acting on an arbitrary point P produces a point P 0 which is identical to the point P 0 produced by reflection of P in the mirror plane that bisects the angle between two equivalent mirror planes σv . Thus, the symmetry element given by the product Cn σv is equivalent to a mirror reflection in the bisecting (vertical) mirror plane. The effect of Cn is to transform this bisecting mirror planes into a set of n2 equivalent bisecting mirror planes. Thus, if n is even, there are also 2 n symmetry operations. These 2 n symmetry operations are the set of n rotations and the two sets of n2 reflections. Mirror planes which are not perpendicular to the rotation axis are recorded as m without any special marking. For even n, the International symbol for Cn,v is nmm. The two m’s refer to two distinct sets of mirror planes: one from the original vertical mirror plane and the second m refers to vertical the mirror planes which bisect the first set. For odd n, the international symbol is just nm, as the group only contain one set of mirror planes and does not contain a set of bisecting mirror planes. Cn,h The groups Cn,h contain the n-fold rotation, and have a horizontal mirror plane which is perpendicular to the axis of rotation. These groups contain n 2 n elements and, if n is even, the group contains Cn2 . σh = C2 . σh = I which is the inversion operator. The International notation usually refers to these as n/m. The diagonal line indicates that the symmetry plane is perpendicular to the axis of rotation. The only exception is C3h or 6. The international symbol signifies that C3h is relegated to the group of rotation reflections which are, in general, designated by n. Sn The groups Sn only contain the n-fold rotation - reflection axis. For even n, the group contains only n elements as (Sn )n = E. For odd n, (Sn )n = σ, so the group must contain 2 n elements. The International notation is given by the equivalent rotation inversion group n. For example, S6 ≡ 3, S4 ≡ 4, S3 ≡ 6. Dn The groups Dn contain an n-fold axis of rotation and a two-fold axis which is perpendicular to the n-fold axis. The effect of the n-fold rotation is
25
to produce a set of equivalent two-fold axes. If n is odd there are n equivalent two-fold axes. If n is even, the n-fold rotation produces n2 equivalent two-fold axes which are two sided. When n is even, the action of a two-fold rotation followed by an n-fold rotation is equivalent to a new two-fold rotation about an axis that bisects the original sets of two-fold axes. This can be seen by following the action of an arbitrary point P with coordinates (x, y, z) under the two-fold rotation about a horizontal axis, say the x axis. The rotation by π about the x axis sends z → − z and y → − y. A further rotation of 2nπ about the z axis, sends the point (x, −y, −z) to the final image point P 0 . Note that the z coordinate of point P 0 is − z. Construct the line joining P and P 0 . The mid-point lies on the plane z = 0, and subtends an angle of nπ with the x axis and, therefore, lies on the bisecting rotation axis. As the bisecting axis passes through the mid-point of line P − P 0 , this shows that the arbitrary point P can be sent to P 0 via a π rotation about the bisecting axis. Thus, for even n, there are n2 bisecting two-fold axes, and the n2 two-fold axes. In case of either even or odd n, the group contains 2 n elements consisting of the n-fold rotations and a total of n two-fold rotations. The International designation for Dn is either n22 or just n2, depending on whether n is even or odd. These two designations occur for similar reasons as to why there are two International designations for Cn,v . For odd n the designation n2 indicates that there is one n-fold axis and one set of equivalent two-fold axes. For even n, the symbol n22 indicates the existence of an n-fold axes and two inequivalent sets of two-fold axes. Dnh The groups Dnh contains all the elements of Dn and also contain a horizontal mirror plane perpendicular to the n-fold axis. The effect of a rotation about a two-fold axis followed by the reflection σh is equivalent to a reflection about a vertical plane σv passing through the two-fold axis. Since rotating σv about the Cn axis produces a set of n vertical mirror planes, adding a horizontal mirror plane to Dn produces n vertical mirror planes σv . The group has 4 n elements which are formed from the 2 n rotations of Dn , the n reflections in the vertical mirror planes, and n rotation reflections Cnk σh . For even n, the Intern 2 2 national symbol is m m m which is often abbreviated to n/mmm. The symbol indicates that the n-fold axis has a perpendicular mirror plane, and also the two sets of two-fold axes also have their perpendicular mirror planes. For odd n, the International symbol for the group acknowledges the 2n-fold rotation inversion symmetry and is labelled as 2n2m. Dnd The groups Dnd contains all the elements of Dn and mirror planes which contain the n-fold axis and bisect the two-fold axes. The effect of the two-fold rotations generate a total of n vertical reflection planes. There are 4 n elements, the 2 n rotations of Dn , n mirror reflections σd in the n vertical planes. The remaining n elements are rotation reflections about the principle 2k+1 axis of the form S2n where k = 0 , 1 , 2 , . . . , ( n − 1 ). The principle axis is, therefore, a 2n-fold rotation reflection axis. The International symbol is n2m indicating a n-fold axis, a perpendicular two fold axis and a vertical mirror plane.
26
T The tetrahedral group corresponds to the group of rotations of the regular tetrahedron. The elements are comprised of four three-fold rotation axes passing through one vertex and the center of the opposite faces. The compound action of two of the three-fold rotations yields a rotation about a two-fold axis. There are three such two-fold axes passing through the midpoints of opposite edges of the tetrahedron. The tetrahedral group has 12 elements. The symmetry operations can also be found in a cube, if the three four-fold rotation axes present in the the cube are discarded. The group has the International symbol of 23. Td The group Td corresponds to the tetrahedral group adjoined by a reflection plane passing through one edge and the mid point of the opposite edge of the tetrahedron. The reflection planes bisects a pair of two-fold axes of T . There are six mirror planes σd . For the cube, these mirror planes are the diagonal planes which motivates the use of the subscript d. The mirror planes convert the two-fold axes to produce four-fold rotation reflection axes S4 . The group Td contains 24 elements. The group Td has the International designation as 43m Th The group consists of the tetrahedral group adjoined by a mirror plane which bisects the angle between the three-fold axes. For the tetrahedron, this group is equivalent to Td , but for the cube, the mirror plane is parallel to opposite faces. There are three such horizontal planes. The planes bisect the angles between the three-fold axes, and, therefore, converts them into six-fold rotation reflection axes. Since the group contains S6 , it also contains I. Hence, Th = Ti × CI . The group has 24 elements. The International designation for 2 the group Th is either m 3 or m3. O The octahedral group has three mutually perpendicular four-fold axes. There are four three-fold axes, and six two-fold axes. It has 24 elements. It has an International designation of 432. Oh On adjoining a mirror plane to the octahedral group one obtains Oh . Adding a vertical mirror plane generates three other mirror planes. The effect of a reflection in the vertical mirror plane followed by a rotation C4 is equivalent to a reflection in a diagonal mirror plane. The C3 axes becomes S6 axes, just as in the case of Th . The group contains 48 elements. The International 2 4 3m or m3m. designation is either m
27
3.5
Bravais Lattices
There are an infinite number of choices for the primitive lattice vectors, however, only a few special lattices are invariant under point group operations. These are the Bravais lattices. In three dimensions there are 14 Bravais lattices types. The 14 Bravais lattices are organized according to seven crystal systems. The Bravais lattices can be categorized in terms of the number of symmetry operations. The unit cells have lattice vectors a, b, and c, of length a, b and c, as shown in the figure. The angles between the vectors are denoted by α, β and γ, such that α is the angle between b and c, etc. That is, α ( 6 b , c ), β ( 6 a , c ), and γ ( 6 a , b ). ——————————————————————————————————
3.5.1
Exercise 1
Show that the volume of a primitive unit cell, Vc is given by Vc = a b c
1 + 2 cos α cos β cos γ − cos2 α − cos2 β − cos2 γ
21 (33)
—————————————————————————————————— If the point group contains four three-fold axes C3 or (3), the system is cubic. It is possible to choose three coordinate axes which are orthogonal to each other and are perpendicular to the faces of a cube that has the four three-fold axes as the body diagonals.
3.5.2
Cubic Bravais Lattices.
The cubic Bravais lattices have the highest symmetry. The simple cube (P) has three four-fold rotation axes and four three-fold axes, along with six two-fold axes. There are three mirror planes can be adjoined, to the set of rotational symmetry operations. The three four-fold rotation axes (C4 ) are mutually perpendicular and pass through the centers of opposite faces of the cube. Any rotation which is an integer multiple of 24π will bring the cube into coincidence with itself. The four three-fold axes (C3 ) pass through pairs of opposite corners of the cube. A rotation of any multiple of 23π will bring the cube into coincidence with itself, as can be seen by inspection of the three edges at the vertex which the rotation axis passes through. The six two-fold axes (C2 ) join the centers of opposite edges. The highest symmetry group when mirror symmetry
28
is not included is the octahedral group O. The octahedral group O contains 24 symmetry operations. On adjoining a mirror plane to the set of rotations of the octahedral group, one has the highest symmetry point group which is labelled as Oh or m3m and has 48 symmetry elements. The reason that the cubic group is called the octahedral group is explained by the following observation. The group of symmetry operations of the cube is equivalent to the group of symmetry operations on the regular octahedron. This can be seen by inscribing an octahedron inside a cube, where each vertex of the octahedron lies on the center of the faces of the cube. Thus, the cubic point group is called the octahedral group O. There are three types of cubic Bravais lattices: the simple cubic (P), the body centered cubic (I) and face centered cubic (F) Bravais lattices. The primitive lattice vectors for the simple cubic lattice (P) can be taken as the three orthogonal vectors which form the smallest cube with the lattice points as vertices. The three primitive lattice vectors have equal length, a, and are orthogonal. The vertices of the cube can be labelled as (0, 0, 0), (0, 0, 1), (0, 1, 0), (1, 0, 0), (1, 1, 0), (1, 0, 1), (0, 1, 1) and (1, 1, 1). The primitive cell is the cube which contains one lattice point and has a volume a3 . The body centered cubic Bravais lattice (I) has a lattice point at the vertices of the cube and also one at the central point which is a2 (1, 1, 1) when specified in terms of the Cartesian coordinates formed by the edges of the conventional nonprimitive unit cell (which is the cube). The primitive lattice vectors are given in terms of the Cartesian coordinates by a1 = a2 (1, 1, −1), a2 = a2 (−1, 1, 1), a3 = a2 (1, −1, 1). These are the three vectors from any lattice point joining three neighboring body centers. The conventional unit cell contains two lattice sites and has a volume a3 , where a is the length√of the side of the cube. The primitive unit cell is a rhombohedron of edge a 2 3 which contains one lattice site and has a volume 18 4 a3 . The angles between the primitive lattice vectors is given by cos γ = − 13 . In the primitive cell, each body center of the conventional unit cell is connected by three primitive lattice vectors to three vertices of the conventional cell. The Wigner-Seitz cell for the body centered cubic lattice is a truncated octahedron. It is made of eight hexagonal planes which are bisectors of the lines joining the body center to the vertices. These eight planes are truncated by the planes of the cube which coincide with the bisectors of the lines between the neighboring body centers. The truncation produces the six square faces of the body centered cubic Wigner-Seitz cell. The face centered cubic Bravais lattice (F) consists of the lattice points at the vertices of the cube, and lattice points at the centers of the six faces. The
29
lattice points at the face centers are located at a2 (1, 1, 0), a2 (1, 0, 1), a2 (0, 1, 1), a a a 2 (1, 1, 2), 2 (1, 2, 1) and 2 (2, 1, 1). The primitive lattice vectors point from the vertex centered at (0, 0, 0) to the three closest face centers, a1 = a2 (1, 1, 0), a2 = a2 (1, 0, 1), a3 = a2 (0, 1, 1). Since each face is shared by two consecutive non-primitive unit cells there are 4 lattice sites in the conventional non-primitive cubic unit cell. The primitive unit cell is a rhombohedron with side √a2 . The edges of the primitive unit cell connect two opposite vertices of the cube via the six face centers. The edges of the primitive cell are found by connecting the vertex to the three neighboring face centers. The volume of the primitive unit cell is found to be 14 a3 . The angles between the primitive lattice vectors are π3 . The Wigner-Seitz cell for the face centered cubic lattice is best seen by translating the conventional unit cell by a2 along one axis. After the translation has been performed, the unit cell has the appearance of being a cube which has lattice sites at the body center and at the mid-points of the twelve edges of the cube. The Wigner-Seitz cell can then be constructed by finding the twelve planes bisecting the lines from the body center to the mid-points of the edges. The resulting figure is a rhombic dodecahedron. —————————————————————————————————— The presence of either one four-fold C4 , (4) or four-fold inversion rotation axes S4 , (4), makes it possible to choose three vectors such that a = b 6= c, α = β = γ = π2 and c is parallel to the C4 or S4 axes. This is the tetragonal system.
3.5.3
Tetragonal Bravais Lattices.
The tetragonal Bravais lattice can be considered to be formed from the cubic Bravais lattices by deforming the cube, by stretching it, or contracting it along one axis. This special axis is denoted as the c axis. Thus, the conventional unit cell can be constructed, starting with a square base of side a, by constructing edges of length c 6= a parallel to the normals of the base, from each corner. The simple tetragonal Bravais Lattice has a four-fold rotational axis and two orthogonal two-fold axes. These symmetry elements generate the group D4 . On adding a horizontal mirror plane to D4 , one obtains the highest symmetry tetragonal point group which is D4h or 4/mmm with 16 symmetry elements. There are two tetragonal Bravais lattices: the simple tetragonal Bravais lattice (P) and the body centered tetragonal Bravais lattice (I). The face centered tetragonal lattice is equivalent to the body centered tetragonal lattice. This can be seen by considering a body centered tetragonal lattice in which the conventional unit cell can be described in terms of a side of length c 30
perpendicular to the square base of side a and area a2 . Consider the view along the c axis which is perpendicular to the square base. By taking a new base of √ area 2 a2 and sides 2 a which are the diagonals of the original base, one finds that the body centers can now be positioned as the face centers. That is, the body centered tetragonal is equivalent to the face centered tetragonal unit cell. The equivalence between the body centered and face centered structures does not apply to the cubic system. However, the conventional body centered cubic unit cell is equivalent to a face centered tetragonal unit cell in which the height along the c-axis has a special relation√to the side of the base. Namely, the c-axis height is a and the side of the base is 2 a. Using the converse construction, the face centered cubic unit cell can be shown to be equivalent to the body centered tetragonal lattice with a particular length of the c-axis. —————————————————————————————————— The orthorhombic system has three mutually perpendicular two-fold rotation axes. The existence of the three mutually perpendicular two-fold rotation axes is compatible with the point groups D2 , C2v and D2h . It is possible to construct a unit cell α = β = γ = π2 .
3.5.4
Orthorhombic Bravais Lattices.
The conventional orthorhombic unit cell can be considered to be formed by deforming the tetragonal unit cell by stretching the base along an axis in the basal plane. Thus, the base can be viewed as consisting of a rectangle of side a 6= b. The unit cell has another set of edges which are parallel to the normal to the base and have lengths c. Thus, the conventional unit cell has edges which are parallel to three orthogonal unit vectors. The simple orthorhombic lattice (P) only has two-fold rotation axes. The two-fold axes are perpendicular, so the rotational group is D2 . The effect of adjoining a horizontal mirror plane converts D2 into the orthorhombic point group with highest symmetry which is D2h or 2/mmm with 4 symmetry operations. There are four inequivalent orthorhombic Bravais lattices. These are the simple orthorhombic lattice (P), the body centered orthorhombic (I), face centered orthorhombic (F) and a new type of lattice, the base centered orthorhombic lattice (C). The base centered orthorhombic lattice (C) can be constructed from the tetragonal lattice in the following manner. View the square net of side a, which forms the bases of the tetragonal √ unit cells, in terms of a non-primitive unit cell with a square base of side 2 a with sides along the diagonal. This larger non-primitive unit cell contains one extra lattice site at the center of the base 31
and the top. When this base centered tetragonal structure √ is then stretched along one of its sides ( one of the diagonal sides of length 2 a ), one obtains the orthorhombic base centered lattice. —————————————————————————————————— The monoclinic lattice system requires a minimum of one two-fold rotation axis. Due to the conditions imposed by the two-fold rotation symmetry, it is possible to choose α = γ = π2 6= β. The monoclinic systems is compatible with the point groups C2 , Cs and C2h .
3.5.5
Monoclinic Bravais Lattice.
The monoclinic Bravais lattice is obtained from the orthorhombic Bravais lattices by distorting the rectangular base perpendicular to the c axis into a parallelogram. The base is a parallelogram, and the two basal lattice vectors are perpendicular to the c axis. The simple monoclinic lattice (P) has a two-fold axis parallel to the c axis. The rotational group is C2 . If a horizontal mirror plane is added to C2 , then one finds that the most symmetric monoclinic point group is C2h or 2/m which has four elements. There are two types of monoclinic Bravais lattices: the simple monoclinic (P) and the body centered monoclinic Bravais lattice (I). The two monoclinic Bravais lattices correspond to the two tetragonal Bravais lattices. The four orthorhombic lattices collapse onto two lattices in the tetragonal and monoclinic systems, as the centered square net is not distinct from a square net. Likewise, the centered parallelogram is not distinct from a parallelogram. —————————————————————————————————— The groups C1 and Ci impose no specific restrictions on the lattice. This is the triclinic lattice system.
3.5.6
Triclinic Bravais Lattice.
The triclinic Bravais Lattice is obtained from the monoclinic lattice by tilting the c axis so that it is no longer orthogonal to the base. There is only the simple triclinic Bravais Lattice (P). The three axes are not orthogonal and the sides are all different. Apart from inversion, which is required by the periodic translational invariance of the lattice, the triclinic lattice has no special symmetry elements. The
32
point group of highest symmetry is Ci or 1 which has two elements. —————————————————————————————————— The presence of only one three-fold axes, either C3 (3) or S6 (3), produces the trigonal system. There are two types of trigonal system. In one of the trigonal systems, a primitive unit cell may be chosen with a = b = c and α = β = γ such that the three-fold axes is along the body diagonal. The other trigonal system has a = b 6= c and α = β = π2 and γ = 23π . This later system is denoted as the hexagonal system.
3.5.7
Trigonal Bravais Lattice.
The Trigonal Bravais Lattice is a deformation of the cube produced by stretching it along the body diagonal. The lengths of the sides remain the same and the three angles between the sides are all identical. There is only one trigonal Bravais lattice. The point symmetry group is D3h or 62m with 12 symmetry operations. The body centered cubic and face centered cubic Bravais lattices can be considered to be special cases of the trigonal lattice. For these cubic systems, the sides of the primitive unit cells are all equal and the angles are 109.47 degrees for the b.c.c. structure and 60 degrees for the f.c.c. structure. The trigonal unit cell contains two equilateral triangles. In the trigonal lattice the equilateral triangles form hexagonal nets. The difference between the trigonal lattice and the hexagonal lattice is merely due to the different stacking of the hexagonal planes. —————————————————————————————————— The presence of either a six-fold axes C6 (6) or a rotation inversion axes S3 (6), indicates that the system is hexagonal. The hexagonal unit cell has a = b 6= c and α = β = π2 and γ = 23π .
3.5.8
Hexagonal Bravais Lattice.
The hexagonal Bravais lattice has a unit cell in which the base has sides of equal length, inclined at an angle of 23π with respect to each other. The c axis is perpendicular to the base. The hexagonal system has a point group D6h or 6/mmm which has 24 symmetry elements.
33
There is only one Hexagonal Bravais Lattice. The primitive unit cells are rhombic prisms which can be stacked to build the hexagonal non-primitive unit cell. The six-fold rotational symmetry of the hexagonal Bravais lattice is most evident from the non-primitive unit cell. The primitive lattice vector are given in terms of Cartesian coordinates by a1
=
a2
=
a3
=
a eˆx √ a eˆx + 3 eˆy 2 c eˆz
34
(34)
In summary the following structures were found: Cubic. 3
a = b = c
α = β = γ =
π 2
Tetragonal. 2
a = b 6= c
α = β = γ =
π 2
Orthorhombic. 4
a 6= b 6= c
α = β = γ =
π 2
Monoclinic. 2
a 6= b 6= c
α = β =
Triclinic. 1
a 6= b 6= c
α 6= β 6= γ
Trigonal. 1
a = b = c
α = β = γ <
Hexagonal. 1
a = b 6= c
α = β =
π 2
π 2
6= γ
2 π 3
, γ =
6=
π 2
2 π 3
This completes the discussion of the set of fourteen Bravais lattices. In order to specify crystal structures, it is necessary to associate a basis along with the underlying Bravais lattice. The addition of a basis can reduce the symmetry of the crystal from the symmetry of the Bravais lattices. This results in thirty two point groups, and by adjoining the translations and combined operations, one finds the two hundred and thirty space groups. ——————————————————————————————————
3.5.9
Exercise 2
Form a table of the number of the n-th nearest neighbors and the distances to the n-th neighbors for the face centered cubic (f.c.c.), body centered cubic (b.c.c.) and simple cubic (s.c.) lattices, for n = 1, n = 2 and n = 3. ——————————————————————————————————
35
Having just used symmetry to enumerate all the possible Bravais lattices, we shall now discuss the possible symmetries of crystals. Due to the addition of the basis, the point group symmetry of a crystal can be different from the point group symmetry of the Bravais lattice.
3.6
Point Groups
The addition of a basis to a lattice can result in a reduction of the symmetry of the point group. Here the point groups are enumerated according to the Bravais Lattice types and by the Schoenflies designation followed by the appropriate (International) symbol. The cubic system with a basis can have the point symmetry group of either Oh (m3m), O (43), Td (43m), Th (m3) or T (23). The tetragonal system can have point group symmetry of D4h (4/mmm), D4 (42), C4v (4mmm), C4h (4/m) or C4 (4). The orthorhombic system can have point group symmetry of either D2h (mmm), D2 (222) or C2v (2mm). The monoclinic system can exist with point group symmetry of either C2h (2/m), C2 (2) and Cs (m), the group which only consists of the identity and the inversion operation. The triclinic system only contains C1 (1) and Cs (m). The trigonal system has the point groups D3h (62m), D3 (32), C3v (3m), S6 (3), or C3 (3). The hexagonal system has the point groups D6h (6/mmm), D6 (62), C6v (6mm), C6h (6/m), or C6 (6). There are four remaining groups. The groups C3h (6) and D3d (3m) which are usually included in the hexagonal system. Finally, there are the groups S4 (4) and D2d (42m) which are included with the tetragonal systems. This completes the enumeration of the 32 point groups.
3.7
Space Groups
On combining the point group symmetry operations with lattice translations, one can generate 230 space groups. Often, the space group is composed from symmetry operations of the point group and symmetry operations that are 36
translations by the vectors of the direct lattice. These space groups are called symmorphic groups. Lattices with symmorphic space groups can be constructed by attaching basis with the various point group symmetries on the various Bravais Lattices. For example, the 5 cubic point groups can be placed on the three cubic Bravais Lattices, yielding 15 cubic space groups. Likewise, 7 tetragonal groups can be placed on the two tetragonal Bravais Lattices, yielding 14 tetragonal space groups. This process only leads to 61 different space groups. In the other cases, the space groups contain two new types of symmetry operations that cannot be compounded from translations by Bravais lattice vectors and operations contained in the point groups. These groups are non-symmorphic. The new types of symmetry operations occur when there is a special relation between the basis dimensions and the size of the Bravais lattice. These new symmetry elements include :Screw Axes. A screw operation is a translation by a vector, not in the Bravais lattice, which is followed by a rotation about the axis defined by the translation vector. A screw symmetry is denoted by nm , where n represents the rotations 2nπ , where n = 2 , 3 , 4 , 6 and m represents the number of translations by lattice vectors which produce one complete rotation by 2 π. Thus, n screw operations, each producing a rotation of 2nπ , produce a translation of m lattice spacings. Glide Planes. A glide operation is composed of a translation by a vector, not in the Bravais lattice, which is followed by a reflection in a plane containing the translation vector. Glide planes are denoted by a, b, c (according to whether the translation is along the a, b and c axis), or n and d (the diagonal or diamond glide which are special cases involving translations along more than one axis). The hexagonal close-packed lattice structure has both of these types of nonsymmorphic symmetry operations. The hexagonal close-packed structure can be described by a three-dimensional unit cell which contains a centered hexagonal base, and which has an identical centered hexagonal top located at vertical distance c directly above the base. If one considers the base hexagon to be formed by six equilateral triangles, then there are lattice sites at the vertex of each triangle. These lattice sites form a triangular net in the basal plane and there is a similar triangular net in the upper plane. These lattice sites are designated as the A sites. There is a second net of triangles at a distance 2c vertically over the base. The centers of the mid-plane equilateral triangles are located directly over the (central) lattice sites of the base. There are two possible orientations for these triangles. On choosing any one orientation, the set of lattice sites on this mid-plane are located such that they lie directly over the centers of every other equilateral triangle in the base. These mid-plane lattice sites are designated as the B sites. Consider a line, parallel to the c axis. The line is equidistant between two neighboring B lattice sites and is equidistant to the two A lattice sites that form 37
the section of the perimeter of the basal hexagon which is parallel to the line connecting the above two B lattice sites. Viewed from the c axis, the vertical line passes through the center of the rectangle formed by the two A and two B lattice sites. This line is the screw axis. The screw operation consists of a translation by 2c followed by rotation of π, and brings the A hexagons into coincidence with the sites of the B hexagons. The glide planes can also be found by considering the projection of the lattice along the c axis. A line can be constructed which connects any two of the three B sites inside the hexagonal unit cell. Form a parallel line connecting a pair of neighboring A sites that forms part of the perimeter of the hexagonal base. Since this line is on the perimeter of the unit cell, it is equivalent to the parallel line segment connecting A sites at the opposite boundary. Consider the pair of parallel lines, one which connects the B sites, and the other which is the closest line segment that connects the A sites on the perimeter of the base hexagon. The projection of the glide plane along the c axis is parallel to and equidistant from the above pair of lines. The glide operation is a translation by c 2 along the c axis followed by a reflection in the plane. There are two different systems of nomenclature for space groups, one due to Schoenflies and the other is due to Hermann and Mauguin. The Hermann - Maugin space group nomenclature consists of a letter P , I , F , R , C which describes the Bravais Lattice type, followed by a statement of the essential symmetry elements that are present. Thus, for example, the space group P 63 /mmc has a primitive (P ) hexagonal Bravais lattice with point group symmetry 6/mmm. Another example is given by the space group P ba2 which represents a primitive (P ) orthorhombic Bravais Lattice and has a point group of mm2 (the a and b glide planes being simple mirror planes in point group symmetry).
38
3.8
Crystal Structures with Bases.
Crystal structures are specified by giving the basis and the Bravais Lattice. The basis is specified by the positions of and types of atoms in the unit cell. Sometimes it is also useful to specify the local coordination polyhedra around each inequivalent site in the lattice. This provides information about the local environment of the atom which is important for bonding. Small deformations in the positions of the atoms can lower the symmetry of a crystal structure, but usually does not affect the connectivity or topology of the atoms. Therefore, slight deformations of the local environment are often specified by the same local coordination polyhedra. The local coordination polyhedra have been enumerated by W. B. Jensen, (The Structures of Binary Compounds, North Holland publishers (1988)) and by Villars and Daams (Journal of Alloys and Compounds, 197, 177 (1993)).
3.8.1
Diamond Structure
The diamond lattice is formed by the carbon atoms in a diamond crystal. The structure is cubic, and has the space group F d3m. The underlying Bravais lattice is the face centered cubic lattice, and has a two atom basis. In the diamond structure, both atoms are identical. They are located at the sites of the Bravais lattice (0, 0, 0) and at a second site displaced by a distance a( 14 , 14 , 14 ) in terms of the Cartesian coordinates of the conventional unit cell. There are four lattice points corresponding to the sites of the conventional f.c.c. unit cell. There are also four interior points which are displaced from the Bravais lattice points by the basis vector a( 14 , 14 , 14 ). Thus, the diamond structure consists of two interpenetrating face centered cubic lattices with atoms on each lattice site. Diamond possesses a center of inversion located half way between the origins of the two f.c.c. lattices. This is a glide-like inversion operation. The center of inversion is located at a( 18 , 18 , 18 ). When this is chosen as the origin, the crystal is symmetric under the transformation r → − r. Each atom is covalently bonded to four other atoms. The four neighboring atoms form a tetrahedron centered on each atom. The tetrahedra centered on the two inequivalent lattice sites have different orientations. The diamond lattice is most stable for compounds in which the bonds are highly directional. Directional covalent bonding is often found in the elements of column IV of the periodic table. In particular, Carbon, Silicon and Germanium can crystallize in the diamond structure. The great strength of diamond is a consequence of the three-dimensional network of strong covalent bonds. The diamond structure is relatively open as the packing fraction is only 0.34. ——————————————————————————————————
39
3.8.2
Exercise 3
Find the angles between the tetrahedral bonds of diamond. ——————————————————————————————————
3.8.3
Graphite Structure
Graphite is the stable form of carbon. Graphite has a hexagonal unit cell and has the space group P 63 /mmc. The primitive lattice vectors may be represented by √ a1 = 3 a eˆx √ 3 3 a eˆx + a eˆy a2 = 2 2 a3 = c eˆz (35) where a is the length of the side of the hexagon. The atoms are located at [0, 0, z] and [0, 0, 12 + z] where z ≈ 0, and the coordinates are given in terms of the primitive lattice vectors. Another two atoms are located at the positions [ 23 , 23 , z] and [ 13 , 13 , 12 + z], where z ≈ 0. The structure is formed in layers, in which each atom is bonded to three other atoms, thereby forming a two-dimensional hexagonal network. The central site of the two-dimensional hexagonal ring is open. The stacking sequence of the layers just corresponds to a translation of one layer by [ 31 , 13 , 12 ] with respect to the other, such that one C atom lies above the hexagonal hollow in the layer below. The layers are relatively far apart, and as is expected, there is only weak van der Waals bonding between the layers. This structure explains the cleavage and other characteristic properties of graphite. Carbon may crystallize into either as a diamond lattice or as graphite, under different conditions. This is an example of polymorphism which is quite common among the elements. Diamonds are not forever as they actually are an unstable form of C under ambient conditions, although the rate of transformation to the stable form (graphite) is exceedingly slow. Boron and Nitrogen, which occur on either side of Carbon in the periodic table, form compounds which have properties that are strikingly similar to Carbon. The Boron and Nitrogen atoms can be bonded in either planar structures like graphite, or tetrahedral structures, like diamond. The tetrahedral bonded Boron - Nitrogen materials have extremely high melting points and hardness, and have great importance in materials engineering. ——————————————————————————————————
40
3.8.4
Exercise 4
There are two forms of graphite. The most common form is hexagonal graphite, which has a stacking sequence A − B − A − B. The other form of graphite is based on an f.c.c. form with a stacking sequence A−B −C −A−B −C. Describe the primitive unit cells for the two forms of graphite. How many atoms are in the primitive unit cells of graphite? ——————————————————————————————————
3.8.5
Hexagonal Close-Packed Structure
The hexagonal close-packed structure has hexagonal symmetry, the space group is P 63 /mmc. It is composed of the hexagonal Bravais lattice, and has a basis composed of two atoms. The two identical atoms are positioned at [0, 0, 0] which is at the vertex of the primitive lattice cell, and has the other atom located at [ 13 , 13 , 12 ] as expressed in terms of the primitive lattice vectors. (The square brackets indicate that the direction in the direct lattice are specified with respect to the primitive lattice vectors.) The primitive lattice vectors are a1
=
a2
=
a3
=
a eˆx √ a eˆx + 3 eˆy 2 c eˆz
(36)
Thus, the hexagonal close-packed structure has a basis of two atoms one at r1 = (0, 0, 0) and the other at 1 1 r2 = a1 + a2 + a 3 2 3 a c a √ eˆy + = eˆx + eˆz (37) 2 2 2 3 Since a1 and a2 are inclined at an angle π3 , the structure can be considered to be formed by two interpenetrating simple hexagonal Bravais lattices. Alternately, the structure may be viewed as being formed by stacking two-dimensional triangular lattices above one another, with a separation between the layers of half the height of the unit cell. Each atom has 12 nearest neighbors: six within the hexagonal plane and three in each of the planes above and below the atom. The name hexagonal close-packed comes from thinking of this structure as being formed from hard spheres of radius and forming a close-packed hexagonal layer. The second layer is formed by stacking a second hexagonal layer of atoms above the first. However, the center of the second layer of atoms are positioned above the dimples in the first layer. There are two sets dimples of dimples between the atoms, so there are two different choices for placing the second 41
layer of atoms. The third layer is stacked such that the centers of the atoms are directly above the centers of the atoms of the first layer, and the fourth is stacked directly over the second layer, etc. Thus, there are two interpenetrating hexagonal lattices displaced by 1 1 1 a a + a + 2 3 3 2 3 1
(38)
or [ 13 , 13 , 12 ]. There are a total of twelve nearest neighbor atoms which are distributed as 6 neighbors in the plane, 3 in the plane above, and 3 in the plane below. This gives a total of 12 nearest neighbor atoms. On assuming a radius of the atomic spheres to be r, the lattice constants √ satisfy a = b = 2 r and c = 4 √23 r. This yields the hexagonal close-packed structure, and has the particular ratio of the c to the a axis lengths of r c 8 = = 1.633 (39) a 3 This is the ideal c to a ratio. Hexagonal close-packed systems with the ideal ratio have a packing fraction of 0.74. As atoms are not hard spheres, there is no reason for this value to be found in naturally occurring crystals, and deviations from the ideal value are found most frequently. Only He has the ideal c to a ratio. The most frequently occurring structures are the close-packed structures. These are the hexagonal close-packed, face centered cubic and body centered cubic structures, which have packing fractions of 0.74, 0.74 and 0.68, respectively. Both simple and transition metals frequently form in the hexagonal close-packed structure, or other close-packed structures. ——————————————————————————————————
3.8.6
Exercise 5
Show that the
c a
ratio for an ideal hexagonal close-packed lattice structure is c = a
12 8 3
(40)
——————————————————————————————————
42
3.8.7
Exercise 6
N a transforms from b.c.c. to h.c.p. at 23 K via a Martensitic transition. On assuming that the density remains constant and the h.c.p. structure is ideal, find the h.c.p. lattice constant a in terms of the b.c.c. value a0 . ——————————————————————————————————
3.8.8
Other Close-Packed Structures
One can form other close-packed structures by altering the sequence of stacking of the close-packed layers. The hexagonal close packed can be characterized by the repeated stacking sequence A - B - A - B etc. That is, the atoms in the planes above and below the triangular lattice have centers directly over the dimples and each other, thereby creating a two layer unit cell. Another stacking sequence is given by A - B - C in which the unit cell consists of three layers. The A and C layers have the atoms centered on the two inequivalent sets of triangular dimples of the B layer. This close-packed stacking corresponds to the face centered cubic lattice. The packing fraction of the face centered cubic lattice and hexagonal close-packed lattice are identical. The triangular close-packed nets are the planes perpendicular to the body diagonal of the conventional f.c.c. unit cell. There are two such planes which pass through the conventional unit cell and two further planes that each just graze one vertex of the unit cell. The intercepts of the planes with the conventional (Cartesian) axes are (1, 0, 0), (0, 1, 0) and (0, 0, 1). The next plane has intercepts (2, 0, 0), (0, 2, 0) and (0, 0, 2). The sets of planes are known as {1, 1, 1} planes and have triangular arrays of atoms, where the sides of the triangle side has length √a2 . The normal to the planes are in the direction [1, 1, 1] i.e. 1 n ˆ = √ eˆx + eˆy + eˆz (41) 3 where eˆ are the orthogonal unit vectors of the conventional cell. The equations of the planes are r − m a eˆx . n ˆ = 0 (42) where m is an integer that labels the plane by the intercept with the x axis. The quantity m is related to the perpendicular distance, s, between the plane and the origin through a (43) s = m √ 3 for integer m.
43
It is convenient to introduce three new orthogonal unit vectors to describe the positions of the atoms in the planes. The first is n ˆ the normal to the planes 1 eˆx + eˆy + eˆz (44) n ˆ = √ 3 The other vectors eˆ1 and eˆ2 are chosen to be vectors in the planes. These form a new set of Cartesian non-primitive lattice vectors which are defined by 1 eˆx − eˆy (45) eˆ1 = √ 2 which corresponds to the face diagonal of the conventional unit cell that lies in the triangular plane and 1 eˆ2 = √ eˆx + eˆy − 2 eˆz (46) 6 which is the ”lateral” direction in the triangular plane. The lateral displacements of atoms between one triangular plane, say the plane which passes through the atom at ( 12 , 12 , 0)), and the atoms on the next plane (centered on the origin (0, 0, 0)) can be written as a a ∆r = eˆx + eˆy − n ˆ √ 2 3 a = eˆx + eˆy − 2 eˆz 6 1 a (47) = √ √ eˆ2 3 2 This can be re-written as
√ 2 3 a √ eˆ2 3 2 2
(48)
√
as √a2 is the triangular lattice constant and 23 √a2 is the height of the triangle. Thus, the atoms in consecutive planes are displaced ”laterally” by 0, 23 , and 4 3 and then repeats. The resulting structure has layers which have a stacking sequence A − B − C − A − B − C etc. There are other possible stacking sequences, with longer periodicities. The earlier lanthanides and late actinides have a stacking sequence A - B - A - C with four layers per unit cell, however, the Sm structure only repeats itself after nine layers. The longest known periodicity is 594 layers which is found in a polytype of SiC. The long-ranged crystallographic order is not due to longranged forces, but is caused by spiral steps caused by dislocations in the growth nucleus. There is also the possibility of random stacking sequences.
44
3.8.9
Sodium Chloride Structure
The Sodium Chloride or N aCl structure is cubic. The space group is F m3m. It has an ordered array of N a and Cl ions located on the sites of a simple cubic lattice of linear dimension a2 . Each type of ion is surrounded by six ions of the opposite charge, located at a distance a2 away. The twelve next nearest neighbors have like charge and are located at a distance √12 a away along the face diagonals of the cubic unit cell. There are four units of N aCl in the unit cell. The structure may be most efficiently visualized as having the N a+ ions located on the sites of a face centered cubic lattice with vertices at (0, 0, 0) and the Cl− ions are located on a face centered cubic lattice with vertices at the center of the cubic unit cell ( 12 , 12 , 12 ). The Sodium Chloride structure is favored by many ionic compounds. In this structure, the electrostatic interactions are balanced by the short ranged repulsive interactions due to the finite size of the ions. The short ranged repulsions are due to the Pauli exclusion principle. The sizes of the ions are important in determining the stability of this structure. If the ions of opposite charge are envisaged as just touching, then the ionic radii must satisfy the equality + − a = 2 r(N a ) + r(Cl ) (49) Ions of the same type are closest along the face diagonals, so if they do not touch, the lattice constant satisfies the inequality 1 √ a > 2 r(Cl− ) 2
(50)
Combining the above two equations yields an inequality for the ratio of the ionic radii of the ions √ r(Cl− ) ≤ 1 + 2 (51) + r(N a ) If this inequality is not obeyed, the Pauli forces render the structure unstable. Examples of materials that form in the N aCl structure are the alkali halides made from the alkaline elements Li, N a, K, Rb or Cs with a halide element F , Cl Br or I. Alternatively, one can go to the next columns of the periodic table and combine M g, Ca, Sr or Ba with a chalcogen O, S, Se or T e to form the N aCl structure.
3.8.10
Cesium Chloride Structure
The ionic compound Cesium Chloride or CsCl has a cubic structure. The space group is P m3m. The Cs+ ion is located at (0, 0, 0) and the Cl− ion at the body center of the cube ( 12 , 12 , 12 ). Thus, the CsCl structure resembles a body centered 45
cubic structure in which one type of atom is at the simple cubic sites and the other type of atom is at the body center. Each ion is surrounded by eight atoms √ 3 of opposite charge located at a distance 2 a away, which corresponds to half the length of the body diagonal of the cube. Each atom has six neighbors of similar charge located a distance a away. The ratio of the ionic radii required for this structure to be possible is √ r(Cl− ) ( 3 + 1) ≤ (52) r(Cs+ ) 2 If the radii ratio is greater than 1.366, but less than 2.42, ionic compounds prefer the N aCl structure. Examples of compounds that form the CsCl structure are the Cs halides, T l halides, CuZn (beta brass), CuP d, AgM g and LiHg. Linus Pauling has produced a set of empirical rules which determine the coordination numbers in terms of the ionic radii of the ions. If one assumes that the anion adopts the cubic close-packed structure (f.c.c.), there are three types of holes between the close-packed spheres and each type of hole has a different size. It is assumed that the cations fit into one set of holes. The central site of the conventional f.c.c. unit cell is surrounded by an octahedron and, therefore, has a coordination number of 6. There are also tetrahedral holes with coordination number 4. The tetrahedral holes are located near the 8 corners of the f.c.c. cube, and the vertices of the tetrahedra are located at the corner and the three neighboring face centers. The tetrahedral holes are best seen by considering an octant of the f.c.c. cube. The tetrahedral hole site is at the center of the octant, and the four vertices of the tetrahedron are located at four of the octants corners. The are 12 trigonal holes which are located near the 8 vertices of the conventional unit cell. The trigonal sites lie in the plane formed by the vertex and any two of the closest face centers. The radius ratio rule suggests that the structure is determined by maximizing the coordination numbers while keeping ions of opposite charge in contact. This procedure seems likely to maximize the electrostatic attraction energy. By considering the geometry of the holes, one expects that certain structures will be stable for different values of the radius ratio r(X − ) rr = (53) r(R+ ) For the tetragonal sites, by considering the body diagonal of the octant, one expects that √ 3 − + 2 r(X ) + r(R ) = a (54) 2 and by considering the face diagonal a 2 r(X − ) < √ 2 46
(55)
Hence, we find the tetragonal hole has the limiting radius ratio of r 3 r(X − ) > 2 + 1 r(R+ ) 2
(56)
In particular, the radius ratio rules suggest that the range of radii ratios where the various configurations are stable are given by 6.45 > rr 4.45 > rr 2.41 > rr
> > >
4.45 trigonal 2.41 tetrahedral 1.37 octahedral
3 4 6
(57)
If the atoms have comparable sizes, then it is necessary to consider more open structures with higher coordination numbers, such as simple cubic. For the simple cubic structure, the coordination number is 8 hole and the hole size is larger 1.37 > rr. Thus, since rr ∼ 1.8 for N a and Cl, it fits the radius ratio rules as being octahedrally coordinated, like in the N aCl structure. On the other hand, for Cs and Cl where the ions have comparable sizes, the radius ratio is rr ∼ 1.07 which is compatible with the cubic hole structure found in CsCl.
3.8.11
Fluorite Structure
Fluorite or CaF2 has a cubic structure. The space group is F m3m. Ionic compounds of the form RX2 , in which the ratio of the ionic radii r satisfy the inequality √ ( 3 + 1) r(X − ) (58) ≤ 2 r(R2+ ) can form the fluorite structure. The unit cell has four Ca2+ ions, one at the origin and the others are located at the face centers of the cube. The eight F − ions are interior to the cube. The F − ions form simple cubes which are concentric with the unit cells, but the simple cubes have only half the lattice spacing of the unit cell. Alternatively, the eight F − ions can be considered to lie on two interpenetrating f.c.c. lattices with origins ( 34 , 14 , 14 ) and ( 34 , 34 , 34 ). Each F anion occupies a site at the center of a tetrahedron formed by the Ca cations. Materials, such as LiO2 , form an anti-fluorite structure. The anti-fluorite structure is the same as the fluorite structure except that the positions of the anions and cations are revered. The O anions are in the f.c.c. positions and the Li cations form a simple cubic array.
3.8.12
The Copper Three Gold Structure
The Cu3 Au structure is cubic, and has the space group P m3m. The Bravais lattice corresponds to a primitive cubic structure. There are 3 Cu atoms and one Au per unit cell. All the atoms are located on the sites of a face centered cubic 47
unit cell. The Au atom can be envisaged as being positioned on the corners of the cube, whereas the 3 Cu atoms sit on the centers of the faces of the cube, forming octahedra. Thus, the basis of the structure consists of the position of the Au atom r0 = 0 (59) and the three Cu atoms are located at a ( eˆy + eˆz ) r1 = 2 a ( eˆx + eˆz ) r2 = 2 a r3 = ( eˆx + eˆy ) 2 The Au atoms have 12 Cu nearest neighbors located at a distance the Cu atoms only have 4 Au nearest neighbors.
(60) √a , 2
whereas
Other compounds with the Cu3 Au structure are N i3 Al, T iP t3 and the metastable compound Al3 Li.
3.8.13
Rutile Structure
The structure possessed by rutile, T iO2 , by cassiterite, SnO2 and by numerous other substances with small cations, is tetragonal. The space group is P 42 /mnm. The T i4+ ions occupy positions : (0, 0, 0) ; ( 12 , 12 , 12 ) while the O2− 3 ions occupy the four positions ± (x, x, 0) ; ± ( 12 + x, 12 − x, 12 ) where x ≈ 10 . Thus, the titanium atoms occupy the sites of a body centered tetragonal lattice. The oxygen atoms lie on lines which are oriented along one set of face diagonals of the base. The atoms are also located on horizontal lines through the body centers, and are orthogonal to the lines in the base. The titanium ion is surrounded by six O atoms which form a slightly distorted octahedron.
3.8.14
Zinc Blende Structure
Zinc Blende structure or ZnS is cubic. This is also known as the Sphalerite structure. The space group is F 43m. The Zn2+ ions are positioned at (0, 0, 0) and the face centers of the cube. The S 2− are positioned on an interpenetrating face centered cubic lattice with origin ( 14 , 14 , 14 ). There are four units of ZnS in the unit cell. The Zinc Blende structure is related to the diamond structure, except that Zinc Blende involves two different types of atoms. Each atom in ZnS is surrounded by a regular tetrahedron of atoms of the opposite type. Unlike diamond, Zinc Blende has no center of inversion, as the diamond inversion operator interchanges the two different types of atoms. The radius ratio rules suggest that this structure will be adopted whenever r √ 3 (61) 2 + 1 > rr > 1 + 2 2 48
The Zinc Blende structure is often found for binary compounds formed from pairs of elements from either the II - VI columns, III - V columns or the I - VII columns of the periodic table.
3.8.15
Zincite Structure
Zincite, ZnO, has a hexagonal structure. This structure is also known as the Wurtzite structure. The space group is P 63 mc. The primitive lattice vectors are given by a1 a2 a3
= a eˆx √ a = ( eˆx + 3 eˆy ) 2 = z eˆz
(62)
The Zn and O atoms occupy the positions [ 0, 0, z ]; [ 23 , 23 , 12 + z ] where z = 0 for Zn and about z ≈ 83 for O. Since ZnS also is found in this form above 1300 K, it is not surprising that Zincite structure has a local coordination similar to that of the low-temperature Zinc Blende structure. Each atom is surrounded by a tetrahedron of atoms of the opposite type. The tetrahedra form continuous interconnected networks. However, symmetry does not require that the tetrahedra are regular. The cubic Zinc Blende and the hexagonal Wurtzite structures are closely related. They merely differ by the stacking sequence of the Zn (S) close-packed planes. The structure consists of alternate close-packed planes which either contain only Zn or only S ions. The set of planes form layers consisting of a pair of planes. In a layer, the Zn atoms in one plane and the S atoms in the other plane are bonded by vertical tetrahedral bonds. The remaining three tetrahedral bonds join the atoms in the successive layers. Due to the orientation of the inter-layer tetrahedral bonds, successive pairs of planes are displaced horizontally. Thus, the successive sets of vertical bonds are displaced horizontally. In the cubic Zinc Blende sequence, the tetrahedra of the S atom bonds have the same rotational orientation in each layer, so that each S layer is displaced in the same direction. The net horizontal displacement produced in three vertical S layers is equal to the periodicity in the direction of the displacement. This can be considered as having a stacking sequence A - B - C which repeats. In the hexagonal Wurtzite sequence, the tetrahedra of bonds are rotated by π between successive S layers. Thus, the horizontal displacement that occurs between one S layer and the next are cancelled by the opposite displacement that occurs by going to the very next S layer. This stacking sequence is A - B which repeats.
49
3.8.16
The Perovskite Structure
The perovskite structure, as exemplified by BaT iO3 , is cubic at high temperatures but becomes slightly tetragonal on cooling below a ferro-electric transition temperature. The cubic structure has the space group P m3m. The structure is composed of the T i atoms positioned on the simple cubic lattice sites (0, 0, 0), and the Ba atoms positioned at the body center sites ( 21 , 12 , 12 ). The three O atoms are located at the mid-points of the edges of the cube, i.e. at (0, 0, 21 ), (0, 12 , 0) and ( 12 , 0, 0). An alternate representation of the unit cell is found by centering the lattice on the Ba ions, by translating the origin via 12 (1, 1, 1). In this representation, the T i atoms are located at the body centers, and the O atoms lies on the face centers. The T iO2 form a set of parallel planes separated by planes of BaO. Each T i atom is surrounded by an octahedron of O atoms, which have corners which are shared with the octahedron surrounding the neighboring T i atoms. ——————————————————————————————————
3.8.17
Exercise 7
The density of the face centered cubic structure is highest, body centered cubic is the next largest, followed by simple cubic and then diamond has the lowest density. This correlates with the coordination numbers. The coordination number is defined to be the number of nearest neighbors. The coordination numbers are 12 for the f.c.c. lattice, 8 for b.c.c., 6 for s.c. and 4 for diamond. Assume that the atoms are hard spheres that just touch. Find the packing fraction or density of these materials. ——————————————————————————————————
3.9
Lattice Planes
A Bravais lattice plane, by definition, passes through three non-collinear Bravais lattice points. Since these points are connected by combinations of multiples of the primitive lattice vectors, and due to the periodic translational symmetry of the lattice, the lattice planes must contain an infinite number of lattice points. Given one such lattice plane, there exists a family consisting of an infinite set of parallel lattice planes with the same normal. One such lattice plane must pass through each Bravais lattice point, since the lattice viewed from any lattice point is identical to the lattice when viewed from any other lattice point. Thus, the family of parallel planes contain all the points of the Bravais lattice. Each member of the set of lattice planes must intersect the axis given by the primitive lattice vectors a1 , a2 and a3 . The planes need not intersect any 50
particular axes at a lattice point, however, every lattice point on the three axes will have one member of the family pass through it. In particular, one plane must pass through the origin O. Each plane is uniquely specified by the three intercepts of the plane with the axes formed by three primitive lattice vectors directed from the origin to the Bravais lattice points a1 , a2 and a3 . The intercepts x1 , x2 and x3 are measured in units of the length of the primitive lattice vectors. That is, the intercepts are x1 a1 , x2 a2 and x3 a3 . The three points of intersection between one lattice plane with the three primitive axes can be represented as κ [ h11 , 0, 0], κ [0, h12 , 0] and κ [0, 0, h13 ], where κ is a positive or negative integer, and (h1 , h2 , h3 ) are also positive or negative integers. The integers (h1 , h2 , h3 ) are chosen such that they have no common factors. The index κ serves to distinguish between the different members of the same family of planes. The plane that passes through the origin has κ = 0, whereas the plane that passes next closest to the origin has κ = 1. The planes that are at successively further distances from the origin have larger magnitudes of κ. The indices (h1 , h2 , h3 ) are found by locating the intercepts of the plane with the three primitive axes, say x1 a1 , x2 a2 and x3 a3 , inverting the intercepts 1 1 1 x1 , x2 , x3 , and then finding the smallest three integers which have the same ratio 1 1 1 : : = h1 : h2 : h3 (63) x1 x2 x3 The set of integers (h1 , h2 , h3 ) are enclosed in round brackets and denote the Miller indices of the plane. A negative valued integer, such as − h1 , is denoted by an overbar such as h1 . The Miller indices label the direction of the normal to the family of planes. Since the vectors between pairs of intercepts lay in the plane, the three vectors 1 1 a − a h1 1 h2 2 1 1 a a2 − h3 3 h2 1 1 a − a h3 3 h1 1
(64)
are parallel to the plane. Any two of these vectors span the plane, so the third vector is not independent. The normal to the plane is parallel to the vector product of any two non-collinear vectors in the plane 1 1 1 1 2 n ˆ ∝ κ a − a ∧ a − a h1 1 h2 2 h2 2 h3 3
51
=
κ2 h1 h2 h3
h3 a1 ∧ a2 + h2 a3 ∧ a1 + h1 a2 ∧ a3
(65) Thus, the direction of the normal to the plane is given in terms of the components hi in the three directions defined by aj ∧ ak . The three vectors have the same directions as the primitive ”reciprocal lattice vectors”. The primitive reciprocal lattice vectors are defined by b1 = 2 π
a2 ∧ a3 a1 . ( a2 ∧ a3 )
(66)
and cyclic permutations of the set (1, 2, 3). These primitive reciprocal lattice vectors are, in general, not orthogonal. The normal to the plane is then given by the direction of the reciprocal lattice vector B h B h = h 1 b 1 + h 2 b2 + h 3 b 3
(67)
where (h1 , h2 , h3 ) are the Miller indices. The length of this reciprocal lattice vector is defined as 2 | B h |2 = h 1 b1 + h 2 b 2 + h 3 b 3 =
2π dh
2 (68)
This is seen through the following consideration: The equation for the points r on the plane which intercept the primitive lattice vectors ai at distances xi = hκi is given by r . Bh
= = =
κ a . Bh h1 1 κ a . h1 b 1 h1 1 2πκ
(69)
The minimum distance, s, between the origin and the plane is given by s
Bh = r. | Bh | dh = r . Bh 2π = κ dh
(70)
Thus, it is found that the spacing between successive planes in the family is given by s = dh , and the planes are equidistant.
52
Sets of families of planes that are equivalent in a given crystal structure are denoted by {h, k, l}. For example, in a cubic crystal the families of planes (1, 0, 0), (0, 1, 0) and (0, 0, 1) are equivalent and are denoted by {1, 0, 0}. A direction of a vector in the direct lattice is specified by three integers in square brackets [n1 , n2 , n3 ] and specify a vector n1 a1 + n2 a2 + n3 a3
(71)
A negative value for a component is also denoted by an overbar. The set of directions which are equivalent for a crystal structure are denoted by < n1 , n2 , n3 >. ——————————————————————————————————
3.9.1
Exercise 8
Consider the planes (1, 0, 0) and (0, 0, 1) for a f.c.c. lattice with axes described by the conventional unit cell. What are the indices of the planes when referred to the primitive axes? ——————————————————————————————————
3.9.2
Exercise 9
Show that the angles α1 ( 6 a2 , a3 ), α2 ( 6 a3 , a1 ) and α3 ( 6 a1 , a2 ) between the three primitive lattice vectors of the direct lattice, ai , are related to the angles between the three primitive lattice vectors of the reciprocal lattice, bi , β1 ( 6 b2 , b3 ), β2 ( 6 b3 , b1 ) and β3 ( 6 b1 , b2 ) via cos α1 =
cos β2 cos β3 − cos β1 | sin β2 sin β3 |
(72)
and also find the inverse relation. ——————————————————————————————————
3.10
Quasi-Crystals
Quasi-crystals have symmetries intermediate between a crystal and a liquid. Quasi-crystals are usually intermetallic alloys. The quasi-crystal is space filling, but unlike a regular Bravais lattice, does not have just one unit cell. These different ”unit cells” are stacked in a way such that there is no long-ranged positional order, but nevertheless retain orientational order. The absence of long-ranged positional order lifts the restriction on the symmetry of the lattice
53
but puts a restriction on the vectors that describe the ”unit cells”. For example, an Al − M n quasi-crystal (Schechtman, Blech, Gratais and Cahn, Phys. Rev. Lett. 53, 1951 (1984)) has icosahedral symmetry, with two, three and five-fold axes. The structure is made from blocks consisting of a central M n atom surrounded by 12 Al atoms arranged at the corners of an icosahedron. This type of icosahedral structure often the arrangement of 13 atoms which has the lowest energy (F.C. Frank, Proc. Roy. Soc. London, 215, 43 (1952)). The icosahedra are stacked together with the same orientation. The voids are formed with the second structural unit. The five-fold symmetry of the icosahedra is not allowed for a regular Bravais Lattice. The five-fold point group symmetry imposes a restriction on the lengths of the ”lattice vectors” of a quasi-crystal to have certain irrational ratios. Thus, the reciprocal lattice contains reciprocal lattice vectors of arbitrary small magnitude which show up as an extremely high density of Bragg reflections (Levine and Steinhart, Phys. Rev. Lett. 53, 2477 (1984)). A way of obtaining quasi-crystal structures is by projecting a periodic Bravais lattice structure in higher dimensions (six or more) onto three dimensions (P. Kramer and R. Neri, Acta. Crystallogr. Sec. A 40, 580 (1984)). To illustrate this, consider a square two-dimensional lattice, with lattice constant a. On any unit cell, construct two parallel lines with slope tan θ passing through opposite corners. The equations of the lower line is given by y = x tan θ
(73)
and the upper line is determined by y = a + ( x + a ) tan θ
(74)
For rational values of the slope, tan θ = pq , the lattice points cross the line periodically, with repeat distance q a along the x direction and have periodicity p a along the y direction. Lines with irrational values of the slope cannot cross more than one lattice point and, therefore, do not have periodic long-ranged order. The points (na, ma) contained in the area between the two lines satisfy the inequality 1 + ( n + 1 ) tan θ > m > n tan θ (75) Project the lattice points contained within the strip onto one of the lines. The distance s along the lower line is given by s = n a cos θ + m a sin θ
(76)
where m =
X
m0 Θ(1 + (n + 1) tan θ − m0 ) Θ(m0 − n tan θ)
(77)
m0
For irrational values of the slope, the resulting array of points is a quasi-periodic array. The spacing between consecutive points of the quasi-periodic array is either given by cos θ or sin θ. The spacings are not distributed periodically, but 54
nevertheless, are distributed according to some√irregular or more complex pattern. If the slopes of the line are equal to 12 ( 5 − 1 ) the array of projected points is a Fibonacci series. For a Fibonacci series of numbers, the first term can be chosen in any way but the next term is given by the sum of the preceding two numbers, i.e., Fn+1 = Fn + Fn−1 . Thus, both series 1 , 1 , 2 , 3 , 5 , 8 ,√13 etc. or 3 , 3 , 6 , 9 , 15 etc. are Fibonacci series. The golden mean 1 5 + 1) is the limit of the ratio of the successive terms. In our example, 2 ( the sequences of spacings is given by s c s c c s c s c . . .. The first element of the Fibonacci series is s the second element is c the third element comprises of s c, the next element is c s c, which is followed by s c c s c etc. If this type of analysis is applied to high dimensional Bravais lattices, one can find three dimensional quasi-crystal structures with five-fold symmetry. A five-fold symmetry is also found when tiling a two-dimensional plane with two types of tiles, both having the same length of edge s, but with angles of 2 π π ”diameter” to side ratios of these two types of tiles satisfy 5 or 5 . The √ s 5 − 1 d . The sides of the tiles are marked and the tiles are adjoined 2 s = d0 = so that the markings match (Gardner, Scientific American, 236, 110 (1977)). The result is a tiling without long-ranged periodic order, although every finite area segment repeats an infinite number of times in the plane. These types of tilings are known as Penrose tilings. The Penrose tiling has long-ranged orientational order, as can be seen by decorating each tile with lines. The lines on the tiles join up to form five sets of parallel lines (Ammann lines). The five sets of lines make an angle of 25π with respect to each other. The spacing between the successive members of a set form a Fibonacci series.
55
4
Structure Determination
Structure can be determined by experiments in which beams of particles are scattered from the structure. Elastic scattering experiments are usually preferred as the underlying lattices are not dynamically deformed by the process. In order that the results be easily interpretable in terms of the structure, it is necessary that the wave length associated with the beam of particles should have the same order of magnitude as the spacing between atoms in the structure and secondly, the beam of particles should only interact weakly with the structure. The first condition allows for a clear resolution of diffraction peaks caused by the atomic structure. The second condition ensures that the beam is scattered primarily in the bulk or interior of the material, and not just the surface. It also allows for an easy interpretation of the data via second order perturbation theory.
4.1
X Ray Scattering
X-rays are usually used in the determination of the atomic structure of solids. The strength of the interaction is measured by the deviation of the dielectric constant from its vacuum value ( 1 ). At energies of about 10 keV, the wave length of the x-rays λ is ∼ 10−10 m, and at these high energies the refractive index is almost unity. In x-ray diffraction, the x-rays are elastically scattered from the charge density of the electrons. The formal theory of x-ray scattering shows that the intensity of the reflected waves is given by the Fourier Transform of the electron density - density correlation function. For a solid which possesses long-ranged order, the resulting expression for the intensity can be simplified down to involve the square of the Fourier transform of the electron density. In order to elucidate the role of the Bravais lattice and the coherent nature of x-ray scattering, the atoms shall first be considered to be point like objects. Later, the spatial distribution of the electrons around the nuclei shall be re-introduced.
4.1.1
The Bragg conditions
Bragg considered the specular reflection of a beam of x-rays from successive planes of atoms separated by distances d. If the angle between the x-rays and the planes (not the normal to the plane) is θ2 , then the difference in optical path lengths for a beam specularly reflected at the lower of two consecutive layers is 2 d sin
56
θ 2
(78)
In this expression, θ is the scattering angle of the particles in the beam. The reflected beams superimpose with a phase difference of 4π
d θ sin λ 2
(79)
and constructive interference occurs whenever n λ = 2 d sin
θ 2
(80)
This is Bragg’s law. The value of n is called the order of the Bragg reflection. Since the successive planes are equi-spaced, the scattering for an entire family of planes is constructive when the scattering from two neighboring planes in the family is constructive. Since there are a large number of planes in a family, and since the solid is almost transparent to x-rays, the scattering amplitude from each member of the family adds coherently giving rise to a very high intensity of the scattered beam whenever Bragg’s condition is satisfied. In the application of Bragg’s law to x-ray scattering, not only must one consider the different coherent scattering conditions from a single family of planes, but one must also consider scattering from the different families of planes in the solid. Different families of planes of atoms in a solid have different orientations. Since a plane of every family passes through each lattice point, the different orientations may have different spacings between members of the families of planes, so d can vary from family to family. The different Bragg reflections are usually indexed by the Miller indices (m1 , m2 , m3 ) of the planes that they are reflected from.
4.1.2
The Laue conditions
Laue’s condition is more general than that of Bragg. The Laue condition is derived by considering scattering from the basis atoms in each of the primitive unit cells in the solid. The individual cells scatter the x-rays almost isotropically, however, the scattering in a specific direction will only be coherent at wave lengths for which the scattered waves from each unit cell add constructively. The wave vector of the incident beam is expressed as k, where k =
2π eˆ λ
(81)
and the reflected wave has wave vector k 0 k0 =
2π 0 eˆ λ
(82)
where eˆ and eˆ0 are two unit vectors. Let us consider two scattering centers separated by a vector displacement d. Then, the difference in optical path length 57
for x-rays scattered from one atom is composed of two non-equal segments d cos d cos
θ = d . eˆ 2 θ0 = d . eˆ0 2
(83)
The optical path difference between the two waves is given by the difference d cos
θ0 θ − d cos = d . ( eˆ − eˆ0 ) 2 2
(84)
Thus, constructive interference of the scattered waves from two unit cells occurs whenever eˆ − eˆ0
d.
= mλ
(85)
holds for integer m. This condition can be re-expressed in terms of the wave vectors of the incident and scattered x-rays as 0 d. k − k = 2πm (86) If this condition is fulfilled for the set of vectors d that are all the Bravais Lattice vectors R, one finds the Laue condition for coherent scattering 0 R. k − k = 2πm (87) or alternatively exp
i
k − k
0
.R
= 1
(88)
If this condition is satisfied for all R in a solid with N unit cells, constructive interference will occur between all pairs of unit cells, giving rise to coherent scattering. The cross-section will have N 2 such contributions, and the scattered wave will be extremely intense. If the scattering vector q is defined as q = k − k0
(89)
the Laue condition is satisfied for the special set of q values, Q which satisfy exp i Q . R = 1 (90) ∀ R These special q values can be used to obtain the k values at which the reflection will occur. The expression for the momentum transfer is k0 = k − Q 58
(91)
which can be squared to yield k 02 = k 2 − 2 k . Q + Q2
(92)
This equation may be combined with the condition for elastic scattering | k | = | k 0 |, to result in a condition on the incident k values for coherent scattering of the form Q2 = 2 k . Q (93) Thus, k will satisfy the Laue condition for coherent scattering when the component of k along Q bisects Q. Thus, the projection of k along Q must be equal to half the length of Q. The incident wave vector k must lie on the plane bisecting the origin and Q, which is called the Bragg plane. The Laue condition is satisfied if Q . R = 2 π m for all lattice vectors R. In particular, if the Laue condition is satisfied, one can choose R to be any one of three primitive lattice vectors. The three choices of primitive lattice vectors yields the three equations, a1 . Q = 2 π m1 a2 . Q = 2 π m2 a3 . Q = 2 π m3 (94) Since any lattice vector R can be expressed as integer multiples of the primitive lattice vectors, these three Laue equations are equivalent to the Laue condition. The three Laue equations have a geometrical interpretation. Namely, Q lies on a cone around the direction of a1 with projection 2 π m1 . Similarly, Q also lies on a certain cone around a2 , and also on a cone around a3 . Thus, Q must lie on the common intersection of the three cones. This is a severe constraint: the values of k for which this is satisfied can only be found by systematically sweeping the magnitude of k or by rotating the direction of k which is equivalent to systematically re-orienting the crystal. However, once Qi values have been found which satisfy the Laue conditions, other Q values can be found which are integral multiples of the initial Qi ’s. General considerations show that there are three basis vectors bi which can be used to construct the general Q.
4.1.3
Equivalence of the Bragg and Laue conditions
Since a plane belonging to each family of planes passes through each lattice point, it is obvious that the Laue condition is equivalent to the Bragg condition. Let Q = k − k 0 be a scattering wave vector such that Q . R = 2 π m for all lattice vectors R. As k and k 0 have the same magnitude, they make the 59
same angle
θ 2
with the Bragg plane.
Due to the elastic scattering condition one has |Q| = 2 k sin θ2 and if the scattering is coherent then the magnitude of Q can be written as |Q| = 2 πd n , where n is the order of the reflection and d is a distance characteristic of the lattice. Combining the elastic and Laue conditions, one has θ πn = 2 d θ 2 d sin = nλ 2 k sin
(95)
Thus, the Laue diffraction peak associated with the change in k given by k − k 0 = Q, just corresponds to a Bragg reflection by an effective family of planes which have Q as their normal. The order n of the Bragg reflection just corresponds to the magnitude of | Q | divided by 2dπ , where d is the separation of a family of planes.
4.1.4
The Ewald Construction
Since the Laue condition is very restrictive, the vectors k which produce coherent scattering are relatively few and far between. The Ewald construction (P.P. Ewald, Z. Krist. 56, 129 (1921)) provides a convenient way of visualizing how the Laue condition may be fulfilled. The incident wave vector wave k is centered on the origin O. A sphere of radius k 0 ( = k ) is constructed which is centered on the tip of k. This is the Ewald sphere. The scattered wave vectors have the magnitude k 0 and may be represented by vectors k 0 directed from points on the sphere’s surface directed to the center of the sphere. The scattering wave vectors q = k − k 0 are directed from the origin towards the points on the surface of the sphere. Since the wave vectors Q which are solutions of Q.R = 2πm
(96)
form a lattice of points (the reciprocal lattice) including Q = 0, a lattice point has to be centered on the origin. This lattice is indexed by three integers (m1 , m2 , m3 ) corresponding to the components along three primitive (reciprocal) lattice vectors. When a second point of the lattice of Q points resides on the surface of the Ewald sphere, say at − k 0 , it produces a Bragg reflected beam. In this case, the Laue condition is satisfied and the incident beam will be Bragg reflected at this k 0 value. In general, it is expected that the Ewald sphere will not have a second lattice point on the surface. When Bragg reflections occur, they are indexed by the integers (m1 , m2 , m3 ) which describe the family of planes 60
associated with the momentum transfer Q.
4.1.5
X-ray Techniques
There are various techniques which can be used to obtain diffracted beams. In the Laue Method, a beam of x-rays with a continuum of wave lengths in the range between λ0 and λ1 is used, and the incident beam has a fixed direction. Thus, it is only appropriate to use this method for a single crystal, as a polycrystalline sample would correspond to an average over the relative orientation with the incident beam. In the Laue method, the continuous wavelength of the beam broadens the surface of the Ewald sphere into a finite volume enclosed between two Ewald spheres with the limiting wave lengths. For a large enough mismatch between the wave length of the interior Ewald sphere λ0 and the exterior sphere λ1 , it is quite likely that at least one Bragg reflection will occur. This method provides the simplest method for orienting a single crystal relative to the direction of the incident beam. If the incident beam is along a direction of high symmetry of the lattice of Q points, the pattern of reflected beams should exhibit the same symmetry. It should be noted that the x-ray pattern will always show a center of symmetry, even if the crystal does not have one. This discovery is due to Friedel. The Rotating Crystal Method uses a monochromatic beam of x rays, and in the experiment, the relative direction of the incident beam and the crystal is varied. If one considers the lattice of points Q as being fixed, then the Ewald sphere rotates around the origin and, for large enough k, will sweep some lattice points through the surface of the sphere. This experiment produces a set of Bragg reflected beams that are recorded on a photographic film. In practice, the crystal is rotated about a crystallographic axis, say a1 , while the incident beam has a fixed direction perpendicular to a1 . The photographic film is bent into a cylinder with an axis which is chosen to coincide with the axis of rotation of the crystal. Since the incident beam is perpendicular to the rotation axis, then the Bragg reflected beams occur within cones of fixed angle. That is, the b2 and b3 components of the lattice of Q points form planes which are perpendicular to a1 . Therefore, under the rotation these two components of the Q vectors are rotated in the planes. However, the components of Q parallel to a1 remain invariant and are governed by m1 , since Q . a1 = 2 π m1
(97)
Furthermore, since k and a1 are perpendicular k 0 . a1 = − Q . a1
(98)
and the reflected beams produce a series of Bragg spots which exist in rings wrapped around the photographic film cylinder. Each ring corresponds to a 61
different value of m1 . Direct observation of the angle between k 0 and a1 allows the magnitude of a1 to be obtained with ease. The Debye-Scherrer Method uses a polycrystalline or powdered sample. Each grain of the sample has a random orientation, therefore, this method is equivalent to the rotating crystal method in which the sample is rotated over all possible orientations. Each reciprocal lattice point will generate a sphere of radius equal to the magnitude of the reciprocal lattice vector. If this spherical shell of reciprocal lattice vectors intersects with the Ewald sphere, it produces Bragg reflections. Each lattice vector with length less than 2 k will produce a cone of Bragg reflections, with an angle θ relative to the un-scattered beam. The magnitude of the reciprocal lattice vector is given by Q = 2 k sin θ2 . Thus, a measurement of θ will give the lengths of the smallest reciprocal lattice vectors. These methods can be used to determine the reciprocal lattice vectors and, hence, the Bravais lattice associated with the crystal. In order to completely determine the crystal structure, one must determine the basis. This can be done by examining the structure and form factors. ——————————————————————————————————
4.1.6
Exercise 10
Fleming and co-workers describe the structure of various alkaline metal C60 compounds in their Nature article, Nature 352, 701, 1991. In figure (2.a) of the paper they indicate an f.c.c. structure for the solid. Indicate the conventional axes on their unit cell. If a powder x -ray diffraction experiment is performed on Rb doped C60 with x-rays of wavelength λ = 0.9 A, for the dopings 3, 4 and 6 in the paper, what are the angles 2 θ for the first 5 diffraction peaks for the observed structures? ——————————————————————————————————
4.1.7
The Structure and Form Factors
If the lattice has a basis, the scattered wave from each unit cell must be composed from the scattered waves from each atom in the basis. This means that the scattering from each type of atom in the basis must be determined and then superimposed to find the scattered wave. The scattering from the electron density of each atom can be expressed in terms of the form factor. The form factors for an atom in a solid differ only slightly from the form factors of isolated atoms, and are mainly determined by the atomic charge number Z. Although there
62
are differences due to the bonding, the form factors are determined by all the electrons, and not just those involved in bonding. The form factor of the j-th atom in the basis is denoted by Fj (q). It is conventional to use a scale such that the forward scattering θ = 0 atomic form factor equals the number of electrons in the atom. Since the coherent scattering is restricted to scattering vectors Q that satisfy the Laue condition, the form factor only needs to be evaluated at these values of Q. The amplitude of the scattered wave from the atoms in the basis of the unit cell can be expressed in terms of the structure factor S(Q) which is given by X S(Q) = exp i Q . rj Fj (Q) (99) j
This is just the component of Fourier Transform of the electron density from one unit cell. The intensity of the Bragg peaks is proportional to the factor | S(Q) |2
(100)
The Q dependence of the intensity can be used to determine the basis of the crystal. Unfortunately, since only the modulus of S(Q) can be found from experiment and not its phase, indirect methods have to be used to discover the crystal structure. However, if the crystal is centro-symmetric, then if there is an atom at the basis point rj there is another atom of the same type at − rj and S(Q) is purely real. The phase problem just simplifies to the question as to whether S(Q) is positive or negative. If the basis of a crystal structure is mono-atomic, the atomic form factor can be factorized out, and the amplitude of the scattered wave is partially determined by the geometric structure factor X SG (Q) = exp i Q . rj (101) j
The geometric structure factor expresses the interference between identical atoms in the basis. The intensity of the Bragg peak is still determined by the product of the modulus of the form factor with the modulus of the geometric structure factor. The vanishing or variation of the Bragg peak intensities due to interference can be used to determine the positions of the basis atoms. An example of the ambiguity imposed by the non-measurability of the phase of the Structure Factor is given by Friedel’s law, for non-centrosymmetric crystals. The structure factor S(Q) is a complex number, and can be written as S(Q) = A + i B
(102)
For each Q that satisfies the Laue condition, there is a vector −Q which corresponds to the negative integer − m. The structure factor S(−Q) is just the 63
complex conjugate of S(Q) S(−Q) = A − i B
(103)
Since the structure factor for both the vectors Q and −Q have the same magnitude, the Bragg peaks have the same intensity. Thus, the diffraction pattern has a center of inversion symmetry, even if the crystal structure does not. Exceptions to Friedel’s law only occur if the crystal has anomalous dispersion. This happens when the x-rays are highly absorbed by the crystal. Face Centered Cubic Lattice. The face centered cubic lattice can be represented in terms of a simple cubic lattice with a four atom basis. The scattering from this lattice can be expressed in terms of the Laue condition for the simple cubic lattice, but modulated by the geometric structure factor. The four atom basis of the non-primitive (conventional) unit cell of the face centered cubic lattice consists of the atomic positions r1
=
r2
=
r3
=
r4
=
0 a eˆx + eˆy 2 a eˆz + eˆx 2 a eˆz + eˆy 2
(104)
The Bragg vectors for the conventional simple cubic cell are easily found to be 2π bx = eˆx a 2π by = eˆy a 2π bz = eˆz (105) a so a general simple cubic Bragg scattering vector is given by 2π Q = m1 eˆx + m2 eˆy + m3 eˆz a
(106)
The geometric structure factor for the conventional f.c.c. unit cell is found to be X SG (Q) = exp i Q . rj j
" =
1 + exp
+ i π ( m1 + m2 )
64
+
+
exp
+ i π ( m1 + m3 )
#
+ exp
+ i π ( m2 + m3 ) (107)
When evaluated at the Bragg vectors, the geometric structure factor adds coherently SG (Q) = 4 (108) if the integers (m1 , m2 , m3 ) are either all even or are all odd. The geometric structure factor interferes destructively SG (Q) = 0
(109)
if only one integer is different from the other two. That is, if one integer is either even or odd, while the other two, respectively are odd or even, then SG (Q) vanishes. Thus, the f.c.c. lattice has the same pattern of Bragg reflections as the simple cubic lattice, but has missing Bragg spots. The resulting lattice of Bragg spots is cubic with twice the dimensions (in q space) but has missing Bragg spots at the mid points of the edges and at the face centers. Thus, it is found that the diffraction pattern has the form of a body centered cubic lattice. The Body Centered Cubic Lattice. The body centered cubic lattice can be viewed as a simple cubic lattice with a two atom basis r0
=
r1
=
0 a eˆx + eˆy + eˆz 2
(110)
Then, the geometric structure factor for the conventional b.c.c. unit cell is just a SG (Q) = 1 + exp i Q . ( eˆx + eˆy + eˆz ) 2 a = 1 + exp i ( Qx + Qy + Qz ) (111) 2 Now the Bragg vectors for the simple cubic structure are just Q =
2π ( m1 eˆx + m2 eˆy + m3 eˆz ) a
therefore, at these Q values the geometric structure factor simplifies to SG (Q) = 1 + exp i π ( m1 + m2 + m3 ) (
=
1 +
− 1
65
m1 + m2 + m3 )
(112)
=
2
f or ( m1 + m2 + m3 ) even
=
0
f or ( m1 + m2 + m3 ) odd (113)
Thus, the body centered cubic lattice has Bragg spots that form a cubic lattice. However, the intensity of the odd indexed Bragg spots vanish, leading to a face centered cubic lattice of Bragg spots. The Diamond Lattice. The diamond lattice is an f.c.c. lattice with a two atom basis r0
=
r1
=
0 a eˆx + eˆy + eˆz 4
(114)
where the conventional f.c.c. unit cell has linear dimension a. From the discussion of scattering from an f.c.c. lattice, one finds that the Q vectors of the Bragg spots can be expressed in terms of the set of primitive vectors for the b.c.c. lattice X Q = mi bi (115) i
The primitive vectors are given by b1
=
b2
=
b3
=
2π eˆy + eˆz − eˆx a 2π eˆz + eˆx − eˆy a 2π eˆx + eˆy − eˆz a
(116)
The geometric structure factor of the diamond lattice, relative to the lattice of Bragg spots of the real space f.c.c. lattice, is given by π SG (Q) = 1 + exp i ( m1 + m2 + m3 ) (117) 2 From this it is found that the geometric structure factor not only gives rise to extinctions but also modulates the intensity of the non-zero Bragg spots, according to the rule SG (Q)
=
2
f or
( m1 + m2 + m3 ) 2 × even 66
SG (Q)
=
0
SG (Q)
=
1 ± i
f or
( m1 + m2 + m3 ) 2 × odd f or
( m1 + m2 + m3 ) odd (118)
As the f.c.c. lattice has Bragg spots arranged venient to transform the Bragg vectors into the conventional b.c.c. unit cell " 4π 1 Q = eˆx ( m1 + m2 a 2 1 + eˆy ( m1 + m2 2 1 + eˆz ( m1 + m2 2
on a b.c.c. lattice, it is concoordinates system used for a + m3 ) − m1 + m3 ) − m2 # + m3 ) − m3 (119)
The rule for the modulation of intensities is expressed directly in terms of the quantity X Qi a 1 = ( m1 + m2 + m3 ) (120) 4 π 2 i Thus, one can describe the system of Bragg spots as residing on a b.c.c. lattice with cubic cell of side 4aπ . The b.c.c. lattice can be re-interpreted in terms of two interpenetrating simple cubic lattices. Thus, the Bragg spots with nonequal intensities reside on two interpenetrating simple cubic lattices of side 4aπ . The length scale is twice as large as the reciprocal lattice spacing of the (simple cubic) lattice constructed from the conventional unit cell. One simple cubic lattice contains the origin Q = 0, and the Bragg spots have integer coefficients for the unit vectors eˆx , eˆy and eˆz . This means that ( m1 + m2 + m3 ) is even for this simple cubic lattice. On dividing by a factor of 2, the resulting number is odd and even at consecutive lattice points. When ( m1 + m2 + m3 )/2 is an even integer, S = 2 and the intensities are finite. However, when ( m1 + m2 + m3 )/2 is odd then S = 0 so the intensities are vanishing. Thus, the non-zero intensities on this simple cubic reciprocal lattice actually forms a face centered cubic reciprocal lattice. The second interpenetrating simple cubic lattice has Bragg points with half (odd) integer coefficients for the unit vectors eˆx , eˆy and eˆz . This means that the sum ( m1 + m2 + m3 ) is odd for this simple cubic lattice. These lattice points are the body center points of the underlying b.c.c. lattice. The geometric structure factor is simply SG (Q) = 1 ± i and thus, the Bragg spots on this 67
simple cubic lattice all have the same intensities. Extinctions due to Glide Planes and Screw Axes. Consider a solid with a glide plane, along the eˆz axis perpendicular to the eˆy axis. Thus, if there is an atom at (x, y, z) in units of the lattice parameters, there is an equivalent atom at (x, y, z + 12 ). The pairs of basis atoms each contribute a term 1 SG (Q) = exp 2 π i ( x m1 + y m2 + z m3 ) + exp 2 π i ( x m1 − y m2 + ( z + ) m3 ) 2 (121) to the geometric structure factor. One can see that for the special case m2 = 0 the structure factor is composed of terms with the form SG (Q) = exp 2 π i ( x m1 + z m3 ) 1 + exp π i m3 m3 = exp 2 π i ( x m1 + z m3 ) 1 + (−1) =
0
if m3 is odd
=
2 exp
2 π i ( x m1 + z m3 )
if m3 is even (122)
Thus, reflections of the type (m1 , 0, m3 ) will be missing unless m3 is an even number. Similar extinctions occur for screw axes. Consider a two-fold screw axis parallel to eˆy . The equivalent positions are (x, y, z) and (x, 12 + y, z). Thus, the structure factor for m1 = 0 and m3 = 0 is made up of contributions with the form SG (Q) = exp 2 π i y m2 1 + ( − 1 )m2 =
0
=
2 exp
if m2 is odd
2 π i y m2
if m2 is even (123)
Thus, reflections of the type (0, m2 , 0) will be missing unless m2 is an even integer.
68
——————————————————————————————————
4.1.8
Exercise 11
Experiments on solid Ax C60 show that the C60 molecules are located on a face centered cubic lattice with lattice spacing a = 14.11 A, and that the (2, 0, 0) x-ray diffraction peak is very weak when compared to the (1, 1, 1) Bragg peak. Fleming et. al. Nature 352, 701 (1991). Calculate the structure factor for these reflections in an approximation which assumes that the electron distribution of each fullerene molecule is uniformly spread over a spherical shell of radius 3.5 A. ——————————————————————————————————
4.1.9
Exercise 12
The Hendriks-Teller model for x-ray diffraction from a disordered system considers a one-dimensional line of molecules. The probability that a pair of atoms is separated by a distance a is given by p and the probability that they are separated by a + da is given by 1 − p. The random system has an infinite unit cell. Calculate the average geometric structure factor for this model, and show that p ( 1 − p ) 1 − cos Q da SG (Q) = 1 − p(1 − p) − p cos Qa − (1 − p) cos[ Q(a + da) ] + p(1 − p) cos Qda (124) In a scattering measurement on a random system, one measures the average of | SG (Q) |2 . Determine the relation between SG (Q) and | SG (Q) |2 and describe the results of a scattering measurement on this one-dimensional system. —————————————————————————————————— Polyatomic Crystals. For a polyatomic crystal the structure factor has both the geometric contribution and the contribution from the atomic form factors of the basis atoms X S(Q) = exp i Q . rj Fj (Q) (125) j
The atomic form factor Fj (Q) is determined by the internal structure of the atom that occupies the position rj in the basis.
69
The atomic form factor is normalized to the electronic charge of the atom. For a single atom, the form factor is given by Z 3 d r ρ(r) exp − i Q . r (126) F (Q) = where ρ(r) is the atomic electron density. If the charge density is spherically symmetric, then the form factor can be reduced to a radial integral Z ∞ Z 1 2 F (Q) = 2 π dr r ρ(r) d cos θ exp − i Q r cos θ 0 −1 Z ∞ 2 sin Q r = 2π dr r2 ρ(r) Qr Z0 ∞ sin Q r = 4π (127) dr r2 ρ(r) Qr 0 For forward scattering, Q = 0, the form factor reduces to Z ∞ F (0) = 4 π dr r2 ρ(r) 0
=
Z
(128)
where Z is the atomic number. Typically F (Q) decreases monotonically with increasing Q, falling off as a power of Q12 for large Q. ——————————————————————————————————
4.1.10
Exercise 13
Calculate the x-ray scattering intensities for the following close-packed structure formed by stacking hexagonal layers, in the following sequences: (a) The sequence ABAB... (the h.c.p. sequence). (b) The sequence ABCABC... (the f.c.c. sequence). (c) The random sequence in which all the consecutive layers are different, but given one layer (say A), there is an equal probability that it will be followed by either one of the two other layers. ——————————————————————————————————
4.1.11
Exercise 14
Find the atomic form factor for the hydrogen atom, using the electron density 1 2r ρ(r) = exp − (129) π a3 a 70
where a is the Bohr radius. —————————————————————————————————— Sodium Chloride. An example of a diatomic crystal with a basis is provided by N aCl. This has a face centered cubic lattice and has N a+ ions at the positions (0, 0, 0), ( 21 , 12 , 0), ( 12 , 0, 12 ) and (0, 12 , 12 ). The Cl− ions reside at ( 12 , 0, 0), (0, 12 , 0), (0, 0, 12 ) and ( 12 , 12 , 12 ). The structure can be viewed as a simple cubic lattice with a six atom basis. In this case, we can use the simple cubic representation of the Bragg vectors Q. Thus, the structure factor is given by S(Q) = FN a (Q) 1 + exp i π ( m1 + m2 ) + exp i π ( m2 + m3 ) + exp i π ( m3 + m1 ) + FCl (Q) exp i π m1 + exp i π m2 + exp i π m3 + exp i π ( m1 + m2 + m3 ) (130) As exp[ i π m ] = ( − 1 )m the structure factor can be factorized as m1 S(Q) = FN a (Q) + − 1 FCl (Q) (m1 +m2 ) (m2 +m3 ) (m3 +m1 ) × 1 + − 1 + − 1 + − 1 (131) The structure factor is 0 unless the indices are either all odd or all even. This is characteristic of face centering. The intensities of the Bragg spots with all even indices and all odd indices are different as the atomic form factors either add or subtract. ——————————————————————————————————
4.1.12
Exercise 15
Potassium Chloride has the same structure as N aCl. However, K + and Cl− are iso-electronic and so have very similar structure factors. Determine the indices (m1 , m2 , m3 ) of the allowed Bragg reflections.
71
——————————————————————————————————
4.1.13
Exercise 16
Calculate the structure factor for the zincblende structure. The zincblende structure is a face centered cubic lattice of side a, with a positively charged ion at the origin and a negatively charged ion at a4 ( eˆx + eˆy + eˆz ). —————————————————————————————————— Since the differences between the atomic form factors show up in the experimentally observed structure factor of compounds, it is possible to distinguish between ordered binary compounds and binary compounds with site disorder. The order-disorder transition in Cu3 Au has been observed by x-ray scattering. At high temperatures, the atoms in this material are randomly distributed one atom on each site of an f.c.c. lattice. However, there is a transition between the disordered phase, which occurs above a critical temperature of Tc ≈ 660 K, to an ordered phase at lower temperatures. In the completely disordered phase, the structure factor is that pertaining to an f.c.c. crystal, in which the form factor is replaced by the statistically averaged value Fav (Q) =
1 3 FCu (Q) + FAu (Q) 4 4
(132)
Thus, at high temperatures, the structure factor is given by S(Q) = Fav (Q) 1 + exp i π ( m1 + m2 ) + exp i π ( m2 + m3 ) + exp i π ( m3 + m1 ) (133) Hence, the peaks have intensity of either 16 | Fav (Q) |2 or zero depending on whether the indices are all even or all odd, or whether they are mixed. In the ordered phase, the Cu atoms reside on the face center sites and the Au on the vertices of the cubes. In this phase, ”super-lattice” peaks appear in the spectra for mixed indices. For the completely ordered phase, the structure factor is given by S(Q) = FAu (Q) + FCu (Q) exp i π ( m1 + m2 ) + exp i π ( m2 + m3 ) + exp i π ( m3 + m1 ) (134)
72
The ”super-lattice” peaks occur for mixed indices. The relative intensity of the ”super-lattice” peaks are approximately given by 2 I(1, 0, 0) FAu (0) − FCu (0) ∼ (135) I(2, 0, 0) FAu (0) + 3 FCu (0) which leads to a relative intensity of about 0.09. Since the x-ray form factors are FCu (0) = 29 and FZn (0) = 30, the relative intensity of the ”super-lattice” peaks of CuZn, or beta brass, are of the order of 0.0003. Thus, the super-lattice peaks are difficult to observe in x-ray scattering. However, the order-disorder transition in CuZn is easily observable by neutron diffraction. At very low temperatures, CuZn exists as an ordered compound of the CuCl type. The structure consists of two interpenetrating simple cubic sub-lattices which have a relative displacement of [ 21 , 12 , 12 ]. The Cu atoms occupy the sites of one sub-lattice, say the A sub-lattice, and the Zn atoms are located on the other sub-lattice, say the B sub-lattice. For an infinite solid the A and B sub-lattices are equivalent, thus, the compound may also form with the Cu atoms on the B sub-lattice and the Zn atoms on the A sub-lattice. At temperatures above the order-disorder transition temperature, the material exists in a disordered phase in which the Cu and Zn atoms are randomly positioned on the sites of the A and B sub-lattices. At the transition temperature, a phase transition occurs between the high temperature disordered phase and the low-temperature ordered phase. The order parameter for the phase transition is given by the scalar quantity, φ(T ), where φ(T ) = n(Cu)A − n(Cu)B = n(Zn)B − n(Zn)A
(136)
where n(Cu)A and n(Cu)B are, respectively, the number of Cu atoms on the A and B sub-lattices. The second line follows from the fact that an atom of one type or the other exists at each site. In particular, if the total number of sites is 2 N , the numbers of Zn atoms at the sites of the A and B sub-lattices are, respectively, given by n(Zn)A n(Zn)B
= N − n(Cu)A = N − n(Cu)B
(137)
Above the transition temperature, the Cu atoms are equally probable to be found on the A and B sublattices and so the order parameter is zero, φ = 0. Below the transition temperature, Tc ≈ 741 K, the order parameter has a non-zero magnitude φ0 (T ) which is temperature dependent, and has either a positive or negative sign depending on whether the Cu atoms spontaneously select to occupy the A or B sites, φ(T ) = ± φ0 (T ). In the ordered state, the temperature dependence of the order parameter is given by φ0 (T ) ∝ ( Tc − T )β 73
(138)
where β ≈ 0.32. As the Hamiltonian is symmetric under interchange of the A and B sub-lattices, this order-disorder transition provides an example of spontaneous symmetry breaking. ——————————————————————————————————
4.1.14
Exercise 17
Express the inelastic x-ray scattering intensity for CuZn in terms of the atomic form factors FCu (Q), FZn (Q), and the order parameter φ(T ). Assume that the deviations of the site occupancies from the average values at different sites are un-correlated. ——————————————————————————————————
4.2
Neutron Diffraction
Elastic neutron scattering from the nuclei of a solid involves the change in the momentum of the neutron from the initial value h ¯ k to the final value h ¯ k 0 of q = k − k0
(139)
Conservation of momentum requires that the transferred momentum must be equal to a momentum component of the interaction potential. This momentum is ultimately transferred to the solid. Experimentally accessible ranges of q for neutrons are in the range of 0.01 < q < 30 A, which covers the range that is useful to determine crystalline structures. The interaction between the neutron and one nucleus is short ranged and can be modelled by a point contact interaction, 2 ˆ int = 2 π ¯h b δ 3 ( r − R ) H mn
(140)
where b is the scattering amplitude of the order of 10−14 m. The differential scattering cross-section represents the number of particles scattered into solid angle dΩ per incident flux. The differential scattering cross-section for one nucleus is assumed to be isotropic and given by dσ = | b |2 dΩ
(141)
Hence, the total cross-section is given by σ = 4 π | b |2
74
(142)
For a crystalline lattice of nuclei, as it shall be shown, the scattering crosssection is given by X dσ (143) = exp i q . ( Ri − Rj ) b∗i bj dΩ i,j where bi is the scattering amplitude from the i-th nucleus. The value of bi depends on what isotope exists at the lattice site and also on the direction of nuclear spins. In general, the different isotopes are randomly distributed so they must be averaged over. Thus, bi and bj are independent or uncorrelated if they belong to different sites, and the average for i 6= j is given by the product of the averages b∗i bj = b∗i bj = | b |2 (144) while, if i = j, one has the average of the squared amplitude b∗i bi = | b |2
(145)
In general, the average has the form b∗i bj = | b |2 + δi,j
| b |2 − | b |2
(146)
The scattering cross-section can be written as the sum of two parts, a coherent part where i 6= j and an incoherent part which has i = j. The coherent cross-section is given by X dσ = exp i q . ( Ri − Rj ) | b |2 dΩ i,j
(147)
For coherent scattering from every nuclei in the solid, the momentum transfer must satisfy the Laue condition and so q must be equal to Q, where Q satisfies Q.R = 2πm
(148)
for all lattice vectors R and m is any integer. When this condition is satisfied, the scattering produces Bragg reflections similar to those observed in x-ray scattering. When the Bragg scattering condition is satisfied the coherent scattering has an intensity proportional to N 2 . The incoherent scattering cross-section comes from the terms with i = j and is given by 2 dσ 2 = N |b| − |b| (149) dΩ 75
The incoherent scattering is proportional to the number of nuclei N and is independent of the direction of q. It is obvious that the coherent and incoherent contributions are profoundly different. Only the coherent contribution can be utilized to determine the crystalline structure.
4.3
Theory of the Differential Scattering Cross-section
dσ By definition, the differential scattering cross-section dΩ is the ratio of the number of particles scattered dNscatt (per unit time) into a solid angle dΩ = sin θ dθ dϕ to the incident flux of particles F (number of particles crossing unit area per unit time) times the solid angle element
dNscatt = F
dσ dΩ dΩ
(150)
Consider a beam of particles collimated to have a momentum k that falls incident on a crystal. The particles are assumed to interact with either the electrons or nuclei of the solid. An example is x-ray diffraction, in which the beam of photons interacts elastically with the electron density, or alternatively in neutron diffraction experiments the beam of neutrons interacts, via short ranged nuclear forces, with the nuclei of the solid. The interaction Hamiltonian between a particle in the beam and the relevant particles of the solid can be represented as the sum of single particle interactions X ˆ int = H Vj (r − rj ) (151) j
Here, r represents the position of the beam particle and rj is the position of the j-th particle in the solid. For x-rays in which the energy of the photon is in the keV range, the photon energy is much greater than the electronic energy scale. This has the effect that only certain terms of the interaction Hamiltonian between the x-rays and the electron need be considered. The non-relativistic form of the interaction between the electromagnetic field represented by a vector potential A(r, t) and particles of charge q and mass m is given by " X q q2 Hint = − pˆj . A(rj , t) + A(rj , t) . pˆj − A(rj , t) . A(rj , t) 2mc 2 m c2 j (152) where rj and pˆj are the position and momentum of the j-th particle. The first pair of terms involve processes in which a single photon is absorbed or emitted, whereas the last term involves the interaction of two photons with the charged particle. To calculate the cross-section for light scattering, one needs to consider terms of fourth order in the vector potential A(r, t), as both the initial and final 76
states each involve a photon. In principle, this requires including the first pair of terms in fourth order as well as the last term in second order. However, as the fourth order processes involve intermediate states in which a very high energy photon has either been absorbed or emitted, the energy denominator involving the intermediate state is large. Thus, these contributions can safely be ignored and only the last term in the interaction need be considered explicitly in the calculation of the elastic scattering cross section. Thus, in this P approximation, the x-rays couple to the density of the charged particles, ρ(r) = j δ( r − r j ). For electrons, the coupling constant is proportional to the length e2 e2 ¯h = ∼ 10−15 m 2 2 me c ¯ c me c h
(153)
which involves the fine structure constant and the Compton wave length. The resulting length scale is the so-called classical radius of the electron.
4.3.1
Time Dependent Perturbation Theory
The incident beam has the asymptotic form of a momentum eigenstate with eigenvalue h ¯ k Ψk (r, t) =
1 V
12
exp
+ ik.r
exp
− i ωk t
(154)
The time independent part of the asymptotic initial state will be denoted by | k > in Dirac notation. The scattered wave at the detector has an asymptotic form of a momentum eigenstate | k 0 > with momentum eigenvalue ¯h k 0 . The matrix elements of the interaction potential are given as Z 1 X 0 0 3 ˆ d r exp − i k . r Vj (r − rj ) exp + i k . r < k | Hint | k > = V j V Z 1 X = d3 R0 exp − i k − k 0 . R0 Vj (R0 ) exp − i k − k 0 . rj V j V (155) where R0 = r − rj . The integration over R0 yields the Fourier transform of the interaction potential between the scattered particle and the j-th atom Z Vj (q) = d3 R0 exp − i q . R0 Vj (R0 ) V Z ≈ d3 R0 exp − i q . R0 Vj (R0 ) (156)
77
Given one incident particle in the state | Ψk (t) >, which is initially in an ˆ int is turned on adiabatically energy eigenstate | k > before the interaction H at t → − ∞, the state of this particle evolves according to the Schrodinger equation ∂ ˆ ˆ i¯ h | Ψk (t) > = H0 + Hint | Ψk (t) > (157) ∂t As the interaction is weak, the Schrodinger equation can be solved perturbatively using the interaction representation. In the interaction representation the states are transformed through a unitary operator in a manner such that i ˆ ˜ | Ψk (t) > = exp + H0 t | Ψk (t) > (158) ¯h This unitary transformation would make the eigenstate of the non-interacting particle time independent. However, the presence of a non-zero interaction term leads to the time dependent equation of motion i¯ h
∂ ˜ ˆ˜ (t) | Ψ ˜ k (t) > | Ψk (t) > = H int ∂t
where the new interaction operator is time dependent and is given by i ˆ i ˆ ˆ ˜ ˆ H int (t) = exp + H0 t Hint exp − H0 t ¯h ¯h
(159)
(160)
The equation of motion in the interaction representation can be solved by iteration. The equation is integrated to yield Z t i ˆ˜ (t0 ) | Ψ ˜ ˜ k (t0 ) > | Ψk (t) > = | k > − dt0 H (161) int ¯h −∞ On iterating once, it is found that the state is given to first order in the interaction by Z t i ˆ˜ (t0 ) | k > + . . . ˜ | Ψk (t) > = | k > − (162) dt0 H int ¯h −∞ This shows that, if wave function is started in an initial state which is an energy eigenstate of the unperturbed Hamiltonian, the time evolution caused by the interaction will admix other states into the wave function. In this sense, the particle described by the wave function may be considered as undergoing transitions between the unperturbed energy eigenstates.
4.3.2
The Fermi-Golden Rule
The rate at which the particle makes a transition from the initial state | k > ˆ int , is given in second order perturbation to state | k 0 >, due to the effect of H 78
theory by the Fermi-Golden rule. The probability that the system has made a transition at time t is given by the squared modulus of the transition amplitude < k 0 | Ψk (t) >
(163)
However, it is more convenient to calculate the probability based on the matrix elements evaluated in the interaction representation ˜ k (t) > < k0 | Ψ
(164)
These two quantities are equivalent, as they are simply related via i 0 0 ˜ k (t) > < k | Ψk (t) > = exp − E(k ) t < k0 | Ψ ¯h
(165)
and the phase factor cancels out in the squared modulus. To first order in the perturbation, the transition amplitude is given by Z t ˆ˜ (t0 ) | k > ˜ k (t) > = − i dt0 < k 0 | H < k0 | Ψ int ¯h −∞ Z t i i ˆ int | k > = − dt0 exp ( E(k 0 ) − E(k) − i η ) t0 < k0 | H ¯h −∞ h ¯ (166) where E(k) and E(k 0 ) are the unperturbed (non-interacting) energies of the initial and final states of the beam particles. The factor η corresponds to adiabatically switching on the interaction at t0 → − ∞. The probability that the transition has occurred at time t is given by Z t 2 i 1 0 0 0 0 ˆ int | k > dt exp ( E(k ) − E(k) − i η ) t < k | H 2 h ¯ h ¯ −∞ (167) The rate at which the transition occurs is given by the time derivative of the transition probability 2 Z t 1 ∂ i 0 0 0 0 ˆ int | k > − i η ) t < k P (k → k 0 , t) = 2 dt exp ( E(k ) − E(k) | H ¯h h ∂t ¯ −∞ (168) The transition rate is evaluated as P (k → k 0 , t)
exp
2 η t h ¯
ˆ int | k > |2 ∂ = | < k0 | H ∂t ( E(k 0 ) − E(k) )2 + η 2 η 2 ˆ int | k > |2 exp 2 η t = | < k0 | H ¯h ¯h ( E(k 0 ) − E(k) )2 + η 2 (169) 79
Then, in the limit η → 0, the transition rate becomes time independent and energy dependent terms reduce to π times an energy conserving delta function since lim
η → 0
η = π δ( E(k 0 ) − E(k) ) ( E(k 0 ) − E(k) )2 + η 2
(170)
Hence, we have obtained the Fermi-Golden rule lim P (k → k 0 , t)
η → 0
=
2π ˆ int | k > |2 δ( E(k 0 ) − E(k) ) | < k0 | H ¯h (171)
This expression represents the probability per unit time for a transition to occur from the initial state to a very specific final state, with a precisely known k 0 that exactly conserves energy. As the rate contains a dirac delta function it is necessary, for the rate to be mathematically meaningful, to introduce a distribution of final states. Thus, one must sum over all states with k 0 in the solid angle subtended by the detector, irrespective of the magnitude of k 0 . Thus, the dirac delta function is to be replaced by the density of final states with energy E = E(k) which are travelling in the direction dΩ. The probability that a particle makes the transition from state k to states with final momentum in a solid angle dΩ distributed around k 0 , per unit time, is given by summing over the number of allowed final states Z ∞ V 2π ˆ int | k > |2 δ( E − E(k 0 ) ) P (k → dΩ) = dk 0 k 02 dΩ | < k0 | H 3 (2π) ¯h 0 2π ˆ int | k > |2 ρdΩ (E, k 0 ) = | < k0 | H (172) h ¯ where ρdω (E, k 0 ) is the density of final scattering states per unit energy range, defined as Z ∞ V 0 ρdω (E, k ) = dΩ dk 0 k 02 δ( E − E(k 0 ) ) (173) ( 2 π )3 0 The matrix elements of the interaction operator are to be evaluated with k 0 that have the magnitude of k and are headed in direction dΩ.
4.3.3
The Elastic Scattering Cross-Section
The scattering cross-section is defined by dσ dΩ = P (k → dΩ) / F dΩ
80
(174)
where the incident flux F is the density of particles (which is one per unit volume, i.e. V1 ) times the velocity. For massive particles the velocity is just h¯mk . Thus, for particles of mass mn , the flux is given by F =
¯h k mn V
(175)
On changing the variable of integration from dk 0 to dE 0 , the density of final states is evaluated by integrating over the energy conserving delta function Z ∞ V dk 0 k 02 0 0 dE dΩ δ( E − E(k 0 ) ) ρdΩ (E, k ) = ( 2 π )3 0 dE 0 dk 0 k 02 V = dΩ (176) ( 2 π )3 dE 0 where the magnitude of k 0 is determined by the the solution of E = E(k 0 ), hence k 0 = k. For massive particles, one has the energy momentum relation dE 0 =
¯ 2 k0 h dk 0 mn
(177)
and so, the density of final states can be written as ρdΩ (E, k 0 ) =
V mn k 0 dΩ 3 (2π) ¯h2
(178)
On inserting the Fermi-golden rule expression for P (k → dΩ) P (k → dΩ) =
2π ˆ int | k > |2 ρdΩ (E, k 0 ) | < k0 | H ¯h
(179)
the final density of states ρdΩ (E, k) and the flux F into the expression eqn(174) for the scattering cross-section, one finds that the elastic scattering cross-section for massive particles such as neutrons, is calculated as 2 2 Z dσ V mn 3 ∗ ˆ d r Ψk0 (r) Hint (r) Ψk (r) = 2 dΩ 2π¯ h V 2 X mn ∗ V (q) = V (q) exp − i q . R − R 0 0 j j j j 2π¯ h2 j,j 0 (180) where q is the scattering vector k − k0 = q
(181)
The magnitude of the scattering vector is related to the scattering angle θ via q = 2 k sin 81
θ 2
(182)
On substituting the point contact interaction appropriate for nuclear scattering, and noting that the Fourier transform of the delta function is q independent, one finds the expression for the Fourier component of the potential Vj (q) =
2 π ¯h2 bj mn
(183)
Substituting for Vj (q), in the above expression for the cross-section, yields the formulae for the elastic neutron scattering cross-section X dσ ∗ (184) = bj bj 0 exp − i q . Rj − Rj 0 dΩ 0 j,j
previously discussed. For massless particles such as photons, the incident flux is just F =
c V
if the incident vector potential is r ¯h exp i ( k . r − ω t ) + c.c. A(r, t) = eˆα c 2ωV
(185)
(186)
With this normalization, the vector potential represents one incident photon per volume V , with frequency ω and incident polarization eˆα . The density of final states (for polarization eˆβ ) is just ρdΩ (E, k 0 ) =
V k 02 dΩ 3 ( 2 π ) ¯h c
(187)
Thus, it is found that the cross-section for elastic x-ray scattering is simply given by 2 2 Z 2π V 2 ω2 e2 dσ ∗ 3 = d r Ak0 (r) . ρˆ(r) Ak (r) 2 3 2 dΩ ( 2 π c ) 2 m c h c ¯ e V 2 2 X 2 e ∗ = eˆα . eˆβ S(q) S (q) exp − i q . R − R 0 j j 4 π me c2 j,j 0
(188) where the structure factor S(q) is the contribution of a unit cell to the Fourier transform of the electron density. The vectors Ri are the lattice vectors. Thus, the factors of V and ω cancel, leading to a scattering cross-section that only depends on the Fourier transform of the electronic density and has a coupling constant which is the square of the classical radius of the electron 2 e2 2 re = (189) 4 π me c2 82
From the form of this coupling constant, it can be seen that the scattering of x-rays from the density of charged nuclei is entirely negligible compared with the scattering from the electron density.
4.3.4
The Condition for Coherent Scattering
Consider scattering from a crystal which has a mono-atomic basis and has a finite spatial extent. In this case, the subscript on the atomic potential can be dropped, and the summation over j and j 0 run over all the lattice sites. For convenience, it shall be assumed that the crystal has the same shape as the primitive unit cell but has overall dimensions ( N1 − 1 ) a1 , ( N2 − 1 ) a2 and ( N3 − 1 ) a3 along the various primitive lattice directions. The solid, therefore, contains a total of N1 N2 N3 primitive unit cells, and as the basis consists of one atom, the solid contains a total of N = N1 N2 N3 atoms. The summation over Rj , in the scattering cross-section can be performed by expressing the general reciprocal lattice vector in terms of the primitive lattice vectors, X X exp i q . Rj = exp i n1 q . a1 exp i n2 q . a2 exp i n3 q . a3 n1 ,n2 ,n3
j
(190) The sums over n1 runs from 0 to N1 − 1, and similarly for n2 and n3 . This gives the products of three factors, each of the form . a 1 − exp i N q n1 =N 1 1 X1 −1 exp i n1 a . a1 = n1 =0 1 − exp i q . a1 ( N1 − 1 ) = exp + i q . a1 × 2
exp
i
× exp =
exp
N1 2
i
1 2
− exp
q . a1 q . a1
− i
N1 2
− i
1 2
− exp
( N1 − 1 ) + i q . a1 2
q . a1 q . a1
sin sin
N1 2 q . a1 1 2 q . a1
!
(191) This function exhibits the effect of the constructive and destructive interference between the scattered waves emanating from the various atoms forming the
83
solid. The numerator of the function falls to zero at q . a1 =
2mπ N1
(192)
for general integer values of m. The numerator has maximum magnitude at q . a1 =
(2m + 1)π N1
(193)
The overall q dependence is dominated by the denominator which falls to zero when q . a1 = 2 m π, for integer m. At these special q values, the function has to be evaluated by l’hopital’s rule and has the limiting value of N1 . This occurs since, for these q values, the exponential phase factors are all in phase (and equal to unity) and so the sum over the N1 terms simply yields N1 . Thus, the scattering cross-section is proportional to the product of the modulus square of three of these factors dσ dΩ
=
×
re2 | F (q) |2 × sin sin
N1 2 q . a1 1 2 q . a1
!2
sin sin
N2 2 q . a2 1 2 q . a2
!2
sin sin
N3 2 q . a3 1 2 q . a3
!2
(194) Since for a macroscopic solid the numbers N1 , N2 and N3 are of the order of 107 , the three factors rapidly vary with the magnitudes of q . ai . The maxima occur when the three conditions q . a1 = 2 π m1 q . a2 = 2 π m2 q . a3 = 2 π m3 (195) are satisfied. These special values of q are denoted by Q. In this case, one finds that the scattering cross-section is simply proportional to dσ ∼ re2 | F (Q) |2 N 2 dΩ
(196)
which is just equal to the square of the number of atoms in the solid. The coherent scattering from an ordered solid should be contrasted with incoherent scattering from the atoms of a gas. Due to the positional disorder in the gas, the phase factors may be considered to be random. The net scattering intensity for scattering of a gas of N atoms is then approximately equal to just N times the scattering intensity for an isolated atom. The coherent scattering from atoms in a solid possessing long-ranged order is a factor of N 2 larger than the scattering
84
intensity for an isolated atom. In summary, the condition that there is complete constructive interference between all the atoms in the solid is given by = 1 ∀ i (197) exp i Q . Ri The intensity of the scattered beam is exceptionally large at these special values Q, compared with all other q values. Thus, coherent scattering is the dominant feature of diffraction from crystalline solids but occurs only infrequently, as it only occurs when the scattered wave length and scattering angle satisfy the above stringent condition. These special values of Q are the lattice vectors of the reciprocal lattice. ——————————————————————————————————
4.3.5
Exercise 18
Consider a sample with N unit cells arranged in M micro-crystals that are oriented parallel with respect to each other, but their positions are random. Calculate the width and height of the Bragg peak. ——————————————————————————————————
4.3.6
Exercise 19
At finite temperatures, the atoms of a crystal undergo thermal vibrations. Due to the vibrations, the intensity of the Bragg peaks are reduced by a DebyeWaller factor which involves the spectrum of lattice vibrations. However, this situation can be approximately modelled by assuming that each atom undergoes a small random displacement δ R from its equilibrium position R. Assume that the displacements are small compared with the separation between neighboring atoms, | δ R | a, and are Gaussian distributed. Also assume that the displacements of different atoms are entirely uncorrelated δi,R δj,R0 = 0 for R 6= R0 . Calculate the diffraction peak intensity, and show that the largest reduction occurs for large Q values. ——————————————————————————————————
4.3.7
Exercise 20
Evaluate the effect of a significant number of thermally induced vacancies (missing atoms) in the elastic scattering cross-section from a crystal. ——————————————————————————————————
85
4.3.8
Anti-Domain Phase Boundaries
The order-disorder transition usually starts at several nucleation centers in a crystal. For CuZn the underlying CsCl lattice can be divided into two interpenetrating simple cubic sub-lattices: the A and B sub-lattice. In several regions, the nucleation may start with the Cu atoms condensing on the A sublattice, whereas the nucleation may occur in other regions where the Cu atoms condense on the B sub-lattices. These distinct domains of nucleation grow and spread through the crystal until they meet and the entire crystal is ordered. The interfaces of the different domains meet at anti-domain phase boundaries at which there is a mismatch of the long-ranged ordering of the atoms. Due to the mismatch, two planes containing similar atoms form the anti-domain phase boundary. The effect of anti-domain phase boundaries is to smear out the ”super-lattice” Bragg peaks. This can be seen by considering the amplitude of the scattered x-rays as a superposition of the scattering from the various domains. For simplicity, let us consider the scattering from two domains of identical shape and size. If the scattering amplitude from one domain is denoted by A1 (q) and the scattering from the second domain is denoted by A2 (q) then, as the scattering amplitudes are additive, one obtains A(q) = A1 (q) + A2 (q) where
A2 (q) = exp
(198)
i q . δR
A1 (q)
(199)
δR is the vector displacements of the origins of the two domains. The scattering amplitude A1 (q) is given by ! ! ! N q a sin N3 2qz a sin 2 2 y sin N1 2qx a A1 (q) ∝ q a sin qz2 a sin qx2 a sin y2 (200) For a domain wall in the y − z plane, the displacement between the two Cu sub-lattices is given by δR = ( N1 +
1 a a ) a eˆx + eˆy + eˆz 2 2 2
(201)
Hence, for a CsCl-type structure and if q is close to Q, the total scattering amplitude is given by the expression A(q) ∼ A1 (q) 1 + ( − 1 )m1 +m2 +m3 exp i N1 qz a (202) The total intensity of the scattered wave is proportional to 2 m1 +m2 +m3 I(q) ∝ 2 | A1 (q) | 1 + (−1) cos N1 qx a 86
(203)
Thus, if m1 + m2 + m3 is even the intensity is modulated by the factor 4 cos2
N 1 qx a 2
(204)
whereas, if m1 + m2 + m3 is odd the intensity is modulated by the factor 4 sin2
N1 qx a 2
(205)
This factor is due to the interference of the scattering from the two domains. The destructive interference causes an exact cancellation of the intensity at the exact Bragg wave vector, at odd m1 + m2 + m3 . However, for qx slightly off-the Bragg position qx
=
δqx
∼
2π m1 + δqx a π N1 a
(206)
the scattered intensity is finite and large. That is, the single anti-domain phase boundary between identically domains of identical shapes and sizes produces a hole in the Bragg peak with odd m1 + m2 + m3 . For a crystal with a CuCl type structure which contains several anti-domain phases, one expects there to be three sets of anti-domain phase boundaries and one expects that each domain has a different size. On averaging over the distribution of domains, one expects the small oscillations in the scattered intensity from the single domain S1 (q) to be washed out. Furthermore, one expects that the intensities of the ”super-lattice” peaks to be smeared out in q space.
4.3.9
Exercise 21
Consider the scattering produced by a CuCl type material, with anti-domain walls. For simplicity, only consider the component of the scattering amplitude associated with a single primitive lattice vector. Let p be the probability of not crossing a domain wall on traversing one step a along a primitive lattice vector, and q is the probability of crossing a domain wall, where q ∼ N11 . Show that the average scattered intensity, near the ”super-lattice” peaks, is proportional to the factor | A(qx ) |2
∝ N1 +
N1 X
( N1 − m1 ) ( p − q )m1 2 cos m1 qx a
m1 =1
=
( p2 − q 2 ) q 2 2 N 1 p q + ( p2 − q 2 ) − 2 qx a 2 2 2 2 ( q + ( p2 − q 2 ) sin2 qx2a )2 2 ( q + ( p − q ) sin 2 ) + O ( p − q )N1 +1 (207)
87
Hence, show that the intensity of the ”super-lattice” Bragg peaks are diminished and acquire low amplitude tails.
4.4
Elastic Scattering from Quasi-Crystals
The scattering intensity from three-dimensional quasi-crystals show ten-fold, six-fold and five-fold symmetric diffraction patterns which can be understood as arising from a space of six or more dimensions. Icosahedral symmetry can be found in a six dimensional hyper-cubic lattice. An icosahedron has 20 identical faces made of equilateral triangles. Five of the faces meet at each of the 12 vertices of the icosahedron, which is responsible for the five-fold symmetry. The x-ray scattering amplitude A(q) from a one-dimensional quasi-crystal can be found by a projection from a two-dimensional lattice. The amplitude is a linear superposition from the scattered amplitudes from the sites sn , where sn = n a cos θ + m0 a sin θ
(208)
and where the points (na, m0 a) are restricted to lie in a two-dimensional strip. The amplitude is given by X A(q) = exp i q sn n
=
X
=
X
exp
i q a ( cos θ n + sin θ m0 )
n,m0
exp
i q a ( cos θ n + sin θ m )
Θ(1 + (n + 1) tan θ − m) Θ(m − n tan θ)
n,m
(209) This can be expressed as an integral over a two-dimensional space X A(q) = exp i q a ( cos θ n + sin θ m ) Θ(1 + (n + 1) tan θ − m) Θ(m − n tan θ) n,m
Z =
Z dx
dy exp
i q ( cos θ x + sin θ y )
X
δ(x − na) δ(y − ma) ×
n,m
× Θ(a + (x + a) tan θ − y) Θ(y − x tan θ) This is a two-dimensional Fourier transform X Z 2 d r exp i q . r A(q) = δ(x − na) δ(y − ma) × n,m
× Θ(a + (x + a) tan θ − y) Θ(y − x tan θ) (210) 88
which is to be evaluated on the one-dimensional line q = q ( cos θ , sin θ )
(211)
The two-dimensional Fourier transform is recognized as the Fourier transform of a product Z 2 A(q) = d r exp i q . r B(r) C(r) (212) where B(r) is non-zero on the sites of a two dimensional array X B(r) = δ(x − na) δ(y − ma)
(213)
m,n
and the function C(r) projects onto a two-dimensional strip C(r) = Θ(a + (x + a) tan θ − y) Θ(y − x tan θ)
(214)
This can be evaluated using the convolution theorem as the convolution of the product of Fourier Transforms Z d2 q 0 B(q − q 0 ) C(q 0 ) (215) A(q) = ( 2 π )2 The function B(q) is the scattering amplitude from the two-dimensional lattice X B(q) = exp i ( qx n a + qy m a ) (216) n,m
while the function C(q 0 ) is evaluated as exp[ i qy0 a (1 + tan θ) ] − 1 C(q ) = dx exp i ( + tan θ ) x i qy0 ! 0 exp[ i q a (1 + tan θ) ] − 1 y = ( 2 π ) δ( qx0 + qy0 tan θ ) i qy0 0
Z
qx0
qy0
!
(217) The scattering amplitude for the two-dimensional lattice is only non-zero at the two dimensional reciprocal lattice vectors q = Q. Thus, the scattering from the the two-dimensional lattice is represented by the factor 2 X 2π B(q) = δ 2 (q − Q) (218) a Q
Hence, we find that the amplitude in the two-dimensional space is given by 1 X C(q − Q) A(q) = a2 Q
89
=
2π X δ( qx − Qx + ( qy − Qy ) tan θ ) × a2 Q ! exp[ i ( qy − Qy ) a (1 + tan θ) ] − 1 × i ( qy − Qy ) (219)
Evaluating this on the line in q space yields the amplitude for scattering from the one-dimensional quasi-crystal X qa − Qx a − Qy a tan θ ) × A(q) = 2 π δ( cos θ Q ! exp[ i ( q sin θ − Qy ) a (1 + tan θ) ] − 1 × i ( q a sin θ − Qy a ) X = 2π δ( q a − Qx a cos θ − Qy a sin θ ) × Q
×
exp[ i ( Qx sin θ − Qy cos θ ) a (cos θ + sin θ) ] − 1 i ( Qx a sin θ − Qy a cos θ )
!
(220) This has delta function like peaks at the wave vectors given by q a = 2 π ( m1 cos θ + m2 sin θ )
(221)
where m1 and m2 are integers. The intensities of the peaks are proportional to sin2 π ( m1 sin θ − m2 cos θ ) ( cos θ + sin θ ) | A(q) |2 ∝ (222) ( m1 sin θ − m2 cos θ )2 Thus, the inelastic scattering spectra consists of a dense set of sharp peaks, but with varying intensities. The intensities are large when the ratios of m2 and m1 are close to the value tan θ.
4.5
Elastic Scattering from a Fluid
The structure of a fluid, as expressed by the pair correlation function, can be inferred from elastic scattering experiments. The intensity of a beam of particles scattered from a liquid can be considered as analogous to the scattering from a solid with an infinite unit cell. First, we shall consider the atoms of the fluid as static point particles. The amplitude of the beams scattered from each atom add, giving a total amplitude which is proportional to X S(q) = exp i q . rj j
90
Z =
d3 r exp
iq.r
X
δ 3 (r − rj )
j
(223) The scattering intensity is given by the square of the scattered amplitude | S(q) |2 X = exp + i q . ri exp − i q . rj
I(q) ∝
i,j
Z
3
=
Z
d r
3 0
0
iq.(r − r )
d r exp
X
δ 3 (r − ri ) δ 3 (r0 − rj )
i,j
(224) On considering the long time average of the atomic positions, one obtains X Z Z d3 r d3 r0 exp i q . ( r − r0 ) δ 3 (r − ri ) δ 3 (r0 − rj ) I(q) ∝ i,j
Z
d3 r
=
Z
d3 r0 exp
i q . ( r − r0 )
C(r, r0 ) (225)
The scattering intensity can be expressed in terms of the radial distribution function g(r), since C(r − r0 ) = δ 3 (r − r0 ) ρ(0) + g(r − r0 )
(226)
Hence, the Z Z d3 r ρ(0) + d3 r d3 r0 exp i q . ( r − r0 ) g(r − r0 ) Z = N + V d3 r exp i q . r g(r) Z
I(q) ∝
(227) However, the integral over g(r) can be split into two parts Z Z 2 2 I(q) ∝ N + V d3 r exp i q . r ρ(0) + V d3 r exp i q . r ( g(r) − ρ(0) ) Z 2 2 = N + V ( 2 π )3 ρ(0) δ 3 (q) + V d3 r exp i q . r ( g(r) − ρ(0) ) Z 2 ( 2 π )3 3 = N + N2 δ (q) + V d3 r exp i q . r ( g(r) − ρ(0) ) V Z ∞ 3 2 4π 2 ( 2 π ) 3 = N + N δ (q) + V dr r sin q r ( g(r) − ρ(0) ) V q 0 (228) 91
The first term represents the incoherent scattering. The second term represents coherent forward scattering. The integral in the last term is convergent and yields non-trivial information about the structure of the fluid.
92
5
The Reciprocal Lattice
The reciprocal lattice vectors play an important role in describing the properties of a solid that has periodic translational invariance. Any property of the solid, whether scalar, vector or tensor, should have the same periodic translational invariance as the potential due to the charged nuclei. This means that, due to the translational invariance, the physical property only needs to be specified in a finite volume, and this volume can then be periodically continued over all space. The vectors of the reciprocal lattice play an important and special role in the Fourier transform of the physical quantity. The Reciprocal Lattice Vectors have dimensions of inverse distance and are defined in terms of the direct primitive lattice vectors a1 , a2 and a3 . The primitive reciprocal lattice vectors, b(i) , are defined via the scalar product ai . b(j) = 2 π δij
(229)
where the Kronecker delta function δij has the value 1 if i = j and is zero if i 6= j. Thus, the primitive reciprocal lattice vectors are orthogonal to two primitive lattice vectors of the direct lattice. The primitive reciprocal lattice vectors can be constructed via b(1) b(2) b(3)
a2 ∧ a3 a1 . ( a2 ∧ a3 ) a3 ∧ a1 = 2π a1 . ( a2 ∧ a3 ) a1 ∧ a2 = 2π a1 . ( a2 ∧ a3 )
=
2π
(230) where the last two expressions are found from the first by cyclic permutation of the labels (1, 2, 3). The denominator is just the volume of the primitive unit cell. The reciprocal lattice consists of the points given by the set of vectors Q where Q = m1 b(1) + m2 b(2) + m3 b(3) (231) and (m1 , m2 , m3 ) are integers. This set of vectors are the reciprocal lattice vectors. The reciprocal lattice vectors denote directions in the reciprocal lattice or are the normals to a set of planes in the direct lattice. In the latter case, as it shall be seen, the numbers (m1 , m2 , m3 ) are equivalent to Miller indices and, hence, are enclosed in round brackets. ——————————————————————————————————
93
5.0.1
Exercise 22
Find the volume of the primitive unit cell of the reciprocal lattice. ——————————————————————————————————
5.1
The Reciprocal Lattice as a Dual Lattice
The reciprocal lattice vectors can be considered to be the duals of the direct lattice vectors. This relation can be seen by expressing the primitive lattice vectors aj in terms of the primitive reciprocal lattice vectors bi , via aj =
1 X gj,i b(i) 2π i
(232)
The quantity gi,j is given by the metric, since aj . ak =
1 X gj,i b(i) . ak 2π i
(233)
and since b(i) . ak = 2 π δki
(234)
gj,k = aj . ak
(235)
one has Hence, gj,k is the metric tensor. The metric tensor expresses the length s of a vector r in terms of its components xi along the basis vectors ai . That is, if X r = xi ai (236) i
then, for a constant metric, the length is given in terms of the components via X s2 = gi,j xi xj (237) i,j
The metric tensor, when evaluated in terms of the parameters of the primitive unit cell, is given by the matrix a21 a1 a2 cos α3 a1 a3 cos α2 a22 a2 a3 cos α1 ( gi,j ) = a1 a2 cos α3 (238) a1 a3 cos α2 a2 a3 cos α1 a23
The inverse transform is given by b(i) = 2 π
X k
94
g i,k ak
(239)
where the quantity g i,k is identified as the metric for the dual vectors. Since aj
1 X gj,i b(i) 2π i X X g i,k ak = gj,i
=
i
(240)
k
and as aj =
X
δjk ak
(241)
gj,i g i,k
(242)
k
one infers that X
δjk =
i
Hence, the metric tensor is the inverse of the metric tensor for the dual vectors. The volume of the unit cell, Vc , is given by Vc2 = det ( gi,j )
(243)
or Vc2
=
a21
a22
a23
2
2
2
1 − cos α1 − cos α2 − cos α3 + 2 cos α1 cos α2 cos α3 (244)
The dual metric tensor is given by the inverse of the metric tensor, this is evaluated as the matrix a2 a a (cos α cos α −cos α ) a2 a a (cos α cos α a2 a2 (1−cos2 α ) 1
2 3
g i,j
=
3 1 2
Vc2 a23 a1 a2 (cos α1 cos α2 −cos α3 ) Vc2 a22 a1 a3 (cos α1 cos α3 −cos α2 ) Vc2
1
2
3
Vc2 a21 a23 (1−cos2 α2 ) Vc2 a21 a2 a3 (cos α2 cos α3 −cos α1 ) Vc2
3 −cos α2 ) Vc2 2 a1 a2 a3 (cos α2 cos α3 −cos α1 ) Vc2 a21 a22 (1−cos2 α3 ) Vc2 2 1 3
1
(245) This dual metric is also defined as bi . bj = ( 2 π )2 g i,j
(246)
From this, one can immediately find that the length of the reciprocal lattice vectors are given by a2 a3 b1 = 2 π | sin α1 | (247) Vc etc., and the angle β3 between b(1) and b(2) is given by cos β3 =
( cos α1 cos α2 − cos α3 ) | sin α1 sin α2 |
95
(248)
etc. On using the inverse transformation, the reciprocal lattice vectors are given in terms of the primitive direct lattice vectors by (cos α1 cos α3 − cos α2 ) (cos α1 cos α2 − cos α3 ) (1 − cos2 α1 ) a2 a2 a2 + a + a b(1) = 2π 1 22 3 a1 3 2 a1 a3 a1 a2 a21 Vc 2 2 2 2 (cos α2 cos α3 − cos α1 ) (1 − cos α2 ) (cos α1 cos α2 − cos α3 ) a a a + a + a2 b(2) = 2π 1 22 3 a1 3 a2 a3 a22 a1 a2 Vc 2 2 2 (1 − cos2 α3 ) (cos α2 cos α3 − cos α1 ) (cos α1 cos α3 − cos α2 ) a a a + a3 + a2 b(3) = 2π 1 22 3 a1 a23 a2 a3 a1 a3 Vc (249) These expressions are equivalent to the expression in terms of the vector product, and they also satisfy the definitions of the primitive reciprocal lattice vectors ai . b(j) = 2 π δij
(250)
Any vector of the direct Bravais Lattice can be expressed as R = n1 a1 + n2 a2 + n3 a3
(251)
A reciprocal lattice vector Q can also be written as Q = m1 b(1) + m2 b(2) + m3 b(3)
(252)
where (m1 , m2 , m3 ) are integers. Any vector k in the reciprocal lattice can be represented as a superposition of the reciprocal lattice vectors k = µ1 b(1) + µ2 b(2) + µ3 b(3)
(253)
where the µi are non-integer. Thus, the scalar product of an arbitrary vector k in the reciprocal lattice and a Bravais Lattice vector R is evaluated as k . R = 2 π µ1 n1 + µ2 n2 + µ3 n3 (254) If k is a reciprocal lattice vector Q then the set of µi ’s take on integer values mi , so that the scalar product reduces to Q . R = 2 π m1 n1 + m2 n2 + m3 n3 (255) As the sum of the products of integers is still an integer ( say M ), the Laue condition can be expressed as Q.R = 2πM
96
(256)
for all R. Thus, the Reciprocal Lattice vectors satisfy the Laue condition. This requirement is equivalent to the condition that the exponential phase factor given by exp
iQ.R
= 1
(257)
is unity for all Bravais Lattice vectors R. The vectors Q form a Bravais Lattice in which the primitive lattice vectors can be expressed in terms of the vectors b(i) . Also, the reciprocal lattice of a reciprocal lattice is the original direct lattice. ——————————————————————————————————
5.1.1
Exercise 23
Determine the primitive lattice vectors of the lattice that is reciprocal to the reciprocal lattice. How are they related to the vectors of the original direct lattice? ——————————————————————————————————
5.2
Examples of Reciprocal Lattices
Now some examples of reciprocal lattices are examined.
5.2.1
The Simple Cubic Reciprocal Lattice
In terms of Cartesian coordinates, the lattice vectors of the simple cubic direct lattice are a1 = a eˆx a2 = a eˆy a3 = a eˆz
(258)
The reciprocal lattice vectors are determined to be 2π eˆx a 2π = eˆy a 2π = eˆz a
b(1) = b(2) b(3)
(259)
These are three orthogonal vectors which are oriented parallel to the direct lattice vectors. The reciprocal lattice of the simple cubic direct lattice is also 97
simple cubic.
5.2.2
The Body Centered Cubic Reciprocal Lattice
In terms of Cartesian coordinates, centered cubic direct lattice are a a1 = 2 a a2 = 2 a a3 = 2
the primitive lattice vectors of the body eˆx + eˆy − eˆz − eˆx + eˆy + eˆz eˆx − eˆy + eˆz
The volume of the unit cell is Vc = | a1 . ( a2 ∧ a3 ) | = The reciprocal lattice vectors are determined 2π (1) eˆx + b = a 2π b(2) = eˆy + a 2π eˆx + b(3) = a
to be eˆy eˆz eˆz
(260) a3 2 .
(261)
The three reciprocal lattice vectors span the three-dimensional reciprocal lattice, but have different orientations from the direct lattice vectors. The reciprocal lattice has cubic symmetry as can be seen by combining the three reciprocal lattice vectors ( adding any two and subtracting the third ) to yield three orthogonal vectors of equal magnitude. The reciprocal lattice of the body centered cubic direct lattice is face centered cubic, with a conventional cell of side 4aπ .
5.2.3
The Face Centered Cubic Reciprocal Lattice
In terms of Cartesian coordinates, the primitive lattice vectors of the face centered cubic direct lattice are a a1 = eˆx + eˆy 2 a eˆx + eˆz a2 = 2 a a3 = eˆy + eˆz (262) 2
98
The reciprocal lattice vectors are determined to be 2π (1) = eˆx + eˆy − eˆz b a 2π (2) eˆx − eˆy + eˆz b = a 2π (3) b − eˆx + eˆy + eˆz = a
(263)
These are three non co-planar vectors, but have different orientations from the direct lattice vectors. The reciprocal lattice has cubic symmetry. This can be seen by combining pairs of reciprocal lattice vectors, which yields three orthogonal vectors of equal magnitude. The reciprocal lattice of the face centered cubic direct lattice is body centered cubic, with a conventional unit cell of side 4aπ .
5.2.4
The Hexagonal Reciprocal Lattice
The hexagonal lattice has lattice vectors √ a a1 = 3 eˆx + eˆy 2 √ a a2 = − 3 eˆx + eˆy 2 a3 = c eˆz
(264)
The volume of the primitive unit cell is √ Vc =
3 2 a c 2
The primitive reciprocal lattice vectors are 2π 1 √ eˆx + eˆy b(1) = a 3 2 π 1 (2) b = − √ eˆx + eˆy a 3 2 π b(3) = eˆz c
(265)
(266)
Thus, the reciprocal lattice of the hexagonal lattice is its own reciprocal lattice, but is rotated about the z axis. ——————————————————————————————————
99
5.2.5
Exercise 24
A trigonal lattice is defined by three primitive lattice vectors a1 , a2 and a3 , all of equal length a and an angle θ between any pair of these lattice vectors is a constant. Show that the three vectors a1 = [m, n, p], a2 = [p, m, n] and a3 = [n, p, m], referenced to an orthonormal basis represent a trigonal lattice. Prove that the reciprocal lattice of a trigonal lattice is another trigonal lattice. ——————————————————————————————————
5.3
The Brillouin Zones
The first Brillouin zone is the Wigner-Seitz cell of the reciprocal lattice. That is, the first Brillouin zone is a volume of a unit cell in the reciprocal lattice. This cell is found by first connecting a central reciprocal lattice point O to all the other reciprocal lattice points via the reciprocal lattice vectors Qi . Secondly, these connecting lines are bisected by planes. The equations for the set of these planes are given by 1 k − Q . Qi = 0 (267) 2 i for each i. The smallest volume around the origin O enclosed by these planes is the first Brillouin zone. That is, the first Brillouin zone consists of all the regions of space that can be reached from O without crossing any of the planes. The regions of the entire reciprocal lattice can be partitioned off into Brillouin zones of higher order. The planes defined by eqn(267) form a set of boundaries for the set of Brillouin zones. The n-th order Brillouin zone consists of the regions of k space that is accessed from the origin by crossing a minimum of n − 1 boundaries. Although the n-th order Brillouin zone exists in the form of isolated regions of k space, these regions can be brought together to make a contiguous volume by translating the isolated regions through appropriately chosen reciprocal lattice vectors Qi .
5.3.1
The Simple Cubic Brillouin Zone
The first Brillouin zone of the simple cubic direct lattice is a simple cube centered at the origin O. The sides of the cube are 2aπ and the Brillouin zone has a volume of ( 2aπ )3 which, when given in terms of the volume of the unit cell of 3 the direct lattice, is equal to 8Vπc . Points of high symmetry are usually given special names. Points interior to the first Brillouin zone are designated by Greek letters and those on the surface 100
are designated by Roman letters. The center of the zone (0, 0, 0) is denoted by Γ, the vertex of the cube 2aπ ( 12 , 12 , 12 ) is called R. The center of the x face located at 2aπ ( 12 , 0, 0) is called X, and the mid-points of the edges at 2aπ ( 12 , 12 , 0) are denoted by M . Points on high symmetry lines are also given special designations. The points between M and X are denoted by Z. The points on the lines between R and X are denoted by S, the points on the lines between R and M are denoted by T . The points on high symmetry lines in the interior have the following designations: the points between Γ and M are denoted by Σ, the points between Γ and X by ∆, the points on lines between Γ and R are denoted by Λ.
5.3.2
The Body Centered Cubic Brillouin Zone
The first Brillouin zone for the body centered direct lattice is a dodecahedral rhombohedron. The cell is centered at the origin Γ = (0, 0, 0). The vertices are located either on the positive or negative Cartesian axes at H = 2aπ (1, 0, 0) or at diagonal points P = 2aπ ( 12 , 12 , 12 ). The centers of the faces are denoted by N = 2aπ ( 12 , 12 , 0). Points on the high symmetry lines joining P and H are denoted by F . Other special points are: G which are on the high symmetry line between N and H, or D between P and N . The names of interior points on high symmetry lines are Σ which are intermediate between Γ and N , ∆ which are intermediate between Γ and H, and Λ which are intermediate between Γ and P .
5.3.3
The Face Centered Cubic Brillouin Zone
The Brillouin zone for the face centered cubic has twenty four vertices at W = 2aπ (1, 12 , 0) and has six square faces which have centers on the Cartesian axes. The centers of the square faces are denoted by X and are located at 2 π a (0, 1, 0). These squares are connected to eight hexagonal faces with centers 1 1 1 at the L points L = 2π a ( 2 , 2 , 2 ). The mid-points of the edges joining two 2 π 3 3 hexagonal faces are at a ( 4 , 4 , 0), and are denoted by K. The mid-points of the edges between the square and hexagonal faces are denoted by U . The points on the lines between X and U contained on the square faces are denoted by S while those between X and W are denoted by Z. The points on the high symmetry lines between L and W on the hexagonal faces are denoted by Q. The points on the high symmetry lines between Γ and K are denoted by Σ: the points on the lines between Γ and X are denoted by ∆, and the points on the line running through Γ and L are known as Λ.
101
5.3.4
The Hexagonal Brillouin Zone
The Brillouin zone for the hexagonal lattice is hexagonal. The upper and lower faces are hexagons. The hexagonal face centers are at A = 2aπ (0, 0, 2ac ). The vertices are at the H points, H = 2aπ ( √13 , 13 , 2ac ). The centers of the vertical rectangular faces are denoted by M and M = 2aπ ( √13 , 0, 0). The mid-points of the horizontal edges are denoted by L and L = 2aπ ( √13 , 0, 2ac ). Some of the interior high symmetry points are: Γ the zone center, Σ which are located on the high symmetry lines Γ M , ∆ are the points on the lines Γ A.
102
6
Electrons
The types of states of single electrons in the potentials produced by the crystalline lattice are discussed in the next three chapters. For simplicity, we shall first implicitly assume that the effect of the Coulomb interactions between electrons can be neglected. The neglect of electron - electron interactions is unjustified, as can be seen by considering the electrical neutrality of solids. The condition of electrical neutrality leads to the electron charge density being comparable with the charge density due to the lattice of nuclei or ions. Thus, the strength of the interactions between the electrons is expected to be comparable to the strength of the potential due to the nuclei. A simple order of magnitude estimate, based upon the typical linear dimensions of a unit cell a0 ∼ 2 2 Angstroms, leads to the average value of er ∼ 3 eV for both these interaction energies. Nevertheless, as a discussion of the effect of pseudo-potentials reveals, for most metals, the effect of the periodic potential of the lattice may be considered as small. The small value of the effective potential (or pseudo-potential) leads to a useful approximation namely, that of nearly free electrons. The effect of the finite strength of electron-electron interactions is a more complex issue, and is not yet fully understood. The density functional method in principle provides a method of evaluating the ground state electron density including the effect of electron-electron interactions. However, the density functional method does not describe the excited states. The effect of the electron-electron interactions is that of disturbing the electron density around any excited electron. On assuming that the interactions can be treated as a small perturbation, it can be shown that most of the effects of electron-electron interactions on the low energy excited electrons merely involve the dressing of the single excited electron thereby, forming a quasi-particle excitation. That is, the effect of the electron excitation of the surrounding gas of electrons can be absorbed as renormalizations of the properties of the single-electron excitations. This feature can lead to the low-temperature properties of the electronic system being determined by the gas of quasi-particles, which has the same form as a non-interacting gas of electrons. Systems where this simplification occurs are known as Landau Fermi-liquids. The effect of electron-electron interactions will be delayed to a later chapter.
7
Electronic States
In describing electronic states in metals first, the nature of the many-electron wave function and its decomposition into the sum of anti-symmetric products of one-electron wave functions shall be described. Then, the general properties of the one-electron basis wave functions shall be discussed. The one-electron wave functions or Bloch functions, are taken to be eigenfunctions of a suitable non-interacting Hamiltonian in which the potential has the periodicity of the underlying Bravais Lattice.
103
7.1
Many Electron Wave Functions
The energy of the electrons in a solid can be written as the sum of the kinetic energies, the ionic potential energy acting on the individual electrons, and the interaction potential between pairs of electrons. Thus, for a system with Ne electrons, the Hamiltonian can be written as ˆ = H
i=N Xe i=1
¯2 h ∇2 + Vions (ri ) − 2m i
+
e2 1 X 2 | ri − rj |
(268)
i6=j
where ri denotes the position of the i-th electron, Vions is the potential due to the lattice of ions, and the last term is the pair-wise interaction between the electrons. This Hamiltonian can be separated into two sets of terms, ˆ = H ˆ0 + H ˆ int H where ˆ0 = H
i=N Xe i=1
−
¯2 h ∇2 + Vions (ri ) 2m i
(269) (270)
is the sum of one-body Hamiltonians acting on the individual electrons, and the interaction term is given by the sum of two body terms X e2 ˆ int = 1 H 2 | ri − rj |
(271)
i6=j
Since electrons are indistinguishable, the Hamiltonian must be symmetric under all permutations of the indices i labelling the electrons. Also, the modulus squared wave function must be invariant under all possible permutations of the electron labels. An arbitrary permutation of the labels can be built up through sequentially permuting pairs of labels. The permutation operator Pˆi,j is defined as the operator which interchanges the indices i and j labelling a pair of otherwise indistinguishable particles. Thus, if Ψ(r1 , . . . ri , . . . rj , . . . rNe ) for an arbitrary Ne particle wave function, the permutation operator can be defined as Pˆi,j Ψ(r1 , . . . ri , . . . rj , . . . , rNe ) = Ψ(r1 , . . . rj , . . . ri , . . . rNe ) (272)
104
Since the Hamiltonian is symmetric under interchange of the indices i and j labelling any two identical particles, the permutation operators commute with the Hamiltonian ˆ ] = 0 [ Pˆi,j , H (273) Likewise, the permutation operators must also commute with any physical operator Aˆ [ Pˆi,j , Aˆ ] = 0 (274) otherwise measurements of the quantity Aˆ could lead to the particles being distinguished. Since the Hamiltonian commutes with all the permutation operators, one can ˆ and all the permutation find simultaneous eigenstates of the Hamiltonian H operators Pˆi,j . The energy eigenstates Ψ corresponding to physical states of indistinguishable particles must satisfy the equations ˆ Ψ H ˆ Pi,j Ψ
= EΨ = pi,j Ψ
(275)
where pi,j are the eigenvalues of the permutation operators Pˆi,j . As permutating the same pair of particle indices twice always reproduces the initial wavefunction, one has Pˆi,j 2 = Iˆ (276) where Iˆ is the identity operator. Thus, the eigenstates of the permutation operators satisfy the two equations Pˆi,j 2 Ψ = p2i,j Ψ = Ψ
(277)
Hence, the eigenvalues of the permutation operators must satisfy pi,j 2 = 1
(278)
pi,j = ± 1
(279)
or Thus, the Ne particle wave functions have the property that, under any permutation of a single pair of identical particles which are labelled by i and j, the un-permuted and permuted wave functions are related by Ψ(r1 , . . . rj , . . . ri , . . . rNe )
= ± Ψ(r1 , . . . ri , . . . rj , . . . rNe ) (280)
The upper sign holds for boson particles and the lower sign holds for fermions. Also, since pi,j is a constant of motion, the nature of the particles does not change with respect to time. Electrons are fermions and, thus, the wave function must always be anti-symmetric with respect to the interchange of any pair 105
of electron labels. Furthermore, the modulus square of the many-electron wave function must be invariant under all possible permutations of the electron labels. The energy eigenfunction for the system of Ne electrons can be written as Ψ(r1 , r2 , . . . rNe )
(281)
The many-electron energy eigenstates Ψ usually cannot be found exactly. However, they can be expressed in terms of a superposition formed from a complete set of many-electron eigenfunctions Φα1 ,α2 ,...,αNe (r1 , r2 , . . . , rNe ) of the ˆ 0 . The subscript αi represents the complete set of one-particle Hamiltonian H quantum numbers (including spin) which completely describes the state of a single fermion state. ˆ 0 Φα ,α ,...α (r , r , . . . r ) = E0 Φα ,α ,...α (r , r , . . . r ) H 1 2 Ne 1 2 Ne 1 2 1 2 Ne Ne (282) This many-electron eigenfunction is interpreted as representing the state in which the Ne electrons are distributed in the set of single-electron states with the specific quantum numbers α1 , α2 , . . . αNe . The basis states are orthonormal and so satisfy the relations Ne Z Y j=1
V
d3 r j
Φ∗β1 ,β2 ,... (r1 , . . . , rNe ) Φα1 ,α2 ,... (r1 , . . . , rNe ) = δα1 ,β1 δα2 ,β2 . . .
(283) where we have assumed that the sets of single-electron eigenvalues have been ordered. Since the basis is complete, the exact many-body eigenstates of the ˆ can be written as a linear superposition of the complete set full Hamiltonian H of basis functions X Ψ(r1 , r2 , . . . , rNe ) = Cα1 ,α2 ,...αNe Φα1 ,α2 ,...,αNe (r1 , r2 , . . . rNe ) α1 ,α2 ,...,αNe
(284) where the sum over the set of {αi } runs over all possible distributions of the Ne electrons in the set of all the single-electron states. The coefficients Cα1 ,α2 ,...,αNe have to be determined. The coefficients represent the probability amplitudes that electrons occupy the set of single-electron states labelled by α1 , α2 , . . . , αNe . The set of many-electron basis functions Φα1 ,α2 ,...αNe (r1 , r2 , . . . rNe ) can be expressed directly in terms of the one-electron wave functions φα (r). First, ˆ 0 can be decomposed as the sum note that the non-interacting Hamiltonian H of Hamiltonians which only act on the individual electrons ˆ0 = H
i=N Xe i=1
106
ˆi H
(285)
where the one-particle non-interacting Hamiltonian is given by ¯2 ˆi = − h H ∇2 + Vions (ri ) 2m i
(286)
This one-particle Hamiltonian has eigenstates, φβ (ri ), which satisfy the eigenvalue equation ˆ i φβ (r ) = Eβ φβ (r ) (287) H i i ˆ 0 has eigenfunctions which The many-particle non-interacting Hamiltonian H are the products of Ne one-particle eigenfunctions φβ (r) χ(r1 , α1 ; r2 , α2 ; . . . rNe , αNe ) = φα1 (r1 ) φα2 (r2 ) . . . φαNe (rNe )
(288)
and the non-interacting energy eigenvalue E0 for the many-particle state is given as the sum of the one-electron energy eigenvalues Eαi that are occupied by the electrons i=N Xe E0 = Eαi (289) i=1
However, the wave functions χ(r1 , α1 ; r2 , α2 ; . . . rNe , αNe ) do not represent physical wave functions since each of the single particle states with quantum numbers α1 , α2 , . . . αNe are occupied by the respective electron labelled by r1 , r2 , . . . rNe and, hence, the electrons have been labelled. As the electrons are indistinguishable, it is impermissible to distinguish them by this type of labelling. Thus, physical wave functions should contain terms which are related by all the possible relabelling of the indices of the particles. Electrons are fermions and, therefore, they have wave functions which are anti-symmetric under the interchange of any pair of particles. The proper basis set of the many-electron wave function Φ must correspond to all possible permutations of the single-particle indices. The proper anti-symmetrized wave function Φα1 ,α2 ,...,αNe is given by the Slater determinant φα1 (r1 ) φα1 (r2 ) . . . φα1 (ri ) . . . φα1 (rNe ) φα2 (r1 ) φα2 (r2 ) . . . φα2 (ri ) . . . φα2 (rNe ) .. .. .. .. . . . . 1 Φα1 ...αNe = √N e! φαi (r2 ) . . . φαi (ri ) . . . φαi (rNe ) φαi (r1 ) .. .. .. .. . . . . φα (r ) φα (r ) . . . φα (r ) . . . φα (r ) Ne
1
Ne
2
Ne
i
Ne
Ne
1 The normalization is √N as there are Ne ! terms in the determinant, corree! sponding to the Ne ! permutations of the electron labels.
The anti-symmetric wave function has the property that if there are two or more particles in the same one-particle eigenstate, say αi = αj , then the wave function vanishes. This can be seen by noting that two rows of the determinant 107
are identical and, hence, the determinant vanishes. As the wave function vanishes if two or more electrons occupy the same single particle eigenstate, there is no state in which a one-particle eigenstate is occupied by two or more electrons. The anti-symmetric nature of the fermion wave function directly leads to the Pauli exclusion principle. The Pauli exclusion principle can be stated as ”no unique single particle state can be occupied by two or more electrons.” For electrons which have spin one half, a single particle state is uniquely specified only if the spin quantum number is also specified. The single particle wave function φα (r) should be supplemented by the spinor χσ . That is, the single-electron wave function should be replaced by the product φα (r) → φα (r) χσ
(290)
where χσ is a spinor or a normalized two component column vector. The spin index σ can be considered to label an eigenstate of a component of an arbitrary single-electron spin operator, and the label σ should be considered as analogous to the single particle eigenvalue α. A complete set of labels for the singleelectron state are given by α and σ. An arbitrary spinor χσ can be decomposed as the linear superposition of two basis spinors χ± X χσ = γ± χ± (291) ±
where the normalization condition is given by X | γ± |2 = 1
(292)
±
The two basis spinors χ± are usually denoted by the two component column vectors 1 χ+ = (293) 0 corresponding to an eigenstate of the Pauli matrix σz with spin up and 0 χ− = (294) 1 corresponding to the eigenstate with spin down. Thus, the arbitrary state can be written as γ+ χσ = (295) γ− In this representation, the two components of an arbitrary spinor, χσ , represent the internal degree of freedom of the spin and, thus, are analogous to the degree of freedom represented by r in the position representation. The complex conjugate wave function should be replaced by φ∗α0 (r) → χTσ0 φ∗α0 (r) 108
(296)
which contains χTσ0 which is the complex conjugated transpose of the spinor states given by the two dimensional row matrices 0 ∗ 0 ∗ γ+ γ− χTσ0 = (297) In the situations where the electron spin has to be explicitly considered, these replacements lead to the inner product of two one-electron states not only involving the integration of the product φ∗α0 (r) φα (r) over the electron’s position r but also automatically involves evaluating the matrix elements of the individual electron’s row spinor state χTσ0 with the column spinor state χσ . The probability density ρ(r) for finding an electron at position r can be obtained from the matrix elements of the many-electron wave function Ψ(r1 ; r2 , . . . rNe ) with the one-electron density operator ρˆ. The one-electron density operator is given by a dirac delta function ρˆ(r) =
i=N Xe
δ 3 ( r − ri )
(298)
i=1
The density ρ(r) is evaluated as Z Z Z ρ(r) = Ne d3 r 1 d3 r 2 . . . V
V
V
d3 rNe δ( r − r1 ) | Ψ(r1 ; r2 , . . . rNe ) |2
(299) Thus, the trace over the positions particles can be evaluated by integrating over all but one of the particles positions Z Z Z ρ(r) = Ne d3 r 2 d3 r 3 . . . d3 rNe | Ψ(r; r2 , . . . rNe ) |2 (300) V
V
V
The matrix elements of the spin states has also to be taken. The resulting electron density is normalized to Ne . The probability density for finding an electron at position r and another electron at r0 is a correlation function ρ(r, r0 ) which is given by the matrix elements of the operator X X ρˆ(r, r0 ) = (301) δ 3 ( r − ri ) δ 3 ( r0 − rj ) i
j6=i
The resulting expression for the two-particle density is found by integrating over the positions of all the electrons except two Z Z Z ρ2 (r, r0 ) = Ne ( Ne − 1 ) d3 r 3 d3 r 4 . . . d3 rNe | Ψ(r; r0 ; . . . rNe ) |2 V
V
V
(302) This two-particle density correlation function is normalized to twice the number of pairs of electrons, Ne ( Ne − 1 ). ——————————————————————————————————
109
7.1.1
Exercise 25
Evaluate the single-particle density and two-particle density correlation function for a many-particle basis wave function Φα1 ,α2 ,...αNe given by a single Slater determinant of single-particle wave functions φα (r). —————————————————————————————————— The properties of the single-electron wave functions φα (ri ), that are to be used in forming the many-particle basis functions Φα1 ,α2 ...αNe (r1 , r2 , . . . rNe ) as Slater determinants, are discussed in the next chapter. In the following, the electron labels i in the one-electron wave functions are omitted.
7.2
Bloch’s Theorem
Bloch’s theorem describes the properties of the one-electron states φα (r) which are eigenstates of the one-electron Hamiltonian with a periodic potential. An electron in the solid experiences a periodic potential that has the periodicity of the underlying lattice of ions. In particular, the potential is invariant under translation through any Bravais lattice vector Ri Vions (r − Ri ) = Vions (r)
(303)
General properties of the solution of the Schrodinger equation for a single electron in a solid can be found from the periodicity of Vions (r). If the electronelectron interactions are neglected, the independent electrons obey the oneparticle Schrodinger equation with the periodic potential, h2 ¯ 2 ˆ H φα (r) = − ∇ + Vions (r) φα (r) = Eα φα (r) (304) 2m For an infinite solid, the physically acceptable solutions of this equation are known as the Bloch wave functions. The energies of the Bloch states are usually labelled by two quantum numbers n and k, instead of by α. The one-dimensional case, where the values of k were restricted to real values, was investigated by Kramers (H.A. Kramers, Physica 2, 483 (1935)). Bloch’s theorem applies to the eigenstates of the one-particle Hamiltonian, ¯h2 2 ˆ H = − (305) ∇ + Vions (r) 2m in which the potential has the symmetry Vion (r − Ri ) = Vions (r) 110
(306)
for all lattice vectors Ri in the Bravais Lattice. Bloch theorem states that the eigenfunctions can be found in the form φn,k (r) = exp i k . r un,k (r) (307) where the function un,k is invariant under the translation through any Bravais Lattice vector un,k (r − Ri ) = un,k (r) (308)
Bloch’s theorem asserts that the periodic translational symmetry manifests itself in the transformation of the wave function φn,k (r − Ri ) = exp − i k . Ri φn,k (r) (309) Thus, a translation of the wave function through a reciprocal lattice vector only shows up through the presence of an exponential factor. Furthermore, if the wave vector k is real, then the electron density for the Bloch state is identical for each unit cell in the crystal. This prevents the wave function from diverging at the boundaries of the solid. The proof of Bloch’s theorem is based on the consideration of the translation operator TˆR which, when acting on an arbitrary function f (r), has the effect of translating it through a Bravais lattice vector R TˆR f (r) = f (r − R)
(310)
ˆ φ(r) which This translation operator can be applied to the wave function H yields ˆ φ(r) TˆR H
= = =
ˆ − R) φ(r − R) H(r ˆ H(r) φ(r − R) ˆ ˆ H TR φ(r)
(311)
Thus, the Hamiltonian commutes with the translation operator which produces a translation through a Bravais lattice vector, ˆ , TˆR ] = 0 [H
(312)
ˆ This means that it is possible to find simultaneous eigenstates of both TˆR and H. Furthermore, the translation operators corresponding to translations through different lattice vectors commute. This can be shown by successive translations TˆR and TˆR0 , which yields TˆR TˆR0 φ(r)
=
φ(r − R − R0 )
= 111
φ(r − R0 − R) TˆR0 TˆR φ(r)
(313)
Thus, the translation operators commute [ TˆR , TˆR0 ] = 0
(314)
This proves that the wave functions can be chosen to be simultaneous eigenstates of the Hamiltonian and all the translation operators that produce translations through Bravais lattice vectors. The Bloch functions are chosen such that they satisfy ˆ φ(r) H
=
E φ(r)
TˆR φ(r)
=
c(R) φ(r)
(315)
ˆ and all the TˆR . and, thus, are the simultaneous eigenstates of H The translation operators can be compounded as = φ(r − R − R0 ) = TˆR+R0 φ(r)
TˆR0 TˆR φ(r)
(316)
When two translation operators are successively applied to the simultaneous eigenfunctions of the translation operators, it may be re-interpreted in terms of the compound translation TˆR0 TˆR φ(r) = TˆR+R0 φ(r)
=
c(R0 ) c(R) φ(r)
=
c(R + R0 ) φ(r)
(317)
This shows that the products of two eigenvalues of different translation operators gives the eigenvalue of the compound translation c(R0 ) c(R) = c(R + R0 )
(318)
Since a general Bravais lattice vector can be expressed as the sum R = n1 a1 + n2 a2 + n3 a3
(319)
where (n1 , n2 , n3 ) are integers, a general eigenvalue can be decomposed in terms of products (320) c(R) = c(a1 )n1 c(a2 )n2 c(a3 )n3 Hence, on defining
− i 2 π x1
c(a1 ) = exp
− i 2 π x2
c(a2 ) = exp c(a3 ) = exp
− i 2 π x3 112
(321)
one can define a vector k via k = x1 b(1) + x2 b(2) + x3 b(3)
(322)
With these definitions, the eigenvalue of the translation operator can be expressed in terms of the k vector as c(R) = exp − i k . R (323) Thus, the eigenvalue equation for the translation operator is expressed as TˆR φ(r)
φ(r − R) c(R) φ(r) exp − i k . R φ(r)
= = =
(324)
which completes the proof of Bloch’s theorem. The wave functions which are simultaneous eigenfunctions of the energy and the periodic translation operators are the Bloch functions. The Bloch functions, φn,k (r), are labelled by the translation quantum number k and a quantum number n that pertains to the single particle energy eigenvalue En,k . It should be noted that Bloch’s theorem does not guarantee that the quantity k is real. Since k is the quantum number associated with the eigenvalue of the operator which translates through a Bravais lattice vector TˆR φn,k (r)
φn,k (r − R) exp − i k . R φn,k (r)
= =
(325)
then it should be clear that as exp
iQ.R
= 1
(326)
the eigenvalue labelled by k is identical to the eigenvalue labelled by k +Q. This means that the two wave vectors can be identified, i.e., k + Q ≡ k. Thus, the Bloch wave vector when translated through a reciprocal lattice vector Q leads to an equivalent wave vector. Furthermore, if the convention φn,k+Q (r) = φn,k (r)
(327)
is adopted, then the eigenvalues must be related through En,k+Q = En,k
(328)
Thus, if k is real, any k value can be restricted to lie within one unit cell of reciprocal space (F. Bloch, Zeit. f¨ ur Physik, 52, 555 (1928)).
113
7.3
Boundary Conditions
Bloch’s theorem does not ensure that the wave vector k is real. In fact, for surface states or impurity states, k may become imaginary. However, for bulk states the wave vector is real, as can be ascertained by applying appropriate boundary conditions. Consider a crystalline solid of finite size which has the same shape as the primitive unit cell of the Bravais Lattice but with dimensions L1 = N1 | a1 |, L2 = N2 | a2 | and L3 = N3 | a3 | along the three primitive axes. The solid then contains N = N1 N2 N3 lattice points. Born-von Karman or periodic boundary conditions are imposed on the wave function φn,k (r − Ni ai ) = φn,k (r)
f or
i = 1 , 2 or 3 .
(329)
The periodic boundary conditions ensure that the electronic states are homogeneous bulk states and are unmodified in the vicinity of the surface of the solid. Application of Bloch’s theorem yields the condition φn,k (r)
= =
φn,k (r − Ni ai ) exp − i Ni k . ai φn,k (r)
f or i = 1 , 2 or 3 (330)
Thus, the periodic boundary conditions are fulfilled if the wave vectors k satisfy the conditions exp
− i Ni k . ai
= 1
(331)
Since k can be written in terms of the primitive reciprocal lattice vectors, b(i) , via i=3 X k = xi b(i) (332) i=1
and as ai . b(j) = 2 π δij , then the periodic boundary conditions require that exp − i 2 π Ni xi = 1 f or i = 1 , 2 or 3 (333) Thus, the components xi must be in the form of ratios xi =
114
mi Ni
(334)
where mi are integers. This proves that the general Bloch wave vector k is a real vector, and the k vectors have the general form k =
i=3 X mi (i) b Ni i=1
(335)
Since Ni 1, the k vectors form a dense set of points in reciprocal space. The properties of a solid can be expressed in terms of summations over the electronic states. Since each state can be expressed in terms of the discrete k quantum number, the summation are over a dense set of k vectors. The summation over a dense set of k vectors can be represented in terms of an integral over the energy, weighted by the density of states. From the form of k, the volume of k space per allowed k value is (2) b(1) b b(3) . ∧ ∆3 k = N1 N2 N3 1 (1) = b . b(2) ∧ b(3) (336) N As the volume of the Brillouin zone is given by b(1) . b(2) ∧ b(3)
(337)
the volume of one state is N1 times the volume of the Brillouin zone. This implies that the number of allowed k values within the Brillouin zone is equal to the number of unit cells in the crystal. The volume ∆3 k associated with a Bloch state is given by ∆3 k
= =
1 ( 2 π )3 N a1 . ( a2 ∧ a3 ) 1 ( 2 π )3 N Vc
(338)
Now, since the volume of the solid V is N times the volume of the cell Vc , V = N Vc
(339)
then the volume of k space associated with each Bloch state is ∆3 k =
( 2 π )3 V
(340)
Hence, in the continuum limit, the number of one-electron states (per spin) in an infinitesimal volume of phase d3 k is given by V d3 k ( 2 π )3
115
(341)
7.4
Plane Wave Expansion of Bloch Functions
Any function obeying Born-von Karman boundary conditions can be expanded as a Fourier series. This implies that the Bloch functions can also be expanded as X Cq exp i q . r (342) φn,k (r) = q
where the wave vectors q are to be related to k. From Bloch’s theorem, the Bloch functions can also expressed as φn,k (r) = exp i k . r un,k (r) (343) Since un,k (r) has periodic translational invariance, it only contains reciprocal lattice vectors Q. The Fourier series expansion of the periodic function is un,k (r) =
X
un,k (Q) exp
iQ.r
(344)
Q
and the inverse transform is given by the integral Z 1 3 un,k (Q) = d r un,k (r) exp − i Q . r V V
(345)
On comparing the above two forms for the Bloch functions, one has X Cq exp i q . r φn,k (r) = q
=
X
un,k (Q) exp
i(k + Q).r
Q
(346) Thus, the allowed q values in the Bloch wave functions are equal to k, modulo a reciprocal lattice vector. Furthermore, the Cq are equal to the Fourier components un,k (Q). Next, it shall be shown how the Cq can be determined directly from the Schrodinger equation which contains the periodic potential Vions (r). The Bloch functions can be found by solving the Schrodinger equation where the Hamiltonian contains the periodic potential Vions (r). The periodic potential also has a Fourier series expansion X Vions (Q) exp i Q . r (347) Vions (r) = Q
116
and the inverse transform is given by the integral Z 1 3 d r Vions (r) exp − i Q . r Vions (Q) = V V
(348)
Furthermore, since Vions (r) is real, the Fourier transform of the potential has the symmetry ∗ Vions (−Q) = Vions (Q) (349) This follows from taking the complex conjugate of the Fourier series expansion of Vions (r). A second condition on the Fourier expansion coefficients exists for crystals which have an inversion symmetry around a suitable origin. The inversion symmetry implies that the potential is symmetric Vions (r) = Vions (−r)
(350)
and this implies that the Fourier transform of the potential has the property ∗ Vions (Q) = Vions (−Q) = Vions (Q)
(351)
The expansion coefficients Cq in the Bloch function are found by substituting the Fourier series into the energy eigenvalue equation. The kinetic energy term is evaluated from p ˆ2 φn,k (r) 2m
= =
¯2 h ∇2 φn,k (r) 2m X ¯h2 q 2 Cq exp i q . r 2m q
−
(352)
The potential term in the energy eigenvalue equation has the form of a convolution when expressed in terms of the Fourier Transforms X X Vions (r) φn,k (r) = Vions (Q0 ) Cq0 exp i q 0 + Q0 . r (353) q0
Q0
The form of the energy eigenvalue equation is simplified if q 0 is expressed as q 0 = q − Q0 , so X X Vions (r) φn,k (r) = Vions (Q0 ) Cq−Q0 exp i q . r (354) Q0
q
Then, the energy eigenvalue equation takes the form ! 2 2 X X h q ¯ 0 Vions (Q ) Cq−Q0 − E Cq + exp i q . r = 0 (355) 2m 0 q Q
117
The wave vectors q are expressed as q = k − Q so that k is always located within the first Brillouin zone. On equating the coefficients of the plane waves with zero, one finds the matrix eigenvalue equation 2 X h ( k − Q )2 ¯ (356) − E Ck−Q + Vions (Q0 ) Ck−Q−Q0 = 0 2m 0 Q
The reciprocal lattice vector is transformed as Q0 → Q” = Q0 + Q in the second term, leading to an infinite set of coupled equations 2 X h ( k − Q )2 ¯ − E Ck−Q + Vions (Q” − Q) Ck−Q” = 0 2m Q”
(357) Thus, because of the periodicity of the potential, the Bloch functions only contain Fourier components q that are connected to k via reciprocal lattice vectors. For fixed k, the set of equations couple Ck to all the Ck−Q via the Fourier component of the potential Vions (Q). In principle, the set of infinite coupled algebraic equations (357) could be used to find the coefficients Ck−Q and the eigenvalue En,k . The Bloch function is expressed in terms of the coefficients Ck−Q as X Ck−Q exp i ( k − Q ) . r φn,k (r) = Q
=
exp
+ ik.r
X
Ck−Q exp
− iQ.r
Q
(358) Using this, the Bloch function can be expressed in terms of the periodic function un,k (r) via X un,k (r) = Ck−Q exp − i Q . r (359) Q
In order to make this approach tractable, it is necessary to truncate the infinite set of coupled equations (357) to a finite set. However, if this set of equations are truncated, it would require approximately 103 to 106 plane wave components before convergence is attained in three dimensions. Therefore, other methods are frequently used.
7.5
The Bloch Wave Vector
The Bloch wave vector k plays a role similar to that of the momentum of a free electron. In fact, it reduces to the momentum quantum number in the 118
limit Vions (r) → 0. However, for a non-zero crystal potential, k is not equal to the eigenvalue of the electron momentum p ˆ = − i ¯h ∇ since it differs by amounts that are determined by the reciprocal lattice vectors Q and the coefficients Ck+Q . That is, p ˆ φn,k (r) = h ¯ k φn,k (r) − i ¯h exp
ik.r
∇ un,k (r)
(360)
Thus, ¯h k is known as the crystal momentum. The crystal momentum can always be chosen to be in the first Brillouin zone by making the transformation k = k0 + Q
(361)
On substituting this relation, the Bloch function is re-written as exp i Q . r un,k (r) exp i k . r un,k (r) = exp i k 0 . r = exp i k 0 . r u ˜n,k (r) (362) where
u ˜n,k (r) = exp
iQ.r
un,k (r)
(363)
is identified as a periodic function of the type that is used in Bloch’s theorem. The new function u ˜n,k (r) transforms like un,k (r) since it has the periodicity of the Bravais Lattice as exp i Q . R = 1 (364)
Due to the periodic translational symmetry, the eigenvalue problem can be reduced to finding a solution for the periodic function un,k (r) in a single cell of the lattice. The total number of energy eigenfunctions must correspond to the number of electron states originating from each atom in the crystal, and there may be many basis atoms in the unit cell. As an isolated atom is expected to have an infinite number of excited levels, and as the number of different k points in the Brillouin zone is equal to the number of primitive cells in the crystal, there must be infinitely many energy eigenfunctions with fixed k. The different one-electron states with fixed k are distinguished by the index n. The energy En,k is a continuous function of k, forming energy bands. This is seen by examination of the eigenvalue equation, when the Bloch functions are expressed as φn,k (r) = exp i k . r un,k (r) (365)
119
This procedure leads to the energy eigenvalue equation 2 2 ¯h ˆ Hk un,k (r) = − i∇ + k + Vions (r) un,k (r) 2m = En,k un,k (r) (366) Due to the Born-von Karman boundary conditions, each energy band in the Brillouin zone contains N different states. The different k values are not part of a continuum but form a discrete dense set of points. The energy eigenvalues En,k , therefore, although a continuous function of k, only exist at the finite set of points.
7.6
The Density of States
A physical quantity A may be expressed in terms of the quantities An,k associated with the individual electrons in each of the occupied Bloch states (n, k) in the solid. That is, the quantity A is given by X A = 2 An,k (367) n,k
where the sum runs over each level (n, k) that is occupied by an electron. The factor 2 originates from the spin degeneracy. Since the different k states are dense and uniformly distributed in the Brillouin zone, the summation may be represented by an integration. The volume ∆3 k of phase space associated with a Bloch state is given by ( 2 π )3 (368) ∆3 k = V The quantity A is expressed as the integral X Z V A = 2 d3 k An,k (369) ( 2 π )3 n En,k <EF where the integration over k runs over the volume of occupied states in the first Brillouin zone. Thus, for the partially filled bands the integration runs over a volume of k space enclosed by a surface of constant energy EF , and for the completely filled bands it runs over the entire Brillouin zone. The integration over k space may be converted into an integral over the energy E, by introducing the one-electron density of states ρ(E). The density of states per spin is defined by the integration over the dirac delta function X Z d3 k ρ(E) = V (370) δ( E − En,k ) ( 2 π )3 n If the quantity An,k only depends on (n, k) through En,k , then An,k = A(E) 120
(371)
so the quantity A can be represented as an integral over the density of states Z
EF
A = 2
dE ρ(E) A(E)
(372)
−∞
The density of states ρ(E) can be calculated by noting that the infinitesimal R E+∆E integral E ρ(E) dE ∼ ρ(E) ∆E is the number of states in the energy range between E and E + ∆E, or the allowed number of k values between E and E + ∆E in each of the energy bands. Thus, on integrating over an energy range ∆E and using the definition of the density of states in terms of the dirac delta function, one finds ρ(E) ∆E
∼ V
X Z n
=
E+∆E
Z dE
E
d3 k δ( E − En,k ) ( 2 π )3
X Z V 3 d k Θ(E + ∆E − E ) − Θ(E − E ) n,k n,k ( 2 π )3 n (373)
where Θ(x) is the Heaviside step function. Thus, the density of states is expressed by an integral over a volume of k space enclosed by surfaces of constant energy E and E + ∆E. Furthermore, since ∆E is an infinitesimal quantity, ∆E can be expressed in terms of the perpendicular distance between the two surfaces of constant energy. Let Sn (E) be the surface En,k = E lying within the primitive cell and let δk(k) be the perpendicular distance between the surfaces Sn (E) and Sn (E + dE) at point k. Then, as Sn (E) is a surface of constant E and ∇ En,k is perpendicular to that surface E + ∆E
=
E + | ∇ En,k | δk(k)
δk(k)
=
∆E | ∇ En,k |
(374)
Hence, the density of states can be expressed as an integral over a surface of constant energy X Z d2 S 1 ρ(E) = V (375) 3 Sn (E) ( 2 π ) | ∇ En,k | n This gives an explicit relation between the density of states and the band structure.
121
Since En,k is periodic, it is bounded from above and below for each value of n. This implies that there will be values of k in each Brillouin zone where the group velocity vanishes, ∇ En,k = 0 (376) The band energy En,k must have at least one maximum and one minimum in the Brillouin zone. At each of these k points, the integrand in ρ(E) diverges. Other divergences may be expected which originates from k points near the Brillouin zone boundary, where the dispersion relation is expected to have zero slope. These divergences give rise to van Hove singularities in the density of states. L. van Hove provided a general discussion of these types of singularities using the Morse index theorem (L. van Hove, Phys. Rev. 89, 1189 (1953), also see the discussion by H.P. Rosenstock, Phys. Rev. 97, 290 (1955)). In three dimensions these singularities are integrable. That is, the integration over the surface area yield a finite value for ρ(E). In the three-dimensional case the divergences show up in the slopes of the density of states ∂ρ(E) ∂E , and are the van Hove singularities. The van Hove singularities at the density of states occur at the values of E where ∇ En,k vanishes at some points of the surface Sn (E). Typical van Hove p singularities occur at the band edges where | E | . Although the density of states ρ(E) the density of states varies as at van Hove singularities does not diverge in three dimensions, the derivatives diverge and can give rise to anomalies in thermodynamics as can be seen by examining the Sommerfeld expansion. In low dimensional systems, the divergence can show up directly as a divergence in the density of states. ——————————————————————————————————
7.6.1
Exercise 26
The energy dispersion relation at a van Hove singularity has a zero gradient. In the vicinity of the van Hove singularity, the d-dimensional dispersion relation can be written as i=d X αi ki2 a2i (377) Ek = E0 + E1 i=1
where the coefficients αi determine whether the extremum is a maximum, minimum or saddle point. The coefficients are given by αi = ± 1
(378)
Characterize the different types of van Hove singularities in the density of states and sketch the energy dependence in the vicinity of the singularity for d = 1, 2
122
and d = 3. ——————————————————————————————————
7.7
The Fermi-Surface
The ground state of the electronic system has the lowest possible energy. For non-interacting electrons, the electrons occupy the lowest possible eigenvalues. However, the distribution of electrons must satisfy the restriction imposed by the Pauli exclusion principle, which states that no uniquely specified electron state can be occupied by more than one electron. This means that a spin degenerate state cannot be occupied by more than two electrons, one for each spin ˆ 0 is represented by a Slater determinant wave value. Thus, the ground state of H function in which two electrons are placed in the lowest energy eigenstate, and two in the successively next lowest states, until all the Ne electrons have been placed in states. In the following, the convention is adopted that the electrons which are associated with the states (n, k) have k restricted to be within the first Brillouin zone. Two different types of ground states result. Insulators. In insulators, a number of bands are completely filled and all other bands are completely empty. No band is partially filled. In this case, there must exist an energy interval which separates the lowest unoccupied band state and the highest occupied band state. The density of states must be zero in this energy interval. The width of the interval, where ρ(E) = 0, is the threshold energy required to excite an electron from an occupied to an unoccupied state. This energy interval is defined to be the band gap. In an insulator, the chemical potential µ falls in the band gap. An insulating state can only occur if the number of electrons Ne is equal to an even number times the number of primitive unit cells N in the direct lattice. This is because each band can be occupied by 2 N electrons. For example, C being tetravalent when it crystallizes in the diamond structure is insulating, and has a band gap of over 5 eV. The elements Si and Ge are also insulating, but have smaller band gaps which are 1.1 eV and 0.67 eV, respectively. Metals. A number of bands may be partially filled. In this case, the highest occupied Bloch states have an energy EF which lies within the range of one or more bands. This case corresponds to a metal, in which the one-electron density of states at EF is non-zero, ρ(EF ) 6= 0. Systems with an odd number of electrons per unit cell should be metallic, such as the simple mono-valent metals like N a 123
or K. However, systems with two electrons per unit cell can be metallic. For example, divalent M g is metallic. M g crystallizes in the hexagonal close-packed system and, hence, has four electrons per unit cell. The small distance between the atoms is responsible for the large dispersion of the bands which allows the bands to overlap. The overlapping of the bands leads to divalent M g being metallic. For each partially filled band, there will be a surface in the three-dimensional k space which separates the occupied from the unoccupied states. The set of all such surfaces forms the Fermi-surface. The Fermi-surface is determined by the equation En,k = EF (379) Since En,k is periodic in the reciprocal lattice, the Fermi-surface may either be represented within the full periodic reciprocal lattice or in a single unit cell of the reciprocal lattice. If the full reciprocal lattice is used, the Fermi-surface is represented in the extended zone scheme. If the Fermi-surface is represented within a single primitive unit cell of the reciprocal lattice, it is represented in a reduced zone scheme.
124
8
Approximate Models
Some of the earlier approaches to electronic structure of solids will be discussed in this chapter. These methods are not in common use, and are not reliable methods for calculating electronic structures. These older methods also neglect the effect of electron-electron interactions. By contrast, the most common method in use today is based on the Density Functional approach of Kohn and Sham, which is quantitatively reliable and includes the effect of electron-electron interactions. Nevertheless, the older methods were important in the development of the subject and yield important insights into the results of electronic structure calculations.
8.1
The Nearly Free Electron Model
In the nearly free electron approach to electronic structure calculations, one assumes that the periodic potential due to the lattice is small. This assumption is not justified, apriori, as the potential is of the order of 10 eV. However, the effect of the potential can be much smaller than this estimate, and for these cases, the nearly free electron model gives results which can be used to phenomenologically describe metals found in groups I, II, III, and IV of the periodic table. These materials have an atomic structure which consists of s or p electrons outside a closed shell configuration. The nearly free model works for two main reasons:(i) The region in which the electron - ion interaction is strongest is in the vicinity of the ion. However, since this region is occupied by the core electrons and the Pauli principle forbids the conduction electrons to enter this region, the effective potential is weak. (ii) In the region of space where the conduction electrons reside, the motion of other conduction electrons effectively screen the potential. Since in the nearly free electron approximation the effective potential is assumed to be small, perturbation theory may be used.
8.1.1
Perturbation Theory
The wave function for an electron in a Bloch state with wave vector k is given by X Ck−Q exp i ( k − Q ) . r φk (r) = (380) Q
125
where Q are reciprocal lattice vectors and the coefficients Ck have to be determined. The coefficients satisfy the set of coupled algebraic equations 2 X h ¯ Vions (Q − Q0 ) Ck−Q0 = 0 (381) ( k − Q )2 − E Ck−Q + 2m 0 Q
where the sum runs over all the reciprocal lattice vectors Q0 . For fixed k, there is an equation for each Q value. The solutions of this equation for fixed k are labelled by n. If one neglects the potential due to the lattice, one obtains the empty lattice approximation. This is the result of the zero-th order perturbation theory. To zero-th order in the perturbing potential Vions , the set of equations reduce to (0) Ek − Q − E Ck−Q = 0 (382) where the zero-th order energy eigenvalues are given by (0) − Q
Ek
=
¯2 h ( k − Q )2 2m
(383)
and the zero-th order energy eigenfunctions are 1 (0) φk (r) = √ exp i ( k − Q ) . r V
(384)
If, for a given k, the energies associated with the set of reciprocal lattice vectors Q1 , . . . , Qm are degenerate, (0) − Q
Ek
1
(0) − Q
= Ek
2
(0) − Q
= . . . = Ek
(385)
m
(0)
then φk (r) can be made of any linear combination of the functions exp[ i ( k − Q ) . r ]. The type of perturbation theory that is appropriate depends on whether the zero-th order eigenvalues are degenerate or not.
8.1.2
Non-Degenerate Perturbation Theory
Non-degenerate perturbation theory can be used when the energy separations (0) between the level under consideration, Ek − Q , and all other zero-th order 1 eigenvalues are large compared with the magnitude of the potential (0) − Q
| Ek
1
(0) − Q
− Ek
| | Vions (Q1 − Q) |
126
(386)
for fixed k and all Q 6= Q1 . This corresponds to the non-degenerate case. We shall evaluate the one-electron energy eigenvalue to second order in Vions but first, we need to consider the first order correction to the energy and wave function of the state under consideration which, to zero-th order, has momentum k − Q1 . The amplitude corresponding to the plane wave component with this momentum satisfies the secular equation X (0) (387) Vions (Q1 − Q0 ) Ck−Q0 = 0 Ek − Q − E Ck−Q + 1
1
Q0
This shall be used to obtain the energy E and the coefficient Ck−Q to first 1 order in Vions . The term involving the summation is explicitly of the order of Vions , so the coefficients Ck−Q in this term only need to be calculated to zero-th order in the Vions . Only one coefficient is non-zero to zero-th order in Vions , since (0) ∀ Q 6= Q1 (388) Ck−Q = 0 Thus, to first order in Vions , only one term survives in the summation and the coefficient Ck−Q satisfies the eigenvalue equation 1 (0) (0) E − Ek − Q Ck−Q = Vions (0) Ck−Q (389) 1
1
1
This equation determines the energy eigenvalue E (1) to first order in Vions . Since the energy shift is to be calculated to first order in Vions , the coefficient Ck−Q 1
(0)
can be substituted by its zero-th order value Ck−Q . This procedure yields the 1 first order approximation for the energy eigenvalue (0)
E (1) = Ek−Q + Vions (0)
(390)
1
This only yields a constant shift in the zero-th order energy eigenvalues which can be absorbed into the definition of the reference energy. It is also seen from eqn(389) that, to first order, the change in the coefficient Ck−Q remains 1 undetermined, so we may set (1)
(0)
Ck−Q
= Ck−Q
1
(391)
1
(0)
This is seen by substituting the first-order expression for E − Ek−Q into 1
eqn(389). In the following discussion, we shall neglect the effect of the average potential V (0)ions . The coefficients of the other plane wave components of the Bloch function satisfy X (0) Ek − Q − E Ck−Q + Vions (Q − Q0 ) Ck−Q0 = 0 (392) Q0
127
(1)
This is used to obtain the coefficients Ck−Q to first order in Vions . Since the summand is explicitly of first order in Vions , then the coefficients Ck−Q0 need (0)
only be considered to zero-th order. However, only Ck−Q is non-zero in this 1 order so, Vions (Q − Q1 ) (0) (1) (393) Ck−Q Ck−Q = (0) 1 E − Ek − Q (0)
(1)
The coefficients Ck−Q and Ck−Q completely determine the energy eigenfunc1 tion to first order in Vions . The energy eigenvalue can now be found to second order in Vions using the wave function that have just been calculated to first order in Vions . On (1) substituting the expression for Ck−Q , eqn(393), into the secular equation which determines Ck−Q , eqn(387), one finds 1 X | Vions (Q − Q) |2 (0) (0) 1 Ck−Q E − Ek − Q Ck−Q = (394) (0) 1 1 1 ( E − Ek − Q ) Q Since both the energy and wave function are unchanged to first order in Vions , the lowest order non-zero contribution to the term on the left hand side is found when Ck−Q is evaluated in zero-th order and E is evaluated to second order. 1 Thus, to second order in Vions , the energy eigenvalue E is given by the solution of X | Vions (Q − Q) |2 (0) 1 E − Ek − Q = (395) (0) 1 ( E − Ek − Q ) Q (0) − Q
or, since the eigenvalue E is approximately equal to Ek value is given by (0) − Q
E = Ek
+
1
| Vions (Q1 − Q) |2
X Q
, the energy eigen-
1
(0) − Q
( Ek
1
(0) − Q
− Ek
(396) )
This relation shows that weakly perturbed non-degenerate bands repel each other. For example, if (0) (0) E k − Q > Ek − Q (397) 1
then the second order contribution is negative and E is reduced further below (0) Ek − Q . On the other hand, if 1
(0) − Q
Ek
(0) − Q
< Ek
(398)
1
then the second order contribution is positive and E is increased further above (0) Ek − Q . Hence, the leading order effect of the perturbation increases the sep1 aration between the energy bands.
128
8.1.3
Degenerate Perturbation Theory
The most important effect of the potential occurs when a pair of the free electron eigenvalues are within Vions of each other, but are far from all other eigenvalues. Under these conditions, the eigenvalues are almost doubly degenerate and one can use degenerate perturbation theory to couple these energy levels. In this case, the set of equations can be truncated to only two non-zero C coefficients. These two coefficients satisfy the pair of equations (0) − Q
) Ck−Q
= Vions (Q2 − Q1 ) Ck−Q
(399)
(0) − Q
) Ck−Q
= Vions (Q1 − Q2 ) Ck−Q
(400)
( E − Ek
1
1
2
and ( E − Ek
2
2
1
which can be combined to yield the quadratic equation for E (0) − Q
( E − Ek
1
(0) − Q
) ( E − Ek
2
) = | Vions (Q1 − Q2 ) |2
(401)
This quadratic equation has the solution for the energy eigenvalue v u (0) (0) (0) 2 E (0) u E k − Q + Ek − Q k − Q − Ek − Q t 1 2 1 2 E = ± + | Vions (Q1 − Q2 ) |2 2 2 (402) Whenever the Bloch wave vector k takes on special values such that unperturbed bands cross (0) (0) Ek − Q = E k − Q (403) 1
2
the energy bands simplify to yield the two branches (0) − Q
E = Ek
± | Vions (Q1 − Q2 ) |
1
(404)
If the unperturbed bands cross, the non-zero potential produces a splitting of 2 | Vions (Q1 − Q2 ) |. This result is consistent with that previously found by using non-degenerate perturbation theory. The avoided crossings of the bands are expected to occur whenever (0) − Q
Ek
1
(0) − Q
∼ Ek
(405)
2
This gives rise to a specific condition on the wave vectors. For convenience of notation, let q = k − Q1 so that this criterion takes the form (0) − Q”
Eq(0) = Eq
(406)
for some reciprocal lattice vector Q” 6= 0. This requires that vector q lies on the Bragg plane bisecting Q”, as this condition reduces to Q”2 = 2 q . Q” 129
(407)
The vector q − Q” lies on a second Bragg plane. Thus, the geometric significance of the condition for the degeneracy of the unperturbed bands, is that the electronic states satisfy the condition for Bragg scattering. The origin of the gaps can be easily understood from consideration of the wave functions. When q lies on a single Bragg plane, then the energy eigenvalues are simply given by E = Eq(0) ± | Vions (Q”) | (408) The coefficients corresponding to these energies are found from the two coupled equations. In this case, where the unperturbed bands cross, the coefficients are related via Cq = ± sign Vions (Q”) Cq−Q” (409) which produces two standing wave solutions. If Vions (Q”) > 0, then the pair of states are the anti-bonding state Q” . r 2 2 | φ+ (r) | ∝ cos q 2 E+
= Eq(0) + | Vions (Q”) |
(410)
and the bonding state 2 | φ− q (r) |
E−
∝ sin2
Q” . r 2
= Eq(0) − | Vions (Q”) |
(411)
On the other hand, if Vions (Q”) < 0, then the situation is reversed, and the anti-bonding state is given by Q” . r 2 + 2 | φq (r) | ∝ sin 2 E+
= Eq(0) + | Vions (Q”) |
(412)
while the bonding state is given by the other form Q” . r 2 2 − | φq (r) | ∝ cos 2 E−
= Eq(0) − | Vions (Q”) |
(413)
In this context, the wave function φpq (r)
∝ sin
130
Q” . r 2
(414)
is called p-like as it vanishes at the lattice points, whereas Q” . r φsq (r) ∝ cos 2
(415)
is called s-like as it is non-vanishing at the positions of the ions, r = R. The origin of the gap between the two branches is seen through examination of the average potential energy of the s and p like wave functions Z 2 d3 r Vions (r) | φs,p (416) q (r) | V
The s-like electrons congregate at the position of the ions where the potential is lower, and the p-like electrons congregate between the ions where the potential is higher. For an attractive interaction Vions (r) < 0, this leads to φsq (r) having a lower energy than φpq (r), ( when Vions (Q”) < 0 ). The Bragg planes have other significance as can be inferred from the gradient of the energy v u (0) 2 Eq(0) + E (0) u Eq − E (0) q − Q” q − Q” t ± + | Vions (Q”) |2 E± = 2 2 (417) which is found as (0) (0) # " Eq − Eq − Q” Q” Q” h2 ¯ s ∇ q E± = q− ± 2 m 2 2 (0) (0) Eq − Eq − Q” + 4 | Vions (Q”) |2 (418) On the Bragg plane, one has (0) − Q”
Eq(0) = Eq
(419)
therefore, the second term in the expression for the gradient drops out on these planes. Thus, the gradient of the energy of the mixed bands is given by Q” ¯h2 ∇q E± = q − (420) m 2 Q”
and, as q is on the Bragg plane, the vector q − 2 is parallel to the plane and so is the gradient. The gradient of the energy is perpendicular to surfaces of constant energy and so, the constant energy surfaces are usually perpendicular to the Bragg planes at their points of intersection.
131
Generally, the vanishing of the normal component of the gradient at the Brillouin zone boundary is not dependent on the validity of the nearly free electron approximation, but is a consequence of symmetry. Consider the case in which there is a mirror plane symmetry, σ. The mirror plane is assumed to run through the origin of the Brillouin zone and is parallel to the Brillouin zone boundary under consideration. Then, the normal component of the gradient is defined as " # Ek+δQ − Ek−δQ (421) Q . ∇k Ek = lim δ → 0 2δ However, since the point k is equivalent to the point k − Q, one has Ek+δQ = E−Q+k−δQ
(422)
and, as there exists a mirror plane σ through the origin and perpendicular to Q, one also has E−Q+k−δQ = EQ+σk+δQ (423) Noting that as k is on the Bragg plane, k ≡ Q + σk, and substituting the above equality into the definition, one finds that the normal component of the gradient vanishes at the Brillouin zone boundary Q . ∇k Ek = 0 (424) Thus, at the Brillouin zone boundary, either the normal component of the gradient vanishes or the gradient does not exist, i.e. there might be a cusp. The presence of other types of symmetry can give rise to similar conclusions.
8.1.4
Empty Lattice Approximation Band Structure
Since the nearly free electron approximation deviates only slightly from the free electron approximation, the gross features of the band structure can be found using the empty lattice approximation. Since the Brillouin zone is a three-dimensional object and is highly symmetric, it is only necessary to specify the bands within an irreducible wedge. Once the bands are specified within the wedge, then by use of symmetry, the bands are completely known throughout the Brillouin zone. Since it is difficult to represent the energy dispersion relations in a three-dimensional volume of reciprocal space, it is customary to specify the dispersion relations on the lines defining the boundaries of the irreducible wedge. These lines have high symmetries. Consider the case of an f.c.c. Bravais Lattice, and consider the bands within the first Brillouin zone. The high symmetry points are marked by special letters.
132
Γ
≡
K
≡
3 π 2 a
W
≡
π a
X
≡
2 π a
L
≡
π a
≡
(0, 0, 0)
(1, 1, 0) ≡
2 π a
( 34 , 34 , 0)
≡
2 π a
(1, 12 , 0)
(1, 0, 0) ≡
2 π a
(1, 0, 0)
≡
2 π a
( 12 , 12 , 12 )
(2, 1, 0)
(1, 1, 1) 2 π a
and in units of
(0, 0, 0)
correspond to
Γ
≡
(0, 0, 0)
T he zone center
K
≡
( 34 , 34 , 0)
T he hexagonal edge center
X
≡
(1, 0, 0)
T he diamond f ace center
W
≡
(1, 21 , 0)
T he corner
L
≡ ( 12 , 12 , 12 ) T he hexagonal f ace center
The electron bands are usually plotted against k along the high symmetry directions Γ → X
→ W
→ L → Γ → K
→ X 2 π a
The length of these linear segments ( in units of 1
1 2
√ 3 2 4
√
√1 2
3 2
) are given by
√
10 4
The band energies in the empty lattice approximation can be plotted along these axes in units of E0 where ¯h2 4 π2 E0 = (425) 2m a2 and the components of the reduced wave vectors k˜i where ki a k˜i = 2π
(426)
0 The energy of the various bands can be constructed from the various Ek−Q .
The first band that is considered is simply Ek0 , where Q = (0, 0, 0) thus, Ek0 E0
=
k˜x2 + k˜y2 + k˜z2 133
(427)
which for Γ → X is just = k˜x2
f or 0 ≤ k˜x ≤ 1
(428)
For X → W this band dispersion is given = 1 + k˜y2
1 f or 0 ≤ k˜y ≤ 2
(429)
For W → L this band is described by =
1 + k˜x2 + ( 1 − k˜x )2 4
f or
1 ≤ k˜x ≤ 1 2
(430)
For L → Γ this dispersion is given as = 3 k˜x2
1 f or 0 ≤ k˜x ≤ 2
(431)
For Γ → K this band takes the form = 2 k˜x2
3 f or 0 ≤ k˜x ≤ 4
(432)
The last segment is given by K → X in which the band takes the form = k˜x2 + 9 ( 1 − k˜x )2
f or
3 ≤ k˜x ≤ 1 4
0 , where Q = The next band to be considered is simply Ek−Q thus, 0 Ek−Q 2 2 2 ˜ ˜ ˜ = ( kx − 2 ) + ky + kz E0
(433)
4 π a
(1, 0, 0)
(434)
which for Γ → X is just = ( k˜x − 2 )2
f or 0 ≤ k˜x ≤ 1
(435)
For X → W this band dispersion is given = 1 + k˜y2
1 f or 0 ≤ k˜y ≤ 2
(436)
For W → L this band is described by =
1 + ( k˜x − 2 )2 + ( 1 − k˜x )2 4
1 ≤ k˜x ≤ 1 2
(437)
1 f or 0 ≤ k˜x ≤ 2
(438)
f or
For L → Γ this dispersion is given as = ( k˜x − 2 )2 + 2 k˜x2 134
For Γ → K this band takes the form 3 f or 0 ≤ k˜x ≤ 4
= ( k˜x − 2 )2 + k˜x2
(439)
The last segment is given by K → X in which the band takes the form = ( k˜x − 2 )2 + 9 ( 1 − k˜x )2
4 π a
0 , where Q = The next band is Ek−Q 0 Ek−Q
E0
=
3 ≤ k˜x ≤ 1 4
f or
(440)
( 12 , 12 , 12 ) thus,
( k˜x − 1 )2 + ( k˜y − 1 )2 + ( k˜z − 1 )2
(441)
which for Γ → X is just = 2 + ( k˜x − 1 )2
f or 0 ≤ k˜x ≤ 1
(442)
1 f or 0 ≤ k˜y ≤ 2
(443)
For X → W this band dispersion is given = 1 + ( k˜y − 1 )2 For W → L this band is described by =
1 + ( k˜x − 1 )2 + k˜x2 4
f or
1 ≤ k˜x ≤ 1 2
(444)
For L → Γ this dispersion is given as 1 f or 0 ≤ k˜x ≤ 2
= 3 ( k˜x − 1 )2
(445)
For Γ → K this band takes the form = 1 + 2 ( k˜x − 1 )2
3 f or 0 ≤ k˜x ≤ 4
(446)
The last segment is given by K → X in which the band takes the form = 1 + ( k˜x − 1 )2 + ( 2 − 3 k˜x )2
0 , where Q = The next band is Ek−Q 0 Ek−Q
E0
=
f or
4 π a
3 ≤ k˜x ≤ 1 4
( 12 , − 12 , 12 ) thus,
( k˜x − 1 )2 + ( k˜y + 1 )2 + ( k˜z − 1 )2 135
(447)
(448)
which for Γ → X is just = 2 + ( k˜x − 1 )2
f or 0 ≤ k˜x ≤ 1
(449)
1 f or 0 ≤ k˜y ≤ 2
(450)
For X → W this band dispersion is given = 1 + ( k˜y + 1 )2 For W → L this band is described by =
9 + ( k˜x − 1 )2 + k˜x2 4
f or
1 ≤ k˜x ≤ 1 2
(451)
For L → Γ this dispersion is given as 1 f or 0 ≤ k˜x ≤ 2
= 2 ( k˜x − 1 )2 + ( k˜x + 1 )2
(452)
For Γ → K this band takes the form = 1 + ( k˜x − 1 )2 + ( k˜x + 1 )2
3 f or 0 ≤ k˜x ≤ 4
(453)
The last segment is given by K → X in which the band takes the form = 1 + ( k˜x − 1 )2 + ( 4 − 3 k˜x )2
f or
3 ≤ k˜x ≤ 1 4
(454)
It is seen that some branches of these bands are highly degenerate. When Vions 6= 0, the degeneracy of the various branches may be lifted. Group theory can be used to determine whether or not the potential lifts the degeneracy of the branches. Thus, even in the empty lattice approximation, the method of plotting bands shows a great deal of structure. The real structure is actually inherent in the Bragg planes which generally can be associated with an ”energy gap” in the dispersion relations. The ”gap” may or may not extend across the entire Brillouin zone. A gap only appears in the density of states if the ”gap” extends across the entire Brillouin zone. The nearly free electron approximation has been worked out in detail for Al by B. Segall, Physical Review 124, 1797 (1961). For the b.c.c. lattice, the reciprocal lattice vectors are 1 2 1 = 2 1 = 2
b1 = b2 b3
4π ( eˆx + eˆy ) a 4π ( eˆx + eˆz ) a 4π ( eˆy + eˆz ) a 136
(455)
The Cartesian coordinates of the high symmetry points are Γ
≡
(0, 0, 0)
H
≡
(1, 0, 0)
N
≡
( 12 , 12 , 0)
P
≡ ( 12 , 12 , 12 )
in units of
2 π a .
——————————————————————————————————
8.1.5
Exercise 27
Derive the lowest energy bands of a b.c.c. lattice in the empty lattice approximation. Plot the dispersion along the high symmetry directions (Γ → H → N → P → Γ → N ). ——————————————————————————————————
8.1.6
Degeneracies of the Bloch States
The degeneracies of the bands at various points in the Brillouin zone, found in the empty lattice approximation, can be raised by the crystalline potential. The character and degeneracies of the bands at symmetry points can be ascertained by the use of group theory (L.P. Bouckaert, R. Smoluchowski and E. Wigner, Phys. Rev. 50, 58 (1936)). Given a Bloch function φn,k (r), one can apply a general point group symˆ j ) to the Bloch function, thereby, transforming it into the metry operator O(A Bloch function corresponding to the wave vector Aj k ˆ j ) φn,k (r) = φn,A k (r) O(A j
(456)
This is proved by considering the combined operation consisting of the point ˆ j ) followed by a translation through a Bravais lattice vector group operation O(A R. The effect of the combined operation is evaluated as ˆ j ) φn,k (r) Tˆ(R) O(A
= Tˆ(R) φn,k (A−1 j r) = exp − i k . A−1 R φn,k (A−1 j j r)
137
(457)
where the second line follows from Bloch’s theorem. However, we note that the scalar product remains invariant if both vectors are transformed. We shall transform the vectors k and ( A−1 R ) by Aj . Hence, as j k . ( A−1 R) j
= ( Aj k ) . ( Aj Rj−1 R ) = ( Aj k ) . R
(458)
we find that Tˆ(R) Aˆj φn,k (r)
=
− i ( Aj k ) . R
exp
= exp
− i ( Aj k ) . R
φn,k (A−1 j r) ˆ j ) φn,k (r) O(A (459)
Since the quantity exp
− i ( Aj k ) . R
(460)
is the eigenvalue of the translation operator Tˆ(R), the Bloch wave vector of ˆ j ) φn,k (r) is Aj k. As this is an energy eigenfunction, the the function O(A transformed function is a Bloch function. That is, ˆ j ) φn,k (r) φn,Aj k (r) = O(A
(461)
Since the point group symmetry operations commute with the Hamiltonian, ˆ , O(A ˆ j) ] = 0 [H
(462)
ˆ j ) φn,k (r) all have the same energy En,k . the Bloch states O(A A basis set can be constructed by repeated application of the point group symmetry operators on the Bloch functions. The same vector k cannot appear in distinct bases created from a Bloch function since the symmetry operations form a group. This means that two such bases are either identical or have no wave vector k in common. A basis created from the Bloch function φn,k (r) in this fashion may be either reducible or irreducible. An irreducible basis can be constructed by selecting an appropriate subset of Bloch functions from the above basis set. If one considers the set of wave vectors Ai k, then certain of these points may be equivalent in that Ai k = Aj k + Q
(463)
where Q is a reciprocal lattice vector. The star of k is the set of all the inequivalent wave vectors Ai k. More precisely, the star of the wave vector k consists of the set of all mutually inequivalent wave vectors Ai k, where Ai ranges over 138
all the operations of the point group. Since none of the Bloch wave vectors in the star are equivalent, the corresponding Bloch functions are all linearly independent. Hence, the Bloch functions of the star may be used to construct an irreducible basis. The group of the k vector consists of all symmetry operations which, when acting on k, lead to an equivalent point. That is, the symmetry operations of the group of the k vector satisfy Aj k = k + Q
(464)
where Q is a reciprocal lattice vector. As an example, the groups of the k vectors for the points Γ, Z, M, A of the simple tetragonal lattice coincide with the D4h point group of the tetragonal lattice itself. The groups of the k vectors at X and R are D2h , and D2h is a subgroup of D4h . In general, the group of the k vector of the Γ point will always coincide with the point group of the crystal. The group of the k vector has irreducible representations, and these are called the small representations. The basis functions of the star of k can be symmetrized with respect to the small representations. The symmetrization can be performed by using the projection method. Although the groups of the wave vector in the star may be different, the small representation of any one can be chosen for the symmetrization process. After the symmetrization, the resulting basis functions form an irreducible representation of the space group. Each basis function of the small representation only corresponds to exactly one wave vector in the star and the equivalent wave vectors. The basis functions corresponding to the different irreducible representations are orthogonal. The irreducible representations of the space group constructed from the Bloch functions are fully determined by the star of the k vector and the small representation. The basis functions forming the irreducible representation of ˆ0 the space group constructed from the Bloch state φn,k (r) are eigenstates of H with energy En,k . Barring accidental degeneracies, the degeneracy of this eigenvalue is equal to the dimension of its irreducible representation. As k varies in the Brillouin zone, the eigenvalue En,k and the corresponding basis functions vary continuously. The group of the k vector also varies as k varies. Whenever the dimension of the small representation corresponding to the basis function φn,k (r) changes, the degeneracy of En,k changes. This may signify that at these points different bands cross or merge together. Alternatively, at k there are a vast number of bands each corresponding to a different small representation. The degeneracy of each band is given by the dimension of the corresponding small representation. If an irreducible representation of the group of the wave vector k can be decomposed into the irreducible representations of the group of k 0 , then on varying k to k 0 , the branch will 139
split into sub-levels. The degeneracies of the sub-levels are determined by the dimensions of the irreducible representations contained in the decomposition. —————————————————————————————————— As an example, consider the nearly free electron bands of zinc blende. The material has tetrahedral point group symmetry, Td . The point group contains twenty four elements in five equivalence classes. One class consists of the identity E. There is a class of eight C3 operations, which contain the rotation C3 and the inverse rotation C3−1 about the four axes [1, 1, 1], [1, 1, 1], [1, 1, 1] and [1, 1, 1]. There is a class consisting of three C2 operations around the [1, 0, 0], [0, 1, 0] and [0, 0, 1]. There is a class consisting of six S4 operations around the [1, 0, 0], [0, 1, 0] and [0, 0, 1] axes. Finally, there is a group consisting of six σ operations which are reflections in the six planes (1, 1, 0), (1, 0, 1), (0, 1, 1), (1, 1, 0), (1, 0, 1) and (0, 1, 1). Therefore, the group has five irreducible representations. The character table is given by Td Γ1 Γ2 Γ3 Γ4 Γ5
E 1 1 2 3 3
C2 (3) 1 1 2 -1 -1
C3 (8) 1 1 -1 0 0
(S4 )(6) 1 -1 0 1 -1
σ(6) 1 -1 0 -1 1
Let us consider the band structure along the high symmetry directions [1, 1, 1] and [1, 0, 0] directions. At the Γ point the group of the k vector coincides with the point group of the crystal. Since the nearly free electron approximation for the Bloch wave function for k = 0 is a constant, it is a basis for the Γ1 representation. Thus, the level is non-degenerate. At a general point along the eight [1, 1, 1] directions, the group of the k vector is C3v and contains six elements in three classes. The are the identity E, a class consisting of the rotation C3 about the [1, 1, 1] axis and its inverse C3−1 , and three reflections σ in the three equivalent (1, 1, 0) planes containing the [1, 1, 1] axis. The character table is given by C3v Λ1 Λ2 Λ3
E 1 1 2
C3 (2) 1 1 -1
σ(3) 1 -1 0
Thus, the branches along the Λ axis are either singly or doubly degenerate, when the crystalline potential is introduced. The branch which emanates from
140
2
k = 0 with energy E = 2h¯ m k 2 belongs to the Λ1 representation as this is compatible with the Γ1 representation. At the end point L where k = πa (1, 1, 1), the symmetry operations are identical to those of Λ. In the free electron approximation, the state at L is doubly degenerate (ignoring spin) since the wave vectors πa (1, 1, 1) and − πa (1, 1, 1) differ by a reciprocal lattice vector Q = 2aπ (1, 1, 1). Using the compatibility relations, one can show that the next highest band has Λ1 symmetry. These two levels are accidentally degenerate, since they are not partner basis functions of a multi-dimensional irreducible representation. Therefore, the degeneracy may be lifted by the presence of a crystalline potential V (Q). On continuing along the band with Λ1 symmetry, one reaches the point k = 2aπ (1, 1, 1). Since the primitive lattice vectors of the f.c.c. lattice are of the form 2π b1 = (−1, 1, 1) a 2π (1, −1, 1) b2 = a 2π b3 = (1, 1, −1) (465) a then Q = b1 + b2 + b3 is equal to 2aπ (1, 1, 1). Thus, the point k = 2aπ (1, 1, 1) is equivalent to the Γ point. The star consists of just one wave vector. At this point, the eight nearly free electron bands corresponding to 2π ( ± x ± y ±z ) (466) φkj (r) ∼ exp i a are degenerate. They form the basis of an eight-dimensional representation which is reducible. In this representation, a symmetry transformation A is represented by the 8 × 8 matrices, D(A), which are constructed according to the prescription ˆ O(A) φki (r)
= φki (A−1 r) X = φkj (r) D(A)j,i
(467)
j
The characters of this eight-dimensional representation are given by the trace of the 8 × 8 matrices and, therefore, the character of an operation is just the number of wave functions that are unchanged by the transformation. Class E C2 (3) C3 (8) S4 (6) σ(6)
Transformation x, y, z x, y, z y, z, x x, z, y y, x, z
χ 8 0 2 0 4
141
This eight-dimensional representation, Γ, is reduced into the irreducible representations, Γµ , via X Γ = aµ Γµ (468) µ
The decomposition can be found from considering the characters. The characters of a symmetry operation A, χ(A), is decomposed into the characters of the irreducible representations, χµ (A), via X χ(A) = aµ χµ (A) (469) µ
The multiplicity aµ can be found from the orthogonality relation X gi χ(Ai ) χµ (Ai ) = g aµ
(470)
i
where the sum over i runs over all the equivalence classes of the group, and gi is the number of symmetry elements in the i-th equivalence class, and g is the order of the group. This procedure leads to the decomposition χ(Ai ) = 2 χΓ1 (Ai ) + 2 χΓ4 (Ai )
(471)
Thus, the eight plane wave basis can be symmetrized into two sets of basis functions of Γ1 symmetry and two three-dimensional sets of basis functions of Γ4 symmetry. The symmetrization process is performed by the use of the projection method. A projector, Pˆ µ which projects the functions on to an irreducible set of basis functions, is constructed from the symmetry operations ˆ O(A) and the characters of the operations via nµ X µ ˆ Pˆ µ = χ (A) O(A) g
(472)
A
In this, nµ is the dimension of the µ-th irreducible representation, i.e., nµ = χµ (E). When the projector acts on an arbitrary combination of functions with equivalent wave vectors k, φk (r), it produces a basis function, φµk (r) for the µ-th irreducible representation ˆ µ φk (r) = φµ (r) O k In this way, one can construct the set of symmetrized basis functions:
142
(473)
Representation Γ1
Basis functions cos 2πx cos 2πy cos 2πz a a a
Γ1 sin 2πx sin 2πy sin 2πz a a a Γ4 cos 2πx sin 2πy sin 2πz a a a 2πy 2πx sin a cos a sin 2πz a sin 2πx sin 2πy cos 2πz a a a Γ4 sin 2πx cos 2πy cos 2πz a a a 2πy 2πx cos a sin a cos 2πz a cos 2πx cos 2πy sin 2πz a a a
In this basis, all the matrices D(A) representing the symmetry operators A have the same block diagonal form. The matrices contains two one-dimensional blocks and two three-dimensional blocks. Thus, these levels may be split by the application of a potential, however, the degeneracies cannot be completely lifted. Along the X direction, the wave vectors are of the form (k, 0, 0) where 0 < k < 2aπ . The group of k is C2v . It has four elements in four classes: the identity E, a two-fold rotation about the [1, 0, 0] axis, and the two diagonal mirror planes σd and σd0 . The character table is given by C2v ∆1 ∆2 ∆3 ∆4
E 1 1 1 1
C42 1 1 -1 -1
σd 1 -1 1 -1
σd0 1 -1 -1 1
Therefore, along this direction, all the irreducible representations are onedimensional. The symmetry of the wave function emanating from (0, 0, 0) belong to ∆1 since this is the only irreducible representation compatible with Γ1 . This branch continues up to the X point. The point 2aπ (1, 0, 0) is equivalent to the point − 2aπ (1, 0, 0), as they are related via the Q vector Q = b2 + b3 . At the X point, the lowest energy level in the nearly free electron approximation is doubly degenerate. The group of the k vector at the X point is D2d and consists of eight elements arranged in five classes. These are the identity E, a two fold rotation about the x axis C42 , a class of two elements which are the two-fold rotations C2 143
about the y and z axis, and two S4 operations about the x axis, and a class of two diagonal reflections σd on the (0, 1, 1) and the (0, 1, 1) planes. Thus there are five irreducible representations. The character table is given by D2d X1 X2 X3 X4 X5
E 1 1 1 1 2
C42 (1) 1 1 1 1 -2
C2 (2) 1 1 -1 -1 0
S4 (2) 1 -1 -1 1 0
σd (2) 1 -1 1 -1 0
At the X point, the wave functions of the two-fold degenerate energy levels, E 0 , found in the nearly free electron approximation belong to the one-dimensional X1 and X3 irreducible representations. This degeneracy may be raised by the potential. On continuing along the X direction, one reaches the point (2, 0, 0). The six k points (±2, 0, 0), (0, ±2, 0) and (0, 0, ±2) are all equivalent to the zone center. The group of the wave vector is Td . The six wave functions 4π φk (r) = exp ± i x a 4π y φk (r) = exp ± i a 4π z φk (r) = exp ± i a (474) can be used as a basis for a six-dimensional representation. In this representation, the characters of the symmetry operations are given by: Class E C2 (3) C3 (8) S4 (6) σ(6)
Transformation x, y, z x, y, z y, z, x x, z, y y, x, z
χ 6 2 0 0 2
This representation is degenerate and can be decomposed via X Γ = aµ Γµ
(475)
µ
The multiplicities aµ are calculated from X gi χ(Ai ) χµ (Ai ) = g aµ i
144
(476)
which leads to the decomposition Γ = Γ1 + Γ3 + Γ4
(477)
into a one-dimensional, a two-dimensional and a three-dimensional irreducible representation. The basis functions can be symmetrized using the projection method. The basis functions for the small representations are Representation Γ1
Basis functions cos 4πx + cos 4πy + cos 4πz a a a
Γ3 cos 4πy − cos 4πz a a 2 cos − cos 4πy − cos 4πz a a 4πx a
Γ4 sin 4πx a sin 4πy a sin 4πz a Hence, the six-fold degenerate energy level E 0 = degeneracy lifted by V (Q).
h ¯2 2m
( 4aπ )2 may have the
——————————————————————————————————
8.1.7
Exercise 28
Using the symmetrized wave functions at k electron model for Zn blende r 8 2πx φΓ1 = cos a3 a r 8 2πx sin φΓ4 (x) = a3 a r 8 2πx cos φΓ4 (y) = a3 a r 8 2πx φΓ4 (z) = cos a3 a
= ( 2aπ ) (1, 1, 1) in the nearly free 2πy a 2πy cos a 2πy sin a 2πy cos a cos
2πz a 2πz cos a 2πz cos a 2πz sin a cos
(478)
145
Show that the matrix elements of the momentum operator between the Γ1 and Γ4 basis functions are given by 2
2
2
| < Γ1 | pˆx | Γ4 (x) > | = | < Γ1 | pˆy | Γ4 (y) > | = | < Γ1 | pˆz | Γ4 (z) > | = (479) while all other matrix elements are zero. ——————————————————————————————————
8.1.8
Brillouin Zone Boundaries
The Brillouin zone boundaries play an important role in the understanding of Fermi-surfaces. In the empty lattice approximation, the Fermi-surface is a sphere when represented in the extended zone scheme. The nearly free electron approximation introduces a distortion to the sphere which is most marked near the Brillouin zone boundaries. In general, if the spherical Fermi-surface crosses a Bragg plane, then the sphere may distort. In particular, the constant energy surface should be perpendicular to the Bragg plane at the line where they intersect. Due to the appearance of the potential Vions (Q) in the expression for the Bloch energy near the Bragg plane, and also due to the accompanying band splitting, the circles of intersection of the constant energy surfaces (corresponding to EF ) with the Bragg plane do not match up. This is necessary since the distortion of the Fermi-surface must conserve the volume enclosed. This volume is equal to the volume enclosed by the spherical Fermi-surface of the empty lattice approximation. The Fermi-surface in the reduced Brillouin zone scheme can be constructed from the Fermi-surface in the extended zone scheme. This is done by translating the disjoint pieces of the Fermi-surface in the higher order zones by reciprocal lattice vectors, so that the pieces fit back into the first Brillouin zone. The first Brillouin zone is the Wigner-Seitz unit cell of the reciprocal lattice. It encloses the set of points that are closer to Q = 0 than they are to any other reciprocal lattice vector Q 6= 0. This can be restated as, the first Brillouin zone consists of the volume in the reciprocal lattice which can be accessed from the origin without crossing a Bragg plane. The second Brillouin zone is the volume that can be reached from the first Brillouin zone by crossing only one Bragg plane. Likewise, the (n + 1)-th Brillouin zone consists of the points, not in the (n − 1)-th zone, that can be reached from the n-th zone by crossing only one
146
2 π ¯h a
2
Bragg plane. Alternatively, the n-th Brillouin zone is the volume that can only be reached from the origin by crossing a minimum of (n − 1) Bragg planes. The Fermi-surface is constructed by: (i) Drawing the free electron sphere. (ii) Distorting the sphere at the Bragg planes. (iii) For each of the n Brillouin zones, take the portions of the surface in the n-th zone and translate them by reciprocal lattice vectors so that they lay within the first Brillouin zone. The resulting surface is the branch of the Fermisurface assigned to the n-th band in the repeated zone scheme. The Hume-Rothery rules provide a correlation of crystal structure with the number of electrons per unit cell, or band filling. It is an empirical rule which only applies to alloys of noble metals, such as Cu, Ag and Au, with s-p elements such as Zn, Al, Si, and Ge. If it is assumed that the noble metals have one electron outside the closed d shell, then the alloys have an f.c.c. phase for an average number of electrons per atom up to 1.38, while the b.c.c. phase is stable for band-fillings between 1.38 and 1.48. In the f.c.c. structure, the smallest vectors from the zone center to each face of the Brillouin zone have the form 21 2aπ (1, 1, 0), whereas for the f.c.c. lattice these vectors are of the form 1 2 π 2 a (1, 1, 1). Therefore, the radius of the Fermi-sphere, kF , at which it first makes contact with the Brillouin zone boundary is given by √ π kF = 3 f or f.c.c. a √ π kF = 2 f or b.c.c. (480) a When the Fermi-sphere first makes contact with the zone boundary, the occupied band is depressed by V (Q) resulting in an energy lowering which stabilizes the structure. In the free electron approximation, the number of electrons per primitive unit cell, n, is given by n = 2
V 4π 3 k N ( 2 π )3 3 F
(481)
where V N V N
= =
a3 4 a3 2
f or f.c.c. f or b.c.c.
(482) √
Thus, one finds that the critical number n is given by √ and 23 π = 1.48 for the b.c.c. lattices.
147
3π 4
= 1.36 for the f.c.c.
8.1.9
The Geometric Structure Factor
The potential Vions (r) is a periodic function and can be defined in terms of the ionic potentials, Vatom , the lattice vectors R, and the basis vectors rj , via X X (483) Vions (r) = Vatom (r − R − rj ) j
R
The evaluation of the Fourier Transform of the potential can be reduced to an evaluation of the Fourier Transform in one unit cell of the lattice as X Z 1 3 Vions (Q) = d r exp − i Q . r Vatom (r − R − rj ) V V R,j Z X 1 3 = d r exp − i Q . ( r − R ) Vatom (r − R − rj ) V V R,j Z X 1 = d3 r 0 exp − i Q . r0 Vatom (r0 − rj ) V V0 R,j
(484) where we have used the Laue condition exp i Q . R = 1
(485)
and the transformation r0 = r − R. Furthermore, since the Bravais Lattice vectors do not explicitly appear in the summand, the sum over R merely produces a factor of N X ≡ N (486) R
one has Vions (Q)
=
N V
Z
3
d r exp
− iQ.r
V
X
Vatom (r − rj )
j
=
Z N X 3 exp + i Q . rj d r” exp − i Q . r” Vatom (r”) V j V
=
N S(Q) Vatom (Q) V
(487)
where S(Q) is the geometric structure factor associated with the basis and the other factor is the Fourier transform of the ionic potential Z Vatom (Q) = d3 r exp − i Q . r Vatom (r) (488) V
Thus, when the geometric structure factor vanishes, the Fourier component of the lattice potential also vanishes and then the lowest order splitting at the 148
Bragg plane also vanishes. An example of this is given by the hexagonal closepacked lattice. The unit cell of the reciprocal lattice of the (direct space) hexagonal closed packed lattice is a hexagonal prism. There are two hexagonal planes which have normals pointing along the positive and negative z axis. These are Bragg planes. The structure factor vanishes for all q values on the hexagonal top and bottom of the prism. The structure factor can be evaluated as 2 4 S(Q) = 1 + exp i π ( m1 + m2 + m3 ) (489) 3 3 which vanishes when m1 = m2 = 0 and m3 = ± 1, corresponding to q laying on the Bragg planes. The vanishing of the structure factor at these particular Bragg planes is a consequence of a glide symmetry. In fact, group theory shows that the splitting on these planes is rigorously zero in the absence of spin-orbit coupling (C. Herring, Phys. Rev. 52, 361 (1937)). Since the gaps vanish on some faces of the Brillouin zones, it is sometimes helpful to define a set of zones, the Jones zones, which are separated by planes in which gaps do occur. The spin-orbit interaction can lead to the re-occurrence of small gaps in the bands (M.H. Cohen and L. Falicov, Phys. Rev. Letts. 5, 544 (1960)). The spinorbit interaction is a relativistic effect, which appears as low order correction to the non-relativistic limit of the Dirac equation. For a particle of charge q in the presence of a scalar and vector potential (φ, A), this process yields the single particle Hamiltonian in the form 2 q ˆ = m c2 + 1 H (p − A).σ + qφ 2m c 1 q ¯h q q ¯h3 4 − p + σ . ∇ φ ∧ ( p − A ) + ∇2 φ 3 2 2 8m c 4m c c 8 m2 c2 (490) The first line, apart from the rest energy, coincides with the non-relativistic Pauli Hamiltonian 2 1 q ˆP = + σ0 q φ (491) H (p − A).σ 2m c which, together with the identity σ.a σ.b = σ0 a . b + iσ. a ∧ b leads to ˆP H
=
1 σ0 2m
q A − i ¯h ∇ − c 149
2 + σ0 q φ
(492)
−
¯ q h σ. 2mc
∇ ∧ A + A ∧ ∇
(493)
which, since ∇ ∧ A Ψ(r) = Ψ(r)
∇ ∧ A
− A ∧ ∇ Ψ(r)
(494)
and as B = ∇ ∧ A, yields the non-relativistic Pauli Hamiltonian including the anomalous Zeeman interaction 2 q ¯h q 1 ˆ HP = σ0 p − A − σ . B + σ0 q φ (495) 2m c 2mc Thus, all the terms in the first line of equation (490) are found in the nonrelativistic theory whereas the terms in the second line represent interactions, ˆ rel , which have a relativistic origin. The relativistic terms are given by H q q ¯h3 1 q ¯h 4 ˆ σ ∇2 φ p + . ∇ φ ∧ ( p − A ) + Hrel = − 8 m3 c 4 m2 c2 c 8 m2 c2 (496) The first term which is proportional to p4 represents a relativistic correction to the kinetic energy. The next term is the spin-orbit interaction which can be interpreted as being caused by the interaction of the spin with the magnetic field produced by the electron’s own orbital motion. The last term is the Darwin term, which is often discussed as an interaction with a classical electron of finite spatial extent. Thus, the spin-orbit interaction for an electron is truly a relativistic effect and, unlike the other relativistic corrections, is not very symmetric. It is given by the pseudo-scalar interaction −
q ¯h σ.(v ∧ E) 4 m c2
(497)
Due to its reduced symmetry, the spin-orbit interaction raises the degeneracy of the bands at high symmetry points in k space (R.J. Elliott, Phys. Rev. 96, 280 (1954)), such as those on the hexagonal faces of the h.c.p. Brillouin zone. ——————————————————————————————————
8.1.10
Exercise 29
The effect of the Bragg planes on the density of states can be calculated from the nearly free electron model. For simplicity, consider the effect of one Bragg plane. The Bloch wave vector k is resolved into components parallel, k k , and perpendicular, k ⊥ , to the reciprocal lattice vector Q k = k⊥ + kk 150
(498)
The energy of the two bands can be written as Ek,± =
¯2 2 h k + ∆E± (kk ) 2m ⊥
(499)
where 1 ¯2 h 2 2 kk + Q − 2 kk Q ∆E± (kk ) = 2m 2 ! 12 2 2 h ¯ 2 2 ± Q − 2 kk Q + | V (Q) | 4m
(500)
describes the splitting of the two bands. (Note that the band energies are not periodic in kk . This is a consequence of our artificial assumption that there is only one Bragg plane.) For each band, the density of state per spin is Z V ρ± (E) = d3 k δ( E − Ek,± ) (501) ( 2 π )3
Show that the density of states is given by 2m V k (E) − k (E) maxk mink 4 π2 h2 ¯
(502)
where E = ∆E± (kmk ) defines the maximum and minimum value of kk . Show that, if the constant energy surface cuts the zone, i.e., E 0Q − | V (Q) | ≤ E ≤ E 0Q + | V (Q) | 2
(503)
2
then for the lower band one has kmaxk (E) = and
r kmink (E) = −
Q 2
(504)
2mE + O(|V (Q)|2 ) ¯h2
(505)
for E > 0. Show that V m Q ρ+ (E) = k (E) − maxk 4 π2 ¯ 2 h2
f or E ≥ E 0Q + | V (Q) | (506)
Show that the energy derivative of the density of states, the energies E = E 0Q ± | V (Q) |
2
∂ρ ∂E ,
is singular at (507)
2
——————————————————————————————————
151
8.1.11
Exercise 30
Consider the point W on the Brillouin zone boundary of an f.c.c. crystal. Three Bragg planes meet at W. The k value at W is 1 2π (1, , 0) (508) kW = a 2 The three planes are the (2, 0, 0), (1, 1, 1) and (1, 1, 1) planes. The four free electron energies are E10
=
E20
=
E30
=
E40
=
¯2 2 h k 2m 2 ¯h2 2π k − (1, 1, 1) 2m a 2 ¯h2 2π k − (1, 1, 1) 2m a 2 ¯h2 2π k − (2, 0, 0) 2m a
0 These four energies are degenerate at W and are equal to EW =
(509) h ¯2 2 m
k 2W .
Show that near W, the first order energies are given by the solutions of E10 − E V1 V1 V2 0 V E − E V V 1 2 1 2 = 0 0 V V E − E V 1 2 1 3 V2 V1 V1 E40 − E where V2 = V (2, 0, 0) and V1 = V (1, 1, 1) = V (1, 1, 1), and that at W the roots are 0 E = EW − V2
doubly degenerate
0 E = EW + V2 ± 2 V1
singly degenerate
(510)
Two Bragg planes meet at the point U, which corresponds to the k value 2π 1 1 kU = (1, , ) (511) a 4 4 152
Show that at the U point the band energies are given by E
= EU0 − V2
E
= EU0 +
1 V2 ± 2 2
where EU0 =
q
V22 + 8 V12
(512)
¯2 2 h k 2m U
(513)
is the free electron energy at point U. ——————————————————————————————————
8.1.12
Exercise 31
Consider a nearly free electron band structure near a Bragg plane. Let k =
Q + q 2
(514)
and resolve q into the components q k and q ⊥ parallel and perpendicular to the Bragg plane
Q 2.
Then, the energy bands are given by
0
E = EQ 2
¯2 2 h + q ± 2m
0
4 EQ 2
¯2 2 h q + | V (Q) |2 2m k
12 (515)
It is convenient to express the Fermi-energy µ in terms of the energy of the lower band at the Bragg plane µ = E 0Q − | V (Q) | + ∆
(516)
2
Show that when 2 V (Q) > ∆ > 0, then the Fermi-surface is only composed of states in the lower Bloch band. Furthermore, show that the Fermi-surface intersects the Bragg plane in a circle of radius ρ where r 2m∆ (517) ρ = ¯h2
Show that, if ∆ > 2 | V (Q) |, the Fermi-surface cuts the Bragg plane in two circles of radius ρ1 and ρ2 such that the area between them is 4πm 2 2 | V (Q) | π ρ1 − ρ2 = (518) ¯h2 153
This area is measurable through de Haas - van Alphen experiments. ——————————————————————————————————
8.1.13
Exercise 32
In a weak periodic potential the Bloch states in the vicinity of a Bragg plane can be approximated in terms of two plane waves. Let k be a wave vector with polar coordinates (θ, ϕ) in which the z axis is taken to be the direction Q of the reciprocal lattice vector that defines the Bragg plane. (i) If E < given by
2 h ¯2 Q 2 m 2
show that to order V (Q)2 the surface of energy E is r
k(θ, ϕ) =
2mE ¯h2
where δ(θ) =
m
1 + δ(θ)
| V (Q) |2 E
¯ 2 Q2 − 2 ¯h Q cos θ h
√
(519)
(520)
2mE
(ii) Show that | V (Q) |2 results in a shift of the Fermi-energy given by ∆µ = µ − µ0 where
1 | V (Q) |2 ∆µ = − 8 µ0
2 kF Q
(521) Q + 2 kF ln Q − 2 kF
(522)
——————————————————————————————————
8.1.14
Exercise 33
Consider an energy E which lies within the gap between the upper and lower bands at point k on the Bragg plane which is defined by the reciprocal lattice vector Q. Let Q (523) k = + q 2
(i) Find an expression for the imaginary part of k for E within the gap.
154
(ii) Show that for E at the center of the gap, the imaginary part of k satisfies s 2 2 Q2 Q2 2m (524) Im k = − ± + | 2 V (Q) |2 2 2 ¯h Thus, on solving for k given E, there is a range of Im k when Re k =
Q 2.
Complex wave vectors are important for the theory of Zener tunnelling between two bands, caused by strong electric fields. Complex wave vectors also occur in the description of states that are localized near surfaces. ——————————————————————————————————
8.2
The Pseudo-Potential Method
The failure of the nearly free electron model is primarily due to the large values of the potentials, V (Q), calculated from first principles, and the small values of the experimentally observed splittings between the bands. Due to the large value of the lattice potential, if the wave functions are expanded terms of plane waves very many plane waves ( of the order of 106 ) are needed to obtain convergence. Furthermore, band structure calculations with the exact lattice potential are expected to reproduce the entire set of wave functions ranging from the core wave functions located within the ions, up to the valence and/or conduction wave functions. Since the core electrons are very localized and almost atomic, a large number of plane waves are needed for an accurate calculation of the core wave functions. Large numbers of plane waves are also needed to calculate the valence band wave functions. The need for a large number of Fourier components to calculate the valence band wave functions can be understood by the consideration of the fact that the conduction or valence band states have to be orthogonal to the wave functions of the core electrons. Thus, the conduction electrons should have wave functions that exhibit rapid oscillations in the vicinity of the ion cores. Historically, there have been many methods which were used to avoid the need to use many plane waves. The methods used range from orthogonalized plane waves, augmented plane waves and pseudo-potentials. All these methods have some common features, namely the feature of producing wave functions that require fewer plane wave components in the expansion and, thereby, increase the rate of convergence, and concomitantly diminish the effect of the ionic potential. The pseudo-potential method provides a first principles way of explicitly finding a smaller effective potential. The electrons in the valence band move in a periodic potential Vions (r) provided by the ions. The ionic potential already includes a partial screening of the nuclear potential by the ion core electrons.
155
The valence band Bloch functions φvk,n (r) undergo many oscillations in the region of the core as they must be orthogonal to the core electron wave functions φck,α (r). In the Dirac notation, the orthogonality condition is expressed as < φvk,n | φck,α > = 0
(525)
The valence band Bloch function can be expressed in terms of a smooth function v ψk,n (r)
(526)
that doesn’t contain the oscillations that orthogonalize the Bloch state, | φvk,n >, with the core wave states. The smooth function is known as the pseudo-wave function. The pseudo-wave function is related to the valence band Bloch function by the definition X v v > − | φck,α > < φck,α | ψk,n > (527) | φvk,n > = | ψk,n α
This definition automatically ensures the othornomality of the core states with the valence band states without placing any restriction on the form of the pseudo-wave function. The basic idea behind pseudo-potential theory is that the smooth pseudo-wave function represents the electronic wave function in the region between the cores, and may be expressed in terms of only a few plane wave components (J.C. Phillips and L. Kleinman, Phys. Rev. 116, 287 (1959)). Since the Bloch state, | φvk,n >, satisfies the one-particle Schrodinger equation v ˆ | φvk,n > = Ek,n H | φvk,n > (528) one finds that the smooth function satisfies X v ˆ | ψv > − H Eαc | φck,α > < φck,α | ψk,n > = k,n α
=
v Ek,n
|
v ψk,n
>−
X
v | φck,α > < φck,α | ψk,n >
α
(529) This equation can be re-arranged to yield an eigenvalue equation for the (unknown) smooth function, which has the same energy eigenvalues as the exact eigenfunction. The rearranged equation has the form v v v v ˆ ˆ > = Ek,n | ψk,n > (530) H + V (Ek,n ) | ψk,n where v Vˆ (Ek,n ) =
X
v c Ek,n − Ek,α
α
156
| φck,α > < φck,α |
(531)
is a non-local and energy dependent contribution to the potential. The important point is that this potential may be regarded as being positive and, therefore, counteracts the effect of the large negative potential due to the ions. This can be seen by taking the expectation value of the energy dependent potential in any arbitrary state | Ψ > X v c v )|Ψ > = Ek,n − Ek,α | < Ψ | φck,α > |2 (532) < Ψ | Vˆ (Ek,n α v and as the valence electrons have a higher energy than the core electrons, Ek,n > c Ek,α , one finds < Ψ | Vˆ (E v ) | Ψ > ≥ 0 (533) k,n
Thus, the potential operator is effectively positive as it increases the expectation value of the energy for an arbitrary state. The operator Vˆ is non-local. This can be seen by considering the action of Vˆ on an arbitrary wave function Ψ(r). The operator has the effect of transforming the state through Z X v v c 0 0 c ˆ V (Ek,n ) Ψ(r) = Ek,n − Ek,α d3 r0 φ∗c k,α (r ) Ψ(r ) φk,α (r) (534) V
α
Thus, the operator when acting on the wave function at position r changes the position to r0 . If the original one-particle Schrodinger equation for φvk,n (r) has the form
¯2 h − ∇2 + Vions (r) 2m
v φvk,n (r) = Ek,n φvk,n (r)
(535)
v then the Schrodinger equation for the smooth function ψk,n (r) has the form
¯2 h v ) − ∇2 + Vions (r) + Vˆ (Ek,n 2m
v v v ψk,n (r) ψk,n (r) = Ek,n
(536)
The Schrodinger equation for the smooth wave function has exactly the same energy eigenvalues as the original potential. The pseudo-potential is defined as v ) Vˆpseudo = Vions (r) + Vˆ (Ek,n
(537)
and, as has been shown, the effect of the pseudo potential is much weaker than v (r) is a smooth function it can be that of Vions (r). Also as the eigenstate ψk,n expanded in terms of a few planes waves X v Ck−Q exp i ( k − Q ) . r (538) ψk,n (r) = Q
157
Thus, the pseudo-potential may be treated as a weak perturbation and gives results very similar to those of the nearly free electron model. There are many different forms that the pseudo-potential can take (B.J. Austin, V. Heine and L.J. Sham, Phys. Rev. 127, 276 (1962)). The non-local pseudo-potential can be approximated by a local potential and, as its energy dependence is weak, E v can be set to zero in the pseudo-potential. In this approximation, the pseudo-potential is almost zero within the core. This is a result of the so-called cancellation theorem (M. Cohen and V. Heine, Phys. Rev. 122, 1821 (1961)). The cancellation theorem can be found from classical considerations. Classically, the gain in kinetic energy of a conduction electron as it enters the core region is equal to the potential energy. As the oscillations in φck,α (r) give rise to the kinetic energy of the electron in the core region, one expects the pseudo-potential to cancel in the core region. Therefore, the pseudo-potential follows the ionic core potential for distances larger than the ionic core radius Rc , at which point the attractive potential almost shuts off. The empty core approximation to the atomic pseudo-potential (N.W. Ashcroft, Phys. Letts. 23, 48 (1966)) is given by Vpseudo (r)
=
−
Vpseudo (r)
=
0
Z e2 r
f or r > Rc
f or r < Rc (539)
Basically, this is a reflection of the fact that the valence electrons do not probe the region of the cores as this region is already occupied by the core electrons and the Pauli exclusion principle forbids the overlap of states. The Fourier transform of the local pseudo-potential is a smooth function of the wave vector q. Vpseudo (q) = −
4 π Z e2 cos q Rc q2
(540)
Only the values of Vpseudo (q) at the reciprocal lattice vectors Q are physically important and most of these are small. When one includes the effect of the screening electron clouds, the pseudo-potential is replaced by the screened pseudo-potential Z e2 Vpseudo (r) = − exp − kT F r f or r > Rc r Vpseudo (r)
=
0
f or r < Rc
(541)
The Fourier transform of the screened pseudo-potential is given by Vpseudo (q) = −
4 π Z e2 cos q Rc q 2 + kT2 F 158
(542)
which is weakened with respect to the original potential.
8.2.1
The Scattering Approach
The pseudo-potential is a potential that gives the same eigenvalues as Vions (r), for the valence electron states. The pseudo-potential may be obtained from scattering theory. Consider a single ionic scattering center with a spherically symmetric potential V (r) which is zero for r > R. Then for r > R, the radial wave function has the asymptotic form Rl (r, E) = Cl jl (kr) − tan δl ηl (kr) (543) where
¯ 2 k2 h (544) 2m and jj (x) and ηl (x) are the spherical Bessel and Neumann functions. The coefficients Cl and the phase shifts δl (E) are obtained by matching the asymptotic form to the solution at some large distance r = R. The exact logarithmic derivative of Rl (r, E) at r = R can be defined as E =
Ll (E) =
Rl0 (R, E) Rl (R, E)
(545)
The matching condition of the logarithmic derivative of the asymptotic form with the logarithmic derivative of the wave function at r = R leads to the equation jl (kR) Ll (E) − k jl0 (kR) tan δl (E) = (546) ηl (kR) Ll (E) − k ηl0 (kR) The phase shifts δl (E) determine the scattering amplitude f (θ, E) for a particle of energy E to be scattered through an angle θ. Partial wave analysis yields the relation 1 X (547) f (θ, E) = ( 2 l + 1 ) exp 2 i δl − 1 Pl (cos θ) 2ik l
The scattering amplitude only depends on the phase shift modulo π. The phase shift can always be restricted to the range − π2 to + π2 by defining δ l = nl π + ∆ l
(548)
where nl is an integer chosen such that the | ∆l | < 159
π 2
(549)
The value of nl denotes the number of the oscillations in the radial wave function Rl (r, E). The (truncated) phase shifts ∆l produce the same scattering amplitude as the original phase shift δl (E). The atomic pseudo-potential is defined as any potential in which the complete phase shifts are the truncated phase shifts ∆l and, thus, gives rise to the same scattering amplitude, but does not produce any bound states (according ˜ l (r, E) have no to Levinson’s theorem). The pseudo-radial wave functions R nodes and, thus, have no rapid oscillations. Therefore, the pseudo-radial wave function can be represented in terms of a finite superposition of plane waves of long wave length. The pseudo-potential actually only depends on the function Ll (E). From the knowledge of logarithmic derivative, Ll (E), one can construct the pseudo-potential. One method has been proposed by Ziman and Lloyd.
8.2.2
The Ziman-Lloyd Pseudo-potential
Ziman and Lloyd independently proposed a pseudo-potential which is local in r and is zero everywhere except on the surface of a shell of radius R. The potential operator, Vˆ ZL , is written as X Vˆ ZL = Bl (E) δ( r − R ) Pˆl (550) l
where Pˆl projects onto the states with angular momentum l (J.M. Ziman, Proc. Phys. Soc. (London) 86, 337 (1965), P. Lloyd, Proc. Phys. Soc. (London), 86, 825 (1965)). Inside the sphere the potential is zero and so the radial wave function is just proportional to jl (kr), since the Neumann function is excluded due to the boundary condition at r = 0. The amplitude Bl (E) is chosen so as to give the proper asymptotic properties of the wave function of the true potential V , for r > R. The pseudo-radial wave functions satisfy the radial Schrodinger equation, given by 2 h2 1 ∂ ¯ ¯h l ( l + 1 ) 2 ∂ ˜ ZL ˜ l (r) = E R ˜ l (r) − r Rl + + V (r) R 2 m r2 ∂r ∂r 2 m r2 (551) The derivative of the pseudo-radial wave function is found by integrating the Radial Schrodinger equation over the shell at r = R R+ h2 ∂ ˜ ¯ ˜ l (R) = 0 + Bl (E) R (552) − Rl (r) 2 m ∂r R− The pseudo-wave function is matched with the true wave function at the radius r = R+ . The matching condition determines the function Bl (E) in the pseudopotential in terms of the logarithmic derivative of the true wave function, Ll (E). 160
Thus, the coefficient Bl (E) is related to Ll (E) via Ll (E) − k
jl0 (kR) 2m Bl (E) = jl (kR) ¯h2
(553)
Therefore, the Bl (E), for different l, are determined in terms of the exact value of logarithmic derivatives. The projection operator is simply given as X Pˆl = | l, m > < l, m | (554) m
which also gives rise to the non-locality of the pseudo-potential operator. The pseudo-potential for the solid can be constructed as a superposition of the pseudo-potentials of the ions. It should be noted that the pseudo-potential only cancels for states of angular momentum l if there are core states with angular momentum l otherwise, the electrons experience the full potential. Thus, in C the 2s electron experience the cancelled pseudo-potential but the 2p electrons interact with the full potential. The 2p electrons are relatively tightly bound compared with the 2s. Thus, the s → p promotion energy is lower than in the other group IV elements Si, Ge, Sn and P b. This allows C to easily form the tetrahedrally directed sp3 valence bonds. Similarly, in the 3d transition metals, the 3d electrons are tightly bound compared with the 4d or 5d electrons in the second and third series. Thus, the 3d electrons form tightly bound narrow bands, and pseudo-potential theory is inappropriate. In summary, the pseudo-potentials can be created from first principles and then, if the pseudo-potential is weak enough, the nearly free electron model can be used to obtain the results for the valence bands of real solids. ——————————————————————————————————
8.2.3
Exercise 34
An electron outside a hydrogen atom with a 1s core state is treated by the pseudo-potential method. Calculate the Bloch wave function for an electron which has a pseudo-wave function that can be approximated by a single plane wave. Discuss whether this function is appropriate to represent a 2s wave function. Evaluate the magnitude of the pseudo-potential, for low energy electron states. ——————————————————————————————————
161
8.3
The Tight-Binding Model
The tight-binding method is appropriate to the situation in which the electron density in a solid can be considered to be mainly a superposition of the densities of the individual atoms (J.C. Slater and G.F. Koster, Phys. Rev 94, 1498 (1954)). However, the tight-binding method does produce slight corrections to the atomic densities. It should be a good approximation for the inner core orbitals where the ratio of the radius of the atomic orbit to the inter-atomic separation is small. Consider a lattice with a mono-atomic basis. The Hamiltonian for a single ˆ 0 and has eigenstates | φm > defined by the eigenvalue ion centered at 0 is H equation ˆ 0 | φm > = Em | φm > H (555) The periodic potential of the ions can be written as the sum of the potential from the ion at site 0, V0 , and the potential due to all other ions in the crystalline lattice ∆V Vions = V0 + ∆V (556) Thus, the Hamiltonian is written as the sum of a single ion Hamiltonian and the potential due to the rest of the ions ˆ = H ˆ 0 + ∆V H
(557)
In the tight-binding method it is convenient to define Wannier functions, φen , as a transform of the Bloch functions X φk,n (r) = exp i k . R φen ( r − R ) (558) R
Thus, the Wannier functions are centered around the different lattice points R. The Wannier states are almost localized states and are composed of a linear superposition of the atomic states X bn,m | φm > (559) | φen > = m
The band structure is found from the energy eigenvalue equation for the Bloch wave functions ˆ | φk,n > = Ek,n | φk,n > H (560) or
ˆ 0 + ∆V H
| φk,n > = Ek,n | φk,n >
(561)
This energy eigenvalue equation is projected onto the atomic wave function | φm > located at O leading to ˆ 0 + ∆V ˆ | φk,n > = < φm | H | φk,n > < φm | H = Ek,n < φm | φk,n > 162
(562)
ˆ 0 and However, the state | φm > is an eigenstate of the atomic Hamiltonian H so the overlap is given by ˆ 0 | φk,n > = Em < φm | φk,n > < φm | H (563) On substituting this relation into the matrix elements of the eigenvalue equation, the equation reduces to ( Ek,n − Em ) < φm | φk,n > = < φm | ∆V | φk,n > (564) The Bloch wave function can be expressed in terms of the Wannier functions, and then the Wannier functions are expressed in terms of the atomic wave functions via X φk,n (r) = exp i k . R φen ( r − R ) R
X
=
bn,m0 exp
φm0 ( r − R )
ik.R
R,m0
(565) The overlap of the Bloch functions and the atomic wave function is expressed as the sum of the overlap of atomic wave functions at the same site and the overlaps of atomic wave functions centered at different sites X = < φm | φk,n > δm,m0 bn,m0 + m0
X X
+
m0
bn,m0 exp
Z ik.R
d3 r φ∗m (r) φm0 (r − R)
R6=0
(566) Substituting this into the energy eigenvalue equation, one obtains the equation X Ek,n − Em δm,m0 bn,m0 + m0
+
X
Ek,n − Em
Z
bn,m0 exp
ik.R
d3 r φ∗m (r) φm0 (r − R)
m0 ,R6=0
=
X
Z bn,m0
d3 r φ∗m (r) ∆V (r) φm0 (r) +
m0
+
X
bn,m0 exp
Z ik.R
d3 r φ∗m (r) ∆V (r) φm0 (r − R)
m0 ,R6=0
(567) 163
The first term on the left side involves the overlap of two atomic wave function both centered at site 0. These atomic wave functions are part of an orthonormal set of eigenfunctions. The second term on the left hand side involves the overlap of atomic wave functions at site 0 and site R, and may be expected to be exponentially smaller than the first term. Z 3 ∗ 1 d r φm (r) φm0 (r − R) (568) The two terms on the right both involve the potential ∆V and the atomic wave function φm (r) located at site 0. The first term on the right hand site involves the effect of the potential due to the other ions on the central atom. This term represents the effect of the crystalline electric field on the atomic levels. The remaining term represents the delocalization of the electrons. The magnitudes of the coefficients bn,m that appear in the expansion of the Wannier state crucially depend on the ratios of the overlap integrals to the energy difference Ek,n − Em . Generally, this allows one to approximate the Wannier functions by retaining only a finite number of atomic wave functions in their expansions. That is, the expansion of the Wannier function is truncated by only considering atomic wave functions that have energies close to the energy of the Bloch state. The set of equations can be solved approximately by considering the spatial dependence. If one assumes that the potential ∆V is non-zero only in the range where φm (r) is negligibly small, both terms on the right hand side will be approximately zero. Thus, in a first order and very crude approximation, it is found that Ek,n = Em . On keeping the two center and three center integrals in which R is limited to a few neighbor sites to O, and to atomic states with a few energies close to Em , the set of equations truncate into a finite set. These can be solved to yield the Bloch state energies and the Bloch wave functions. In general, the band widths are linearly related to the overlap matrix elements, γi,j , where Z d3 r φ∗i (r) ∆V (r) φj (r − R) (569) γi,j (R) = − in which φj are atomic wave functions and R represent atomic positions relative to the central atom 0. The band widths increase with the increase in the ratio of the spatial extent of φi (r) to the typical separation R. Thus, bands with large binding energies which tend to have wave functions with small spatial extents form narrow bands while the higher energy bands have broader band widths.
164
The overlap integrals are conventionally expressed in terms of the angular momentum quantum numbers (l, m) of the atomic wave functions that are quantized along the axis joining the atoms. The matrix elements are non-negligible only if the z-component of the angular momentum satisfies a selection rule. The non-zero overlap matrix elements are then characterized by m. In analogy to the atomic wave functions, the type of bonding is labelled by the greek letters σ, π and δ respectively, corresponding to m = 0, m = ± 1 and m = ± 2. The overlap integrals corresponding to ssσ and ppπ bonds are negative, as the lobes of the wave function with the same sign overlap the negative crystal field potential. The ppσ bonds are positive at large to intermediate separations as lobes of opposite sign overlap the negative potential, but become negative at small values of R where the overlap of lobes with the same sign start to dominate. The spσ overlap is an odd function of R and vanishes for zero separation R = 0 as the different atomic wave functions are orthogonal. The sign of the spσ overlap depends on the ordering of the s and p orbitals along the axis. The spσ bond is positive if lobes of different sign overlap and is negative if lobes of the same sign overlap. The Helmholtz-Wolfsberg approximation consists of replacing the value of the potential ∆V by a constant. The magnitude of the potential is factorized out of the integral. Therefore, the overlap integrals merely depend on the displaced atomic wave functions, i and j. The overlap integrals are then written as γi,j (R) = − ∆V ti,j (R)
(570)
The overlap between hydrogen-like 1s wave functions r κ3 φ1s (r) = exp − κ r π
(571)
can be evaluated from the Fourier transformed wave function r κ3 8πκ φ1s (q) = π ( q 2 + κ2 )2
(572)
The overlap of two wave functions, with a relative displacement R, can be evaluated via the convolution theorem Z Z d3 q d3 r φ∗1s (r) φ1s (r − R) = φ (−q) φ (q) exp i q . R (573) 1s 1s ( 2 π )3 with the result that t1s,1s,σ = −
1 + κR +
1 2 2 κ R 3
exp
On using the hydrogenic-like 2s and 2p wave functions, r κ3 1 − κr exp − κ r φ2s (r) = π 165
− κR
(574)
r
φ2p,0 (r) φ2p,±1 (r)
κ3 = cos θ κ r exp − κ r π r κ3 = sin θ exp ± i ϕ κ r exp − κ r 2π (575)
one finds that the Fourier transform of the 2s and 2p wave functions are given by √
32 π κ ( q 2 − κ2 ) 0 κ3 Y0 (θq , ϕq ) ( κ2 + q 2 )3 r κ5 64 π κ q φ2p,0 (q) = i Y 0 (θq , ϕq ) 3 ( κ2 + q 2 )3 1 r κ5 64 π κ q φ2p,±1 (q) = − i Y ±1 (θq , ϕq ) 3 ( κ2 + q 2 )3 1 φ2s (q)
=
(576)
where the dependence on the direction of q is expressed through the factors Ylm (θq , ϕq ). The functions Ylm (θ, ϕ) are the spherical harmonics. On using the convolution theorem, the approximate overlap integrals are evaluated as 1 2 2 1 4 4 κ R + κ R exp − κ R t2s,2s,σ = − 1 + κ R + 3 15 13 3 3 t2s,2p,σ = κ R exp − κ R 30 2 3 3 1 4 4 1 2 2 t2p,2p,σ = − 1 + κ R + κ R − κ R − κ R exp − κ R 5 15 15 2 2 2 1 3 3 t2p,2p,π = − 1 + κ R + κ R + κ R exp − κ R 5 15 (577) where κ determines the spatial extent of the wave function and R is the interatomic separation. Typically for a material such as C, the relative strength of the bonds are given by the ratios at the radius R where the bonding saturates. Typical values of the relative strengths are given by t2s,2s,σ : t2s,2p,σ : t2p,2p,σ : t2p,2p,π = − 1 : 1 : 0.75 : − 0.49
(578)
The structure of tight-binding d bands can be found by expressing the Bloch functions in terms of five atomic d wave functions that correspond to the different eigenvalues of the z component of the orbital angular momentum mz = ± 2, mz = ± 1 and mz = 0. If mz is quantized along the axis between two atoms, 166
the tight-binding overlap integrals between these sets of states are denoted, respectively, by td,d,δ , td,d,π and td,d,σ . The matrix elements for arbitrary orientations are tabulated in the article of Slater and Koster (1954). Representative ratios of the strengths of the td,d,δ , td,d,π and td,d,σ bonds are given by td,d,δ : td,d,π : td,d,σ = − 6 : 4 : − 1
(579)
In general, the tight-binding bands obtained by considering d bands alone is highly inaccurate. Usually, a broad s band crosses the narrow set of d bands. This degeneracy is lifted as the d and s bands hybridize strongly (V. Heine, Phys. Rev. 153, 673 (1967)). The Bloch functions are constructed out of localized atomic levels with equal amplitude, but only involves the phase exp[ i k . R ]. Thus, the electrons are equally likely to be found in any atomic cell of the crystal. Also, Re φk,n shows that the atomic structure is modulated by the sinusoidal variation of exp[ i k . R ]. Since the mean velocity is given by v(k) =
1 ∇Ek 6= 0 ¯ h
(580)
then the electrons have a non-zero velocity and will be able to move throughout the crystal. The non-zero velocity is due to the coherent tunnelling of the electron between the atoms. For a lattice with a basis, the Bloch wave function is given X X φk (r) = exp[ i k . R ] aj,m φm (r − rj − R)
(581)
j,m
R
where rj are the positions of the basis atoms and aj,m are the amplitudes of the orbitals on the j-th basis atom. The equation for the Bloch function has a structure in which the basis atoms in each unit cell can be viewed as forming molecules. These molecular wave functions in each lattice cell are then combined via the tight-binding method.
8.3.1
Tight-Binding s Band Metal
For a simple s-band metal the Wannier state | φen > can be approximated by the atomic s wave function. As this s wave function is non-degenerate, one has | φe1 > ≈ | φs >
(582)
or bs = 1. All other coefficients are set to zero, corresponding to the assumption that the energy of the s band, Es , is well separated from the energies of the other bands. This is probably a good assumption for the 1s band which is often
167
regarded as forming part of the core of the ions. The energy eigenvalue equation truncates to Z X 3 ∗ Es,k − Es 1 + exp i k . R d r φs (r) φs (r − R) R6=0
= < φs | ∆Vˆ | φs > +
X
exp
Z ik.R
d3 r φ∗s (r) ∆V (r) φs (r − R)
R6=0
(583) The overlap between the atomic wave functions on different sites is defined to be a function α(R) through Z d3 r φ∗s (r) φs (r − R) = α(R) (584) The matrix elements of the atomic functions centered at 0 with the tail of the potential, ∆V , is defined to be β where < φs | ∆Vˆ | φs > = − β
(585)
and the matrix elements of the atomic functions centered at 0 and R with the tail of the potential is defined to be γ(R) through Z d3 r φ∗s (r) ∆V (r) φs (r − R) = − γ(R) (586) The dispersion relation can be expressed in terms of these three functions via P ! β + γ(R) exp i k . R R6=0 Es,k = Es − (587) P 1 + α(R) exp i k . R R6=0 Since γ(R) = γ(−R) and α(R) = α(−R) the dispersion relation E1,k is an even periodic function of k. For bonding only to the nearest neighbors, the sums over R are truncated to run only over the nearest neighbors. For the f.c.c. structure the dispersion relation becomes ! β + γ(k) Es,k = Es − 1 + α(k)
(588)
where γ(k) = 4 γ
cos
kx a ky a kx a kz a ky a kz a cos + cos cos + cos cos 2 2 2 2 2 2 (589) 168
and
kx a ky a kx a kz a ky a kz a α(k) = 4 α cos cos + cos cos + cos cos 2 2 2 2 2 2 (590) Usually α is neglected as it is small. The tight-binding bands are off-set from Es by an energy β due to the tail of the potential of all other atoms at O, β = − < φs | ∆Vˆ | φs >
(591)
The band width is governed by the overlap of the central atom’s wave function with the nearest neighbor atomic wave function. This overlap, γ, is evaluated from Z γ = − d3 r φ∗s (r) ∆V (r) φs (r − Rnn ) (592) The band width for the f.c.c. lattice is 12 γ. For small | k | a one can expand the dispersion relation in powers of k E1,k = Es − β − 12 γ + γ k 2 a2
(593)
which is independent of the direction of k near k = 0. Thus, the constant energy surfaces are spherical around k = 0. The gradient of the energy has a component perpendicular to the square face of the Brillouin zone (the face containing the X point) that is given by ∂Ek kx a ky a kz a = 2 a γ sin cos + cos (594) ∂kx 2 2 2 Thus, if E1,k is plotted along any line in k space which is perpendicular to the square face, it crosses with zero slope. The points on the hexagonal face satisfy the equation 3π 3 2π kx + ky + kz = = a 2 a
(595)
Since there is no plane of symmetry parallel to the hexagonal face, the energy plotted along any line perpendicular to the hexagonal face is not required to cross with zero slope, kx a ky a kz a cos + cos ∇ E1,k . eˆ ∝ sin 2 2 2 ky a kx a kz a + sin cos + cos 2 2 2 kz a kx a ky a + sin cos + cos 2 2 2 (596) 169
This only vanishes along the lines joining L ( 12 , 12 , 12 ) to the vertices W (1, 12 , 0). For degenerate levels such as p or d levels, the tight-binding method leads to a N × N secular equation where N is the orbital degeneracy. For heavy elements, spin-orbit coupling should be included. In this case, the potential ∆V should have a spin dependent contribution. The spin-orbit coupling breaks the spin degeneracy and increases the size of the secular equation by a factor of 2 (J. Friedel, P. Lenghart and G. Leman, J. Phys Chem. Solids 25, 781 (1964)). ——————————————————————————————————
8.3.2
Exercise 35
Consider two p orbitals, one located at the origin and another at the point R (cos θx , cos θy , cos θz ), where R is the separation between the two ions and the cos θ are the direction cosines of the displacements. The overlap parameters for the orbitals φi (r) and φj (r) are defined by Z γi,j (R) = − d3 r φ∗i (r) ∆V (r) φj (r − R) (597) Show that the overlap parameters are given by 2 2 γx,x = − tppσ cos θx + tppπ sin θx γx,y = − tppσ − tppπ cos θx cos θy
(598)
Thus, the tight-binding parameters not only depend on the distance, R, but also depend on the direction. ——————————————————————————————————
8.3.3
Exercise 36
Consider the p bands in a cubic crystal, which have the p wave functions φpx (r) = x f (r) φpy (r) = y f (r) φpz (r) = z f (r)
(599)
where f (r) is a spherically symmetric function. The energies of the three p bands are found from the secular equation Ek − Ep δi,j + βi,j + γi,j (k) = 0 (600) 170
and γi,j (k) =
X
exp
ik.R
γi,j (R)
(601)
R
and
Z γi,j (R) = −
d3 r φ∗i (r) ∆V (r) φj (r − R)
(602)
and βi,j = γi,j (0)
(603)
Show that, using cubic symmetry, βx,x = βy,y = βz,z = β
(604)
and all other overlap matrix elements are zero βx,y = βy,z = βx,z = 0
(605)
Assuming that only the nearest neighbor overlaps γi,j (R) are non-zero, show that for a simple cubic lattice γi,j (k) are diagonal in i and j. Hence, the px , py and pz wave functions generate three independent bands Ex,k Ey,k Ez,k
= Ep + 2 tppσ cos kx a + 2 tppπ ( cos ky a + cos kz a ) = Ep + 2 tppσ cos ky a + 2 tppπ ( cos kx a + cos kz a ) = Ep + 2 tppσ cos kz a + 2 tppπ ( cos kx a + cos ky a ) (606)
The relative values of these parameters can be estimated from first principles calculations of bulk silicon, where the ratios were found to be given by tppσ : tppπ = 3.98 : − 1 . ——————————————————————————————————
8.3.4
Exercise 37
Consider the p bands in a face-centered cubic lattice with nearest neighbor hopping γi,j (R). Show that the system is described by a 3 × 3 secular equation which is expressed in terms of four integrals E − Ek0 + Mx0 − Mz1 − My1 1 0 0 1 − Mz E − Ek + My − Mx (607) 0 = 0 1 1 0 − My − Mx E − E k + Mz 171
where the functions Mi0 and Mi1 are given by Mx0 Mx1
kz a ky a cos 2 2 ky a kz a = 4 γ1 sin sin 2 2 =
4 γ0 cos
(608)
and cyclic permutations. The energy Ek0 is given by
ky a kz a kx a kz a kx a ky a + cos cos E0,k = Ep − β − 4 γ2 cos cos + cos cos 2 2 2 2 2 2 (609) Evaluate the integrals in terms of the overlap of atomic wave functions by using the Helmholtz-Wolfsberg approximation. Also show that the three energy bands are degenerate at the Γ point, and that when k is directed along the cube axis (Γ X) or the cube diagonal (Γ L), two bands are degenerate. ——————————————————————————————————
8.3.5
Exercise 38
The parent compound of the doped high temperature superconductors is La2 CuO4 which has the Perovskite structure. In this structure, the CuO2 atoms form planes. Each Cu atom is surrounded by an octahedra of O atoms of which four atoms are in the plane. The in-plane Cu − O bonds can serve to define the x and y axes. The O atoms that have the Cu − O bonds parallel to the x axis are denoted as Ox , whereas the other O atoms are denoted by Oy . In this coordinate system, the appropriate basis orbitals are the Cu dx2 −y2 orbitals, while the only Ox states which mix with the Cu states are the px states and the only Oy states that mix with the Cu are the py states. Using the tight-binding form of the Bloch wave function X a a p d p φk = exp i k . R a φx2 −y2 (r) + bx φx (r − eˆx ) + by φy (r − eˆy ) 2 2 R
(610) find the energy bands for the CuO2 planes. ——————————————————————————————————
8.3.6
Exercise 39
Evaluate the tight-binding density of states for the s states of a simple hypercubic lattice in d = 1, d = 2, d = 3, d = 4, in which only the nearest neighbor hopping matrix elements t are retained. Calculate the form of the 172
density of states when d → ∞. ——————————————————————————————————
8.3.7
Exercise 40
Consider the tight-binding density of states for s states on a tetragonal lattice where the overlap in the c direction is t0 and the overlap in either the a or b direction is t. Assume that t t0 . Examine the form of the Fermi-surface when the band is nearly half-filled. Evaluate the density of states. ——————————————————————————————————
8.3.8
Wannier Functions
Consider the position r to have a fixed value. The Bloch functions can be written as X φk,n (r) = exp i k . R fn (r, R) (611) R
The Bloch function φk for fixed r is periodic in k, with periodicity given by the primitive reciprocal lattice vectors Q. Clearly φk+Q,n
=
X
exp
i(k + Q).R
fn (r, R)
R
=
X
exp
ik.R
fn (r, R)
R
= φk,n
(612)
since Q and R satisfies the Laue condition. Thus, the Bloch functions are periodic functions in k space. The Fourier coefficients, fn (r, R), that appear in the k space Fourier expansion can be found from the inversion formulae Z 1 0 3 0 (613) d k exp − i k . R φk0 ,n (r) fn (r, R) = Ωc Ω c where the integration volume Ωc is the volume of one cell of the reciprocal lattice. The simultaneous transformations r → r − R0 and R → R − R0 leave fn (r, R) unchanged fn (r, R) = fn (r − R0 , R − R0 )
173
(614)
This is proved by considering the effect of the transformation r → r − R0 on the definition of the functions fn (r, R) X 0 φk,n (r) = exp i k . R fn (r, R0 ) (615) R0
Applying the transformation on the Bloch function yields X 0 φk,n (r − R0 ) = exp i k . R fn (r − R0 , R0 )
(616)
R0
and then, on transforming the sum over R0 as R0 = R − R0 , one has X φk,n (r − R0 ) = exp i k . (R − R0 ) fn (r − R0 , R − R0 )
(617)
R
On comparing the above expression with the result of Bloch’s theorem X φk,n (r − R0 ) = exp − i k . R0 exp i k . R fn (r, R)
(618)
R
one recovers the symmetry relation fn (r, R) = fn (r − R0 , R − R0 )
(619)
Using the above symmetry of f (r, R) under a translation R0 , and on choosing R0 = R one finds fn (r, R) = fn (r − R, 0) = φen (r − R)
(620)
which shows that the function only depends on the difference r − R. Hence, it has been shown that the Bloch function can be expressed as X φk,n (r) = exp i k . R φen (r − R) (621) R
where φen (r) are the Wannier functions (G. Wannier, Phys. Rev. 52, 191 (1947)). The Wannier functions at different sites are orthogonal. Thus, as they are linearly related to the Bloch wave functions φk,n (r), the set of Wannier functions form a complete orthogonal set. The Wannier functions are given in terms of the Bloch functions via Z 1 (622) φen (r − R) = d3 k exp − i k . R φk,n (r) Ωc Ω c 174
The Wannier functions are localized around the site R, as can be seen by substituting the expression for the Bloch functions in the above equation Z 1 φen (r − R) = d3 k exp + i k . ( r − R ) un,k (r) (623) Ωc Ω c The phase factor in the integral over d3 k has the effect of localizing the Bloch function around r = R, as at this r value, the phase of the integral is stationary. The integral is easy to evaluate for free electrons for which un (r) = 1. The Wannier functions appropriate to free electrons in an orthorhombic lattice are given by φen (r) =
sin [
π x ax π x ax
]
sin [
π y ay π y ay
] sin [
π z az π z az
]
(624)
which have amplitudes that decay algebraically outside the unit cell. This algebraic decay is found only for bands with infinite width. Bands that have allowed energies that are separated by forbidden ranges of E of finite width have Wannier functions that decay exponentially. Furthermore, the rate of exponential decay is dependent on the band width (W. Kohn Phys. Rev. 115 (1959), E.I. Blount, Solid State Physics, Vol 13, Acad. Press, (1962)). ——————————————————————————————————
8.3.9
Exercise 41
Prove that the Wannier functions centered on different lattice sites are orthogonal Z (625) d3 r φe∗n0 (r − R0 ) φen (r − R) ∝ δn0 ,n δR0 ,R Also show that the Wannier functions are normalized to unity Z d3 r | φen (r) |2 = 1
(626)
——————————————————————————————————
175
9
Electron-Electron Interactions
In the last chapter, the effects of interactions between electrons were neglected in the calculation of the energies of single-electron excitations and the singleelectron wave functions. The neglect of the effects of electron-electron interactions is certainly not justifiable from considerations of the relative strength of the effect of the Coulomb interactions with the potential due to the lattice of nuclei compared with the electron-electron interactions. However, due to the Pauli exclusion principle, the lowest energy excitations of an interacting electron gas can be put into a one to one correspondence with the excitations of a non-interacting gas of fermions. The effects of electron-electron interactions are weak for low energy excitations and this leads to the concept of treating the interacting electron system as a Landau Fermi Liquid.
9.1
The Landau Fermi Liquid
The Pauli exclusion principle plays an important role in reducing the effect of electron-electron interactions. A important result of this blocking principle is that the low energy excitations of an electron gas behave very similarly to that of a non-interacting electron gas. This allows one to consider the low energy excitations as quasi-particles, which have a one to one correspondence with the excitations of a non-interacting electron gas. This is the basis of the Landau theory of Fermi-liquids. An important step in deriving the Landau theory was proved by J.M. Luttinger, who showed that electrons with energies close to the Fermi-energy have scattering rates that vanish as the energy approaches the Fermi-energy, to all orders in the electron-electron interaction. This can already be be seen from the lowest order calculation of the lifetime of an electron in a Bloch state due to electron-electron interactions. Although, a rigorous derivation of Fermi Liquid theory must consider processes of all order in the electron-electron interaction, we shall only consider the lowest order processes. Consider the lowest order process, in which an electron, initially in a state k above the Fermi-surface, is scattered to a state k − q. In this scattering processes a second electron is excited from an initial state k 0 below the Fermi-surface to a state k 0 + q above the Fermi-surface. This process conserves momentum and will conserve energy if h2 ( k − q )2 ¯ ¯h2 ( k 0 + q )2 ¯h2 k 02 h2 k 2 ¯ − = − (627) 2m 2m 2m 2m or ( k − k0 ) . q = q2 (628) For fixed k and k 0 this is an equation of a sphere of diameter | k − k 0 |, ( k − k0 ) . Thus, q ranges from 0 to k − k 0 , and conservation of centered on 2 energy ensures that k − q lies on a sphere of radius | 176
( k − k0 ) 2
|, centered
( k + k0 )
at , passing through k and k 0 . However, since k − q must be above 2 the Fermi-surface there are additional restrictions due to the Pauli exclusion principle, namely | k − q | ≥ kF (629) and | k 0 + q | ≥ kF
(630)
Thus, only a segment of the surface of this sphere represents final states of the possible processes. This segment becomes small as k approaches kF . In the limit | k | → kF this segment tends to a circle in the plane of intersection of the sphere and the Fermi-surface, unless of course k = − k 0 . The net result is that the phase space available for the scattering process vanishes as k → kF , and the scattering rate vanishes (J.J. Quinn and R.A. Ferrell, Phys. Rev. 112, 812 (1958)).
9.1.1
The Scattering Rate
The scattering rate can be evaluated from Fermi’s Golden rule 2 X 2π X 4 π e2 1 m 0 2 = 2 δ( ( k − k ) . q − q ) 2 + k2 τk h ¯ q h ¯ TF q k0
(631) The sum over k 0 is performed where k 0 lies within the Fermi-sphere. Thus, the quasi-particle scattering rate vanishes as Ek → µ at zero temperature. At finite temperatures the quasi-particle scattering rate at the Fermienergy varies as ( kB T )2 (E. Abrahams, Phys. Rev. 95, 834 (1954)). The quasi-particle concept remains valid in the limit Ek → µ and T → 0.
9.1.2
The Quasi-Particle Energy
The quasi-particle excitation energy Ek is affected by the interaction with the other electrons in the system. The manner in which this change in energy occurs system can be estimated from perturbation theory. To second order in the perturbation, the energy of the state with an additional electron in state k is given by X Y Y ˆ int | k Ek+ = Ek0 + Ek0n + < k kn | H kn > |kn |
+
X
X
q
|km |
|kn |
|kn |
Q Q ˆ < k |kn | Ek0 + Ek0
m
0 − Ek0 − Ek−q
m
2
+q
(632) 177
To second order in the interaction, the ground state energy is given by Y Y X ˆ int | kn > kn | H Egs = Ek0n + < |kn |
+
|kn |
|kn |
2 Q Q ˆ < |kn | |kn |
Ek0
m,m0
m0
+ Ek0
− Ek0
m0
m
−q
− Ek0
m
+q
(633) The excitation energy for adding an electron to state k is defined by Ekexc = Ek+ − Egs
(634)
To this order, the excitation energy is expressed in terms of two-particle states as X ˆ int | k k > < k kn | H Ekexc = Ek0 + n |kn |
+
X
X
|k−q|>kF |km |
−
X
X
|k+q|
2 ˆ int | k − q k + q > < k km | H m Ek0 + Ek0
0 − Ek−q − Ek0
m
m
+q
2 ˆ int | k k + q > < k+q k | H m m 0 Ek+q + Ek0
m
− Ek0 − Ek0
m
+q
(635) The terms first order in the interaction represent the interaction of the particle with the average density due to the other electrons. The last two terms are second order terms. The first of this pair represents the scattering of the electron from the state k from an electron k m in the Fermi-sea, to final states k − q and k m + q above the Fermi-sea. The last term represents a subtraction, as this represents a scattering process for a pair of electrons that initially are below the Fermi-surface which is forbidden by the Pauli exclusion principle as the state k is occupied by an electron. The k independent terms are absorbed into a shift of the Fermi-energy. This excitation represents the excitation energy for adding an electron to state k. However, the many-body state consists of a linear superposition of single-electron states and states where the added electron is dressed by electronhole pairs. The quasi-particle weight Z −1 (k) is defined as the fraction of the ˆ int , the initial bare electron contained in the quasi-particle. To lowest order in H
178
quasi-particle weight or wave function renormalization is calculated as 2 ˆ int | k − q k + q > < kk |H m m X X Z(k) = 1 + 0 − Ek0 +q )2 ( Ek0 + Ek0 − Ek−q |k−q|>kF |km |
+
X
X
|k+q|
m
m
2 ˆ int | k k + q > < k + q km | H m 0 + Ek0 ( Ek+q
m
− Ek0 − Ek0
m
+q
)2 (636)
which is greater than unity. Thus, the fraction of the bare electron in the quasiparticle state is always less than unity. This conclusion remains valid to all orders of perturbation theory, if the Fermi Liquid phase is stable. When |k| crosses kF , the quasi-particle changes from a quasi-particle to a quasi-hole. At zero temperature due to the vanishing of the quasi-particle scattering rate, the distribution of the number of bare particles has a discontinuity at the Fermi-energy of Z(k)−1 . This discontinuity is small compared with the discontinuity for noninteracting electrons which is completely contained in the Fermi-function. Thus, the concept of a Fermi-surface remains well defined for interacting electron systems. The quasi-particle weight has the effect that the excitation energy for a single quasi-particle is given by the expression Eqp (k) =
Ekexc Z(k)
(637)
In addition to the shift in the excitation energy, the quasi-particle excitation energy is reduced by Z(k) and these two effects combine to yield an reduction of the dispersion. The reduced dispersion is interpreted in terms of an increase in the effective mass of the quasi-particle. The density of single-electron excitations is given by the quasi-particle contribution X −1 ρqp (E) = Z(k) δ E − Eqp (k) (638) k
where E is the excitation energy relative to the Fermi-energy. Due to quasiparticle weight factor, the single-electron density of states is narrowed and peaks up near the Fermi-energy. As the quasi-particles obey Fermi-Dirac statistics, the quasi-particles can give rise to an enhancement of the coefficient of the linear T term in the low-temperature electronic specific heat. Despite the apparent simplicity of the Fermi Liquid picture, it is exceedingly difficult to quantitatively derive the Fermi Liquid description appropriate
179
to a specific microscopic Hamiltonian. Since the perturbation due to electronelectron interaction is long-ranged, there are divergent terms in the perturbation expansion. The divergent terms first appear in the expansion taken to second order. The divergent terms can actually be re-summed to yield finite results. The re-summations are made possible by the fact that the long-ranged Coulomb interaction in a metal is screened by the other electrons. The screening processes involves the Coulomb interaction to infinite order. By taking into account the screening of the long-ranged Coulomb interaction, the divergent terms can be summed to infinite order leading to finite results. That is, the divergence associated with any term can be eliminated by combining it with a subset of other divergent terms. However, the re-summation of all the terms in the perturbation expansion presents a serious challenge and so approximations have been developed. These approximations involve the summation of infinite subsets of the terms that appear in the perturbation expansion. One such approximation is the Hartree-Fock approximation. The Hartree-Fock approximation is self-consistent first order perturbation theory in that it just consists of the first order terms in the perturbation expansion. However, in these terms, all the wave functions are calculated self-consistently by taking the first order processes into account. ——————————————————————————————————
9.1.3
Exercise 42
Using a perturbation expansion, find the energy of a free electron gas to first order in the electron-electron interaction. ——————————————————————————————————
9.2
The Hartree-Fock Approximation
The Hartree-Fock approximation consists of writing the many-electron wave function as a single Slater determinant, much the same way as for independent or non-interacting electrons. This should be contrasted with the exact wave function which is expected to be composed of a linear superposition of Slater determinants. The Hartree-Fock approximation, therefore, involves finding the best one-electron basis functions that takes the average effect of electron-electron interactions into account (D.R. Hartree, Proc. Camb. Phil. Soc., 24, 89,1928)). The Hartree-Fock approximation can be expressed in terms of the RayleighRitz variational principle (V.A. Fock, Zeit. f¨ ur Physik, 61, 126, (1930)), in which the many-particle wave function is written as a single Slater determinant (J.C.
180
Slater, Phys. Rev. 35, 210 (1930)). The Hamiltonian operator is expressed as X pˆ2 1 X e2 i ˆ = (639) H + Vions (ri ) + 2 | ri − rj | 2m i i 6= j
The expectation value of the Hamiltonian in terms of a Slater determinant Φ of a complete set unspecified single-electron wave functions φα,σ (r) is given by i=N Ye Z 3 ˆ ˆ Φα , . . . α (r , . . . r ) H = d ri Φ∗α1 , . . . αNe (r1 , . . . rNe ) H 1 1 Ne Ne i=1
(640) The expectation value of the energy is evaluated as X Z ¯h2 2 3 ∗ E = d r φα (r) − ∇ + Vions (r) φα (r) 2m α Z Z 1 X e2 3 φβ (r0 ) φα (r) d3 r0 φ∗α (r) φ∗β (r0 ) + d r 2 | r − r0 | α,β Z Z 1 X e2 3 − d r d3 r0 φ∗α (r) φ∗β (r0 ) φα (r0 ) φβ (r) 2 | r − r0 | α,β
(641) where the sums over α and β run over all the single particle quantum numbers labelling the Slater determinant Φ. The first term just represents the sum of one-particle energies of the electrons. The second term represents the interaction energy between an electron and the average charge density of all the electrons. The last term is the exchange term; it arises due to the Coulomb interaction and the anti-symmetry of the many-electron wave function. The spin indices have been suppressed in the expression for the energy. The quantum number α needs to be supplemented by the spin quantum number σ to uniquely specify the state and φα (r) → φα (r) χσ . Therefore, in the matrix elements there is not only an integration over r, but also the matrix elements of the spin states has to be evaluated. The single-electron wave functions are to be chosen such that they minimize the energy, subject to the constraint that they remain normalized to unity. Hence, subject to this condition, the single-electron wave functions are chosen such that the first order variation of the energy is identically equal to zero. The minimization is performed by using the Lagrange method of undetermined multipliers. First, one forms the functional Ω which is the average value of the Hamiltonian minus the Ne constraints that ensure that the one-electron wave functions are normalized to unity. The functional Ω is given by i=N Ye Z 3 ˆ Φα , . . . α (r , . . . r ) Ω = d ri Φ∗α1 , . . . αNe (r1 , . . . rNe ) H 1 Ne 1 Ne i=1
181
−
i=N Xe
Z λαi
d3 ri φ∗αi (ri ) φαi (ri ) − 1
(642)
i=1
where the λα are the undetermined multipliers. Since φα is an arbitrary complex function, the real and imaginary parts are independent. Instead of working with the real and imaginary parts, we shall consider the function φα and its complex conjugate φ∗α as being independent. The second step of the Lagrange method consists of considering the effect of varying the set of φ∗α . The deviation of the variational functions φ∗α (r) from the extremal function, φ∗HF,α (r), are denoted by δφ∗α , i.e., φ∗α (r) = φ∗HF,α (r) + δφ∗α (r) (643) To first order in the deviation δφ∗α (r), the expectation value of the functional Ω changes to first order in δφ∗α by an amount δΩ. The change δΩ is evaluated as X Z ¯h2 δΩ = ∇2 + Vions (r) − λα φHF,α (r) d3 r δφ∗α (r) − 2m α Z Z X e2 + d3 r d3 r0 δφ∗α (r) φ∗HF,β (r0 ) φHF,β (r0 ) φHF,α (r) | r − r0 | α,β Z XZ e2 3 φHF,β (r0 ) φHF,α (r) d3 r0 φ∗HF,β (r) δφ∗α (r0 ) − d r | r − r0 | α,β
(644) The expression for δΩ must vanish identically for any of the independent and arbitrary variations δφ∗α (r), if the Hartree-Fock wave functions φHF,α (r) minimize the average energy. In order for this to be true, for each value of α, the coefficient of δφ∗α (r) must vanish identically. After interchanging the variables r and r0 in the last term, one finds that the normalized Hartree-Fock wave functions must satisfy the set of equations h2 ¯ 0 = − ∇2 + Vions (r) − λα φHF,α (r) 2m X Z e2 0 + d3 r0 φ∗HF,β (r0 ) φ (r ) φHF,α (r) HF,β | r − r0 | β X Z e2 3 0 ∗ 0 0 − d r φHF,β (r ) φHF,α (r ) φHF,β (r) | r − r0 | β
(645) in order to minimize the energy. To simplify further analysis, we shall explicitly display the spin dependence by writing φHF,α (r) φHF,β (r)
= ψα (r) χσ = ψβ (r) χσ0 182
(646)
This notation recognizes that the spatial component of the wave function, ψα (r), depends on all the quantum numbers represented by α, including the spin quantum number, as in the un-restricted Hartree-Fock approximation. The HartreeFock equations are re-written as h2 ¯ ∇2 + Vions (r) − λα ψα (r) χσ 0 = − 2m X Z e2 0 0 + d3 r0 χTσ0 ψβ∗ (r0 ) ψ (r ) χ ψα (r) χσ β σ | r − r0 | β X Z e2 0 ψ (r ) χ ψβ (r) χσ0 − d3 r0 χTσ0 ψβ∗ (r0 ) σ α | r − r0 | β
(647) In the inner product, the integrations over the position r0 of the spatial component of the wave function is combined with the matrix elements of the spin wave functions. The spin matrix elements are given by χTσ0 χσ = δσ0 ,σ
(648)
Since the Coulomb interaction is spin independent, that last term contains a Kronecker delta function that is non-vanishing only when σ = σ 0 . The set of Hartree-Fock equations are eigenvalue equations for a non-local linear operator h2 ¯ 0 = − ∇2 + Vions (r) − λα ψα (r) 2m X Z e2 0 ) ψα (r) + d3 r0 ψβ∗ (r0 ) ψ (r β | r − r0 | β Z X e2 3 0 ∗ 0 − δσ0 ,σ d r ψβ (r ) ψβ (r) ψα (r0 ) | r − r0 | β
(649) There is one such equation for each value of α. In solving the above equations for ψα (r), one should consider the functions ψβ (r) as known quantities. In this case, the eigenvalue equations are linear in the eigenfunctions, ψα , and the undetermined multipliers, λα , are the eigenvalues. The term proportional to X Z e2 | ψβ (r0 ) |2 d3 r 0 Vdirect (r) = (650) | r − r0 | β
represents a contribution to the potential from the average electrostatic potential due to the all electrons in the system. This potential includes the contribution from an the electron in state α. This potential is independent of the spin states of the electrons, and is called the direct interaction. The last term in the Hartree-Fock equation is non-local, as it relates the unknown eigenfunction 183
ψα (r) to the weighted average of the unknown eigenfunction at other points in space, ψα (r0 ). The non-local potential represented by σ Vexch (r, r0 ) = −
X
δσ,σ0 ψβ∗ (r0 )
β
e2 ψβ (r) | r − r0 |
(651)
is called the exchange interaction. Since the Coulomb interaction is spin independent, the matrix elements in the non-local exchange potential are non-zero only if the spin of state α is identical to the spin of state β. If the spins are antiparallel, the exchange term is zero. Thus, the exchange term is spin dependent. With this notation, the Hartree-Fock equations can be written as Z h2 ¯ σ − ∇2 + Vion (r) + Vdirect (r) ψα (r) + d3 r0 Vexch (r, r0 ) ψα (r0 ) = λα ψα (r) 2m (652) These sets of equation can be solved iteratively. Using approximations for the direct and exchange potentials, one can solve the equations to find a set of wave functions which are approximations for the ψα (r). These approximate wave functions are then used to construct new approximations for the direct and exchange potentials. The procedure is repeated until self-consistency is achieved. The contributions to the direct and exchange potentials, arising from the state where β = α, exactly cancel in the non-local operator. Therefore, there are no self interaction terms in the Hartree-Fock approximation. The cancellation of the self interaction has the effect that the linear potential operator is the same for all the single-electron wave functions. The Hartree-Fock approximation can be solved exactly for the free electron gas in which the potential of the lattice of ions is replaced by a constant value. This (unrealistic) uniform potential is of special importance, since the solution is often used as a starting point to discussing the electronic structure of a nonuniform electron gas. Specifically, the most common method of determining electronic structure, the local density functional method, utilizes the expression for the ground state energy of the uniform electron gas.
9.2.1
The Free Electron Gas.
The Hamiltonian for the free electron gas is invariant under all translations and, as long as the translational symmetry is not spontaneously broken, the HartreeFock eigenstates should be simultaneous eigenstates of the momentum operator. Thus, the Hartree-Fock equations for a uniform potential Vions = V0 should have the eigenfunctions 1 √ exp i k . r χσ (653) ψk,σ (r) = V
184
where V is the volume of the crystal. It can be seen that this is true by substituting the wave functions into the Hartree-Fock eigenvalue equations. The charge density due to the electrons is a constant, and this combines with the uniform charge density from the background gas of ions. Due to charge neutrality, the resulting net direct Coulomb potential from the total charge density vanishes Vions (r) + Vdirect (r) = 0 (654)
In order to evaluate the exchange potential, one has to perform the sum over values of k 0 , σ 0 . The sum over k 0 , σ 0 only runs over the occupied states. We shall assume that the Hartree-Fock state does not spontaneously break the spin rotational symmetry and lead to magnetism. Likewise, we shall also assume that the Hartree-Fock solution does not break translational invariance. Magnetic solutions which also break translation invariance have been found by Overhauser (A.W. Overhauser, Phys. Rev. Letts. 4, 462 (1960), Phys. Rev. 128, 1437 (1962)) and also by Kohn and Nettel (W. Kohn and S.J. Nettel, Phys. Rev. Letts. 5, 8 (1960)). In the non-magnetic translationally invariant case, the Hartree-Fock states are spin degenerate, and the one-particle states are filled according to the magnitude of the kinetic energy. All the one-particle states labelled by (k, σ), where k is contained inside a sphere of radius kF , are filled with electrons. The spin-dependent exchange term is evaluated as X 1 e2 0 0 0 δ exp i k . ( r − r ) Vexch (r, r0 ) = − σ,σ V | r − r0 | 0 0 |k |≤kF , σ
(655) The exchange potential also has translational invariance, and so it is possible that plane waves are eigenfunctions of the Hartree-Fock equations. The exchange potential is evaluated from Z 2π Z π 1 Vexch (r, r0 ) = − dϕ dθ sin θ ( 2 π )3 0 0 Z kF e2 × dk 0 k 02 exp i k 0 . ( r − r0 ) | r − r0 | 0 1 = − 2 π e2 ( 2 π )3 Z kF exp + i k 0 | r − r0 | − exp − i k 0 | r − r0 | ! × dk 0 k 0 i | r − r0 |2 0 (656) The integration over k 0 can be performed with the aid of an identity obtained 185
by differentiating the expression Z 1 sin α dx cos α x = α 0
(657)
with respect to α. That is, Z 1 sin α cos α dx x sin α x = − α2 α 0
(658)
The resulting expression for the exchange potential is e2 kF4 Vexch (r, r ) = − 2 π2 0
sin kF | r − r0 | cos kF | r − r0 | − 0 4 ( kF | r − r | ) ( kF | r − r0 | )3
!
(659) The long-ranged oscillatory behavior of the exchange potential is due to the sharp cut off of the integration at kF . This cut off occurs as the Fermi-wave vector kF is the largest wave vector associated with the occupied one-electron states. The contribution of the exchange potential to the energy eigenvalue λk can be found from Z Z 1 d3 r0 Vexch (r, r0 ) ψk (r0 ) = √ exp i k . r d3 r0 exp i k . ( r0 − r ) Vexch (r, r0 ) V (660) Thus, the contribution of the eigenvalue stemming from exchange potential is just the Fourier transform of the exchange term, Vexch (k), ! Z e2 kF4 sin kF R cos kF R 3 Vexch (k) = − d R exp i k . R − 2 π2 ( kF R )4 ( kF R )3 (661) which can be evaluated directly. An alternate method involves using the convolution theorem, in which case the expression Z V e2 Vexch (k) = − d3 r 0 3 (2π) | r − r0 | Z × d3 k 0 ψk∗0 (r0 ) ψk0 (r) exp i k . ( r0 − r ) |k0 |≤kF
(662) can be used. The plane wave nature of the eigenfunctions can be utilized to write the expression as Z V e2 3 0 Vexch (k) = − d r × ( 2 π )3 | r − r0 | 186
Z × |k0 |≤kF
d3 k 0 | ψk0 (r0 ) |2 exp
i ( k0 − k ) . ( r − r0 )
(663) The electron density, per spin, arising from state k is just | ψk0 (r0 ) |2 = V1 for | k 0 | ≤ kF . Since this is independent of r0 , the exchange contribution to the eigenvalue involves the Fourier Transform of the Coulomb potential. The Fourier transform of the exchange potential is found as Z Z e2 1 0 3 0 0 3 0 d k exp i ( k − k ) . ( r − r ) = − d r | r − r0 | ( 2 π )3 |k0 |≤kF (664) Hence, the expression for the exchange contribution to the eigenvalue λk is given by Z 1 4 π e2 3 0 Vexch (k) = − d k ( 2 π )3 |k0 |≤kF | k − k 0 |2 Z kF e2 | k + k0 | = − dk 0 k 0 ln (665) πk 0 | k − k0 | The integral can be evaluated as Vexch (k) = − where F (x) =
2 e2 kF F π
k kF
1 1 − x2 |1 + x| + ln 2 4x |1 − x|
(666)
(667)
At k = 0, the function F (0) is unity. At k = kF , the function falls to the value F (1) = 12 and has a logarithmic singularity in the slope. This singularity in the slope is due to the long-ranged nature of the Coulomb interaction ( 4k2π ). The function F (x) falls to zero in the limit limx → ∞ F (x) → 0. Thus, the eigenvalue λk is given by ¯h2 k 2 2 e2 k λk = − kF F (668) 2m π kF
The total energy of the electron system is given by the sum of the kinetic energy and the exchange energy EHF
=
X h ¯ 2 k2 2m k Z XZ − d3 r d3 r0 ψk∗ (r) ψk∗0 (r0 ) 2
k,k0
187
e2 ψk (r0 ) ψk0 (r) | r − r0 |
X h ¯ 2 k2 = + λk 2m
(669)
k
where the summations are restricted to the values of k and k 0 which are within the Fermi-sphere. The Hartree-Fock energy can be re-expressed as EHF
=
2
X k ≤ kF
−
¯ 2 k2 h 2m
2 e2 kF π
X
k ≤ kF
1 + 2
kF2 − k 2 4 k kF
ln
| kF + k | | kF − k |
(670) The summations over k can be evaluated by transforming them into integrals Z 4πV h2 k 2 2 ¯ EHF = 2 dk k ( 2 π )3 k ≤ kF 2m 2 Z 2 4πV 1 kF − k 2 | kF + k | 2e 2 − kF dk k + ln π ( 2 π )3 k ≤ kF 2 4 k kF | kF − k | 2 5 2 V ¯ h kF V e 4 1 1 = − kF − (671) 2 3 π 10 m π 3 12
The number of electrons, per spin,
Ne 2
is given by
Ne V 4π 3 = k 2 8 π3 3 F
(672)
Using this, the Hartree-Fock approximation for the cohesive energy of the free electron gas can be expressed as 3 ¯h2 kF2 3 e2 kF (673) − EHF = Ne 5 2m 4 π An alternative expression is given by introducing a characteristic dimension, or radius rs , such that there exists one electron in a sphere of radius rs a0 , where 2 a0 is the Bohr radius ( a0 = mh¯ e2 ). Then, the uniform electron density, ρ, is given by the equivalent expressions 1 ρ
= =
4π 3 3 a r 3 0 s 3 π2 kF3
Thus, the magnitude of the Fermi-wave vector kF is given by 1 9π 3 1 kF = 4 rs a0 188
(674)
(675)
and so the electronic energy is expressed as EHF Ne
= = =
2 1 3 ¯h2 9π 3 1 3 e2 9π 3 1 − 10 m a20 4 rs2 4 π a0 4 rs 1 1 e2 9π 3 3 9π 3 1 3 1 − 2 a0 4 5 4 rs2 2 π rs 2.21 0.9163 − Rydbergs rs2 rs
(676)
2
where 1 Rydberg = 2ea0 . The Hartree-Fock energy has a minimum at the rs value given by rs ∼ 4.8 and has a cohesive energy of about 0.1 Rydbergs. Typical materials have spatially varying densities, hence, the local value of rs also varies. For a hydrogen-like atom, the ground state density is given by 2Z r Z3 exp − (677) ρ(r) = π a30 a0 Therefore, typical values of rs are given by the density at the nuclear position 1
rs
= =
( 34 ) 3 Z 0.9086 Z
(678)
and at the first Bohr radius r = Z a0 1
rs
= =
2
( 34 ) 3 e 3 Z 1.7696 Z
(679)
Since for metals the density of electrons corresponds to rs values in the range of 2 to 5, the exchange term is of similar magnitude to the kinetic energy term. The Hartree-Fock approximation indicates that the cohesive energy is largest for low density metals, i.e., those with rs ∼ 5. In the particular case of the free electron gas where the lattice potential is zero, the Hartree-Fock approximation coincides with second order perturbation theory. If higher order terms are included (M. Gell-Mann and K. Brueckner, Phys. Rev. 106, 347, (1957), W.J. Carr and A.A. Maradudin, Phys. Rev. A 133, 371 (1964)), one obtains the expression for the energy per electron E 2.21 0.9163 − = + 0.06218 ln r − 0.094 + O(r ) (680) s s Ne rs2 rs 2
in units of 2ea0 . The energy is a form of an expansion in rs , valid for rs < 1. Thus, the Hartree-Fock result can be thought of as an approximation which 189
reproduces the high density limit ( small rs limit ) correctly. The other terms in the expression are due to electron correlations. A completely different behavior is expected to occur in the low density limit. In reducing the density from the high density metallic limit to the low density limit, the system is expected to undergo a transition to a Wigner crystal phase (E.P. Wigner, Phys. Rev. 46, 1002 (1934)). In a Wigner crystal, the electrons are expected to localize in a b.c.c. structure. The total energy is expected to be dominated by the electrostatic interaction and the energies of the vibrations of the electronic lattice (W.J. Carr, R.A. Coldwell-Horsfall, and A.E. Fein, Phys. Rev. 124, 747 (1961)). The energy of the Wigner crystalline phase is given by E e2 1.792 2.65 0.73 = − + − + ... (681) 3 Ne 2 a0 rs rs2 rs2 for rs 1. The electronic wave functions described by a Slater determinant are not devoid of correlations. The correlations are a result of the Pauli exclusion principle. The two-particle density-density correlation function for a single Slater determinant can be written as X 1 ρ2 (r, r0 ) = | φα (r) φβ (r0 ) − φβ (r) φα (r0 ) |2 2 α,β X X X = | φα (r) |2 | φβ (r0 ) |2 − φ∗α (r) φβ (r) φ∗β (r0 ) φα (r0 ) α
β
α,β
(682) On making the spin dependence explicit, by writing φα (r) φβ (r)
= ψα (r) χσ = ψβ (r) χσ0
(683)
one finds that the two-particle density-density correlation function is given by X X X | ψβ (r0 ) |2 − ρ2 (r, r0 ) = | ψα (r) |2 δσ,σ0 ψα∗ (r) ψα (r0 ) ψβ∗ (r0 ) ψβ (r) α
β 0
= ρ(r) ρ(r ) −
X
α,β 0
Gσ (r , r) Gσ (r, r0 )
(684)
σ
where Gσ is given by a sum over the single-particle states labelled by α which have the spin quantum number σ X ψα∗ (r0 ) ψα (r) (685) Gσ (r, r0 ) = α
The last term in the two-particle density-density correlation function is the exchange term. The exchange term originates from pairs of electrons with parallel 190
spins. In the Hartree-Fock approximation for the free electron gas, the exchange contribution to the two-particle density-density correlation function ρ2 (r, r0 ) is expressed in terms of the factors X Gσ (r, r0 ) = ψk∗0 (r0 ) ψk0 (r) |k0 | < kF
=
=
1 V
X
exp
i k0 . ( r − r0 )
|k0 | < kF
kF3 2 π2
sin kF | r − r0 | cos kF | r − r0 | − ( kF | r − r0 | )3 ( kF | r − r0 | )2
(686)
where the summation is over the Fermi-sphere. The density-density correlation function shows a hole in the density of parallel spin electron around the electron and vanishes as | r − r0 | → 0, as expected from the Pauli exclusion principle. The exchange potential has a similar form and can be thought of arising from a deficiency in the density of parallel spins around an electron at r. The HartreeFock approximation is deficient in that it does not include a similar correlation hole between electrons with anti-parallel spins. In the Hartree-Fock approximation, the energies of the excited states are given by Koopmans’ theorem (T.A. Koopmans, Physica 1, 104 (1933)). That is, the energy for adding or removing an electron from the system is given by the eigenvalue λk , if the other one-electron states in the many-particle Slater determinant are not changed or that the other electrons in the ground state are not re-arranged. Thus, in the Hartree-Fock approximation, the quasi-particles energies are given by Eqp (k) = λk (687) The quasi-particle density of states, per spin, is given by X ρqp (E) = δ( E − Eqp (k) ) k
Z
=
V 2 π2
=
V k2 2 π2
kF
dk k 2 δ( E − Eqp (k) )
0
dEqp (k) dk
−1
(688) k(E)
where k(E) is the value of k that satisfies the equation Eqp (k) = E
(689)
From the above, one sees that at the Fermi-energy defined by EF = Eqp (kF ) 191
(690)
the quasi-particle density of states is zero since dEqp (k) dk
=
¯2 k h e2 kF − m π k e2 ( kF2 + k 2 ) | kF + k | + ln π 2 k2 | kF − k |
(691)
which diverges logarithmically at k(EF ) = kF . Thus, the Hartree-Fock approximation for the free electron gas is of limited utility in discussing properties of real metals. This is caused by the divergent slope of the one-electron eigenvalues near the Fermi-surface. This spurious divergence caused by the neglect of screening, results in the one-electron density of states falling to zero just at the Fermi-energy. ——————————————————————————————————
9.2.2
Exercise 43
Show, using perturbation theory, that the second order correction to the energy of a free electron gas is given by 2 m X 4 π e2 1 (2) (692) ∆E = − 2 2 V q q . ( k − k0 + q ) h ¯ 0 k,k ,q
where k < kF , k 0 < kF , | k + q | > kF and | k 0 − q | > kF . Since this integral is dominated by the region q → 0, the value of k ∼ kF and k 0 ∼ kF . Show that the contribution to ∆E (2) is proportional to Z Z d3 q dq = 4 π 3 q q = 4 π ln q (693) and, thus, diverges for q → 0. Simple second order perturbation theory does not work for the free electron gas. None the less, perturbation theory can be applied by using more elaborate techniques which take into account the screening of the Coulomb interaction. ——————————————————————————————————
9.3
The Density Functional Method
The density functional method provides an exact method for calculating the electron density and ground state energy for interacting electrons in the presence of a crystalline potential. As such, it can be used to determine the stability 192
of various lattice structures. It can also be used to determine ground state properties or static properties of the electronic systems such as those provided by elastic scattering experiments. It is based on the Hohenberg and Kohn Theorem (P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964)). The Hohenberg-Kohn theorem considers the form of a many-electron Hamiltonian in which the form of the Coulomb interaction term between pairs of electrons, Vint (r, r0 ), is known, but in which the one-particle potential due to the ion cores is considered to be an external potential. Thus, the external potential Vext (r) varies between one crystal structure and the next. The Hamiltonian is written as the sum of the kinetic energy of the electrons and the interaction and external potentials acting on each electron. The Hohenberg-Kohn theorem associates every non-degenerate many-body ground state wave function with a unique external potential. A second map exists between the many-body ground state wave function and the ground state electron density. Therefore, the expectation value of any ground state property can be expressed in terms of a unique functional of the ground state density. Having established this, the ground state properties and electron density can then be evaluated from the Rayleigh-Ritz variational principle for the ground state energy in which the electron density is the function to be varied. This leads to a knowledge that, if one can construct the unique energy functional which contains the external potential due to the lattice potential, then one can find the ground state energy and electron density. This functional is not known, however, it is customary to make the local density approximation. In this approximation, an un-testable assumption is made about the electronelectron interactions in a non-uniform electron gas. The method also generates eigenvalues which are interpreted in terms of the energies of independent Bloch electrons. The energy dispersion relations generated this way do show a marked similarity with the experimentally determined bands of simple metals. The basis for density functional theory is provided by a theorem proved by Hohenberg and Kohn (P. Hohenberg and W. Kohn, Phys. Rev. 136 B864 (1964)).
9.3.1
Hohenberg-Kohn Theorem
The Hohenberg-Kohn theorem first assures us that the electron density in a solid, ρ(r), uniquely specifies the electrostatic interaction potential between the Ne electrons and the ionic lattice. Thus, the density ρ(r) can be used as the basic variable. Furthermore, the energy can be expressed as a unique functional of the energy density involving the potential due to the ionic lattice. This establishes a variational principle which can be used to calculate the electron density, ρ(r), and the total energy of the electronic system. 193
First, it shall be assumed that Vions (r) is not uniquely specified if ρ(r) is given. That is, it is assumed that there exists at least two potentials V and V 0 which give rise to the same ground state electron density. These potentials are related to the exact ground state many-particle wave functions via the energy eigenvalue equations, ˆ Ψ(r , . . . r ) = E Ψ(r , . . . r ) H 1 Ne 1 Ne
(694)
ˆ 0 Ψ0 (r , . . . r ) = E 0 Ψ0 (r , . . . r ) H 1 Ne 1 Ne
(695)
and From the Rayleigh-Ritz variational principle, one finds that the primed wave function Ψ0 (r1 , . . . rNe ) provides an upper bound to the ground state energy ˆ of the unprimed Hamiltonian H, E =
i=N Ye
Z
ˆ Ψ(r , . . . r ) d3 ri Ψ∗ (r1 , . . . rNe ) H 1 Ne
Z
ˆ Ψ0 (r . . . r ) d3 ri Ψ0∗ (r1 , . . . rNe ) H 1 Ne
i=1
E <
i=N Ye
(696)
i=1
However, as the primed and unprimed Hamiltonian are related through ˆ = H ˆ 0 + Vˆ − Vˆ 0 H
(697)
ˆ 0 with energy eigenvalue E 0 , the energies and as Ψ0 is the ground state of H satisfy an inequality Z E < E0 + d3 r ρ(r) ( V (r) − V 0 (r) ) (698) However, by similar reasoning, it can also be shown that the energies also satisfy the inequality Z (699) E0 < E + d3 r ρ(r) ( V 0 (r) − V (r) ) where the prime and unprimed quantities are interchanged. The assumption that the ground state densities of the primed and unprimed Hamiltonian are equal has been used. Adding these two inequalities leads to an inconsistency E + E0 < E + E0
(700)
Therefore, the assumption that the same ground state density can be found for two different potentials is false. Furthermore, the potentials can, at most, only differ by a constant V 0 (r) − V (r). Thus, the ground state electron density ρ(r) must correspond to a unique V (r). This means that the electron density, ρ(r), can be taken to be the principal variable.
194
9.3.2
Functionals and Functional Derivatives
As a mathematical prelude, we shall define functionals and functional derivatives. A functional is a generalization of a function. A function f (r) can be defined as a mapping which maps each point in space, r, to a number. The value of the number depends on the position of the point. The functional is similar in that it maps a scalar function onto a number. The value of the functional, F [ρ], depends upon the function ρ(r), i.e., the values of the function ρ at each point in space. Functionals are usually expressed in terms of integrals over space, usually as multiple integrals. A simple example of a functional is given by the number of electrons Ne [ρ], which is a functional of the density. The number of electrons is given by Z Ne [ρ] = d3 r ρ(r) (701) It is a functional as different densities may correspond to different number of particles i.e., N a has a different density than Li and they have different numbers of electrons. The classical Coulomb energy is a more interesting functional. The Coulomb energy is defined as the pairwise sum of interactions Z Z e2 ρ(r) ρ(r0 ) ECoul [ρ] = d3 r d3 r 0 (702) 2 | r − r0 | This yields a number which is the value of the energy, and this number depends on the density at all points of space. Given a functional F [ρ], one can define a functional derivative. The definition of the functional derivative is similar to the definition of a derivative of a function. However, instead of defining the derivative in terms of the difference of the function at two nearby points, one defines the functional derivative in terms of the difference of the functional for two functions that are close. For example, an arbitrary family of functions, ρ0 (r), can be defined in terms of a fixed function ρ(r) and an arbitrary deviation δρ(r) via ρ0 (r) = ρ(r) + λ δρ(r)
(703)
The scale factor λ varies from unity to zero continuously. When λ = 1, this relation defines the shape of the deviation δρ(r). If λ is changed continuously to zero, the differences between the function ρ0 and the fixed function ρ vanish. The shape of the deviation λ δρ(r) is arbitrary and does not change, only the magnitude of the deviation is changing. The functional derivative can be expressed in terms of the limit of the difference of the functional evaluated at 195
these two functions. If one assumes that one may Taylor expand the functional in powers of λ, one has F [ρ0 ] = F [ρ] + λ δ 1 F [ρ, δρ] +
1 2 2 λ δ F [ρ, δρ] + . . . 2
(704)
since the differences now depend on two functions ρ and δρ. If one defines the terms of first order in λ to have the form Z δF [ρ] δ 1 F [ρ, δρ] = d3 r δρ(r) (705) δρ(r) then the quantity δF [ρ] δρ(r)
(706)
is independent of the shape of the deviation, and is defined to be the first order functional derivative. Sometimes a functional may depend on the higher order derivatives of ρ i.e., Z F [ρ] =
d3 r f (r, ρ, ∇ρ)
(707)
In this case, one can define a functional derivative in terms of the partial derivatives, ∂f (708) ∂ρ and the vector quantity ∂f ∂∇ρ
(709)
etc., where the functions ρ and ∇ρ etc. are treated as independent variables. This yields the first order variation as Z ∂f ∂f + ∇δρ . (710) δ 1 F [ρ, δρ] = d3 r δρ ∂ρ ∂∇ρ If the functions ρ satisfy appropriate conditions at the boundaries of the integration, the equation can be integrated by parts to eliminate the term Z ∂f (711) d3 r ( ∇δρ ) . ∂∇ρ In this case, the first order functional derivative is evaluated as ∂f δF [ρ] ∂f = − ∇. ∂∇ρ δρ(r) ∂ρ
(712)
The extension to functionals containing higher order derivatives is quite straightforward.
196
An alternative method of evaluating functional derivatives is based on the observation that the functional derivative is independent of the variation δρ. Since δρ is arbitrary, one may choose δρ to have any particular form. The particular variation of the form of a dirac delta function proves to be a useful choice δρ(r) = δ 3 (r − r0 ) (713) since, for this particular choice, the value of δF 1 [ρ, δ 3 (r − r0 )] is given by δ 1 F [ρ, δ 3 (r − r0 )] =
δF [ρ] δρ(r0 )
(714)
An example of the first order functional derivative is given by the functional derivative of the Coulomb energy Z Z e2 e2 ρ(r) ρ(r0 ) δECoul [ρ] + d3 r 0 = d3 r 0 | r − r1 | 2 | r − r1 | δρ(r1 ) 2 Z ρ(r) (715) = e2 d3 r | r − r1 | In obtaining the second line, we have relabelled the variable of integration. The first order functional derivative of the mono-nomial functional Z (716) Fn [ρ] = d3 r ρ(r)n is simply evaluated as δFn [ρ] = n ρ(r1 )n−1 δρ(r1 )
(717)
The delta function method also proves useful for evaluating functional derivatives of higher orders. The first order functional derivative is often encountered in variational principles. In a variational principle, there exists a function ρ(r) which yields an extremal value of the functional. That is, if the functional is changed by an arbitrary small variation λδρ away from the extremal function, the functional does not change. On regarding the functional F [ρ + λδρ] as a function of λ, the extremal condition is equivalent to ∂ F [ρ + λδρ] = 0 (718) ∂λ λ=0 since the value of the functional does not change to order λ as λ approaches zero. This equation is satisfied for an arbitrary shape δρ(r), if the functional derivative is identically zero δF [ρ] = 0 (719) δρ(r)
197
for all r. The extremal function ρ(r) must satisfy this extremal condition for all r. Often, the extremal condition provides an integro-differential equation that can be used to uniquely determine ρ(r). The above condition only guarantees that F [ρ] is an extremal. In order that the functional F [ρ] is minimized, we require that δ 2 F [ρ, δρ] > 0 (720) for every δρ. The second order functional derivative is defined via Z Z δ 2 F [ρ] δ 2 F [ρ, δρ] = d3 r d3 r0 δρ(r) δρ(r0 ) δρ(r) δρ(r0 )
(721)
On using the choice δρ(r) = δ 3 (r − r1 )
(722)
0
for the deviation centered at r in the first derivative and the choice δρ0 (r0 ) = δ 3 (r0 − r2 )
(723)
when differentiating the second time, one obtains δ 2 F [ρ, δρ, δρ0 ] =
δ 2 F [ρ] δρ(r1 ) δρ(r2 )
(724)
As an example, the second order functional derivative of the Coulomb energy is found to be δ 2 ECoul [ρ] e2 = (725) δρ(r1 ) δρ(r2 ) | r1 − r2 | A second example is provided by the functional derivative of the mono-nomial Z Fn [ρ] = d3 r ρ(r)n (726) for real φ. For this functional, the second order functional derivative has the form δ 2 Fn [ρ] (727) = δ 3 (r1 − r2 ) n ( n − 1 ) ρ(r1 )n−2 δρ(r1 ) δρ(r2 ) etc.
9.3.3
The Variational Principle
Hohenberg and Kohn defined an energy functional of the electron density Z (728) E[ρ] = F [ρ] + d3 r Vions (r) ρ(r)
198
in which the energy functional F [ρ] depends on the kinetic energy Tˆ given by Ne ¯2 X h Tˆ = − ∇2 2 m i=1 i
(729)
and the electron-electron interaction energy, Vˆint , given by 1 X e2 Vˆint = 2 | ri − rj |
(730)
i6=j
The functional F [ρ] can be evaluated as F [ρ] =
i=N Ye
Z
d3 ri Ψ∗ (r1 , . . . rNe ) Tˆ + Vˆint Ψ(r1 , . . . rNe )
i=1
(731) The functional F [ρ] is a universal functional of ρ, as the functional F [Ψ] is a universal functional of Ψ. Furthermore, as will be shown, the energy of the electronic system E is given by the minimum value of the functional E[ρ] where ρ(r) is the correct ground state density of an Ne electron system associated with the lattice potential Vions (r). In fact, E is the minimum value of E[ρ] evaluated for the set of functions, ρ(r), which correspond to the Ne -electron ground state densities of arbitrary potentials. Such densities are known as V -representable densities. Not all densities are V -representable. Let ρ0 (r) 6= ρ(r) be an arbitrary density associated with some many-body wave function Ψ0 6= Ψ, that is not the ground state of our system. The ground state energy is defined by E =
i=N Ye
Z
ˆ Ψ(r , . . . r ) = E[ρ] d ri Ψ∗ (r1 , . . . rNe ) H 1 Ne 3
i=1
(732) The Rayleigh-Ritz variational principle asserts that the expectation values of the Hamiltonian satisfies the inequality i=N Ye
Z
ˆ Ψ(r , . . . r ) d3 ri Ψ∗ (r1 , . . . rNe ) H 1 Ne
i=1
≤
i=N Ye
Z
ˆ Ψ0 (r , . . . r ) d ri Ψ0∗ (r1 , . . . rNe ) H 1 Ne 3
(733)
i=1
and so E[ρ] ≤ E[ρ0 ]
199
(734)
This establishes the minimum principle for the energy functional δE[ρ] = 0 δρ(r)
(735)
subject to the constraint that the total number of electrons are fixed Z d3 r ρ(r) = Ne
(736)
The condition that ρ is V -representable may be replaced by a less stringent condition of N representable (M. Levy, Proc. Nat. Acad. Sci. 76, 6062 (1979)), which only requires ρ(r) > 0 Z Z
d3 r ρ(r) 1
d3 r | ∇ ρ 2 (r) |2
= Ne < ∞
(737)
Having established the existence of the variational function, the precise form of the functional remains to be determined.
9.3.4
The Electrostatic Terms
Hohenberg and Kohn suggest that one should separate the long-ranged classical Coulomb energy of the electrons from the functional F [ρ]. This term represents the average Coulomb interaction with the electrons in the system and, therefore, represents the Hartree terms. That is, the energy functional representing the kinetic and electron-electron interaction energies is written as Z Z ρ(r) ρ(r0 ) 1 3 + G[ρ] (738) d r d3 r 0 F [ρ] = | r − r0 | 2 The total energy functional is given by Z Z Z e2 ρ(r) ρ(r0 ) d3 r 0 d3 r + G[ρ] E[ρ] = d3 r Vions (r) ρ(r) + 2 | r − r0 | (739) The electrostatic potential φes (r) is given by the sum of the potential due to the lattice of ions and the electron-electron interaction Z ρ(r0 ) 2 d3 r 0 − | e | φes (r) = Vions (r) + e (740) | r − r0 | This potential may be obtained directly from Poisson’s equation from the density of the ions and electrons 2 − ∇ φes (r) = 4 π | e | Z ρions (r) − ρ(r) (741) 200
where | e | is the magnitude of the charge on the electron. The electrostatic potential determines the chemical potential through the variational procedure. The energy functional is minimized w.r.t variations of ρ(r) subject to the constraint that the density is normalized to Ne . This is performed by using Lagrange’s method of undetermined multipliers. The method consists of constructing the functional Ω[ρ] as Z 3 Ω[ρ] = E[ρ] − µ d r ρ(r) − Ne (742) Then on writing ρ0 (r) = ρ(r) + λ δρ(r) and Taylor expanding in λ one has Z δΩ[ρ] Ω[ρ0 ] = Ω[ρ] + λ d3 r δρ(r) δρ(r) Z 2 Z λ δ 2 Ω[ρ] + d3 r d3 r0 δρ(r) δρ(r0 ) + . . . (743) 2 δρ(r) δρ(r0 ) The extremal condition
becomes
δΩ[ρ] = 0 δρ(r)
(744)
δE[ρ] = µ δρ(r)
(745)
The first order functional derivative of E is evaluated from the Taylor expansion by retaining the terms of first order in λ. The first order term in E, δE 1 , is evaluated as Z 1 δE = d3 r δρ(r) Vions (r) Z Z e2 ρ(r0 ) ρ(r) 3 3 0 0 + d r d r δρ(r) + δρ(r ) 2 | r − r0 | | r − r0 | Z δG[ρ] + d3 r δρ(r) (746) δρ(r) On interchanging the variables of integration r and r0 in the second part of the Coulomb term and combining it with the first, one obtains Z Z ρ(r0 ) δG[ρ] d3 r 0 δE 1 = d3 r δρ(r) Vions (r) + e2 + | r − r0 | δρ(r) (747) Since the first two terms are identified with the electrostatic potential, the functional derivative is given by Z δE[ρ] ρ(r0 ) 2 d3 r 0 = Vions (r) + e δρ(r) | r − r0 | δG[ρ] (748) = − | e | φes (r) + δρ(r) 201
Hence, Ω is minimized if ρ satisfies the equation − | e | φes (r) +
δG[ρ] = µ δρ(r)
(749)
For large Ne , µ is equal to the chemical potential given by µ =
9.3.5
∂E ∂Ne
(750)
The Kohn-Sham Equations
The Kohn-Sham equations provide a formal correspondence between the manybody problem and an effective (non-interacting) one-body problem (W. Kohn, L.J. Sham, Phys. Rev. 140, A1133 (1965)). This allows the kinetic energy term in the energy functional to be determined. The kinetic energy functional T [ρ] can be defined via T [ρ] =
i=N Ye
Z
d ri Ψ∗ (r1 , . . . rNe ) Tˆ Ψ(r1 , . . . rNe ) 3
(751)
i=1
so the non-electrostatic contribution to the energy functional may be written as the sum G[ρ] = T [ρ] + Exc [ρ] (752) which defines the exchange and correlation functional Exc [ρ]. The variational principle for the density functional gives − | e | φes (r) +
δExc [ρ] δT [ρ] + = µ δρ(r) δρ(r)
(753)
δExc [ρ] δρ(r)
(754)
Thus, the quantity − | e | φes (r) +
plays the role of an effective potential, Vef f [ρ, r], which not only depends on r, but is also a functional of ρ. The effective potential is given by Vef f [ρ, r] = − | e | φes (r) +
δExc [ρ] δρ(r)
(755)
Thus, minimizing the energy functional entails solving the equation Vef f [ρ, r] +
δT [ρ] = µ δρ(r)
202
(756)
Formally, this is equivalent to solving for the ground state of a (non-interacting) problem with the energy functional given by Z Es [ρ] = T [ρ] + d3 r Vs (r) ρ(r) (757) in which the electron-electron interaction terms are absent. The variational procedure leads to δT [ρ] Vs (r) + = µ (758) δρ(r) Since the particles are non-interacting, this equation is solved by exactly finding the single-particle wave functions φs,α which make up the single Slater determinant that represents the non-interacting ground state. The set of single-particle wave functions are given as the solutions of the eigenvalue equation h2 ¯ − ∇2 + Vs (r) φs,α (r) = Es,α φs,α (r) (759) 2m and then the electron density is given by ρ(r) =
i=N Xe
| φs,αi (r) |2
(760)
i=1
By analogy, one can find the solution of the effective one-body eigenvalue equation ¯h2 − ∇2 + Vef f [ρ, r] φef f,α (r) = λef f,α φef f,α (r) (761) 2m and the electron density is given by ρ(r) =
i=N Xe
| φef f,αi (r) |2
(762)
i=1
The value of the kinetic energy functional for this effective one-body problem can be found from the eigenvalues λef f,αi by T [ρ] =
Ne X
Z λef f,αi −
d3 r Vef f [ρ, r] ρ(r)
(763)
i=1
Thus, one also has to minimize the sum of the effective one-body eigenvalues Ne X
λef f,αi
(764)
i=1
This shows that the Kohn-Sham equations provide a method of obtaining the kinetic energy functional and also minimizes the energy functional. Although 203
Kohn-Sham eigenvalues λef f,α are often used to describe electron excitation energies, they have no physical meaning. In general, the method only provides the ground state energy and ground state electron density. However, there is a density functional analogue of Koopmans’ theorem: the eigenvalue of the highest occupied effective single-particle level is the Fermi-energy. All the non-trivial information about the many-body ground state is contained in the exchange and correlation function. This is usually approximated in an uncontrolled fashion by using the local density functional approximation.
9.3.6
The Local Density Approximation
In the Kohn-Sham equations, the remaining unknown function is the exchange and correlation functional Exc [ρ]. This contains the information about the many-body interactions. The Local Density Approximation is motivated by an assumption namely, that this functional can be represented as an integral over all space of a function of ρ. This assumes that the functional has no nonlocal terms, or equivalently, that the non-local terms in the density functional can be expanded in powers of the gradient. This expansion could be justifiable if the density ρ(r) was slowly varying in space. The first few terms of the gradient expansion of the exchange-correlation energy would be Z 3 2 Exc [ρ] = d r Exc0 (ρ(r)) + Exc2 (ρ(r)) | ∇ ρ(r) | + . . . (765) where the coefficients Exc0 (ρ(r)) and Exc2 (ρ(r)) are ordinary functions of the density. The local density approximation neglects the gradient terms and uses the same form of the exchange-correlation function Exc0 (ρ(r)) as it pertains to the free electron gas. In the free electron gas, the electron density ρ is independent of r. However, in the local density approximation, the uniform density appearing in the expressions for the uniform electron gas is replaced by similar expressions but which depend upon the local electron density. The exchange and correlation terms from the local density approximation are taken from the free electron gas. The energy of the free electron gas is written as 3 ¯h2 kF2 3 e2 m e4 ln k + O(1) (766) kF − 0.0311 E = Ne − F 5 2m 4 π ¯h2 where the first term is due to the kinetic energy, and the second term is the exchange energy. The final term is the leading term in the high density expansion of the electron correlation energy, as evaluated by Gell-Mann and Brueckner (M. Gell-Mann and K. Brueckner, Phys. Rev. 106, 364 (1957)). For the free electron gas, the electrostatic interaction energy between the electrons and the smeared out lattice of ions cancels identically with the Hartree term. To obtain
204
the exchange-correlation energy, the kinetic energy term is omitted to find 3 e2 m e4 Exc = Ne − kF − 0.0311 ln k + O(1) (767) F 4 π ¯h2 Combining this together with the two relations kF =
3 π2 ρ
13 (768)
and Ne = V ρ
(769)
the exchange-correlation term can be expressed as −
Exc = V ρ
3 e2 4 π
3 π2 ρ
13 − 0.0104
m e4 ln ρ + O(1) ¯h2
(770)
The exchange-correlation energy in the local density approximation is simply given by Z Exc [ρ] =
3 e2 d r ρ(r) − 4 π 3
3π
2
13
m e4 ρ(r) − 0.0104 2 ln ρ(r) + O(1) ¯h (771) 1 3
Since the effective potential is given by the sum Vef f [ρ, r] = − e φ(r) +
δG[ρ] δρ(r)
(772)
the local density approximation for the exchange and correlation energy functional contributes a term to the potential in the Kohn-Sham equations of e2 Vxc [ρ] = − π
2
3 π ρ(r)
13 − 0.0104
m e4 ln ρ(r) + O(1) ¯h2
(773)
which adds to the electrostatic potential. The first term comes from the exchange interaction, and has the form that was originally proposed by J.C. Slater but has a different coefficient (J.C. Slater, Phys. Rev. 81, 385 (1951)). The higher order terms come from the correlation energy. In practice, the form of the exchange-correlation energy that is used as an input to the local density approximation is a form which interpolates between the high density limit and the low density limit. As the density is reduced, the electrons are expected to undergo a phase transition and form a Wigner crystal. Since the energy is expected to be a non-analytic function at the phase transition, the interpolation is of doubtful utility. It seems more appropriate to use the results of Monte Carlo calculations for the correlation energy of the homogenous electron gas (D.M. 205
Ceperley and B.J. Alder, Phys. Rev. Lett. 45, 566 (1980)). The local density functional approximation has been used to successfully describe many different materials, and fails miserably for some others. Attempts to justify this expression based on the gradient expansion have failed. Basically, the electron density varies too rapidly for the gradient expansion to be useful.
9.4
Static Screening
The response of an electronic system to a static or time independent external potential is quite remarkable in a metal. In a metal, the static external potential is screened out by the electron response. The screening is characterized by the dielectric constant. Classically, the total electrostatic potential φes (r) is related to the charge density through Poisson’s equation. In the absence of the external potential, Poisson’s equation is written as − ∇2 φes (r) = 4 π | e | Z ρions (r) − ρ(r) (774) For a free electron gas, the charge density for the electrons exactly cancels the contributions from the smeared out charges of the ions. The corresponding potential is constant, and the reference value φes (r) may be set to be zero. It is expected that a positive external charge with density ρext (r) will induce a change in the electronic density ρind (r). The external charge produces the external potential which is defined by the Poisson equation − ∇2 φext (r) = 4 π | e | ρext (r) The total potential φes (r) satisfies the Poisson equation − ∇2 φes (r) = 4 π | e | ρext (r) − ρind (r)
(775)
(776)
where ρext is assumed to have a positive charge, and the induced electron density ρind is associated with a negative charge. The external potential is related to the total potential via the dielectric constant through the non-local relation Z d3 r0 ε(r, r0 ) φes (r0 ) (777) φext (r) = V
In a spatially homogeneous system, the dielectric constant is translationally invariant and, therefore, only depends upon the difference r − r0 . In this case, the linear response relation is expressed as a convolution Z φext (r) = d3 r0 ε(r − r0 ) φes (r0 ) (778) V
206
This non-local relation, which is valid for homogeneous systems, is simpler after it has been Fourier transformed. The Fourier transform of φext (r) is defined by Z 1 3 φext (q) = d r φext (r) exp − i q . r (779) V V and the Fourier transform of the dielectric constant is defined by Z 3 ε(q) = d r ε(r) exp − i q . r
(780)
V
Hence, the Fourier transform of the convolution is just the product of the respective Fourier transforms. Thus, the relation becomes φext (q) = ε(q) φes (q)
(781)
Hence, the total potential is reduced by the dielectric constant φes (q) =
φext (q) ε(q)
(782)
The Fourier Transform of the Poisson equations yield q 2 φext (q) = 4 π | e | ρext (q) and
2
q φes (q) = 4 π | e |
(783)
ρext (q) − ρind (q)
(784)
On using the first equation to eliminate ρext (q) in the second, one obtains q 2 φes (q) = q 2 φext (q) − 4 π | e | ρind (q)
(785)
Taking the induced charge density term to the other side of the equation produces q 2 φext (q) = q 2 φes (q) + 4 π | e | ρind (q) (786) The definition of the dielectric constant, (q), can be used to yield the relation ε(q) = 1 +
4 π | e | ρind (q) q2 φes (q)
(787)
On expressing the total scalar potential as a potential energy term acting on the electrons V (q) = − | e | φes (q) (788) and defining the response function χ(q) as the ratio of the induced density to the potential ρind (q) χ(q) = (789) V (q) 207
one finds that the expression for the dielectric constant reduces to ε(q) = 1 −
4 π | e |2 χ(q) q2
(790)
Thus, the dielectric constant is related to the response of the charge density to the total potential. This response function can be calculated via different techniques. However, in making approximations, it is imperative that only the response to the total field is approximated and not the response to the external field. In a metal, it is all the electrons that take part in screening an external charge. If each electron were to react independently to screen the external charge, the external charge density would be over-screened by a factor of Ne as each electron by itself could neutralize a charge of | e |. The simplest approximate theory of the system’s response to the total field is given by the Thomas-Fermi approximation. The Thomas-Fermi theory pre-dates linear response theory and density functional theory. A more accurate approximation for weak potentials is based on linear response theory. The above derivation has the following drawbacks: First, the use of Poisson’s equation only treats the classical direct Coulomb interactions between aggregates of electrons, neglecting the effect of the exchange interactions. Second, the assumption of spatial homogeneity neglects the effect of Umklapp interactions in a solid. This neglect produces simple algebraic coupled equations. The inclusion of Umklapp scattering produces an infinite set of coupled equations which has no known analytic solution.
9.4.1
The Thomas-Fermi Approximation
The Thomas-Fermi approximation is based on the assumption that the potential is slowly varying. The energy of a Bloch state is given by ¯ 2 k2 h − | e | φes (r) 2m
(791)
The momentum of the highest occupied energy is r dependent kF (r) and is given by h2 kF2 (r) ¯ (792) − | e | φes (r) = µ 2m Thus, the electron density at position r is expressed in terms of a local Fermiwave vector 1 4 π kF3 (r) (793) ρ(r) = 2 3 8 π3 On expressing the Fermi-wave vector in terms of the chemical potential and the electrostatic potential, the total density becomes 3 32 1 2m 2 µ + | e | φes (r) (794) ρ(r) = 3 π2 ¯h2 208
The induced density is given in terms of the electrostatic potential via 32 23 # 3 " 1 2m 2 ρind (r) = µ + | e | φes (r) − µ 3 π2 ¯h2
(795)
This is the basis of the Thomas-Fermi Theory. On assuming that φes (r) is small compared with µ, the equation can be linearized yielding ∂ρ0 ρind (r) = | e | φes (r) (796) ∂µ Thus, the Thomas-Fermi response function is given by ∂ρ0 χT F = − ∂µ 3 1 2m 2 1 µ2 = − 2 π2 ¯h2 m kF = − 2 2 π ¯h
(797)
This leads to the Thomas-Fermi approximation for the dielectric constant 4 π e2 ∂ρ0 ε(q) = 1 + q2 ∂µ 2 k = 1 + T2F (798) q The Thomas-Fermi wave vector is given in terms of the Fermi-wave vector by kT2 F
= =
4 m e2 kF π ¯h2 4 kF π a0
(799)
and by the alternate expression kT F kF
r
4 π kF a0 13 1 16 = rs2 2 3π
=
1
= 0.8145 rs2
(800)
Thus, kT F is of the order of kF in a metal, and depends on the density of mobile electrons available to perform screening. This means that the external potential or charge is screened over distances of the order of kT−1F ∼ 1 Angstrom.
209
This can be most clearly seen by applying the Thomas-Fermi approximation to the screening of a point charge Z e in a metal. The charged particle is located at the origin. From the Fourier transform of Poisson’s equation, the external potential is given by 4πZ |e| (801) φext (q) = q2 The total potential is given by φes (q)
=
φext (q) ε(q)
=
4πZ |e| q 2 + kT2 F
(802)
which no longer possesses the long-ranged divergence as k → 0. On performing the inverse Fourier transform, thereby transforming the potential back into direct space, one has Z |e| φes (r) = exp − kT F r (803) r Thus, the charged impurity is exponentially screened over a distance kT−1F . The induced charge density is given by Z e2 ∂ρ0 ρind (r) = exp − kT F r r ∂µ 2 kT F Z = exp − kT F r (804) r 4π On integrating this over all space, one finds that the screening in a metal is perfect in that the total number of electrons in the induced density is equal to Z. The Thomas-Fermi approximation is deficient. For isolated atoms, it can be shown that the Thomas-Fermi approximation breaks down as it predicts that the electron density at the nuclear position is infinite (L.D. Landau and E.M. Lifshitz, Quantum Mechanics), i.e., 3
lim ρ(r) ∼ r− 2
r → 0
(805)
The Thomas-Fermi approximation cannot describe negative ions. That is, in the Thomas-Fermi approximation, the number of electrons must always be less than the nuclear charge. Furthermore, the Thomas-Fermi method also precludes the binding of neutral atoms into molecules (N.L. Balazs, Phys. Rev. 156, 42 (1967)). The Thomas-Fermi method is deficient as it assumes that the potential is slowly varying in space compared to the distance over which the electrons adjust to the potential. Therefore, the Thomas-Fermi method assumes that a 210
local approximation for the kinetic energy is valid. This is not the case for most simple metals, where the potential due to the ions varies over distances of the order of Angstroms.
9.4.2
Linear Response Theory
Linear response theory describes the response of a system to a weak perturbing potential. In such cases, the response is approximately linear in the perturbation, so perturbation theory may be used. The effect of a perturbing potential δV (r) on the electronic system is considered. The effect of this one-body potential on the one-body Bloch functions φn,k (r), is examined via perturbation theory. To first order in the perturbation, the one-electron eigenfunctions are altered. The one-electron eigenfunctions are no longer Bloch functions, but are given by ψn,k (r) = φn,k (r) +
X n0 ,k0 6=n,k
Mn0 ,k0 ;n,k φn0 ,k0 (r) En,k − En0 ,k0
(806)
where the Mn0 ,k0 ;n,k is the matrix element of the perturbing potential between two Bloch functions, Z 0 0 Mn ,k ;n,k = d3 r0 φ∗n0 ,k0 (r0 ) δV (r0 ) φn,k (r0 ) (807) The induced change in the electron density, to first order in δV (r), is found as " # X X Mn0 ,k0 ;n,k ∗ ρind (r) = (808) φn,k (r) φn0 ,k0 (r) + c.c. En,k − En0 ,k0 0 0 n,k,σ
n ,k 6=n,k
where the summation over n, k runs over all the occupied states and c.c. denotes the complex conjugated term. Thus, the response is not local in the perturbation but is non-local. The response is expressed in the form Z ρind (r) = d3 r0 χ(r, r0 ) δV (r0 ) (809) The response function χ(r, r0 ) is given by the expression # " X X φ∗n0 ,k0 (r0 ) φn0 ,k0 (r) 0 ∗ 0 χ(r, r ) = + c.c. (810) φn,k (r) φn,k (r ) En,k − En0 ,k0 0 0 n,k,σ
n ,k 6=n,k
where the summation over n, k, σ runs over all one-electron states that were occupied before the perturbation was turned on. Due to the Pauli exclusion principle, the summation over n0 , k 0 is restricted to the unoccupied states. The expression for the response is expected to be modified by the presence of electronelectron interactions.
211
The expression for the non-interacting response can easily be evaluated for free electrons. First, the variables k 0 and k are interchanged in the complex conjugate term, and then, due to a cancellation between the two terms, the range of one integration in each term is extended over all momentum space. Once again, the variables k 0 and k are interchanged in the second term, to yield " Z Z 3 3 0 4 m d k d k 0 0 0 χ(r, r ) = exp i ( k − k ) . ( r − r ) ( 2 π )3 h2 |k|≤kF ( 2 π )3 ¯ # 1 + c.c. k 2 − k 02 (811) where the integration over k 0 runs over all space. As the Hamiltonian possesses translational invariance, the response function only depends on the vector R = r − r0 . Thus, for the homogeneous electron gas, the real space linear response relation is in the form of a convolution. The integrations over the directions of k and k 0 can be evaluated by standard means. The range of integration over the magnitude of k 0 can be extended between − ∞ and + ∞ and evaluated by means of contour integration which leads to 2 2m χ(r, r ) = − 2 h ( 2 π )3 ¯ 0
Z
kF
dk k 0
sin 2 k | r − r0 | | r − r0 |2 (812)
The resulting expression is sin 2 kF | r − r0 | cos 2 kF | r − r0 | 2m 1 4 − k χ(r, r0 ) = F ( 2 kF | r − r0 | )3 ( 2 kF | r − r0 | )4 h2 π 3 ¯ (813) This is the response to a delta function perturbation at the origin. This delta function perturbation requires the electron gas to adjust at very short wave lengths. Instead of having the exponential decay as predicted by the ThomasFermi approximation, the response only decays algebraically, with characteristic oscillations determined by the wave vector 2 kF due to the sharp cut off at the Fermi-surface. That is, 2 kF is the largest wave vector available for a zero energy density fluctuation in which an electron is excited from just below to just above the Fermi-surface. The oscillations in the density that occur in response to a potential are known as Friedel oscillations. It is more convenient to consider the Fourier transform of the response function Z d3 r χ(r) exp
χ(q) = V
212
− iq.r
(814)
The response function χ(q) is evaluated from 2m 1 X 1 χ(q) = 2 2 h V k
+
k2
1 − ( k − q )2
(815) where the summation over k runs over the occupied states within the Fermisphere. The summation can be replaced by an integration Z kF Z +1 2m 1 1 2 χ(q) = − 2 2 dk k d cos θ 2 2 q + 2 k q cos θ h 4π ¯ 0 −1 1 + 2 q − 2 k q cos θ Z kF |q + 2k| 2m 1 = −2 2 dk k ln (816) |q − 2k| h q 4 π2 0 ¯ The response is given explicitly by m kF 1 4 kF2 − q 2 | 2 kF + q | χ(q) = − + ln 2 8 q kF | 2 kF − q | h2 π 2 ¯
(817)
This is the Lindhard function for the free electron gas (J. Lindhard, Kgl. Danske Videnskab. Selskab. Mat. Fys. Medd. 28, 8 (1954)). The Lindhard function reduces to the value of the corresponding Thomas-Fermi response function at q = 0, which is k2 χT F = − T F 2 (818) 4πe Thus, for very slowly varying potentials, the response of the free electron gas is identical to the response function found using the Thomas-Fermi approximation. The magnitude of the Lindhard function drops with increasing q, falling to half the q = 0 value at q = 2 kF . At this point, the slope has a weak logarithmic singularity. The electron gas is ineffective in screening the applied potential for q ≥ 2 kF as 2 kF corresponds to the largest wave vector at which electrons on the spherical Fermi-surface can readjust.
9.4.3
Density Functional Response Function
The change in the electron density ρind (r) due to an external potential, φext (r), in which electron-electron interactions are included can be obtained from density functional theory. The relation between the induced density and the external potential is given by the screened response function Z d3 r0 χs (r − r0 ) | e | φext (r0 ) (819) ρind (r) = −
213
Density functional theory yields an effective potential which contains the effect of the electron-electron interactions " # Z | e |2 δ 2 Exc 3 0 0 | e | φef f (r) = | e | φext (r) − d r ρind (r ) + | r − r0 | δρ(r) δρ(r0 ) (820) The relation between the induced electron density and the effective potential is given by Z ρind (r) = − d3 r0 χ0 (r − r0 ) | e | φef f (r0 ) (821) where χ0 (r − r0 ) is the Lindhard response function for non-interacting electrons. The response function, including the effects of the electron-electron interactions, can be found by Fourier transforming the above set of equations. Thus, the full response function is given by ρind (q) = − χs (q) | e | φext (q)
(822)
and the non-interacting response function is given by ρind (q) = − χ0 (q) | e | φef f (q)
(823)
The relationship between the effective and external potential is given by 4π|e| π|e| φef f (q) = φext (q) + − + Γ (q) ρind (q) (824) xc q2 kT2 F This equation can be solved for χs (q) in terms of the non-interacting response function χ0 (q). χ0 (q)
χs (q) =
1 − |e
|2
4 π q2
−
π 2 kT F
(825)
Γxc (q)
χ0 (q)
The dielectric constant ε(q) is given by 1 ε(q)
=
φes (q) φext (q)
1 ε(q)
=
1 −
1 ε(q)
=
1 +
4 π | e | ρind (q) q2 φext (q) 4 π e2 χs (q) q2
The exchange contribution to Γxc (q) is given in the limit q → 0 by 2 4 5 q 73 q Γxc (q) = 1 + + + ... 9 2 kF 225 2 kF 214
(826)
(827)
It is noted that if the effect of the exchange-correlation terms to the screening could be dropped, then the dielectric constant is approximated by 1 4 π e2 χ0 (q) ≈ 1 + ε(q) q2 ε(q)
(828)
which is consistent with the result for free-electrons using the Lindhard approximation for the response to the total field, and treating the total scalar potential classically via Poisson’s equation. In obtaining this approximate result, it was necessary to calculate the response of the system to the external potential by including processes, to all orders in e2 , in which the electron gas is polarized. That is, the electron gas is polarized by the external potential and then the resulting polarization and the external potential are screened by the electron gas, ad infinitum. This infinite regression is necessary for the external charge to be completely screened at large distances, and is a consequence of the long-ranged 2 nature of the Coulomb interaction limq → 0 4 πq2 e → ∞. This re-emphasizes the importance of only making approximations in the response to the total potential χ and not in the response to the external potential χs . The response of the electronic system to an applied potential can be used to examine the stability of a structure. The electronic energy change due to the perturbation consists of the potential energy of interaction between the ions and the electron gas, as well as the change induced into the energy of electron-electron repulsions. All of these energies can be expressed in terms of the induced charge density. ——————————————————————————————————
9.4.4
Exercise 44
Calculate the Lindhard function for a free electron gas Ek0 = 1, d = 2 and d = 3 dimensions, at zero temperature.
h ¯ 2 k2 2 m
in d =
——————————————————————————————————
9.4.5
Exercise 45
Consider the Lindhard function for a tight-binding non-degenerate s band on a hyper-cubic lattice with the dispersion relation Ek = E0 − 2 t
i=d X
cos ki a
(829)
i=1
Show that the response function at the corner of the Brillouin zone q = π a (1, 1, 1, ., ., .) diverges as the number of electrons in the band approaches one 215
per site. ——————————————————————————————————
216
10
Stability of Structures
In this chapter, the structural stability of a metal is discussed. The total energy of the metal will be expressed in terms of the energy for a uniform electron gas, and the interaction with the periodic structure will be treated as a perturbation.
10.1
Momentum Space Representation
In the uniform electron gas, the electro-static energy between pairs of electrons and also between the particles forming the background positive charge exactly cancels with the interaction between the electrons and the positive charges. When the periodic potential is introduced as a perturbation, the change in the total energy can be expressed in terms of the change in the one-electron eigenvalues. However, the inclusion of the Coulomb interaction between the lattice and the electrons will also require that the contributions from electron-electron and ion-ion interaction be explicitly reconsidered in the calculation of the total energy. The energy of a one-electron Bloch state, calculated to second in the potential due to the ionic lattice, can be expressed in terms of the one-electron energy eigenvalues for a free electron gas as En,k =
¯ 2 k2 h 2 m X | Vions (k 0 , k) |2 + Vions (k, k) + 2m ¯h2 k6=k0 k 2 − k 02
(830)
The zero-th order and first order terms in this energy are independent of the lattice structure of the ionic potential. This can be seen by examining the matrix elements Z 1 0 0 3 Vions (k , k) = d r Vions (r) exp i ( k − k ) . r (831) V which is just the average potential when k = k 0 . The sum over the energies of all the occupied Bloch states, (k, σ), contribute to the total energy of the solid. The first order contribution from Vions (k, k), like the kinetic energy of the free electron gas, does not depend on the structure. These terms combine to produce a volume-dependent contribution to the solid’s total energy. The other volume-dependent contribution to the total energy of the solid originates from the electron-electron interactions and the ion-ion interactions. It is convenient to combine these terms with the energy of the zero-th order electron-ion interaction, due to the exact cancellation for the uniform electron gas. This combination is the total electrostatic interaction. It can be evaluated in the approximation that the Coulomb interactions between different WignerSeitz cells are totally screened (E. Wigner and F. Seitz, Phys. Rev. 43, 804 (1933), Phys. Rev. 45, 509 (1934)). This means that the ion-ion interactions 217
need not be considered explicitly. The electrostatic contribution to the energy is then written as Z Z Z e2 ρ(r) ρ(r0 ) (832) Ees = d3 r Vions (r) ρ(r) + d3 r d3 r 0 | r − r0 | 2
To lowest order in the structure, the electrostatic contribution to the total energy can be evaluated by considering the Wigner-Seitz unit cell to be spherical with radius RW S . The electron density is given by ρ =
3Z 3 4 π RW S
(833)
For the uniform density, the electron-electron repulsion term is evaluated as Z 3 Z 2 e2 Ees = d3 r Vions (r) ρ(r) + (834) 5 RW S For the free-electron approximation for the kinetic energy to be valid, the electrostatic contribution from the ions should be calculated using the pseudopotential. We shall use the Ashcroft empty core approximation for the ionic pseudo-potential. Inside the Wigner-Seitz cell, the pseudo-potential reduces to that of an isolated atom Vatom (r)
Z e2 f or r ≥ Rc r 0 f or r ≤ Rc
= − =
(835)
where Rc is the radius of the ionic core. Hence, for a structureless metal, the electrostatic terms can be expressed as " 2 # 3 Z 2 e2 Rc 3 Z 2 e2 Ees = − 1 − + (836) 2 RW S RW S 5 RW S The potential terms inversely proportional to the Wigner-Seitz radius can be 9 9 Z 2 e2 combined as − 10 RW S . The coefficient α = 10 is the Madelung constant for a solid composed of spherical unit cells. In general, the Madelung constant will depend slightly on the structure of the lattice. For a solid with structure, the electrostatic energy can be expressed as the sum E = EM + Ec (837) where EM is the Madelung energy and Ec is the core energy. The Madelung energy is the electrostatic energy due to point charges immersed in a neutralizing uniform distribution of electrons. The Madelung energy is given by EM = − α 218
Z 2 e2 RW S
(838)
where α is the structure-dependent Madelung constant. The Madelung constants are evaluated as Structure
α
b.c.c. f.c.c. h.c.p. simple hexagonal simple cubic
0.89593 0.89587 0.89584 0.88732 0.88006
The Madelung energy is seen to increase as the symmetry is lowered. The remaining contribution to the electrostatic energy is defined to be the core energy. The core energy is given by 3 Z 2 e2 2 RW S
Ec =
Rc RW S
2 (839)
and, as it is the electrostatic energy associated with the spherical pseudopotential core, is not dependent on the solid’s structure. The largest structural-dependent contribution to the energy originates from the second order terms of the Bloch energies in the electron-ion interaction (2)
En,k =
2 m X | Vions (k 0 , k) |2 ¯h2 k6=k0 k 2 − k 02
(840)
On summing over all the occupied Bloch states ( | k | < kF ) and both spin values σ, one obtains a contribution E2 to the total energy of E2 =
2m h2 ¯
X |k|
X | Vions (k 0 , k) |2 k 2 − k 02 k6=k0
(841)
In the free electron basis, the matrix elements of the electron-ion interaction, Vions (k 0 , k), only depends on the momentum difference q = k 0 − k. 1 Vions (k , k) = V 0
Z
3
d r Vions (r) exp
0
i(k − k ).r
(842)
The potential due to the lattice can be written as the sum of the individual potentials from the atoms. The basis position of the j-th atom in the unit cell is denoted by rj and the Bravais lattice vector is denoted by Ri . Thus, the potential for the lattice of ions is given by X Vj (r − Ri − rj ) Vions (r) = (843) i,j
219
The matrix elements are then given by Z X 1 3 Vions (q) = d r Vj (r − Ri − rj ) exp − i q . r V i,j Z 1 X 3 = d r exp − i q . Ri Vj (r − Ri − rj ) exp − i q . ( r − Ri ) V i,j (844) This can be expressed as Vions (q) =
X
− i q . Ri
exp
X
i
− i q . rj
exp
Vj (q)
j
(845) where Vj (q) is related to the Fourier transform of the potential from the j-th atom of the basis Z Vj (q) = d3 r Vj (r) exp − i q . r (846) For simplicity, a crystal with a mono-atomic basis is considered. The matrix elements are only non-zero when q is a reciprocal lattice vector Q. The matrix can be expressed in terms of the structure factor S(Q), via Vions (Q) =
N S(Q) V0 (Q) V
(847)
The structure dependence of the total electronic energy is contained in the second order contribution E2 =
N2 V2
X k
2 m X | S(Q) |2 | V0 (Q) |2 ¯h2 Q6=0 k 2 − ( k + Q )2
(848)
where the sum over k, σ runs over the occupied states ( k < kF ), and the term with Q = 0 is omitted. On interchanging the order of the summations over k and Q, one finds that the second order term can be expressed in terms of the Lindhard function χ(q), E2
=
X fk N2 X | S(Q) |2 | V0 (Q) |2 2 V Ek − Ek+Q Q6=0
k,σ
X fk − fk+Q 1 N X | S(Q) |2 | V0 (Q) |2 2 2 V Ek − Ek+Q 2
=
Q6=0
=
1 N 2 V
2
X
k,σ
| S(Q) |2 | V0 (Q) |2 χ(Q)
Q6=0
220
(849)
The summation over q is limited to the reciprocal lattice vectors Q. Therefore, it depends on the lattice structure through the structure factors | S(Q) |2 , and on the electron density through the factors χ(q), and the nature of the ions through V0 (q). The latter is often expressed in terms of the Thomas-Fermi screened pseudo-potential V0 (q) = − 4 π Z e2
cos q Rc + kT2 F
q2
(850)
where Rc is the radius of the ionic core. The potential has a node at q0 Rc = π2 . The structural part of the electronic energy depends sensitively on the position of the node q0 with respect to the smallest reciprocal lattice vectors Q. Reciprocal lattice vectors close to a node q0 contribute little to the cohesive energy. The system may lower its structural energy, if Q moves away from q0 without causing a change in the volume-dependent contribution to the energy. Reciprocal lattice vectors greater than 2 kF contribute little as the response of the electron gas is negligible. In addition to these terms, there is a structural contribution arising from the electron-electron interactions which comes from the induced change in the electron density E2
es
= −
1 X ∗ 4 π e2 V ρind (q) ρind (q) 2 q q2
(851)
This term occurs since the effect of electron-electron interactions have been double counted. On noting that the ionic potential only has non-zero Fourier components at q = Q, and that ρind (Q) = χ(Q) Vions (Q)
(852)
one can combine this with the contribution from the Bloch energies. The factor 4 π q 2 χ(q) is related to the dielectric constant ε(q) through ε(q) = 1 −
4 π e2 χ(q) q2
(853)
The two second order terms can be combined to yield the dominant contribution to the structural energy Estructural =
1 N2 X | S(Q) |2 | V0 (Q) |2 χ(Q) ε(Q) 2 V
(854)
Q6=0
Since both pseudo-potential terms V0 (Q) include screening, the explicit factor of ε(q) cancels with one factor of ε(q) in the denominators. Thus, the structural energy is only screened by one factor of the dielectric constant. The magnitude of the structural energy is quite small. The maximum magnitude of the 221
2
pseudo-potential is ZRec which may be as small as 12 eV. The magnitude of χ is given by the inverse of the Fermi-energy which is typically 5 eV. Thus, the structural energy is of the order of milli-Rydbergs. Since the structure factor vanishes unless q = Q, the structural energy depends on the screened potential only at the reciprocal lattice vectors. Note that the pseudo-potential contains nodes at the wave vectors q0 = n 2Rcπ . The structural energy is composed of negative contributions, but the contributions from the reciprocal lattice vectors which are close to the nodes, contribute little to the stability of the structure. In fact, reciprocal lattice vectors at the nodes would correspond to the special case in which the band gap at the appropriate Brillouin zone boundary is zero. Usually, the opening of a band gap at a Brillouin zone boundary in a conduction band can result in an increased stability of the structure. The electronic states below the ”band gap” are depressed and, if occupied, result in a lowering of the solid’s energy. However, the states above the ”band gap,” if empty, are raised but don’t contribute to the solid’s energy. Al is f.c.c. and the reciprocal lattice vectors (1, 1, 1) and (2, 0, 0) are both larger than q0 . On moving down the column of the periodic table from Al to Ga and then In, the ratios of Q/q0 are reduced. Al
Ga
In
Q(1, 1, 1)/q0
1.04 0.94
0.93
Q(2, 0, 0)/q0
1.20
1.08
1.09
As the Q vector for (2, 0, 0) approaches q0 in In, there is a loss in structural stability and the series undergoes a transition from the f.c.c. to a tetragonal structure (V. Heine and D.L. Weaire, Solid State Physics, 24, 1 (1970)). When this transition occurs, the set of equivalent f.c.c. reciprocal lattice vectors that have qQ0 ∼ 1, split. In the tetragonal structure, as the structure is sheared, the reciprocal lattice vectors undergo different changes. Some values of qQ0 move to higher values while others move to lower values. This type of transformation leaves the atomic volume unchanged, but as all the ”band gaps” V (Q) increase, the transition lowers the energy of the structure. This structural transition occurs when the lowering of the electronic energy outweighs the increase in the Madelung energy.
10.2
Real Space Representation
The dominant electronic structural energy is given by a sum over all Q of Estructural =
1 N2 X | S(Q) |2 | V0 (Q) |2 χ(Q) ε(Q) 2 V Q6=0
222
(855)
where S(Q) is the structure factor evaluated at a reciprocal lattice vector. This can be written as a sum over all vectors q, by using the Laue identity X X N2 δq,Q = exp i q . ( R − R0 ) (856) R6=R0
Q
where R and R0 are Bravais Lattice vectors. The structural energy then takes the form 1 X X 0 Estructural = exp i q . ( R − R ) | S(q) |2 θ(q) (857) 2V 0 q6=0 R6=R
where θ(q) is defined to be θ(q) =
1 | V0 (q) |2 χ(q) ε(q) V
(858)
It should be noted that in this approximation, θ(q) is independent of the direction of q. The product of the structure factors can be written as X | S(q) |2 = exp i q . ( ri − rj ) (859) i6=j
Thus, on denoting the position of the atoms by Rj = R + rj , one has 1 X X Estructural = exp i q . ( Ri − Rj ) θ(q) (860) 2 i6=j q6=0
The Fourier transform of θ(q) is defined as θ(Ri,j ) =
X
θ(q) exp
i q . Ri,j
(861)
q
where the vector Ri,j denotes the relative position of the two atoms. On changing the sum to an integration, θ(Ri,j ) is evaluated as Z V 3 θ(Ri,j ) = d q θ(q) exp i q . R i,j ( 2 π )3 Z ∞ Z 1 2πV 2 = dq q d cos θ θ(q) exp i q R cos θ i,j ( 2 π )3 0 −1 Z ∞ exp i q Ri,j − exp − i q Ri,j ! V = dq q 2 θ(q) ( 2 π )2 0 i q Ri,j Z ∞ V sin q Ri,j = dq q 2 θ(q) (862) 2 π2 0 q Ri,j 223
Thus, the electronic contribution to the structural energy has the real space representation 1 X (863) Estructural = θ(Ri,j ) 2 i6=j
The Madelung energy, which is the sum over the interaction energies of the ions EM adelung =
1 X Z 2 e2 2 | Ri,j |
(864)
i6=j
should also be added to the structural energy. Thus, the total structural energy can be expressed in terms of the sum of pair potentials Θ(R), where Θ(R) =
Z 2 e2 + θ(R) R
(865)
The pair potential represents the interaction between a pair of bare ions in the solid plus the effect of the screening clouds. The pair potential does not describe the volume dependence of the energy of the solid, but only the structuredependent contribution to the energy. The pair potential can be expressed as Z ∞ Z 2 e2 V sin q R 2 Θ(R) = + dq q θ(q) (866) R ( 2 π2 ) 0 qR The first and second term can be combined to yield the interaction between an ion and a screened ion. This can be seen by expressing the potential in terms of a dimensionless function V˜ (q) defined by V0 (q) =
4 π Z e2 ˜ V (q) q 2 ε(q)
Thus, the interaction can be expressed as " Z ∞ Z 2 e2 2 sin q Θ(R) = 1 + dq R π 0 q " Z ∞ Z 2 e2 2 sin q = 1 + dq R π 0 q " Z ∞ Z 2 e2 2 sin q = 1 − dq R π 0 q " Z ∞ Z 2 e2 2 sin q R + dq R π 0 q
(867)
4 π e2 χ(q) ˜ | V (q) |2 q2 ε(q) # R 1 − ε(q) ˜ 2 | V (q) | ε(q) # R 2 ˜ | V (q) |
R
1 | V˜ (q) |2 ε(q)
#
#
(868) The integral in the first term can be evaluated with the calculus of residues, and is evaluated in terms of the pole at q = 0. Since V˜ (0) = 1 and as R > Rc , the 224
integral is equal to unity. Therefore, the first term cancels identically. Hence, the interaction energy between a bare ion and a screened ion is given by the expression " # Z ∞ Z 2 e2 2 sin q R 1 2 Θ(R) = dq | V˜ (q) | (869) R π 0 q ε(q) The long-ranged nature of the Coulomb interaction between the bare ions has been completely eliminated due to the screening. The very weak logarithmic singularity at q = 2 kF leads to Friedel oscillations in the potential at asymptotically large distances R Θ(R) = A
cos 2 kF R R3
(870)
However, at intermediate distances, the pair potential can be approximately expressed as the sum of three (damped) oscillatory terms (D.G. Pettifor and M.A. Ward, Solid. State. Commun. 49, 291 (1984)) Θ(R) =
3 Z 2 e2 X Bn cos αn 2 kF R + φn exp − βn kF R (871) R n=1
where the phase shift depends on the ionic core radius Rc and the electron density. This form is obtained as a result of approximating the Lindhard function χ(q) by a ratio of polynomials (Pad´e approximation). The integration over q can be performed via contour integration. The pairs of complex poles in the integrand produce the terms which have a damped oscillatory dependence on R. The fit parameters for N a are given by:
n αn βn Bn φn π
1 0.291 0.897 1.961 1.706
Na 2 0.715 0.641 0.806 1.250
3 0.958 0.271 0.023 1.005
while for M g the interaction is specified by
n αn βn Bn φn π
1 0.224 0.834 5.204 1.599
Mg 2 0.664 0.675 1.313 0.932
3 0.958 0.277 0.033 0.499
and for Al one has
225
n αn βn Bn φn π
1 0.156 0.793 7.954 1.559
Al 2 0.644 0.698 1.275 0.832
3 0.958 0.279 0.030 0.431
The contributions to the pair potential are arranged in order of increasing range i.e., they are arranged in order of decreasing βn . The Z dependence of the phase shifts determine the position of the minima of the pair potential. This pair potential, although it only has a magnitude of about 10−2 eV, dominates the structural energy. neighbor shell number b.c.c. number of neighbors neighbor distance f.c.c. number of neighbors neighbor distance h.c.p. number of neighbors neighbor distance
1
2
8
6 12 √ 2 1 6 1
√
3 2
12
√
2 2
12
√
2 2
6 1
3
24
√
6 2
2 √2 3
4
5
√
24
8 √ 3
12 √ 2
√
11 2
18
√
6 2
24 10 2
12
√ √11 6
The energy difference between the f.c.c. and h.c.p. structures are determined by the third, fourth and fifth nearest neighbors, as the number and positions of the nearest and next nearest neighbors are the same. Hence, the relative stability of this pair of structures is determined by the reasonably long distance behavior of the pair potential. The form of the pair potential can be used to describe the relative stability of the h.c.p. and f.c.c structures of N a, M g and Al (A.K. McMahan and J.R. Moriarty, Phys. Rev. B 27, 3235 (1983)). At ambient pressure, N a and M g are h.c.p. and Al is f.c.c.. The f.c.c. form of M g is unstable due to a repulsive contribution from the pair potentials between the (12) fourth nearest neighbor pairs. The h.c.p. form of Al is unstable due to a repulsive contribution from the pair potentials between the (12) fifth nearest neighbor pairs. This trend is understood as almost entirely being due to the long-ranged component of the pair potential. Basically, as the value of Z increases, when going across the column from N a to Al, the phase shift of the long-ranged interaction decreases. This means that the oscillations in the pair potential move out to larger distances. This causes the changes in the pair potential at the positions of the fourth or fifth nearest neighbors. Under pressure, these materials are predicted to transform to a b.c.c. phase. The phase shift of the long-ranged component decreases monotonically with increasing Rrsc , which corresponds to increasing pressure. The change in the 226
phase shifts moves the oscillations in the pair potentials to distances larger distances than the neighbor distances. This shows that as the pressure is increased, one may expect the energy differences between the h.c.p. and f.c.c. phases to oscillate. The energy differences between the b.c.c. and close-packed phases originate from the combined (14) first and second nearest neighbors in b.c.c. and the (12) nearest neighbors of the close-packed structures. The separations 1 of the neighbors in the b.c.c. structure should be scaled by a factor of 2− 3 to yield the same electron density as the close-packed structures. After this scaling, it is found that the nearest neighbor distances in the close-packed structures are intermediate between the nearest neighbor and the next nearest neighbor distances of the b.c.c. structure. On decreasing the phase shift, one may expect to see the b.c.c. phase become unstable to a close-packed phase when the (8) nearest neighbors experience the hard core repulsive potential. On further decreasing the phase shift, the (12) neighbors of the close-packed phase will experience the same hard core potential at which point, the b.c.c. becomes stable again. This region of stability of the b.c.c. structure will remain until the (8) next nearest neighbors are compressed to distances where the pair potential has the form of a hard core repulsion. These and similar considerations illuminate the origins of the stability of different structures, which are hard to extract from other methods, as the structural energy typically amounts to only 1% of the cohesive energy of a solid. In general, the cohesive energy of the solid will also involve three and four-atom interactions etc., in addition to the pair potential. To obtain a more accurate description of structural stability, it is necessary to utilize density functional calculations.
227
11
Metals
In a metal with Ne electrons, the state with minimum energy has the Ne lowest one-electron energy eigenvalue states filled with one electron per state (per spin) in accordance with the Pauli exclusion principle. In a metal, the highest occupied and the lowest unoccupied state have energies which only differ by an infinitesimal amount. This energy is called the Fermi-energy, F . Thus, the one-electron states have occupation numbers distributed according to the law f () = 1
if
< F
f () = 0
if
> F
(872)
The number of electrons in a solid Ne is dictated by charge neutrality to be equal to the number of nuclear charges N Z. At finite temperatures, the electron occupation numbers are statistically distributed according to the Fermi-Dirac distribution function 1 f () = (873) 1 + exp β ( − µ ) where β −1 = kB T is the inverse temperature. The Fermi-Dirac distribution represents the probability that a state with energy is occupied. Due to the Pauli exclusion principle, the distribution also represents the average occupation of the level with energy . The value of the chemical potential coincides with the Fermi-energy at zero temperature µ(0) = F . Since the solid remains charge neutral at finite temperatures, the chemical potential is determined by the condition that the solid contains Ne electrons. For a solid with a density of states given by ρ(), per spin, the total number of electron is given by Z +∞ 2 d ρ() f () = Ne (874) −∞
which is an implicit equation for µ. The factor of two represents the number of different spin polarizations of the electron.
11.1
Thermodynamics
Due to the Pauli exclusion principle, the density of states at the Fermi-energy can often be inferred from measurements of the thermodynamic properties of a metal. As the characteristic energy scale for the electronic properties is of the order of eV, and room temperature is of the order of 25 meV, the thermodynamic properties can usually be evaluated in the asymptotic low-temperature expansion first investigated by Sommerfeld (A. Sommerfeld, Zeit. f¨ ur Physik, 47, 1 (1928)). The low-temperature Sommerfeld expansion of the electronic specific heat, for non-interacting electrons, shall be examined.
228
11.1.1
The Sommerfeld Expansion
The total energy of the solid can be expressed as an integral Z +∞ E = 2 d ρ() f ()
(875)
−∞
Integrals of this type can be evaluated by expressing them in terms of the zero temperature limit of the distribution and small deviations about this limit. Z µ E = 2 d ρ() −∞ Z µ +2 d ρ() f () − 1 −∞ +∞
Z +2
d ρ() f ()
(876)
µ
The variable of integration in the terms involving the Fermi-function is changed from to the dimensionless variable x defined by = µ + kB T x
(877)
The Fermi-function becomes f (µ + kB T x) =
1 1 + exp x
(878)
Thus, the integral becomes Z µ E = 2 d ρ() −∞ 0
Z + 2 kB T
dx ρ(µ + kB T x) ( µ + kB T x )
f (µ + kB T x) − 1
−∞ +∞
Z + 2 kB T
dx ρ(µ + kB T x) ( µ + kB T x ) f (µ + kB T x) 0
(879) The integral over the negative range of x is re-expressed in terms of the new variable y where y = −x (880) Thus, the energy is expressed as Z µ E = 2 d ρ() −∞ Z ∞ + 2 kB T dy ρ(µ − kB T y) ( µ − kB T y ) f (µ − kB T y) − 1 0
Z + 2 kB T
+∞
dx ρ(µ + kB T x) ( µ + kB T x ) f (µ + kB T x) 0
(881) 229
However, the Fermi-function satisfies the relation 1 − f ( µ − kB T y ) = f ( µ + kB T y )
(882)
1 1 = 1 + exp[ − y ] 1 + exp[ y ]
(883)
or equivalently 1 −
On setting y back to x, one finds Z µ E = 2 d ρ() −∞
Z + 2 kB T " ×
+∞
dx 0
1 × 1 + exp x #
ρ(µ + kB T x) ( µ + kB T x ) − ρ(µ − kB T x) ( µ − kB T x ) (884)
The terms within the square brackets can be Taylor expanded in powers of kB T x, and the integration over x can be performed. Due to the presence of the Fermi-function, the integrals converge. One then has an expansion which is effectively expressed in powers of kB T / µ. Thus, the energy is expressed as Z µ E = 2 d ρ() −∞ ( " #) Z ∞ ∞ (2n+1) X x(2n+1) ( kB T )2n+1 ∂ + 4 kB T dx µ ρ(µ) 1 + exp x (2n + 1)! ∂µ 0 n=0 (885) The integrals over x are evaluated as Z ∞ Z ∞ ∞ X xn = dx dx xn ( − 1 )l+1 exp − l x 1 + exp x 0 0 l=1
∞ X ( − 1 )l+1 = n! ln+1
(886)
l=1
which are finite for n ≥ 1. Furthermore, the summation can be expressed in terms of the Riemann ζ functions defined by ∞ X 1 ζ(m) = lm
(887)
2n+1 ∞ X ( − 1 )l+1 2 − 1 = ζ(2n + 2) l2n+2 22n+1
(888)
l=1
Using this, one finds that
l=1
230
The Riemann zeta functions have special values π2 6 π4 ζ(4) = 90
ζ(2) =
(889)
Thus, the Sommerfeld expansion for the total electronic energy only involves even powers of T 2 , that is, Z µ E = 2 d ρ() −∞ ( " #) ∞ 2n+1 (2n+1) X 2 − 1 ∂ + 4 ( kB T )2 ζ(2n + 2) ( kB T )2n µ ρ(µ) 22n+1 ∂µ n=0 (890) The coefficients may be evaluated in terms of the Riemann ζ functions. Although the expansion contains an explicit temperature dependence, there is an implicit temperature dependence in the chemical potential µ. This temperature dependence can be found from the equation Z +∞ Ne = 2 d ρ() f () (891) −∞
which also can be expanded in powers of T 2 as Z µ Ne = 2 d ρ() −∞
+ 4 ( kB
( " #) ∞ 2n+1 (2n+1) X 2 − 1 ∂ 2n T ) ζ(2n + 2) ( kB T ) ρ(µ) 22n+1 ∂µ n=0 2
(892) Since Ne is temperature independent, in principle, the series expansion can be inverted to yield µ in powers of T .
11.1.2
The Specific Heat Capacity
The electronic contribution of the heat capacity, for non-interacting electrons, can be expressed as ∂S CNe (T ) = T ∂T Ne ∂E = (893) ∂T Ne 231
as the solid remains electrically neutral. Using the Sommerfeld expansion of the energy, the specific heat can can be expressed as the sum of the specific heat at constant µ and a term depending on the temperature derivative of µ at constant Ne . ( ) 2n+1 ∞ (2n+1) X ∂ 2 − 1 2 (n + 1) ζ(2n + 2) ( kB T )2n µ ρ(µ) CNe = 4 kB T 2n 2 ∂µ n=0 " ∂µ + 2 µ ρ(µ) ∂T Ne ( ) # ∞ 2n+1 (2n+2) X ∂ 2 − 1 2 2 ζ(2n + 2) (kB T )2n µ ρ(µ) + 4 kB T 2n+1 2 ∂µ n=0 (894) In the above expression, µ is to be expanded in powers of T about its zero temperature value µ = F . The temperature derivative of the chemical potential can be evaluated from the temperature derivative of the equation for the fixed number of electrons Ne , ( ) 2n+1 ∞ (2n+1) X 2 − 1 ∂ 2 0 = 4 kB T (n + 1) ζ(2n + 2) ( kB T )2n ρ(µ) 2n 2 ∂µ n=0 " ∂µ + 2 ρ(µ) ∂T Ne ( )# ∞ 2n+1 (2n+2) X 2 − 1 ∂ 2 2 + 4 kB T ζ(2n + 2) ( kB T )2n ρ(µ) 2n+1 2 ∂µ n=0 (895) This equation yields the temperature dependence of µ which can be substituted back into the expression for the temperature dependence of CN . This yields the leading term in the low-temperature expansion for the electronic-specific heat of non-interacting electrons as CN
2 4 = kB T 4 ζ(2) ρ(µ) + O(kB T 3) 2 2π 4 2 = kB T ρ(µ) + O(kB T 3) 3
(896)
The coefficient of the linear term is proportional to the density of states, per spin, at the Fermi-energy. The result is understood by noting that the Pauli exclusion principle prevents electrons from being thermally excited, unless they are within kB T of the Fermi-energy. There are ρ(µ) kB T such electrons, and each electron contributes kB to the specific heat. Thus, the low-temperature 2 specific heat is of the order of kB T ρ(µ). The inclusion of electron-electron interaction changes this result, and in a Fermi-liquid, may increase the coefficient 232
of T . The low-temperature specific heat is enhanced, due to the enhancement of the quasi-particle masses. This can be demonstrated by a simplified calculation in which the quasi-particle weight is assumed to be independent of k. Since the quasi-particle width in the vicinity of the Fermi-energy is negligible, one has the relationship between the quasi-particle density of states and the density of states for non-interacting electrons given by X ρqp (E) = δ Z(k) E − Ek + µ k
=
X
=
X
( Ek − µ ) Zk δ E − Eqp (k)
1 δ Z(k)
k
1 Z(k)
k
E −
(897)
Also, the quasi-particle density of states at the Fermi-energy is un-renormalized as X ρqp (0) = δ µ − Ek k
= ρ(µ)
(898)
The γ term in the low-temperature specific heat is calculated from the quasiparticle entropy S defined in terms of the quasi-particle occupation numbers nqp k by S
= − kB
X
nqp k
ln
nqp k
+ (1 −
nqp k
) ln( 1 −
nqp k
)
σ,k
Z
∞
= − 2 kB
dE Z ρqp (E)
f (E) ln f (E) + ( 1 − f (E) ) ln( 1 − f (E) )
−∞
(899) Thus, in this approximation, the coefficient of the linear T term is given by γ
CN lim T → 0 T ∂S = ∂T Ne =
2 = kB Z
2 π2 ρ(µ) 3
(900)
In the more general case, the specific heat coefficient is enhanced through a k weighted average of the quasi-particle mass enhancement Zk . For materials like CeCu6 , CeCu2 Si2 , CeAl3 and U Be13 , the value of the γ coefficients are extremely large, of the order of 1 J / mole of f ion / K2 , which is 1000 times larger 233
than Cu. The quasi-particle mass enhancements are inferred by comparison to L.D.A. electronic density of states calculations and are about 10 to 30. The enhancement is assumed to be due to the strong electron-electron interactions, which the L.D.A. fails to take into account. ——————————————————————————————————
11.1.3
Exercise 46
Calculate the next to leading order term in the low-temperature electronicspecific heat. ——————————————————————————————————
11.1.4
Exercise 47
CeN iSn is thought to be a zero-gap semiconductor with a V shaped density of states. The density of states near the Fermi-level is approximated by ρ() = α0 ρ() = − α1
f or > 0 f or < 0
(901)
where α0 and α1 are positive numbers. Find the leading temperature dependence of the low-temperature specific-heat. ——————————————————————————————————
11.1.5
Pauli Paramagnetism
In the absence of spin-orbit scattering effects, the susceptibility of a metal can be decomposed into two contributions; the susceptibility due to the spins of the electrons, and the susceptibility due to the electrons orbital motion. The spin susceptibility for non-interacting electrons gives rise to the Pauli-paramagnetic susceptibility which is positive, and is temperature independent at sufficiently low temperatures. The susceptibility due to the orbital motion has a negative sign and, therefore, yields the Landau-diamagnetic susceptibility. The magnetization due to the electronic spins can be calculated from ∂Ω Mz = − (902) ∂Hz where the grand canonical potential is given by ! X Ω = − kB T ln 1 + exp − β ( Eα − µ ) α
234
(903)
and where the sum over α runs over the quantum numbers of the single particle states including the spin. The applied magnetic field Hz couples to the quantum number corresponding to the z component of the spin of the electron, σ, via the Zeeman energy ˆ Zeeman H
g|e| H z Sz 2 me c = − µB Hz σ = −
(904)
where the spin angular momentum is given by S = h¯2 σ and the gyromagnetic ratio g = 2 originates from the Dirac or Pauli equation. The quantity µB is the Bohr magneton and is given in terms of the electron’s charge and mass by µB =
| e | ¯h 2 me c
(905)
The energy of a particle can then be written as Eσ (k) = E(k) − µB σ Hz
(906)
where σ is the eigenvalue of the Pauli spin matrix σ ˆz . The density of states, per spin, in the absence of the field is defined as X ρ() = δ( − E(k) ) (907) k
Thus, in the presence of a field, one has the spin dependent density of states ρσ () = ρ( + µB σHz )
(908)
The grand canonical potential can be expressed as an integral over the density of states ! Z ∞ X Ω = − kB T d ρ( + µB σHz ) ln 1 + exp − β ( − µ ) −∞
σ
∞
Z = − kB T
dE −∞
X
!
ρ(E) ln
1 + exp
− β ( E − µB σHz − µ )
σ
(909) where the variable of integration has been changed in the last line. The summation over σ runs over the values ± 1. The spin contribution to the magnetization induced by the applied field is given by Z ∞ X 1 Mz = µB dE σ ρ(E) −∞ σ 1 + exp β ( E − µB σHz − µ ) Z ∞ X = µB dE σ ρ(E) f ( E − µB σHz ) −∞
σ
= µB
Ne (σ = 1) − Ne (σ = − 1) 235
(910)
The magnetization due to the spins is just proportional to the number of up-spin electrons minus the down-spin electrons. The spin susceptibility is given by ∂M z χzz (T, H ) = (911) z p ∂Hz and is given by χp (T, Hz ) = −
µ2B
Z
∞
dE −∞
X
σ 2 ρ(E)
σ
∂ f ( E − µB σHz ) ∂E (912)
It is usual to measure the susceptibility at zero field. Since the derivative of the Fermi-function is peaked around the chemical potential, only electrons within kB T of the Fermi-energy contribute to the Pauli-susceptibility. At sufficiently low temperatures, one may use the approximation −
∂ f = δ( E − F ) ∂E
(913)
so that the zero temperature value of the Pauli-susceptibility is evaluated as χp (0) = 2 µ2B ρ(F )
(914)
which is inversely proportional to the free electron mass, and is also proportional to the density of states at the Fermi-energy. The finite temperature susceptibility can be evaluated by integration by parts, to obtain Z ∞ ∂ 2 ρ(E) (915) χp (T ) = 2 µB dE f (E) ∂E −∞ The zero field spin susceptibility can then be obtained via the Sommerfeld expansion " ( )# ∞ 2n+1 (2n+2) X 2 −1 ∂ 2 2 2 2n χp (T ) = 2 µB ρ(µ) + 2 kB T ζ(2n+2) (kB T ) ρ(µ) 22n+1 ∂µ n=0 (916) Thus, the spin susceptibility has the form of a power series in T 2 . The temperature dependence of the chemical potential can be found from the equation for Ne . The leading change in the chemical potential ∆µ due to T is given by ! ∂ρ(F ) 2 ∂F 2 2 π 4 4 ∆µ = − kB T + O( kB T ) (917) 6 ρ(F ) The temperature dependence of the chemical potential depends on the logarithmic derivative of the density of states, such that it moves away from the region
236
of high density of states to keep the number of electrons fixed. This leads to the leading temperature dependence of the Pauli susceptibility being given by # " 2 ∂ρ(F ) 2 2 ( ) ∂ ρ( ) π F ∂ 4 4 F χp (T ) = 2 µ2B ρ(F ) + k2 T 2 − + O( kB T ) 6 B ∂2F ρ(F ) (918) The temperature dependence gives information about the derivatives of the density of states. The coefficient γ of the linear T term in the low-temperature specific-heat and the zero temperature susceptibility are proportional to the density of states at the Fermi-energy. The susceptibility and specific heat can be used to define the dimensionless ratio lim
T → 0
T χp (T ) χp (0) = C(T ) γ
(919)
This ratio is known as the Sommerfeld ratio. For free electrons, this ratio has the value 3 µ2 T χp (T ) lim = 2 B2 (920) T → 0 C(T ) π kB The effect of electron-electron interactions can change this ratio, as they may affect the susceptibility in a different manner than the specific heat. The Stoner model, discussed in the chapter on magnetism, shows that the effect of electronelectron interactions can produce a large enhancement of the paramagnetic susceptibility for electron systems close to a ferromagnetic instability. Thus, near a ferromagnetic instability, the Sommerfeld ratio is expected to be large. However, for heavy fermion materials where both C(T )/T and χp (0) are highly enhanced, the value of the Sommerfeld ratio is very close to that of non-interacting electrons. ——————————————————————————————————
11.1.6
Exercise 48
Determine the field dependence of the low-temperature Pauli susceptibility. ——————————————————————————————————
11.1.7
Exercise 49
Determine the high temperature form of the Pauli susceptibility. ——————————————————————————————————
237
11.1.8
Landau Diamagnetism
Free electrons in a magnetic field aligned along the z axis have quantized energies given by 1 ¯h2 kz2 Ekz ,n = + n + ¯h ωc (921) 2m 2 where
| e | Hz (922) mc is the cyclotron frequency and n is a positive integer. For a cubic environment of linear dimension L, the value of kz is given by ωc =
kz =
2π nz L
(923)
The Landau levels have their orbits in the x − y plane quantized and have a level spacing of h ¯ ωc . Each Landau level is highly degenerate. The degeneracy D, or number of electrons with a given n and kz , can be found as the ratio of the area of the sample divided by the area enclosed by the classical orbit D =
L2 2 π rc2
(924)
where rc is the radius of the classical orbit. This radius can be obtained by equating the field energy with the zero point energy of the Landau level m 2 2 1 ¯h ωc ω r = 2 c c 2
(925)
Thus, the degeneracy is given by D
=
D
=
L2 m ωc 2 π ¯h | e | L2 Hz hc
(926)
Since Hz ∼ 1 kG, a typical value of the degeneracy is of the order of D ∼ 1010 . These levels can be treated semi-classically as there are an enormous number of Landau levels in an energy interval. The number of occupied Landau levels is given by the Fermi-energy µ divided by ¯h ωc , µ = ¯h ωc
µ | e | h ¯ m c
(927) Hz
The numerical constant has the value | e | ¯h ∼ 1.16 × 10−8 eV / G mc 238
(928)
so, with µ ∼ 1 eV and Hz ∼ 104 G, one finds that the number of occupied Landau levels is approximately given by µ ∼ 104 ¯h ωc
11.1.9
(929)
Landau Level Quantization
The Hamiltonian of a free electron in a magnetic field is given by 2 |e| ˆ = H p ˆ + A /(2m) c
(930)
Using the gauge A = (0, Hz x, 0) appropriate for a field along the z axis, then the Schrodinger equation takes the form | e | Hz ∂φ e2 Hz2 2 ¯h2 − ∇2 φ − i x + x φ = Eφ (931) 2m mc ∂y 2 m c2 This can be solved by the substitution φ(r) = f (x) exp i ( ky y + kz z )
(932)
so that f (x) satisfies 2 " 2 ¯h2 ∂ f | e | Hz 1 ¯h2 kz2 − + h ky + ¯ x − E− f (x) = 0 2 m ∂x2 c 2m 2m (933) which is recognized as the equation for the harmonic oscillator with energy eigenvalue ¯h2 kz2 1 E − = n + ¯h ωc (934) 2m 2 where
e2 Hz2 (935) m2 c2 That is, the motion in the plane perpendicular to the field, Hz , is quantized into Landau levels (L.D. Landau, Zeit. f¨ ur Physik, 64, 629 (1930)). The energy spacing between the levels is given by h ¯ ωc , where ωc2 =
ωc =
| e | Hz mc
(936)
and the orbit is centered around the position x0 = −
¯h ky c | e | Hz
239
(937)
The momentum dependence of the position x0 has a classical analogy. The center of the classical orbit is determined by its initial velocity vy via vy = ωc x0
(938)
so the center of the quantum orbit is determined by py . The energy of the Landau orbit is given by ¯h2 kz2 1 Ekz ,n = + n + ¯h ωc (939) 2m 2 The degeneracy of the n-th level must correspond to the number of kx , ky values for Hz = 0 that collapse onto the Landau levels as Hz is increased. The degeneracy can be enumerated in the case of periodic boundary conditions, φ(x, y, z) = φ(x, Ly − y, z) (940) The periodic boundary conditions imply that exp i ky Ly = 1
(941)
or ky =
2π ny Ly
(942)
The x dependent factor of the wave function f (x) is centered at x0 where x0 = −
¯h ky c | e | Hz
(943)
For a sample of width Lx , one must have Lx > x0 > 0, so one has the equality | e | Hz Lx > − ky > 0 ¯h c
(944)
The degeneracy, D, is the number of quantized ky values that satisfy this inequality. The degeneracy is found to be D
= =
| e | Hz Lx 2 π / ¯h c Ly | e | Hz Lx Ly 2 π ¯h c
(945)
independent of n. Thus, the degeneracy D of every Landau is given by D =
| e | Hz Lx Ly 2 π ¯h c 240
(946)
The degeneracy can be expressed in terms of the amplitude of the oscillations in the x direction, which is defined as the length scale that determines the exponential fall off of the ground state wave function r ¯h rc = (947) m ωc The degeneracy of the Landau levels can also be expressed as D =
Lx Ly 2 π rc2
(948)
as previously found from classical considerations. The quantization of the orbital motion, in the presence of a periodic potential, has been considered by Rauh (A. Rauh, Phys. Stat. Solidi, B 65, K131 (1974), A. Rauh, Phys. Stat. Solidi, B 69, K9 (1975)) and by Harper (P.G. Harper, Ph.D. Thesis, University of Birmingham (1954), P.G. Harper, Proc. Phys. Soc. London, A 68, 874 (1955)). These authors have shown that the periodic potential causes the Landau levels to be broadened or split.
11.1.10
The Diamagnetic Susceptibility
The diamagnetic susceptibility is determined from the field dependence of the grand canonical potential, Ω, Z ∞ X 1 ¯h2 kz2 1 ¯ 2 kz2 h D Lz dkz ¯h ωc ( n + ) + − µ Θ µ −¯h ωc ( n + ) − Ω = 2 2π¯ h −∞ 2 2m 2 2m n (949) On integrating over kz , one finds 8 D Lz Ω = − 3 2π
2m h2 ¯
µ h ¯ ωc
−1 X2
12
µ − ¯h ωc ( n +
n=0
32 1 ) 2
(950)
or 2 V Ω = − m ωc 3 ( π2 h ¯ )
2m ¯h2
12
µ h ¯ ωc
−1 X2
n=0
32 1 µ − ¯h ωc ( n + ) (951) 2
The summation over n can be performed using the Euler-MacLaurin formula n=N X n=0
Z F (n) =
N
dx F (x) + 0
1 1 ( F (0) + F (N ) ) + ( F 0 (N ) − F 0 (0) ) + . . . 2 12 (952)
241
This produces the leading order field dependence of the grand canonical potential, given by # 1 " 2 V 2 5 1 2 2 1 2m 2 µ2 − ¯h ωc µ 2 + . . . Ω = − m 3 ( π2 ¯ 5 16 h2 ) ¯h2 (953) The diamagnetic susceptibility is given by the second derivative with respect to the applied field 2 ∂ Ω χd = − ∂Hz2 1 2m 2 1 V m µ2 (954) = − µ2B 3 π 2 ¯h2 ¯h2 where we have expressed the orbital magnetic moment in terms of the (orbital) Bohr magneton | e | ¯h (955) µB = 2mc The diamagnetic susceptibility χd can be compared with the Pauli paramagnetic susceptibility χp . For free electrons, the Pauli susceptibility is given by χp
= 2 µ2B ρ(µ) V m kF = 2 µ2B 2 π 2 ¯h2 1 V m 2m 2 1 = µ2B 2 2 µ2 π ¯h ¯h2
(956)
Hence, the spin and orbital susceptibilities are related via χd = −
1 χp 3
(957)
Thus, the Landau diamagnetic susceptibility is negative and has a magnitude which, for free electrons, is just one third of the Pauli paramagnetic susceptibility (L.D. Landau, Zeit. f¨ ur Physik, 64, 629 (1930)). The diamagnetism results from the quantized orbital angular momentum of the electrons. The value of µB in the diamagnetic susceptibility is given by the band mass m∗ , whereas the factor of µB in the Pauli susceptibility is defined in terms of the mass of the electron in vacuum me . In systems such as Bismuth, in which the band mass is smaller than the free electron mass, the diamagnetic susceptibility is larger by a factor of 2 χd 1 me = − (958) χp 3 m∗ and the diamagnetic susceptibility can be larger than the Pauli susceptibility. The susceptibility of Bismuth is negative.
242
In the presence of spin-orbit coupling, the orbital angular momenta are coupled with the spin angular momenta. As a result, the components of the total susceptibility are coupled. The manner in which the total angular momentum couples to the field is described by the g factor.
243
11.2
Transport Properties
11.2.1
Electrical Conductivity
The electrical conductivity of a normal metal is considered. The application of an electromagnetic field will produce an acceleration of the electrons in the metal. This implies that the distribution of the electrons in phase space will become time dependent, and in particular the Fermi-surface will be subject to a time dependent distortion. However, the phenomenon of electrical transport in metals is usually a steady state process, in that the electric current density j produced by a static electric field E is time independent and obeys Ohm’s law j = σE
(959)
where σ is the electrical conductivity. This steady state is established by scattering processes that dynamically balances the time dependent changes produced by the electric field. That is, once the steady state has been established, the acceleration of the electrons produced by the electric field is balanced by scattering processes that are responsible for equilibration. Since Ohm’s law holds almost universally, without requiring any noticeable non-linear terms in E to describe the current density, it is safe to assume that the current density can be calculated by only considering the first order terms in the electro-magnetic field. The validity of this assumption can be related to the smallness of the ratio of λ | eµ | E where λ is the mean free path, E the strength of the applied field and µ the Fermi-energy. This has the consequence that the Fermi-surface in the steady state where the field is present is only weakly perturbed from the Fermi-surface with zero field. A number of different approaches to the calculation of the electrical conductivity will be described. For simplicity, only the zero temperature limit of the conductivity shall be calculated. The dominant scattering process for the conductivity in this temperature range is scattering by static impurities.
11.2.2
Scattering by Static Defects
The electrical conductivity will be calculated in which the scattering is due to a small concentration of randomly distributed impurities. The potential due to the distribution of impurities located at positions rj , each with a potential Vimp (r) is given by X V (r) = Vimp (r − rj ) (960) j
This produces elastic scattering of electrons between Bloch states of different wave vectors. The transition rate in which an electron is scattered from the state with Bloch wave vector k to a state with Bloch wave vector k 0 is denoted 1 by τ (k→k If the strength of the scattering potential is weak enough, the 0) .
244
transition rate can be calculated from Fermi’s golden rule as 2 2 π 1 0 = < k | V | k > δ( E(k) − E(k 0 ) ) 0 ¯h τ (k → k ) 2 2π 1 X 0 0 = exp i (k − k ) . (ri − rj ) Vimp (k − k ) δ( E(k) − E(k 0 ) ) 2 h V ¯ i,j
(961) where the delta function expresses the restriction imposed by energy conservation in the elastic impurity scattering processes. As usual, the presence of the delta function requires that the transition probability is calculated by integrating over the momentum of the final state. As the positions of the impurities are distributed randomly, the scattering rate shall be configurational averaged. The configurational average of any function is obtained by integrating over the positions of the impurities Y 1 Z 3 F = d rj F ({rj }) (962) V j The configurational average of the scattering rate is evaluated as 2 1 2 π 1 X 0 == Vimp (k − k ) δ( E(k) − E(k 0 ) ) 0 2 h V ¯ τ (k → k ) j
(963)
where only the term with i = j survives. The conductivity can be calculated from the steady state distribution function of the electrons, in which the scattering rate dynamically balances the effects of the electric field. This is found, in the quasi-classical approximation, from the Boltzmann equation. The Boltzmann Equation. The distribution of electrons in phase space at time t, f (k, r, t), is determined by the Boltzmann equation. The Boltzmann equation can be found be examining the increase in an infinitesimally region of phase space that occurs during a time interval dt. The number of electrons in the infinitesimal volume d3 k d3 r located at the point k, r at time t is f (k, r, t) d3 k d3 r
(964)
The increase in the number of electrons in this volume that occurs in time interval dt is given by ∂ f (k, r, t) d3 k d3 r dt + O(dt2 ) f (k, r, t + dt) − f (k, r, t) d3 k d3 r = ∂t (965) 245
This increase can be attributed to changes caused by the regular or deterministic motion of the electrons in the applied field, and partly due to the irregular motion caused by the scattering. The appropriate time scale for the changes in the distribution function due to the applied fields is assumed to be much longer than the time interval in which the collisions occur. The deterministic motion of the electrons trajectories in phase space results in a change in the number of electrons in the volume d3 k d3 r. The increase due to these slow time scale motions is equal to the number of electrons entering the six-dimensional volume through its surfaces in the time interval dt minus the number of electrons leaving the volume. This is given by ∆ f (k, r, t) d3 k d3 r = − dt ∇ . r˙ f (k, r, t) + ∇k . k˙ f (k, r, t) (966) and the slow rates of change in position and momentum of the electrons is determined via ¯h k r˙ = m |e|E ¯h k˙ = − m (967) Hence, the deterministic changes are found as ¯h k |e|E ∆ f (k, r, t) d3 k d3 r = − ∇ . f (k, r, t) − ∇k . f (k, r, t) d3 k d3 r dt m m ¯h (968) This involves the sum of two terms, one coming from the change of the electrons momentum and the other from the change in the electrons position. The two gradients in this expression can be evaluated, each gradient yields two terms. One term of each pair involves a gradient of the distribution function, while the other only involves the distribution function itself. One term, originating from the change in the electrons position involves the spatial variation of the velocity. From Hamilton’s equations of motion it can be shown that the coefficient of this term is equal to the second derivative of the Hamiltonian, ∇ . r˙ = ∇ . ∇p H
(969)
while the similar term originating from the change in particles momentum is just equal to the negative of the second derivative ∇p . p˙ = − ∇ . ∇p H
(970)
Since the Hamiltonian is ana analytic function these terms are equal magnitude and of opposite sign. Thus, these terms cancel yielding only ∆ f (k, r, t) d3 k d3 r = 246
= −
¯ k h .∇ m
f (k, r, t)
−
|e|E . ∇k f (k, r, t) m ¯h
d3 k d3 r dt (971)
The remaining contribution to the change in number of electrons per unit time occurs from the rapid irregular motion caused by the impurity scattering. The net increase is due to the excess in scattering of electrons from occupied states at (k 0 , r) into an unoccupied state (k, r) over the rate of scattering out of state (k, r) into the unoccupied states at (k 0 , r). The restriction imposed by the Pauli exclusion principle, is that the state to which the electron is scattered into should be unoccupied in the initial state. This restriction is incorporated by introducing the probability that a state (k, r) is unoccupied, through the factor ( 1 − f (k, r, t) ). " X 1 f (k 0 , r, t) ( 1 − f (k, r, t) ) ∆ f (k, r, t) d3 k d3 r = 0 τ (k → k) 0 k # 1 0 f (k, r, t) ( 1 − f (k , r, t) ) d3 k d3 r dt (972) − τ (k → k 0 ) On equating these three terms, cancelling common factors of d3 k d3 r dt one obtains the Boltzmann equation ∂ ¯h k |e|E f (k, r, t) = − ∇ . f (k, r, t) − ∇k . f (k, r, t) + I f (k, r, t) ∂t m m ¯h (973) where the functional I f is the collision integral and is given by " X 1 I f (k, r, t) = f (k 0 , r, t) ( 1 − f (k, r, t) ) 0 τ (k → k) k0 # 1 0 − f (k, r, t) ( 1 − f (k , r, t) ) τ (k → k 0 ) (974) Thus, the Boltzmann equation can be written as the equality of a total derivative obtained from the regular motion and the collision integral which represents the scattering processes d (975) f (k, r, t) = I f (k, r, t) dt Since
1 1 = τ (k → k 0 ) τ (k 0 → k)
247
(976)
the collision integral can be simplified to yield " # X 1 I f (k, r, t) = f (k 0 , r, t) − f (k, r, t) τ (k → k 0 ) 0
(977)
k
Due to time reversal invariance of the scattering rates and conservation of energy, the collision integral vanishes in the equilibrium state. In equilibrium, the distribution function is time independent and uniform in space. The distribution function, therefore, only depends on k in a non-trivial manner, and can be written in terms of the Fermi-function f0 (k). Thus, in this case, the distributions are related via f (k, r, t) =
1 f0 (k) V
(978)
The equilibrium distribution function f0 (k) is only a function of the energy E(k). The condition of conservation of energy which occurs implicity in the scattering rate requires f0 (k) = f0 (k 0 ). Hence, in equilibrium the collision integral vanishes. In the steady state produced by the application of an electric field, the electron density will be time independent and uniform throughout the metal, and so the temporal and spatial dependence of f (k, r, t) can still be neglected. In this case, the distribution function in momentum space is still related to the distribution function in phase space via f (k, r, t) =
1 f (k) V
(979)
where f (k) is the non-equilibrium distribution describing the steady state. The Solution of the Boltzmann equation. Since the electron distribution in the steady state conduction of electrons is close to equilibrium one may look for solutions, for f (k) close to the equilibrium Fermi-Dirac distribution function. Thus, solutions of the form can be sought f (k) = f0 (k) + Φ(k)
∂f0 (k) ∂E(k)
(980)
where Φ is an unknown function, with dimensions of energy. It is to be shown that Φ(k) is determined by the electric field and small compared with the Fermienergy µ. The above ansatz for the non-equilibrium distribution function is motivated by the notion that the term proportional to Φ occurs from a Taylor expansion of the steady state distribution function. In other words, the variation of Φ with k occurs from the distortion of the Fermi-surface in the steady state.
248
If the above ansatz for the steady state distribution is substituted into the Boltzmann equation one obtains |e|E ∂f0 (E(k)) − ∇k . f (k, r, t) = I Φ(k) m¯ h ∂E(k) (981) This shows that the energy Φ has a leading term which is proportional to the first power of the electric field. However, in order to obtain a current that satisfies Ohm’s law, only the terms in Φ terms linear in E need to be calculated. Therefore, the Boltzmann equation can be linearized by dropping the term that involves the electric field and Φ, since this is second order in the effect of the field. The linearized Boltzmann equation can be solved by noticing that the collision integral is equal to the source term which is proportional to the scalar product ( k . E ). Hence, it is reasonable to assume that Φ(k) has a similar form (982) Φ(k) = A(E(k)) ( k . E ) where A(E) is an unknown function of the energy, or other constants of motion. Due to conservation of energy, the unknown coefficient can be factored out of 0 the collision integral, as can be the factor of ∂f ∂E since both are only functions of the energy. It remains to evaluate an integral of the form Z d3 k 0 δ( E(k) − E(k 0 ) ) | Vimp (k − k 0 ) |2 ( k 0 − k ) . E (983) The integration over k 0 can be performed by first integrating over the magnitude of q = k − k 0 . On using the property of the energy conserving delta function, δ( E(k) − E(k 0 ) ) =
2m δ( q 2 − 2 k q cos θ0 ) ¯h2
(984)
this sets the magnitude of q = 2 k cos θ0 , where the direction of k was chosen as the polar axis. For simplicity it shall be assumed that the impurity potential is short ranged, so that the dependence of V (q) on q is relatively unimportant. The integration over the factor of q . E can easily be evaluated, and the result can be shown to be proportional to just ( k . E ) . That is, on expressing the scalar product as q . E = ( q sin θ0 cos φ0 Ex + q sin θ0 sin φ0 Ey + q cos θ0 Ez )
(985)
on integrating over the azimuthal angle φ0 dΩ0 = dθ0 sin θ0 dφ0
(986)
the terms proportional to Ex and Ey vanish. The integration over the polar angle θ0 produces a factor Z 1 2m 8 π 2 k2 d cos θ0 cos3 θ0 | V ( 2 k cos θ0 ) |2 Ez (987) h ¯ −1 249
This yields the result = 4π
2m k(k.E)2 ¯h2
Z
1
d cos θ0 cos3 θ0 | V ( 2 k cos θ0 ) |2
(988)
−1
which can be expressed as an integral over the scattering angle θ = π − 2 θ0 Z 1 2m θ = 2π 2 k(k.E) d cos θ ( 1 − cos θ ) | V ( 2 k sin ) |2 (989) 2 h ¯ −1 On identifying the non-equilibrium part of the distribution function with Φ(k)
∂f0 (k) ∂f0 (k) = ( k . E ) A(E) ∂E(k) ∂E(k)
(990)
yields the solution for the non-equilibrium contribution of the distribution function as |e| Φ(k) = + τtr (k) E . ∇k E(k) (991) ¯h Thus, Φ is proportional to the energy change of the electron produced by the electric field in the interval between scattering events. In the above expression, the term 2π X 1 0 0 2 = c δ( E(k) − E(k )) | V (k − k ) | 1 − cos θ (992) τtr (k) h ¯ 0 k
is identified as the transport scattering rate, in which c is the concentration of impurities. The transport scattering rate has the form of the rate for scattering out of the state k but has an extra factor of ( 1 − cos θ ). In the quantum formulation of transport this factor appears as a vertex correction. Basically, the electrical current is related to the momentum of the electrons in the direction of the applied field. Forward scattering processes do not result in a reduction of the momentum and, therefore, leave the current unaffected. The transport scattering rate involves a factor of ( 1 − cos θ ) where θ is the scattering angle. This factor represents the relative importance of large angle scattering in the reduction of the total current. The Current Density. The current density can be obtained directly from the expression. 1 X 1 ∂E(k) f (k) V ¯h ∂k k ∂f0 (k) 1 X 1 ∂E(k) = −2|e| f0 (k) + Φ(k) ∂E(k) V ¯ ∂k h j = −2|e|
k
(993) 250
where the factor of 2 represents the sum over the electron spins. On viewing the electron distribution function as the first two terms in a Taylor expansion, the electron distribution function can be described by an occupied Fermi-volume which has been displaced from the equilibrium position in the direction of the applied field. The displacement of the Fermi-volume produces the average current in the direction of the field. The first term represents the current that is expected to flow in the equilibrium state. This term is zero, as can be seen by using the symmetry of the energy E(k) = E(−k) in the Fermi-function. Due to the presence of the velocity vector h¯1 ∇k E(k), it can be seen that the current produced by an electron of momentum k identically cancels with the current produced by an electron of momentum −k. Thus, the non-zero component of the current originates from the non-equilibrium part of the distribution function. This can only be evaluated once the Bloch energies are given. The current is given by e2 X ∂f0 (k) (994) j = −2 2 τ (k)tr ∇k E(k) ∇k E(k) . E ∂E(k) h ¯ k
On recognizing the zero temperature property of the Fermi-function ∂f0 (k) = δ( E(k) − µ ) − ∂E(k)
(995)
it is seen that the electrical current is carried by electrons in a narrow energy shell around the Fermi-surface. On using the symmetry properties of the integral one finds that only the diagonal component of the conductivity tensor is nonzero and is given by 2 δα,β e2 X ∂f0 (k) 2 σα,β = − (996) τ (k)tr | ∇k E(k) | 3 ∂E(k) ¯h2 k For free electron bands, the conductivity tensor is evaluated as σα,β = δα,β
ρ e2 τtr m
(997)
where ρ is the density of electrons, m is the mass of the electrons and τtr is the Fermi-surface average of the transport scattering rate τtr (k). ——————————————————————————————————
11.2.3
Exercise 50
Determine the conductivity tensor σα,β (q, ω) which relates the Fourier component of a current density jα (q, ω) to a time and spatially varying applied electric 251
field with a Fourier amplitude Eβ (q, ω) via Ohm’s law jα (q, ω) =
X
σα,β (q, ω) Eβ (q, ω)
(998)
β
Assume that ω is negligibly small compared with the Fermi-energy so that the scattering rate can be evaluated on the Fermi-surface. The above result should show that in the zero frequency limit ω → 0 the q = 0 conductivity is purely real and given by the standard expression 2 σα,β (0, 0) = δα,β ρ em τtr , and decreases for increasing ω. The frequency width of the Drude peak is given by the scattering rate τ1tr . ——————————————————————————————————
11.2.4
The Hall Effect and Magneto-resistance.
The Hall effect occurs when an electrical current is flowing in a sample and a magnetic field is applied in a direction transverse to the direction of the current density. Consider a sample in the form of a rectangular prism, with axes parallel to the axes of a Cartesian coordinate system. The magnetic field is applied along the z direction and a current flows along the y direction. The Hall effect concerns the appearance of a voltage (the Hall voltage) across a sample in the x direction. The Hall voltage appears in order to balance the Lorentz force produced by the motion of the charged particles in the magnetic field. The initial current flow in the x direction sets up a net charge imbalance across the sample in accordance with the continuity equation. The build up of static charge produces the Hall voltage. In the steady state, the Hall voltage balances the Lorentz force opposing the further build up of static charge. The sign of the Hall voltage is an indicator of the sign of the current carrying particles. The Hall Electric field is given by E x eˆx = + vy Bz eˆy ∧ eˆz
(999)
The Hall voltage VH is related to the electric field and the width of the sample dx via VH
= =
− E x dx = − vy Bz dx jy Bz dx − ρq (1000)
Hence, measurement of the Hall voltage VH and jy , together with the magnitude of the applied field Bz , determines the carrier density ρ and the charge q. This
252
is embodied in the definition of the Hall constant, RH Ey jx Hz
RH =
(1001)
which for semi-classical free carriers of charge q and density ρ is evaluated as RH =
1 ρq
(1002)
In other geometries, one notices that the current will flow in a direction other than parallel to the applied field. The conductivity tensor will not be diagonal, as will the resistivity tensor. The dependence of the resistivity on the magnetic field is known as magneto-resistance. The phenomenon of transport in a magnetic field can be quite generally addressed from knowledge of the conductivity tensor in an applied magnetic field. This can be calculated using the Boltzmann equation approach. The Boltzmann Equation. The Boltzmann equation for the steady state distribution f (p), in the presence of static electric and magnetic fields, can be expressed as − | e | E + v ∧ B . ∇p f (p) = I f (p) (1003) Since only a solution for f (p) is sought which contain terms linear in the electric field E, the equation can be linearized by making the substitution f (p) → f0 (p) but only in the term explicitly proportional to E. − | e | E . ∇p f0 (p) − | e | ( v ∧ B ) . ∇p f (p) = I f (p) (1004) The substitution of the zero field equilibrium distribution function f0 (p) in the term proportional to f0 (p) without any magnetic field corrections is consistent with the equilibrium in the presence of a static magnet field. This can be seen by examining the limit E = 0, where the Boltzmann equation reduces to − | e | ( v ∧ B ) . ∇p f (p) = I f (p) (1005) which has the solution f (p) = f0 (p) since in this case the collision integral vanishes and the remaining term is also zero as ( v ∧ B ) . ∇p f0 (p)
=
( v ∧ B ) . ∇p E(p)
=
(v ∧ B).v
∂f0 (p) ∂E(p)
∂f0 (p) = 0 ∂E(p) (1006)
253
since due to the vector identity (A ∧ B).A = 0
(1007)
the scalar product vanishes. This is just a consequence of the fact that a magnetic field does not change the particles energy. The observation can be used to simplify the Boltzmann equation as, has been seen, the magnetic force term only acts on the deviation from equilibrium. The ansatz for the steady state distribution function is f (p) = f0 (p) + Φ(p)
∂f0 (p) ∂E(p)
= f0 (p) + v . C
∂f0 (p) ∂E(p)
(1008)
where C is an unknown vector function. It shall be shown that the vector C is independent of p. In this case, the collision integral simplifies to the case that was previously considered. Namely, the collision integral reduces to the transport scattering rate times the non-equilibrium part of the steady state distribution function. On cancelling the common factor involving the derivative of the Fermi-function, and using ∇p . v = one finds |e|E.v +
1 m∗
|e| 1 (v ∧ B).C = (v.C ) m∗ c τtr
(1009)
(1010)
The solution of this equation is independent of v, hence C is a constant vector. This can be seen explicitly by substituting the identity (v ∧ B).C = (B ∧ C ).v
(1011)
back into the Boltzmann equation. The resulting equation can be solved for all v if C satisfies the algebraic vector equation 1 |e| (1012) C B ∧ C = |e|E + τtr m∗ c To solve the above algebraic equation it is convenient to change variables |e| ωc = (1013) B m∗ c This shall be solved by finding the components parallel and transverse to B.
254
If the scalar product of the algebraic equation is formed with ω c and on recognizing that ω c . ( ω c ∧ C ) = 0 one finds that the component of C parallel to the magnetic field is given by ω c . C = | e | τtr ω c . E
(1014)
The transverse component of C can be obtained by taking the vector product of the algebraic equation with ω. This results in the equation ωc ∧ ( | e | E ) + ωc ∧ ( ωc ∧ C ) =
1 ω ∧ C τtr c
(1015)
but ω c ∧ ( ω c ∧ C ) = ω c ( ω c . C ) − ωc2 C
(1016)
so one recovers the relation between the transverse component and C from ω c ∧ ( | e | E ) + ω c ( ω c . C ) − ωc2 C =
1 ω ∧ C τtr c
(1017)
by eliminating the longitudinal component. The resulting relation is found as ω c ∧ ( | e | E ) + ω c ( | e | τtr ω c . E ) − ωc2 C =
1 ω ∧ C τtr c
(1018)
The transverse component can be substituted back into the original algebraic equation to find the complete expression for C. 1 1 − ωc2 C = 2 C − | e | E (1019) | e | ωc ∧ E + τ ωc ( ωc . E ) τ τ Therefore, C is given by the constant vector 2 2 2 (1020) 1 + ωc τ C = τ | e | E + τ ( ωc . E ) ωc + τ ( ωc ∧ E ) which only depends upon E and B but not on the momentum p. This leads to the explicit expression for the non-equilibrium distribution function of f (k)
= +
f0 (k) τ |e| ∂f0 (k) 2 v . E + τ ( v . ω ) ( ω . E ) + τ ( v ∧ ω ) . E c c c 2 2 1 + ωc τ ∂E(k) (1021)
The deviation from equilibrium can be interpreted in terms of an anisotropic displacement of the Fermi-function involving the work done by the electric field 255
on the electron in the time interval between scattering events. The Conductivity Tensor. The average value of the current density is given by j = −|e|
2 X v(k) f (k) V
(1022)
k
where f (k) is the steady state distribution function, and the factor of 2 represents the summation over the electrons spin. On substituting for the steady state distribution function, and noting that because of the symmetry k → − k in the equilibrium distribution function, no current flows in the absence of the electric field. The current density j is linear in the magnitude of the electric field E, and is given by Z τ d3 k ∂f0 2 j = 2|e | v − × 1 + ωc2 τ 2 ( 2 π )3 ∂E " # ( v . E ) + τ 2 ( v . ωc ) ( ωc . E ) + ( v ∧ ωc ) . E
×
(1023) Thus, the conductivity tensor is recovered in dyadic form as Z τ d3 k ∂f0 σ e = 2 | e2 | − × 1 + ωc2 τ 2 ( 2 π )3 ∂E " # ×
v v + τ 2 ( v . ωc ) v ωc + τ v ( v ∧ ωc ) (1024)
Furthermore, if E(k) is assumed to be spherically symmetric, one finds that the components of the tensor can be expressed as Z ∂f0 v 2 τ 2 2 σα,β = e dE ρ(E) − δα,β + τ ωα ωβ ± ( 1 −δα,β ) τ ωγ ∂E 3 1 + ω2 τ 2 (1025) where in the off-diagonal term the convention is introduced such that γ is chosen such that (α, β, γ) corresponds to a permutation of (x, y, z). Since the density 3 of states per unit volume, ρ(E), is proportional to E 2 , the conductivity tensor can be evaluated, by integration by parts, to yield ρ e2 τ 2 σα,β = δα,β + τ ωα ωβ ± ( 1 − δα,β ) τ ωγ (1026) m 1 + ω2 τ 2 where the ± sign is taken to be positive when (α, β, γ) are an odd permutation of (x, y, z) and is negative when (α, β, γ) are an even permutation of (x, y, z). Thus, 256
if the field is applied along the z direction it is found the diagonal components of the conductivity tensor are given by σx,x
= σy,y =
σz,z
=
ρ e2 τ m 1 + ωc2 τ 2
ρ e2 τ m
(1027)
The non-zero off-diagonal terms are found as σy,x = − σx,y =
ωc τ 2 ρ e2 m 1 + ωc2 τ 2
(1028)
Thus, for the diagonal component of the conductivity tensor are anisotropic. The component parallel to the field is constant while the other two components decrease like ωc−2 in high fields. The off diagonal components are zero at zero field, but increase linearly with the field for small ωc but then decreases like ωc−1 at high fields. A useful representation of the conductivity is through the Hall angle. For example, if one applies the magnetic field along the z direction and then an electric field along the x direction Ex 6= 0, then the current will have an x and y component that can be characterized by a complex number z z
= Jx + i Jy 1 − i ωc τ = σ 0 Ex 1 + ωc2 τ 2 1 = σ0 1 + i ωc τ
(1029)
This complex number z lies on a semi-circle of radius σ20 Ex centered on the point ( σ20 Ex , 0), as σ0 1 − i ωc τ σ0 Ex = Ex (1030) z − 2 2 1 + i ωc τ and the modulus is just given by |z −
σ0 σ0 Ex | = Ex 2 2
(1031)
Thus, the number z lies on a semi-circle of radius σ20 Ex passing through the origin. The Hall angle ΨH is defined as the angle between the line subtended from the point z to the origin and the Jx axis. Thus tan ΨH =
Jy Jx
(1032)
and from the Boltzmann equation analysis of the magneto-conductivity ΨH = tan−1 ωc τ 257
(1033)
Thus, from knowledge of σ0 and E one can find z and, thence, J. The resistivity tensor ρi,j is obtained from the conductivity tensor σi,j by inverting the relation X Ji = σi,j Ej (1034) j
to obtain Ei =
X
ρi,j Jj
(1035)
j
The resistivity tensor is found ρ0 ρ0 ω c τ ρ0 ρe = − ρ0 ωc τ 0 0
as 0 0 ρ0
Thus, for the free electron model the diagonal part of the resistivity tensor is completely unaffected by the field. There is neither a longitudinal or transverse magneto-resistance. However, as Hz increases, the transverse component of the electric field Ey increases. This is the Hall field. The Hall field is given by Ey = ωc τ ρ0 Jx =
J x Hz ρ|e|
(1036)
Thus, the Hall resistivity is ρyx
= =
Ey Jx Hz ρ|e|
(1037)
Thus, the Hall constant RH is given by RH =
1 EY = Hz J x ρ|e|
(1038)
The magneto-resistivity is usually classified as being longitudinal or transverse. The longitudinal magneto resistance is the change in the resistivity tensor ρz,z due to the application of a magnetic field along the z direction. The transverse magneto-resistance is given by the change in ρx,x or ρy,y due to afield Hz . The longitudinal magneto-resistance is usually due to the dependence of the scattering rate on the magnetic field, whereas the transverse magneto-resistance is due to the action of the Lorentz force. The general features of the magneto-resistance are:(i) for low fields, Hz such that ωc τ < 1 then ∆ρx,x = ρx,x (Hz ) − ρx,x (0) ∝ Hz2 258
(1039)
(ii) There is an electric field Ey transverse to Jx and Hz , which has a magnitude proportional to Hz . (iii) For large fields ρx,x (Hz ) may either continue to increase with Hz2 or saturate. (iv) For a set of samples all which have different residual resistivities ρzz (T = 0, Hz = 0), then the transverse magneto resistance usually satisfies Koehler’s rule ∆ρx,x (Hz ) Hz = F (1040) ρx,x (T = 0, Hz = 0) ρx,x (0, 0) Basically, Koehler’s law expresses the fact that ρ(Hz ) only depends on Hz through the combination ωc τ and that ∆ρx,x and ρzz (T = 0, Hz = 0) are both proportional to τ −1 . The standard form of the relationship between E and J is expressed as a vector equation E
= +
ρ0 J + a ( J ∧ H ) + b H 2 J c ( J . H ) H + d Te J
(1041)
where Te is a tensor which only has diagonal components that, when referred to the crystalline axes, are (Hx2 , Hy2 , Hz2 ). That is Te is the matrix Hx2 0 0 0 Hy2 0 0 0 Hz2 The five unknown quantities may be determined by five experiments. (1) When J and H are parallel to the x axis one has the longitudinal magneto-resistance given by ρx,x = ρ0 + ( b + c + d ) H 2
(1042)
(2) When J k x , H k y then ρx,x = ρ0 + b H 2 which is the transverse magneto-resistance. (3) With J k (1, 1, 0) i.e. J J = √ (1, 1, 0) 2 and H H = √ (1, 1, 0) 2 259
(1043)
(1044)
(1045)
one has a different longitudinal magneto-resistance ∆ρ = ( b + c +
d ) H2 2
(1046)
(4) When H = H (0, 0, 1) then a second transverse magneto-resistance is found as ∆ ρ = b H2 (1047) but when H =
H √ 2
(1, −1, 0) then the magneto-resistance is found as ∆ρ = (b +
d ) H2 2
(1048)
(5) The constant a makes no contribution to the magneto-resistance, but is found from the Hall effect. If H is transverse to J then the Hall effect is only determined by a alone, and is isotropic. The magneto-resistance is usually negative except for cases where the scattering is of magnetic origin, such as disorder with spin - orbit coupling or from Kondo scattering by magnetic impurities in metals.
11.2.5
Multi-band Models
The transverse magneto-resistance for a multi-band model is non-trivial, unlike the one band free electron model. The resistance can be obtained from the current field diagram, in which the currents originating from the various sheets of the Fermi-surface are considered separately. For example, a two band model with positive and negative charge carriers produces two components of the current J+ and J− by virtue of their responses σ+ , σ− in response to the electric field Ex . On assuming that the carriers have the same Hall angles ΨH , then the total current is found as σ+ + σ− Jx = Ex ( 1 + cos 2 ΨH ) (1049) 2 and
Jy =
σ+ − σ − 2
Ex sin 2 ΨH
(1050)
Thus, σx,x
=
σy,x
=
σ0 cos2 ΨH ρ+ − ρ− σ0 sin ΨH cos ΨH ρ+ + ρ− (1051) 260
Since the conductivity tensor isanisotropic and given by σx,x σx,y 0 0 σ ˜ = − σy,x σx,x 0 0 σz,z then the transverse magneto resistivity can be found from σx,x σz,z ρx,x = 2 2 σz,z ( σx,x + σx,y ) σx,x = 2 2 ( σx,x + σx,y ) =
=
=
1 σ0 1 σ0 1 σ0
( cos4 ΨH +
cos2 ΨH 2
ρ+ − ρ− ρ+ + ρ−
sin2 ΨH cos2 ΨH )
1 ( cos2 ΨH +
1 +
ρ+ − ρ− ρ+ + ρ−
2
sin2 ΨH )
sec2 ΨH 2
ρ+ − ρ− ρ+ + ρ−
(1052) ωc2
τ2
since tan Ψh = ωc τ . Furthermore, as sec2 ΨH then ρx,x =
1 σ0
= 1 + tan2 ΨH = 1 + ωc2 τ 2 ( 1 + ωc2 τ 2 ) 2 − ρ− 1 + ρρ+ ωc2 τ 2 + + ρ−
(1053)
(1054)
This saturates if | ρ+ − ρ− | > 0 and increases indefinitely for a compensated metal ρ+ = ρ− . Basically, the positive magneto-resistance occurs because the Lorentz force produces a transverse component of the current in each sheet of the Fermi-surface. The Lorentz force, the acting on these transverse currents then produces a shift of the Fermi-surface opposite to the shift produced by the electric field. A similar analysis can be performed on the Hall coefficient E⊥ (1055) H J The value of E⊥ is the component of the field perpendicular to the current. This is found from the angle θ between J and E RH =
cos θ
= 261
Jx J
sin θ
Jy J
=
(1056)
Thus RH
=
=
=
E Jy E sin θ = 2 H J H J 1 σ0 H
ωc τ σ0 H
ρ+ − ρ− ρ+ + ρ−
1 +
ρ+ − ρ− ρ+ + ρ−
ρ+ − ρ− ρ+ + ρ−
1 +
ρ+ − ρ+ +
1 | e | ( ρ+ − ρ− )
2
tan2 ΨH
( 1 + ωc2 τ 2 ) 2 ρ− tan2 ΨH ρ−
=
tan ΨH cos2 ΨH
ρ+ − ρ− ρ+ + ρ−
1 +
2
ρ+ − ρ+ +
( 1 + ωc2 τ 2 ) 2 ρ− tan2 ΨH ρ− (1057)
The Hall coefficient saturates to RH →
1 | e | ( ρ+ − ρ− )
for large magnetic fields.
262
(1058)
11.3
Electromagnetic Properties of Metals
Maxwell’s equations relate the electromagnetic field to charges and current sources ρ(r; t) and j(r; t). Maxwell’s equations can be formulated as ∇ . E(r; t) = 4 π ρe (r; t) ∇ . B(r; t) = 0 4π 1 ∂E(r; t) ∇ ∧ B(r; t) = j(r; t) + c c ∂t 1 ∂B(r; t) ∇ ∧ E(r; t) = − c ∂t
(1059)
where E and B represent the microscopic electric and magnetic fields, and ρe and j are the microscopic charge and current densities. These are eight equations for the six unknown quantities. The six unknown quantities are the components of E and B. The sourceless equations have a formal solution in terms of a scalar potential φ and a vector potential A, which are related to the electric field E and magnetic field B via E
= −∇φ −
B
= ∇ ∧ A
1 ∂A c ∂t (1060)
The solutions for the potentials are not unique, as the gauge transformations A → A0 = A + ∇ Λ
(1061)
1 ∂Λ c ∂t
(1062)
and φ → φ0 = φ −
yield new scalar and vector potentials, (A0 , φ0 ), that produce the same physical E and B fields as the original potentials (A, φ). The four quantities φ and A satisfy the four source equations 1 ∂ ∇2 φ + ∇.A = − 4 π ρe (1063) c ∂t and ∇2 A −
1 ∂2A − ∇ c2 ∂t2
∇.A +
1 ∂φ c ∂t
= −
4π j c
(1064)
These equations are usually simplified by choosing a gauge condition. The gauge conditions which are usually chosen are either the Coulomb Gauge ∇.A = 0 263
(1065)
or the Lorentz Gauge 1 ∂φ = 0 (1066) c ∂t The Lorentz gauge has the advantage that it is explicitly covariant under Lorentz transformations. The Coulomb gauge, also known as the transverse gauge or radiation gauge, is quite convenient for non-relativistic problems in that it separates out the effect of radiation from electrostatics. ∇.A +
The space and time Fourier transform of the charge density is defined as Z Z 1 ρe (q, ω) = d3 r dt exp − i ( q . r − ω t ) ρe (r; t) (1067) V On Fourier transforming the Source equations with respect to space and time, one has ω − q 2 φ(q, ω) + q . A(q, ω) = − 4 π ρe (q, ω) c ω2 ω 4π 2 − q + 2 A(q, ω) + q q . A(q, ω) − φ(q, ω) = − j(q, ω) c c c (1068) In the wave-vector and frequency domain, the Coulomb gauge condition is expressed as (1069) q . A(q, ω) = 0 which shows that the vector potential is transverse to the direction of q. In the transverse gauge, the equation for the vector potential reduces to 4π ω ω2 2 φ(q, ω) = − j(q, ω) − q + 2 A(q, ω) − q c c c (1070) The first term is transverse and the second term is longitudinal. Thus, the current can also be divided into a longitudinal term j L (q, ω) = ˆq ˆq . j(q, ω) (1071) and a transverse term j T (q, ω) = j(q, ω) − ˆq
ˆq . j(q, ω)
(1072)
Thus, the second non-trivial Maxwell equation separates into the transverse equation 4π ω2 2 − q + 2 A(q, ω) = − j (q, ω) (1073) c T c 264
and the longitudinal equation ω 4π φ(q, ω) = − j (q, ω) (1074) c c L In the Coulomb gauge, the other non-trivial Maxwell equation relates the charge density to the scalar potential via −q
− q 2 φ(q, ω) = − 4 π ρe (q, ω)
(1075)
This is just Poisson’s equation, and it has the solution φ(q, ω) =
4π ρe (q, ω) q2
(1076)
which is equivalent to Coulomb’s law. When Fourier transformed with respect to space and time, Poisson’s equation yields an instantaneous relation between the charge density and the scalar potential in the form of Coulomb’s law. Although this is an instantaneous relation, the signals transmitted by the electromagnetic field still travel with speed c and are also causal. This is because, in the Coulomb gauge, the retardation effects are contained in the vector potential. Poisson’s equation actually has the same content as the longitudinal equation, as can be seen by examining the continuity equation which expresses conservation of charge ∂ρe + ∇.j = 0 (1077) ∂t The continuity equation can be Fourier transformed to yield − ω ρe (q, ω) + q . j(q, ω) = 0
(1078)
This shows that the fluctuations in the charge density are related to the longitudinal current. On solving the continuity condition, one finds that the longitudinal current is given by ω j L (q, ω) = ˆq ρe (q, ω) (1079) q On substituting the above expression for the longitudinal current into the longitudinal equation, one finds −q
ω 4π ω φ(q, ω) = − ˆq ρe (q, ω) c c q
(1080)
On cancelling the factors of ω/c and q, one obtains Poisson’s equation. This proves that the longitudinal equation is equivalent to Poisson’s equation. We have also found that the longitudinal current can be expressed in the forms ω j L (q, ω) = ˆq ρe (q, ω) q qω = φ(q, ω) (1081) 4π so the longitudinal current can be viewed as being produced either by the charge density or by the scalar potential.
265
11.3.1
The Longitudinal Response
The currents and charge densities are usually broken down into the external contributions and the induced contribution, via j(q, ω) = j ind (q, ω) + j ext (q, ω) ρe (q, ω) = ρe ind (q, ω) + ρe ext (q, ω)
(1082)
The external scalar potential is given in terms of the external charge density via Poisson’s equation − q 2 φext (q, ω) = − 4 π ρe
ext (q, ω)
(1083)
The frequency and wave vector dependent dielectric constant for a homogeneous medium, ε(q, ω), is defined by the ratio ε(q, ω) =
φext (q, ω) φ(q, ω)
(1084)
The dielectric constant describes the screening of the external potential by longitudinal or charge density fluctuations. The dielectric constant is related to the longitudinal conductivity. This can be seen by combining the relation j L (q, ω) =
qω φ(q, ω) 4π
(1085)
with the expression for the induced component of the longitudinal current qω j L (q, ω)ind = φ(q, ω) − φ(q, ω)ext (1086) 4π Hence, on using the definition of the frequency dependent dielectric constant, one obtains qω j L (q, ω)ind = 1 − ε(q, ω) φ(q, ω) (1087) 4π The total scalar potential φ(q, ω) can be related to the the longitudinal electric field, E L (q, ω), since the electric field can be written as the sum of the time dependence of the vector potential and the gradient of the scalar potential E(q, ω) =
iω A(q, ω) − i q φ(q, ω) c
(1088)
If the longitudinal part of the electric field is identified as E L (q, ω) = − i q φ(q, ω)
(1089)
then one obtains the relation between the longitudinal current and the longitudinal electric field iω j L (q, ω)ind = 1 − ε(q, ω) E L (q, ω) (1090) 4π 266
Hence, as the longitudinal conductivity σL is defined by the relation j L (q, ω)ind = σL (q, ω) E L (q, ω) (1091) one finds that the conductivity and the dielectric constant are related through iω 1 − ε(q, ω) (1092) σL (q, ω) = 4π The frequency dependent dielectric constant can be expressed in terms of the response of the charge density due to the potential ε(q, ω)
=
φext (q, ω) φ(q, ω)
ε(q, ω)
=
φ(q, ω) − φind (q, ω) φ(q, ω)
=
1 −
ε(q, ω)
4 π ρe ind (q, ω) q2 φ(q, ω)
(1093)
The charge density is related to the electron density via a factor of the electron’s charge ρe ind (q, ω) = − | e | ρind (q, ω) (1094) and the scalar potential acting on the electrons produces the potential δV (q, ω) where (1095) δV (q, ω) = − | e | φ(q, ω) Thus, the frequency dependent dielectric constant may be written as ε(q, ω)
= 1 − =
1 −
4 π e2 ρind (q, ω) q2 δV (q, ω) 4 π e2 χ(q, ω) q2
(1096)
(H. Ehrenreich and M.H. Cohen, Phys. Rev. 115, 786 (1959)) where we have used the definition of the frequency dependent response function χ(q, ω). The frequency dependent response function is defined by χ(q, ω) =
ρind (q, ω) δV (q, ω)
(1097)
The real space and time form of the linear response relation can be found by re-writing this relation as ρind (q, ω) = χ(q, ω) δV (q, ω)
267
(1098)
and then performing the inverse Fourier transform. The real space and time form of the linear response relation is in the form of a convolution Z Z ∞ ρind (r, t) = d3 r 0 dt0 χ(r − r0 ; t − t0 ) δV (r0 , t0 ) (1099) −∞
The dependence of the response function on r − r0 is a direct consequence of our assumption that space is homogeneous. As the response function relates the cause and effect in a linear fashion, the response function can be calculated perturbatively. The induced electron density is found, in real space and time, by treating the time dependent potential as a perturbation. The resulting causal, non-local relation is then Fourier transformed with respect to space and time. This procedure is a generalization of our previous treatment of static screening. The expectation value of the electron density operator ρˆ(r) at time t, is calculated in a state that has evolved from the ground states due to the interaction. The electron density operator is given by X 3 ρˆ(r) = δ r − ri (1100) i
and the time dependent perturbation is Z ˆ int (t) = H d3 r0 ρˆ(r0 , t) δV (r0 , t)
(1101)
The expectation value of the electron density is to be evaluated in the interaction representation. The expectation value of the density is given by ρ(r, t) = < Ψint (t) | ρˆint (r, t) | Ψint (t) >
(1102)
where the state and operators are expressed in the interaction representation. In the interaction representation the operators evolve with respect to time under ˆ 0 , and are given by the influence of the unperturbed Hamiltonian H it ˆ it ˆ (1103) H0 H0 ρˆ(r) exp − ρˆint (r, t) = exp + ¯h ¯h In the interaction representation, the state evolves under the influence of the ˆ int (t). To first order in the perturbation, the ground state is given interaction H by " # Z t i 0 ˆ 0 | Ψint (t) > = 1 − dt Hint (t ) + . . . | Ψ0 > (1104) ¯h −∞ ˆ 0 . The induced where | Ψ0 > is the initial ground state eigenfunction of H electron density is defined as ρind (r, t) = < Ψint (t) | ρˆint (r, t) | Ψint (t) > − < Ψ0 | ρˆint (r, t) | Ψ0 > (1105) 268
The second term is time independent, as it is the expectation value in the ground ˆ 0 . On substituting the expression state of the time independent Hamiltonian H for the perturbed wave function, one finds a linear relationship between the induced density and the perturbing potential Z t i 0 0 ρind (r, t) = − dt < Ψ0 | ρˆint (r, t) , Hint (t ) | Ψ0 > h −∞ ¯ Z t Z i = − dt0 d3 r0 < Ψ0 | ρˆint (r, t) , ρˆint (r0 , t0 ) | Ψ0 > δV (r0 , t0 ) h −∞ ¯ Z +∞ Z 0 = dt d3 r0 χ(r, r0 ; t − t0 ) δV (r0 , t0 ) (1106) −∞
This is a causal relation in which the response function is identified as i 0 0 0 0 < Ψ0 | ρˆint (r, t) , ρˆint (r , t ) | Ψ0 > Θ( t − t0 ) χ(r, r ; t − t ) = − h ¯ (1107) where Θ(t) is the Heaviside step function. Thus, the response function is a two time correlation function, which involves the ground state expectation value of the commutator of the density operators at different positions and different times. Due to the time homogeneity of the ground state, the correlation function only depends on the difference of the two times. For a spatially homogeneous system, the correlation function only depends on the difference r−r0 . The expression can be evaluated by using the completeness relation X | Ψn > < Ψn | = Iˆ (1108) n
On inserting a complete set of states between the density operators, one obtains " i X 0 0 χ(r, r ; t − t ) = − < Ψ0 | ρˆint (r, t) | Ψn > < Ψn | ρˆint (r0 , t0 ) | Ψ0 > ¯h n # − < Ψ0 | ρˆint (r0 , t0 ) | Ψn > < Ψn | ρˆint (r, t) | Ψ0 >
Θ( t − t0 ) (1109)
On expressing the time dependence of the operators in terms of the eigenvalues ˆ 0 , the response function reduces to of the unperturbed Hamiltonian, H i X i 0 = − exp + (t − t )(E0 − En ) < Ψ0 | ρˆ(r) | Ψn > < Ψn | ρˆ(r0 ) | Ψ0 > ¯h n ¯h X i i 0 + exp − (t − t )(E0 − En ) < Ψ0 | ρˆ(r0 ) | Ψn > < Ψn | ρˆ(r) | Ψ0 > ¯h n ¯h (1110) 269
for t − t0 > 0, and is zero otherwise In the above expression for the response function, the density operators are no longer time dependent. Up to this point, our analysis has been completely general. To illustrate the structure of the response function, we shall now make the assumption that the electrons are non-interacting. The ground state | Ψ0 > and the excited states | Ψn > can represented by single Slater determinants, composed of the set of one-electron energy eigenfunctions {φαj (rj ); j ∈ 1, 2, . . . Ne } and {φβj (rj ); j ∈ 1, 2, . . . Ne }, respectively. The matrix elements of the one-electron operator ρˆ(r) are non-zero only if the set of quantum numbers {αj ; j ∈ 1, 2, . . . Ne } and {βj ; j ∈ 1, 2, . . . Ne } only differ by at most one element, say the i-th value. Thus, we may permute the indices in the set βj until one has αi 6= βi
(1111)
and αj = βj
∀ j 6= i
(1112)
In this case, the matrix elements < Ψ0 | ρˆ(r) | Ψn > are trivially evaluated as Z < Ψ0 | ρˆ(r) | Ψn > = d3 ri φ∗αi (ri ) δ 3 ( r − ri ) φβi (ri ) =
φ∗αi (r) φβi (r)
(1113)
The matrix element is only non zero if the spin state of α is identical to the spin state of β, so the spin quantum number is conserved. In the above expression, the single electron state αi is occupied in the initial state | Ψ0 > and unoccupied in the final state | Ψn > and the single electron state βi is unoccupied in the initial state | Ψ0 > and occupied in the final state | Ψn > . All the other single-electron quantum numbers in | Ψ0 > and | Ψn > are unchanged, i.e., αj = βj for ∀ j 6= i. Furthermore, the Pauli exclusion principle requires that βi 6= βj . This shows that the final states of the non-interacting many-electron system are obtained by exciting a single electron from the state αi to the state βi . For non-interacting electrons, the excitation energy En − E0 is simply given by the difference in the single-electron energy eigenvalues En − E0 = Eβi − Eαi
(1114)
Thus, the response function is simply given by i X i 0 χ(r, r ; t) = − exp + t ( Eα − Eβ ) φ∗α (r) φβ (r) φ∗β (r0 ) φα (r0 ) h ¯ ¯h α,β i X i + exp − t ( Eα − Eβ ) φα (r) φ∗β (r) φβ (r0 ) φ∗α (r0 ) h ¯ ¯h α,β
(1115)
270
for t > 0. The sum over α is restricted to run over the single particle quantum numbers that are occupied in the ground state, and β runs over the quantum numbers that are unoccupied in the ground state. The spin quantum number is conserved, that is σα = σβ . On evaluating the response function for free electrons, summing over spin states and using the Bloch state energy eigenvalues, one finds X X 2i it i 0 0 = − exp + (Ek − Ek0 ) exp − (k − k ) . (r − r ) ¯h V 2 ¯h ¯h |k|kF X X 2i it i 0 0 0 ) + exp − (E − E ) exp + (k − k ) . (r − r k k ¯h V 2 ¯h ¯h 0 |k|kF
(1116) for t > 0. Since the free electron gas is homogeneous, the response function only depends on the distance between the perturbation and the response r − r0 . On Fourier transforming the response function with respect to space and time one obtains χ(q, ω) as Z
+∞
χ(q, ω) =
Z dt
d3 r exp
− i(q.r − ωt)
χ(r; t)
(1117)
−∞
Since the response function contains the Heaviside step function Θ(t), the integral over t can be evaluated in the interval ∞ > t ≥ 0. The integral over t converges faster if ω is analytically continued into the upper half complex plane to ω → z = ω + i δ. The factor of exp [ − δ t ] damps out the oscillations in the integrand as t → ∞. Thus, one finds that in the (q, ω) domain the response function is complex and is given by the expression " # X 2 1 χ(q, ω + iδ) = V ¯h ω + i δ + Ek − Ek+q |k|kF " # X 1 2 − V ¯h ω + i δ + Ek − Ek+q |k|>kF |k+q|
(1118) The restrictions on the summation over k can be simplified. To see this, we shall introduce a function fk which behaves like the T → 0 limit of the Fermifunction. The function is defined by fk = 1
f or Ek < EF
(1119)
fk = 0
f or Ek > EF
(1120)
and
271
The response function can then be written as the sum over all k as " # fk ( 1 − fk+q ) 2 X χ(q, ω + iδ) = V ¯h ω + i δ + Ek − Ek+q k " # fk+q ( 1 − fk ) 2 X − V ¯h ω + i δ + Ek − Ek+q k " # fk − fk+q 2 X = V ¯h ω + i δ + Ek − Ek+q
(1121)
k
In the last line, it is seen that the factors which explicitly enforce the Pauliexclusion principle cancel. For ω just above the real axis, i.e in the limit δ → 0, the imaginary part of the response function is found as 2π X Im χ(q, ω + iδ) = − δ ¯h ω + Ek − Ek+q V |k|
(1122) From this analysis, one can see that for positive ω the imaginary part of χ(q, ω) is non-zero in for the region of (ω, q) phase space, where h ¯ ¯h ( − 2 kF q + q 2 ) < ω < ( + 2 kF q + q 2 ) (1123) 2m 2m It is only in this region that the argument of the first delta function in Im χ(q, ω) has a solution h ¯ ¯h k.q = ω − q2 (1124) m 2m with k < kF . These conditions divide (q, ω) space into non-overlapping regions. For completeness, the complete expressions for the real and imaginary parts of the Lindhard dielectric function at finite frequencies are given (J. Lindhard, Kgl. Danske Videnskab. Selskab. Mat. Fys. Medd. 28, 8 (1954)). The real part is given by " ( kT2 F kF (2 m ω − ¯h q 2 )2 × Re ε(q, ω) = 1 + 1 + 1 − 2 q2 2q 4 ¯h2 q 2 kF2 2 m ω − 2 ¯h q k − ¯h q 2 F × ln 2 m ω + 2 h ¯ q kF − ¯h q 2 )# (2 m ω + ¯h q 2 )2 ¯ q kF + ¯h q 2 2 m ω + 2 h + 1 − ln 2 m ω − 2 ¯h q kF + ¯h q 2 4 ¯h2 q 2 kF2 (1125) 272
and the imaginary part is given by π kT2 F m ω Im ε(q, ω + iδ) = 2 q 2 ¯h q kF
2 m ω < 2 ¯h q kF − ¯h q 2 (1126)
Im
ε(q, ω + iδ)
π kT2 F kF = 4 q2 q
"
(2 m ω − ¯h q 2 )2 1 − 4 ¯h2 q 2 kF2
#
2 ¯h q kF − ¯h q 2 < 2 m ω < 2 ¯h q kF + ¯h q 2
(1127)
and Im
ε(q, ω + iδ)
2 ¯h q kF + ¯h q 2 < 2 m ω
= 0
(1128)
The real part is an even function of ω and the imaginary part is an odd function of ω. For ω = 0 the response function reduces to the real static response function calculated previously. For | ω | > 2 h¯m ( 2 kF q + q 2 ) the imaginary part of the function vanishes, as the denominator never vanishes for any k value in the range of integration. In this region of q and ω there are no poles, therefore, the real part of the response function χ(q, ω) can be expanded in powers of q 2 . To the order of q 4 , one finds " # 2 kF3 q2 3 ¯h kF q Re χ(q, ω) = + 1 + + ... (1129) 3 π2 m ω2 5 mω Thus, for high frequencies such that ω q h¯ mkF the dielectric constant can be approximated by " # 2 4 π ρ e2 3 ¯h kF q ε(q, ω) = 1 − 1 + + ... m ω2 5 mω " # 2 ωp2 3 ¯h kF q = 1 − 2 1 + + ... (1130) ω 5 mω where the expression for the electron density ρ ρ = 2
kF3 6 π2
(1131)
has been used, and the plasmon frequency ωp has been defined via ωp2 =
4 π ρ e2 m
(1132)
Thus, the dielectric constant has zeros at the frequencies ω = ωp (q), where " # 2 3 ¯h kF q 2 2 ωp (q) = ωp 1 + + ... (1133) 5 m ωp (q) 273
If the external potential is zero, φext (q, ωp (q)) = 0, and the total potential is non-zero φ(q, ωp (q)) 6= 0, then the real and imaginary parts of the dielectric constant must vanish, ε(q, ωp (q)) = 0, as ε(q, ωp (q)) φ(q, ωp (q)) = φext (q, ωp (q)) ε(q, ωp (q)) φ(q, ωp (q)) = 0
(1134)
In this case, when the total potential inside the solid, φ(q, ωp (q)) is non-zero, the induced density and current fluctuations must be finite. These longitudinal collective charge oscillations excitations are plasmons. A typical energy range for the plasmon energy, h ¯ ωp , in metals ranges from the low values of 3.72 eV found in K, 5.71 eV found in N a, to values as high as 15.8 eV found in Al. The dielectric materials Si, Ge etc. also have plasmon energies of the order 16 eV. One may enquire as to the nature of the excitations at larger q values, such that the phase velocity of the plasmons becomes greater than the Fermi-velocity vF = h¯ mkF . At a critical value of q the denominator of the response function may vanish, so the response function acquires a sizeable imaginary part. The plasmon excitations merge with a continuum of particle hole excitations which have excitation energies given by ¯h ω(q, k) = Ek+q − Ek
(1135) 2
for k < kF . The edges of the continuum stretch from 2h¯ m ( 2 kF q + q 2 ) 2 to 2h¯ m ( − 2 kF q + q 2 ). When the plasmon merges into the continuum it undergoes significant broadening. This sort of damping is called Landau damping. Landau damping can also be viewed classically, in terms of electrons surf riding the waves in the potential field. Imagine a wave with phase velocity ωq is propagating through an electron gas, and consider the electrons with velocity is almost parallel and close to the phase velocity of the wave. In the frame of reference travelling with the wave, the electron is at rest and experiences an essentially time independent electric field. The electric field continuously transfers energy from the wave to the electrons that have the same velocity. If there is a slight mismatch in the velocities, electrons with lower velocity than the wave draw energy from the wave and accelerate, whereas electrons that are moving faster lose energy and slow down. This has the consequence that the rate of energy loss of the wave is proportional to the derivative of the electron velocity distribution, evaluated at the wave’s phase velocity.
11.3.2
Electron Scattering Experiments
The longitudinal excitations of the electrons in a metal can be probed by scattering of a beam of charged particles or fast electrons. The coupling takes place via the Coulomb interaction X e2 ˆ int = H (1136) | r − ri | i 274
where ri labels the positions of the electrons in the plasma and r is the position of the incoming high energy electron. If the incident beam is composed of electrons which have high energies, the beam electrons can be considered to be as classical and are, therefore, distinguishable. This ignores the possibility of exchange interactions with the electrons in the metal. Analysis of the Mott scattering formula for electrons also shows that the neglect of the exchange scattering is an excellent approximation for scattering through small angle scattering. Therefore, we shall consider the charged particles in the beam as being distinguishable from the electrons in the solid. The rate at which a charged particle is scattered inelastically from state k with energy E(k) to state k 0 with energy E(k 0 ) is given by 2 δ En + E(k 0 ) − E0 − E(k) (1137) where | Ψ0 > and E0 are the ground state wave function and ground state energy of the solid. The final state wave function and energy is given by | Ψn > and En . The momentum and energy loss of the charged particle are defined to be 1 2 π ˆ int | k Ψ0 > = < k 0 Ψn | H h ¯ τ (k → k 0 )
¯h q
= h ¯ k − ¯h k 0
hω ¯
= E(k) − E(k 0 )
(1138)
On performing the integral over the position of the fast charged particle one has 2 2 X 2 π 4 π e2 1 < Ψn | × = exp[ i q . r ] | Ψ > 0 i 0 2 h ¯ q V τ (k → k ) i δ ¯h ω + E0 − En (1139) The energy conserving delta function can be replaced by an integral over time by using the identity Z ∞ dt t δ ¯ h ω + E0 − En = exp[ i ω t ] exp i ( E0 − En ) h ¯h −∞ 2 π ¯ (1140) The energy eigenvalues in the exponential time evolution factors can be replaced by the general time evolution operators involving the unperturbed Hamiltonian ˆ 0, operator H 1 2π = 0 τ (k → k ) ¯2 h
4 π e2 q2 V
2 Z
∞
−∞
X dt exp[ i ω t ] < Ψn | exp[ i q . ri ] | Ψ0 > × 2π i 275
< Ψ0 |
X
exp[ i
i
ˆ0 t ˆ0 t H H ] | Ψn > ] exp[ − i q . ri ] exp[ − i ¯h ¯h (1141)
The factor involving ri can be expressed as the Fourier transform of the electron density operator Z X 1 X 1 d3 r exp[ − i q . r ] δ 3 r − ri exp[ − i q . ri ] = V V i i Z 1 = d3 r exp[ − i q . r ] ρˆ(r) V = ρˆq (1142) Thus, on combining the above expressions, the inelastic scattering rate is found as 2 Z ∞ dt 1 2 π 4 π e2 = exp[ i ω t ] < Ψn | ρˆ−q | Ψ0 > × 2 2 q 2 π τ (k → k 0 ) h ¯ −∞ ˆ0 t ˆ0 t H H ] ρˆq exp[ − i ] | Ψn > < Ψ0 | exp[ i ¯h ¯h 2 Z ∞ 2 π 4 π e2 dt = exp[ i ω t ] < Ψ0 | ρˆq (t) | Ψn > < Ψn | ρˆ−q (0) | Ψ0 > 2 2 q h ¯ −∞ 2 π (1143) where the density operator is evaluated in the interaction representation. If the final state of the solid | Ψn > is not measured, there is a distribution of possible final states of the solid. If only the final state of the charged particle is measured and the final state of the solid is not measured, the index n corresponding to the different possible final states must be summed over X 2 π 4 π e2 2 Z ∞ dt 1 = exp[ i ω t ] × q2 τ (k → k 0 ) ¯h2 −∞ 2 π n < Ψ0 | ρˆq (t) | Ψn > < Ψn | ρˆ−q (0) | Ψ0 > (1144) The sum over the final states can be evaluated using the completeness relation, which leads to the result 2 Z ∞ 1 2 π 4 π e2 dt = exp[ i ω t ] < Ψ0 | ρˆq (t) ρˆ−q (0) | Ψ0 > 0 2 2 q τ (k → k ) h ¯ −∞ 2 π (1145) The factor of q −4 shows the scattering process is dominated by small momentum transfers. The density-density correlation function, S(q, ω), is defined via Z ∞ dt S(q, ω) = V 2 exp[ i ω t ] < Ψ0 | ρˆq (t) ρˆ−q (0) | Ψ0 > (1146) 2 π ¯h −∞ 276
On substituting this relation into the scattering rate we obtain the result 2π 1 = ¯h τ (k → k 0 )
4 π e2 q2 V
2 S(q, ω)
(1147)
Thus, it is seen that the long wavelength electron density fluctuations are mainly responsible for scattering the incident charged particle. For non-interacting electrons, S(q, ω) is evaluated as S(q, ω) = 2
X
¯h ω + Ek − Ek+q
δ
(1148)
|k|kF
where the summation over k is over the filled Fermi-sphere, subject to the restriction that the final state be allowed by the Pauli exclusion principle. The inelastic scattering cross-section can be evaluated in terms of the scattering rate, and is found to be given by the expression d2 σ k0 = dΩ dω k
2 mq e2 ¯h2 q 2
2 ¯h S(q, ω)
(1149)
where mq is the mass of the charged particle. Thus, in the Born approximation the scattering cross-section is directly related to the density-density correlation function. This type of correlation function was first introduced by van Hove in the context of neutron scattering (L. van Hove, Phys. Rev. 95, 249 (1954)). The Fluctuation-Dissipation Theorem (H.B. Callen and T.A. Welton, Phys. Rev. 83, 34 (1951)) relates the spectrum of electron density fluctuations to the imaginary part of the dielectric constant. At finite temperatures this relation has the form 1 q2 V 1 S(q, ω) = Im (1150) 4 π 2 e2 exp[ − β ¯h ω ] − 1 ε(q, ω + iδ) The relation between S(q, ω) and the inverse dielectric constant can be seen through the following classical argument. The power, per unit volume, dissipated by the electromagnetic field of the charged particle is given by P (r, t) =
∂D 1 E. 4π ∂t
(1151)
For a negatively charged particle travelling with velocity v(t), the displacement field D(r, t) is the experimentally controllable quantity and is given by the expression " # |e| D(r, t) = − ∇ − (1152) | r − r(t) |
277
On Fourier transforming D(r, t) with respect to space and time one finds D(q, ω). However, D(q, ω) is related to the Fourier transform of the electric field E(q, ω) via a factor of the dielectric constant E(q, ω) =
D(q, ω) ε(q, ω + iδ)
(1153)
On Fourier transforming the expression for the power density, P (r, t), respect to r and t, one finds P (q, ω) to be given by " # 2 ω 1 P (q, ω) = − Im D(q, ω) 8π ε(q, ω + iδ) " # 2 Im ε(q, ω + iδ) ω D(q, ω) = 2 8π | ε(q, ω + iδ) | " # Im ε(q, ω + iδ) ω D(q, ω) = 8π ( Re ε(q, ω) )2 + ( Im ε(q, ω + iδ) )2
with
2
(1154) This result implies that the zero in the real part of ε(q, ω) should show up as a delta function peak in the power loss. ——————————————————————————————————
11.3.3
Exercise 51
Use linear response theory to express the change in the electron density induced by an external charge. Hence, express the inverse of the dielectric constant in terms of the exact eigenstates and energy eigenvalues of the interacting manyelectron system. Use the resulting expression to find the T = 0 form of the fluctuation-dissipation theorem (P. Noziˆeres and D. Pines, Nuovo Cimento, 9, 470 (1958)). —————————————————————————————————— Solution In the Coulomb gauge, the Fourier transform of the external charge density ρext (r, t) is related to the external potential via Poisson’s theorem − q 2 φext (q, ω) = − 4 π | e | ρext (q, ω)
(1155)
The total field is related to the external charge density and the induced charge density via − q 2 φ(q, ω) = − 4 π | e | ρext (q, ω) − 4 π | e | ρind (q, ω) 278
(1156)
The dielectric constant is defined as φ(q, ω) 1 = ε(q, ω) φext (q, ω)
(1157)
ρind (q, ω) 1 = 1 + ε(q, ω) ρext (q, ω)
(1158)
which can be expressed as
Hence, the linear response relation can be expressed as 1 ρind (q, ω) = − 1 ρext (q, ω) ε(q, ω) 1 q2 − 1 φext (q, ω) = ε(q, ω) 4π|e| 1 q2 = 1 − δV (q, ω) ε(q, ω) 4 π e2
(1159)
However, the interaction operator is given by Z ˆ int (r, t) = H d3 r ρˆ(r) δV (r, t)
(1160)
where the electron density is given by X ρˆ(r) = δ 3 (r − ri )
(1161)
i
The induced electron density is evaluated using linear perturbation theory, in the interaction representation. In the absence of the perturbation, the ground state is denoted by | Ψ0 > . The perturbation is turned on adiabatically, and the ground state evolves to the state | Φ0 (t) > which, to first order in the interaction, is given by Z i t 0 ˆ 0 | Φ0 (t) > = 1 − dt Hint (t ) | Ψ0 > (1162) ¯h −∞ The induced electron density ρind (r, t) is then given by the expectation value of the commutator Z t i ˆ int (t0 ) ] | Ψ0 > ρind (r, t) = − dt0 < Ψ0 | [ ρˆ(r, t) , H h −∞ ¯ Z t Z i = − dt0 d3 r0 < Ψ0 | [ ρˆ(r, t) , ρˆ(r0 , t0 ) ] | Ψ0 > δV (r0 , t0 ) h −∞ ¯ Z ∞ Z = dt0 d3 r0 χ(r − r0 , t − t0 ) δV (r0 , t0 ) (1163) −∞
279
The material is assumed to be homogeneous, therefore, the response function ˆ 0 is is only a function of the spatial separation r − r0 . Furthermore, since H independent of time, the response function is only a function of t − t0 . On Fourier transforming this equation with respect to space and time, one finds ρind (q, ω) = χ(q, ω) δV (q, ω)
(1164)
where χ(q, ω)
= V
X
| < Ψ0 | ρˆq | Ψn > |2 ¯h ω + i δ + En − E0
n
+
| < Ψ0 | ρˆq | Ψn > |2 − ¯h ω − i δ + En − E0
(1165)
The imaginary part of the response function is found as X 2 Im χ(q, ω) = − π V | < Ψ0 | ρˆq | Ψn > | δ( ¯h ω + En − E0 ) − δ( ¯h ω − En + E0 ) n
(1166) Thus, the zero temperature limit of the fluctuation-dissipation theorem has the form 1 4 π 2 e2 V X 2 = | < Ψ | ρ ˆ | Ψ > | δ(¯ h ω + E − E ) − δ(¯ h ω − E + E ) Im 0 q n n 0 n 0 ε(q, ω) q2 n 4 π 2 e2 = S(q, −ω) − S(q, ω) (1167) q2 V The first term is only non-zero if 0 > ω, and the second term is only non-zero in the range ω > 0. ——————————————————————————————————
11.3.4
Exercise 52
Show, using classical electromagnetic theory, that the power loss spectrum of a particle with charge e moving with velocity v, due to plasmons can be expressed as 2 e2 ω q0 v P (ω) = − Im ln (1168) πv ε(ω + iδ) ω Assume that the dielectric constant is independent of q, for q < q0 where h ¯ q0 is the maximum momentum transfer. —————————————————————————————————— Solution 52
280
The average power P dissipated by the charged particle can be expressed as the limit τ → ∞ Z ∞ 1 t P = dt P (t) exp − τ 0 τ Z ∞ Z 1 t 3 = dt d r P (r, t) exp − τ 0 τ Z ∞ Z 1 ∂ t 3 = dt d r E(r, t) D(r, t) exp − 4πτ 0 ∂t τ (1169) where we have inserted an exponential convergence factor. The convergence factor will be absorbed in the displacement and electric fields. The Fourier transform is expressed as Z Z ∞ 1 D(q, ω) = d3 r dt D(r, t) exp − i ( q . r − ω t ) (1170) V −∞ and the inverse Fourier transformation is given by Z Z V dω 3 D(r, t) = d q D(q, ω) exp + i ( q . r − ω t ) (1171) ( 2 π )3 2π On inserting the expressions for the inverse Fourier transforms into the expression for the average power loss, one finds Z ∞ Z d3 q i V2 dω P = − E(−q, −ω) ω D(q, ω) 2 τ ( 2 π ) −∞ 2 π ( 2 π )3 Z ∞ Z d3 q D(q, ω) i V2 dω = ω D∗ (q, ω) 3 2 τ ( 2 π ) −∞ 2 π ( 2 π ) ε(q, ω + 2τi ) Z ∞ Z d3 q i V2 dω ω = | D(q, ω) |2(1172) 2 τ ( 2 π ) −∞ 2 π ( 2 π )3 ε(q, ω + 2τi ) in the limit τ → ∞. The Fourier transform of the displacement field is given by Z Z ∞ 1 t |e| 3 D(q, ω) = d r dt exp − i ( q . r − ω t ) − ∇ V 2 τ | r − vt| 0 Z ∞ 4πiq|e| t = − dt exp i ( ω − q . v ) t − 2 V q 2τ 0 4πiq|e| 2τ = − (1173) V q2 1 − i(ω − q.v)2τ Hence, 4 e2 P = 2 τ ( 2 π )3
Z
∞
Z dω
−∞
d3 q iω q 2 ε(q, ω +
281
1 i 2τ ) 4 τ 2
1 + ( ω − q . v )2 (1174)
which in the limit τ → ∞ reduces to Z ∞ Z d3 q 2 e2 iω P = dω δ( ω − q . v ) 2 2 (2π) q ε(q, ω + iδ) −∞
(1175)
This yields the expression Z ∞ Z 2 e2 dq iω P = dω θ( ω + q v ) − θ( ω − q v ) ( 2 π ) −∞ q v ε(q, ω + iδ) (1176) Hence, on assuming that the dielectric constant is independent of q, from q = 0 to an upper cut off q = q0 one obtains the result Z ∞ e2 iω q0 v P = − dω ln (1177) π v −∞ ε(ω + iδ) ω The power loss spectrum, P (ω), is defined in terms of an integral over positive frequencies Z ∞ P = dω P (ω) (1178) 0
On using the symmetry properties of the dielectric constant under the transformation ω → − ω, one finds that the contribution from real part of the inverse dielectric constant vanishes. Hence, one obtains the final result for P (ω) ω q0 v 2 e2 P (ω) = − Im ln (1179) πv ε(ω + iδ) ω
—————————————————————————————————— In a typical experiment, monochromatic electron beams with energies E(k) of the order of keV fall incident on thin films, and the energies of the scattered electrons, E(k 0 ) are analyzed (L. Marton, J.A. Simpson, H.A. Fowler and N. Swanson, Phys. Rev. 126, 182 (1962)). Experimentally, it is found that the fast electron loses energy in almost exact multiples of ¯h ωp . That is, the energy loss spectrum shows peaks separated by energies which are multiples of h ¯ ωp . The above analysis predicts a single pole near ω = ωp . The discrepancy is caused by the use of the Born Approximation, which neglects the effects of multiple scattering. The experiments are usually analyzed by fitting the intensities of the peaks to a Poisson distribution n 1 L L In = exp − I0 (1180) n! λ λ where In is the intensity of the n-th plasmon peak, L is the sample thickness and λ is the mean free path. The mean free path is then compared with the
282
theoretically derived inelastic scattering cross-section. The mean free path can be estimated from the scattering cross-section. One expressing the density-density correlation function in terms of the imaginary part of the inverse of the dielectric constant one finds " # 2 d2 σ k0 mq e 1 = − ¯h V Im (1181) dΩ dω k ε(q, ω + iδ) π ¯h2 q For the frequency range 2 m ω > 2 ¯h q kF + q 2
(1182)
the imaginary part of χ(q, ω + iδ) vanishes as δ → 0. Therefore, in this limit, the imaginary part of the dielectric constant also vanishes Im ε(q, ω + iδ) → κ δ for some value finite of κ. Hence, in this limit one has " # 1 Im = − π δ( Re ε(q, ω) ) ε(q, ω + iδ)
(1183)
(1184)
Furthermore, in this region of (ω, q) space one has the approximate expression ε(q, ω) = 1 −
ωp2 (q) ω2
where the plasmon dispersion relation is given by 3 2 2 ωp2 (q) = q vF + . . . ωp2 + 5
(1185)
(1186)
and the plasmon frequency by 4 π ρ e2 m
ωp2 =
(1187)
Thus, one finds the single (plasmon) pole approximation for the inverse dielectric constant # " 2 ω − ωp2 (q) 1 Im = −πδ ε(q, ω + iδ) ω2 = −π = −
ω2 δ( ω − ωp (q) ) ω + ωp (q)
π ωp (q) δ( ω − ωp (q) ) 2 283
(1188)
for positive ω. The plasmon contribution to the differential scattering crosssection, is found by integrating over the energy loss ω and is given by k 0 2 m2q ρ V dσ = dΩ k m ¯h ωp
e2 ¯h q
2 (1189)
Thus, the scattering cross-section is directly proportional to the number of electrons in the solid. In deriving the differential scattering cross-section, we have neglected the q dependence of the plasmon dispersion relation. The total plasmon scattering cross-section is found by integrating the differential cross-section over the scattering angle θ. We note that energy and momentum conservation leads to the two conditions q2
= k 2 + k 02 − 2 k k 0 cos θ
q2
=
( k − k 0 )2 + 4 k k 0 sin2
θ 2
(1190)
and ( k − k0 ) ( k + k0 ) = ( k − k0 )
=
2 mq ωp ¯h 2 mq ω p ¯h ( k + k 0 )
(1191)
Foe small scattering angles, θ 1, these can be combined to yield q2
≈ k2 ≈ k2
¯h2 ωp2 θ + 2 4 E(k)2 + θ02
4 sin2 θ2
(1192)
Hence, one has dσ dΩ
≈ ≈
k 0 2 m2q ρ V e4 2 2 k m ¯h ωp h ¯ k ( θ2 + θ02 ) mq e4 ρ V m ¯h ωp E(k) ( θ2 + θ02 )
(1193)
On integrating over the solid angle dΩ, but restricting the range of θ from zero to a maximum momentum transfer given by 2 kF ∼ θm k, one finds the total cross-section, σ, for plasmon scattering is given by σ ≈ 2π
e4 ρ V mq θm ln ¯h ωp E(k) m θ0
(1194)
The mean free path, λ, is then found by noting that a trajectory of crosssectional area σ covers a volume λ σ between consecutive collisions, which must
284
equal V the volume of the solid. This leads to the mean free path being given by e4 ρ mq θm λ−1 ≈ 2 π ln (1195) ¯h ωp E(k) m θ0 Thus, the mean free path depends linearly on the kinetic energy of the incident electron. This value has been found to track the mean free path obtained by fitting the measured intensities of the multi-plasmon peaks.
11.3.5
The Transverse Response
In the Coulomb or radiation gauge, the vector potential describes the transverse electromagnetic field. It satisfies the equation 4π ω2 A(q, ω) = − j (q, ω) (1196) − q2 + 2 c c T The situation in which there are no transverse external currents impressed on the system, j T ext (q, ω) = 0 is considered. Thus, one obtains the microscopic equation 4π ω2 2 A(q, ω) = − j (q, ω) (1197) − q + 2 c c T ind Ohm’s law can be expressed in the form jT
ind
(q, ω) = σT (q, ω) ET (q, ω)
(1198)
where σT is the transverse conductivity and the total transverse electric field is given by ω ET (q, ω) = i A(q, ω) (1199) c This leads to ω2 4πω − q2 + 2 + i σ (q, ω) A(q, ω) = 0 (1200) T c c2 The transverse dielectric function is identified in terms of the optical conductivity 4πi εT (q, ω) = 1 + σT (q, ω) (1201) ω The photon dispersion relation can be re-written as 2 ω εT (q, ω) = q 2 c
(1202)
If εT (q, ω) > 0, then there are undamped transverse electromagnetic waves. Otherwise, q would be complex which implies that E T (r; t) is attenuated as it enters into the sample. In other words, if Im ε(q, ω) = 0 and Re ε(q, ω) > 0, 285
the material is transparent to transverse electromagnetic waves. The dispersion relation is given by 2 cq (1203) εT (q, ω) = ω Thus, the transverse excitations have a completely different character to the longitudinal excitations, specially at large q. As q → 0, one expects that σL (q, ω) → σT (q, ω) since electrons cannot differentiate between transverse and longitudinal waves in this limit. In this limit the conductivity may be modelled by the complex Drude expression σ(0, ω) =
ρ e2 τ 1 m 1 − iωτ
(1204)
which leads to the dielectric constant being given by ε(0, ω) = 1 −
1 4 π e2 ρ m ω 2 1 + ωi τ
(1205)
This approximate equality between the longitudinal and transverse dielectric constants implies that the plasmon frequency also sets the frequency scale for the interaction of photons with a metal. The dispersion relation for transverse radiation becomes ε(0, ω) ω 2 = c2 q 2
(1206)
ω 2 − ωp2 = c2 q 2
(1207)
which is given by
Thus, for ω < ωp , the wave will be reflected from a metal. For a typical metal, where ρ ∼ 1022 electrons/cm3 , a typical plasmon frequency is 1015 sec−1 . This typical frequency corresponds to a typical wave length of light in vacuum of λp ∼ 10−7 m. Incident light with longer wave length will be reflected from the metal. Hence, as εL (0, ω) = εT (0, ω) which implies that optical experiments that measure εT (q, ω) produce similar information to characteristic energy loss experiments that determine εL (q, ω). The transverse conductivity may be evaluated directly by linear response theory. The vector potential couples to the electrons via the interaction " # X | e | | e | 2 ˆ int = H pˆi . A(ri , t) + A(ri , t) . pˆi + A(ri , t) 2mc i c (1208) The interaction contains a paramagnetic contribution that involves a coupling to the momentum density and a diamagnetic contribution that involves a coupling with the density of the charged electrons. The transverse current density j(r, t) 286
is the mechanical current density e v and is given by the sum of a paramagnetic current j p and a diamagnetic current j d j(r, t) = j p (r, t) + j d (r, t) where the paramagnetic current is given by the symmetric operator |e| X ˆi + p ˆi δ 3 (r − ri ) j p (r) = − δ 3 (r − ri ) p 2m i
(1209)
(1210)
and the diamagnetic current is given by j d (r) = −
| e |2 X 3 δ (r − ri ) A(ri ) mc i
(1211)
To linear order in the vector potential, the interaction can be written as Z ˆ1 = −1 d3 r j p (r) . A(r) (1212) H int c The induced paramagnetic current density is then found from linear response theory, in which the ground state is evaluated to first order in the perturbing ˆ 1 . The components of the induced paramagnetic current are interaction H int given as a causal convolution of a paramagnetic current - paramagnetic current tensor correlation function and the components of the total vector potential. Z X Z +∞ 1 α,β α 0 jind (r, t) = dt d3 r0 Rj,j (r, r0 , t − t0 ) Aβ (r0 , t0 ) (1213) p c −∞ β
where the paramagnetic response function is given by the ground state expectation value i α,β (r, r0 , t − t0 ) = + Rj,j < Ψ0 | jpα (r, t) , jpβ (r0 , t0 ) | Ψ0 > Θ( t − t0 ) h ¯ (1214) This is known as the Kubo formula for the conductivity (R. Kubo, J. Phys. Soc. Jpn. 12, 570 (1957)). The structure of the Kubo formula for the response R is similar to that of the longitudinal response function χ. They both involve the expectation value of a retarded two time commutator. However, the Lindhard function involves the commutator of the density operator and the Kubo formula involves the current operator. On Fourier transforming the non-local relation between j p and A with respect to space and time, one has α jind p (q, ω) =
X 1 α,β R (q, ω) Aβ (q, ω) c j,j β
287
(1215)
Hence, to linear order in the vector potential, the total transverse current is given by " # X 1 α,β | e |2 α jind (q, ω) = R (q, ω) − δα,β ρ0 Aβ (q, ω) (1216) c j,j mc β
where it is assumed that the electron density is uniform and is given by ρ0 . The transverse conductivity is then found with the aid of the relation between the transverse electric field and the vector potential ET (q, ω) = i as σTα,β (q, ω)
1 = iω
ω A(q, ω) c
" α,β Rj,j (q, ω)
− δα,β
(1217) | e |2 ρ0 m
# (1218)
The conductivity should be evaluated using a microscopic theory, and has a real part and an imaginary part that are connected by causality. The conductivity determines the material’s properties and how transverse electromagnetic radiation or light interacts with the electrons in the metal. The energy loss due to a longitudinal field is related to the inverse of the dielectric constant, but the energy loss of a transverse field is related to the conductivity or the imaginary part of the dielectric constant. This can be seen from the expression for the time averaged dissipated power density P (q, ω)
= =
ω Im ε(q, ω + iδ) | E T (q, ω) |2 4π ω3 Im ε(q, ω + iδ) | A(q, ω) |2 4 π c2
(1219)
As the imaginary part of the dielectric constant is related to the real part of the conductivity, 4π Re σ(q, ω) (1220) Im ε(q, ω + iδ) = ω the absorption of light measures the conductivity.
11.3.6
Optical Experiments
The optical conductivity can be measured in optical absorption and reflection experiments. The wave vector of light in the medium is given by the complex number 1 ω 4 π i σ(ω) 2 1 + q = c ω ω = n + iκ (1221) c 288
and this has the effect that intensity of light is exponentially attenuated as it passes through the material κωz nz − t ) exp − (1222) E(r, t) = E 0 exp i ω ( c c Experiments measure the absorption coefficient η absorbed in passing through unit thickness of the Re j . E η = = 2 n | E |2
which is the fraction of light material κω c
(1223)
Another, experimental method (ellipsometry) measures the reflectance of light. This involves measuring the ratio of the reflected to the incident intensities, and gives rise to the real reflection coefficient. At oblique incidence, with angle of incidence θ one distinguishes between s and p polarized light. The s polarized light has the polarization perpendicular to the plane of incidence and the p polarized light has polarization parallel to the plane of incidence. The reflectances are given in terms of the complex refractive index n ˜ = n + i κ, via the Fresnel formulas 1 cos θ − ( n ˜ 2 − sin2 θ ) 2 (1224) Rs (θ) = 1 cos θ + ( n ˜ 2 − sin2 θ ) 2 and 2 1 n ˜ cos θ − ( n ˜ 2 − sin2 θ ) 2 Rp (θ) = (1225) 1 n ˜ 2 cos θ + ( n ˜ 2 − sin2 θ ) 2 The complex refractive index can then be inferred from measurements of Rs (θ) and Rp (θ). However, it is usual to infer the real part from the imaginary part via the Kramers-Kronig relation (H.A. Kramers, Nature 117, 775 (1926), R. de L. Kronig, J. Opt. Soc. Am., 12 547 (1926)).
11.3.7
Kramers-Kronig Relation
Causality requires that the frequency be continued in the upper half complex plane ω + i δ in the response functions. This has the consequence that the response function is analytic in the upper half complex plane. Also, it is required that the integrand vanishes over a semi-circular contour at infinity in the upper half complex plane. With these restrictions one can consider the Cauchy integral Z ε(q, z) − 1 1 ( ε(q, ω + iδ) − 1 ) = dz (1226) 2πi c z − ω − iδ where the contour is taken around the point z = ω + iδ. If ε(q, z) does not have a pole at z = 0, the contour of integration can be deformed to the real axis and an infinite semi-circular contour in the upper half complex plane. In this case, one finds Z +∞ ε(q, z) − 1 1 ε(q, ω + iδ) − 1 = Pr dz (1227) πi z − ω −∞ 289
in which the contribution of the small semi-circle around the pole at z = ω + i δ has been cancelled out. On writing ε(q, z) = Re ε(q, z) + i Im ε(q, z)
(1228)
one finds Re ε(q, ω + iδ) − 1 =
1 Pr π
and 1 Pr Im ε(q, ω + iδ) = − π
Z
+∞
−∞
+∞
Z
−∞
Im ε(q, z) dz z − ω
Re ε(q, z) − 1 dz z − ω
(1229)
(1230)
These relations can be recast in the form 2 Re ε(q, ω + iδ) − 1 = Pr π
∞
Z 0
and Im ε(q, ω + iδ) = −
2ω Pr π
Z
∞
0
z Im ε(q, z) dz z2 − ω2
(1231)
Re ε(q, z) dz z2 − ω2
(1232)
These are the Kramers-Kronig relations (H.A. Kramers, Nature 117, 775 (1926), R. de L. Kronig, J. Opt. Soc. Am., 12 547 (1926)). They can be used to analyze experimental data or as consistency checks. ——————————————————————————————————
11.3.8
Exercise 53
Derive the form of the Kramers-Kronig relation for the imaginary part of the dielectric constant ε(q, ω) Z ∞ 4 π σ(q, 0) Re ε(q, z) 2ω − Pr dz 2 (1233) Im ε(q, ω) = ω π z − ω2 0 valid for a material which has a finite d.c. conductivity σ(q, 0). —————————————————————————————————— Another sum rule, the optical sum rule is stated as Z ∞ π 2 dω ω Im εT (q, ω + iδ) = ω 2 p 0
(1234)
The optical sum rule can be derived by exact methods. However, it can also be proved by noting that at high frequencies lim
ω → ∞
εT (q, ω) = 1 − 290
ωp2 ω2
(1235)
On expressing the imaginary part of the dielectric constant in terms of the real part, one can verify the sum rule using contour integration. A more usual form of the optical sum rule is stated in terms of a sum rule for the conductivity Z ∞ π ρ e2 dω σ(0, ω) = (1236) 2 m 0 where ρ is the electron density. Kramers-Kronig relations and sum rules can be established for a variety of response functions (P.C. Martin, Phys. Rev. 161, 143 (1967)). Since the inverse dielectric constant is the longitudinal response function, 1/ε(q, ω) − 1 also satisfies a Kramers-Kronig relation. ——————————————————————————————————
11.3.9
Exercise 54
The n-th moment of the imaginary part of the dielectric constant is defined by Mn Z ∞ (1237) Mn = dω ω n Im ε(q, ω + iδ) 0
Show that M1 is given by M1 = 2 π 2
e2 ρ m
(1238)
and that M−1 is given by M−1 =
π 2
ε(q, 0) − 1
(1239)
——————————————————————————————————
11.3.10
The Drude Conductivity
Metals have a large conductivity, and as a result, electromagnetic fields only penetrate a small distance into the metal before the energy of the field is absorbed and dissipated as Joule heating. For low frequencies, or slowly spatially varying fields, the penetration depth δ can be calculated from Maxwell’s equations using the frequency dependent Drude electrical conductivity. The Drude conductivity is calculated by assuming that the photon has a long wavelength, therefore q ≈ 0. On assuming that the medium is homogeneous and isotropic, one finds that the conductivity tensor is diagonal σ α,β (ω) = δ α,β
e2 ρ τ 1 m 1 − iωτ
291
(1240)
The Drude formula for the conductivity can be obtained directly from Kubo formulae, in the case of a non-interacting electrons. On expressing the Kubo formulae in terms of the completes set of exact eigenstates of the many-particle ˆ0 Hamiltonian H ˆ 0 | Ψn > = En | Ψn > H (1241) for t > 0, one finds Rα,β (r, r0 , t)
=
i X t < Ψ0 | jpα (r) | Ψn > < Ψn | j β (r0 ) | Ψ0 > exp + i (E0 − En ) h n ¯ ¯h i X t − < Ψ0 | jpβ (r0 ) | Ψn > < Ψn | j α (r) | Ψ0 > exp − i (E0 − En ) ¯h n ¯h (1242)
On Fourier transforming the Kubo formula with respect to time, one obtains Rα,β (r, r0 , ω)
= −
X < Ψ0 | jpα (r) | Ψn > < Ψn | j β (r0 ) | Ψ0 > ¯h ω + i η + E0 − En n
+
X < Ψ0 | jpβ (r0 ) | Ψn > < Ψn | j α (r) | Ψ0 > ¯h ω + i η + En − E0 n (1243)
where the convergence factor η is to be assigned a physical meaning. This expression is to be evaluated for non-interacting electrons, in which case the states | Ψn > can be taken to be Slater determinants. The matrix elements of the current density operators can be expressed in terms of the one-electron wave functions φγ (r) and φγ 0 (r) via | e | ¯h α ∗ α α ∗ < Ψn | j (r) | Ψ0 > = − φγ 0 (r) ∇ φγ (r) − ∇ φγ 0 (r) φγ (r) 2im | e | ¯h ∗ α (1244) = − Im φγ 0 (r) ∇ φγ (r) m where the electron in the state labelled by the one-electron quantum number γ is the excited to the state with quantum number γ 0 in the final state. The energy difference between the initial and final states is given by the energy difference between the initial and final energies of the excited electron En − E0 = E γ 0 − Eγ
(1245)
Thus, one has Rα,β (r, r0 , ω)
= −
e2 ¯ h2 m2
X Im
φ∗γ (r) ∇α φγ 0 (r)
Im
φ∗γ 0 (r0 ) ∇β φγ (r0 )
¯h ω + i η + Eγ − Eγ 0
γ,γ 0
292
+
e2 ¯ h2 m2
X Im
φ∗γ (r0 ) ∇β φγ 0 (r0 )
Im
φ∗γ 0 (r) ∇α φγ (r)
¯h ω + i η + Eγ 0 − Eγ
γ,γ 0
(1246) where Eγ < µ and Eγ 0 > µ. The conductivity response function will be evaluated for free-electrons. On inserting the single electron wave functions, the response function is found as 0 i 0 exp − h¯ (k − k ) . (r − r ) X e2 ¯ h2 α 0α β 0β Rα,β (r, r0 , ω) = − (k + k ) (k + k ) 4 m2 V 2 ¯h ω + i η + Ek − Ek0 k,σ;k0 0 i 0 exp − h¯ (k − k ) . (r − r ) X e2 ¯ h2 α 0α β 0β + (k + k ) (k + k ) 4 m2 V 2 ¯h ω + i η − Ek + Ek0 0 k,σ;k
(1247) α
β
where k and k denote the α and β components of the vector k. The summation over k, σ runs over the occupied states k < kF whereas the sum over k 0 runs over the unoccupied states with the spin σ but with k 0 > kF . The initial and final state spin quantum numbers are identical. On Fourier transforming with respect to the space variable, and re-arranging the summation index in the second term, one finds f (Ek ) − f (Ek+q ) e2 ¯ h2 X Rα,β (q, ω) = − (2k α + q α ) (2k β + q β ) 2 ¯h ω + i η + Ek − Ek+q 4m V k,σ
(1248) In this expression, the effect of the Pauli-exclusion principle is automatically accounted for. Due to the large magnitude of c, for fixed ω, this expression can be evaluated to leading order in q. In this case, as space is isotropic, the response function is also isotropic. That is, the response function is diagonal in the indices α and β and the diagonal components have equal magnitudes. Hence, the diagonal components can be evaluated from the relation Rα,α (q, ω) =
3 1 X β,β R (q, ω) 3
(1249)
β=1
The response function can be expressed as f (Ek ) − f (Ek+q ) 2 e2 X Rα,β (q, ω) = − δ α,β Ek 3mV ¯h ω + i η + Ek − Ek+q k,σ
(1250) 293
On Taylor expanding the Fermi-function f (Ek+q ) in one has 2 e2 X ∂f α,β α,β R (q, ω) = − δ Ek 3mV ∂Ek ¯h ω k,σ 2 e2 X ∂f = − δ α,β Ek 1 3mV ∂Ek k,σ
powers of (Ek+q − Ek ) ( Ek − Ek+q ) + i η + Ek − Ek+q −
¯ ω + iη h ¯h ω + i η + Ek − Ek+q
(1251) The first term can be evaluated through integration by parts Z ∞ ∂f 2 X 2 X ∂f dE E ρ(E) = − Ek − 3 ∂Ek 3 σ ∂E 0 σ,k Z ∞ 2 X ∂ = dE f (E) E ρ(E) 3 σ ∂E 0 (1252) √
since the boundary terms vanish. Furthermore, since ρ(E) ∝ E, this term is evaluated as X Z ∞ 2 X ∂f − Ek = dE f (E) ρ(E) ∂Ek 3 0 σ σ,k
= Ne
(1253)
as the factor of 32 cancels with the factor of 32 from the derivative. Due to this simplification, the response function is given by ρ0 e2 2 e2 X ∂f ¯h ω + i η α,α R (q, ω) = + Ek ∂Ek ¯h ω + i η + Ek − Ek+q m 3mV k,σ
(1254) On substituting this expression into the conductivity, one finds the first term cancels with the diamagnetic current. This cancellation is responsible for prohibiting current flow occurring in a metal as a response to an applied magnetic field. In other words, a normal metal does not superconduct due to the cancellation of the diamagnetic current. The conductivity is simply given by X ¯h 2 e2 ∂f σ α,β (q, ω) = δ α,β Ek 3imV ∂Ek ¯h ω + i η + Ek − Ek+q k,σ
(1255)
294
The derivative of the Fermi-function is only non-zero in the vicinity of the Fermi-energy. In the limit, T → 0, the derivative may be expressed as −
∂f = δ( E − EF ) ∂E
(1256)
The appearance of the derivative of the Fermi-function in the expression for the conductivity is a consequence of the Pauli-exclusion principle. Only electrons close to the Fermi-energy can absorb relatively small amounts of energy and be excited to unoccupied single-electron states and, hence, carry current. A phenomenological relaxation time τ can be defined via ¯ h = η τ
(1257)
The relaxation time τ can be thought of as the lifetime of the current carrying hole or the current carrying excited electron. This lifetime must be caused by a scattering mechanism. In more rigorous treatments of the conductivity, the scattering rate is calculated using microscopic descriptions of the scattering interaction. When expressed in terms of the relaxation rate, the conductivity becomes ∂f E k ∂Ek e2 τ X 2 (1258) σ α,β (q, ω) = − δ α,β mV 3 1 − i τ ( ω − q . v(k) ) k,σ
In the limit, q → 0, one recovers the Drude approximation for the conductivity σ α,β (ω) = δ α,β
ρ e2 τ 1 m 1 − iτ ω
(1259)
where ρ is the electron density. The Drude conductivity is purely real in the d.c. limit, and is given by σ α,β (0) = δ α,β
ρ e2 τ m
(1260)
and at finite frequencies has a real part that decays like ω −2 Re σ α,β (ω) = δ α,β
ρ e2 τ 1 m 1 + ω2 τ 2
(1261)
Thus, the Drude conductivity has a peak at zero frequency and the width of the peak is determined by the relaxation time. The integrated strength of the low energy Drude peak in the conductivity is given by Z ∞ π ρ e2 dω Re σ α,β (ω) = δ α,β (1262) 2m 0 Hence, the intensity of the Drude peak provides a measure of the number of conduction electrons in a system of non-interacting electrons. For a metal with 295
interacting electrons, the Drude peak, when integrated over a low frequency range yields an estimate of the quasi-particle weight. ——————————————————————————————————
11.3.11
Exercise 55
Show that microwaves, with low frequency frequency ω, satisfy the equation − ∇2 E(r, ω) =
4 π i σ(ω) ω E(r, ω) c2
(1263)
where σ(ω) is the diagonal component of the conductivity tensor. Solve this equation for the electric field and hence calculate the classical skin depth δ. The classical skin depth is defined as the distance δ that an electric field penetrates into a metal before being attenuated. —————————————————————————————————— The analysis of Exercise 55 is only valid if the electric field vary slowly over distances of the order of a mean free path λ. The analysis is only valid for low frequencies and dirty metals. However, for good metals, E(r, ω) varies rapidly in space. This regime corresponds to the anomalous skin effect (A.B. Pippard, Proc. Roy. Soc. A, 191, 385 (1947), A.B. Pippard, Proc. Roy. Soc. A, 224 273 (1954)). Since the electrons do not respond to the field instantaneously and locally, the retarded and non-local response function ought to be used. In this case, one should solve Maxwell’s equations by solving for the Fourier components of the fields E(q, ω) and B(q, ω) and by using an approximate expression for the conductivity tensor in which both the wave vector and frequency dependence are kept (D.C. Mattis and G. Dresselhaus, Phys. Rev. 111, 403 (1958)). This procedure is crucial for the discussion of the anomalous skin effect. ——————————————————————————————————
11.3.12
Exercise 56
The conductivity tensor can be expressed as an integral over the Fermi-surface, Z d2 S τ v α (k) v β (k) (1264) σ α,β (q, ω) = | ¯h v(k) | 1 − i τ ( ω − q . v(k) ) Consider a clean material with a sufficiently long mean free path λ such that qz λ 1 for fixed qz . Show that the transverse conductivity σ x,x (qz eˆz , ω) is given by the approximate expression σ x,x (qz eˆz , 0) = 296
3π σ0 4 | qz | λ
(1265)
(G.E.H. Reuter and E.H. Sondheimer, Proc. Roy. Soc. A195, 336 (1948)) ——————————————————————————————————
11.3.13
The Anomalous Skin Effect
For clean materials with large mean free-paths λ, the penetration of an electric field into a metal is governed by the anomalous skin effect. In the low frequency limit, the electric field Ex (z, ω) is governed by ∂ 2 Ex ω2 4πiω + Ex = − jx (1266) 2 2 ∂z c c2 where the surface of the material is the z = 0 plane. we shall assume that electrons are specularly reflected from the surface. This boundary condition can be understood by imagining that the surface of the metal demarcates the boundary between two identical solids. One solid represents an extension of the actual material generated by mirror symmetry. The condition of specular reflection amounts to assuming that the electrons and fields in the mirror image solid behave in the same way as in the actual solid. This leads to the boundary condition for the electric field being given by ∂Ex ∂Ex = − (1267) ∂z ∂z z=−
z=
Therefore, on subsuming the boundary condition in the equation of motion for the field, one has ∂ 2 Ex ω2 4πiω ∂Ex + 2 Ex = − jx + 2 δ(z) (1268) ∂z 2 c c2 ∂z z=0 Hence, on Fourier transforming with respect to z and using Ohm’s law jx (qz , ω) = σ x,x (qz ) Ex (qz , ω)
(1269)
one obtains the solution 2 Ex (qz , ω) =
ω2 c2
− qz2 +
∂Ex ∂z
z=0 4 π i ω σ x,x (qz , ω) 2 c
(1270)
In the limit of extremely long mean free path λ → ∞, the conductivity simplifies to 3 π σ(0, 0) σ x,x (qz , 0) = (1271) 4 | qz | λ (G.E.H. Reuter and E.H. Sondheimer, Proc. Roy. Soc. A195, 336 (1948)). The spatial dependence of the electric field is given by the inverse Fourier transform, Z ∞ exp − i q z z ∂Ex dqz (1272) Ex (z, ω) = 2 2 ∂z z=0 −∞ π ωc2 − qz2 + c23 λπ | iq ω | σ(0, 0) z 297
On defining δ via δ −1 =
3 π 2 ω σ(0, 0) c2 λ
13 (1273)
one has Z ∞ | qz | exp − i qz z ∂Ex dqz Ex (z, ω) = 2 ∂z z=0 −∞ π ωc2 | qz | − | qz3 | + i δ −3 Z ∞ x cos( x zδ ) ∂Ex dx = −2iδ ∂z π 1 + i x3 − i ω2c2δ2 x 0 z=0 (1274)
For low frequencies, the decay of the electric field is governed by δ the anomalous skin depth. At the surface, the value of the field is given by Z ∞ ∂Ex dx x Ex (0, ω) = − 2 i δ 3 ∂z π 1 + ix − i 0 z=0 2 δ ∂Ex i ≈ − (1 + √ ) 3 ∂z z=0 3
ω2 δ2 c2
Far from the surface, the field has an exponential decay 2 δ z Ex (z, ω) ∼ exp − z δ
x (1275)
(1276)
which decays over a distance δ. This result for the anomalous skin depth δ was first obtained by Pippard, upto a numerical factor (A.B. Pippard, Proc. Roy. Soc. A, 191, 385 (1947), A.B. Pippard, Proc. Roy. Soc. A, 224 273 (1954)). Pippard noted that only the fraction of the electrons λδ close to the surface may participate in the screening process. That is, only the electrons moving parallel to the surface are strongly effected by the electric field. The electrons that remain within the penetration depth δ before being scattered, subtend an angle of dθ ≈
δ λ
(1277)
The number of the electrons which are capable of responding to the field is proportional to the solid angle dΩ, dΩ = 2 π sin θ dθ ∼ 2 π dθ since θ ≈
π 2.
(1278)
Hence, the effective electron density, ρef f is given by ρef f ≈ ρ 298
δ λ
(1279)
where ρ is the uniform electron density. The implies that conductivity parallel to the surface should be reduced by the factor λδ . Thus, on applying the analysis of the classical skin effect, one recovers the relation δ −2 ∼
4πω δ σ(0) 2 c λ
(1280)
Hence, one has Pippard’s relation δ
−1
∼
4 π ω σ(0) c2 λ
13 (1281)
which only differs by a numerical factor from the previously given expression for the skin depth.
11.3.14
Inter-Band Transitions
The absorption of a photon of wave vector q may cause an electron to make a transition between the initially occupied state with Bloch wave vector k to a final state with wave vector k + q. Since the wave vector of light q is small compared with kF , for a given ω, the final state must be in a different band and must be empty. These are inter-band transitions. Materials with small inter-band gaps can have large dielectric constants. The inter-band contribution to the dielectric constant can be obtained, by neglecting q, thereby producing vertical transition between the different Bloch bands. The imaginary part of the dielectric constant due to inter-band transitions can be written as 2 2 Z e d3 k e ˆ . M hω) Im ε(ω+iδ) = 8 π 2 ¯ h2 α k δ( Ec,k − Ev,k − ¯ 3 mω (2π) (1282) The matrix elements for the inter-band transitions are given by Z 3 eˆα . M k = eˆα . d r exp − i k . r uv,k (r) ∇ exp + i k . r uc,k (r) (1283) where un,k are the periodic functions of r in the Bloch functions. The sum over k can transformed into an integral 2 eˆα . M k
2 Z e d2 S Im ε(ω+iδ) = 8 π 2 ¯ h2 mω ( 2 π )3 ( E − E ) ∇ c,k v,k
Ec −Ev =¯ hω
(1284)
299
where d2 S represents an element of the surface in k space defined by the equation h ω = Ec,k − Ev,k . The quantity ¯
Z J(ω) =
2 eˆα . M k
d2 S ( 2 π )3 ∇k ( Ec,k − Ev,k )
(1285) Ec −Ev =¯ hω
is known as the joint density of states. The joint density of states varies rapidly with respect to ω at the critical points, at which = 0 (1286) ∇k ( Ec,k − Ev,k ) Ec −Ev =¯ hω
The inter-band transitions produces a broad continuum in the absorption spectrum, and only the van Hove singularities may be uniquely identified in the spectrum. The analytic behavior of the dielectric constant near a singularity may be obtained by Taylor expanding about the critical point. When other processes such as electron-phonon scattering are considered, second order time dependent perturbation theory describes indirect transitions. In this case, a phonon may be absorbed or emitted by the lattice while the photon is being absorbed. The emission or absorption of the phonon introduces a change of momentum q. Conservation of momentum leads to the momenta of the initial and final state of the electron being related via q = k 0 − k. Since the energy of the phonon is usually negligible compared with the energy of the photon, the energy of the absorbed photon is approximately given by the energy difference of the electron’s initial and final states ¯h ω ≈ Ec,k − Ev,k0
(1287)
Thus, since q varies continuously, the indirect inter-band transitions have a continuous spectrum. The threshold energy for the inter-band transition is close to the minimum value of Ec,k − Ev,k0 , for all possible values of k and k 0 . If the minimum value of Ec,k − Ev,k0 occurs for k = k 0 the band gap is known as a direct band gap, whereas if k 6= k 0 the band gap is called an indirect band gap.
11.4
Measuring the Fermi-Surface
The Fermi-surface determines most of the thermodynamic, transport and optical properties of a solid. The geometry of the Fermi-surface can be determined experimentally, through a variety of techniques. The most powerful of these techniques is the measurement of de Haas - van Alphen oscillations. The de Haas - van Alphen effect is manifested as an oscillatory behavior of the magnetization (W.J. de Haas and P.M. van Alphen, Proc. Amsterdam, Acad. 33, 1106 (1930)). 300
1 The magnetization is periodic in the inverse of the applied magnetic field H . 1 Onsager pointed out that the period in H is given by the expression |e| 1 ∆ = 2π Ae (1288) H ¯h c
where Ae is the extremal cross-sectional area of the Fermi-surface in the plane perpendicular to the direction of the applied field H (L. Onsager, Phil. Mag. 45, 1006 (1952)). By observing the period of oscillations for the different directions of the applied field, one can measure the extremal areas for each direction. This information can then be used to reconstruct the three-dimensional Fermi-surface (D. Schoenberg, Proc. Roy. Soc. A 170, 341 (1939)). First, some properties of the electron orbits in an applied field will be examined, then the experimental methods used in the determination of the Fermi-surface will be described.
11.4.1
Semi-Classical Orbits
In the classical approximation, the Hamilton equations of motion are given by the pair of equations 1 r˙ = v(k) = ∇ Ek (1289) ¯h k and |e| h k˙ = − ¯ v(k) ∧ H (1290) c From these one finds that k changes in a manner such that it remains on the constant energy surfaces. This is found by observing that, from the above equation, k˙ is perpendicular to ∇k Ek . Also since k˙ is perpendicular to H, the k space orbits are a section of the constant energy surfaces with normal along the z axis. That is the orbits traverse the constant energy surfaces, but kz is a constant. The real space orbits are perpendicular to the k space orbits. To show this, we shall first prove that k˙ is perpendicular to r. ˙ On taking the vector product of the equation of motion with the vector H, one finds |e| v(k) ∧ H H ∧ (1291) h H ∧ k˙ = − ¯ c The component of the velocity perpendicular to the applied field is given by ˆ ˆ r˙ ⊥ = r˙ − H r˙ . H ˆ ˆ = H ∧ r˙ ∧ H (1292)
301
ˆ is the unit vector in the direction of H. Hence, where H r˙ ⊥ = −
c ¯h ˆ ∧ k˙ H | e | Hz
(1293)
Thus, on integrating this one finds that the displacement ∆r⊥ is given in terms of the displacement in k space through ∆r⊥ = −
c ¯h ˆ ∧ ∆k H | e | Hz
(1294)
Thus, the real space orbit is perpendicular to the k space orbit and is scaled by a factor of ec Hh¯z . The period T at which the orbit is traversed is given by the integral over one orbit I dk T = (1295) k˙ The rate of change of k is given by the Lorentz Force | e | k˙ = ∇ E ∧ H k 2 ¯h c |e| = Hz ∇k E⊥ (1296) 2 ¯h c where ∇k E⊥ is the component of the gradient perpendicular to H, i.e., the projection in the plane of the orbit. Thus, I dk ¯h2 c 1 T = (1297) ∇ E⊥ | e | H z k If semi-classical quantization considerations are applied, then the energy of the orbits become quantized as do the orbits themselves. The area enclosed by the orbits are related to the energy, and so the areas are also expected to be quantized. This shall be shown by two methods, in the first the quantization condition is imposed through the energy - time uncertainty relation, and the second method will utilize the Bohr-Sommerfeld quantization condition. Quantization Using Energy - Time Uncertainty. The relationship between the energy and the areas enclosed by the k space orbits can be found from consideration of two classical orbits, one with energy E and another with energy E + ∆E where both orbits are in the same kz plane. Then, let ∆k be the minimum distance between these two orbits. The value of ∆k is related to ∆E via ∆E = ∇k E⊥ ∆k (1298) 302
This relation can be substituted into the expression for the period to yield I ¯h2 c 1 T = ∆k dk (1299) | e | Hz ∆E However, the area between the two successive orbits is given by the integral I ∆A = ∆k dk (1300) Thus, the period can be expressed as T =
∆A ¯h2 c | e | Hz ∆E
(1301)
The orbits can be quantized through the energy uncertainty relation En+1 − En
= =
¯h T | e | Hz ¯h c
∆E ∆A
(1302)
Furthermore, as ∆E En+1 − En = ∆A An+1 − An
(1303)
one can cancel a factor of ∆E to find that the area enclosed between consecutive Landau orbits is quantized An+1 − An =
2π|e| Hz ¯h c
(1304)
This difference equation can be solved to yield the area enclosed by the n-th Landau orbital as 2π|e| Hz (1305) An = n + λ ¯h c where λ is a constant, independent of n. Thus, the area of a Landau orbit in k space is related to n and the applied field Hz , through the Onsager equation (L. Onsager, Phil. Mag. 43, 1006, (1952)). Bohr-Sommerfeld Quantization. An alternate derivation of the Onsager equation follows from the BohrSommerfeld quantization condition I 1 p . dr = 2 π ¯h n + (1306) 2 The mechanical momentum is given by p = h ¯ k − 303
|e| A c
(1307)
The integral is evaluated over an orbit in the x − y plane perpendicular to H. The orbit is obtained from the equation of motion with the Lorentz Force Law |e| ¯h k˙ = − r˙ ∧ H c
(1308)
The equation of motion can be integrated with respect to time, to yield ¯h k = −
|e| r ∧ H c
Thus, the Bohr-Sommerfeld quantization condition reduces to I |e| 1 − r ∧ H + A . dr = 2 π ¯h n + c 2 I I |e| = H . dr ∧ r − dr . A c
(1309)
(1310)
However, the integral I dr ∧ r = 2 Ar eˆz
(1311)
is just twice the area enclosed by the real space orbit, and the integral of the vector potential around the loop is given by I dr . A = Φ (1312) where Φ = Ar Hz is the flux enclosed by the orbit. Hence, the magnitude of the area of the orbit, Ar , in real space is quantized and is given by 1 2 π ¯h c (1313) Ar = n + 2 | e | Hz
Since the real space and momentum space orbitals are related via ∆r =
¯h c ∆k | e | Hz
(1314)
one can scale the areas of the real and momentum space orbits. Thus, one recovers the Onsager formulae for the area of the momentum space orbit 1 2π|e| An = n + Hz (1315) 2 ¯h c
304
11.4.2
de Haas - van Alphen Oscillations
Given a solid with Hz = 0, surfaces of constant energy do not intersect when plotted in k space. The consecutive constant energy surfaces, corresponding to the different allowed values of energy, completely fill momentum space. The states on the surfaces which have energy less than µ will be occupied, and those with energy greater than µ are empty. On applying a magnetic field, Hz , the momentum perpendicular to the field is no longer a constant of motion, but kz is constant. However, as time evolves, an orbit never leaves its surface of constant energy. The magnetic field quantizes the orbits. In momentum space, the allowed orbits form a nested set of discrete Landau tubes. Orbits in the regions between the tubes are forbidden. For a general Fermi-surface of a three-dimensional crystal, the intersection of the constant energy surfaces with a plane of fixed kz need not be circular, so that the Landau tubes need not have cylindrical cross-sections. However, for free electrons, the zero field constant energy surfaces are spherical and the Landau tubes are cylindrical. The radius of the tubes is determined by the energy of the the x-y motion, while the height is determined by the component of the kinetic energy due to motion in the zdirection. For free electrons, an occupied orbit is specified by kz and n. The energy of the free-electron orbit is given by En,kz = ( n + where ωc =
¯h2 kz2 1 ) ¯h ωc + 2 2m
(1316)
| e | Hz mc
(1317)
The orbit maps out a circle of area An
2π|e| = Hz ¯h c
1 n + 2
(1318)
so the orbits will consist of concentric circles. On varying kz but holding n fixed, the consecutive orbits will map out a tube in k space. The occupied portions of k space will lie on portions of a series of tubes. These portions will be contained in a volume similar to the volume of the Fermi-surface, when Hz = 0. The bounding volume must reduce to the volume enclosed by the Fermi-surface when the field is decreased. For fields of the order of H ∼ 1 kG, the Fermi-surface cuts about 103 such tubes, so the quasi-classical approximation can be expected to be valid. As the field increases, the cross-sectional area enclosed by the tubes also increases, as does the number of electrons held by the tubes. The extremal tube may cross the zero field Fermi-surface ( H = 0 ) at which point the electrons in the tube will be entirely transferred into the tubes with lower n values. The changing structure gives rise to a loss of tubes from the occupied Fermi-volume when the field changes by amounts ∆H. Thus, if at some value
305
of kz , the occupied Landau tube with the largest area has the largest value n given by the extremal area of the Fermi-surface A(kF ) 1 2π|e| Hz ( n + ) ∼ π kF2 (kz ) = A(kF ) hc ¯ 2
(1319)
then, on changing Hz to Hz + ∆H the tube becomes unoccupied so the largest Landau tube changes from n to n − 1. This occurs when (n −
1 1 ) ( Hz + ∆H ) = ( n + ) Hz 2 2 (1320)
Thus, the extremal orbit crosses the Fermi-surface when Hz is increased by n ∆H ≈ Hz which can be used to eliminate n and relate of the extremal orbit. A(kF ) = π kF2 (kz ) =
∆H Hz
(1321) to the momentum space area
Hz 2 π | e | Hz ∆H ¯ c h
(1322)
Thus, n decreases by unity at fields given by −
∆H 2π|e| 1 = − H2 ¯ c h A(kz )
(1323)
In other words, ∆n changes by − 1 with increasing ∆
1 Hz
.
The non-
monotonic variation of the occupancy of the extremal orbits or tubes gives rise to oscillations in the Free energy as Hz is varied. This can also be seen from examination of the density of states, per spin polarization, for free electrons Z ∞ X dkz ¯h2 kz2 δ( E − n ¯h ωc − ) ρ(E) = D Lz 2m −∞ 2 π n X Z ∞ Lx Ly Lz ¯h2 kz2 = m ω dk δ( E − n h ¯ ω − ) c z c 4 π2 ¯ h 2m 0 n X Lx Ly Lz 2 θ( E − n ¯h ωc ) p = 2 m ωc 2 4π ¯ h 2 m ( E − n ¯h ωc ) n (1324) where D is the degeneracy of a Landau orbital. The degeneracy is given by the ratio of the cross-section of the crystal to the real space area enclosed between the Landau orbits D
= =
Lx Ly ∆Ar Lx Ly m ωc 2 π ¯h 306
(1325)
which increases with increasing field. The density of states has equally spaced square root singularities determined by the energies of the Landau levels, but yet still roughly follows the zero field density of states. On changing the field the spacing between the singularities increases. This means that, as the field is increased, successive singularities may cross the Fermi-energy, and give rise to oscillations in physical properties. Physical properties are expressible as averages which are weighted by the product of the Fermi-function and the density of states. For zero spin-orbit coupling, the average of A is given by X Z +∞ A = dE f (E) ρ( E − µB σ Hz ) Aσ (E) (1326) σ
−∞
in which the electronic density of states is spin split by the Zeeman field. This splitting is comparable with the effect of ¯h ωc . Increasing the field will produce regular oscillations in the integrand which will show up in A. Due to the thermal smearing manifested by the Fermi-function, the oscillations of A as a function of H1z can only be seen at sufficiently low temperatures such that kB T ¯h ωc
(1327)
If this condition is not satisfied, the Fermi-function becomes broad and washes out the peaks in the integrand near µ. As | e | ¯h ∼ 1.34 × 10−4 k/G m c kB
(1328)
it is found that, for a typical field of H = 10 kG, the oscillations will only be appreciable below T ∼ 2 K. ——————————————————————————————————
11.4.3
Exercise 57
A non-uniformity of the magnetic field in a de Haas - van Alphen experiment may cause the oscillations in M z to be washed out. Calculate the field derivative of the electron energy ∂En,k (1329) ∂H for an extremal orbit. Determine the maximum allowed variation of Hz that is allowable for the oscillations to still be observed. Show that it is given by δH, where δH 2π|e| < (1330) 2 Hz ¯h c A and A is the area of the extremal orbit.
307
——————————————————————————————————
11.4.4
The Lifshitz-Kosevich Formulae
The de Haas - van Alphen Oscillations in the magnetization M can be found from the grand canonical potential Ω X Ω = − kB T ln 1 + exp − β ( Eα − µ ) (1331) α
where the sum over α runs over all the one-electron states. The Lifshitz-Kosevich formulae describes the oscillatory parts of M (I.M. Lifshitz and A.M. Kosevich, Sov. Phys. J.E.T.P. 2, 636 (1956)). This shall be examined in the T → 0 limit. In the limit T → 0 one has X lim Ω = Eα − µ Θ( µ − Eα ) (1332) T → 0
α
where Θ(x) is the Heaviside step function. Also, the total number of electrons is given by X Ne = Θ( µ − Eα ) (1333) α
The dispersion relation for free electrons in an applied field is given by h2 kz2 ¯ 1 + n + ¯h ωc − µB Hz σ (1334) Eα = 2m 2 so 2 2 n=∞ Z ∞ | e | Hz V X X ¯h kz 1 Ω = dkz + n + ¯h ωc − µB Hz σ − µ 4 π2 c ¯ h 2m 2 −∞ σ n=0 ¯h2 kz2 1 ×Θ µ − − (n + ) ¯h ωc + µB Hz σ 2m 2 (1335) For fixed n the step function has the effect that the kz integration is limited to the range of kz values, kz (σ, n) > kz > − kz (σ, n) , where h2 kz (σ, n)2 ¯ 1 = µ − n + ¯h ωc + µB Hz σ (1336) 2m 2 The kz integration can be performed yielding 4 | e | Hz V Ω = − 3 4 π2 c ¯ h
2m h2 ¯
12 X n=∞ X σ
n=0
µ −
1 n + 2
! 32
¯h ωc + µB Hz σ
1 ×Θ µ − (n + ) ¯h ωc + µB Hz σ 2 (1337) 308
Thus, the summation over n only runs over a finite range of values, where n runs from 0 to n+ , where n+ denotes the integer part of n+ =
µ + µB Hz σ 1 − ¯h ωc 2
(1338)
Hence, 4 | e | Hz V Ω = − 3 4 π2 c ¯ h
2m h2 ¯
12 X n=n X+ σ
µ −
n=0
1 n + 2
! 23
¯h ωc + µB Hz σ
(1339) The thermodynamic potential shows oscillatory behavior as H increases, since when h¯ µωc changes by an integer the upper limit of the summation over n also changes by an integer. In order to make the oscillatory nature of the summation more explicit, a periodic function β(x) is introduced. The periodic function is defined as β(x) =
n=+∞ X
x − (n +
δ
n=−∞
1 ) 2
(1340)
The summation over n in the thermodynamic potential can be expressed in terms of an integral over β(x) via 4 | e | Hz V Ω = − 3 4 π2 c ¯ h
2m h2 ¯
12 X Z
! 32
x+
dx β(x)
µ − x ¯h ωc + µB Hz σ
0
σ
(1341) where the upper limit of integration is given by x+ =
µ mc + µB σ ¯h ωc | e | ¯h
(1342)
| e | ¯h 2mc
(1343)
µ σ + ¯h ωc 2
(1344)
However, as µB = the upper limit of integration becomes x+ =
in which the mass of the electron has cancelled in the second term. In general, the spin splitting term will depend on the ratio of the mass of the electron to the band mass.
309
On Fourier analyzing β(x) one has β(x) = 1 + 2
∞ X
cos 2πp ( x −
p=1
1 ) 2
(1345)
which on substituting into the expression for Ω yields the expression " 52 1 2m 2 X 2 4 | e | Hz V Ω = − µ + µB σ Hz 3 4 π2 c ¯ h 5 ¯h ωc h2 ¯ σ +2
∞ Z X p=1
0
x+
1 dx cos 2πp ( x − ) 2
! 32 # µ − x ¯h ωc + µB Hz σ (1346)
The first term is non-oscillatory. The term containing the summation produces the oscillatory terms. The second term can be evaluated by integration by parts 3 Z x+ 1 µ µB Hz σ 2 Ip = dx cos 2πp ( x − ) − x + 2 ¯h ωc ¯h ωc 0 32 Z x+ dx d 1 = sin 2πp ( x − ) x+ − x 2 π p dx 2 0 12 Z x+ 3 dx 1 sin 2πp ( x − ) x+ − x = 2 0 2πp 2 (1347) since the boundary term vanishes. Integrating by parts once again 21 Z x+ 3 d 1 Ip = − dx cos 2πp ( x − ) x − x + 8 π 2 p2 0 dx 2 " − 12 # Z x+ 1 3 1 1 2 = x+ cos πp − dx cos 2πp ( x − ) x+ − x 8 π 2 p2 2 0 2 (1348) Changing variables from x to u, where u2 2 p x+ − x = (1349) 2 so 1 dx = − u du (1350) 2 p Then, the integration becomes " # Z u0 1 3 1 π Ip = x+2 cos πp − √ du cos ( u20 − u2 − 2 p ) 8 π 2 p2 2 4p 0 (1351) 310
The cosine term can be decomposed as cos (
π u2 π u2 π u2 + φ ) = cos cos φ − sin sin φ 2 2 2
(1352)
so one has the integrals x
Z C(x) =
du cos
π u2 2
du sin
π u2 2
(1353)
1 2
(1354)
0 x
Z S(x) = 0
which for large x have the limits C(∞) = S(∞) =
Thus, the integral is evaluated as " 1 3 Ip = x+2 cos πp + 2 2 8π p # π 2 1 π 2 C(u0 ) cos ( u0 − 2 p ) + S(u0 ) sin ( u0 − 2 p ) − √ 2 2 4p # " 1 π 2 1 1 3 2 cos ( u0 − 2 p − ) x+ cos πp − √ ∼ 2 2 8 π 2 p2 8p (1355) Thus, the oscillatory part of the grand canonical potential ∆Ω is 32
∆Ω
∼
h ωc ¯ ×
1 2m 2 × ¯h2 µ σ−1 π cos 2πp + − ¯h ωc 2 4
| e | Hz V 4 π 4 c ¯h
∞ X X
2 5
σ
p=1
( 2 p )2
(1356) This depends on the ratio of the extremal cross-section of the zero field Fermisurface, AF = π kF2 , and the difference in areas of the Landau orbits in momentum space, ∆A = 2π |h¯ec| Hz , 32
∆Ω
∼
h ωc ¯ ×
∞ X X
2 5
σ
p=1
1 2m 2 × ¯h2 ch ¯ π kF2 σ 1 π cos 2πp + − − 2 π | e | Hz 2 2 4
| e | Hz V 4 π 4 c ¯h ( 2 p )2
311
32
∆Ω
∼
h ωc ¯ ×
∞ X X
2 5
σ
p=1
1 2m 2 × ¯h2 σ−1 π AF cos 2πp + − ∆A 2 4
| e | Hz V 4 π 4 c ¯h ( 2 p )2
(1357) Thus, oscillations in the grand canonical potential occur when the number of Landau orbits inside the extremal cross-sectional area change. The oscillations occur in the magnetization Mz as it is related to the grand canonical potential Ω via ∂Ω Mz = − (1358) ∂Hz Thus, the magnetization also has oscillations that are periodic in H1z . Furthermore, for a free electron gas the extremal area of the Fermi-surface is just π kF2 , so the period of oscillations is proportional to the extremal cross-sectional area of the zero field Fermi-sphere. In addition to the fundamental oscillations, there are also higher harmonics which can be observed in experiments. For the more general situation, where the Fermi-surface is non-spherical different extremal cross-sections will be observed when the magnetic field is applied in different directions. This can be used to map out the Fermi-surface. The Lifshitz-Kosevich formulae, valid at finite temperatures, is π p 32 exp − ωc τ | e | Hz kB T V X X 1 ∆Ω = 2 1 3 2π¯ hc 2 ( 2 π )2 σ sinh 2π h¯p ωkcB T p p AF σ m∗ π × cos πp cos 2πp + − ∆A 2 me 4 (1359) On performing the sum over the spin polarizations, one obtains the result π p 3 exp − ωc τ | e | Hz 2 2 kB T V X 1 ∆Ω = 1 3 2 2 π ¯h c 2 ( 2 π )2 sinh 2π h¯p ωkcB T p p m∗ AF π × cos πp cos πp cos 2πp − me ∆A 4 (1360) The splitting between the up-spin and down-spin bands has modified the relative phase of the higher harmonics in the oscillations. This can be used to 312
extract the ratio of the band mass of the electron to the electron mass in vacuum. For systems which are on the verge of ferromagnetism, the spin splitting factor should be enhanced by including the effective field on the spins due to the interactions with the other electrons. This formula also includes the exponential damping of the oscillations due to T through the thermal smearing of the Fermisurface and also has an exponential damping term depending on the rate for elastic scattering of the impurities τ1 . Both these effects reduce the amplitude of the de Haas - van Alphen oscillations (R.B. Dingle, Proc. Roy. Soc. A 211, 257 (1952)). The oscillations can only be seen at low temperatures T < 1 K and for samples of high purity, as indicated by small residual resistances. The oscillations are only seen in materials where the zero temperature limit of the resistivity ρ(0) is less than 1 µΩ cm. The term involving the lifetime comes from the width of the quasi-particle spectrum, and should also be accompanied with the change in quasi-particle energy due to interactions. Therefore, the increase in the quasi-particle mass can also be extracted from the amplitude of the de Haas - van Alphen oscillations. However, the amplitude of the heavier mass bands are small compared with the light quasi-particle bands. In the heavy fermion materials such as CeCu6 and U P t3 quasi-particle masses of about 200 free electron masses have been observed in de Haas - van Alphen experiments.
11.4.5
Other Fermi-Surface Probes
There are many other probes of the Fermi-surface, these include the attenuation of sound waves. Consider sound waves in a crystal propagating perpendicular to the direction of the applied magnetic field and having a transverse polarization that is also perpendicular to Hz . The motion of the ions is accompanied by an electric field of the same frequency, wave vector and polarization. The electrons interact with the sound wave through the electric field. If the wave length of the mean free path is sufficiently long, the attenuation of the sound waves can be used to determine the Fermi-surface. The electrons follow real space orbits which have projections in the plane perpendicular to Hz , which are just cross-sections of the constant energy surfaces in momentum space, but are scaled by | eh¯| cHz and rotated by π2 . As velocities of the ions are much smaller than the electrons velocities, the electric field may be considered to be static. If the phonon wave vector q is comparable to the radius of the real space orbit, or more precisely the diameter of the orbit in the direction of q, then the electric field can significantly perturb the electrons motion. This strongly depends on the mismatch between q −1 and the diameter of the orbit. When the radius of rq the orbit is such that 2 rq =
λ 2
(1361)
then the electron may be accelerated tangentially by the electric field at both extremities of the orbit. The coupling is coherent over the electron’s orbit and 313
the coupling is strong. When 2 rq = λ
(1362)
the electron is sequentially accelerated and decelerated by the field. The coupling is out of phase on the different segments of the electron’s orbit so that the resulting coupling is weak. In general the condition for strong coupling is that of constructive interference 1 2 rq = ( n + )λ (1363) 2 and weak coupling occurs when the interference is destructive 2 rq = n λ
(1364)
The period differs slightly from the asymptotic large n variation just described. Assume that the projection of the trajectory on the plane perpendicular to the applied field Hz is circular. The energy transfer between the electron and the electronic wave in one orbit is given by Z ω2π c dt E(r, t) . v(t) (1365) 0
The electric field is assumed to be polarized along the y direction, and q is directed along the x direction. Since the orbit in momentum space is rotated by π2 with respect to the real space orbit one has vy (t) = vF sin ωc t
(1366)
x(t) = rq sin ωc t
(1367)
and Thus, the energy transfer in a period is evaluated as Z ω2π c dt exp i q rq sin ωc t sin ωc t Ey vF 0 2π J1 ( q rq ) = Ey vF ωc
(1368)
Thus, the resonances occur for phonon wave lengths which match the maxima of the Bessel function J1 (x). Only electrons near the Fermi-surface can absorb energy from the sound wave. The Pauli exclusion principle forbids electrons in other states to undergo low energy excitations, since the slightly higher energy states are already occupied. The electrons with the extremal diameter on the Fermi-surface are more numerous and, therefore, play a dominant role in the attenuation process. Thus, the sound wave may display an approximately periodic variation in λ where the asymptotic period is determined by 1 1 ∆ = (1369) λ 2 rq 314
By variation of q, and H one can map out the Fermi-surface.
11.4.6
Cyclotron Resonances
This method requires the application of an microwave electric field at the surface of a metal. The field is attenuated as it penetrates into the metal, and is only appreciable with a skin depth δ from the surface. Since the field does not penetrate the bulk, electrons can only pick up energy from the field when they are within the skin depth of the surface. A static (d.c.) magnetic field is applied parallel to the surface say in the x direction, so that the electrons undergo spiral orbits in real space. The velocity vx remains constant, but the electrons undergo circular motion in the y-z plane. It is only necessary to consider the electrons that travel in spirals that are close and parallel to the surface as it is these electron couple that to the microwave field. The size of the orbit and the electron’s mean free path λ should be much larger than the skin depth δ. This holds true when the cyclotron frequency ωc is large, and for microwave frequencies ω where the anomalous skin depth phenomenon occurs. The condition of a long mean free path and large cyclotron frequency is necessary for the electrons to undergo well defined spirals, so ωc τ 1. The electrons pick up energy from the field only if they are within δ of the surface. The electrons in the spiral orbits only experience the electric field each time they enter the surface region. They enter the skin depth periodically, with period 2π (1370) TH = ωc which is the period of the cyclotron motion. In general, the period is given by ¯h2 c ∂A TH = (1371) | e | Hx ∂E
The electron will experience an E field with the same phase, if the applied field has completed an integral number of oscillations during each cyclotron period. That is 2π TH = n TE = n (1372) ω Hence, this requires that the frequency of the cyclotron orbit match with the frequency of the a.c. electric field ω = n ωc
315
(1373)
so that the a.c. field resonates with the electronic motion in the uniform field. This condition can be written as 1 n = 2π|E|h ¯2 c ω Hx ∂A
(1374)
∂E
The factor mc
¯h2 = 2π
∂A ∂E
(1375)
is known as the cyclotron mass. For free electrons, A = π k 2 and E = so the cyclotron mass coincides with the electron mass mc = m If the microwave absorption is plotted versus resonance peaks should be found.
h ¯ 2 k2 2 m
(1376) 1 Hx
a series of uniformly spaced
The calculation of the absorption is the simplest in the case when the wave length of the electromagnetic field λ is much larger than the cyclotron orbit and ωc τ 1. The geometry shall be considered where a surface has a normal in the z direction and a d.c. magnetic field is applied parallel to the surface in the x direction H = eˆx Hx (1377) and the a.c. electric field is in the y direction E = eˆy Ey exp i ( q z − ω t )
(1378)
The electron in its orbit experiences a rapidly alternating electric field. For most values of z the contributions cancel. The cancellation only fails at the extremal values of z where the velocity lies in the plane z = const. . To the zero-th order approximation, the z component of the electron’s position can be expressed as z(t) = z0 +
v sin ωc t ωc
(1379)
The total change in momentum of the electron due to the oscillating field, in one period, can be calculated in the semi-classical approximation. The change of momentum and, hence, the current will be in the direction of the a.c. field. The impulse imparted to the electron is given by the integral of the electric field
316
evaluated at the electron’s position Z t+TH v sin ωc t0 − ω t0 − |e| dt0 Ey exp i q z0 + q ωc t qv 2π ∼ − | e | Ey exp i q z0 − ω t I ωωc ωc ωc (1380) where In (x) is the modified Bessel function of order n. For large the asymptotic form ∼ − | e | Ey
2π q v ωc
12
exp
i
q z0
π − ωt − 4
q v ωc ,
this has
(1381)
Due to the phase differences around the orbit, only a fraction
2 π ωc qv
12 (1382)
of the orbit contributes to the integral. The energy gain of the electron in one traversal is given by v¯ h δky = − | e | Ey
2πv q ωc
12
exp
i
q z0
π − ωt − 4
(1383)
The previous traversal caused a similar displacement, but with t → t − 2ωcπ . However, only the fraction 2π exp − (1384) ωc τ of electrons survive traversing one cyclotron orbit without scattering. As these all have a similar form, one can obtain the average energy displacement experienced by an electron between one scattering and the next v¯ h ∆ky = − | e | Ey
2πv q ωc
12
exp
i
q z0
π − ωt − 4
F (1385)
The factor F reflects the sum of the probabilities that the electron survive n orbits without scattering. The value of F is given by F
=
∞ X
exp
n=0
2πn − (1 + iωτ ) ωc τ
1
=
1 − exp
−
2 π ωc τ
317
(1 + iωτ )
(1386)
It is the imaginary part of the probability for survival that causes the resonances in the surface impedance. To calculate the current at a depth z0 one must examine the orbits on the Fermi-surface. The orbits circulate around the Fermi-surface in sections that are perpendicular to the d.c. field. Thus, the orbits are in the y − z plane. The portions of the Fermi-surface orbits which contribute most to the current are those in which the electrons are moving parallel to the surface. These portions of the orbits are those where kz = 0, and form the effective zone. An electron on the effective zone has kz = 0 and is, thus, moving at the extreme of its orbit. Due to the effect of the a.c. field this orbit has been displaced by a distance ∆ky from the orbit in which the a.c. field has been turned off. The total current from the orbits in a section of width dkx , around kx , can be calculated by considering the contribution of the electrons around the orbit. Due to the phase differences around the orbits only those within a distance 12 of the effective zone contribute. These orbits are displaced from kx 2 qπ vωc their equilibrium positions by an amount ∆ky . Only the displacements from equilibrium contribute to the current. The contribution to the current density is 1 2 π ωc 2 π 2 ∆ky v dkx kx exp i δJy = − | e | 4 8 π3 qv 1 e2 = dk k F Ey (1387) x x 2 π2 ¯h q where the phases of π4 cancel. The integration over dkx can be converted into an integral over the effective zone via φ. If the effective mass and cyclotron frequency are constant over the Fermi-surface, then F is also constant. The conductivity σ is proportional to F . The surface impedance is defined as Z(ω)
=
4 π i ω Ex ∂Ex ∂x
∼
(1 − i) 1
∼ F−2
2πω σ
12 (1388)
The surface impedance has oscillations with varying field Hx . It should be obvious from the above discussion that the oscillations provide information on ωωc , but the dominant contribution occurs from the extremal parts of the line of intersection of the Fermi-surface with the plane kz = 0. Thus, the cyclotron resonance can be used to study points on the Fermi-surface.
318
11.5
The Quantum Hall Effect
The Quantum Hall Effect is found in two-dimensional electron systems, in which an electric field is applied perpendicular to the plane where the electrons are confined. Experimentally, the electrons can be confined to a two-dimensional sheet in a metal oxide semiconductor field effect transistor. The application of a strong electric field to the surface of a semiconductor may pull down the conduction band at the surface of the semiconductor. If the energy of these field induced surface states is less than the Fermi-energy of the metal, electrons will tunnel across the insulating oxide barrier and occupy them. After equilibrium has been established, the electrons at the surface of the semiconductor will form a two-dimensional electron gas.
11.5.1
The Integer Quantum Hall Effect
The Integer Quantum Hall Effect can be understood entirely within the framework of non-interacting electrons. The calculation of the Hall coefficient can be performed using the Kubo formula. However, in applying the Kubo formulae, one must recognize that the vector potential has two components: an a.c. component responsible for producing the applied electric field and a second component which produces the static magnetic field. The usual derivation only takes the a.c. component of the vector potential into account as a perturbation. The d.c. component of the vector potential must be added to the paramagnetic current operator, yielding the electron velocity operator appropriate for the situation where the weakly perturbing electric field is zero. The application of a static magnetic field Bz perpendicular to the surface will quantize the motion of the electrons parallel to the surface. The motion parallel to the surface is quantized into Landau orbits, and the energy eigenvalue equation reduces to the energy eigenvalue equation of simple harmonic motion. Due to the confinement in the direction perpendicular to the surface, kz will not be a good quantum number, and the perpendicular component of the energy will form highly degenerate discrete levels n . For large enough fields only the lowest level 0 will be occupied. On choosing a particular asymmetric gauge for the vector potential, (1389) A(r) = + eˆy Bz x the Hamiltonian for the two-dimensional motion in the x − y plane is given by ˆ = H
pˆ2x m ωc2 + 2m 2
ˆ x ˆ − X
2 (1390)
where the cyclotron frequency ωc is given by ωc =
| e | Bz mc
319
(1391)
ˆ is given in terms of the y component of the momentum operator The operator X pˆy . Since the Hamiltonian is independent of y, both py and X can be taken as constants of motion. The momentum component pˆx is canonically conjugate to the x component of the particle’s position relative to the center of the orbit pˆy c | e | Bz
ˆ = x + x ˆ − X
(1392)
The energy eigenvalues of the shifted Harmonic oscillator are given by Eν,0 = 0 + ¯h ωc ( ν +
1 ) 2
(1393)
where ν is the quantum number for the Landau levels. The Landau levels are independent of ky and, therefore, are degenerate. Since the values of ky are quantized via 2 π ny ky = (1394) Ly for a surface of length Ly , then the possible values of ky are limited by the restriction Lx > X > 0 (1395) which yields the total number of degenerate states as D
= Lx Ly =
Bz | e | 2 π ¯h c
Φ|e| 2 π ¯h c
(1396)
where Φ is the total magnetic flux passing through the sample. The fundamental flux quantum Φ0 is defined as the quantity Φ0 =
2 π ¯h c |e|
(1397)
The density of states can be approximately expressed as a discrete set of delta functions | e | Bz Lx Ly X 1 ρ(E) = δ E − 0 − ¯h ωc ( ν + ) hc 2 ν m ωc Lx Ly X 1 = δ E − 0 − ¯h ωc ( ν + ) 2π¯ h 2 ν (1398) The weight associated with each delta function corresponds to the degeneracy of each Landau level. On defining the cyclotron radius rc as s ¯h c rc = (1399) | e | Bz 320
one finds that the relative position operator can be expressed as rc † ˆ aky + aky x ˆ − X = √ 2
(1400)
and from the Heisenberg equation of motion for x ˆ one finds that the x component of the velocity is given by 1 ˆ x ˆ, H vˆx = i ¯h rc ω c † √ = i aky − aky (1401) 2 as pˆy can be taken to be diagonal. The y component of the velocity is found from the Heisenberg equation of motion for yˆ and is given by 1 ˆ vˆy = yˆ , H i ¯h ˆ ) = ωc ( x ˆ − X (1402) ˆ since the commutator yˆ , X = − | ie h¯| Bc z . On substituting for the x component of the displacement from the center of the orbit vˆy =
rc ω c √ ( a†ky + aky ) 2
(1403)
Thus, as the velocity operators are non-diagonal, in the quantized Landau level indices, the Landau orbitals do not carry a net current. The Kubo formula can be expressed in terms of the electron velocity operators, which includes the diamagnetic current contributions from the static magnetic field. On using the form of the Kubo formula for the conductivity tensor, per unit area, appropriate for single particle excitations " X i e2 1 σα,β (ω) = < ν, ky | vˆα | ν 0 , ky > < ν 0 , ky | vˆβ | ν, ky > × ω + i η Lx Ly 0 ν,ν ,ky # X f (Eν ) f (Eν ) − f (Eν 0 ) × − δα,β ¯h ω + i η + ¯h ωc ( ν − ν 0 ) m ν,ky
(1404) one finds that the diagonal component is zero Re σx,x (0) = 0
(1405)
but, nevertheless, the off-diagonal term is finite and quantized Re σx,y (0) =
e2 (n + 1) 2 π ¯h 321
(1406)
where n is the quantum number for the highest occupied Landau orbital. As the field is changed, the peaks in the density of states associated with the Landau levels sweep through the Fermi-level. The Hall resistivity should, therefore, show a set of steps as the applied field is increased. This phenomenon is the integer quantum Hall effect. Experimentally, it is found that the steps in σx,y (0) are not discontinuous, but instead show a finite slope in the transition region. Furthermore, the diagonal component of the resistivity is non-zero, but shows spikes for fields in the region where the transition between successive plateaus occur. This phenomenon is associated with impurity scattering. The effect of impurities is to broaden the set of delta function peaks in the density of states into a set of Gaussians. This allows the transition between the plateaus to be continuous. In fact, if all the states contributed to the conductivity, the steps of the staircase would be smeared out into a straight line, just like in the Drude theory for three-dimensional metals. Fortunately, the states in the tails of each Gaussian are localized, as the deviation from the ideal Landau level energy indicates that these states experience a larger impurity potential than average. The large potential acts to localize the electrons in the states with energies in the Gaussian tail and so do not contribute to the Hall conductivity. In fact, in two dimensions with zero field, one can show that all the states are localized in an infinite sample. However, the samples are finite and have edges. The edges have extended states that carry current. The edge states can be understood in analogy with the classical motion where there are skipping orbits, in which the cyclotron orbits are reflected at the edges. The classical skipping orbits would produce oppositely directed currents at pairs of edges. Quantum mechanically, the bulk states do not contribute to the current since the velocity operator is given by vˆy
q pˆy − Ay me me c ˆ ) = ωc ( x ˆ − X
=
(1407)
and the probability density for the shifted harmonic oscillator is symmetrically peaked about X. Since the current carried by the state is given by an integral which is almost anti-symmetric Z L q ωc jky ,ν = dx | φν ( x − X ) |2 ( x − X ) (1408) me 0 it vanishes. On the other hand for X close to the boundary, say at X = 0, then the wave function must vanish at the boundary for a hard core potential and so the current is given by Z L q ωc jky ,ν = dx | φν ( x ) |2 x (1409) me 0 Since the wave function is cut off at x = 0, the integral is positive and the edge state carries current. The other edge state carries an oppositely directed 322
current. The presence of the confining potential also lifts the degeneracy of the states in the Landau levels, by increasing the energy of the states close to the boundary. As the wave function of the odd order excited state Landau level of the homogeneous system vanish at x = X, one finds that the energy of the Landau levels with hard wall confining potentials increases from ( ν + 12 ) ¯h ωc to ( 2 ν + 32 ) ¯h ωc as X → Lx . The increase in the Hall resistivity only occurs when the Fermi=level sweeps through the itinerant or delocalized portions of the density of states. As the above calculations completely neglects the effect of impurities and localization, Laughlin proposed a gauge theoretic argument which overcomes these shortcomings (R.B. Laughlin, Phys. Rev. B 23 5632, (1981)). Laughlin envisaged an experiment in which the two-dimensional sample is in the form of the surface of a hollow cylinder of radius R. The axis of the cylinder is taken to be along the z direction. A uniform magnetic field Br is arranged to flow through the sample in a radial direction. Locally, this field is perpendicular to the plane in which the electrons are confined. A second field is arranged to thread through the cylinder parallel to its axis, but is entirely contained inside the hollow and falls to zero at r = R. This field does not affect the motion of the electrons directly since it is zero inside the sample, however, the associated flux threading through the cylinder Φ does lead to a finite vector potential AΦ which satisfies ∇ ∧ AΦ = 0 (1410) which, therefore, can be written in the form AΦ = ∇ Λ
(1411)
This vector potential does cause the electronic wave functions to acquire an Aharonov-Bohm phase factor of qc Λ (1412) exp i ¯h For the vector potential AΦ = one finds that Λ =
Φ eˆϕ 2πr
(1413)
Φ ϕ 2π
(1414)
and, hence, the phase factor is given by qcΦ exp i ϕ 2 π ¯h
(1415)
Thus, on traversing a singly connected path around the cylinders axis, the Aharonov-Bohm flux changes the extended state wave function by a factor of qcΦ exp i 2π (1416) 2 π ¯h 323
Since the fundamental flux quantum Φ0 is defined by Φ0 =
2 π ¯h qc
(1417)
this Aharonov-Bohm factor can be written as Φ exp i 2π Φ0
(1418)
Thus, if Φ is an integer multiple of Φ0 , i.e. Φ = µ Φ0
(1419)
the extended wave functions are single valued. The presence of the perpendicular field B r within the sample quantizes the motion into Landau levels. These states may either be localized or may be extended throughout the sample. The vector potential at position z is given by Br z +
Φ = µ Φ0 2πR
(1420)
which, according to the flux quantization condition, must be an integer multiple of Φ0 for the phase of an extended wave function to be single valued. If Φ is adiabatically increased by Φ0 , then the maximum the extended states in a Landau level must be translated along the z axis by amounts ∆z ∆z
Φ0 2 π R Br Φ0 = −L Φ = −
(1421)
The phase of the localized states can shift by arbitrary amounts. The presence of the gap forbids excitation of electrons to states in the higher Landau levels. Since, in the pure systems, the fully occupied Landau level contains a number m =
| ∆z | L
=
Φ Φ0
(1422)
of electrons, the adiabatic change of Φ results in the transfer of electrons between neighboring extended states. Hence, in the dirty system, all the delocalized electrons in the Landau level are translated along the z direction by one spacing, skipping over the localized states. The net result is that one electron is translated across the entire length of the sample. In the absence of an applied electric field, the initial and final states have the same energy. Thus, by gauge invariance, adding Φ0 maps the system back on itself. However, if there is an electric field Ez across the length of the cylinder, this process requires an energy change of ∆E = q Ez L (1423) 324
The current around the cylinder Iϕ , from all the electrons in a single Landau level, is given by ∂E Iϕ = − c (1424) ∂Φ which leads to a current density jϕ
c q Ez L L Φ0 q2 = − Ez 2 π ¯h = −
(1425)
Hence, on summing over all occupied Landau levels, one has jϕ = −
q2 n Ez 2 π ¯h
(1426)
In this n is the number of completely occupied Landau levels with extended states and the Fermi-energy is in a mobility gap. The Hall conductivity is given by jy q2 σH = − = n (1427) Ez 2 π ¯h Hence, as long as there are Landau levels with extended states, there is an integer quantum Hall effect. The integer quantum Hall effect was measured experimentally by von Klitzing in 1980. (K. von Klitzing, G. Dorda, and M. Pepper, Phys. Rev. Letts. 494 (1980)). The steps can only be discerned in very clean samples. At much higher fields, where only the lowest Landau level should be occupied, Gossard, Stormer, and Tsuei discovered a similar type of effect which is known as the fractional quantum Hall effect (D.C. Tsuei, H.L. Stormer and A.C. Gossard, Phys. Rev. Letts. 48, 1559 (1982)). This phenomenon involves the effect of the Coulomb repulsion between electrons in the Landau levels. Laughlin showed that the energy of the interacting electron states can be minimized by allowing the electrons to form a ground state with a different symmetry from the bulk (R.B. Laughlin, Phys. Rev. Letts. 50, 1395 (1983)). ——————————————————————————————————
11.5.2
Exercise 58
Evaluate the Kubo formula for the real part of the diagonal and off diagonal quantum Hall conductivities by first taking the limit ω → 0 and then taking the limit → 0. Also estimate the effects of introducing scattering due to random impurities. The effect of the scattering lifetime can be introduced by including imaginary parts of the energies of the occupied and unoccupied single particle states of the form ± i 2h¯τ . Choose the signs to ensure that the wave 325
functions for the excited states (with an electron - hole pair) will decay to the ground state after a time τ . Compare your result with the conductivities obtained for the three-dimensional Drude model. ——————————————————————————————————
11.5.3
The Fractional Quantum Hall Effect
Consider a particle of mass me , confined to move in the x − y plane with a uniform magnetic field in the z direction. Using the circularly symmetric gauge, the energy eigenstates are also eigenstates of angular momentum. The single particle wave functions describing the states in the lowest Landau level with angular momentum m can be written as r m x2j + yj2 xj + i yj 2 φm (rj ) = exp − (1428) π m! 2ξ 4 ξ2 where the length ξ is given by s ξ =
¯ c h q Bz
(1429)
The probability density for finding a particle has a peak which form a circle around the origin. The radius of the circle depends on m. The many-particle ground state wave function corresponding to the lowest Landau level is constructed as a Slater determinant from the states of different m. The spins of the electrons are assumed to be fully polarized by the applied field. The N particle wave function is Ψ ∼
Y
( xi − xj ) + i ( yi − yj )
i>j
Y Ne
exp
−
k=1
x2k + yk2 4 ξ2
(1430)
This satisfies the Pauli exclusion principle as the wave function vanishes linearly as ri → rj . The linear vanishing is a signature that the a pair of particles are in a state of relative angular momentum m = 1, together with contributions from states of higher angular momentum. This can be seen by expressing the prefactor as a van der Monde determinant 0 z1 z11 z12 . . . z1Ne −1 z20 Y z21 z22 . . . z2Ne −1 zi − zj = . (1431) .. .. . i>j z0 z1 z2 . . . z Ne −1 Ne
Ne
Ne
Ne
Hence, the Ne electrons occupy the zero-th Landau levels single particle states with all the angular momentum quantum numbers in the range between m = 326
0 and m = Ne − 1. The wave function is an eigenfunction of the total angular momentum. The total angular momentum about the origin is Mz = Ne ( Ne − 1 ) h. Since each m value is occupied, this state corresponds to a ¯ 2 uniform particle density of 1 ρ = (1432) 2 π ξ2 particles per unit area. Hence, this many-particle state corresponds to the completely filled lowest Landau level. For larger Bz the lowest Landau level is only partially filled. The wave function which minimizes the interactions between pairs of particles is given by the Laughlin trial wave function Ψp ∼
Y
( xi − xj ) + i ( yi − yj )
p Y Ne
i>j
exp
k=1
−
x2k + yk2 4 ξ2
(1433)
for odd integers p. The higher power of p has the effect of minimizing the interactions between particles, since the square of the wave function vanishes like a power law with power 2p instead of quadratically. This is a consequence of the pairs of particles being in states with relative angular momentum p. The Laughlin state is also an eigenstate of total angular momentum with eigenvalue h. Since the linear superposition contains 1 particle for Mz = Ne ( N2e − 1 ) p ¯ every p values of m, this state corresponds to a uniform particle density of ρ =
1 2 π ξ2 p
(1434)
particles per unit area. The filling factor, ν is defined as ν = Ne
Φ0 ρ Φ0 = Φ Bz
(1435)
This state corresponds to a state with the fractional filling, ν = p1 , of the lowest Landau level. The energy of the Laughlin ground state is given by the Coulomb interaction energy. Since the Coulomb potential is a central potential it conserves the relative angular momentum. In the Laughlin state, the energy is evaluated as 2 Eg 0.78213 0.211 0.012 e = − √ 1 − 0.74 + (1436) Ne p p p1.7 rc where
r
¯h (1437) me ω c The energy per particle, for small p, is lower than any other candidate state by an amount determined by the Coulomb interaction between states with angular momentum m < p. rc =
327
11.5.4
Quasi-Particle Excitations
The quasi-particle excitations of the Laughlin state, like the quasi-particle excitations of a completely filled Landau level, can be obtained from considering the effect of adding a number of flux quanta, Φ, passing through the center of the system. Although the magnetic field generating the extra flux does not act on the electrons at zero, it does add an Aharonov-Bohm phase to the system. First we shall consider the filled nr = 0 Landau level. The single particle wave function experiences a vector potential of the form Bz r Φ + eˆϕ (1438) A = 2 2πr where Bz is the uniform field and Φ is the Aharonov-Bohm flux. Since the single particle energy eigenstates satisfy
¯2 1 ∂ h − 2 me r ∂r
∂ r ∂r
1 + 2 me
ih ¯ ∂ q Bz r Φ − − ( + ) r ∂ϕ c 2 2πr
2
−E
φ(r, ϕ) = 0
(1439) Then, with the ansatz φ(r, ϕ) = √
1 exp 2π
+ iµϕ
R(r)
(1440)
one finds that the radial wave function is given by the solution of −
¯h2 1 ∂ 2 me r ∂r
r
∂ ∂r
+
¯h2 2 me r2
µ−
qΦ q Bz r 2 − c2πh ¯ 2 ¯h c
2
−E
R(r) = 0 (1441)
Hence, the solutions for the lowest Landau level are of the form r2 qΦ ν ϕ exp i ν ϕ r exp − φ(r, ϕ) ∼ exp i c 2 π ¯h 4 ξ2
(1442)
where
qΦ (1443) c 2 π ¯h Since the wave function is single valued µ must be an integer, say m. Thus, on increasing ν the particles move away from the origin. On subtracting one flux quantum, Φ0 , through the center of the loop, where ν = µ −
Φ0 =
c 2 π ¯h q
(1444)
then the degenerate eigenfunctions transform into themselves, m → m + 1. If the Landau level had been completely filled, then one particle has been pushed to the edge of the system and a hole has been created in the m = 0 orbit. This
328
is the quasi-hole excitation in the filled Landau level. The wave function of the Laughlin state, when a quasi-hole has been added is given by similar considerations. The insertion of a flux quanta produces an extra Aharonov-Bohm phase. The requirement that the wave function is single valued restricts µ to be integer, m. The Laughlin state in which the flux is decreased by one flux quantum Φ0 has m shifted by m → m + 1 which creates a quasi-hole at the origin. The many-particle wave function with a quasi-hole at the origin is given by the expression Ψ+ p
∼
Y
( xi + i yi )
i
Y
( xi − xj ) + i ( yi − yj )
i>j
p Y Ne k=1
x2 + y 2 exp − k 2 k 4ξ
(1445) and the wave function with a quasi-hole at r0 is given by Ψ+ p ∼
Ne Y
xi − x0 + i ( yi −y0 )
i=1
Y
( xi − xj ) + i ( yi − yj )
i>j
p Y Ne k=1
x2 + y 2 exp − k 2 k 4ξ
(1446) where one flux quanta has also been removed from r0 . This state has angular momentum of Ne ( Ne − 1 ) Mz = p ¯h + Ne ¯h (1447) 2 as there is now an extra zero at point r0 . Due to the zero, the charge density of this state is depleted around r0 . The charge deficiency is smaller than that around the position of any electron by a factor of p1 . Hence, the quasi-hole has charge − pq . Alternatively, one may notice that by adding p quasi-holes at the same point and then add an electron there, one just obtains the Laughlin wave function with one more electron. Hence, p quasi-holes are neutralized by an extra electron. The operator creating a quasi-hole can be written just as Sp ∼
Ne Y
xi − x0 + i ( yi − y0 )
(1448)
i=1
since it just adds zeros to the wave function. Creating a quasi-particle is a little more complicated, as adding a flux quantum results in the transformation m → m − 1. Hence, the circles contract to the origin, but the state with m = 0 is already filled in the initial state. This must be lifted to the next Landau level, nr = 1. An operator Sp† which adds a flux quantum at r0 , creating a quasi-particle, can be written just as Sp†
Ne Y ∂ ∂ x0 − i y0 ∼ − i − ∂xi ∂yi ξ2 i=1
329
(1449)
where this operator only acts on the polynomial part of the wave function and not the exponential part. It reduces the angular momentum of each single particle state by one unit of h ¯ and sends the particle at r0 into the higher Landau levels. This activation process ensures that the quasi-particle excitation spectra has a gap. Since each quasi-particle of charge pq is attached to one flux quantum, the statistics are neither fermion nor boson. Two quasi-particles are exchanged in the process, whereby one quasi-particle is rotated by π in a semi-circle centered on the other fixed quasi-particle and then the two particles are translated in the same direction along the diameter. This process results in an interchange of the electrons between their initial position states. The phase of the wave function changes in this permutation. The rotation of the quasi-particle through π around a flux tube produces an Aharonov-Bohm phase of π
q Φ0 π = p 2 π c ¯h p
(1450)
since the quasi-particle has a fractional charge. Thus, the quasi-particles have fractional statistics. These types of fractional or anyon statistics is only possible in two or less dimensions. If the permutation of two particles produces a phase difference of πp then the reverse permutation process must yield a phase change of − πp . In two dimensions, the permutation process and the reversed process are distinguishable. However, if the two-dimensional process is embedded in three dimensions the processes are no longer distinct. On rotating the plane by in which the particles are contained in by π, the interchange becomes equivalent to the reverse interchange. Hence, π π exp + i = exp − i (1451) p p which yields p = 1.
11.5.5
Skyrmions
Although, we have considered the effect of extremely high magnetic fields, the electronic spin system is not completely polarized, as we have been assuming. The reasons for the relatively weak coupling between the electronic spin and the magnetic field, compared with the coupling of the orbital motion to the field is mainly due to the small band mass of the electron. The strength of the orbital coupling is determined by the quantity q 2 m∗ c
330
(1452)
where m∗ is the band mass, which in GaAs has the value of m∗ ≈ 0.07 me where me is the free electron mass. The strength of the spin coupling is given by gq (1453) me c where g is the gyro-magnetic ratio g ≈ 2. This is further reduced by the strong spin-orbit coupling in GaAs to a value given by the Lande gl factor, gL ≈ 0.45. The spin directions are, therefore, determined via the exchange parts of the Coulomb interaction. The magnitude of the Coulomb interaction is given by q2 (1454) ξ where ∼ 12 and ξ is the magnetic length. For fields of order B = 10 Tesla this is the same order of magnitude as h ¯ ωc . The spin magnetization M contributes to the effective magnetic field B ef f = B + 4 π M
(1455)
Hence, if we consider excitations in the spin system the magnetization will be spatially varying, and so will the effective field. A variation in the effective field will result in a local change in the filling factor. The system will respond to the change in the filling factor by transferring charge. Thus, spin and charge excitations are coupled. The lowest energy coupled spin-charge excitations are skyrmions, not the Laughlin quasi-particles. Consider an electron with spin S moving in the exchange field of the other fixed electrons. The spin degree of freedom is governed by the effective Zeeman Hamiltonian ˆ int = − g µB B (r) . S H (1456) ef f In the lowest energy state, the electron aligns its spin with the static effective magnetic field. If the electron is moved around a closed contour, the spin will remain aligned with the local magnetic field all along the contour. However, the spin wave function does not return to its initial value but instead acquires a phase, the Berry phase. The Berry phase is related to the solid angle enclosed by the spins trajectory, as mapped onto the unit sphere in spin space. The solid angle Ω traced out by the spin when completing the contour is given by I Ω = dϕ ( 1 − cos θ ) (1457) where the spin direction is specified by the polar coordinates (θ, ϕ). After the contour is traversed the spin wave function acquires an extra phase is S Ω h ¯. The Berry phase can be illustrated by considering a spin one half in a magnetic field of constant magnitude oriented along the direction (θ, ϕ). In this 331
case, the Zeeman Hamiltonian is given by ˆ Z = − µB ( B . σ ) H which can be expressed as ˆ Z = − µB B H
cos θ sin θ exp[− i ϕ ] sin θ exp[ + i ϕ ] − cos θ
ˆZ which for fixed (θ, ϕ) has an eigenstate of H cos θ2 χ+ = sin θ2 exp[ + i ϕ ]
(1458) (1459)
(1460)
which has the eigenvalue E0 = − µB B
(1461)
Thus, in this state the spin is aligned parallel to the applied field. For a static field one has the time dependent wave function given by µB B cos θ2 t (1462) χ+ (t) = exp + i sin θ2 exp[ + i ϕ ] ¯h where the time dependence is given purely by the exponential phase factor. If the direction of the field (θ(t), ϕ(t)) is changed very slowly, one expects the spin will adiabatically follow the field direction. That is, if the field is rotated sufficiently slowly, one does not expect the spin to make a transition to the state with energy E = + µB B where the spin is aligned anti-parallel to the field. However, the wave function may acquire a phase which is different from the time and energy dependent phase factor expected for a static field. This extra phase is the Berry phase δ, and can be calculated from the Schr¨odinger equation ∂ α(t) cos θ(t) sin θ(t) exp[− i ϕ(t) ] α(t) = − µB B i¯ h β(t) sin θ(t) exp[ + i ϕ(t) ] − cos θ(t) β(t) ∂t (1463) We shall assume that the wave function takes the adiabatic form ! µB B α(t) cos θ(t) 2 = exp + i t − δ(t) β(t) ¯h sin θ(t) exp[ + i ϕ(t) ] 2 (1464) which instantaneously follows the direction of the field but is also modified by the inclusion of the Berry phase. On substituting this ansatz into the Schrodinger equation, one finds that the non-adiabatic terms satisfy ! 0 ∂δ ∂ϕ cos θ(t) 2 − + sin θ(t) exp[ + i ϕ(t) ] ∂t ∂t sin θ(t) exp[ + i ϕ(t) ] 2 2 ! i ∂θ − sin θ(t) 2 = 2 ∂t cos θ(t) exp[ + i ϕ(t) ] 2 (1465) 332
The above equation is projected onto the adiabatic state by multiplying it by the row matrix θ(t) (1466) sin exp[ − i ϕ(t) ] cos θ(t) 2 2 One finds that the derivative of θ w.r.t. t cancels and that the equation simplifies to ∂ϕ θ ∂δ − + sin2 = 0 (1467) ∂t ∂t 2 Hence, the Berry phase is given by integrating w.r.t. to t, Z t ∂ϕ θ(t0 ) δ(t) = dt0 0 sin2 ∂t 2 Z0 θ = dϕ sin2 2 Z 1 = dϕ ( 1 − cos θ ) (1468) 2 On completing one orbit in spin space, the extra phase is given by Ω 2
δ =
(1469)
as was claimed. Thus, an inhomogeneous effective field on the spin introduces an extra phase of Ω2 in the wave function of the electron which is dragged around a contour. This extra phase has the same effect as if the contour contains an additional contribution to the magnetic flux of ∆Φ =
Ω Φ0 2 2π
(1470)
since encircling a flux quanta Φ0 produces a phase change of 2 π. Furthermore, as the effective flux enclosed in the region is increased by ∆Φ, and the filling fraction ν is constant Φ0 ν = ∆N (1471) ∆Φ Hence, the contour encloses an extra charge ∆Q = q ∆N ∆Φ = νq Φ0 Ω = νq 4π
(1472)
Thus, the extra charge is determined by the Berry phase and the filling fraction ν, also the spin and charge excitations are coupled.
333
Due to the coupling of spin and charge, a localized spin-flip excitation of a fully polarized ground state introduces a non-uniform charge density. Consider a skyrmion excitation in the fully filled lowest Landau level. The ground state wave function is written in second quantized form as Y † | Ψ0 > = am,↑ | 0 > (1473) m
The creation of a charged spin-flip excitation at the origin requires adding a down spin electron in m = 0. However, to allow for the spin excitation to have a finite spatial extent and the charge density to re-adjust, the wave function needs to be able to reduce the charge density and net spin at the origin by redistributing them on neighboring shells. The state is also an eigenstate of total angular momentum Jz . Thus, to a first approximation the excited state wave function can be written as | Ψ+ >
≈
v0 +
u0 a†1,↓
a0,↑
a†0,↓
Ne Y
a†m,↑ | 0 >
m=0
≈
v0 a†0↑ + u0 a†1,↓
a†0,↓
Ne Y
a†m,↑ | 0 >
m6=0
(1474) where v0 and u0 are variational parameters which, since the wave function is normalized, must satisfy | u0 |2 + | v0 |2 = 1
(1475)
Iterating this process leads to the skyrmion wave function | Ψ+ >
=
Ne Y
vm a†m↑ + um a†m+1,↓
a†0,↓ | 0 >
m=0
(1476) where one expects that as m → N then | vm | → 1 and | um | → 0. This is a variational wave function for the excited state, and the parameters vm and ˆ um are to be determined by minimizing the expectation value of H. The Hamiltonian can be approximated by ˆ = H
X m,σ
m,σ a†m,σ am,σ +
1 X Vm,m0 a†m,↑ a†m0 +1,↓ am0 ,↑ am+1,↓ (1477) 2! 0 m,m
which is a simplified version of the skyrmion Hamiltonian. On using the relations < Ψ+ | a†m,↑ am,↑ | Ψ+ > 334
= | vm |2
< Ψ+ | a†m,↑ am+1,↓ | Ψ+ >
∗ = vm um
< Ψ+ | a†m+1,↓ am,↑ | Ψ+ >
= u∗m vm
< Ψ+ | a†m+1,↓ am+1,↓ | Ψ+ >
= | um |2 (1478)
the expectation value of the Hamiltonian is found as X 2 2 ˆ < Ψ+ | H | Ψ+ > = m,↑ | vm | + m+1,↓ | um | m
+
1 X ∗ Vm,m0 vm um u∗m0 vm0 2! 0 m,m
(1479) The energy of this excited state is to be minimized w.r.t. um and vm subject to the constraint | vm |2 + | um |2 = 1 (1480) The minimization is performed using Lagrange’s method of undetermined multipliers, λm . The minimization results in the set of equations ∗ ( m,↑ − λm ) vm +
1 ∗ X ∗ u Vm,m0 um0 vm 0 = 0 2 m 0
(1481)
m
and ( m+1,↓ − λm ) u∗m +
1 ∗ X v Vm,m0 vm0 u∗m0 = 0 2 m 0
(1482)
m
These sets of equations can be solved to yield the undetermined multipliers s 2 m↑ + m+1,↓ m↑ − m+1,↓ λm = ± + | ∆m |2 (1483) 2 2 where we have defined the parameter ∆m =
1 X ∗ Vm,m0 vm 0 um0 2! 0
(1484)
m
which we expect will decrease with increasing m. The factors | vm |2 and | um |2 are then found as 1 m,↑ − m+1,↓ 2 | vm | = 1 ∓ p 2 ( m↑ − m+1,↓ )2 + 4 | ∆m |2 1 m,↑ − m+1,↓ 2 1 ± p | um | = 2 ( m↑ − m+1,↓ )2 + 4 | ∆m |2 (1485) 335
Far from the center of the spin flip excitation one expects that the spins will be polarized parallel to the field and the ground state will be recovered. Hence, we shall use the upper signs. The equations (1484) and (1485) have to be solved self-consistently for ∆m . We shall use a real solution for the gap. These are combined to yield the ”gap” equation ∆m =
∆m0 1 X Vm,m0 p 2 2 2 0 0 ( − m↑ m +1,↓ ) + 4 | ∆m0 | m0
(1486)
The solution uniquely determines the wave function up to an undetermined phase. We note that as the Zeeman splitting increases, the magnitude of ∆m decreases. Once the wave function has been determined, one can examine the spin distribution. The direction of the spin at the point (r, ϕ) is defined as the direction along which the spin density operator is maximum. The spin density operator, projected along a unit vector ηˆ in an arbitrary direction (θ0 , ϕ0 ) in second quantized form is given by ( ηˆ . σ )(r) = Ψ† (r) ( ηˆ . σ ) Ψ(r)
(1487)
However, the field creation and annihilation operators are given by X Ψ† (r) = a†m,α φ∗m (r) χ†α m,α
Ψ(r)
=
X
am0 ,β φm0 (r) χβ
m0 ,β
(1488) Hence, one finds the component of the spin density operator in the form X X ( ηˆ . σ )(r) = φ∗m (r) ( ηˆ . σ )α,β φm0 (r) a†m,α am0 ,β m,m0 α,β
=
X
φ∗m (r) φm0 (r) sin θ0
a†m,↑ am0 ,↓ exp[ − i ϕ0 ] + a†m,↓ am0 ,↑ exp[ + i ϕ0 ]
m,m0
+
X
φ∗m (r) φm0 (r) cos θ0
a†m,↑ am0 ,↑ − a†m,↓ am0 ,↓
m,m0
(1489) On taking the expectation value of the spin density operator in the skyrmion state one finds X ∗ < Ψ+ | ( ηˆ . σ )(r) | Ψ+ > = sin θ0 exp[ − i ϕ0 ] φ∗m (r) φm+1 (r) um vm m
+
X
sin θ0 exp[ + i ϕ0 ] φ∗m+1 (r) φm (r) vm u∗m
m
336
+
X
cos θ0
| φm (r) |2 | vm |2 − | φm+1 (r) |2 | um |2
m
(1490) The wave functions can be expressed in planar polar coordinates as 1 exp + i m ϕ Rm (r) φm (r, ϕ) = √ 2π
(1491)
where Rm (r) is real. Hence, we have X ∗ sin θ0 exp[ − i ( ϕ0 − ϕ ) ] Rm (r) Rm+1 (r) um vm < Ψ+ | ( ηˆ . σ )(r) | Ψ+ > = m
+
X
+
X
sin θ0 exp[ + i ( ϕ0 − ϕ ) ] Rm+1 (r) Rm (r) vm u∗m
m
cos θ
0
2 Rm (r)
2
| vm |
−
2 Rm+1 (r)
2
| um |
m
(1492) On maximizing w.r.t ϕ0 one finds X X ∗ exp[ − 2 i ( ϕ0 − ϕ ) ] Rm (r) Rm+1 (r) um vm = Rm+1 (r) Rm (r) vm u∗m m
m
(1493) Hence, ϕ0 = ϕ, that is, the in-plane component of the spin is directed radially outwards. On substituting this relation into the wave function, one finds that the spin density along this direction simplifies to X ∗ < Ψ+ | ( ηˆ . σ )(r) | Ψ+ > = sin θ0 Rm+1 (r) Rm (r) ( vm u∗m + vm um ) m
+ cos θ0
X
2 2 | vm |2 Rm (r) − | um |2 Rm+1 (r)
m
(1494) The out of plane component of the spin is determined by θ0 . This is found from maximizing w.r.t. θ0 , and leads to P ∗ ( v u∗ + vm um ) Rm+1 (r) Rm (r) m m m (1495) tan θ0 = P 2 2 2 2 | vm0 | Rm0 (r) − | um0 | Rm0 +1 (r) m0 At large distances r from the origin, the wave functions are dominated by a range of m values around the value given by r2 = 2 m ξ 2
337
(1496)
In this case, one has Rm+1 (r) ∼ 2 √rm ξ Rm (r), hence, the out of plane angle at the distance r from the origin is governed by um r θ0 √ tan ∼ (1497) 2 2 m ξ vm Since, the ratio decreases with increasing m, the spin direction varies from θ0 = π at the origin to θ0 = 0 as r → ∞. The texture can be expressed empirically as θ0 λ tan = (1498) 2 r where λ expresses the size of the skyrmion. In fact the size of the skyrmion is determined by the m variation of um , or more explicitly on the ratio m+1,↓ − m,↑ (1499) ∆m The size of the skyrmion decreases as the magnitude of the Zeeman splitting increases. This reflects the fact that the energy required to flip the spins in a region of large spatial extent becomes prohibitively costly as the Zeeman interaction is increased. Skyrmions can also be created in the Laughlin state. The skyrmions have lower energy than the Laughlin quasi-particles for all values, of the gyromagnetic ratio g. The energy difference is largest for g → 0. However, as g → ∞ the region over which the spin is varying is reduced, and the energy approaches that of the Laughlin quasi-particle. In fact, in this limit, the skyrmion becomes identical to the Laughlin quasi-particle.
11.5.6
Composite Fermions
The Laughlin wave function describes states with filling fractions p1 , where p is odd. However, the sequence of fillings at which the fractional quantum Hall effect is observed is given by the expressions ν =
n 2nr ± 1
and
(1500)
n (1501) 2nr ± 1 where n and r are integers. These two filling fractions are related by approximate electron-hole symmetry, in which occupations of only the lowest Landau level are considered. ν = 1 −
338
These other states can be expressed in terms of composite fermions. The general Laughlin state can be written as Ψ2r+1
∼
Y
( xi − xj ) + i ( yi − yj )
2r+1 Y Ne
i>j
=
x2 + y 2 − k 2 k 4ξ
x2 + y 2 − k 2 k 4ξ
exp
k=1
Y
2r ( xi − xj ) + i ( yi − yj )
×
i>j
×
Y
( xi − xj ) + i ( yi − yj )
i>j
Y Ne k=1
exp
(1502) This can be thought of as taking the state in which the lowest Landau level has the filling ν = 1 and attaching an even number 2 r flux quanta to each particle. Then, the statistics of the composite particle (composed of the 2 r flux quanta and the electron) will be that of an electron (π) plus a multiple of π for each flux quantum. Hence, the total exchange phase of the composite particle wave function will be π + 2rπ = (2r + 1)π (1503) and, thus, the composite particle is a fermion. Consider the state with the general filling factor ν =
n 2rn ± 1
(1504)
If the electrons are attached to 2 r flux tubes, one has composite fermions. This has the effect of reducing the magnetic field from Φ to a value Φ∗ given by Φ∗ = Φ − 2 r N e Φ0
(1505)
This effective free flux can be either positive or negative. The fractional quantum Hall effect of electrons with filling factor ν can be related to the fractional quantum Hall effect of composite fermions with filling factor ν ∗ . The relation between ν and ν ∗ is found by first inverting the definitions N e Φ0 Φ N e Φ0 = Φ∗
ν = ν∗
(1506)
in which Φ∗ and, therefore, ν ∗ are assumed to be positive. Then, substituting Φ∗ and Φ into the relation given by eqn(1505) yields 1 1 = − 2r ∗ ν ν 339
(1507)
or
ν∗ (1508) 2 ν∗ r + 1 Hence, the fractional quantum Hall effect at general fillings can be related to the integer quantum Hall effect, with integer fillings ν ∗ for composite fermions with 2 r flux tubes attached to each electron. The expression for the filling fractions with the minus sign in the denominator can be obtained by considering negative values of Φ∗ , for which a minus sign has to be inserted into the definition of ν ∗ in order to keep the filling fraction positive. ν =
340
12
Insulators
The existence of band gaps is a natural consequence of Bloch’s theorem for periodic crystals. However, the existence of band gaps is a much more universal phenomenon, for example it also appear in amorphous materials. In these cases, the existence of band gaps can be traced back to the energy gaps separating the discrete bound state energies of isolated atoms. When the atoms are brought together to form a solid, each electron will be shared with all the atoms in a crystal like in a giant molecule of N atoms. The set of discrete energy levels from each of the N atoms, are nearly degenerate. The binding of the N atomic degenerate atomic states into a N molecular states will involve bonding / anti-bonding splittings that raises the degeneracy. As the energy spread of the bonding anti-bonding states are fixed, the levels form a dense set of discrete levels, which can be approximated by a continuous energy band. The separation between consecutive bands is roughly determined by the energy separation of the discrete levels the isolated atom. Generally, the low energy bands have a small energy spread, and a clear correspondence with the atomic levels can be established. However, the higher energy bands tend to have larger band widths, so the bands overlap and the correspondence with the atomic orbitals becomes more obtuse. An example is given by the ionic compound LiF , in which the Li ion loses an electron, and the F ion gains an electron in order that each ion only have completely filled atomic shells. The Li 1s and F 1s levels are completely filled and are well separated forming the core levels. The F 2s and 2p levels are also occupied, but have broader band widths and form the occupied valence bands. The unoccupied Li 2s and the unoccupied F 3s and 3p levels have large band widths which strongly overlap yielding a conduction band which has mixed character. The density of states from the completely filled valence band states are separated by an energy gap from the completely empty conduction band portion of the density of states. By definition, the Fermi-energy or chemical potential lies somewhere in the energy gap. The existence of a gap in the density of states at the Fermi-energy is the characteristic feature that defines an insulator or semiconductor. The distinction between a semiconductor and insulator is only by the magnitude of the energy gap between the lowest unoccupied state and the highest occupied state. In insulators this energy gap is large. In semiconductors this energy gap is small, so the electronic properties are determined by the electronic states close to the bottom of the conduction band and the states close to the top of the valence band. The density of states close to the band gap can usually be parameterized by a few quantities, such as the value of the band gap, and the effective masses me and mh for the valence and conduction bands. This is true, since the discontinuities at the band edges are van Hove singularities. Due to the symmetry one can represent the single particle Bloch energies of the
341
valence band and conduction band states as Ec (p)
=
Eg +
d X
ai p2i
i=1
Ev (p)
=
−
d X
bi p2i
(1509)
i=1
where the zero of energy was chosen to be at the top of the valence band. Using the definition of the effective mass as ∂ 2 E(p) 1 = mα,β ∂pα ∂pβ
(1510)
one finds a diagonal effective mass tensor mα,β , which is positive for the conduction band and negative for the valence band. There are two types of semiconductors that are frequently encountered. Intrinsic semiconductors and extrinsic semiconductors. A Intrinsic Semiconductors. These are pure semiconductors, where the density of states consists of a completely filled valence band and a completely empty conduction band at T = ), which are separated by a band gap Eg . At temperatures comparable to the band gap Eg ∼ kB T (1511) a finite number of electrons can be excited from valence band states to the conduction band. The thermal excited electrons in the conduction band are associated with empty states in the valence band. For each conduction electron there is one empty valence band state. A hole is defined as the absence of an electron in a valence band state. Thus, at finite temperatures one has a finite number of electron - hole pairs. For materials such as sI or Ge the gap in the density of states is of the order of 1 to 0.5 eV. Thus, the number of thermally activated electron hole pairs is expected to be extremely small under ambient conditions. B Extrinsic Semiconductors. Extrinsic semiconductors are a type of semiconductor that contain impurities. Semiconductors with impurities can have discrete atomic levels that have energies which are lower than the empty conduction band and higher than the full valence band. That is the impurity level lie within the gap. There are two types of extrinsic or impurity semiconductors.
342
N type semiconductors have impurities with levels which at T = 0 would be filled with electrons. At higher temperatures the electrons can be excited from the levels into the conduction band. These types of impurities are called donors, since at high temperatures they donate electrons to the conduction band. An example of an N type semiconductor is given by semiconducting Si in which As impurities are substituted for some Si ions or alternatively semiconducting Ge substitutionally doped with P impurities. These are examples of elemental semiconductors from the IV column of the periodic table doped with impurities taken from the V column. Since the impurity ion has one more electron than the host material, the host bands are completely filled by taking four electrons from each impurity, but the impurity ion can still release one extra electron into the conduction band. For Si doped with a low concentration of As impurities, each As impurity can be considered individually. The As atom contains 5 electrons, while the perfect valence band only contains 4 electrons per site. Thus, the As ion has one extra electron which, according to the Pauli exclusion principle has to be placed in states above the valence band. In the absence of the impurity potential of the ionized As atom, this extra electron would go into the conduction band and would behave very similarly to a free electron with mass me . However, one must consider the effect of the potential produced by the As+ ion and the spatial correlation that this imposes on the extra electron. The As+ ions has a positive charge, which affects the free conduction electron much the same way as the positive nuclear charge effects the electron in a H atom. The extra electron becomes bound to the donor atom. In the semiconductor, the binding energy is extremely low and the radius of the orbit is large. This can be seen by examining the potential produced by the As+ ion V (r) = −
e2 εr
(1512)
which is screened by the dielectric constant ε. The dielectric constant for si has a value of about 70. The mass of the electron is the effective mass me . The Bohr radius of the donor orbital is given by ad (n) =
ε ¯h2 2 n me e2
(1513)
where n is the principal quantum number. The energy of the donor levels below the bottom of the conduction band are given by the expression Ed (n) = −
me e4 1 2 ε2 ¯h2 n2
(1514)
The lowest state of the donor level is occupied at T = 0, where the electron is located in an orbit of radius ad (1) ∼ 30 A and the energy of the donor level Ed (1) ' − 0.02 eV. Thus, there are shallow impurity levels just below 343
the bottom of the conduction band, one of these set are occupied at T = 0 K. These discrete levels can be represented by a set of delta functions in the density of states. For sufficiently large concentrations these impurities can be represented in terms of a finite impurity band which appears inside the gap in the density of states of the pure host material. Since the value of the gap is comparable to room temperature, one expects that under ambient conditions there are a finite number of thermally activated conduction electrons available for carrying current. P type semiconductors have impurities with levels that at T = 0 would be empty of electrons. At higher temperatures, electrons from the filled valence band will be excited into the empty impurity levels. These thermally excited levels will be localized on the impurities ( for small concentrations of impurities ). However, the holes present in he valence band allows the valence electrons to conduct electricity and contribute to the properties of the semiconductor. The impurities in P type semiconductors are called acceptors as they accept electrons at finite temperatures. Examples of P type semiconductors are Ga impurities doped substitutionally on the sites of a Si host, or Al impurities substituted for the atoms in a Ge crystal. These are examples of impurities from the III column of the periodic table being substituted for the atoms in a semiconductor composed of an element from the IV column of the periodic table. In this case the type III impurity provides only 3 electrons to the host conduction band, which at finite T contains one hole per Ga impurity. The Ga impurity atom shares the electrons of the surrounding Si atom and becomes negatively charged. The extra hole orbits around the negative ion producing acceptor levels that lie just above the top of the valence band. A finite concentration of acceptor levels is expected to produce a smeared out acceptor band just above the top of the valence band. Due to the smallness of the gap between the valence band and the acceptor levels, at room temperature an appreciable number of electrons can be excited from the valence band onto the acceptor levels.
12.1
Thermodynamics
The thermodynamic and transport properties of semiconductors are governed by the excitations in the filled valence band or empty conduction band, as the fully filled or empty bands are essentially inert. The excitations of electrons on the localized impurity levels of extrinsic semiconductors, do not directly contribute to physical properties. However, the electrons in the conduction band and the valence band are itinerant and do contribute. To develop a theory of the properties of semiconductors it is convenient to focus attention on the few unoccupied states of the valence band rather than the many filled states. This is achieved by reformulating the properties in terms of holes.
344
12.1.1
Holes
A hole is an unoccupied state in an otherwise completely occupied valence band. The probability of finding a hole in a state of energy E, Ph (E) is given by the probability that an electron is not occupying that state Ph (E) = 1 − f (E)
(1515)
Thus, as 1
Ph (E) = 1 − f (E) =
1 + exp
(1516)
− β (E − µ)
one finds that the probability of finding a hole in a state of energy E is also given by the Fermi-function except that E → − E and µ → − µ. That is the energy of the hole is the energy of a missing electron. The momentum of a hole can be found as the momentum of a completely occupied band is zero, since inversion symmetry dictates that for each state with momentum k e there is a state with momentum −k e with the same energy, and by assumption both are occupied. Then by definition the momentum of a hole in k e is that of the full band with one electron missing kh = 0 − ke
(1517)
Thus, the momentum of a hole is opposite to that of the missing electron kh = − ke
(1518)
The charge on the electron is − | e |. The charge on the hole can be obtained from the quasi-classical form of Newton’s laws applied to an electron in an electric field E, dk e ¯h = −|e|E (1519) dt As the momentum of a hole is given by kh = − ke one finds that
(1520)
dk h (1521) = +|e|E dt Thus, the charge of the hole is just | e |. This is consistent with considerations of electrical neutrality. In a solid in equilibrium, the charge of the nuclei is equal in magnitude to the charge of the electrons. Thus | e | Z N n − Ne = 0 (1522) ¯h
345
Thus, in equilibrium the number of electrons is related to the number of nuclei via Ne = Z Nn . Now if one electron is removed from the valence band and removed from the solid, there is just one hole. The total charge of the one hole state is given by Qh = | e | Z Nn − ( Ne − 1 ) (1523) where the number of electrons is now Ne − 1 = Z Nn − 1. Hence, Qh = | e |
(1524)
Thus, the charge on the hole is minus the charge of the electron. The velocity of a hole v h can be found by considering the electrical current carried by the electrons. The electrical current carried by a full band is zero. The electrical current carried by a hole is the electrical current carried by a full band minus one electron. j h = | e | vh = 0 − ( − | e | ve )
(1525)
vh = ve
(1526)
Thus, Thus, the velocity of the hole is the same as the velocity of the missing electron. The above results indicate that the velocity of the hole is in the same direction as the velocity of the missing electron, but the momenta have opposite directions. This implies that the sign of the hole mass should be opposite of the mass of the missing electron. This can also be seen by considering the alternate form of the quasi-classical equation of motion me
dv e = −|e|E dt
(1527)
and since ve = vh
(1528)
and the charge of the hole is | e | one has − me
dv h = |e|E dt
(1529)
Hence, mh = − me
(1530)
the mass of the hole is the negative of the mass of the missing electron. The most important number for the thermodynamic and transport properties of a semiconductor is the number of charge carriers. This can be obtained from analysis of the appropriate semiconductor.
346
12.1.2
Intrinsic Semiconductors
The number of electrons thermally excited into the conduction band of an intrinsic semiconductor can be calculated with knowledge of the chemical potential µ(T ). Consider an intrinsic semiconductor like pure Si which has an energy gap Eg of the order of 1 eV. The energy of the top of the valence band shall be set to zero. The energy of the hole is zero at the top of the valence band and increases downwards, as the energy is that of the missing electron. This agrees with the fact that mh < 0, as 1 ∂2E = < 0 ∂2k mh
(1531)
The energy of the conduction electron is given by E(k) = Eg +
¯ 2 k2 h 2 me
(1532)
The chemical potential can be obtained from considerations of charge neutrality, which implies that the number of conduction electrons is equal to the number of holes in the valence band. The number of electrons can be calculated from the electron density of states V ρ(E) = 2 π2
32
2 me h2 ¯
1
( E − Eg ) 2
f or
E > Eg
The number of electrons in the conduction band, Nc is given by Z ∞ Nc = dE ρ(E) f (E)
(1533)
(1534)
Eg
The Fermi-function can be expressed in terms of ( E − Eg ) as f (E) =
exp
β ( E − Eg )
1 exp − β ( µ − Eg ) + 1
(1535)
E
If it is assumed that µ = 2g then, as the lowest conduction band state is at E = Eg , the Fermi-function for the conduction electrons is almost classical since both exponents are positive. Thus, with the assumption f (E) ∼ exp − β ( E − µ ) (1536) As shall be shown later the above assumption is valid. The number of thermally activated conduction electrons can now be obtained by evaluating the integral over the classical Boltzmann distribution by changing variables from E to the
347
dimensionless variable x = β ( E − Eg ). Then Z ∞ 3 1 V 2 me kB T 2 2 exp Nc ∼ exp − β ( E − µ ) dx x − x g 2 π2 h2 ¯ 0 3 2 2 π me kB T = 2V exp − β ( Eg − µ ) h2 (1537) which is usually expressed in terms of the thermally De Broglie wavelength of the conduction electrons λe defined by h (1538) 2 π me kB T as V (1539) Ne = 2 3 exp − β ( Eg − µ ) λe The number of thermally excited conduction electrons depends exponentially on the unknown quantity µ. λe = √
The number of holes in the valence band can be found from a similar calculation. The occupation number for the holes in the valence band is given by Ph (E)
=
1 − f (E)
=
1 1 + exp
− β (E − µ)
which is the Fermi-function except that E → − E and µ → − µ. The density of states for the holes in the valence band is given by 32 1 V − 2 mh f or E < 0 (1540) ρ(E) = ( − E )2 2 π2 h2 ¯ E
Since the value of µ is assumed to be positive and of the order of 2g the Fermifunction can also be treated classically. Thus, the number of holes Nv in the valence band is given by the integral Z 0 Nv = dE Ph (E) ρ(E) −∞
=
V 2 π2
∼
2V
− 2 mh kB T h2 ¯
32 Z
− 2 π mh kB T h2
32
1
0
( − E )2
dE −∞
− β(E − µ)
exp
exp
+ 1
− βµ (1541)
348
which also depends on µ. Since in an intrinsic semiconductor the electrons and holes are created in pairs one has Nc = Nv (1542) Therefore, the equation for the unknown variable µ is given by exp
2βµ
=
− mh me
32
exp
β Eg
(1543)
or on taking the logarithm one has − mh Eg 3 µ = + kB T ln 2 4 me
(1544)
Hence, at T = 0, the chemical potential lies half way in the gap and only acquires a temperature dependence if there is an asymmetry in the magnitude of the conduction band density of states and the valence band density in the vicinity of the band edge. This result justifies the previous assumption that the distribution functions can be treated classically. The result also shows that the number of thermally activated electrons or holes crucially depends on the gap via the exponential factor Eg exp − β (1545) 2 and is extremely small at room temperature for a semiconductor with a gap that has a magnitude of the order of electron volts.
12.1.3
Extrinsic Semiconductors
In an intrinsic semiconductor the gap in the host material is usually much larger than the gap between the impurity levels and the band edges. Therefore, under ambient temperatures one can neglect one of the bands, as the number of thermally excited electrons or holes in that band is small. For concreteness, consider an N type semiconductor. The bottom of the conduction band is defined as the reference energy E = 0 and the energy of the donor levels is defined as − Ed . Let Nd be the number of donor atoms, and nd be the number of electrons remaining in the donor atoms, and nc be the number of electrons thermally excited to the conduction band. The important physical properties are determined by the number of conduction electrons. This can be calculated from the free energy F . The free energy F is given in terms of the energy and entropy via the relation F = E − T S 349
(1546)
where the total energy is E = − nd Ed
(1547)
as the energy of the thermally activated conduction electrons E ∼ 0 as they occupy states at the bottom of the band. The entropy S is given by S = kB ln Ω
(1548)
where Ω is the number of possible states of the system. This is just the number of ways of distributing nd electrons in the 2 Nd states of the donor atoms. It is assumed that the donor atoms only have a two-fold spin degeneracy, and that there is no interaction energy between two electrons occupying the two spin states on the same donor atom. With these assumptions, the number of accessible states is given by Ω =
( 2 Nd )! nd ! ( 2 Nd − nd )!
(1549)
Hence, the Free energy can be calculated in the thermodynamic limit as F
= −
− Ed n d kB T 2Nd ln 2Nd − nd ln nd − ( 2Nd − nd ) ln( 2Nd − nd ) (1550)
The chemical potential is found from minimizing the free energy with respect to nd µ
=
∂F ∂nd
=
− Ed − kB T
− ln nd + ln( 2Nd − nd )
(1551)
Thus, on exponentiating exp
β (µ + Ed )
=
nd 2 N d − nd
(1552)
which leads to the number of electrons occupying the donor orbitals as 2 Nd
nd =
exp
(1553)
β ( − Ed − µ )
+ 1
which is just governed by the Fermi-function f (−Ed ) and the number of orbitals 2 Nd .
350
The number of thermally excited conduction electrons is given by Z 1 nc = 2 V d3 k exp β ( E(k) − µ ) + 1 =
2V
2 π me kB T h2
32
exp
βµ
(1554)
The unknown chemical potential µ can be eliminated from the above two equations and thereby a relation between nc and nd can be found nc ( 2 N d − nd ) = exp nd
− β Ed
2V
2 π me kB T h2
32 (1555)
This is the law of mass action for the dissociation reaction nd → nc + ( N d − nd )
(1556)
in which filled donors dissociate into conduction electrons and unoccupied donor atoms. This relation can be written as V nc ( 2 N d − nd ) = exp − β Ed 2 3 (1557) nd λ where λ is the thermal De-Broglie wavelength, λ = √
h 2 π me kB T
(1558)
On using the condition of electrical neutrality N d = nd + n c one can eliminate nd to find the number of conduction electrons as nc ( N d + nc ) 2V = exp − β Ed ( N d − nc ) λ3
(1559)
(1560)
This is a quadratic equation for nc which can be solved to yield the positive root 1 2V nc = − Nd + 3 exp − β Ed 2 λ s 2 1 2V 8 Nd V + Nd + 3 exp − β Ed + exp − β E d 2 λ λ3 (1561)
351
With this expression for the number of conduction electrons one can solve for µ from nc λ 3 µ = kB T ln (1562) 2V At sufficiently low temperatures when λ3 N d exp 2V
− β Ed
(1563)
one finds that µ = − Ed
(1564)
as at T = 0 the donor level is only partially occupied as there are 2 Nd orbitals including spin and Nd electrons. At sufficiently high temperatures, the donor levels are almost completely ionized and nc ' N d
(1565)
and the chemical potential is found as µ = kB T ln
N d λ3 2V
(1566)
——————————————————————————————————
12.1.4
Exercise 59
Assume that if the donor levels are localized to such an extent that the coulomb repulsion between two opposite spin electrons in the same donor atom is extremely large. This assumption makes the occurrence of doubly occupied donor atoms extremely improbable. Show that the number of accessible states is given by Nd ! (1567) Ω = 2Nd nd ! ( Nd − nd )! Hence, show that Nd
nd = 1 +
1 2
exp
− β ( Ed
+ µ)
(1568)
Thus, the interaction affects the statistics of the occupation numbers. Also show that the law of mass action becomes nc ( N d − nd ) V = 3 exp − β Ed (1569) nd λ ——————————————————————————————————
352
12.2
Transport Properties
Transport in doped semiconductors is mainly due to scattering from donor impurities. The potential due to the isolated donor impurities is a screened Coulomb interaction Z e2 exp − kT F r (1570) V (r) = − εr The transport scattering rate is given by 2 Z Z 1 c 4 π Z e2 0 0 02 d cos θ ( 1 − cos θ ) = dk k δ( E(k) − E(k ) ) τ (E(k)) h ¯ ε ( (k − k 0 )2 + kT2 F ) (1571) where c is the impurity concentration and θ is the scattering angle. The integration over the magnitude of k 0 can be performed over the delta function. The integration over the angle can be performed as 2 Z 1 2mc 1 4 π Z e2 = d cos θ ( 1 − cos θ ) τ (E(k)) ε ( 1 − cos θ + ( kTkF )2 ) h3 k 3 ¯ (1572) Hence, τ (k) ∼ k 3 . The conductivity can be evaluated from the formulae σx,x
=
=
∼
∂f (E(k)) ¯2 2 2 e2 X h k τ (k) ∂E(k) m2 x V k 2 e2 X ¯h2 2 k τ (k) β exp − β ( E(k) − µ ) V m2 x k Z β ¯h2 2 7 β exp β µ dk k exp − k 2 me −
(1573) On changing the variable of integration from k to the dimensionless variable x =
¯ 2 k2 h 2 me kB T
(1574)
one finds that the temperature dependence of the conductivity is governed by 3 σx,x ∼ ( kB T ) exp β µ (1575) The conductivity is proportional to the electron density and the scattering time. 3 For Coulomb scattering, the average scattering time is proportional to T 2 . On the other hand, scattering from lattice vibrations gives rise to a conductivity that is just proportional to exp[ β µ ].
12.3
Optical Properties 353
13
Phonons
14
Harmonic Phonons
The Hamiltonian describing the motion of the ions can be formulated by assuming that the ions move slowly compared with the electrons. Thus, at any instant of time, the electrons have relaxed into equilibrium positions and the ions are frozen into their instantaneous positions. It is assumed that the ions have definite mean equilibrium positions and that the displacements of the ions from these equilibrium positions are small. First, the situation in which the crystal lattice can be described by a Bravais lattice, with a one atom basis is considered. In a later section, the effects of a multi-atom basis will be described. For the case of a one atom basis, the energy of the solid can be calculated in the which the ionic positions are displaced by amounts ui , r(Ri ) = Ri + ui (1576) where ui is the deviation from the equilibrium position Ri . It is assumed that the total energy can be formulated as a constant plus pairwise interactions which depends on r(Ri ) − r(Rj ). The pair-wise interaction is given in terms of the pair-potentials Θ(r(Ri ) − r(Rj )) via 1 X Vˆ = Θ(r(Ri ) − r(Rj )) 2 i,j =
1 X Θ(Ri − Rj + ui − uj ) 2 i,j
(1577)
The Hamiltonian governing the motion of the ions is just given by the sum of the ionic kinetic energies of the atoms of mass M and the pair-wise interactions ˆ = H
X P ˆ2 i + Vˆ 2 M i
(1578)
ˆ The harmonic approximation assumes that ui are sufficiently small so that H can be expanded in powers of ui . The terms involving ui describe the change in energies of the ions due to the lattice vibrations. The expansion is ˆ H
= +
+
X P ˆ2 1 X i + Θ(Ri − Rj ) 2M 2 i,j i 1 X (ui − uj ) . ∇ Θ(Ri − Rj ) 2 i,j 2 1 1 X (ui − uj ) . ∇ Θ(Ri − Rj ) + .... 2 2! i,j (1579) 354
However, since the Ri are equilibrium positions, and the total force on the atom located at Ri is given by X (1580) ∇ Θ(Ri − Rj ) j
the total force must vanish in equilibrium. Hence, the potential to second order in u is just given by Vˆ = Veq + Vˆharmonic (1581) where the harmonic potential is given by 1 X X µ Vˆharmonic = (u − uµj ) Θνµ (Ri − Rj ) (uν,i − uν,j ) 4 i,j µ,ν i
(1582)
In the above equation, the quantities Θνµ are defined in terms of the second derivatives of the pair potential ν ν (1583) Θµ (Ri − Rj ) = ∇µ ∇ Θ(Ri − Rj + ui − uj ) u≡0
The harmonic potential Vˆharmonic is usually expressed directly in terms of the displacements, and not their differences 1 X X µ ν Vˆharmonic = u D (R − Rj ) uν,j (1584) 2 i,j µ,ν i µ i where the dynamical matrix is given by X Θνµ (Ri − R”) − Θνµ (Ri − Rj ) Dµν (Ri − Rj ) = δRi ,Rj
(1585)
R”
In general, when the interactions are not just pairwise, the harmonic potential can still be defined via the second derivative of the total energy Dµν (R − R0 ) =
∂2E ∂uµi ∂uν,j
(1586)
For a solid with a mono-atomic Bravais lattice, the dynamical matrix Dµν (R−R0 ) possesses the symmetry Dµν (R − R0 ) = Dνµ (R0 − R)
(1587)
due to the analyticity of the pair potential. Also, one has Dµν (R − R0 ) = Dµν (R0 − R)
(1588)
which follows as every Bravais Lattice has inversion symmetry. Due to translational invariance, one has the sum rule X Dµν (R) = 0 (1589) R
355
This follows from consideration of the absence of energy change due to a uniform displacement of the solid r(Ri ) = Ri + ∆
∀i
(1590)
Under this displacement, the energy change of the solid is zero, and so 0 =
1 X X µ ν ∆ Dµ (Ri − Rj ) ∆ν 2 i,j µ,ν
(1591)
for an arbitrarily chosen ∆µ . Since ui,µ represents the displacement canonically conjugate to Pˆi,µ , the momentum and displacement operators satisfy the commutation relations [ ui,µ , Pˆj,ν ] = i ¯h δi,j δµ,ν
(1592)
[ ui,µ , uj,ν ] = [ Pˆi,µ , Pˆj,ν ] = 0
(1593)
and
To diagonalize the harmonic Hamiltonian, first a canonical transformation is performed to a representation in which the periodic translational invariance of the lattice is explicit. The displacement is expressed as 1 X ui = √ uq exp i q . Ri (1594) N q and the momentum operator becomes 1 X Pˆ i = √ pˆq exp i q . Ri N q
(1595)
where N = N1 N2 N3 is the number of unit cells in the crystal. On assuming that the lattice displacements satisfy Born-von Karman boundary conditions, the momenta are quantized by q =
n3 n2 n1 b b + b + N3 3 N2 2 N1 1
(1596)
where ni are integers such that 0 < ni < Ni . In the Fourier transformed basis, the commutation relations become [ uq,µ , pˆq0 ,ν ] = i ¯h ∆(q + q 0 ) δµ,ν
(1597)
[ uq,µ , uq0 ,ν ] = [ pˆq,µ , pˆq0 ,ν ] = 0
(1598)
and
356
The quantity ∆(q) is the Dirac delta function, modulo reciprocal lattice vectors, and is defined via X (1599) exp i q . R = N ∆(q) Ri
This is non-zero if q = Q is a reciprocal lattice vector, and is zero otherwise. In terms of the new coordinates the Hamiltonian becomes X pˆ†q,µ δµ,ν pˆq,ν
ˆ = H
1 + u†q,µ Dνµ (q) uq,ν 2
2M
q
(1600)
where D(q) is the Fourier Transform of the dynamical matrix e D(q) =
X
e D(R) exp
− iq.R
(1601)
R
Thus, the harmonic Hamiltonian is diagonal in the quantum number q. e The symmetries of the dynamical matrix D(R) can be used to show that X e e D(q) = D(Ri − Rj ) exp − i q . ( Ri − Rj ) i,j
=
1 X e D(R) 2
− iq.R
exp
+ exp
!
+ iq.R
− 2
R
=
−2
X R
e D(R) sin2
q.R 2
(1602)
e Thus, D(q) is a real symmetric matrix. Every real 3 × 3 symmetric matrix has three real eigenvalues, thus, one may find three eigenfunctions e D(q) α (q) = M ωα2 (q) α (q)
(1603)
where α (q) are the eigenvectors and the eigenvalues have been written as M ωα2 (q). It is necessary that the eigenvalues are positive for the lattice to be stable. The eigenvectors are usually normalized, such that α (q) . β (q) = δα,β
(1604)
Since the eigenvectors form a complete orthonormal set, one may expand the displacements and the momentum operators in terms of the eigenvectors as X uq = Qα (1605) q α (q) α
357
and pˆq =
X
Pˆqα α (q)
(1606)
α
This transformation diagonalizes the Hamiltonian in terms of the polarization index α. This can be seen as X e e α (q) Qα Q†q β β (q) D(q) u†q D(q) uq = q α,β
=
X
=
X
M ωα2 (q) Q†q
β
β (q) . α (q) Qα q
M ωα2 (q) Q†q
α
Qα q
α,β
(1607)
α
and also pˆ†q . pˆq =
X
Pˆq†
α
Pˆqα
(1608)
α
Hence, the Hamiltonian is diagonal in the polarization indices α, ˆ = H
X q,α
" ˆ† Pq
α
Pˆqα
2M
M ωα2 (q) † + Qq 2
# α
Qα q
(1609)
The Hamiltonian has the form of 3 N independent harmonic oscillators. The eigenvalues of the Hamiltonian can be found by introducing boson annihilation and creation operators. The annihilation operator is defined by s s M ωα (q) α 1 Pα (1610) Qq + i aq,α = 2 ¯h M ωα (q) q 2 ¯h and the creation operator is the Hermitean conjugate of the annihilation operator s s M ω (q) 1 α (1611) P† α a†q,α = Q†q α − i 2 ¯h M ωα (q) q 2 ¯h The commutation relations for the boson operators can be calculated from the commutation relations of Pqα and Qα q , so [ aq,α , a†q0 ,β ] = ∆(q − q 0 ) δα,β
(1612)
These are the usual commutation relations for boson creation and annihilation operators. In second quantized form, the Hamiltonian is expressed as X ¯ h ωα (q) † † ˆ (1613) H = aq,α aq,α + aq,α aq,α 2 q,α
358
Due to the commutation relations, one can show that the operator a†q,α when acting on an energy eigenstate produces an energy eigenstate in which the energy eigenvalue is increased by an amount ¯h ωα (q). Likewise, the annihilation operator acting on an energy eigenstate produces an energy eigenstate with an eigenvalue which is lower by h ¯ ωα (q). If one assumes the existence of a ground state of the oscillator | 0q,α > such that aq,α | 0q,α > = 0
(1614)
then the energy eigenvalue of the ground state is just 12 ¯h ωα (q). The excited states can be found by the raising operator to be ( nq,α + 12 ) h ¯ ωα (q), where nq,α is a positive integer. Thus, the Hamiltonian may be expressed as ˆ = H
X
¯h ωα (q)
nq,α +
q,α
1 2
(1615)
This implies that each normal mode (q, α) has quantized excitations. These quantized lattice vibrations are known as phonons. The completeness relation for the polarization vectors is just α,µ (q) . α,ν (q) = δµ,ν
(1616)
The original displacements and momenta can be expressed in terms of the phonon creation and annihilation operators via s 1 X h ¯ ui = √ α (q) aq,α + a†−q,α exp i q . Ri 2 M ωα (q) N q,α (1617) and s Pˆ i
i X = √ N q,α
h M ωα (q) ¯ α (q) 2
a†−q,α
− aq,α
exp
i q . Ri (1618)
In the Heisenberg representation, the displacements and momentum operators become time dependent. The time dependence occurs through factors of exp ± i ωα (q) t (1619) appearing along with the phonon creation and annihilation operators.
359
Excited lattice vibration normal modes that resemble classical waves, in that they have a definite displacement and definite phase, cannot be energy eigenstates of the Hamiltonian as they are not eigenstates of the number operator. The classical lattice vibrations are best described in terms of coherent states which are a superposition of states each containing large numbers of phonons. The time dependence contained in the exponential phase factors has the effect that the displacements associated with each excited normal mode oscillate periodically with time. In general, the time dependent factors have the effect that, if the eigenvalues of the dynamical matrix are such that ωα2 (q) > 0, the various expectation values of the displacements can simply be represented as a sum of periodic oscillations. On the other hand, if ωα2 (q) < 0 the displacements have unbounded exponential growth, the harmonic approximation fails and the lattice becomes unstable. The discussion will now be restricted to the case of stable lattice structures.
14.1
Lattice with a Basis
When the lattice has a basis composed of p atoms, the analysis of the phonon excitations is similar. The displacements are labelled by the lattice vector index i and also by an index l which labels the atoms in the basis. Also, the dynamical matrix becomes a 3 p × 3 p matrix and there are 3 p normal modes labelled by α. Thus, one has 1 XX α l Qq α (q) exp i q . Ri (1620) uli = √ N q α The polarization vectors lα (q) satisfy the generalized ortho-normality condition l=p X
lβ (q)∗ . lα (q) Ml = δα,β
(1621)
l=1
where Ml is the mass of the l-th atom in the basis.
14.2
A Sum Rule for the Dispersion Relations
The eigenvalue equation e D(q) α (q) = M ωα2 (q) α (q)
(1622)
can be related to the Fourier transform of the pair potential. The Fourier Transform of the dynamical matrix is given by X e e exp − i q . R (1623) D(q) = D(R) R
360
Hence, one has X
− iq.R
e D(R) exp
α (q) = M ωα2 (q) α (q)
(1624)
R
The dynamical operator is given in terms of the pair potential X Dµν (Ri − Rj ) = δRi ,Rj ∇µ ∇ν Θ(Ri − R”) − ∇µ ∇ν Θ(Ri − Rj ) R”
(1625) and the Fourier Transform of the pair potential is given by X Θ(R) = exp i k . R Θ(k)
(1626)
k
On substituting the expression for the Fourier transform of the pair potential into the expression for the Fourier transform of the dynamical matrix, one obtains X X ν ν Dµ (q) = kµ k Θ(k) exp i ( k − q ) . ( Ri − Rj ) Rj
−
k
X
kµ k ν Θ(k) exp
i k . ( Ri − R” )
(1627)
R”
Thus, the phonon frequencies satisfy the eigenvalue equation X X M ωα2 (q) α (q) = k Θ(k) exp i ( k − q ) . ( Ri − Rj ) k . α (q) j
−
k
X X j
k Θ(k) exp
i k . ( Ri − Rj )
k . α (q)
k
(1628) On making use of conservation of momentum modulo the reciprocal lattice vectors Q, one has M ωα2 (q) α (q)
= N
X
( q + Q ) Θ(q + Q) ( q + Q ) . α (q)
Q
−N
X
Q Θ(Q) ( Q . α (q) )
(1629)
Q
It can be seen that the transverse modes only exist because of the periodicity of the lattice. That is, if only the Q = 0 reciprocal lattice vector were included
361
in the sum, the eigenvectors would be longitudinal as (q) would be parallel to q. In this case, M ωα2 (q) α (q) = N q Θ(q) q . α (q)
(1630)
which are the longitudinal sound modes. Hence, the transverse modes only exits because of the periodicity of the lattice. Furthermore, by letting q → 0 one finds that the phonon frequencies M ωα2 (0) must vanish in this limit. This provides an example of the Goldstone mode which occurs because the continuous translational symmetry of the Hamiltonian is spontaneously broken at the phase transition when the solid is formed. Goldstone’s theorem may be roughly stated as ”When a continuous symmetry of the Hamiltonian is spontaneously broken in a phase transition, a continuous branch of normal modes appears which extend to zero energy that dynamically restore the broken symmetry”. Goldstone’s theorem also depends on the condition that long range forces are not present. If long range forces are present the symmetry restoring mode may acquire a finite frequency at q = 0 through the Kibble-Higgs mechanism. In this case the resulting modes are called Higgs bosons. The Sum Rule. The Sum Rule is obtained by using the orthogonality relations for the polarization vectors. On taking the scalar product of the eigenvalue equation for the dispersion relation with an eigenvector and summing over the polarizations, one obtains X X X α (q) . ( q + Q ) Θ(q + Q) ( q + Q ) . α (q) M ωα2 (q) = N α
α
−N
Q
X X α
α (q) . ( Q ) Θ(Q) ( Q ) . α (q)
Q
(1631) Utilizing α,µ (q) . α,ν (q) = δµ,ν one finds the sum rule X M ωα2 (q) α
= N
X
(1632)
Θ(q + Q) ( q + Q )2
Q
−N
X
Θ(Q) ( Q )2
(1633)
Q
for the phonon modes. ——————————————————————————————————
362
14.2.1
Exercise 60
Show that if the pair potential is approximated by the non-screened Coulomb potential between the charged nuclei, one finds X
ωα2 (q) =
α
4 π N Z 2 e2 M V
(1634)
which defines the plasmon frequency for the ions. In the long wave length limit, one has the longitudinal plasmon mode and at most two transverse modes depending on the presence of long-ranged order. As the longitudinal plasmon mode saturates the sum rule at q = 0, the transverse modes if they exist must be acoustic. ——————————————————————————————————
14.3
The Nature of the Phonon Modes
The long wavelength form of the dynamical matrix can easily be calculated from X q.R 2 e e D(R) sin (1635) D(q) = − 2 2 R
as
2 1 X e e D(q) = − D(R) q . R 2
(1636)
R
This implies that ωα (q)2 ∝ q 2 as q → 0, which gives rise to acoustic modes. These acoustic modes include the two Goldstone modes as well as the longitudinal sound mode, which can be considered as a density fluctuation similar to the sound waves found in fluids. If the crystal is anisotropic the frequency or sound velocity may depend on the direction of propagation. In an isotropic solid, there should be one longitudinal and two transverse polarizations ( k q) and ( ⊥ q). In an anisotropic solid the relation between and q is not so simple, except at high symmetry points. However, because the polarization vectors are continuous functions of q, one may still use the terminology of longitudinal and transverse polarizations in the vicinity of the high symmetry points. The solid has 3 N p degrees of freedom, 3 N of which are tied up in the acoustic phonon branches. The other 3 ( p − 1 ) N modes appear as optic branches.
363
The phonon density of states is given by an integral over the first Brillouin zone, and a summation over the polarization index α. ρ(ω) = V
X Z α
d3 q δ( ω − ωα (q) ) ( 2 π )3
(1637)
This can be written as an integral over a surface of constant ωα (q). This surface is denoted by Sα (ω) which consists of the points ω = ωα (q) where q is in the first Brillouin zone. This yields X Z d2 S 1 ρ(ω) = V (1638) 3 | ∇ωα (q) | Sα (ω) ( 2 π ) α The van Hove singularities occur when the group velocity vanishes, ∇ωα (q) = 0
(1639)
The van Hove singularities are usually integrable in three dimensions, but still give rise to anomalous slopes or discontinuities in the derivatives of ρ(ω). An example of the van Hove singularities in the phonon modes is given by Al which has two transverse and one longitudinal contribution to the density of states. ——————————————————————————————————
14.3.1
Exercise 61
Consider a one-dimensional linear chain, with a unit cell composed of two atoms, one with mass M1 and the other with mass M2 . The atoms interact with their nearest neighbors via a harmonic force, with force constant γ. Find the phonon dispersion relation. ——————————————————————————————————
14.3.2
Exercise 62
Consider a one-dimensional line of ions, with equal mass, but alternating charges, such that the charge on the p-th ion is ep = e ( − 1 )p
(1640)
Assume that the inter-atomic potential has two contributions:(A) A short ranged force between nearest neighbors with a force constant C1 = γ.
364
(B) A Coulomb interaction between all the ions Cp = 2 ( − 1 )p where a is the atomic spacing.
e2 p3 a3
(i) Show that ∞ X ω(q)2 ( − 1 )p 2 q a = sin + σ 1 − cos q p a ω02 2 p3 p=1 where ω02 = 4
γ M
and σ =
(1641)
e2 γ a3 .
(ii) Show that ω 2 (q) becomes soft ω 2 (q) = 0 at q =
π a
if σ >
(iii) Show that the speed of sound becomes imaginary if σ >
4 7
ξ(3).
1 2 ln 2 .
Thus, ω 2 goes to zero and the lattice becomes unstable for q in the interval (0, π) if σ lies in the range 0.475 < σ < 0.721. ——————————————————————————————————
14.3.3
Exercise 63
Consider a two-dimensional square lattice, with a mono-atomic basis. The atoms have mass M and interact with their nearest neighbors and next nearest neighbors through a harmonic force of strength γ1 and γ2 , respectively. Calculate the frequencies of the longitudinal and transverse phonons at q = ( πa ) (1, 0). ——————————————————————————————————
14.3.4
Exercise 64
(i) Show that the linear chain with nearest neighbor (harmonic) interactions has a dispersion relation qa ω(q) = ω0 | sin | (1642) 2 and that the density of states is given by ρ(ω) =
2 1 p 2 πa ω0 − ω 2
(1643)
which has a van Hove singularity at ω = ω0 . (ii) Show that in three dimensions the van Hove singularities near a maximum of ωα (q) gives rise to a term in the density of states that varies as q ω02 − ω 2 (1644) ρ(ω) ∝ 365
and, thus, has a singularity in the first derivative of the density of states with respect to ω. ——————————————————————————————————
14.3.5
Exercise 65
(i) Show that if the wave vector q lies along a 3 , 4 or 6 fold axis, then one normal mode is polarized along q and the other two modes are degenerate and polarized perpendicular to q. (ii) Show that if q lies in a plane of mirror symmetry, then one mode has a polarization perpendicular to q and the plane, and the other two modes have polarizations within the plane. (iii) Show that if q lies on a Bragg plane that is parallel to a plane of mirror symmetry, then one mode is polarized perpendicular to the Bragg plane, while the other two modes have polarizations laying within the plane. ——————————————————————————————————
14.3.6
Exercise 66
Consider an f.c.c. mono-atomic Bravais Lattice in which the atoms interact via a nearest neighbor pair potential Θ. (i) Show that the frequencies of the phonon modes are given by the eigenvalues of a 3 × 3 matrix given by X q.R 2 e ˆ ˆ e e AI + BRR (1645) D(q) = D(R) sin 2 R
where the sum over R runs over the 12 lattice sites closest to the site R = 0, and the constants A and B are given in terms of the pair potential and its derivatives at the nearest neighbor separation d = √a2 via A = 2 Θ0 (d)/d and
B = 2
Θ”(d) − Θ0 (d)/d
366
(1646) (1647)
(ii) Show that when q = (q, 0, 0) the longitudinal and transverse acoustic phonon frequencies are given by r 8A + 4B qa ωl (q) = sin (1648) M 4 and
r ωt (q) =
8A + 2B qa sin M 4
(iii) Find the frequencies when q =
√q 3
(1649)
(1, 1, 1).
(iv) Show that when q = √q2 (1, 1, 0) that the degeneracy between the transverse modes is lifted and the frequencies are given by s 8A + 2B qa 2A + 2B qa sin2 √ + sin2 √ (1650) ωl (q) = M M 4 2 2 2 and the two transverse modes are s 2A 8A + 4B qa qa ωt1 (q) = sin2 √ + sin2 √ M M 4 2 2 2 and
s ωt2 (q) =
2A 8A + 2B qa qa sin2 √ + sin2 √ M M 4 2 2 2
(1651)
(1652)
——————————————————————————————————
14.3.7
Exercise 67
Consider a phonon with wave vector along the axis of a cubic crystal. Then consider the sums in X q.R e e D(q) = D(R) sin2 (1653) 2 R
be restricted to the sites in two planes perpendicular to q separated by a distance qˆ . R. In metals there exists a long-ranged interaction between the planes X q.R e e D(q) = D(R) sin2 2 R
=
A
sin 2kF qˆ . R 2kF qˆ . R 367
(1654)
where A is a constant. (i) Find an expression for ω 2 (q) and
∂ω 2 (q) ∂q .
∂ω 2 (q)
(ii) Show that ∂q is infinite at q = 2 kF . The kink in the dispersion relation at the Fermi-wave vector is the Kohn anomaly. ——————————————————————————————————
14.4
Thermodynamics
A harmonic lattice has an energy given by Eharmonic = Ecl
X
¯h ωα (q)
nq,α
q,α
1 + 2
(1655)
where Ecl is the ground state energy of the lattice in the classical approximation and free energy F is defined in terms of the partition function as Z = exp − β F =
− β Ecl
exp
Y q,α
=
exp
− β Ecl
nq,α =∞
X
− β ¯h ωα (q) ( nq,α +
nq,α =0
Y q,α
exp
exp[ − 12 β ¯h ωα (q) ] 1 − exp[ − β ¯h ωα (q) ]
1 ) 2
!
(1656) Thus, the free energy is given by F = Ecl +
X X ¯ h ωα (q) ln 1 − exp − β ¯h ωα (q) (1657) + kB T 2 q,α q,α
The pressure P is found from the thermodynamic relation dF dF
= dE − T dS − S dT = − S dT − P dV
Hence, the pressure is given by ∂F P = − ∂V T,N ∂Ecl 1 X ∂¯hωα (q) = − − ∂V T,N 2 q,α ∂V T,N 368
(1658)
X ∂¯hωα (q) − ∂V T,N q,α
1 exp
(1659)
β ¯h ωα (q)
− 1
The first two terms are temperature independent, and the last term depends on temperature through the average phonon occupation numbers. The pressure is only temperature dependent if the phonon frequencies depend on the volume V . The thermal volume expansion coefficient α is defined by 1 ∂V α = V ∂T P
(1660)
As the equation of state is a relation between pressure, temperature and volume P = P (T, V )
(1661)
Thus, the infinitesimal derivatives are related by ∂P ∂P dT + dV dP = ∂T V ∂V T For a process at constant P , dP = 0, thus ∂P ∂T ∂V = − V ∂T P ∂P
(1662)
(1663)
∂V T
The bulk modulus, B defined by B = −V
∂P ∂V
(1664) T
is finite as Ecl is expected to be volume-dependent. Hence, the denominator is finite. The thermal expansion coefficient is non-zero, only if ( ∂P ∂T )V 6= 0. Using the harmonic approximation, the frequencies must be functions of the volume V if the solid is to undergo thermal expansion. The specific heat at constant pressure is different from the specific heat at constant volume. This relation is found by relating the temperature derivative of the entropy with respect to temperature, at constant volume to the temperature derivative of the entropy with respect to temperature, at constant pressure. This relation is found by considering the infinitesimal change in entropy, with either the change in volume or pressure ∂S ∂S dS = dT + dV ∂T V ∂V T ∂S ∂S = dT + dP (1665) ∂T P ∂P T 369
Using the equation of state relating P and V V = V (T, P ) to find
dV =
∂V ∂T
(1666)
dT + P
∂V ∂P
dP
(1667)
T
Thus, on combining the expression for dS in terms of dV and dT and the equation for dV results in the expression ∂S ∂V ∂S ∂V ∂S dS = + dT + dP ∂T V ∂V T ∂T P ∂V T ∂P T (1668) Thus, one has the relation ∂S ∂S ∂S ∂V = + (1669) ∂T P ∂T V ∂V T ∂T P ∂S A Maxwell relation can be used to eliminate ∂V . The Maxwell relation comes T from the analyticity condition on a thermodynamic function with independent variables (T, V ), which is F (T, V ). Hence, one has ∂S ∂P = (1670) ∂V T ∂T V and, thus, T
∂S ∂T
= T P
CP
= CV
∂P ∂V + ∂T ∂T P V V ∂P ∂V + T ∂T V ∂T P 2 ∂S ∂T
∂P ∂T
CP
= CV − T
∂P ∂V
V
(1671)
T
Thus, if there is to be a difference between CP and CV the phonon frequencies must be dependent on V .
14.4.1
The Specific Heat
The specific heat at constant volume can be found from the entropy of the phonon gas X S = kB N (ωα (q)) + 1 ln N (ωα (q)) + 1 − N (ωα (q)) ln N (ωα (q)) q,α
(1672) 370
The specific heat is given by ∂S CV = T ∂T V = kB
=
X ∂N (ωα (q)) ln ∂T q,α
N (ωα (q)) + 1 N (ωα (q))
!
X ∂N (ωα (q)) ¯h ωα (q) ∂T q,α
= kB
X
2 β ¯h ωα (q)
N (ωα (q))
N (ωα (q)) + 1
q,α
(1673) This can be expressed as an integral over the phonon density of states ρ(ω) via 2 Z ∞ CV = kB dω ρ(ω) β ¯h ω N (ω) N (ω) + 1 (1674) 0
Now some approximate models of the specific heat of the lattice vibrations shall be examined.
14.4.2
The Einstein Model of a Solid
The Einstein model of a solid considers the phonons to have a constant frequency ω0 , and is an approximate representation of the optic phonons. The phonon density of states is given by ρ(ω) = 3 N δ(ω − ω0 ) where there are 3 modes per atom. The specific heat is given by 2 CV = 3 N kB β ¯h ω0 N (ω0 ) N (ω0 ) + 1
(1675)
(1676)
This vanishes exponentially at low temperatures, kB T ¯h ω0 where N (ω0 ) ≈ exp − β ¯h ω0 (1677) and at high temperatures kB T ¯h ω0 N (ω0 ) ≈
kB T ¯h ω0
(1678)
so the specific heat saturates to yield the classical result lim
T → ∞
CV → 3 N kB 371
(1679)
The Einstein model of the specific heat fails to describe the lattice contribution to low-temperature specific heat of a solid. This is because it fails to describe the low energy acoustic phonon excitations, which gives rise to a power law temperature variation. The Debye model of a solid provides an approximate description of the low-temperature specific heat of a solid.
14.4.3
The Debye Model of a Solid
The Debye model of a solid approximates the phonon density of states for the two transverse acoustic mode and longitudinal acoustic mode in a an isotropic solid. The dispersion relations of the phonon modes are represented by the two-fold degenerate transverse mode ωT (q) = vT q
(1680)
and the singly degenerate longitudinal mode ωL (q) = vL q
(1681)
The transverse sound velocity, in general, will be different from the longitudinal sound velocity, vL 6= vT . There are 3N such phonon modes in the first Brillouin zone. The phonon density of states is given by the integral over a surface area in the first Brillouin zone −1 X Z V dωα 2 ρ(ω) = d Sα (ω) (1682) ( 2 π )3 α dq If the Brillouin zone is approximated as a sphere of radius qD , then the density of states is given by −1 X V dωα 2 ρ(ω) = 4πq ( 2 π )3 α dq
(1683)
for q < qD . Using the form of the dispersion relation, the density of states can be re-written as ρ(ω) =
X ω2 V Θ( vα qD − ω ) ( 2 π 2 ) α vα3
(1684)
This can be further approximated by requiring that the upper limit on the frequency of all three phonon modes be set to the Debye frequency ωD . In this case the density of states is simply given by ρD (ω) =
X ω2 V Θ( ωD − ω ) 2 ( 2 π ) α vα3
372
(1685)
The value of the Debye frequency is determined by the condition Z ωD dω ρD (ω) = 3 N 0
X ω3 V D 2 ( 6 π ) α vα3
=
(1686)
Using this condition, the Debye density of states is written as ρD (ω) = 9 N
ω2 3 Θ( ωD − ω ) ωD
(1687)
Thus, the Debye density of states varies as ω 2 at low frequencies and has a cut off at the maximum frequency ωD . The temperature dependence of the specific heat of the Debye model is given by CV
9 N kB = 3 ωD
Z
ωD
dω ω
2
2
β ¯h ω
N (ω)
N (ω) + 1
(1688)
0
The asymptotic low-temperature variation of the specific heat can be found by changing variable x = β ¯ hω. The specific heat can be written in the form 3 Z xD exp[ x ] kB T CV = 9 N kB dx x4 (1689) 2 h ωD ¯ 0 exp[ x ] − 1 where the upper limit of integration is given by xD = β ¯hωD . At sufficiently low temperatures, kB T ¯ h ωD , the upper limit may be set to infinity yielding 3 Z ∞ exp[ x ] kB T CV = 9 N kB dx x4 2 ¯h ωD 0 exp[ x ] − 1 =
12 π 4 N kB 5
kB T ¯h ωD
3 (1690) 4
where the integral has been evaluated as 415π . Thus, the low-temperature specific heat varies as T 3 in agreement with experiment. The asymptotic high temperature specific heat, kB T ¯h ωD is found from 2 Z ωD 9 N kB 2 CV = dω ω β h ¯ ω N (ω) N (ω) + 1 (1691) 3 ωD 0 noting that the number of phonons is given by N (ω) = Z ωD 9 N kB CV = dω ω 2 3 ωD 0 = 3 N kB 373
kB T h ¯ ω
, so
(1692)
which is the classical limit. Thus, the Debye approximation provides an interpolation between the low-temperature limit and the high temperature limit, which is only governed by one parameter, the Debye temperature kB TD = h ¯ ωD . ——————————————————————————————————
14.4.4
Exercise 68
Evaluate the integral Z 0
∞
exp[ x ]
dx x4
2
(1693)
exp[ x ] − 1
needed in the low-temperature limit of the specific heat for the Debye model. ——————————————————————————————————
14.4.5
Exercise 69
Generalize the Debye model to a d-dimensional solid. Determine the high temperature and leading low-temperature variation of the specific heat due to lattice vibrations. ——————————————————————————————————
14.4.6
Exercise 70
Show that the leading high temperature correction to the Dulong and Petit value of the specific heat due to lattice vibrations is given by R 1 ∆CV = − CV 12
dω R
h ¯ ω kB T
2 ρ(ω)
dω ρ(ω)
Also evaluate the moment of the phonon density of states Z dω ω 2 ρ(ω)
(1694)
(1695)
in terms of the pair potentials between the ions. ——————————————————————————————————
374
14.4.7
Exercise 71
Numerically calculate the phonon density of states for a single phonon mode for a two-dimensional lattice with a dispersion 2 2 (1696) ω (q) = ω0 2 − cos qx a − cos qy a and hence obtain the temperature dependence of the specific heat. Compare this with the numerical evaluation of an appropriate Debye model. ——————————————————————————————————
14.4.8
Lindemann Theory of Melting
Lindemann assumed that a lattice melts when the displacements due to lattice vibrations becomes comparable to the lattice constants. Although this theory does not address the appropriate mechanism it does give the right order of magnitude for simple metals and transition metals. It is shall assumed that melting occurs at a critical value of the ratio γc =
u2i a2i
This determines the estimated melting temperature through X 2 nq,α + 1 ¯h γc = 2 2 M N a q,α ωα (q)
(1697)
(1698)
The right hand side can easily be evaluated in two limits, the zero temperature limit and the high temperature limit. At zero temperature, the relative mean squared displacement is given by X 1 ¯h γ = 2 2 M N a q,α vα q Z 3 ¯h V 1 = d3 q 2 M N v a2 ( 2 π )3 q 3 ¯h V 2 = 2 π qD (1699) 2 M N v a2 ( 2 π )3 Using the relation N =
3 V qD 3V = 6 π2 4 π a3
(1700)
one obtains γ
13 ¯ qD h 9 2 π2 M v ¯h qD kB TD ∼ 0.4 = 0.4 M v M v2 ∼
1 2
375
(1701)
At high temperatures, the relative mean squared displacement is given by Z qD V 6 kB T ¯h γ = dq q 2 2 M N a2 2 π 2 0 ¯h v 2 q 2 3 V kB T qD = 2 2 M a 2π N v2 13 kB T 9 kB T 36 = = 2 M a2 qD v2 π2 M v2 ∼ 1.54
2 kB T ¯h2 qD = 1.54 2 T 2 kB T M v2 M kB D
(1702)
The Lindemann criterion provides a relation between the Debye temperature and the melting temperature. The experimental data for alkaline metals and 1 , independent of the metal. transition metals suggest that γc has a value of 16 Of course, one expects that anharmonic effects may become important for large displacements of the ions from their equilibrium positions. Mermin-Wagner Theorem. The Lindemann theory of melting may be extended to provide an example of the Mermin-Wagner theorem. The Mermin-Wagner theorem states that finite temperature phase transitions in which a continuous symmetry is spontaneously broken cannot occur in lower than three dimensions (N.D. Mermin and H. Wagner, Phys. Rev. Letts. 17, 1133 (1966)). Basically, if such a transition occurs then there should be a branch of Goldstone modes that dynamically restores the spontaneously broken symmetry. These normal modes produce fluctuations in the order parameter. In a periodic solid where continuous translational invariance is broken, the Goldstone modes are the transverse sound waves. The transverse sound modes have dispersion relations of the form ω(q) = v q. The fluctuations in the order parameter are the fluctuations in the choice of origin of the lattice and, therefore, are just the fluctuations in positions of any one ion. In d dimensions, at finite temperatures, the fluctuations have contributions from the region of small q which are proportional to Z qD ¯h kB T u2i ∼ dd q 2 M ω(q) h ¯ ω(q) 0 Z qD ∼ dq q d−3 (1703) 0
where qD is a cut off due to the lattice. The integral diverges logarithmically for d = 2, indicating that the fluctuations in the equilibrium lattice positions will be infinitely large, thereby preventing the solid from being formed. Likewise, for lower dimensions, such as one dimension, the integral will also diverge at the lower limit. Therefore, no truly one-dimensional solid is stable against temperature induced fluctuations. An analysis of the zero point fluctuations also rules out the possibility of a one-dimensional lattice forming in the limit of zero 376
temperature. For a harmonic solid, the phonon frequencies are independent of the volume V . This can be seen by considering the energy of a solid which has expanded in the linear dimensions by an amount proportional to . The energy of a harmonic solid with static displacements about the original equilibrium position is given by the harmonic expression E = Eeq +
1 X e u D(R i − R j ) uj 2 i,j i
(1704)
Now consider the expanded lattice, in which the displacements are given by ui = R i + u ˜i
(1705)
Here u ˜i are the new displacements from the new lattice sites of the lattice which has undergone an increase in volume of ( 1 + )3 , through the application of external forces. The expanded solid has an energy given by E = Eeq +
1 X 2 X e e Ri D(R u ˜ D(R ˜j i − Rj ) Rj + i − Rj ) u 2 i,j 2 i,j i (1706)
The terms linear in vanish identically, as the total force on an ion must vanish in equilibrium. The total force is the sum of the internal forces opposing the expansion and the applied external forces that result in the expansion. Since the dynamical matrix that governs the lattice displacements u ˜i is unchanged, its eigenvalues which are the phonon frequencies are unchanged by expansion of a harmonic solid. Thermal expansion only occurs for an anharmonic lattice. Thermal expansion provides a measure of the volume dependence of the phonon frequencies ∂ h ¯ ω ∂ V or the anharmonicity.
14.4.9
Thermal Expansion
The coefficient of thermal expansion of an insulator can be evaluated from ∂V ∂P ∂P (1707) = − ∂T P ∂T V ∂V T ∂F where the pressure is found from P = − ∂V . Using the expression for the T
free energy of the lattice one finds that the coefficient of thermal expansion can
377
be written as 1 X α = − B q,α
∂ ¯h ωα (q) ∂V
!
∂N (ωα (q)) ∂T
(1708)
where B is the bulk modulus and N (ω) is the Bose-Einstein distribution function. The specific heat can be written as CV =
X
¯h ωα (q)
q,α
∂N (ωα (q)) ∂T
(1709)
On identifying the contributions from each normal mode, one can define a Gruneisen parameter for each normal mode ∂ ωα (q) V γα (q) = − (1710) ωα (q) ∂V which is a dimensionless ratio of
α B V CV
γα (q) = −
. Thus, ∂ ln ωα (q) ∂ ln V
! (1711)
The Gruneisen parameter for the entire solid can be expressed as a weighted average of the Gruneisen parameter of each normal mode P q,α γα (q) Cq,α P γ = (1712) q,α Cq,α with weights given by Cq,α . This is consistent with the definition of the Gruneisen parameter in terms of thermodynamic quantities γ =
αBV CV
(1713)
For most models γα (q) is roughly independent of T and is a constant. γα (q) ∼ γ = −
∂ ln ωD ∂ ln V
! (1714)
Hence, as B is roughly T independent, the specific heat CV tracks the coefficient of thermal expansion α. A typical Gruneisen parameter has a magnitude of ∼ 1 or 2, and a slow temperature variation, which changes on the scale of TD .
378
14.4.10
Thermal Expansion of Metals
For a metal, there is an additional contribution to the pressure from the electrons. The electronic contribution to the pressure is calculated as Pel =
2 EVel 3 V
(1715)
and as the electronic energy is temperature dependent, the electronic contribution to the pressure is also temperature dependent. This gives an additional contribution to the rate of change of pressure with respect to temperature 2 el ∂Pel = C (1716) ∂T 3 V Hence, the coefficient of thermal expansion for a metal is determined from ∂P ∂T ∂V (1717) = − V ∂T P ∂P ∂V T
Hence, 1 α = B where
14.5
2 3
γ
CVlatt
2 el + C 3 V
! (1718)
is the electronic Gruneisen parameter.
Anharmonicity
The anharmonic interactions give rise to the lifetime of phonons, provide temperature dependent corrections to the phonon dispersion relations. These may usually be thought of as producing small corrections to the harmonic phonons, except when the systems is on the verge of a structural instability where they play an important role. The phonon modes are not the only excitations of the crystalline lattice, there are also large amplitude excitations like dislocations. Although these excitations may have a large ( macroscopic ) spatial extent they do not extend all through the crystal, like the phonon modes, and the deviations of the atoms from the ideal equilibrium positions can be large, comparable to the lattice spacing. If the lattice displacements in the dislocations were considered to be made up of a superposition of coherent states for each phonon mode, in the absence of the anharmonic interactions, the distortions would disperse and the dislocations would lose their shape. The anharmonic interactions are responsible for stabilizing these large amplitude, spatially localized, excitations by balancing the effects of dispersion of the phonon modes. These excitation do have macroscopically large excitation energies but they do also have macroscopically large effects. In essence, these dislocations are non-linear excitations, 379
like solitons, and play an extremely important role in determining the actual mechanical properties of any real solid. ——————————————————————————————————
14.5.1
Exercise 72
The full ionic potential of a mono-atomic Bravais Lattice has the form 1 X X uµ (R) Dµ ν (R − R0 ) uν (R0 ) 2! 0 µ,ν
Veq +
R,R
1 + 3!
X
X
R,R0 ,R”
µ,ν,λ
uµ (R) uν (R0 ) uλ (R”) Dµ
ν λ
(R, R0 , R”) (1719)
where u(R) gives the displacement from the equilibrium position R. (i) Show that if an expansion is made about the expanded lattice positions defined by R = (1 + )R (1720) then the dynamical matrix is changed to Dµ,ν (R − R0 ) = Dµ,ν (R − R0 ) + δDµ,ν (R − R0 ) where the change in the dynamical matrix is given by X Dµ ν λ (R, R0 , R”) Rλ ” δDµ,ν (R − R0 ) =
(1721)
(1722)
λ,R”
(ii) Show that the Gruneisen parameter is given by γα (q) =
e α (q) δ D(q) α (q) 2 6 M ωα (q)
(1723)
——————————————————————————————————
380
15
Phonon Measurements
The spectrum of phonon excitations in a solid can be measured directly, via inelastic neutron scattering or Raman scattering of light.
15.1
Inelastic Neutron Scattering
The neutrons interact with the atomic nuclei by a very short ranged contact interaction X 2 π ¯h2 ˆ int = H b δ 3 ( r − r(Ri ) ) (1724) m n i where r is the position of the neutron, and r(Ri ) are the positions of the ions. The inelastic neutron scattering cross-section contains information about the ground state and the all the excited states of the lattice. The various contributions to the spectrum are analyzed by use of the conservation laws. In inelastic neutron scattering experiments the incident neutron energy is given by ¯h2 k 2 P2 E = = (1725) 2 mn 2 mn and the final energy is given by E0 =
P 02 ¯h2 k 02 = 2 mn 2 mn
(1726)
The energy transfer from the neutron to the sample is given by ¯h ω = E − E 0 This energy is the energy given to the excited phonon modes X ¯h ωα (q) ( n0q,α − nq,α ) hω = ¯
(1727)
(1728)
q,α
as found from conservation of energy. Due to the periodic translational invariance of the crystal and the short range of the interaction, the momentum change of the neutron is given by X 0 0 0 p − p = h hk = + ¯h Q ¯ k − ¯ ¯h q nq,α − nq,α (1729) q,α
where Q is a reciprocal lattice vector. Thus, even if the scattering is elastic the neutron may still be diffracted. The use of the two conservation laws allows the dispersion relation ωα (q) to be determined.
381
15.1.1
The Scattering Cross-Section 2
d σ The inelastic scattering cross-section dΩdω depends on the scattering geometry, through the scattering angle θ and dΩ is the angle subtended by the detector to the target. The inelastic neutron scattering cross-section is given by the FermiGolden rule expression. The Fermi-Golden rule is derived in the interaction representation, and involves an integral over time of the expression with the matrix elements Z +∞ Y Y 1 ˆ int | k 0 > | dt0 < nα,q | < k | H n0α,q > × 2 Re h ¯ −∞ q,α q,α 0 Y Y i t0 ˆ ˆ int exp − i t H ˆ0 | k > | H0 H nα,q > < n0α,q | < k 0 | exp + ¯h ¯h q,α q,α
(1730) The matrix elements involve the initial and final states each of which are products of the neutron states k and the states of the lattice. The initial and final states of the lattice are represented byQ the number of quanta Q in each normal mode and respectively are written as | q,α nα,q > and | q,α n0α,q >. The dependence on the states of the lattice are suppressed in the following. The integration over t0 gives rise to an energy conserving delta function. This involves the matrix elements of the interaction between the neutron and the nuclei in the solid, but unlike the elastic scattering cross-section, previously derived, the nuclei may be displaced from their equilibrium positions by ui according to r(Ri ) = Ri + ui
(1731)
Hence, the interaction Hamiltonian is given by ˆ int = H
X 2 π ¯h2 b δ 3 ( r − R i − ui ) m n i
(1732)
and the matrix elements between the initial and final states of the neutron, respectively, labelled by momentum k and k 0 , u, X 2 π ¯h2 0 0 ˆ < k | Hint | k > = b exp i ( k − k ) . ( Ri + ui ) mn i (1733) If the displacements ui have sufficiently small magnitudes, compared with the neutron wave length, or the lattice spacing, the exponential term can be expanded as a series in powers of ui , 0
ˆ int | k > < k |H
≈
X 2 π ¯h2 0 b exp i ( k − k ) . Ri × mn i 382
! ×
0
1 + i ( k − k ) . ui + .... (1734)
The first term in the expansion has been previously considered in the discussion of neutron diffraction by a crystalline lattice, and gives rise to Bragg scattering. The next term gives rise to single phonon scattering, while the higher order terms represent scattering from multi-phonon excitations. In the interaction representation, the terms involving the lattice displacements depend on time, via ˆ 0 t0 ˆ 0 t0 iH iH 0 ˆ ˆ Hint (t ) = exp + Hint (0) exp − (1735) ¯h ¯h Thus, on expressing the lattice displacements in terms of the phonon modes as s 1 X ¯h 0 ui (t ) = √ α (q) exp i q . Ri × 2 M ωα (q) N q,α † 0 0 × aq,α exp[ − i ωα (q) t ] + a−q,α exp[ + i ωα (q) t ] (1736) where ωα (q) are the phonon frequencies. The integrals over t0 can be expressed in terms of energy conserving delta functions 0 Z ∞ dt0 it exp ( E 0 − E ± ¯h ω ) = δ( E 0 − E ± ¯h ω ) h h ¯ −∞ 2 π ¯ (1737) On evaluating the matrix elements between the initial and final states of the lattice, the scattering cross-section is found as the sum of terms 2 2 " X d2 σ k 2π¯ h2 0 δ( E 0 − E ) − k ∝ b exp i ( k ) . R i 0 dΩ dω k mn i ¯h 1 X n−q,α + 1 δ( E 0 − E + ¯h ωα (q) ) × + 2 M ωα (q) N q,α ×
X 0 + q − k ( k − k 0 ) . α (q) exp i ( k ) . R i
+
¯h 1 X nq,α δ( E 0 − E − ¯h ωα (q) ) × 2 M ωα (q) N q,α
×
X 0 exp i ( k + q − k ) . Ri ( k − k 0 ) . α (q)
i
i
383
2
2
# +
..... (1738)
The displacements have been expressed in terms of the normal modes, and nq,α is just the number of phonons with momentum q and polarization α in the initial state. On performing the sum over lattice vectors Ri , one finds the condition for conservation of crystal momentum, X 0 0 = N ∆ k − k − q (1739) exp i ( k − k − q ) . Ri i
modulo Q. Thus, the summation over q can be performed, leading to the second and third terms involves the absorption or emission of a phonon of wave vector q = (k − k 0 + Q) where Q is a reciprocal lattice vector. These terms are smaller than the coherent Bragg terms by a factor of N1 . 2 " d2 σ k 2 π ¯h2 ∝ b N 2 ∆( k − k 0 ) δ( E 0 − E ) dΩ dω k0 mn +
X
N
∆( k − k 0 − q ) δ( E 0 − E + ¯h ωα (q) ) ×
q,α
× +
nq,α + 1
N
X
¯ h ( k − k 0 ) . (q) α 2 M ωα (q)
2
∆( k − k 0 + q ) δ( E 0 − E − ¯h ωα (q) ) ×
q,α
¯h ( k − k 0 ) . (q) α 2 M ωα (q) #
×
nq,α
+
.....
2
(1740) The inelastic one phonon contributions are coherent as it involves the conservation of momentum, but has an intensity that is only proportional to N . The energy of the phonon is given by ¯h ωq,α . The thermal average of the number of phonons nq,α is given by the Bose Einstein distribution function, which is a Boltzmann weighted average N (ωα (q))
= =
nq,α 1
n=∞ X
Zq,α
n=0
384
n exp[ − β n ¯h ωα (q) ]
1 Zq,α
=
exp[ − β ¯h ωα (q) ] 2 1 − exp[ − β ¯h ωα (q) ] (1741)
However, the partition function for a single phonon mode Zq,α is given by Zq,α =
n=∞ X
exp[ − β n ¯h ωα (q) ]
n=0
=
1 1 − exp[ − β ¯h ωα (q) ] (1742)
Hence, the Bose-Einstein distribution function is found to be N (ωα (q)) =
1 exp[ β h ¯ ωα (q) ] − 1
(1743)
At low temperatures, the number of thermally activated bosons is small, therefore, the inelastic scattering intensity for processes which lead to an increase in the energy of the neutron, due to absorption of phonons is small. On the other hand, the intensity of processes which involves the energy loss by the neutron beam due to creation of individual phonons has an intensity governed by the 1 + N (ωα (q)) which is almost unity at low temperatures. The rate for inelastic transitions of the incident neutrons obeys the principle of detailed balance. That is, although the neutron beam is not in equilibrium with the solid, the transition rate is such that it drives the beam towards equilibrium. This can be seen by inspection of the one phonon contribution to the spectrum. The one phonon absorption and emission spectrum is proportional to 1 + N (ωα (q)) δ( E − E 0 − ¯h ωα (q) ) + N (ωα (q)) δ( E − E 0 + ¯h ωα (q) ) (1744) The first term represents a processes in which the neutron loses energy due to the emission of a phonon, whereas the second term represents a processes in which the neutron gains energy due to the absorption of a phonon. The ratio of the rate at which the neutron beam gains energy to the rate at which the neutron beam loses energy is given by W (E → E + ¯h ω) N (ω) = = exp − β ¯h ω (1745) W (E + ¯h ω → E) [ N (ω) + 1 ] If equilibrium with the beam were to be established, the kinetic energy of the neutron beam would be distributed according to the Boltzmann formula 1 P (E) = exp − β E (1746) Z 385
such that dynamic equilibrium would be established. In this case the total number of transitions from E → E + ¯h ω precisely equals the number of transitions in the reverse direction E + ¯h ω → E P (E) W (E → E + ¯h ω) = P (E + ¯h ω) W (E + ¯h ω → E)
(1747)
However, the beam produced by the neutron source is not in equilibrium with the sample, and would only equilibrate if the beam traverses an infinite path length through the sample. In addition to the inelastic one phonon scattering cross-section, there are two second order terms in u2 , which are cross-terms involving the term second order in u(t0 ) from the expansion of one factor 2π¯ h2 0 0 b exp i ( k − k ) . ( Ri + ui (t ) ) (1748) mn and the leading zero-th order term from the other factor. This cross term is proportional to 2 1 ( k − k 0 ) . ui (t0 ) .1 − 2! X 1 1 ¯h q = − ( k − k 0 ) . α (q) × 2! N ωα (q) ωα0 (q 0 ) q,q 0 ,α,α0 2 M × ( k − k 0 ) . α0 (q 0 ) exp i ( q + q 0 ) . Ri × × aq,α exp[ − i ωα (q) t0 ] + a†−q,α exp[ + i ωα (q) t0 ] × × aq0 ,α0 exp[ − i ωα0 (q 0 ) t0 ] + a†−q0 ,α0 exp[ + i ωα0 (−q 0 ) t0 ] (1749) The expectation value of both the cross-terms 2 2 1 1 0 0 0 − ( k − k ) . ui (t ) .1 − 1 . ( k − k ) . ui (0) (1750) 2! 2! is time independent and is given by 2 1 X h ¯ 0 − ( k − k ) . α (q) 2 nq,α + 1 N q,α 2 M ωα (q) (1751) Hence, it is seen that the second order contribution can be decomposed into an elastic and an inelastic one phonon contribution. The elastic contribution, 386
involves the scattering intensity from the nuclei, when they are displaced from their equilibrium positions, either through the zero point fluctuations or through the effect of a thermal activated population of lattice vibrations. This second order contribution to the elastic scattering has a form similar to the intensity of the coherent Bragg peak, and cannot be distinguished from it through experiments. It is expected that inspection of all the even order terms in the expansion should provide similar contributions, which will modify the intensity of the observed Bragg scattering peak. These contributions, due to the fluctuations of the nuclei from their equilibrium positions, give rise to a reduction intensity which is governed by the Debye Waller factor.
15.2
The Debye-Waller Factor
The above second order contribution can be combined with the leading order term, to give the first two terms of the expansion of the elastic scattering crosssection, W , 2 1 X h ¯ 0 ( k − k ) . α (q) 2 N (ωα (q)) + 1 + ... W = 1 − N q,α 2 M ωα (q) (1752) which can be exponentiated to yield the Debye-Waller factor W = exp
2 βh ¯ ωα (q) ¯h 1 X 0 ( k − k ) . α (q) coth − 2 N q,α 2 M ωα (q)
(1753) The Debye-Waller factor reduces the intensity of the Bragg peak. The DebyeWaller factor also modifies the intensity of the Bragg peak in x-ray scattering. The effects of the multi-phonon processes, of all order, can be ascertained by examining the expectation value of the factor in the neutron scattering crosssection given by Z ∞ ˆ int (0) | k 0 > < k 0 | H ˆ int (t0 ) | k > dt0 < k | H −∞
= Z
2 π ¯h2 b mn
2 X
∞
×
dt0 exp
exp
0
− i ( k − k ) . ( Ri − Rj )
×
i,j
− i ( k − k 0 ) . ui (0)
exp
+ i ( k − k 0 ) . uj (t0 )
−∞
(1754) The energy conserving delta functions have been expressed as an integrals over
387
t0 via, δ( E 0 − E ± ¯ hω) =
Z
∞
−∞
dt0 exp 2 π ¯h
i t0 ( E 0 − E ± ¯h ω ) ¯h
(1755)
and the energies in the exponentials can be expressed in terms of the noninteracting Hamiltonian. Using this identity, the scattering cross-section can be represented as a Fourier Transform of the thermal average two time correlation function 0 Z ∞ it dt0 exp ( E − E 0 ) < | exp − i ( k − k 0 ) . ui (0) exp + i ( k − k 0 ) . uj (t0 ) | > h ¯ −∞ (1756) For harmonic phonons, one can express this correlation function as Z ∞ 0 0 0 0 0 = dt exp i t ω < | exp − i ( k − k ) . ui (0) exp + i ( k − k ) . uj (t ) | > −∞ ∞
Z
2 1 < | ( k − k 0 ) . ( ui (0) − uj (t0 ) ) | > 2 dt0 exp i t0 ω exp − < | ( k − k 0 ) . ui (0) ( k − k 0 ) . uj (t0 ) | > × dt0 exp
= −∞ ∞
Z =
i t0 ω
−
exp
−∞
×
exp
1 − < | 2
2
0
( k − k ) . ui (0)
| >
exp
1 − < | 2
0
0
2
( k − k ) . uj (t )
| > (1757)
where h ¯ ω = E − E 0 is the energy loss experienced by the neutron. The last two factors are identified with the Debye-Waller factor, which is given by W
=
exp
=
exp
− < |
−
2
0
( k − k ) . ui (0)
| >
2 βh ¯ ωα (q) ¯h 1 X ( k − k 0 ) . α (q) coth N q,α 2 M ωα (q) 2 (1758)
The frequency dependent factor can be expanded in terms of the number of phonons Z ∞ dt0 exp i t0 ω exp − < | ( k − k 0 ) . ui (0) ( k − k 0 ) . uj (t0 ) | > −∞ ! Z ∞
dt0 exp
=
i t0 ω
1− < |
( k − k 0 ) . ui (0)
( k − k 0 ) . uj (t0 )
| > +...
−∞
(1759) 388
Thus, all the contributions to the scattering cross-section are reduced in intensity by the Debye-Waller factor. The first term in the expansion is found to be proportional to δ(ω) and gives rise to the elastic scattering. The second term is just the one phonon contribution to the scattering cross-section.
15.3
Single Phonon Scattering
The phonon dispersion relation can be inferred from a measurement of the single phonon scattering peak. The scattering cross-section for processes in which a single phonon is emitted have to satisfy the energy and momentum conservation laws h2 k 2 ¯ ¯h2 k 02 = + ¯h ωα (q) (1760) 2 mn 2 mn and k = k0 + q + Q
(1761)
since ω(q) is periodic with a periodicity of the reciprocal lattice vectors ωα (q) = ωα (q + Q)
(1762)
One can combine the equations as ¯h2 k 02 ¯ 2 k2 h = + ¯h ωα (k − k 0 ) 2 mn 2 mn
(1763)
In the scattering experiments the beam of neutrons is generally collimated to have a definite direction of the k vector, and also a definite initial energy. For a given k, the solution of the above equation for the three components of k 0 , form a two-dimensional surface. For a detector placed in a particular scattering direction, the solution only exists at isolated points. On measuring the scattering cross-section at the various magnitudes of the final momentum k 0 yield sharp peaks in the spectrum. With knowledge of the magnitude of the final momentum k 0 , one can construct k 0 − k, and also E 0 − E and hence find ¯h ωα (q) for the normal mode. By varying the direction of k 0 and the magnitude of E, one can map out successive surfaces and hence obtain the dispersion relation. Information about the polarization of the phonon modes can be obtained from the dependence of the intensity on the scattering wave-vector k − k 0 as the scattering cross-section is proportional to 2 ( k − k 0 ) . α (q) (1764)
The width of the single phonon peak obtained in experiments have two origins, one is the experimental resolution and another component is not resolution 389
limited. The second component is due to the lifetime of the phonon τ , which according to the energy time uncertainty principle gives rise to an energy width of h ¯ τ . The lifetime occurs because the phonons are scattered either by anharmonic processes or by electrons. The small magnitude of the width of the phonon peaks attests to the effectiveness of the harmonic approximation and the BornOppenheimer approximation.
15.4
Multi-Phonon Scattering
Processes in which two phonons are absorption or emitted satisfy the two conservation laws ¯ 2 k2 h ¯h2 k 02 = ± ¯h ωα (q) ± ¯h ωα0 (q 0 ) 2 mn 2 mn
(1765)
k = k0 ± q ± q0 + Q
(1766)
and 0
Conservation of momentum can be used to express q in terms of q, this gives rise to the restriction ¯ 2 k2 h ¯h2 k 02 = ± ¯h ωα (q) ± ¯h ωα0 (k − k 0 ± q) 2 mn 2 mn
(1767)
Since there are six quantities k 0 and q, which are undetermined. Even if the direction of k 0 is fixed there still remains three unknown quantities q, which produces a continuously varying final neutron energy. Hence, one obtains a continuous spectrum. A similar analysis of the higher order multi-phonon processes also yields a continuous spectrum. Only the one phonon spectrum gives rise to a single peak. Thus, in a general scattering experiment, with a specific scattering direction, the analysis of the scattered neutrons energy provides a spectrum which contains a continuous portion on superimposed with sharp peaks. The spectrum may show an elastic Bragg peak depending on the magnitude of k and θ, or if there are different isotopes one may observe incoherent nuclear scattering at zero energy transfers. The peaks of the one phonon scattering can be used to map out the dispersion relations. This has been performed for f.c.c. lead. However, some branches were not observed. The intensity of the one phonon absorption peak is proportional to the Bose-Einstein distribution function N (ωα (q)), whereas the one phonon emission process has intensity proportional to [ N (ωα (q)) + 1 ]. Thus, it is usual to measure phonon emission at low temperatures. ——————————————————————————————————
390
15.4.1
Exercise 73
(i) Find a graphical description of the conservation laws for the phonon emission process. (ii) Show that there is a minimum or threshold energy required for phonon emission. ——————————————————————————————————
15.4.2
Exercise 74
(i) Evaluate the Debye-Waller factor for a one, two or three dimensional system of acoustic phonons. (ii) Determine the temperature dependence of the integrated intensity of the scattering cross-section, defined by Z +∞ k 0 d2 σ I(q) = dω (1768) k dω dΩ −∞ ——————————————————————————————————
15.4.3
Exercise 75
Consider inelastic neutron scattering from a perfect fluid, described by the Hamiltonian X P ˆ2 i ˆ0 = H (1769) 2 M i Show that the inelastic scattering cross-section is proportional to d2 σ ∝ dω dΩ
βM 2π¯ h2 q 2
12
exp
−
βM 2 ¯h2 q 2
¯h ω −
¯ 2 q2 h 2M
2 (1770)
——————————————————————————————————
15.5
Raman and Brillouin Scattering of Light
Since the energy of visible light is of the order of eV and the energy of a typical phonon is of the order of meV, ( 10−3 ) eV, it is not possible to observe the phonons by direct absorption or emission of light. However, it is possible to observe the phonons in a solid via light scattering. Even though the scattering
391
processes proceed via the same mechanism, the scattering from optical phonons is called Raman scattering and scattering from acoustic phonons is called Brillouin scattering. As in neutron scattering, the basic process may involve emission of phonons or absorption of phonons. The conservation laws for the one phonon absorption or emission process ¯h ω 0 = h ¯ ω ± ¯h ωα (q) (1771) and one phonon absorption or emission process h k0 n = h ¯ ¯ k n ± ¯h q + ¯h Q
(1772)
0
In these expressions (k, ω) and (k , ω 0 ) are, respectively, the momentum and energy of the incident beam of photons and the scattered photons, and n is the refractive index of the media. It reflects the change in the wavelength of the light as it enters the solid. The phonon absorption (+) gives rise to the Stoke’s sifted line, which has an intensity proportional to the number of activated phonons ∝ N (ωα (q))
(1773)
The phonon emission (−) gives rise to the anti-Stoke’s line with an intensity proportional to ∝ [ 1 + N (ωα (q)) ] (1774) which has contributions from spontaneous and stimulated emission. Since the phonon frequency is given by the Debye frequency ¯h ωD ∼ 10−2 eV, which is small compared with a typical photon energy h ¯ c k n ∼ 1 eV, the change in photon wave vector k − k 0 is small. Thus, as far as the scattering is concerned the scattering triangle is almost isosceles. The momentum transfer q is given by |q|
=
2 n k sin
=
2nk
θ 2
ω θ sin c 2 (1775)
0
Since the direction of k and k are known from the experimental geometry, the direction of q can be inferred, if the small change in the photon energy, h ¯ ω, is measured. For Brillouin scattering the phonon energy is given by ωα (q) = vα q
(1776)
where vα is the velocity of sound. The magnitude of the phonon’s momentum is given by ωα (q) ωn θ q = = 2 sin (1777) vα (q) c 2 392
However, the energy of the acoustic phonon energy is equal to the change in photon energy, ∆ ω, ωα (q) = ∆ω (1778) Thus, the velocity of the acoustic phonon is found as vα (q) =
∆ω c θ csc 2ω n 2
(1779)
The experimentally determined spectra has the form of a strong un-scattered laser line, surrounded by a small Stoke’s line at higher frequencies, and a slightly more intense anti-Stoke’s line at lower frequencies. The Stokes and anti-Stoke’s line are both separated from the main line by the same frequency shift ∆ω.
393
16
Phonons in Metals
An alternate approach to the phonon dispersion in metals is based on a two component plasma composed of electrons and ions. The approach starts by consideration of a plasma composed of the positively charged ions, with charge Z | e | and mass M . The plasma of ions support longitudinal charge density oscillations which occur in the absence of an external potential. Since the total potential is related to the external scalar potential via φ(q, ω) =
φext (q, ω) ε(q, ω)
(1780)
and if φext (q, ω) = 0 one must have ε(q, ω) = 0 for φ(q, ω) 6= 0. In this case, one has an spontaneous density fluctuations and induced longitudinal current " # qω jL (q, ω) = φ(q, ω) − φext (q, ω) 4π qω = 1 − ε(q, ω) φ(q, ω) 4π (1781) where Poisson’s equation and the continuity condition on the charge density have been used. Using the definition of the longitudinal conductivity one recovers the relation 4 π σ(ω) ε(q, ω) = 1 − (1782) iω which together with the Drude expression for the conductivity of a gas of ions of charge Z | e | and mass M σ(ω) =
Z 2 e2 ρions τ 1 M 1 − iωτ
(1783)
This yields the Drude model for the dielectric constant of the ions, which for ω τ1 becomes 4 π Z e2 ρ ε(q, ω) = 1 − (1784) M ω2 where the density of ions is given in terms of the electron density ρ via ρ Z
(1785)
The condition for plasmon oscillations is given by ε(q, ω) = 0
394
(1786)
The ionic plasmon frequency man be written in terms of the plasmon frequency Ω2p =
Z m 2 ωp M
(1787)
This corresponds to an unscreened phonon frequency. Since the factor ZMm ∼ 1 h ωp ∼ 10 eV, the unscreened phonon frequency is approximately 4000 and ¯ 1 ∼ 10 eV.
16.1
Screened Ionic Plasmons
The above model is inadequate as it neglects the effects of the conduction electrons. This effect of the electrons can be included by screening the Coulomb interactions between the charged nuclei 4 π Z 2 e2 q2
(1788)
with the dielectric constant of the electron gas. In the Thomas-Fermi approximation this is given by k2 εeg (q, ω) = 1 + T2F (1789) q Thus, within the Born-Oppenheimer approximation, one obtains the dielectric constant as 4 π Z e2 ρ ε(q, ω) = 1 − (1790) k2 M ( 1 + qT2F ) ω 2 The screened ionic plasmons have frequencies given by 4 π Z e2 ρ
ε(q, ω) = 1 −
M (1 +
2 kT F q2
= 0
(1791)
) ω2
Thus, ω2 =
Z m 2 q2 ωp 2 M q + kT2 F
(1792)
This is the Bohm-Staver model of the phonon frequency. This model results in a linear dispersion relation ω(q) ≈ v q, where the velocity v is given by v2 =
Z m ωp2 M kT2 F
(1793)
As the Thomas-Fermi wave vector is given in terms of the Fermi-wave vector by 4 π e2 ¯h2 π ≈ kT2 F m kF
395
(1794)
and the electron density is given by ρ =
kF3 3 π2
the velocity of sound is related to the Fermi-velocity vF = v2 =
(1795) h ¯ m
kF via
1 Z m 2 v 3 M F
Thus, the velocity of sound v is reduced below the Fermi-velocity vF as 10−3 − 10−5 .
16.1.1
(1796) m M
∼
Kohn Anomalies
A more accurate treatment of the phonon frequency replaces the Thomas-Fermi dielectric constant with the Lindhard expression Z 4 π e2 d3 k f (E(k + q)) − f (E(k)) εeg (q, ω) = 1 − (1797) 2 q 4 π 3 E(k + q) − E(k) + h ¯ ω where f (x) is the Fermi-Dirac distribution function. This has singularities in the derivative at q = 2 kF . These singularities correspond to the extremal diameters of the Fermi-surface. Walter Kohn showed that these singularities should appear in the phonon spectrum by producing kinks or infinities in the derivative ∂ω (1798) ∂q q=2kF
16.2
Dielectric Constant of a Metal
The dielectric constant of a metal represents the process in which an external charge is screened by the combined effects of the electrons and the ions φext (q, ω) = φ(q, ω) ε(q, ω)
(1799)
A dielectric function can be defined for just the electrons, in which the total potential φ(q, ω) is produced as the response to a total external potential which is external to the electron gas. That is the total external potential is considered to be the sum of the applied external potential and the total potential due to the ion charge density φext (q, ω) + φions (q, ω) = φ(q, ω) εel (q, ω)
(1800)
Analogously, a dielectric function can be defined for the ions as the response of the ions to an external potential composed of the applied potential and the electrons φext (q, ω) + φel (q, ω) = φ(q, ω) εions (q, ω) (1801) 396
This goes beyond the Born-Oppenheimer approximation. The total potential is given by the sum of the potentials due to the external, electron and ion charges φ(q, ω) = φext (q, ω) + φions (q, ω) + φel (q, ω)
(1802)
The dielectric constant of the metal is given in terms of the dielectric constant of the electrons and the dielectric constant of the ions, by adding the two equations defining the electronic and ionic dielectric constants (1803) εions (q, ω) + εel (q, ω) φ(q, ω) = φ(q, ω) + φext (q, ω) Then with the definition of the total dielectric constant one has the relation ε(q, ω) = εions (q, ω) + εel (q, ω) − 1 (1804) The dielectric constant of the ions goes beyond the Born-Oppenheimer approximation. It describes how the ions, alone, screen the potential due to the applied potential and the potential due to the electrons. As the dielectric constant due to the ions alone is given by εions (q, ω) = 1 −
Ω2p ω2
(1805)
and the electronic dielectric constant ( at low frequencies ) is given by the Thomas-Fermi approximation εel (q, ω) = 1 +
kT2 F q2
(1806)
Hence, the low frequency dielectric constant is given by ε(q, ω) = 1 +
Ω2p kT2 F − q2 ω2
(1807)
for ωp ω. An alternate definition of the dielectric constant of the ions may be introduced, in which one considers the external potential to be first screened by the electron gas. Secondly the resulting dressed external potential is screened by the ions. That is instead of the electron gas screening the external potential of the ions and the applied potential φ(q, ω) =
φext (q, ω) φions (q, ω) + εel (q, ω) εel (q, ω)
(1808)
one considers only the dressed external potential φdressed (q, ω) =
397
φext (q, ω) εel (q, ω)
(1809)
It is this dressed external potential that is screened by the ions to produce the total potential. This relation defines the dressed dielectric constant of the ions φ( q, ω)
= =
φdressed (q, ω) εdressed (q, ω) ions φext (q, ω) εel (q, ω) εdressed (q, ω) ions (1810)
Hence, the electronic and dressed ionic dielectric constants are related to the dielectric constant via ε(q, ω) = εel (q, ω) εdressed (q, ω) ions
(1811)
Combining this with the relation of the dielectric constant in terms of dielectric constants of the electrons and ions ε(q, ω) = εions (q, ω) + εel (q, ω) − 1 (1812) the dressed ionic dielectric constant can be defined by 1 εdressed (q, ω) = ε (q, ω) + ε (q, ω) − 1 ions el ions εel (q, ω) 1 = 1 + εions (q, ω) − 1 εel (q, ω) (1813) The dressed dielectric constant is calculated as εdressed (q, ω) ions
=
1 + =
1 +
εions (q, ω) − 1
2 kT F q2
1
1 −
1
1 +
2 kT F q2
Ω2p ω2
(1814) This can be written in terms of the phonon dispersion relation ω(q)2 εdressed (q, ω) = 1 − ions
ω(q)2 ω2
(1815)
since the phonon oscillations occur when the dielectric constant vanishes εdressed (q, ω(q)) = 0 ions
(1816)
By inspection of the dressed dielectric constant the phonon frequency is found as q2 ω(q)2 = 2 Ω2 (1817) q + kT2 F p 398
The introduction of screening by the electron gas has reduced the frequency of the ionic density oscillations from the ionic plasmon frequency to a branch of longitudinal acoustic phonons. The total dielectric constant, which is a product of the dressed dielectric constant and the Thomas-Fermi dielectric constant of the electron gas, can now be written in terms of the phonon frequencies as 1 ε(q, ω)
1
=
1 2 kT F q2
1 +
1 −
1
=
1 +
2 kT F q2
ω2
ω(q)2 ω2 2
ω − ω(q)2 (1818)
This is in agreement with the expression discussed earlier.
16.3
The Retarded Electron-Electron Interaction
Consider the screening of the Coulomb interaction between a pair of electrons via the dielectric constant 4π q2
→ =
4π ε(q, ω) q 2 4π 2 kT F + q 2
1 +
ω(q)2 2 ω − ω(q)2
(1819)
Thus, there is an additional contribution in the effective interaction due to the screening by the ions. The ω dependence of the interaction is representative that the effective interaction is not instantaneous but instead is a retarded interaction. The effective interaction between a pair of electrons involves a momentum transfer q = k − k 0 and energy transfer h ¯ ω = E(k) − E(k 0 ). The effective interaction has the following limits (i) This interaction reduces to the Thomas-Fermi screened electron-electron interaction when the electron energy transfer is greater than the typical phonon frequency ωD ∼ Ωp . In this case, when ω > ωD , the phonon correction is unimportant. (ii) The electron-electron interaction is strongly modified at low frequencies, where ω < ωD . The contribution from the phonons is large and of opposite sign to the direct Coulomb repulsion, and exactly cancels at ω = 0. The important point, however, is that the retarded interaction is attractive at low frequencies. It exhibits the phenomenon of over-screening and can give rise to superconductivity.
399
16.4
Phonon Renormalization of Quasi-Particles
The electron-phonon interaction can give rise to a change in the quasi-particle dispersion relation. The Hartree-Fock contribution to the quasi-particle energy from the screened electron-electron interaction is ∆E(k)
=
=
e2 | k k0 > 0 | | r − r k0 Z Z X e2 0 0 3 3 0 0 f (E(k )) d r d r 1 − exp i ( k − k ) . ( r − r ) | r − r0 | 0
X
f (E(k 0 )) < k k 0 |
k
=
∆EH −
X k0
f (E(k 0 ))
4 π e2 | k − k 0 |2 (1820)
The first term is the Hartree term which is k independent and can be absorbed into a shift of the chemical potential and the second term is the exchange term which depends on k. The exchange term affects the quasi-particle dispersion relation. If the effect of phonon screening is included the exchange term becomes " # X 4 π e2 ¯h2 ω(k − k 0 )2 0 1+ − f (E(k )) | k − k 0 |2 + kT2 F ( E(k) − E(k 0 ) )2 − ¯h2 ω(k − k 0 )2 k0 (1821) where the exchange interaction is Thomas-Fermi screened, and there is also a phonon contribution. On utilizing the smallness of the Debye frequency with respect to the Fermienergy, and integrating over the magnitude of k 0 , one can show that the change in energy due to the electron-phonon interaction is given by Z µ − E(k) − ¯hω(k − k 0 ) 1 4 π e2 d2 S 0 0 =− ¯h ω(k − k ) ln 8 π3 ¯ h v(k) | k − k 0 |2 + kT2 F µ − E(k) + ¯hω(k − k 0 ) (1822) where k 0 lies on the Fermi-surface. Substitution of E(k) = µ immediately demonstrates that the value of the Fermi-energy µ and the shape of the Fermisurface are unaltered by the coupling to the phonons, which in the approximation under consideration is given by the Thomas-Fermi quasi-particle theory. Secondly, when the quasi-particle energy is within ¯h ωD of µ, | E(k) − µ | < h ωD , the logarithmic term can be expanded in inverse powers of h ¯ ¯ ω. Then it is seen that the phonon contribution to the screening produces a change in the dispersion relation E T F (k) − µ E(k) − µ = (1823) 1 + λ
400
where λ is the wave function renormalization due to the phonons and is given by Z d2 S 0 1 4 π e2 λ = (1824) 0 8π 3 ¯h v(k ) | k − k 0 |2 + kT2 F This has the result that the quasi-particle velocity is given by v(k)
1 ∇ E(k) ¯h 1 1 ∇ E T F (k) 1 + λ ¯h
= =
(1825) Thus, the quasi-particle contribution to the density of states is enhanced by a factor of 1 + λ ρ(µ) = ( 1 + λ ) ρT F (µ) (1826) The coupling can be estimated via 4 π e2 λ < kT2 F
Z
d2 S 0 1 3 8 π ¯h v(k 0 )
(1827)
but 4 π e2 kT2 F
= =
∂ρ 1 = ∂µ ρ(µ) −1 Z d2 S 0 4 π 3 ¯h v(k 0 )
(1828)
Hence, the phonon renormalization factor is usually less than unity λ < 1
(1829)
Finally, the phonon corrections are negligible for electron energies far from the Fermi-energy. For example when | E(k) − µ | > ¯h ωD
(1830)
then the dispersion relation suffers only small corrections E(k) − µ = E
TF
¯ ωD h (k) − µ + O E(k) − µ
2 (1831)
Thus, there has to be a kink in the quasi-particle dispersion relation near the Fermi-energy.
401
16.5
Electron-Phonon Interactions
The effect of coupling with the phonons on the quasi-particle spectrum can be used to deduce the form of the electron-phonon interaction. The change in the ˆ int , ground state energy of a metal due to the electron phonon interaction, H can be estimated from second order perturbation theory as ∆2 E =
X | < Ψ0 | H ˆ int | Ψm > |2 E0 − Em i
(1832)
It is assumed that the form of the electron - phonon interaction is dominated by the first non-trivial term in the expansion of potential acting on the electrons in powers of the displacements of the ions X ˆ int = H u ˆi . ∇Ri Vions (r) (1833) i
Thus, the most important excitation process comes from excited states | Ψm > in which an electron has been scattered from state k to k − q and in also a phonon of wave vector q has been excited. Hence, Em − E0 = E(k − q) + h ¯ ω(q) − E(k)
(1834)
Thus, one can express the second order correction to the ground state energy in a phenomenological manner as ∆2 E = −
X
f (E(k)) ( 1 − f (E(k − q)) )
k,q
| λq |2 E(k − q) + h ¯ ω(q) − E(k)
(1835) where f (x) is the Fermi-function. One can identify an effective electron - electron interaction due to the phonons from the functional derivative of the energy with respect to the Fermi-functions Vef f (q) =
δ 2 ∆2 E δf (E(k)) δf (E(k − q))
(1836)
Hence, Vef f (q)
=
− −
=
| λq |2 E(k) − E(k − q) − ¯h ω(q) | λq |2
E(k − q) − E(k) − ¯h ω(q) " # 2 ¯h ω(q) 2 | λq | ¯h2 ω(q)2 − ( E(k) − E(k − q) )2 (1837) 402
On identifying the above effective potential with the phonon contribution to the screened interaction between the electrons one obtains an expression for the effective coupling constant | λq |2 as | λq |2 =
¯ ω(q) h 1 4 π e2 2 2 V q + kT F 2
(1838)
For small q the coupling constant vanishes linearly with q, since 2 µ 4 π e2 = kT2 F 3 ρ
(1839)
for q < kT F the coupling constant varies as | λq |2
µ ¯h ω(q) ρV 3 ¯h ω(q) µ 3N Z
= =
(1840)
16.6
Electrical Resistivity due to Phonon Scattering
The electron-phonon scattering contributes to the electrical resistivity. The phonon gas acts as a source or sink for the electron momentum and, thus, the interactions with the electron gas reduces the current flow and hence increases the resistivity. The electron-ion interaction is given by X ˆ ions = H V (r − R) (1841) R
and as the position of the i-th ion can be written in terms of the equilibrium position and a displacement R = R i + ui
(1842)
The potential of the ions is expanded up to linear order in the lattice displacements ui " # X ˆ ions = H V (r − Ri ) − ui . ∇R V (r − Ri ) + . . . (1843) i
The first term represents the static lattice and the second term is the electron phonon interaction. The electron phonon interaction is given by X ˆ int = − H ui . ∇R V (r − Ri ) (1844) i
403
Thus, the interaction produces scattering of the electrons between Bloch states and, through ui involves the absorption or emission of phonons. The condition of conservation of energy yields the selection rule E(k) = E(k 0 ) ± ¯h ω(k − k 0 )
(1845)
This is a single restriction leads to a two-dimensional surface of Bloch state wave vectors k 0 that are allowed final states for the electron initially in Bloch state k. The momentum transfer for these processes is given by q = k − k 0 . The surface of allowed final states must be close to the surface of initial energy as h ¯ ω µ, hence, E(q) ∼ E(k − q). The scattering rate out of the state with momentum k is given by 1 = τ (k → k 0 ) 2π X 2 | λα | f (E(k)) 1 − f (E(k + q)) q h ¯ α " × N (ωα (q)) δ E(k) − E(k + q) + h ¯ ωα (q) +
# 1 + N (ωα (q)) δ E(k) − E(k + q) − ¯h ωα (q) (1846)
The rate for scattering into the momentum state k is given by 1 = τ (k → k) 0
2π X α 2 | λq | f (E(k + q)) 1 − f (E(k)) h ¯ α " × N (ωα (q)) δ E(k + q) − E(k) + h ¯ ωα (q) +
# 1 + N (ωα (q)) δ E(k + q) − E(k) − ¯h ωα (q) (1847)
The transport scattering rate is the rate for momentum change of an electron at the Fermi-surface is defined by X 1 1 1 0 (k.E) − (k .E) = ( k . E ) f (E(k)) 1 − f (E(k)) τ τ (k → k 0 ) τ (k 0 → k) 0 k
(1848) 404
The rate for scattering out of state k will be transformed into a form comparable to the rate for scattering in. The rate for scattering out of momentum state k is re-written as 2π X α 2 = | λq | f (E(k)) 1 − f (E(k + q)) exp β ( E(k) − E(k + q) ) h ¯ α " ¯ ωα (q) × 1 + N (ωα (q)) δ E(k) − E(k + q) + h #
+ N (ωα (q)) δ
E(k) − E(k + q) − ¯h ωα (q)
2π X α 2 | λq | f (E(k + q)) 1 − f (E(k)) = h ¯ α " × 1 + N (ωα (q)) δ E(k) − E(k + q) + h ¯ ωα (q) #
E(k) − E(k + q) − ¯h ωα (q)
+ N (ωα (q)) δ
(1849) Thus, the transport scattering rate can be expressed as 1 ( k . E ) f (E(k)) 1 − f (E(k)) = τ (1850) =
2π X 2 ( q . E ) | λα | f (E(k + q)) 1 − f (E(k)) q h α, q ¯ "
×
1 + N (ωα (q)) δ E(k) − E(k + q) + h ¯ ωα (q) #
+ N (ωα (q)) δ
E(k) − E(k + q) − ¯h ωα (q) (1851)
Furthermore, as f (E(k + q)) 1 − f (E(k)) 1 + N (ωα (q)) = f (E(k)) 1 − f (E(k + q)) N (ωα (q))
E(k) − E(k + q) + h ¯ ωα (q)
δ δ
E(k) − E(k + q) + h ¯ ωα (q) (1852)
405
the scattering rate can be expressed as 1 ( k . E ) f (E(k)) 1 − f (E(k)) = τ " × f (E(k)) 1 − f (E(k + q)) + f (E(k + q))
× +
× +
E(k) − E(k + q) + h ¯ ωα (q)
2π X 2 ( q . E ) | λα q | N (ωα (q)) N ( − ωα (q)) ¯h α, q
f (E(k + q)) − f (E(k)) δ E(k) − E(k + q) + h ¯ ωα (q)
# f (E(k)) − f (E(k + q)) δ E(k) − E(k + q) − ¯h ωα (q) =
"
δ
# 1 − f (E(k)) δ E(k) − E(k + q) − ¯h ωα (q) =
"
2π X 2 ( q . E ) | λα q | N (ωα (q)) ¯h α, q
2π X 2 ( q . E ) | λα q | N (ωα (q)) N ( − ωα (q)) ¯h α, q
f (E(k) + ¯hωα (q)) − f (E(k)) δ E(k) − E(k + q) + h ¯ ωα (q)
# f (E(k)) − f (E(k) − ¯hωα (q)) δ E(k) − E(k + q) − ¯h ωα (q) (1853)
The summation over q is evaluated by transforming it into an integral =
2π X 2 ( q . E ) | λα q | N (ωα (q)) N ( − ωα (q)) ¯h α, q
"
2 ¯h2 2 ¯ h q − h ¯ ωα (q) × f (E(k) + ¯hωα (q)) − f (E(k)) δ (k.q) + 2m m # 2 ¯h2 2 ¯h q + ¯h ωα (q) (k.q) + + f (E(k)) − f (E(k) − ¯hωα (q)) δ 2m m (1854) The integration over the direction of q is performed in spherical polar coordinates, in which the direction of k is fixed as the polar axis. The integral over the azimuthal angles result in the factors of sin φ and cos φ in ( q . E ) = q cos θ Ez + q sin θ ( sin φ Ey + cos φ Ex ) 406
(1855)
vanishing. The sole surviving term, proportional to Ez , can then be written in a manner independent of the choice of axis as k . E which can be factored out of the integral X Z 2π 2mπ 2 = ( 2 2 )(k.E) dq q 2 | λα q | N (ωα (q)) N ( − ωα (q)) h ¯ ¯h k α " Z 1 m ωα (q) q × f (E(k) + ¯hωα (q)) − f (E(k)) d cos θ cos θ δ cos θ + − 2k ¯h k q −1 # Z 1 m ωα (q) q + f (E(k)) − f (E(k) − ¯hωα (q)) + d cos θ cos θ δ cos θ + 2k ¯h k q −1 (1856) On neglecting the term of order vvFα , cancelling the factors of ( k . E ), and Taylor expanding the Fermi-function factors in powers of the phonon frequencies, one finds the transport scattering rate for electrons on the Fermi-surface is given by 1 f (E(k)) 1 − f (E(k)) = τ Z ∂f (E(k)) 2π 2mπ X 2 = − ( 2 3 ) dq q 3 | λα | N (ω (q)) N ( − ω (q)) ¯ h ω (q) α α α q h ¯ ∂E(k) h k ¯ α (1857) On using
∂f (E(k)) ∂E(k)
= − β f (E(k))
1 − f (E(k))
(1858)
one finds 1 4 m π2 X = β( 3 3 ) τ ¯ k h α
Z
2 dq q 3 | λα h ωα (q) q | N (ωα (q)) N ( − ωα (q)) ¯
(1859) The temperature dependence of the transport scattering rate can be evaluated using the Debye model for the phonons, and using a linear q dependence of 2 |λα h ωα (q) q | . The integral over q is evaluated through the substitution z = β ¯ and ωα (q) = vα q to yield 1 ∝ T5 τ
Z
TD T
dz 0
z5 ( exp[z] − 1 ) ( 1 − exp[−z] )
(1860)
For this temperature range, the number of thermally exited phonons is proportional to T 3 . One would expect that the scattering rate would be proportional Z 1 ∝ dq q 2 N (ω(q)) τ ∝ T3 (1861) 407
However, as forward scattering is ineffective in transport properties, the transport scattering rate is proportional to the change in momentum along the direction of the electric field and therefore, is proportional to ( 1 − cos θ )
=
2 sin2
≈
1 q2 2 kF2
θ 2 (1862)
which produces an additional T 2 dependence. For low temperatures ( T < TD ) , the upper limit on the integration may be set to infinity yielding −1
σ(T )
∝
T TD
5 (1863)
Thus, the combined effect of the factor ( 1 − cos θ ) and τ1 ∝ T 2 produces a T 5 temperature dependence in the low-temperature resistivity. At high temperatures ( T > TD ), the range of integration is less than unity so the integrand may be expanded in powers of z yielding −1
σ(T )
∝ ∝
5
T
TD T
Z
4 dz z 3 = T TD
0
T TD
(1864)
which is the result for the classical limit of the scattering. This can be considered to arise merely as the number of thermally activated phonons is given by the classical expression kB T N (ωα (q)) = (1865) ¯h ωα (q) The above results were first derived independently by Bloch and Gruneisen and the resulting formula is known as the Bloch - Gruneisen resistivity due to phonon scattering.
16.6.1
Umklapp Scattering
Umklapp processes may change the leading low-temperature variation of the resistivity. Umklapp scattering circumvents the factor of ( 1 − cos θ ) which produces the extra T 2 factor. When kF is close to the zone boundary, a small q value may couple the sheets of the Fermi-surface in neighboring Brillouin zones. These are the umklapp processes. They produce a large change in the electron velocity ∆v, by a phonon induced Bragg reflection.
408
16.6.2
Phonon Drag
The resistivity could decrease faster than T 5 if the system was relatively free of defects and umklapp scattering could be neglected. This would occur if the phonons were allowed to equilibrate with the electronic system in its steady state. The combined system of electrons and phonons should have a total momentum, which is conserved in collisions. As a result, the phonon system would not be able to momentum ( or current ) from the electron system as they drift together.
409
17 17.1
Phonons in Semiconductors Resistivity due to Phonon Scattering
The transport scattering rate in a semiconductor can be obtained from the collision integral of the Boltzmann equation " X 2 I f (k) = λq f (k) ( 1 − f (k + q) ) N (ω(q)) δ( E(k) − E(k + q) + h ¯ ω(q) ) q
# − ( 1 − f (k) ) f (k + q) ( 1 + N (ω(q)) ) δ( E(k + q) − E(k) − ¯h ω(q) ) (1866) in which f () is the non-equilibrium distribution function. On linearizing about the equilibrium Fermi-distribution function f (k) = f0 (k) + A ( k . E )
∂f0 (k) ∂E
(1867)
yields the linearized collision integral. Using the identity ( 1 − f (E(k)) ) f ((E(k)+¯ hω(q)) =
1 − exp β ¯h ω(q)
f (E(k)) − f (E(k)+¯ hω(q))
(1868) one finds the result Z 2k 2mAV dq 2 I f (k) = 2 exp − β ( E(k) − µ ) q Ez λ2q N (ω(q)) N (−ω(q)) 2π h kB T k ¯ 0 " m ω(q) q × + 1 − exp − β ¯hω(q) 2k ¯h k q # m ω(q) q − − 1 − exp + β ¯hω(q) 2k ¯h k q (1869) For low frequency acoustic phonons, the Bose-Einstein distribution can be approximated by its high temperature form leading to the collision integral Z 2k q ∂f (k) m V dq 2 2 − q λq I f (k) = A ( k . E ) − 2 β ¯h ω(q) ∂E k3 2π 0 (1870) The transport scattering rate is found by factoring out the non-equilibrium part of the distribution function Z 2k 1 mV dq 3 N = k T q q 2 | V (q) |2 B 2 τ (E) k3 4 π 2 M ω(q) 0 410
=
N V m k | V (0) |2 kB T 4 π c2 M
(1871)
Hence, the conductivity in a semiconductor, in which the scattering rate is dominated by phonon scattering, is given by Z 2 3 σx,x ∼ β exp β µ dk k exp[ − β E(k) (1872) Thus, the conductivity has a temperature dependence given by σx,x ∼ exp β µ
(1873)
Thus, as expected, the conductivity is still dominated by the number of carriers, 3 but the conductivity has an additional T dependence of T − 2 above and beyond the prefactor in the number of carriers.
17.2
Polarons
Electron-phonon coupling in semiconductors can be large. For low density of carriers, each carrier can cause a distortion of the lattice. The carrier and the surrounding distortion forms an excitation which is known as a polaron. At low temperatures the polaron appears to have a large effective mass, as the motion of the carrier is hindered by the need to drag the surrounding lattice distortion. Thus, there is a low-temperature regime in which the conductivity is governed by the motion of the heavy quasi-particles with an extremely large and temperature dependent effective mass. At high temperatures, the conductivity is dominated by incoherent hopping processes, which are thermally assisted by the presence of a thermal population of phonons.
17.3
Indirect Transitions
In a semiconductor, light can be absorbed in processes where by an electron is excited from the filled valence band into the empty conduction band. The minimum energy of the photon must be greater than the band gap between the conduction and valence band density of states. Since the speed of light c is so large, the wave vector of the photon absorbed in a transition between two states with energy difference of the scale of eV is extremely long. Thus, the momentum of the photon is negligible on the scale of the size of the Brillouin zone. This means that in a semiconductor, if only a photon and an electron are involved, momentum conservation only allows transitions in which the initial and final state of the electron have the same k value. This type of transition is called a direct transition. In some semiconductors the minimum of the conduction band dispersion relation lies vertically above the maximum of the valence band, and the band gap is called the direct gap. In this case, the threshold for 411
direct absorption should coincide with the gap observed in the density of states. On the other hand, if the energetic separation between the maximum of the valence band and the minimum of the conduction band dispersion relations occur at different k values, then the threshold energy for the absorption of a photon in a direct transition should be greater than the separation inferred from just considering the density of states alone. This second type of semiconductor has two gaps, the indirect band gap inferred from the density of states and a direct gap inferred for q = 0 transitions by consideration of the dispersion relations. If the ions of the lattice are displaced from their equilibrium positions, simple conservation of momentum arguments do not apply. In this case, it is possible to have absorption at the indirect gap. At the threshold for indirect transitions, the absorption process involves the absorption or emission of a phonon with wave vector equal to the wave vector Q separating the valence band maximum to the conduction band minimum. The transition rate has to be calculated via second order perturbation theory, one power of the interaction involves the absorption of one photon and the other power of the interaction involves either the absorption or emission of one phonon. The state of the joint system composed of an electron, phonons of wave vector Q and photons of frequency ω is denoted by | Ψ >. This state satisfies the equation of motion ∂ ˆ ˆ i¯ h |Ψ > = H0 + Hint | Ψ > (1874) ∂t ˆ 0 , | φn > with energy En The state is decomposed in terms of eigenstates of H X En |Ψ > = Cn (t) exp − i t | φn > (1875) ¯h n Then, one finds that the expansion coefficients Cn (t) satisfy the equation X ∂ Em − E n ˆ i ¯h Cn (t) = < φn | Hint | φm > exp i t Cm (t) (1876) ∂t ¯h m Since the system is initially in the ground state, then the state is subject to the initial condition given by Cn (0) = δn,0 (1877) To first order one has Z t i ˆ int | φm > exp i t0 E0 − En Cn1 (t) = − dt0 < φn | H ¯h 0 ¯h
(1878)
We assume the perturbation has no diagonal elements, therefore, C01 (t) = 0. To second order one has X ∂ 2 Em − E n 1 ˆ i ¯h Cn (t) = < φn | Hint | φm > exp i t Cm (t) (1879) ∂t ¯h m6=0
412
18
Impurities and Disorder
If an isolated impurity is introduced into a solid, and the impurity has no low energy degrees of freedom which can be excited, then it can be treated as an impurity potential Vimp (r). Since the impurity breaks the periodic translational invariance of the solid, the impurity potential will scatter an electron between Bloch states with different Bloch wave vectors. The non-zero matrix elements of the potential can be written as Z d3 r φ∗k0 (r) Vimp (r) φk (r) = < k 0 | Vimp | k > (1880) If the wave function, in the presence of an impurity, is written as the superposition X ψα (r) = Cα (k) φk (r) (1881) k
the energy eigenvalue can be expressed as X ( Eα − E(k) ) Cα (k) = Cα (k 0 ) < k | Vimp | k 0 >
(1882)
k0
P If the quantity k0 Cα (k 0 ) < k | Vimp | k 0 > is well defined and non zero function of k, then there exist eigenvalues Eα between every consecutive pairs of E(k). For a large system, where E(k) are very closely spaced, the eigenvalues form a continuum. These eigenstates correspond to weakly perturbed Bloch states. On the other hand, if the potential is attractive, and there is a minimum value of E(k), below which there can be bound state with energies Eα . The dependence of the bound state energy on the density of states of the ordered material can be easily found, for the case where the potential has the property that the matrix elements are independent of k and k 0 . In this case one can easily solve for the bound states. The above equations can be solved writing X Cα (k 0 ) γ = (1883) k0
Therefore, one has Cα (k) =
Vimp γ Eα − E(k)
(1884)
The above two equations leads to the self-consistency condition for the bound state energy Eα X Vimp 1 = (1885) Eα − E(k) k
which shows that, for an attractive potential, there may be a critical value of Vimp needed for a bound state to form.
413
A more powerful way of solving the same problem involves use of the oneparticle resolvent Green’s function defined by the operator ˆ ˆ )−1 G(z) = (z − H
(1886)
ˆ is a Hermitean operator the matrix where z is a complex number. Since H elements of the Green’s function can be expressed in terms of a sum of simple poles at the energy eigenvalues. Since the poles of the Hamiltonian are composed of the discrete bound states at negative energies and a semi-continuous spectrum at positive energies, the Green’s function has a branch cut across the real axis, x = Re z, X ˆ ˆ < Ψ | G(x−i) − G(x+i) |Φ >= 2πi < Ψ | En > < En | Φ > δ( x − En ) n
(1887) where | En > is the energy eigenstate corresponding to the energy eigenvalue En . The resolvent Green’s function can be obtained by expressing the Hamiltoˆ 0 and the interaction H ˆ int , nian in terms of the unperturbed Hamiltonian H ˆ = H ˆ0 + H ˆ int H
(1888)
Then, the Green’s function −1 −1 ˆ ˆ ˆ ˆ G(z) = z − H = z − H0 − Hint
(1889)
can be re-written as −1 −1 ˆ ˆ ˆ ˆ int G(z) ˆ G(z) = z − H0 + z − H0 H
(1890)
which can be expressed in terms of the non-interacting resolvent Green’s funcˆ 0 (z), as tion, G ˆ ˆ 0 (z) + G ˆ 0 (z) H ˆ int G(z) ˆ G(z) = G (1891) The non-interacting resolvent Green’s function is easily evaluated in terms of ˆ 0. the matrix elements between the eigenstates of H < φn |
1 | φm > = ˆ0 z − H = =
1 | φm > z − Em < φn | φm > z − Em δn,m z − Em < φn |
(1892)
which is diagonal. For simultaneous momentum eigenstates the non-interacting resolvent Green’s function only has the diagonal matrix elements < k0 |
1 ˆ0 z − H
|k > =
414
< k0 | k > z − E(k)
(1893)
The interacting Green’s function can be expressed in terms of the Tˆ(z) matrix as ˆ ˆ 0 (z) + G ˆ 0 (z) Tˆ(z) G ˆ 0 (z) G(z) = G (1894) where the T-matrix is defined as ˆ int Tˆ(z) = H
ˆ 0 (z) H ˆ int 1 − G
−1 (1895)
Thus, the poles of the T-matrix are related to the poles of the Green’s function. ˆ int are independent of k, the matrix elements of the As the matrix elements of H T-matrix between the Bloch states can be evaluated as −1 X Vimp (1896) Tˆ(z) = Vimp 1 − z − E(k) k
Since the imaginary part of the trace of the Green’s function is related to the density of state via ρ0 (E) = −
1 X Im < k | G0 (E + i) |k > π
(1897)
k
one can express the real part of the Greens’s function as an integral Z ∞ X Vimp ρ0 (E) Vimp = dE z − E(k) z − E 0
(1898)
k
Thus, the imaginary part of the T-matrix is non-zero for z on the positive real axis. The T-matrix has isolated poles outside the continuum at the negative energy z which are given by the solutions of Z ∞ 1 ρ0 (E) = dE (1899) Vimp z − E 0 The minimum value of the attractive potential Vimp that produces a bound state strongly depends on the form of the density of states at the edge of the continuum. The critical value of Vimp denoted as Vc is given by the condition z = 0 Z ∞ 1 ρ0 (E) = − dE (1900) Vc E 0 d−2
Since ρ0 (E) ∝ E 2 , the integral converges for three dimensions and higher, but diverges for two and one dimensions. The bound states are exponentially localized around the impurity site.
415
18.1
Scattering By Impurities
The exact eigenstates of a Hamiltonian containing a scattering potential Vˆimp satisfies the equation ˆ |Ψ > = (H ˆ 0 + Vˆimp ) | Ψ > = E | Ψ > H
(1901)
The can be re-expressed as an integral equation with an initial state given by incident momentum eigenstate | p > as ˆ 0 + i ) | Ψ > = Vˆimp | Ψ > (E − H
(1902)
has general solutions which are the superposition of the solutions of the homogeneous equation and a particular solution of the inhomogeneous equation |Ψ > = |k > +
1 Vˆimp | Ψ > ˆ E − H0 + i
(1903)
To ensure that | Ψ > − | k > is an outgoing wave must be chosen as a positive infinitesimal term. The asymptotic behavior of the outgoing equation can be expressed as 1 Ψ(r) = exp i k . r 3 (2π¯ h )2 0 0 exp i k . ( r − r ) X Z d3 r 0 Vimp (r0 ) Ψ(r0 ) + 0 3 ( 2 π ¯ h ) E − E(k ) + i 0 k
(1904) which has the well known solution 1 Ψ(r) = exp i k . r 3 (2π¯ h )2 0 Z − r exp i k | r | m d3 r 0 − Vimp (r0 ) Ψ(r0 ) 2π ( 2 π ¯h )3 | r − r0 | (1905) The asymptotic form can be expressed in terms of the phase shifts δl (k) via a partial wave analysis. ∞ X (2l + 1) l lπ i exp i δl Pl (cos θ) sin( k r − + δl ) lim Ψ(r) ∼ r → ∞ kr 2 l=0
(1906) Since the T-matrix has matrix elements which satisfy < k 0 | Tˆ | k > = < k 0 | Vˆimp | Ψ > 416
(1907)
This leads to exp ∞ X m < k 0 | Tˆ | k > = Pl (cos θ) 2π l=0
i 2 δl 2ik
and the differential scattering cross-section is given by m dσ = < k 0 | Tˆ | k > dΩ 2π
2
− 1 (1908)
(1909)
In the limit k → 0 only the s-wave phase shift δ0 is significant and one finds exp i 2 δ0 − 1 m T = (1910) 2π 2ik and the scattering cross-section is given by dσ sin2 δ0 = dΩ k2
(1911)
and the total cross-section σ is given by σ =
4 π sin2 δ0 k2
(1912)
The density of states due to the impurity can be expressed in terms of the phase shift δ0 (k). The number of states is evaluated in a volume of radius R, and the wave function is required to vanish at r = R. Hence, the phase shift must satisfy the condition k R + δ0 (k) = n π (1913) Since successive states satisfy this condition with consecutive integers n and n + 1 then the change in k between two states is given by ∆k ( R +
∂δ0 ) = π ∂k
Thus, the number of states per k interval is 1 1 dδ0 (k) = R + ∆k π dk
(1914)
(1915)
On integrating this with respect to E one finds that the number of states due to the impurity with energy less than E, N (E) is given by N (E) =
1 δ0 (E) π
The impurity density of states, per spin is given by 1 ∂δ0 ρimp (E) = π ∂E 417
(1916)
(1917)
The condition for electrical neutrality for a charge Z | e | determines the phase shift at the Fermi-energy, through Freidel’s sum rule 2 Z = δ0 (µ) (1918) π For systems were the phase shift is rapidly varying at the Fermi-energy, there is a large impurity density of states. For resonant scattering, as Friedel has shown, there exists a virtual bound state at the Fermi-energy. This finite density of states at the Fermi-energy gives rise to an impurity contribution to the specific heat and the susceptibility.
18.2
Virtual Bound States
The virtual bound state can be envisaged as an ( almost ) localized level that has a finite probability amplitude for transitions into the conduction band states. These virtual bound states are most frequently found for 3 d transition metal impurities in metals or in mixed valent lanthanide element impurities in metals. In both these cases, the potential well has a large centrifugal barrier ¯2 l ( l + 1 ) h (1919) 2 m r2 which prevents the 3 d states from being filled until after the 4 s states are filled or, in the case of the lanthanide elements, the 4 f states remain unfilled until after the 6 s, 5 p and 5 d states are all occupied. When the nuclear potential is strong enough, such that the 3 d or 4 f can be occupied in the ground state, the ion localizes an electron within the centrifugal barrier in an inner ionic shell. For example, in the Ce atom, although the 4 f wave function is localized, in that it has a spatial extent of 0.7 a.u. which lies inside the core like 5 s and 5 p orbitals, its’ ionization energy is small and comparable to the ionization energy of the band like 6 s and 6 p orbitals. As the localized state is degenerate with the conduction band states, there is a finite probability amplitude for tunnelling through the barrier. The virtual bound state describes an extended state that, through resonant scattering builds up a significant local character. The virtual bound state in a metal may be modelled by a Hamiltonian which is the sum of three terms ˆ = H ˆ0 + H ˆV = H ˆc + H ˆd + H ˆV H (1920) ˆ c describes the electrons in th conduction band, the Hamiltonian H ˆd where H ˆV represents the (isolated) localized d level on the impurity and the term H describes the coupling. The conduction band is expressed in terms of the number of conduction electrons in the Bloch states (k, σ) with dispersion relation E(k) through X ˆc = H ˆ k,σ E(k) n Vl (r) =
k,σ
=
X
E(k) c†k,σ ck,σ
k,σ
418
(1921)
where c†k,σ and ck,σ Likewise the energy for an electron in the localized d state is given by the binding energy Ed times the number of d electrons of spin σ, X ˆc = H Ed n ˆ d,σ σ
=
X
Ed d†σ dσ
(1922)
σ
where d†σ and dσ respectively create and annihilate an electron of spin σ in the localized d state. The hybridization or coupling term is given by the spin conserving term 1 X † ∗ † ˆ √ V (k) ck,σ dσ + V (k) dσ ck,σ HV = (1923) N k,σ The first term represents a process whereby an electron in the d orbital tunnels into the conduction band, and the Hermitean conjugate term represents the reverse process. It is assumed that the conduction band states have been orthogonalized to the localized states, so that the conduction band fermion operators ant-commute with all the local operators. The Resolvent Green’s function can be calculated from the expression ˆ0 ) (z − H
1 ˆ z − H
ˆV = H
1 ˆ z − H
(1924)
ˆ0 Evaluating the matrix elements of this equation between the eigenstates of H yields the coupled equations ( z − Ed ) < d |
1
1 X 1 |d > = √ V (k) < d | |k > ˆ ˆ N k z − H z − H (1925)
and ( z − E(k) ) < d |
1 1 1 |d > |k > = √ V (k)∗ < d | ˆ ˆ N z − H z − H (1926)
These equations can be combined to yield the matrix elements of the resolvent Green’s functions as Gd,d (z) = =
1
|d > ˆ z − H 1 z − Ed − Σ(z) < d|
419
(1927)
where the self-energy Σ(z) is given by Σ(z) =
1 X | V (k) |2 N z − E(k)
(1928)
k
the real part of the self energy can be interpreted as producing a renormalization of the energy of the localized level Ed , and the imaginary part can be interpreted as giving rise to an width or lifetime τ such that ¯ h = Im Σ(Ed ) 2τ
(1929)
The conduction band Resolvent Green’s function is evaluated, from a similar set of coupled equations as Gk,k0 (z)
= = +
< k|
1 ˆ z − H
| k0 >
δk,k0 + z − E(k) V (k 0 )∗ 1 V (k) Gdd (z) N z − E(k) z − E(k 0 ) (1930)
From these equations it can be seen that the density of states of the d level is given in terms of the imaginary part of Σ(E) via 1 ρd (E) = Im Gd,d (E − i) π 1 Im Σ(E − i) = 2 2 π E − Ed − Re Σ(E − i) + Im Σ(E − i) (1931) The impurity density of states is approximately in the form of a Lorentzian centered on Ed with a width given by Im Σ(E). The width is given by π X ImΣ(E) = | V (k) |2 δ( E − E(k) ) N k
≈
1 π | V |2 ρ(Ed ) N
(1932)
which is related to the Fermi-Golden rule expression for the rate for the localized electron to tunnel into the conduction band density of states ρ(E). Thus, the virtual bound state can be interpreted in terms of a narrow band density of states which is connected to the extended conduction band states.
420
18.3
Disorder
Give a distribution of impurities in a solid, the potential in the solid will be non-uniform. The thermodynamic properties of the solid can be expressed in terms of the energy eigenvalues, or alternatively the poles of the Green’s function. For a macroscopic sample, the exact distribution of impurities will not be measurable and the thermodynamic properties is expected to be representative of all distributions of impurities. Therefore, the average value of a quantity can be represented by averaging over all configurations of the impurities. It can easily be shown that the configurational averaged density of states is given by the discontinuity across the real axis of the configurational averaged resolvent Green’s function. The Hamiltonian of a binary (A-B) alloy, with site disorder, may be represented by ˆ = H ˆ 0 + Vˆ H (1933) in which H0 describes the tight-binding bands of a pure metal with a dispersion relation d X E(k) = − t cos ki ai (1934) i=1
and the randomness appears as a shift of the binding energies of the atomic orbitals X Vˆ = ER | ψ R > < ψ R | (1935) R
where ER can take on the values EA or EB depending on the type of atom present at site R. The average Green’s function is given by G(z) =
ˆ 0 − Vˆ z − H
−1 (1936)
which can be expressed as G(z) =
ˆ 0 − Σ(z) z − H
−1 (1937)
where the operator Σ(z) is complex and is known as a self-energy. Since the configurational averaged Green’s function has translational invariance, then so does the self-energy. It represents the effect of the randomly distributed impurities on the eigenvalue spectrum. Due to the fluctuations in the random potential, the energy eigenvalues may form continua. The averaged Green’s function can be calculated by expanding the Green’s function in powers of the potential and then performing the configurational average. For strongly fluctuating potentials the resulting power series may be 421
slowly convergent, or it may not even be convergent at all. Therefore, it may be preferable to expand the Green’s function about the self energy. This procedure leads to the coherent potential approximation.
18.4
Coherent Potential Approximation
The potential difference between a specific potential due to the impurities and the self energy can be expressed as Vˆ (z) = Vˆ − Σ(z) (1938) The resolvent Green’s function for this type of disordered impurity problem can be expressed as −1 ˆ ˆ 0 − Σ(z) − Vˆ (z) G(z) = z −H (1939) which can be expressed in terms of the T-matrix via ˆ G(z) = G(z) + G(z) Tˆ(z) G(z)
(1940)
where the T-matrix is given by Tˆ(z) = Vˆ (z)
1 − G(z) Vˆ (z)
−1 (1941)
On taking the configurational average one finds that the averaged T-matrix must be zero −1 = 0 (1942) T (z) = Vˆ (z) 1 − G(z) Vˆ (z) This equation can be used to obtain the self-energy. For the A − B alloy the effective potential is X Vˆ (z) = ( ER − Σ(z) ) | ψR > < ψR |
(1943)
R
It is assumed that the concentration of A atoms is c and the concentration of B atoms is ( 1 − c ) and these are randomly distributed. It is also assumed that the T-matrix can be represented as a sum of single site T-matrices, in which the scattering is referenced to an appropriately chosen averaged medium. This is the single site approximation. The averaged T-matrix can be written as T (z)
= c
1 −
EA
EA − Σ(z) − Σ(z) < R0 | G(z) | R0 >
+(1 − c)
1 −
EB
EB − Σ(z) − Σ(z) < R0 | G(z) | R0 > (1944)
422
The Coherent Potential Approximation (C.P.A.) sets T (z) = 0
(1945)
The resulting equations are non-trivial to solve since the Green’s function in the denominator is formed from a sum over the Bloch states and also involves the self-energy. < R0 | G(z) | R0 >
=
1 1 X N z − Σ(z) − E(k)
=
< R0 | G0 ( z − Σ(z) ) | R0 >
k
(1946)
Nevertheless, this can be solved numerically or alternatively if the sum over Bloch energies can be evaluated analytically, a analytic solution may be found. The C.P.A. is expected to be valid in the limit of a dilute concentration of impurities 1 c, weak scattering t | EA − EB | and fortuitously in the atomic limit, where the single site approximation is exact. In general the C.P.A. may be only trusted to yield the density of states, and not transport properties. The density of states obtained from this method resembles a smeared version of the weighted sums of the density of states of a solid composed of A atoms and the density of states composed of B atoms. For small magnitudes in the differences of the site energies, the two components overlap, but they separate for large differences in the site energies. When the bands are split, the widths of the component bands are drastically modified from the ideal superposition, reflecting the increasing separation between sites which decreases the tendency to form bands. The effects of the impurity scattering is to produce a smearing, which washes out any structure such as van Hove singularities. Transport properties crucially depend on the spatial extended nature of the energy eigenstates, which may be destroyed by long-ranged correlations in the random potentials. This type of phenomenon is completely absent in C.P.A., and can lead to the energy eigenstates becoming localized.
18.5
Localization
The phenomenon of disorder induced localization is easiest to understand in terms of states at the tail edge of a band. Just as one impurity with a sufficiently strong attractive potential may cause a bound state to form around it, a bound state may also be formed for a number of nearby atoms with weaker attractive interactions, in which case the bound state may be of larger spatial extent. In both cases, they will produce localized states below the density of states. A distribution in the spatial separation between impurity atoms, will smear these discrete bound states. The localized states manifest themselves as low energy tails to the density of states. As the strength of the disorder is increased, the number of localized states in the tails of the density of states will
423
increase. One surprising feature is that a sharp energy, the mobility edge, separates the states that extend throughout the crystal from the localized states. The length scale over which the states on the localized side of the mobility edge are extra-ordinarily long, and cannot be treated by perturbation methods but require renormalization group type of approaches. On using the electron-hole symmetry for states at the top of the band, one discovers that the states at the top edge of the band will also become localized due to disorder, and also have a mobility edge. On increasing the strength of the disorder, the mobility edges will move towards the middle of the bands. A disorder driven metal insulator transition will occur when the mobility edge crosses the Fermi-energy. This type of transition is known as the Anderson transition. The effect of many-body interactions complicate the physics on the metallic side of the Anderson transition, where weak localization occurs and there effects are most marked. On the insulating side of the transition, conduction will be still possible but only due to the thermal excitation of electrons to the itinerant states above the mobility edge, or by thermally assisted tunnelling processes. For sufficiently strong disorder all the states in the band will become localized. All states in one-dimensional and two-dimensional systems must become localized, for arbitrarily small strengths of disorder. However, this localization need not show up in experiments if the length scale over which the states are localized is smaller than the sample size.
18.5.1
Anderson Model of Localization
In a doped semiconductor such as P doped Si, as the impurity concentration is increased, it is expected that the energy levels of the isolated impurities will broaden and form bands. For large concentrations, the impurity level wave functions are expected to overlap and become extended. Thus, it is expected that a metal insulator transition will occur as a function of impurity concentration. The metal insulator transition can be described by a tight-binding model of a disordered system X X ˆ = H εi c†i,σ ci,σ − t c†i,σ cj,σ (1947) i,σ
i,j,σ
where t are the nearest neighbor tight-binding hopping matrix elements and the sum over (i, j) are assumed to run over pairs of nearest neighbors lattice sites. The site energies εi are assumed to be random variables uniformly distributed over an energy width W P (ε)
= =
1 W W f or − <ε < W 2 2 0 otherwise (1948)
424
The degree of disorder is measured by the dimensionless parameter W t. For sufficiently large W/t the states are expected to all be localized. The critical value of (W/t)c is expected to be dependent on the dimensionality of the lattice. For W/t less than the critical value the states around the center of the tight-binding bands are extended, while states near the band edges are localized. There are energies Ec , called mobility edges that separate the localized and extended states. When the chemical potential µ crosses the mobility edge, the states at the Fermi-energy change their characters and a metal non-metal transition occurs. This is known as the Anderson transition. The wave functions corresponding to extended and localized states have different characters. A wave function for the disordered solid can be expressed as a linear combination of atomic wave functions X C(R) φ(r − R) (1949) ψ(r) = R
A delocalized wave function has an amplitude C(R) which does not decay to zero at large distances. A localized wave function is expected to decay to zero with an exponential envelope | C(R) | ∼ exp − | R | /ξ (1950) The spatial extent of the envelope is given by the correlation length ξ. The correlation length is expected to depend on the energy E of the energy eigenstate. The correlation length is expected to diverge as E approaches the mobility edge Ec . In the Anderson transition, the density of states of the localized states is expected to be continuous. Numerical studies show that the wave function exhibits long-ranged fluctuations close to the critical value of W/t, and appears to be self similar when viewed at all length scales.
18.5.2
Scaling Theories of Localization
Since numerical studies of Anderson localization are hampered by finite size effects which tend to obscure the effect of localization, Licciardello and Thouless introduced a number g(L) which describes the sensitivity of energy eigenvalues on the boundary conditions, for a system with linear dimension L. This number is defined as the ratio ∆E g(L) = (1951) δE where ∆E is the shift in energy levels that occurs when the boundary conditions on the wave function are changed from periodic to anti-periodic. The quantity δE is the mean spacing of the energy levels of the finite size sample. If the wave
425
functions are exponentially localized, it is expected that 2L g(L) ∝ exp − ξ(E)
(1952)
On the other hand, if the wave functions are extended the energy shift due to the different boundary conditions should be of the order of ¯ h τ
(1953)
where τ is the time required for the electron to diffuse to the boundary of the sample. This diffusion time is essentially independent of L. The different dependencies of g(L) on L provide a simple criterion in numerical studies as to whether the states are extended or localized. The quantity g(L) is also found to be equal to a conductance G(L) h e2
(1954)
g(L) ∝ Ld−1 σ
(1955)
g(L) = and is related to the conductivity via
The scaling theory of localization is based upon the length dependence of g(L), (Abrahams, Anderson, Licciardello and Ramakrishnan). In a scale change of a d-dimensional system with linear dimension L, the length scale L is changed to b L. It is expected that g(bL) is related to g(L) and the factor b, and nothing else. This is summarized in the formula g(bL) = f [b, g(L)]
(1956)
where f (x) is a universal scaling function, which only depends on the dimensionality d of the lattice. On considering an infinitesimal scale change b = 1 +
dL L
(1957)
then one can introduce a scaling function d ln g(L) ∂f (b, g)/∂b = = β[g(L)] d ln L g(L)
(1958)
The functional β[g] completely specifies the scaling property of the conductivity in disordered systems. It is assumed that β(g) is a smooth continuous function of g.
426
The asymptotic forms of β can be found in the limits g → 0 and g → ∞. In the strongly localized regime g → 0 where the wave function is exponentially localized one finds that as L g(L) ∝ exp − 2 (1959) ξ then β(g) = − 2
L = ln g ξ
(1960)
Thus, β(g) tends to − ∞ as g → 0. In the metallic limit g → ∞, then as σ is finite and independent of L one has β(g) = ( d − 2 )
(1961)
From this one finds that the system is localized for all spatial dimensionalities less or equal to two, d < 2, as β(g) is always negative and, thus, on increasing L g(L) scales to zero. Thus, in the limit of a large system, no matter how weak the randomness is, the states are always localized in two dimensions. In two dimensions the conductivity decreases with increasing L, logarithmically at small values of g and exponentially at large values of g. Furthermore, for d > 2, there is a critical value of gc such that for g > gc the system scales to the metallic limit, β(g) = ( d − 2 ) since as L is increased and when β(g) is positive then limL → ∞ g(L) → ∞. For g < gc , on increasing L, β(g) is negative and g(L) scales to zero. From the scaling theory one can infer the dependence of conductivity on the concentration of impurities, c > c0 , close to the metal insulator transition σ = σ0 ( c − c0 )1
(1962)
where the exponent of unity can be exactly obtained via perturbation theory.
427
19 19.1
Magnetic Impurities Localized Magnetic Impurities in Metals
When transition metal or rare earth impurities are dissolved in simple metals, the electronic states on the impurities hybridize with the conduction band states and form a Friedel virtual bound state. Since the impurity states are localized the Coulomb interaction U between two electrons occupying these states is large and has to be taken into consideration. The Hamiltonian can be expressed as ˆ = H ˆ0 + H ˆ int H
(1963)
ˆ 0 represents the non-interacting virtual bound state where the Hamiltonian H X X ˆ0 = H E(k) c†k,σ ck,σ + Ed d†σ dσ σ
k,σ
" +
X
V (k)
c†k,σ
∗
dσ + V (k)
d†σ
ck,σ
k,σ
(1964) and the Coulomb interaction U between a pair of electrons in the ( spin only degenerate ) impurity state is given by ˆ int = U d† d† d↓ d↑ H ↑ ↓
(1965)
This is the Anderson impurity Hamiltonian is exactly soluble using numerical renormalization group or Bethe-Ansatz techniques. The mean field solution will be outlined below.
19.2
Mean Field Approximation
ˆ int can be represented in terms of fluctuations about the The interaction term H average value ∆ˆ nσ = d†σ dσ − < | d†σ dσ | > (1966) and the average spin value nσ = < | d†σ dσ | >
(1967)
X
(1968)
as ˆ int = U ∆ˆ H n↑ ∆ˆ n↓ + U
∆ˆ nσ n−σ + U n↑ n↓
σ
In the mean field approximation, the term quadratic in the occupation number fluctuations is neglected, yielding X ˆMF = U H n ˆ σ n−σ − U n↑ n↓ (1969) σ
428
The localized electrons experience an effective spin dependent binding energy given by X ˆd = H ( Ed + U n−σ ) d†σ dσ (1970) σ
where the average number of electrons in the localized level of spin sigma is found as an integral over the density of states of the virtual bound state, which is given by Z ∞
dε f (ε) ρσd (ε)
nσ =
(1971)
−∞
where ρσd (ε)
1 = Im π
ε − Ed
1 − U n−σ − Σ(ε)
(1972)
The self energy can be represented by a constant imaginary part with value ∆ and a small energy shift that can be absorbed into the definition of Ed . Hence, the spin dependent density of states can be approximated as ρσd (ε) =
∆
1 π
ε − Ed − U n−σ
2
+ ∆2
(1973)
Thus, at T = 0 one finds that nσ
1 = cot−1 π
Ed − µ + U n−σ ∆
(1974)
where as cot θ is defined on the interval 0 to π and runs between ∞ and − ∞, then cot−1 x runs over the range from π to 0. These two coupled equations have to be solved self-consistently. This can be done by changing the variables x
=
y
=
µ − Ed ∆ U ∆
(1975)
which are dimensionless measures of the position of the Fermi-energy relative to the d level and the Coulomb interaction. The pair of self-consistency equations become cot π n↑ cot π n↓
= =
( y n↓ − x ) ( y n↑ − x ) (1976)
Thee is a non-magnetic solution n↑ = n↓ = n
(1977)
This has a unique solution for 0 < n < 1 given by the solution of cot π n = ( y n − x ) 429
(1978)
corresponding to a partial occupation of the localized levels. In this case, the virtual bound state does not posses a magnetic moment. However, if y is large the equations have two magnetic solutions. These solutions only occur for sufficiently large values of y as can be seen by linearizing the self-consistency equations in terms of the variable m defined by nσ = n + σ m
(1979)
On equating the first two terms in the expansion in m, one finds cot π n π 2 sin π n
=
(yn − x)
= y
(1980)
These can be re-written as x y 1 y
= =
1 ( θ − sin θ ) 2π 1 ( 1 − cos θ ) 2π
(1981)
where θ = 2 π n. This leads to the identification of line separating the areas of phase space in which the impurity is magnetic from the area in which the impurity is non-magnetic. The tendency for magnetism is strongest when the d-d interaction U is large and when n is close to 12 , i.,e, when Ed and Ed + U are positioned symmetrically about the Fermi-level. In this case the total number of d electrons of both spins is almost unity. The non-magnetic solution occurs when U is small or when the d level is either almost completely filled or almost completely empty. For large y, magnetic solutions are described by nσ
=
n−σ
=
1 πy 1 π(y − x)
1 −
(1982)
These solutions are doubly degenerate and correspond to the spin up and spin down states of the impurity. It is to be expected that the solution should have a continuous symmetry with respect to the orientation of the impurity spin. However, the spin rotational invariance has been specifically broken by the mean field approximation through the choice of a specific quantization axis. Thus, the mean field solution of the Anderson model contains magnetic and non-magnetic solutions. The appearance of magnetic moments of transition metal impurities in metals can be interpreted in terms of the change of position and width of the virtual bound state.
430
19.2.1
The Atomic Limit
In the case, when the hybridization is set to zero, the d orbital is entirely localized. The local level is entirely decoupled from the conduction band and the model is exactly soluble. The local d level can be described in terms of the eigenstates of the d number operator. The four basis states that correspond to the d level being unoccupied, with energy 0, two states which correspond to the d state being occupied by one electron, with energy Ed and one state in which the d level is occupied by two electrons. This has energy 2 Ed + U . The excitation energy required to put an additional particle in the d shell is, therefore, either Ed or Ed + U depending on whether the initial d state of the impurity is unoccupied or singly occupied. The ladder of excitations between these four states are described by the four operators d†σ ( 1 − d†−σ d−σ ) d†σ d†−σ d−σ (1983) The first two take the system from the non-degenerate vacuum state to the doubly degenerate singly occupied states, and the second pair of operators take the system from the doubly degenerate singly occupied state to the non-degenerate doubly occupied state.
19.3
The Schrieffer-Wolf Transformation
If the local magnetic impurity has a narrow width and is almost completely occupied by one electron, then the Anderson Model can be mapped onto a model of a localized magnetic moment by the Schrieffer-Wolf transformation. The zero-th order Hamiltonian can be considered to be the terms in which the hybridization is set to zero. Thus, for the present purposes one may write ˆ = H ˆ0 + H ˆV H
(1984)
ˆ 0 describes the ionic d states and the conduction band states. where H X X ˆ0 = Ed d†σ dσ H E(k) c†k,σ ck,σ + σ
k,σ
+
U
d†↑
d†↓
d↓ d↑ (1985)
ˆ V is the hybridization which couples the local and conduction The Hamiltonian H band states. " X ˆV = (1986) H V (k) c†k,σ dσ + V (k)∗ d†σ ck,σ k,σ
431
The Schrieffer-Wolf transformation is based on a canonical transformation which acts on the operators Aˆ and is of the form Aˆ0 = exp + Sˆ Aˆ exp − Sˆ (1987) where Sˆ is an anti-Hermitean operator. That is the operator Sˆ satisfies Sˆ† = − Sˆ
(1988)
Thus, if the operator Aˆ is Hermitean then Aˆ0 is also Hermitean. The canonical transformation leads to the same expectation values if the states | Ψ > are also transformed to 0 ˆ | Ψ > = exp + S | Ψ > (1989) ˆ 0 and H ˆ are identical. The Schrieffer-Wolf In particular the eigenvalues of H ˆ V vanish transformation S is chosen such that terms linear in the hybridization H ˆ 0 . This can only be achieved if Sˆ is assumed in the transformed Hamiltonian H ˆ V . In this case, the transformed Hamiltonian can be to be of the same order of H ˆ V and S. ˆ On retaining the terms up to second order, expanded in powers of H one finds ˆ0 H
= +
ˆ0 + H ˆ V + [ Sˆ , H ˆ0 ] H ˆ0 ] ] + . . . ˆ V ] + 1 [ Sˆ , [ Sˆ , H [ Sˆ , H 2! (1990)
The operator Sˆ is chosen such that the terms linear in V (k) vanish. Hence, it is required that Sˆ satisfies the linear equation ˆ0 ] = − H ˆV [ Sˆ , H
(1991)
This is an operator equation, and Sˆ is determined if all its matrix elements are known. This requires that a complete set of states be used. The simplest set of ˆ 0 , | φn > with eigenvalues En . complete sets correspond to the eigenstates of H In this case, matrix elements are found as ˆ V | φn > < φm | H < φm | Sˆ | φn > = Em − En
(1992)
Thus, the operator Sˆ connects states which differ through the presence of an additional conduction electron and a deficiency of an electron in the local orbital, and vice versa. The energy denominators are of the form E(k) − Ed or E(k) − Ed − U depending on the state of occupation of the local level. Thus, the antiHermitean operator Sˆ can be expressed in terms of the four creation operators
432
for the localized level. The operator Sˆ is found as " c†k,σ dσ 1 X ˆ S = √ V (k) 1 − d†−σ d−σ E(k) − Ed N k,σ + − −
V (k)
c†k,σ dσ
d†−σ d−σ
E(k) − Ed − U d†σ ck,σ † ∗ V (k) 1 − d−σ d−σ E(k) − Ed # d†σ ck,σ † ∗ V (k) d−σ d−σ E(k) − Ed − U (1993)
ˆ the Hamiltonian to second order in V is Having determined the operator S, given by ˆ0 H
= =
ˆ 0 + [ Sˆ , H ˆ V ] + 1 [ Sˆ , [ Sˆ , H ˆ0 ] ] + . . . H 2! ˆ 0 + 1 [ Sˆ , H ˆV ] + . . . H 2! (1994)
ˆ 0 contains an interaction term whereby the The transformed Hamiltonian H conduction electrons are scattered from the different singly occupied states of the d impurity. On expressing the conduction band factors in terms of the matrix elements of the Pauli-spin matrices X † α σ ˆk,k (1995) ck,δ < δ | σ α | γ > ck0 ,γ 0 = δ,γ
and likewise for the local operators X † α Sˆk,k dδ < δ | σ α | γ > dγ 0 =
(1996)
δ,γ
one finds that in addition to a potential scattering term there is also an interaction between the components of the spin density operators. The spin-flip contribution of the interaction is of the form " X 1 1 1 0 ˆ spin−f lip = H + 2 0 E(k 0 ) − Ed E(k 0 ) − Ed − U k,k ,σ † † 0 ∗ † 0 ∗ † 0 × V (k) V (k ) ck0 ,−σ ck,σ dσ d−σ + V (k) V (k ) ck,σ ck ,−σ d−σ dσ (1997)
433
To describe scattering of electrons close to the Fermi-energy one may set E(k) = E(k 0 ), then the effective exchange interaction has the strength # " 1 1 0 ∗ Jk,k0 = Re V (k) V (k ) + Ed − E(k) U + Ed − E(k) (1998) The total spin dependent part of the interaction is recognized as just involving the scalar product of the Fourier components of the two spin densities. For a singly occupied level, where Ed − µ is negative, the coefficient Jk,k0 also has a negative sign if U is sufficiently large, so that the energy is lowered whenever the expectation values of both the spin density operators are anti-parallel. Thus, classically, the energy is lowered whenever the polarization produced by the conduction electron gas is anti-parallel to the spin of the local moment. This type of coupling is known as an anti-ferromagnetic interaction. The alternative type of coupling occurs when the sign of J is positive, and the ferromagnetic interaction attempts to polarize the conduction electron spin density to be parallel to the local spin density.
19.3.1
The Kondo Hamiltonian
The resulting Hamiltonian is the Kondo Hamiltonian, it contains an interaction between the localized magnetic moment and the spins of the conduction electrons. The Hamiltonian can be expressed as ˆ = H ˆ0 + H ˆ int H ˆ 0 represents the Hamiltonian for the conduction electrons where H X ˆ0 = E(k) c†k,σ ck,σ H
(1999)
(2000)
k,σ
and the interaction is given by ˆ int = − J S . σ(0) H
(2001)
where S is a local moment and σ(0) is the spin of the conduction electrons at the position of the impurity spin. The components of the conduction electron spin is given in terms of matrix elements of the Pauli-spin matrices σ α (0) =
1 N
X
c†k,δ < δ | σ α | γ > ck0 ,γ
(2002)
k,k0 ;γ,δ
It is convenient to write the spin dependent interaction in terms of the spin raising and lowering operators for the local spin and the conduction electron
434
spin density Sˆ± σ ˆ±
= =
Sˆx ± i Sˆy σ ˆx ± i σ ˆy (2003)
with the aid of the identity 1 ˆ+ − ˆ = Sˆz σ ˆz + (S σ ˆ + Sˆ− σ ˆ+ ) Sˆ . σ 2
(2004)
Hence, the interaction is written as ˆ int = H
− +
19.4
J 1 X S z ( c†k,↑ ck0 ,↑ − c†k,↓ ck0 ,↓ ) N 2 k,k0 + † − † 0 0 S ck,↓ ck ,↑ + S ck,↑ ck ,↓
(2005)
The Resistance Minimum
The Kondo effect results in a minimum in the resistivity of metals. The minimum in the resistivity is due to the increasing T 5 resistivity caused by electronphonon scattering and a decreasing contribution from the impurity spin flips scattering, which in an intermediate temperature regime follows a ln T variation ρ(T ) = ρ(0) + b T 5 − c ρ1 Jρ(µ) S ( S + 1 ) ln kB T ρ(µ)
(2006)
where c is the concentration of impurities. Then, the resistivity shows a minimum at a concentration temperature Tmin = (
ρ1 J ρ(µ) S(S + 1) 1 1 )5 c5 5b
(2007)
in agreement with experimental findings. The ln T term in the resistivity comes from scattering process to third order in J. This can be seen by considering the T-matrix for non-spin flip scattering of an up sin electron in second order. The T-matrix will be evaluated on the energy shell E(k) = E(k 0 ), and E will be set to the ground state energy. To lowest order, the non-spin flip scattering matrix elements are given by < k 0 ↑ | T (1) (E + i) |k ↑ > =
J z S N
(2008)
whereas to second order one has four non-zero contributions, two contributions from the spin flip part ( Sˆ± ) of the interactions and two contributions from
435
the non spin flip part ( Sˆz ). The non spin flip part gives rise to a term in T (2) (E + i) of 2 z 2 X J S 0 (2) < k 0 ↑ | ( c†k ,↑ ck01 ,↑ − c†k ,↓ ck01 ,↓ ) < k ↑ | Tzz (E + i) |k ↑ > = 1 1 N 2 k1 ,k2
×
1 ( c†k ,↑ ck02 ,↑ − c†k ,↓ ck02 ,↓ ) | k ↑ > 2 2 ˆ E − H0 + i (2009)
As only the spin up terms contribute to the scattering of the spin up electron the term simplifies to yield 2 z 2 X S 1 J < k 0 ↑ | c†k ,↑ ck01 ,↑ c†k ,↑ ck02 ,↑ | k ↑ > = 1 ˆ N 2 E − H0 + i 2 k1 ,k2
(2010) This has two contributions, one which corresponds to k 0 = k 1 and k = k 02 and the other with k 0 = k 2 and k = k 01 . The sum of these terms are evaluated as 2 z 2 X J 1 − f (E(k 2 )) S = N 2 E(k) − E(k 2 ) + i k2
− =
J N
2
2 X f (E(k 1 )) S 2 E(k 1 ) − E(k) + i
J N
2
2 X S 1 2 E(k) − E(k 1 ) + i
z
k1
z
k1
(2011) The singularity at E(k 1 ) = E(k) yields a finite result when integrated over k 1 . Thus, there is no non-analytic behavior originating from the Sˆz terms in the interaction, which is just of the order J 2 ρ(µ) which is just a factor of J ρ(µ) smaller than the leading contribution to the T-matrix.
The two spin flip contributions to the T-matrix are given by 2 X 1 J (2) Sˆ− c†k ,↑ ck02 ,↓ | k ↑ > < k 0 ↑ | Sˆ+ c†k ,↓ ck01 ,↑ < k 0 ↑ | T+− (E + i) | k ↑ > = 2 1 ˆ 0 + i 2N E − H k ,k 1
2
(2012 and 0
< k ↑ |
(2) T−+ (E
+ i) | k ↑ > =
J 2N
2 X k1 ,k2
< k 0 ↑ | Sˆ− c†k
1 ,↑
ck01 ,↓
1 Sˆ+ c†k ,↓ ck02 ,↑ | k ↑ > 2 ˆ 0 + i E − H
(2013 436
respectively. These terms are calculated to be 2 X J Sˆ+ Sˆ− (2) c†k ,↑ ck02 ,↓ | k ↑ > < k 0 ↑ | T+− (E + i) | k ↑ > = < k 0 ↑ | c†k ,↓ ck01 ,↑ 1 ˆ 2N E − H0 + i 2 k1 ,k2 2 X f (E(k 2 ) J Sˆ+ Sˆ− = E(k) − E(k 2 ) + i 2N k2
(2014) and
(2)
< k 0 ↑ | T−+ (E + i) | k ↑ > =
=
J 2N
2 X
J 2N
2
< k 0 ↑ | c†k
,↑ ck1 ,↓ 1 0
k1 ,k2
Sˆ− Sˆ+
X k1
Sˆ− Sˆ+ c† c 0 | k ↑ > ˆ 0 + i k2 ,↓ k2 ,↑ E − H
1 − f (E(k 1 ) E(k) − E(k 1 ) + i (2015)
In this case, the two terms cannot be combined to give a result independent of the Fermi-function, as Sˆ+ and Sˆ− do not commute. In this case, one can use the identities Sˆ+ Sˆ− = S 2 − ( Sˆz )2 + Sˆz Sˆ− Sˆ+ = S 2 − ( Sˆz )2 − Sˆz (2016) The terms proportional to S 2 − ( Sˆz )2 combine to yield an analytic contribution to the T-matrix of 2 X J 1 (2) < k 0 ↑ | Tsf (E + i) | k ↑ > = ( S 2 − ( Sˆz )2 ) 2N E(k) − E(k 1 ) + i k1
(2017) whereas the remaining contribution is proportional to S z and the integration is divergent at E(k) = E(k 1 ) but the integration is cut off by the Fermi-function. 0
< k ↑ |
(2) Tsf (E
+ i) | k ↑ > =
J 2N
2
Sˆz
X k1
2 f (E(k 1 )) − 1 E(k) − E(k 1 ) + i (2018)
At finite temperatures, either the Fermi-function acts as a cut-off for the singularity when the scattered particle is on the Fermi-surface E(k) = µ, or if the scattered particle is off the Fermi-surface, the excitation energy acts as a 437
cut off. In the latter case, the second order contribution to the real part of the T-matrix can be evaluated as 2 J (2) 0 z ˆ < k ↑ | Tsf (E + i) | k ↑ > ∼ S ρ(µ) ln ( E(k) − µ ) ρ(µ) 2N (2019) which is divergent when E(k) approaches µ. Thus, this second order term can be as large as the first order term which is also proportional to Sˆz . The scattering rate which enters into the resistivity is proportional to the square of the T-matrix and is found as 1 2π 2 S ( S + 1 ) = ρ(µ) J 1 − 4 J ρ(µ) ln kB T ρ(µ) + . . . τ h ¯ 3 (2020) which gives the logarithmically increasing resistivity for magnetic impurities in simple metals. Since, the logarithmic divergence is caused by spin flip scattering in the intermediate states, the application of a field should suppress the Kondo effect. The resistivity and the T-matrix do not diverge at T = 0. The logarithmic dependence found in perturbation theory saturates when all the scattering processes are taken into account. The leading order logarithmic coefficient of each term in the perturbation expansion series (in powers of J ρ(µ)) can be calculated by various means (Abrikosov). In the ferromagnetic case, where J > 0, the saturation occurs at a characteristic Kondo energy or Kondo temperature TK given by 2 J ρ(µ) ln kB TK ρ(µ) = − 1 (2021) or kB TK = ρ(µ)−1 exp
−
1 2 J ρ(µ)
(2022)
and all the results are finite. For the case of anti-ferromagnetic coupling, the physics scales to a strong coupling fixed point (Anderson) so the solution must be obtained by other means such as Bethe Ansatz (Andrei,Weigmann). The properties of the antiferromagnetic solution include the cross-over from a high temperature ( T > TK ) Curie susceptibility for the free impurity moments to a Pauli paramagnetic susceptibility for T < TK , and the specific heat originating from the impurity changes from a constant value at high temperatures to a low-temperature form having a linear T dependence. This indicates that the magnetic moments of the impurity are being removed and that at low temperatures, the properties are those of a narrow virtual bound state of width kB TK located near the Fermienergy. In fact analysis shows that the magnetic moments are being screened by a compensating polarization of conduction electrons, and that the cloud and 438
moment form a singlet bound state of binding energy TK . For T < TK the conduction electrons occupy the bound state and the moment is screened, for T > TK the bound state is thermally depopulated and the system exhibits properties of the free moments. From the perspective of the Anderson impurity model, the density of states that is found at high temperatures follows directly from Anderson’s picture of a spin split virtual bound state. However, as T decreases below TK , the density of states shows a sharp peak of width kB TK growing in the vicinity of µ. In the low-temperature limit, the height of the Abrikosov-Suhl peak saturates on the order of ( kB TK )−1 . Thus, the low-temperature properties can be directly understood in terms of the virtual −1 bound state with a density of states which is very large ∝ TK . The properties of this low-temperature Fermi-liquid were established by Nozi`eres.
439
20
Collective Phenomenon
21
Itinerant Magnetism
21.1
Stoner Theory
The Stoner theory of itinerant magnetism examines the stability of a band of electrons to Coulomb interactions (E.C. Stoner, Rep. Prog. in Phys. 11 ˆ 0 the 43 (1948)). The Hamiltonian is expressed as the sum of two terms, H ˆ non-interacting electrons in the Bloch states and Hint describing the Coulomb repulsion between the electrons ˆ = H ˆ0 + H ˆ int H
(2023)
The Hamiltonian for the non-interacting electrons in the Bloch states is written as X ˆ0 = E(k) nk,σ (2024) H k,σ
The interaction Hamiltonian is given by X ˆ int = U H n ˆ i,σ n ˆ i,−σ 2 i,σ
(2025)
where U represents the short ranged Coulomb interaction between a pair of electrons occupying the orbitals on the i-th lattice site. The operator n ˆ i,σ corresponds to the number of electrons of spin σ which occupy the i-th lattice site. It is assumed that the band is non-degenerate, therefore, there is only one orbital per lattice site which due to the limitations of the Pauli exclusion principle can only hold a maximum of two electrons. The interaction is treated in the mean field approximation. First it shall be assumed that translational invariance holds, so that the orbitals in each unit cell have the same occupation numbers. Also the Hamiltonian is expanded in powers of the fluctuation operator ∆ˆ ni,σ = n ˆ i,σ − nσ so that U X ˆ Hint = ∆ˆ ni,σ ∆ˆ ni,−σ + nσ ∆ˆ ni,−σ + n−σ ∆ˆ ni,σ + n−σ nσ (2026) 2 i,σ and then the second order fluctuations are ignored. This leads to the interaction energy being approximated in terms of single particle operators U X ˆ Hint ≈ n ˆ i,σ n−σ + n ˆ i,−σ nσ − n−σ nσ 2 i,σ U X ˆ k,−σ nσ − n−σ nσ = n ˆ k,σ n−σ + n 2 k,σ
(2027) 440
Thus, in the mean field approximation, the Hamiltonian is given by ˆMF = H
X
( E(k) + U n−σ ) nk,σ − N
k,σ
U X n−σ nσ 2 σ
(2028)
The single particles have the spin dependent energy eigenvalues Eσ (k) = E(k) + U n−σ
(2029)
The magnetization is given by 1 Mz = g µB n+ − n− 2 Z ∞ = µB d f () ρ( − U n− ) − ρ( − U n+ ) −∞
(2030) This equation has non-magnetic solutions with n+ = n− and may have ferromagnetic solutions in which the number of up spin electrons is greater than the number of down spin electrons n+ 6= n− . In the ferromagnetic state, the Stoner model predicts that the up-spin sub bands are rigidly shifted relatively to the down spin bands by the exchange splitting which has a magnitude of U (n+ − n− ). On increasing U from zero, the ferromagnetic solutions first become stable when Mz ∼ 0, in which case the equations can be linearized to yield Z ∞ ∂ ( n+ − n− ) = U n+ − n− d f () ρ() (2031) ∂ −∞ The ferromagnetic state has the lowest energy when the self-consistency equation is satisfied. The integral can be performed by integration by parts yielding Z ∞ ∂ 1 = U d f () ρ() ∂ −∞ Z ∞ ∂ = −U d ρ() f () ∂ −∞ (2032) At low temperatures, the derivative of the Fermi-function can be replaced by a delta function at the Fermi-energy. −
∂ f () = δ( − µ) ∂
(2033)
This yields the Stoner criterion for ferromagnetism as 1 < U ρ(µ) 441
(2034)
where ρ(µ) is the density of states per spin at the Fermi-energy. If the Stoner criterion is satisfied the paramagnetic state is unstable to the ferromagnetic state, and a spontaneous magnetic moment Mz occurs at T = 0. The magnetization is given by the solution of the non-linear equation. The nonlinear equation shows that the magnetization increases with increasing U , and saturates to a value which is one Bohr magneton per electron, for low density materials which the bands have a filling of less then one per atom. In systems which have bands that are more than half filled, the saturation magnetic moment is equal to a Bohr magneton per unoccupied state. At finite temperatures, the value of the magnetization is reduced and disappears at a critical temperature Tc . Unfortunately, Stoner theory does not predict reasonable values of the critical temperatures. In the paramagnetic state Stoner theory predicts that the susceptibility should be exchange enhanced over the non-interacting susceptibility χ0p via χp =
χ0p 1 − U ρ(µ)
(2035)
For systems which are close to the ferromagnetic instability, the susceptibility should take on large values. This is the case for P d in which the d band is almost completely occupied. ——————————————————————————————————
21.1.1
Exercise 76
Determine the critical temperature Tc predicted by Stoner theory. ——————————————————————————————————
21.1.2
Exercise 77
Determine the paramagnetic susceptibility by using Stoner theory. ——————————————————————————————————
21.2
Linear Response Theory
The spatially varying magnetization MZ (r) of a paramagnetic system to a spatially varying applied magnetic field Hz (r) are related by the z − z component
442
of the magnetic tensor susceptibility, via the linear relationship Z Mz (r) = d3 r0 χz,z (r, r0 ) Hz (r0 )
(2036)
This is a special case of the more general relation Z X Mα (r) = d3 r 0 χα,β (r, r0 ) Hβ (r0 )
(2037)
β
For translational invariant systems the expression for the response function is only a function of the difference r − r0 . Also for non magnetic systems, that possess spin rotational invariance, the susceptibility tensor is diagonal and the diagonal components are related via χx,x (r − r0 ) = χy,y (r − r0 ) = χz,z (r − r0 )
(2038)
The relation between the magnetic response and the applied field becomes simpler, after Fourier transforming. The Fourier transform of the magnetization is defined as Z M (q) =
d3 r exp
− iq.r
M (r)
(2039)
The Fourier transform of the magnetization is related to the Fourier transform of the applied field via X Mα (q) = χα,β (q) Hβ (q) (2040) β
The response function can be evaluated from perturbation theory, in which the Zeeman interaction Z ˆ HZeeman = − d3 r0 M (r0 ) . H(r0 ) Z = − d3 q M (q) . H(q) (2041) is treated as a small perturbation. For convenience, χz,z (q) shall be calculated by reducing it to a previously known case. The change in density of electrons of spin σ, with Fourier component q, produced in response to an applied spin dependent potential. The Fourier component of the potential is given by gµB z (2042) H (q) σ + U ρ−σ (q) Vσ (q) = − 2 Thus, the charge density is given by two coupled equations g µB ρσ (q) = χ0 (q) − Hz σ + U ρ−σ (q) 2 (2043) 443
for each spin polarization. In the above expression, χ0 (q) is the Lindhard density-density response function, per spin. On combining these equations one finds, the z component of the magnetization produced by a magnetic field applied along the z direction is g µB 0 Mz (q) = − χ (q) g µB Hz (q) + U Mz (q) (2044) 2 Thus, it is found that the Pauli paramagnetic susceptibility χz,z p (q)
=
Mz (q) Hz (q)
=
−
2 χ0 (q) g 2 µ2B 4 1 + U χ0 (q)
(2045)
It is usual to use re-write this expression in terms of the reduced non-interacting magnetic susceptibility defined by 0 χz,z 0 (q) = − 2 χ (q)
(2046)
instead of the density-density response function χ0 (q). This yields the result χz,z p (q) =
g µB 2
2
χz,z 0 (q) 1 −
U 2
χz,z 0 (q)
(2047)
Since the reduced non-interacting magnetic susceptibility is positive, and U is positive, the paramagnetic susceptibility is enhanced for sufficiently small values of U .
21.3
Magnetic Instabilities
The reduced non-interacting susceptibility may have maxima at certain values of q, say Q, which are determined by the band structure and the occupancy of the non-interacting bands, through the non-interacting susceptibility χ0 (q). If the value of the non-interacting susceptibility at these maxima are finite, then the denominator of the Pauli-paramagnetic susceptibility may become small at these q values for sufficiently small values of U . This has the effect that, for small U , the Pauli-paramagnetic susceptibility is enhanced at these Q values. If U is increased further, there will be a critical value of U , Uc at which point the denominator will fall to zero and the susceptibility at Q will become infinite. The divergence of the susceptibility at Q indicates that an infinitesimal applied field can produce a finite staggered magnetization M z (Q). Although the analysis was performed with the z axis being the axis of quantization, a similar analysis could have been performed in which any arbitrarily chosen direction. The infinitesimal field may be produced by a spontaneously statistical fluctuation, and have an arbitrary direction. This field will force the system to order 444
magnetically by having a finite M (Q) in the spontaneously chosen direction. The system, by spontaneously choosing a direction for the magnetization has spontaneously broken the symmetry of the Hamiltonian. The critical value of U , above which the paramagnetic state becomes unstable to a state with a modulated spin density M (Q), is given by 1 =
Uc z,z χ (Q) 2 0
(2048)
If a non-interacting system is considered which has a maximum at Q = 0 this reduces to the Stoner criterion for ferromagnetism as lim
Q → 0
χz,z 0 (Q) → 2 ρ(µ)
(2049)
where ρ(µ) is the density of states, per spin, at the Fermi-energy. Thus, it is found the critical value of U is given by the criterion 1 = Uc ρ(µ)
(2050)
For values of U larger than the critical value, the paramagnetic state is unstable to the formation of a ferromagnetic state. For values of U greater than Uc , the mean field analysis has to be modified to include the effect of the spontaneous magnetization. For a ferromagnet, the interaction produces a rigid splitting between the up-spin bands and down spin bands by an amount ∆ = U ( nσ − n−σ ) called the exchange splitting. For isotropic systems, the magnetic response will crucially depend on the direction of the applied field compared to that of the spontaneous magnetization. For a ferromagnet, the longitudinal response ( produced by a field which parallel to the spontaneous magnetization M ) will be finite, as this corresponds to processes which excites the system as it stretches the magnitude of M . However, the transverse response will be infinite as this corresponds to applying a field that will rotate the direction of the spontaneous magnetization until it aligns with the applied field. As the system is isotropic, this can be achieved without any finite energy excitations. The zero energy excitations that uniformly rotate the magnetization in a ferromagnet are the q = 0 Goldstone modes associated with the spontaneously broken continuous spin rotational invariance of the Hamiltonian. In three-dimensional systems, with almost spherical Fermi-surfaces, the instability can only occur at 2 kF . This can lead to a spin density wave which has a periodicity which is incommensurate with the underlying lattice. In low dimensional systems, such as two-dimensional and one-dimensional organic materials, there can be large sheets of the Fermi-surface which can produce a large non-interacting susceptibility at the Q value connecting these sheets. This susceptibility coupled by an interaction can produce a spin density wave in which 445
the magnetization is modulated with this wave vector. For tight-binding bands which satisfy the perfect nesting condition E(k + Q) = − E(k)
(2051)
for some Q. The non-interacting susceptibility can be evaluated as an integral over the density of states X f (−E(k)) − f (E(k)) 2 E(k) k Z ∞ 1 − 2 f () = 2 d ρ() 2 −∞
χz,z 0 (Q) = 2
(2052) From this one can see that if the density of states ρ(0) is non-zero, the susceptibility χ(Q) will diverge logarithmically or faster than logarithmically when µ → 0. The divergence occurs when two large portions of the Fermi-surface are connected by the wave vector Q, which allows the system to rearrange the electrons at the Fermi-surface by zero energy excitations involving a momentum change Q. Thus, in this case, there is no energy penalty to be incurred in producing a spin density wave M (Q). The perfect nesting condition occurs for Q = πa (1, 1, 1) in non-degenerate tight-binding bands on a simple cubic lattice, where i=3 X cos ki a (2053) E(k) = − 2 t i=1
Since the bands are symmetric around = 0, the non-interacting susceptibility diverges for half filled bands. In this case, the critical value of U is zero. Hence, the paramagnetic state will become unstable to a state in which the magnetization exhibits spatial oscillations with wave vector Q, even with an infinitesimally small value of U . In real space, the staggered magnetization of this ordered state is given by X ri M (r) = M cos π (2054) a i The staggered magnetization on the neighboring lattice sites is oppositely oriented, and is anti-ferromagnetically ordered. Since anti-ferromagnetic ordering was first proposed by Louis Neel to describe classical magnets (L. Neel, Ann. de Physique, 17, 64 (1932), Ann. de Physique 5, 256 (1936)), this type of ordering is known as Neel ordering. Unfortunately, the Neel state is not an exact ground state for a quantum system. The occurrence of an anti-ferromagnetically ordered state may be accompanied by a metal-insulator transition. This process was first discussed by J.C. Slater. Physically, the appearance of anti-ferromagnetic order could result in a doubling of the size of the real space unit cell. The electrons of spin σ travelling in the solid experience a periodic potential which contains a 446
contribution due to the interaction with electrons of opposite spin. The doubling of the size of the real space unit cell, produced by the magnetic order, results in the volume of the Brillouin zone being halved. The new periodicity caused by the sub-lattice magnetization shows up as a spin dependent contribution to the potential, and may produce gaps in the electronic dispersion relations at the surface of the new Brillouin zone. If the magnitude of the spin dependent potential is large enough, a gap may occur all around the Brillouin zone resulting in a gap in the density of states and hence the insulating state. Such insulating anti-ferromagnetic states occur in undoped La2 CuO4 , which is the parent material of some high temperature superconductors. Although the insulating anti-ferromagnetic state does not have low energy electronic excitations, it does have low energy spin excitations in the form of Goldstone modes. These are spin waves, which have the dispersion relation ω = c q.
21.4
Spin Waves
The dynamical response of the magnetization to a time and spatially varying applied magnetic field of wave-vector q and frequency ω is given by the dynamical response χz,z p (q; ω). The imaginary part of this response function yields the spectrum of magnetic excitations. The imaginary part of the reduced susceptibility can be measured indirectly by inelastic neutron scattering experiments, in which the neutrons spin interacts with the electronic spin density via a dipoledipole interaction. A simple extension of our previous analysis shows that, in the mean field approximation, 2 χz,z g µB 0 (q, ω) χz,z (q; ω) = (2055) p 2 1 − U2 χz,z 0 (q; ω) Let us examine the imaginary part of the response function for a paramagnetic metal, such as P d, which is on the verge of an instability to a ferromagnetic state. Then, 2 Im χz,z g µB 0 (q, ω) z,z Im χp (q; ω) = 2 2 2 z,z z,z U U 1 − 2 Re χ0 (q; ω) + 2 Im χ0 (q; ω) (2056) which on using the approximation Re χz,z 0 (q; ω)
=
Im χz,z 0 (q; ω)
=
2 ρ(µ) π ω ρ(µ) 2 q vF (2057)
for the Lindhard susceptibility, shows that the system exhibits a continuum of quasi-elastic magnetic excitations. As the instability is approached the spectrum is enhanced at low frequencies. These magnetic excitations are known as 447
paramagnons. The lifetime of the paramagnon excitations and the frequency of the excitations soften as the value of U is increased to the critical value Uc . Basically, this represents a slowing down of the rate at which a small region of ferromagnetically aligned spins relax back to the equilibrium (paramagnetic) state. The existence of large amplitude paramagnon fluctuations not only manifest themselves in the inelastic neutron scattering cross-section which is directly proportional to Im [ χα,β (q; ω) ] , an enhanced susceptibility but also lead to logarithmic enhancement of the linear T term in the electronic specific heat (Berk and Schrieffer, Doniach and Engelsberg), and enhancement in the T 2 term in the electrical resistivity (Lederer and Mills). These characteristics have been observed in metallic P d (Schindler and Coles). For values of U greater than Uc , the q = 0 response shows a sharp zero energy mode that represents the Goldstone mode of the system. In the ferromagnetically ordered state, these excitations form a sharp (delta function like) branch of spin waves which stretch up from ω = 0 at q = 0. The transverse response functions are equal χx,x (q; ω) = χy,y (q; ω)
(2058)
The transverse response is expressed in terms of the spin flip response function involving the spin raising and lowering operators ˆ ± (q) = M ˆ x (q) ± i M ˆ y (q) M
(2059)
The spin flip response functions are χ+,− (q; ω) = χ−,+ (q; ω) = 2 χx,x (q; ω)
(2060)
In the limit q → 0, the non-interacting transverse response is given by lim χ+,− (q, ω) = 0
q → 0
∆ U
¯h ω + ∆
(2061)
so the full transverse response function is given by lim χ+,− (q, ω)
q → 0
= =
lim
q → 0
χ+,− (q, ω) 0 1 − U χ+,− (q, ω) 0
∆ U h ¯ ω (2062)
Thus, in accordance with Goldstone’s theorem the instability has produced a dynamic response with a zero-frequency pole. For small q and low frequencies ω < ∆, the spin waves have a dispersion relation of the form ω = D q2 448
(2063)
The spin waves of the ferromagnetic state become broader for larger (ω, q) values where the spin wave branch enters and coexists with the continuum of (Stoner) spin-flip electron-hole excitations. An anti-ferromagnet also has Goldstone modes, but unlike the ferromagnet the order parameter (the sub-lattice magnetization) for the anti-ferromagnet is not a constant of motion. This results in the dispersion of the Goldstone modes being linear in q, ω = c q, similar to the transverse sound waves in a crystalline solid. The above mean field type of analysis has shown that close to a magnetic instability, there will be large amplitude Gaussian fluctuations. This continuous spectrum of excitations is expected to soften as the instability is approached. The fluctuations are expected to be long-ranged and long-lived. However, the above mean field analysis is expected to fail close to the transition, where critical fluctuations should be taken into account. Unlike, most other phase transitions, the phase transition that has just been described occurs at T = 0. The critical fluctuations are not thermally excited and cannot be treated classically but are zero point fluctuations associated with the existence of a quantum critical point.
21.5
The Heisenberg Model
The above model of itinerant magnetism is believed to be appropriate for transition metals only involves one type of electrons. Another model is appropriate for materials which contain two types of electrons, such as rare earth materials, in which the magnetic moments occur in the f states which are inner orbitals buried deep inside the f ion and the interaction is mediated by the itinerant conduction electrons. The spin localized at site Ri is denoted by S i . The spin at site i interacts with the conduction electrons spin σ at i via a local exchange interaction X ˆ int = − J H S i . σi (2064) i
which acts like a localized magnetic field of g 2µB J S i . This localized magnetic field polarizes the conduction electrons, producing a polarization at site j of 2 J S i χz,z (Ri − Rj ) (2065) g µB This polarization then interacts with the spin at site j via the interaction leading to an oscillatory interaction between pairs of localized spins of the form X ˆ = − H J(Ri − Rj ) S i . S j i,j
449
=
X 2 2 − J 2 χz,z (Ri − Rj ) S i . S i g µ B i,j (2066)
The oscillations are produced by the oscillations of the response function of the conduction electrons. The Fourier transform of J(R) shows oscillations at 2 kF . This interaction was discovered independently, by Ruderman and Kittel, Kasuya and Yosida.
22
Localized Magnetism
The nearest neighbor Heisenberg exchange interaction couples spins localized on adjacent lattice sites. X ˆ = − J H S(R + δ) . S(R) 2
(2067)
R,δ
This interaction Hamiltonian can be derived from the model of itinerant magnetism, for large U in the case when the bands are half filled. In this case, there is a spin at each lattice site and the exchange between the spins is the antiferromagnetic super exchange interaction found by Anderson. The exchange constant is given in terms of the tight-binding matrix element t and the Coulomb repulsion via t2 (2068) J = − U
The Heisenberg Hamiltonian can be expressed as " # X J 1 ˆ = − H S z (R) S z (R+δ) + S + (R) S − (R+δ) + S − (R) S + (R+δ) 2 2 R, δ
(2069) For a ferromagnetic exchange, J > 0, the ground state of a three-dimensional lattice of spins of magnitude S consists of parallel aligned spins. The direction of quantization is chosen as the direction of the magnetization. All the spins have their z components of the spin maximized Y | Ψg > = | ( mR = S ) > (2070) R
This state has a total magnetization proportional to N S, and is an eigenstate of the Hamiltonian, with energy Eg = − 3 N J S 2 since the z component of
450
the Hamiltonian is diagonal and the spin flip terms vanish as the effect of the raising operators acting on the fully polarized state S + (R) | mR = S > = 0 S (R + δ) | mR+δ = S > = 0 +
(2071) are both zero. The excitations of this system are the spin waves. The excited state wave function corresponding to a single spin wave is given by 1 X exp − i q . R S − (R) | Ψg > (2072) |q > = √ N R It corresponds to a state with total spin N S − 1 as one spin is flipped over. This state is a coherent superposition of all states with one spin flipped and has total momentum q. The excitation energy is found from ˆ − Eg ) | q > = 2 J S 3 − cos qx a − cos qy a − cos qz a | q > (2073) (H At long wave lengths, the spin wave excitation energy ¯hω(q) is found as h ω(q) = 2 J S 3 − cos qx a − cos qy a − cos qz a ¯ ∼
J S q 2 a2 (2074)
which is the branch of collective Goldstone modes which restore the spontaneously broken spin rotational invariance of the ferromagnet. The vanishing of the frequency at q = 0 is a general consequence that the total spin is a constant of motion, and in this limit the spin wave state just corresponds to P a reduced value of the total magnetization R exp − i q . R S − (R) → S − . The above excitations of the spin system are small amplitude excitations that have a close resemblance to the harmonic phonons of the crystalline lattice. The effects of the interactions could be expected to produce small anharmonic corrections to these excitations, providing them with a lifetime and a renormalized dispersion relation. Not all the excitations can be expressed as small amplitude excitations, some systems have large amplitude soliton excitations that cannot be treated by perturbation theory. However, the small amplitude excitations can be adequately treated as harmonic modes, as can be seen from an analysis based on the Holstein - Primakoff transformation.
22.1
Holstein - Primakoff Transformation
The Holstein-Primakoff transformation provides a representation of localized spins, which enables the low-temperature properties of an ordered spin system 451
to be analyzed in terms of boson operators. The technique is particularly useful for systems where the magnitude of the spin S is large S > 1. The Hamiltonian can be expanded in terms of boson operators, providing a description of the ground state, small amplitude spin fluctuations and the anharmonic interaction between them. The Holstein - Primakoff transformation of the spins represents the effect of the spin operators by a function of bosons operators (T. Holstein and H. Primakoff, Phys. Rev. 58, 1098 (1940)). The components of the spin operators at the i-th site Sˆiα can be defined by their action on the eigenstates of Sˆiz Sˆiz | mi > = mi ¯h | mi >
(2075)
In particular the spin raising and lowering operators Sˆi± = Six ± i Siy have the commutation relations with Sˆx ± z ˆ ˆ Si , Si = ± ¯h Sˆi±
(2076)
(2077)
This can be used to show that the operators Sˆi± have the effect of raising and lowering the magnitude of the eigenvalue S z by one unit of ¯h p S ( S + 1 ) − mi ( mi ± 1 ) h ¯ | mi ± 1 > (2078) Sˆi± | mi > = The Holstein Primakoff transformation represents the (2 S + 1 ) basis states | mi > by the infinite number of basis states of a boson number operator a†i ai . The boson basis | ni > states are defined through a†i ai | ni > = ni | ni >
(2079)
The relation between the basis spin states and boson states is provided by the boson representation of the z component of the spin operator Sˆiz = S − a†i ai
(2080)
Thus, the state where the spin is aligned completely along the z axis ( mi = S) is the state with boson occupation number of zero, and the state with the lowest eigenvalue of the spins z component (mi = − S) corresponds to the boson state where 2 S bosons are present. The states with higher number of bosons are un-physical and must be projected out of the Hilbert space. The effect of the spin raising and lowering operators are similar to the boson annihilation and creation operators, within the space of physical states. The correspondence between the spin and raising and lowering operators and the boson operators can be made exact, by multiplying the boson creation and annihilation operators with a function that ensures that only the physical acceptable boson states form 452
the Hilbert space of the spin system. This is achieved by the representation of the spin lowering operator q Sˆi− = a†i 2 S − a†i ai (2081) which projects out states with more than 2 S bosons present. The raising operator is given by the Hermitean conjugate q † + ˆ Si = 2 S − ai ai ai (2082) This transformation respects the spin commutation relations. Given the nearest neighbor ferromagnetic ( J > 0 ) Heisenberg Hamiltonian X ˆ = − J ˆ ˆ + δ) H S(R) . S(R 2
(2083)
R,δ
this can be expressed as " # J X 1 ˆ+ z z − − + ˆ ˆ ˆ ˆ ˆ ˆ H = − S (R) S (R + δ) + S (R) S (R + δ) + S (R) S (R + δ) 2 2 R,δ
(2084) On representing this in terms of the boson operators, and expanding in powers of S1 one finds " # J X † † † † ˆ H =− ( S − aR aR ) ( S − aR+δ aR+δ ) + S aR aR+δ + aR+δ aR 2 R,δ
(2085) The terms of order S 2 just represents the classical ferromagnetic ground state, in which all the spins are aligned. The terms of order S represent excitations from the ground state and can be put in diagonal form by expressing them in terms of the Fourier transformed boson operators. The spatial Fourier transform of the boson operators are defined as 1 X aR = √ exp i q . R aq (2086) N q and the creation operator is the Hermitean conjugate 1 X † exp − i q . R a†q aR = √ N q 453
(2087)
Substitution of these, and performing the sums over the spatial index, yields an approximate expression for the Hamiltonian. The expression is the sum of the ground state energy and the energies of harmonic normal modes that represent the excitations of the spin waves from the ferromagnetic ground state X X J 2 † ˆ H = − N Z S + J S aq aq 1 − cos q . δ (2088) 2 q δ
The terms of higher order in S1 yield quantum corrections to the ground state energy, the spin wave energies and also produce anharmonic interactions between the spin waves. The thermally excited spin waves have the effect of reducing the magnetization from the fully saturated T = 0 value X N (ω(q)) (2089) M z (T ) − N S = q
where N (ω) is the Bose - Einstein distribution function. Since the ferromagnetic spin waves have a dispersion relation which is ω(q) ∼ q 2 at small q, one finds that the temperature induced change in the magnetization is given by M (T ) − M (0) ∼
kB T J S
32 (2090)
for a three-dimensional lattice at low temperatures. Likewise, the thermal average value of the energy can be calculated as E(T ) + N Z
J 2 S 2
=
X
¯h ω(q) N (ω(q))
q
E(T ) − E(0) ∼
N J S
kB T J S
52 (2091)
This should result in the low-temperature specific heat being proportional to 3 T 2 for a long range ordered insulating ferromagnet.
22.2
Spin Rotational Invariance
In the limit of zero applied magnetic field, the Heisenberg exchange Hamiltonian is invariant under the simultaneous continuous rotation of all the spins. As there is no preferred choice of z axis, the energy of the ferromagnetic state of a fully polarized spin system with a total spin S T = N 12 has the same energy as the state where all the spins are oriented along the direction (θ, ϕ). The 454
ferromagnetic state where the spins are fully polarized along the z axis is given by Y 1 |0 >= | mj = > (2092) 2 j The state where the polarization is rotated through (θ, ϕ) is given by Y θ θ ˆ− | θ, ϕ > = cos + exp i ϕ sin Sj |0 > 2 2 j
(2093)
This can be proved by representing the spin vector operator for one site σ j in terms of its component along the unit vector ηˆ in the direction (θ, ϕ) ηˆ . σ j = sin θ cos ϕ σx
j
+ sin θ sin ϕ σy
j
+ cos θ σz
j
(2094)
where σx j , σy j and σz j are the three Pauli spin matrices for the spin at site j. Thus, the representation of the operators for the spins at site j is found as cos θ sin θ exp − i ϕ sin θ exp + i ϕ − cos θ This spin operator has two eigenstates θ | + > + exp 2
θ | θ, ϕ − > = cos | − > − exp 2
| θ, ϕ + > = cos
+ iϕ
sin
θ |− > 2
(2095)
sin
θ |+ > 2
(2096)
and − iϕ
The un-rotated ferromagnetic state has all the spins in the | + > spinor state and after rotation all the spins have maximal eigenvalue along the direction (θ, ϕ) and are in the | θ, ϕ + > spinor state. The rotated ferromagnetic state has all the spins aligned in the same direction. This classical ferromagnetic state has an infinitesimal overlap with the un-rotated ferromagnetic state, as N θ θ < 0 | θ, ϕ > = cos = exp N ln cos (2097) 2 2 which vanishes in the limit N → ∞. The rotated state can be considered to be a Bose-Einstein condensate of the q = 0 spin waves. For example on expanding the rotated state in powers of exp[ i ϕ ] one finds ∞ X ( ST−ot )n | θ, ϕ > = A(n) exp i n ϕ |0 > n! n=0 455
=
∞ X
cos
(N −n)
n=0
θ θ sinn exp 2 2
inϕ
( ST−ot )n |0 > n! (2098)
where the total spin operator is given by X ST−ot = Sˆj−
(2099)
j
since ( Sj− )2 ≡ 0. The number of q = 0 spin waves, n, are distributed with the binomial probability θ θ N P (n) = C (2100) cos2(N −n) sin2n n 2 2 Since N is a macroscopic number, the distribution of spin flips is a sharp Gaussian distribution with a peak at nmax = N sin2 θ2 , and a width given by 1 ∆n = N 2 sin θ2 cos θ2 . The number of bosons in the q = 0 spin wave mode is macroscopic and of the order N . This is a coherent representation, and it is the quantum state that is closest to a classical state as possible. The coherent state corresponds to the classical state in which the total spin is oriented along the direction (θ, ϕ). This can be established by examining the matrix elements of the spin operators. The un-normalized states with n spin wave present are found as |n > =
− ( Stot )n |0 > n!
(2101)
These states have the normalization < n|n >= C
N n
(2102)
The matrix elements of the total spin lowering operator between states with total numbers of q = 0 spin waves close to the maximum of the wave packet are found first by noting that N − ˆ < n + 1 | ST ot | n > = N − n C n (2103) since Sˆ− can only have a non zero effect on the N − n up spins and creates an extra down spin. The expectation value of the spin lowering operator in the rotated state is found to be X − < θ, ϕ | ST ot | θ, ϕ > = exp − i ϕ A(n) A(n + 1) < n + 1 | SˆT−ot | n > n
456
1 N sin θ exp 2
∼
=
1 N sin θ exp 2
− iϕ
X
A(n)2 C
n
N n
− iϕ (2104)
and likewise, the matrix elements of the spin raising operator are given by the complex conjugate. Likewise, the matrix elements of the SˆTz ot is given by N N z ˆ < n | ST ot | n > = ( − n) C (2105) n 2 which for n ∼ nmax = N sin2
θ 2
yields
< n | SˆTz ot | n > ∼ N cos θ C
N n
(2106)
Thus, one has < θ, ϕ | STz ot | θ, ϕ >
=
X
A(n)2 < n | SˆTz ot | n >
n
∼
N cos θ
X n
=
2
A(n) C
N n
N cos θ (2107)
Thus, the total spin operator has matrix elements between the coherent state that exactly corresponds to the classical vector. Thus, the different classical ferromagnetic states are represented by coherent states which are superpositions of states with arbitrary numbers of Goldstone modes excited. The thermal average in a ferromagnetic state should yield a zero magnetization for a system in the thermodynamic limit. The thermal average has to be taken, in the thermodynamic limit, in the presence of an arbitrary small magnetic field. In this case the different classical states have zero overlap and can be considered as being in disjoint portions of Hilbert space. In this quasi-static state the field may then be driven to zero leading to a non-vanishing vector order parameter. ——————————————————————————————————
22.2.1
Exercise 78
Determine the spin wave spectrum for an isotropic Heisenberg ferromagnet in the presence of an applied magnetic field. Do the conditions of Goldstone’s theorem apply, and what happens to the excitation energy of the q = 0 spin wave?
457
——————————————————————————————————
22.3
Anti-ferromagnetic Spinwaves
One can perform a similar analysis for an anti-ferromagnet ( J < 0 ) in a Neel state. Neel ordering shall be considered on a crystal structure that can be decomposed into two interpenetrating sub-lattices. The spins on one sub-lattice (the A sub-lattice sites) shall be oriented parallel to the z axis, and the spins on the second sub-lattice (the B sub-lattice) are anti-parallel to the z axis. In order for the bosons to represent excitations, it is necessary to switch the directions of the spins on the B sub-lattice Siz → − Siz , Six → Six and Siy → − Siy . This is a proper rotation of π about the x axis so that the commutation relations remain the same. The Holstein - Primakoff transformation for the operator representing the z component of the B spins is of the form Sˆiz = b†i bi − S
(2108)
and the spin raising operators for the B spins are q 2 S − b†i bi Sˆi+ = b†i (2109) which projects out states with more than 2 S bosons present. The lowering operator is given by the Hermitean conjugate q † − ˆ Si = 2 S − bi bi bi (2110) for the B sub-lattice. The Hamiltonian can be written as X ˆ = −J H ( S − a†i ai ) ( S − b†j bj ) − S ( a†i b†j + ai bj ) (2111) i,j
On defining the Fourier transformed operators r 2 X aq = exp − i q . Ri ai N i r 2 X bq = exp − i q . Rj bj N j (2112)
458
then one has ˆ H
=
−N zJ S
2
+ J S
X
z ( a†q aq + b†−q b−q )
q
+
X
− iq.δ
exp
( a†q b†−q + aq b−q )
δ
(2113) This form is still not diagonal, it is necessary to use the Bogoliubov canonical transformation αq = exp + Sˆ aq exp − Sˆ β−q = exp + Sˆ b−q exp − Sˆ (2114) where the operator is given by Sˆ =
X θq q
2
b†−q a†q − aq b−q
(2115)
and θq has still to be determined. The transformation is evaluated as θ θ † aq − sinh b 2 2 −q θ θ † = cosh b−q − sinh a 2 2 q
αq = cosh β−q
(2116) in which θq is chosen so that the Hamiltonian is diagonal. The inverse transformation is given by θ θ † αq + sinh β 2 2 −q θ θ † = cosh β−q + sinh α 2 2 q
aq = cosh b−q
(2117) This value of θq is found as tanh θq
1 X = − exp − i q . δ z
(2118)
δ
The resulting approximate Hamiltonian again can be interpreted in terms of a zero point energy and a sum of harmonic normal modes. ——————————————————————————————————
459
22.3.1
Exercise 79
Find the approximate dispersion relation for spin waves of a Heisenberg antiferromagnet, J < 0, for spins of magnitude S. The Hamiltonian describes interactions between nearest neighbor spins arranged on a simple cubic lattice, X ˆ = −J H S(R) . S(R + δ) (2119) R, δ
Assume that S is large so that the classical Neel state can be considered as being stable. Also calculate the zero point energy. —————————————————————————————————— Since tanh θ → − 1 in the limit q → 0 then θ → − ∞ so both sinh θ2 and cosh θ2 diverge. In one dimension the change in the sub-lattice magnetization < ψ | a†i ai | ψ logarithmically. In two and three dimensions, the P > diverges † divergence in < ψ | a a q q | ψ > is integrable and converges. There is an q energy change relative to the nominal classical energy of the Neel state, given by the sum of the zero point energies, 1 Eg = J z N S 2 1 + ( 1 − Id ) (2120) S where Id
v u X 2 d 2 X u t 2 = d − cos qi a N d q i=1
(2121)
The amplitude of the q = 0 spin wave is divergent, but can be neglected in the limit N → ∞. As the q = 0 spin wave has zero frequency, one can compose a state which is a superposition of the q = 0 spin wave excitations. The dynamics of the finite frequency spin wave excitations can be examined for finite time scales before the zero energy amplitude wave packet of q = 0 spin waves diverges re-orienting the sub-lattice magnetization.
23
Spin Glasses
Spin glasses are found when magnetic impurities are randomly distributed in a metal, such as F e in a gold Au or M n in Cu. Due to the random separations between the moment carrying impurities, the R.K.K.Y. interaction between the magnetic moments are also randomly distributed and can take on both ferromagnetic and anti-ferromagnetic signs. The distribution of interactions prevents the local magnetic moments from forming a long-range ordered phase at low temperatures. Nevertheless, the random spin system may freeze into a spin glass state below a critical temperature. At high temperatures the spins are disordered, 460
and as the temperature is reduced, the spins which are most strongly interacting progressively build up there correlations and freeze into clusters. The dynamics of the spin clusters slow down as they grow, and at a critical temperature Tf they lock into the spin glass phase. However, this may not be the lowest energy state, as in order to reach the ground state there may have to be large scale reorientation of the spin clusters. Thus, the spin glass state is not unique but instead is highly degenerate. This occurs as a result of frustration. The concept of frustration is illuminated by imagining that all the spins on the magnetic sites are frozen in fixed directions, except one, then there is a high probability that the long-ranged interactions of the spin under consideration with the fixed spins almost average out to zero. At finite temperatures, the spin under consideration is almost degenerate with respect to the orientation of the spin as it leads to an insignificant lowering of the energy of the spin glass state. The experimental signatures of spin glass freezing are a plateau in the static susceptibility and a rounded peak in the specific heat. The susceptibility follows a Curie-Weiss law at high temperatures χ(T ) = c
µ2B S ( S + 1 ) 3 kB ( T − Θ )
(2122)
where Θ is the strength of the resultant interaction on an individual spin. For an R.K.K.Y. interaction, the Curie-Weiss temperature Θ should be proportional to c. At lower temperatures where the spin freeze into clusters, the effective moment increases, reflecting the growth of the clusters. The peak in the specific heat, encloses an entropy which is a considerable fraction of ∆S = c kB ln (2S + 1)
(2123)
where c is the concentration of magnetic impurities of spin S. This entropy represents the entropy of the spins gradually freezing into clusters, but does not contain the entropy of the frustrated spins. This maximum disappears and is broadened to higher temperatures as a magnetic field is applied. The effect of the applied field is to order the spins at higher temperatures. Crude estimates indicate that about 70 percent of the spins are already ordered above Tf . The temperature dependence of the resistivity shows a sharp drop or knee at the spin glass freezing temperature. At this temperature, the majority of spin are frozen in specific directions preventing the logarithmic increase with decreasing temperature associated with spin flip scattering. As the spin glass phase is not a ground state but is instead a highly degenerate meta-stable state the most unusual properties occur in the dynamical properties. The low field a.c. susceptibility shows a very sharp cusp at the spin glass freezing temperature. The cusp becomes rounded and the temperature of the peak diminishes as the a.c. frequency is lowered. The susceptibility saturates to a finite value at T = 0 which is roughly half the value of the cusp and has a T 2 variation on the low-temperature side. The d.c. susceptibility shows a 461
memory effect, in that, in field cooled samples ( H 6= 0 ) the susceptibility saturates at Tf , and the curve χ(T ) is reversible as it is also followed for increasing temperature at fixed field. However, the zero field cooled sample ( H = 0 ) shows a cusp at Tf and as the susceptibility is zero until the field is applied, by definition the temperature is only allowed to increase. However, the value of the susceptibility increases with increasing measurement time. The magnetization is slowly increasing as the spins slowly adjust to lower energy state in the presence of the applied field. This is contrasted to the field cooled state in which the spins have already minimized the field energy before the temperature is lowered and they are frozen into the spin glass state. The spin glass freezing resembles a phase transition, but the nature of the order is unclear as the spin glass state involves disorder and is a highly degenerate meta-stable state. Likewise, the description of the low frequency dynamics of the magnetization is complicated by the existence of long-ranged correlations between large groups of spins. Since there is no well defined order parameter, there is no well defined low frequency Goldstone mode. Several important steps in the solution of the thermodynamics of the spin glass problem have been undertaken, this includes the discovery of the nature of the order parameter, by Edwards and Anderson, the formulation of a model which is exactly soluble mean field theory by Sherrington and Kirkpatrick. The Sherrington-Kirkpatrick model consists of an Ising interaction X ˆ = − H Ji−j Siz . Sjz (2124) i,j
where Ji−j is a randomly distributed long-ranged interaction between the spins. The average value of Ji−j is zero < Ji−j > = 0
(2125)
and the average value of the square is given by J2 (2126) N The averaging over the randomly distributed interactions is not commutative. 2 < Ji−j >=
23.1
Mean Field Theory
The simplest mean field approximation is based on a representation of the free energy, for a spin glass with long-ranged interactions between the Ising spin S = 21 , in which the exact value of the spin on a site i is replaced by the thermal averaged value mi . The mean field free energy F [mi ] is given by X X 1 + mi 1 + mi 1 − mi 1 − mi F [mi ] = − Ji,j mi mj + kB T ln + ln 2 2 2 2 i,j i (2127) 462
where the exchange interactions jij are randomly distributed. On minimizing the Free energy one finds, the mean field magnetization at every site. Above the spin glass freezing temperature, the average magnetization at each site is zero. Below the freezing temperature the spin on each site has a non-zero average value, the direction and magnitude varies from site to site and is determined by the non-trivial solution of X 1 + mi kB T 0 = 2 Ji,j mj + ln (2128) 2 1 − mi j
On linearizing in mi ( only valid for T ≥ Tf ) one obtains the eigenvalue equation which determines the spin glass freezing temperature X kB Tf mi = 2 Ji,j mj (2129) j
as the largest eigenvalue of the random matrix Ji,j . This is solved by finding a basis λ that diagonalizes the matrix X Jλ < i | λ > < λ | j > (2130) Ji,j = λ
The basis gives the set of the spin configurations that the spins will be frozen into below the spin glass freezing temperature. In the limit N → ∞, the eigenvalues of the random exchange matrix are distributed according to a semi-circular law (Edwards and Jones) q 1 4 J 2 − Jλ2 (2131) ρ(Jλ ) = 2 π J2 where, obviously, 2 J is the largest eigenvalue. The spin glass freezing temperature is determined as kB Tf = 4 J (2132) This mean field theory predicts a transition temperature which is a factor of 2 too large. This is because the mean field theory needs to incorporate a self reaction term. Namely, the reaction term includes the effect of the central spin on the neighbors back on itself, before the thermal averaging is performed (Thouless, Anderson and Palmer).
23.2
The Sherrington-Kirkpatrick Solution.
The correct mean field solution for the Sherrington-Kirkpatrick model can be obtained in a systematic manner, starting from the partition function. Although the average value of the partition function Z is easily evaluated, the average value of the Free energy is difficult to evaluate. However, the logarithm of the free energy can be evaluated with the aid of the mathematical identity n Z − 1 − β F = lim (2133) n→0 n 463
For finite integer n the configurational average over Ji−j can be evaluated leading to an expression for the partition function for n replicas of the spin system in which the replicas are interacting. The Gaussian averaged value of Z n is given by r Z 2 Y N Ji−j N n Z = dJi−j exp − × 2 π J2 2 J2 i−j n X × T race exp − β Ji−j Si Sj i−j
=
T race
r Y i−j
Z
n
=
T race exp
N 2 π J2
Z dJi−j exp
−
2 X N Ji−j α α − β J S S i−j i j 2 J2 α
( β J )2 X X α α β β Si Sj Si Sj 2N i,j α β
(2134) where α and β are the indices labelling members of the n different replicas. The trace can be evaluated for integer n and then the result can be extrapolated to n → 0. The spin glass order parameter is given by the correlation between the spins of different replicas q α,β = < | Siα Siβ | >
(2135)
which becomes non zero below the freezing temperature. The Free energy is evaluated by re-writing the trace in terms of a Gaussian integral ( β J )2 X X α α β β Si Sj Si Sj T race exp 2N i,j α β √ ! Y Z dyα,β β J N X ( β J )2 β 2 α √ = T race exp − N yα,β − 2 yα,β Si Si 2 2π i α,β √ Z Y ( β J )2 2 dyα,β β J N √ exp − N yα,β × = 2 2π α,β " X # 2 α β × exp N ln T race exp ( β J ) yα,β S S α,β
(2136) In this the thermodynamic limit N → ∞ and the limit n → 0 have been interchanged. Due to the long-ranged nature of the interaction, the trace is over a single spin replicated n times. In the thermodynamic limit N → ∞, this integral can be evaluated by steepest descents. The saddle point value of y α,β is denoted by q α,β . For temperatures above Tf it is easy to show that the 464
interaction part of the Free energy originates from the terms with α = β as the off diagonal terms of q α,β are all equal and zero, and leads to N ( β J )2 2
− β F = N ln 2 +
(2137)
Just below the spin glass freezing transition the off diagonal terms q α,β are all equal and finite. separating out the terms where α = β and replacing the integral by the saddle point value ( β J )2 1 − q 2 (n − 1) × = exp n N 2 " X # 2 α β × exp N ln T race exp (βJ ) qS S α6=β 2
=
exp
nN
(βJ ) 2
1 − 2 q − q 2 (n − 1)
" ×
exp
N ln T race exp
X
2
α
(βJ ) qS S
× β
#
α,β
=
exp
nN
( β J )2 2
" ×
exp
Z N ln T race
1 − 2 q − q 2 (n − 1) dz √ exp 2π
z2 − 2
exp
× X
p ( β J ) z 2 q Sα
#
α
( β J )2 2 exp n N 1 − 2 q − q (n − 1) × 2 " # Z p dz z2 n √ exp N ln exp − 2n cosh (βJ )z 2q 2 2π
= ×
(2138) The saddle point value of q is found by differentiating with respect to q. After an integration by parts and then clearing away fractions, one obtains Z p z2 dz n √ exp − cosh (βJ )z 2q 1 + q (n − 1) 2 2π Z p p dz z2 √ 1 + (n − 1) tanh2 ( β J ) z 2 q = exp − coshn ( β J ) z 2 q 2 2π (2139) In the limit n → 0, the order parameter is given by the solution of the equation Z ∞ p z2 dz √ exp − tanh2 β J 2 q(T ) z (2140) q(T ) = 2 2π −∞
465
The temperature variation of the order parameter is given by
q(T )
=
1 2
q(T )
=
1 −
1 −
lim
T → 0
2 3π
T Tf 12
2 f or T < Tf T TF
(2141)
The finite value of the order parameter produces the cusp in the susceptibility and the low-temperature saturation, since one can show that g 2 µ2B 1 − q(T ) (2142) χ(T ) = 3 kB T Although the long-ranged model is exactly soluble in the mean field approximation, it is only soluble for all temperatures below the freezing temperature if the symmetry between the different replicas are broken. Replica symmetry is specific to interacting random systems (Almeida and Thouless), and the exact solution of the mean field model involves repeated replica symmetry breaking (Parisi). This repeated replica symmetry breaking has the consequence that the dynamics of the low-temperature system are frozen and no longer consistent with the ergodic hypothesis.
466
24
Magnetic Neutron Scattering
The excitations of the electronic system can be probed by inelastic neutron scattering experiments. These experiments provide information about the magnetic character of the excitations, due to the nature of the interaction.
24.1
The Inelastic Scattering Cross-Section
The neutron scattering occurs through the interaction with the magnetic moments of the electronic system.
24.1.1
The Dipole-Dipole Interaction
A neutron has a magnetic moment given by µn = gn µn σ n
(2143)
where the neutrons gyromagnetic ratio is given by gn = 1.91 and interacts with the magnetic moments of electrons via dipole-dipole interactions. The magnetic field produced by a single electron is a dipole field given by ge µB σ e ∧ r |e| v ∧ r (2144) − H = ∇ ∧ 3 |r| c | r |3 where r is the position of the field relative to the electron. The interaction between the neutron and the magnetic field is given by the Zeeman interaction # " σ ∧ r | e | v ∧ r e ˆ int = − gn µn σ . ∇ ∧ − H ge µB n | r |3 c | r |3 " σe ∧ r = gn µn σ n . ∇ ∧ ge µB | r |3 # |e| σn ∧ r σn ∧ r − p. + .p (2145) 2 me c | r |3 | r |3 The first term is a classical dipole - dipole interaction and the second term is a spin - orbit interaction.
24.1.2
The Inelastic Scattering Cross-Section
The scattering cross-section of a neutron, from an initial state (k, σn ) to a final state (k 0 , σ 0n ), in which the electron makes a transition from the initial state
467
| φn > to the final state | φn0 > is given by 2 X d2 σ k 0 V mn σe ∧ r 0 < φn0 ; k 0 , σN = ( ge gn µn µB )2 . ∇ P (n) | σ ∧ N dω dΩ k 2 π ¯h2 | r |3 n,n0 2 1 σn ∧ r σn ∧ r − p. + . p | φn ; k, σn > δ( ¯h ω + En − En0 ) 3 3 2 ¯h |r| |r| (2146) Here, the probability that the electronic system is in the initial state is represented by P (n). The neutron’s energy loss ¯h ω and the momentum loss or scattering vector are defined via hω ¯
=
E(k) − E(k 0 )
hq ¯
=
¯h k − ¯h k 0 (2147)
As the neutron states are momentum eigenstates, the matrix elements of the interaction can be easily evaluated. The spin component of the magnetic interaction is evaluated by considering the neutron component of the matrix elements σe ∧ r 0 |k > = − < k |∇ ∧ σe ∧ ∇ |r| Z 1 1 3 = d rn exp + i q . rn ∇ ∧ σe ∧ ∇ V |r| 4π q ∧ ( σe ∧ q ) exp + i q . re = V q2 (2148) This shows that the neutron only interacts with the component of the electron’s spin σ perpendicular to the scattering vector. Likewise, the orbital component can be evaluated as σe ∧ r 4πi < k 0 | pe ∧ > = − σ ∧ ( q ∧ p ) exp + i q . r | k e e e V q2 | r |3 (2149) Furthermore, the operator ( q ∧ pe ) commutes with exp + i q . re as q ∧ q ≡ 0. Hence, the neutron scattering cross-section from a multi-electron system can be written as 2 X d2 σ k 0 2 mn P (n) δ( ¯h ω + En − En0 ) = ( ge gn µn µB )2 dω dΩ k ¯h2 n,n0 468
2 X σe ∧ q i q ∧ pe 0 − exp + i q . re | φn ; σn > q ∧ × < φn0 ; σn | σ n . 2 2 |q| ¯ |q| h e (2150) Since the nuclear Bohr magneton has the value µn =
| e | ¯h 2 mp c
(2151)
the coupling constant can be simplified 2 mn gn e2 = re gn ge µn µB = 2 me c2 h ¯
(2152)
to yield re , the classical radius of the electron. Thus, the scattering cross-section can be written as d2 σ k0 = re2 S(q; ω) (2153) dω dΩ k where the response function is given by X S(q; ω) = P (n) δ( ¯h ω + En − En0 ) n,n0
X σe ∧ q 0 × < φn0 ; σn | σ n . q ∧ | q |2 e
−
i q ∧ pe ¯h | q |2
exp
+ i q . re
| φn ; σn
2 > (2154)
This expression still depends on the polarization of the neutrons in the incident beam, and also on the polarization of the detector. Polarized neutron scattering measurements reveal more information about the nature of the excitations of a system. However, due to the reduction of the intensity of the incident beam caused by the polarization process, and the concomitant need to compensate the loss of intensity by increase the measurements time, it is more convenient to perform measurements with unpolarized beams. For an un-polarized beam of neutrons, the initial polarization must be averaged over. The averaging can be performed with the aid of the identity X 1 < σn | σnα σnβ | σn > = δα,β 2 σ
(2155)
n
which follows from the anti-symmetric nature of the Pauli spin matrices. For an un-polarized beam of neutrons the response function reduces to X X S(q; ω) = P (n) δ( ¯h ω + En − En0 ) δα,β − qˆα qˆβ n,n0
×
< φn |
α,β
X e
σe
i q ∧ pe + ¯h | q |2 469
exp α
− i q . re
| φn0 >
×
< φn0 |
X
σe −
e
i q ∧ pe ¯h | q |2
exp
+ i q . re
| φn >
β
(2156) where qˆ is the unit vector in the direction of q. On defining the spin density operator Sˆα (q) via Sˆα (q) =
X e
σe
i q ∧ pe + ¯h | q |2
exp
− i q . re
(2157)
α
then the response function can be expressed as a spin - spin correlation function X S(q; ω) = P (n) δ( ¯h ω + En − En0 ) × n,n0
×
X
δα,β − qˆβ qˆβ
< φn | Sˆα (q) | φn0 > < φn0 | Sˆβ† (q) | φn >
α,β
(2158) Thus, the inelastic neutron scattering measures the excitation energies of the system, with intensity governed by the matrix elements < φn | Sˆα (q) | φn0 > which filters out the excitations of a non-magnetic nature. Furthermore, the scattering only provides information about the magnetic excitations which have a component of the fluctuation perpendicular to the momentum transfer. In the case where the spin density can be expressed in terms of the atomic spin density due to the unpaired spins in the partially filled shells, such as in transition metals or rare earths, it is convenient to introduce the magnetic atomic ( ionic ) form factor F (q). For a mono-atomic Bravais lattice, this is achieved by decomposing the spin density in terms of the spin density from each unit cell X i q ∧ pe Sˆα (q) = σe + exp − i q . r e ¯h | q |2 α e X X i q ∧ pj exp − i q . R = σj + exp − i q . r j ¯h | q |2 α j R
(2159) Since the unpaired electrons couple together to give the ionic spin SˆR , the Wigner - Eckert theorem can be used to express the spin density operator X Sˆα (q) = exp − i q . R F (q) SˆR R
(2160)
470
The form factor F (q) is defined as the Fourier transform of the normalized spin density for the ion. By definition F (0) = 1
(2161)
In this case, the inelastic neutron scattering spectrum can be expressed as d2 σ k0 = re2 | F (q) |2 S(q; ω) dω dΩ k
(2162)
where the spin - spin correlation function is expressed in terms of the local ionic spins SˆR . Of course, it is being implicity assumed that the magnetic scattering can be completely separated from the phonon scattering. Thus, the analysis has ignored the existence of phonon excitations, in the case of zero phonon excitations, the intensity of the magnetic scattering is expected to be reduced by the Debye Waller factor of the phonons.
24.2
Time Dependent Spin Correlation Functions
The spin components of the correlation function measured in a scattering experiment can be defined through the function S α,β (q; ω) where S α,β (q; ω)
X
=
P (n) δ( ¯h ω + En − En0 ) ×
n,n0
#
" < φn | Sˆα (q) | φn0 > < φn0 |
×
Sˆβ† (q)
| φn > (2163)
Using the expression for the energy conserving delta function as an integral over a time variable Z ∞ i dt δ( ¯h ω + En − En0 ) = exp ( ¯h ω + En − En0 ) t h ¯h −∞ 2 π ¯ (2164) the spin - spin correlation function can be written as a Fourier transform of a time dependent correlation function. Z ∞ 1 X dt i α,β S (q; ω) = P (n) exp i ω t exp ( En − En0 ) t ¯h ¯h −∞ 2 π n,n0 " # † × < φn | Sˆα (q) | φn0 > < φn0 | Sˆ (q) | φn > β
Z
∞
= −∞
dt exp 2π
iωt
1 X i P (n) exp ( E n − En0 ) t ¯h ¯h 0 n,n
471
" ×
# < φn | Sˆα (q) | φn0 > < φn0 | Sˆβ† (q) | φn > (2165)
On expressing the product of the phase factor and the matrix elements of the ˆ S(q) as an operator in the interaction representation
i ( E n − En0 ) t < φn | Sˆα (q) | φn0 > exp h ¯ i ˆ i ˆ ˆ = < φn | exp + H0 t Sα (q) exp − H0 t | φn0 > ¯h ¯h = < φn | Sˆα (q; t) | φn0 > (2166) Hence, the spin - spin correlation function is given by Z ∞ dt S α,β (q; ω) = exp i ω t × −∞ 2 π " # 1 X † × P (n) < φn | Sˆα (q; t) | φn0 > < φn0 | Sˆβ (q; 0) | φn > h ¯ 0 n,n
(2167) The final states are a complete set of states, therefore, on using the completeness relation, one finds # " Z ∞ X dt 1 exp i ω t P (n) < φn | Sˆα (q; t) Sˆβ† (q; 0) | φn > S α,β (q; ω) = ¯h n −∞ 2 π Z ∞ dt 1 = exp i ω t < | Sˆα (q; t) Sˆβ† (q; 0) | > 2 π h ¯ −∞ (2168) This is the Fourier Transform with respect to time of the thermal averaged spin spin correlation function. Furthermore, as the inverse spatial Fourier transform of the spin density operator and the Hermitean conjugate are given by Z 1 Sˆα (q) = d3 r exp − i q . r Sˆα (r) V Z 1 Sˆα† (q) = d3 r0 exp + i q . r0 Sˆα (r0 ) V (2169) Using this, and the spatial homogeneity of the system, one finds that the inelastic neutron scattering spectrum is related to the spatial and temporal Fourier 472
transform of the spin - spin correlation function Z ∞ Z 1 dt 1 S α,β (q; ω) = d3 r exp i ( ω t − q . r ) < | Sˆα (r; t) Sˆβ (0; 0) | > V −∞ 2 π ¯h (2170) Thus, the inelastic neutron scattering probes the Fourier transform of the equilibrium spin correlation functions.
24.3
The Fluctuation Dissipation Theorem
The spin - spin correlation function satisfies the principle of detailed balance X S α,β (q; ω) = P (n) δ( ¯h ω + En − En0 ) < φn | Sˆα (q) | φn0 > < φn0 | Sˆβ† (q) | φn > n,n0
(2171) If the equilibrium probability P (n) is given by the Boltzmann expression 1 P (n) = exp − β En (2172) Z then spin-spin correlation function can be re-written as X S α,β (q; ω) = P (n) δ( ¯h ω + En − En0 ) < φn0 | Sˆβ† (q) | φn > < φn | Sˆα (q) | φn0 > n,n0
=
exp
β¯ hω
X
P (n0 ) δ( En0 − En − ¯h ω ) < φn0 | Sˆβ† (q) | φn > < φn | Sˆα (q)
n,n0
=
exp
β¯ hω
X
P (n) δ( En − En0 − ¯h ω ) < φn | Sˆβ† (q) | φn0 > < φn0 | Sˆα (q)
n,n0
=
exp
β¯ hω
S β,α (−q; −ω)
which is a statement of the principle of detailed balance. The correlation function S α,β (q; ω) is also related to the imaginary part of the magnetic susceptibility χα,β (q; ω) via the fluctuation dissipation theorem. The reduced dynamical magnetic susceptibility is given by the expression i χα,β (r, r0 ; t − t0 ) = − < | Sˆα (r, t) , Sˆβ (r0 , t0 ) | > Θ( t − t0 ) (2174) h ¯
473
which can be expressed as α,β
χ
(r; t)
=
i X − P (n) exp h ¯ n,n0 i X + P (n) exp h ¯ 0 n,n
i ( En − E n0 ) t ¯h
i ( En0 − E n ) t ¯h
< φn | Sˆα (r) | φn0 > < φn0 | Sˆβ (0) | φn < φn | Sˆβ (0) | φn0 > < φn0 | Sˆα (r) | φn
The Fourier transform is defined as Z ∞ Z 1 dt α,β 3 χ (q; ω) = d r exp i ( ω t − q . r ) χα,β (r; t) V −∞ 2 π (2176) and is evaluated as α,β
χ
(q; ω)
=
1 X P (n) 2π 0
"
< φn | Sˆα (q) | φn0 > < φn0 | Sˆβ† (q) | φn >
n,n
−
< φn | Sˆβ† (q) | φn0
¯h ω + En − En0 + i δ # > < φn0 | Sˆα (q) | φn >
¯h ω + En0 − En + i δ (2177)
The imaginary part of the dynamic susceptibility is given by 1 X α,β Im χ (q; ω) = − P (n) × 2 n,n0 " × δ( ¯h ω + En − En0 ) < φn | Sˆα (q) | φn0 > < φn0 | Sˆβ† (q) | φn > # −
δ( ¯h ω + En0 − En ) < φn |
Sˆβ† (q)
| φn0 > < φn0 | Sˆα (q) | φn > (2178)
which can be written as Im χα,β (q; ω + iδ)
=
1 − S α,β (q; ω) 2
"
#
1 − exp
− β ¯h ω (2179)
or S α,β (q; ω) = 2
1 + N (ω)
Im
χα,β (q; ω − iδ)
(2180)
474
This is the fluctuation dissipation theorem, it relates the dynamical response of the system to an external perturbation to the naturally occurring excitations in the system as measured by neutron scattering experiments.
24.4
Magnetic Scattering
The neutron scattering cross-section is given in terms of the components of the spin spin correlation function. As can be seen by inspection from the Holstein-Primakoff representation of the spins and the spin waves, the spin correlation function is a non-linear function of the spin wave creation operators. The inelastic scattering cross-section can be expanded in powers of the number of spin waves. The lowest order term is time independent and corresponds to Bragg scattering.
24.4.1
Neutron Diffraction
The ω = 0 component of the inelastic scattering cross-section given by the limit of Z ∞ dt 1 α,β S (q; ω) = exp i ω t < | Sˆα (q; t) Sˆβ† (q; 0) | > ¯h −∞ 2 π (2181) diverges if the integrand does not decay rapidly as t → ∞. In this case the time independent component of the spin - spin correlation function given by lim
t → ∞
1 < | Sˆα (q; t) Sˆβ† (q; 0) | > ¯ h
produces a Bragg peak with finite intensity as Z ∞ dt δ(ω) = exp i ω t −∞ 2 π
(2182)
(2183)
Thus, the intensity of the Bragg peak represents the static correlations. If the ergodic hypothesis holds, then in the long time limit, the correlation function decouples into the product of two expectation values 1 < | Sˆα (q; t) Sˆβ† (q; 0) | > ¯h 1 lim < | Sˆα (q; t) | > < | Sˆβ† (q; 0) | > t → ∞ ¯ h lim
t → ∞
=
(2184) and for a static system for which < | Sˆα (q; t) | > = < | Sˆα (q; 0) | > 475
(2185)
then the Bragg peak has an intensity given by =
lim
t → ∞
1 < | Sˆα (q; 0) | > < | Sˆβ† (q; 0) | > ¯ h
(2186)
For a paramagnetic system the (quasi-stationary) average value of the spin vector is zero < | Sˆα (q; 0) | > = 0 (2187) and, thus, there is no magnetic Bragg peak for a paramagnetic system. On the other hand, if there is long-ranged magnetic order with wave vectors Q along certain directions ( say α ), then < | Sˆα (Q; 0) | > 6= 0
(2188)
then the magnetic Bragg peaks are non-zero. The temperature dependence of the intensity of the Bragg peaks provides a direct measure of the temperature dependence of the magnetic order parameter. For a ferromagnet, the magnetic Bragg peaks coincide with the Bragg peaks due to the crystalline order, so no new peaks emerge. The Bragg scattering cross-section is given by 2 2 X dσ 2 ( 2 π N ) 2 = re 1 − qˆz δ( q − Q ) < | Sz | > (2189) dΩ Bragg V Q
For a small single domain single crystal the magnetic elastic scattering is extremely anisotropic, the scattering should be zero for momentum transfers along the direction of the magnetization. For anti-ferromagnetic or spin density wave order new Bragg peaks may emerge at the vectors of the anti-ferromagnetic reciprocal lattice. Analysis of the anisotropy of the neutron scattering intensity for anisotropic single crystals leads to the determination of the preferred directions of the magnetic moments. ——————————————————————————————————
24.4.2
Exercise 80
Evaluate the elastic scattering cross-section for a anti-ferromagnetic insulator, using the Holstein-Primakoff representation of the low energy spin wave excitations. Discuss the anisotropy and also the temperature dependence of the intensity of the Bragg peaks. ——————————————————————————————————
476
24.4.3
Exercise 81
Design a neutron diffraction experiment that will determine if a system has a spiral spin density wave order, as opposed to a magnetic moment that is modulated in intensity. How can the direction of spiral be determined? ——————————————————————————————————
24.4.4
Spin Wave Scattering
The spin wave excitations of an ordered magnet show up in the inelastic neutron scattering spectra. In a process whereby a single spin wave is emitted in the scattering, the energy difference between the initial state En and the final state En0 so that energy conservation leads to E n0 − En = h ¯ ωq
(2190)
The matrix elements in the spin - spin correlation function can be evaluated as Y Y < φn | Sα (q) | φn0 > = < nq0 | Sα (q) | n0q” > (2191) q0
q”
where the number of spin waves in the initial state are related to the number in the final state via n0q0 = nq0
f or q 0 6= q
(2192)
and n0q = nq + 1
(2193)
For a ferromagnet the matrix elements are evaluated as < nq | Sz (q) | nq + 1 > = 0
(2194)
while the transverse matrix elements are < nq | Sx (q) | nq + 1 >
= =
1 < nq | ( S + (q) + S − (q) ) | nq + 1 > 2 √ 1 q nq + 1 2S 2 (2195)
and < nq | Sy (q) | nq + 1 >
= =
1 < nq | ( S + (q) − S − (q) ) | nq + 1 > 2i √ 1 q + nq + 1 2S 2i (2196) 477
Thus, the inelastic neutron scattering from the single spin wave excitations is given purely by the diagonal components of the transverse spin-spin correlation function, as the longitudinal components are zero. The off diagonal terms cancel. The cross-section for the spin wave emission process is given by X ( 2 π)2 N S d2 σ = re2 1 + qˆz2 δ( ¯h ω − ¯h ω(q 0 ) ) δ( q − q 0 −Q ) 1 + N (ω(q 0 )) dω dΩ emit V 2 0 q ,Q
(2197) Likewise, the absorption process has a scattering cross-section given by X d2 σ ( 2 π)2 N S = re2 1 + qˆz2 δ( ¯h ω + h ¯ ω(q 0 ) ) δ( q − q 0 −Q ) N (ω(q 0 )) dω dΩ abs V 2 0 q ,Q
(2198) These satisfy the principle of detailed balance, and give rise to a Stokes and anti-Stokes line in the spectrum of the scattered neutrons. ——————————————————————————————————
24.4.5
Exercise 82
Evaluate the two lowest order terms in the inelastic scattering cross-section for an anti-ferromagnetic insulator, using the Holstein-Primakoff representation of the low energy spin wave excitations. Discuss the differences between the spectrum obtained from magnetic scattering and that found in measurements of the phonon excitations. ——————————————————————————————————
24.4.6
Critical Scattering
Just above the temperature where magnetic ordering occurs, the inelastic neutron scattering cross-section in the paramagnetic phase shows a softening or build up at low frequencies and becomes sharply peaked at q values close to the magnetic Bragg vectors Q. Below the ordering temperature, the intensity transforms into the Bragg peak. This phenomenon of the build up of intensity close to the Bragg peak is known as critical scattering. The Bragg peak is extracted from the inelastic scattering spectrum by extracting a delta function δ(ω), i.e., the inelastic scattering cross-section is integrated over a small window dω. On invoking the fluctuation dissipation theorem and then noting that if, in the paramagnetic phase, the main portion of the scattering occurs with frequencies such that β ω 1, then one finds that by using the Kramers - Kronig relation, the intensity of the critical scattering is given by the static
478
susceptibility. For q values close to the Bragg peak, the susceptibility varies as ∝
1 ( q − Q )2 + ξ −2
(2199)
where ξ the correlation length, in the mean field approximation, is given by s Tc −1 ξ = a (2200) 6 ( T − Tc ) Thus, the critical scattering diverges as ( T − Tc )−1 as the transition temperature is approached.
479
25
Superconductivity
The electrical resistivity ρ(T ) of metals at low temperatures is expected to be described by the Drude model m ρ(T ) ∝ 2 (2201) e τ The resistivity vary with temperature according to ρ(T ) ∼ ρ(0) + A T 2 + B T 5
(2202)
as the scattering rates for scattering from static impurities, electron-electron scattering and phonon scattering are expected to be additive. The resistivity of a perfect metal, without impurities, may be expected to vanish at T = 0. However, it was discovered by Kammerlingh Onnes that the resistivity of a metal may become so small as to effectively vanish for all temperatures below a critical temperature Tc (H. Kammerlingh Onnes, Comm. Phys. Lab. Univ. Leiden, Nos. 119, 120, 122 (1911)). This indicates that the scattering mechanisms suddenly becomes ineffective for temperatures slightly below the critical temperature, where the metal seems to act like a perfect conductor. The resistivity is so small that persistent electrical currents have been observed to flow without attenuation for very long time periods. The decay time of a super-current in favorable materials is apparently not less than 10,000 years.
25.1
Experimental Manifestation
The first manifestation of superconductivity is zero resistance, below Tc . Another manifestation of superconductivity was found by Meissner and Ochsenfeld, which is flux exclusion (W. Meissner and R. Ochsenfeld, Naturwiss. 21, 787 (1933)). A superconductor excludes the magnetic induction field B from its interior, irrespective of whether it was cooled from above Tc to below Tc in the presence of an applied field, or whether the field is only applied when the temperature is smaller than Tc . In other words the Meissner effect excludes time independent magnetic field solutions from inside the superconductor. The Meissner effect distinguishes superconductivity from perfect conductivity, as a static magnetic field can exist in perfect conductor. The perfect conductor has the property that the current produced by an applied electric field increases linearly with time. Therefore, a perfect conductor excludes electric fields from within its bulk. Maxwell’s equations reduce to 1 ∂B c ∂t
=
∇ ∧ B
=
∇.B
=
−
0 4π j c 0 (2203)
480
Thus, a perfect conductor only excludes a time varying magnetic field, but not a static magnetic field. The Meissner effect shows hat the magnetic induction inside a superconductor is zero. However, the magnetic induction B can be expressed in terms of the applied field H and the magnetization M via B = H + 4πM
(2204)
so B = 0 implies that 1 H (2205) 4π so that perfect diamagnetism implies that the susceptibility is given by M = −
χ = −
1 4π
(2206)
The perfect diamagnetism does not hold for arbitrarily large applied magnetic fields. For fields larger than a critical magnetic field, the induction inside the superconductor becomes non-zero. For a type I superconductor, the applied field fully penetrates into the bulk of the superconductor above the critical field Hc . The magnetization drops discontinuously to zero at Hc . The value of Hc depends on temperature according to T2 Hc (T ) = Hc (0) 1 − 2 (2207) Tc For a type II superconductor, the induction first starts penetrating into the bulk at the lower critical field Hc1 , For fields larger than the lower critical field, the magnetization deviates from linear relation associated with perfect diamagnetism. The magnitude of the magnetization is reduced as the applied field is increased above Hc1 . The magnetization falls to zero at the upper critical field Hc2 , at which point the applied field fully penetrates into the bulk. The experimental observations of a drop in the resistivity and the Meissner effect demonstrate that the transition to the superconducting state is a phase transition as the properties are independent of the history of the sample. For a type I superconductor, the bulk superconductivity is completely destroyed at Hc (T ).
25.1.1
The London Equations
A phenomenological description of superconductivity was developed by the London brothers. Basically, this description is based on two phenomenological constitutive equations for the electromagnetic field and its relation to current and density. The first London equation is of the form j(r, t) = −
ns e2 A(r, t) mc
481
(2208)
which expresses the perfect conductivity of a superconductor. It relates the microscopic current in the superconductor which screens the applied magnetic field. Here, ns is density of superfluid electrons. This is slightly different from the condition of perfect conductivity in a metal which would be a relation between the time derivative of the current that flows in response to an electric field. Here, it has been assumed that the electric field is transverse and the condition for perfect conductivity has been integrated with respect to time, thereby allowing constant currents to screen the static applied magnetic field. In order that the continuity equation be satisfied in a steady state, the condition that ∇ . A(r, t) = 0, which defines the London gauge must be imposed. The second London equation comes from Maxwell’s equations j(r, t) 1 ∂ + E(r, t) c c ∂t
∇ ∧ B(r, t) =
(2209)
and with the definitions B(r, t)
=
E(r, t)
=
∇ ∧ A(r, t) 1 ∂ − A(r, t) c ∂t (2210)
one finds ∇ ∧ ∇ ∧ +
1 ∂2 c2 ∂t2
A(r, t) = −
ns e2 A(r, t) m c2
(2211) 2
This is referred to as the second London equation. The quantity nms ce2 has units of inverse length squared and is used to define the London penetration depth λL , via ns e2 = m c2
1 λ2L
(2212)
The second London equation expresses the Meissner effect. Namely, that a superconductor excludes the magnetic induction field B from the bulk of its volume. However, the field does penetrate the region at the surface and extends over a distance λL into the superconductor. This can be seen by examining various cases in which a static applied magnetic field is produced near a superconductor. The geometry is considered in which the applied field is parallel to the surface. Let the surface be the plane z = 0, which separates the superconductor z > 0 from the vacuum z < 0. The external field is applied in the x direction, so B = B0 x ˆ for z < 0. The vector potential inside the superconductor must satisfy the boundary condition Az (z = 0) = 0 as any current should be perpendicular to the surface. The London gauge requires the non-zero components 482
of the vector potential to be Ax and Ay . Thus, the vector potential must be parallel to the surface. The static solution for the vector potential that satisfies the boundary conditions on the current for the semi-infinite solid is z (2213) A(z) = A0 exp − λL An additional boundary condition at z = 0 is that Bx should be continuous. Hence, as the equation for the magnetic induction simplifies to Bx (z) = −
∂Ay (z) ∂z
(2214)
one finds that the vector potential is directed parallel to the surface, but is also perpendicular to the applied field. The only non-zero component of the vector potential in the superconductor is found as z f or z > 0 (2215) Ay (z) = + λL Bx (0) exp − λL London’s first equation then implies that a supercurrent, jy (z), flows in a region near the surface of the superconductor which, through Ampere’s law, produces magnetic field that screens or cancels the applied field. The magnetic induction and the supercurrent are non-zero in the superconductor only within a distance of λL from the surface. Hence, λL is called the penetration depth.
25.1.2
Thermodynamics of the Superconducting State
The phase transition to a superconducting state, in zero field, is a second order phase transition. This can be seen by examining the specific heat which exhibits a discontinuous jump at Tc . The absence of any latent heat implies that the entropy is continuous, and since the entropy is obtained as a first order derivative of the Free energy the transition is not first order. The non-analytic behavior of the specific heat, which is obtained from a second derivative of the free energy defines the transition to be second order. In the presence of a field the transition is first order. The thermodynamic relations are derived from the Gibbs free energy G in which M plays the role of the volume V and the externally applied field H plays the role of the applied pressure P . Then, G(T,H) has the infinitesimal change dG = − S dT − M . dH
(2216)
where S is the entropy and T is the temperature. Since G is continuous across the phase boundary at (Hc (T ), T ) Gn (T, Hc(T )) = Gs (T, Hc (T )) 483
(2217)
which on taking an infinitesimal change in both T and H = Hc(T ) so as to stay on the phase boundary one finds ( Ss − Sn ) dT = ( Mn − Ms ) dHc (T )
(2218)
The magnetization in the normal state is negligibly small, but the superconducting state is perfectly diamagnetic, so Mn − M s = +
1 Hc (T ) 4π
(2219)
This shows that in the presence of an applied field the superconducting transition involves a latent heat L given by L = T ( Sn − Ss ) T ∂Hc = − Hc 4π ∂T
(2220)
Thus, the transition is first order in the presence of an applied field. In the limit that the critical field is reduced to zero, the transition temperature T is reduced to the zero field value Tc , and the entropy becomes continuous at the transition. However, there is a change in slope at Tc , which can be found by taking the derivative of Sn − Ss with respect to T , and then letting Hc → 0 " 2 2 # 1 ∂ Hc ∂Hc ∂ ( Sn − Ss ) = − Hc + ∂T 4π ∂T 2 ∂T 2 1 ∂Hc = − (2221) 4π ∂T The specific heat may show a discontinuity or jump at Tc that is a measure of the initial slope of the critical field Cs − Cn
T = 4π
∂Hc ∂T
2 (2222)
The discontinuous jump in the (zero field) specific heat is a characteristic of a mean field transition. For temperatures below Tc the specific heat is exponentially activated ∆ Cv ∼ γ Tc exp − (2223) kB T The activated exponential behavior of the specific heat suggests that there is an energy gap in the excitation spectrum. The existence of a gap is confirmed by a threshold frequency for photon absorption by a superconductor. Above Tc , the absorption spectrum is continuous and photons of arbitrarily low frequency can be absorbed by the metal. However, for temperatures below Tc , there is a minimum frequency above which photons can be absorbed. The threshold 484
frequency is related to ∆. In most superconductors, the interaction mechanism that is responsible for pairing is mediated by the electron-phonon coupling. This was first identified through the insight of Fr¨ ohlich (H. Fr¨ohlich, Phys. Rev. 79, 845 (1950)), who predicted that the superconducting transition temperature Tc should be proportional to the phonon frequency. Furthermore, as the square of the phonon frequency is inversely proportional to the mass of the ions M , the superconducting transition temperature should depend upon the isotopic mass through 1
Tc ∝ M − 2
(2224)
This isotope effect was confirmed in later experiments by Maxwell (E. Maxwell, Phys. Rev. 78, 447 (1950), Phys. Rev. 79, 173 (1950)) and Reynolds et. al. (C.A. Reynolds, B. Serin, W.H. Wright and L.B. Nesbitt, Phys. Rev. 78, 487 (1950)) on simple metals. However, in transition metals the exponent of the isotope effect is reduced and may become zero, and in α − U the exponent is positive. The occurrence of a positive isotope effect does not necessarily signify the existence of alternate pairing mechanisms, but can indicate the effect of strong electron-electron interactions.
25.2
The Cooper Problem
The electron-electron interaction in a metal, is attractive at low frequencies. The attractive interaction originates from the screening of the electrons by the ions, but only occurs for energy transfers less than h ¯ ωD . The effective attraction is retarded, and occurs due to the attraction of a second electron with the slowly evolving polarization of the lattice produced by the first electron. Cooper showed that two electrons which are close to the Fermi-energy, will bind into pairs whenever they experience an attractive interaction, no matter how weak the interactions is. Consider a pair of electrons of spin σ and σ 0 , excited above the Fermienergy. Due to the interaction between the pair of particles, the center of mass momentum q will be a constant of motion, but not the relative motion. Thus, the wave function of the Cooper pair with total momentum q can be written as | Ψq > =
X
C(k) | σ, k + q σ 0 , −k >
(2225)
k
Due to the Pauli exclusion principle the single particle energies E(−k) and E(k+ q) must both be above the Fermi-energy µ. The wave function is normalized such that X | C(k) |2 = 1 (2226) k
485
ˆ and, The wave function must be an energy eigenstate of the Hamiltonian H thus, satisfies ˆ | Ψq > = Eq | Ψq > H (2227) On projecting out C(k) using the orthogonality of the different momentum states | σ, k + q σ 0 , −k > one finds the secular equation 1 X Eq − E(k) − E(k + q) C(k) = − V (k, k 0 ) C(k 0 ) (2228) N 0 k
The attractive pairing potential V , ( − V < 0 ), scatters the pairs of electrons between states of different relative momentum. The summation over k 0 is restricted to unoccupied Bloch states within ¯h ωD the Fermi-surface, where the interaction is attractive. The above equation has a solution for the amplitude C(k) which is given by C(k) =
α(k) Eq − E(k) − E(k + q)
(2229)
where α is given by α(k) = −
1 X V (k, k 0 ) C(k 0 ) N 0 k
(2230) This equation can be solved analytically in the case where the potential is separable, such as the case where V is just a constant. In such cases, the summation over k 0 can be performed to yield a result which is independent of k. For simplicity, the separable potential shall be assumed to have a magnitude of V when both k and k 0 are within h ¯ ωD of the Fermi-surface. Then, α is independent of k and α =
V (k, k 0 ) α X N E(k 0 ) + E(k 0 + q) − Eq 0 k
(2231) Thus, the energy eigenvalue is determined from the equation 1 =
1 X N 0 k
V (k, k 0 ) E(k 0 ) + E(k 0 + q) − Eq (2232)
For Cooper pairs with zero total momentum q = 0, this equation reduces to Z
µ + h ¯ ωD
1 = V
d µ
486
ρ() 2 − E
(2233)
The density of states ρ() can be approximated by a constant ρ(µ), and the integral can be inverted to give the energy eigenvalue as E = 2µ −
exp
2 ¯h ωD 2 V ρ(µ)
(2234)
− 1
This eigenvalue is less than the energy of the two independent electrons, thus, the electrons are bound together. It is concluded that, due to the sharp cut off of the integral at the Fermi-energy, the electron bind to form Cooper pairs no matter how small the attractive interaction is. The binding energy is small and is a non-analytic function of the pairing potential V , that is, the binding energy cannot be expanded as a power series in V . In the case that the pairing potential is spin rotationally invariant, the total spin of the pair S is a good quantum number. The pairing states can be categorized by the value of their spin quantum number and the projection of the total spin along the z axis. On pairing two spin one half electrons, there are four possible state, a spin singlet state S = 0 and a spin triplet state S = 1 which is three-fold degenerate. The four Cooper pair wave functions corresponding to these states have to obey the Paul-exclusion principle and are written as X 1 ψS=0 (r1 , r2 ) = CS=0 (k) φk (r1 ) φ−k (r2 ) − φ−k (r1 ) φk (r2 ) × 2 k × χ+ 1 χ− 2 + χ− 1 χ+ 2 (2235) for the spin singlet pairing. The three spin triplet pair wave functions are X ψS=1,m=1 (r1 , r2 ) = CS=1 (k) φk (r1 ) φ−k (r2 ) χ+ 1 χ+
2
k
ψS=1,m=0 (r1 , r2 ) =
X k
CS=1 (k)
1 2
×
φk (r1 ) φ−k (r2 ) + φ−k (r1 ) φk (r2 ) ×
ψS=1,m=−1 (r1 , r2 ) =
X
χ+
1
χ−
2
− χ−
1
CS=1 (k) φk (r1 ) φ−k (r2 ) χ−
χ+ 1
2
χ−
2
k
(2236) Thus, for singlet pairing one must have CS=0 (k) = CS=0 (−k)
(2237)
which requires that, when expanded in spherical harmonics, the expansion only contains even components of orbital angular momentum. For triplet pairing one 487
has CS=1 (k) = − CS=1 (−k)
(2238)
thus, the triplet pair can only be composed of odd values of orbital angular momentum. Most superconductors that have been found have singlet spin pairing and are in a state which is predominantly in a state of orbital angular momentum l = 2. The high Tc superconductor such as Sr doped La2 CuO4 found by Bednorz and Muller in 1986 (Tc = 35 K) or Y Ba2 Cu3 O7 (Tc = 90 K) are slightly exceptional to this rule. These materials evolve from an anti-ferromagnetic insulator phase at zero doping, but as the doping increases they lose the antiferromagnetism and become metallic paramagnets. A superconducting phase appear above a small critical doping concentration. The superconductivity is exceptional, not just in the magnitude of the transition temperature Tc but also in that the pairing is singlet, but with an appreciable admixture of a component with l = 2 in the pair. Due to this admixture the pairing in high Tc superconductors is sometimes referred to as d wave pairing. In heavy fermion superconductors, such as CeCu2 Si2 , U Be13 , U P t3 and U Ru2 Si2 , experimental evidence exists that these materials do not show exponentially activated behavior characteristic of a gap. Instead the specific heat and susceptibility show power law variations. This and the multiple superconducting transitions found in U P t3 and T h doped U Be13 indicate that the gap is dominated by components with non-zero angular momentum. It is customary to represent the wave function of the Cooper pair in terms of r + r relative coordinates r = r1 − r2 and center of mass coordinates R = 1 2 2 . Thus, the Cooper pair wave function is written as ψ(r1 , r2 ) → ψ(r, R)
(2239)
and as the pair usually is in a state with zero total momentum, q = 0, the center of mass dependence can be ignored. The mean square radius of the Cooper pair wave function is given by Z ξ2 = d3 r r2 | ψ(r) |2 (2240) but ψ(r) =
X
C(k) exp
ik.r
(2241)
k
Thus, ξ2
Z =
d3 r r 2
X
C(k) C ∗ (k 0 ) exp
k,k0
488
i ( k − k0 ) . r
=
X
| ∇k C(k) |2
k
= =
2 ¯ vF h 2µ − E 2 4 vF 4 exp 3 2 ωD V ρ(µ) 4 3
(2242) For a binding energy of order 10 K and a Fermi-velocity vF of the order of 106 m/sec one obtains a pair size ξ of order 104 Angstroms. The standard weak coupling theory of superconductivity due to Bardeen, Cooper and Schrieffer, (B.C.S.), treats the Cooper pairing of all electrons close to the Fermi-surface in a self-consistent manner.
25.3 25.3.1
Pairing Theory The Pairing Interaction
The attractive pairing interaction can be obtained from the electron phonon interaction, via an appropriately chosen canonical transform. The energy of the electron phonon system can be expressed as the sum ˆ = H ˆ0 + H ˆ int H (2243) where the non-interacting Hamiltonian is given by X X ˆ0 = ε(k) c†k,σ ck,σ + ¯ ωα (q) a†q,α aq,α h H
(2244)
q,α
k,σ
and the interaction term is given by XX ˆ int = H λq c†k+q,σ ck,σ aq,α + a†−q,α
(2245)
k,σ q,α
The Hamiltonian will be transformed via 0 ˆ ˆ ˆ ˆ H = exp + S H exp − S
(2246)
where Sˆ is chosen in a way that will eliminate the interaction term ( at least to in first order ). The operator S can be thought of as being of the same order as ˆ int . That is on expanding the transformed Hamiltonian in powers of Sˆ H ˆ0 = H ˆ0 + H ˆ int + Sˆ , H ˆ0 H +
1 2
Sˆ ,
ˆ0 Sˆ , H
+
ˆ int Sˆ , H
ˆ3 + O H int (2247)
489
and then Sˆ is chosen such that ˆ int = H
ˆ 0 , Sˆ H
(2248)
Since this is an operator equation, this is solved for Sˆ by taking matrix elements between a complete set. A convenient complete set is provided by the eigenstates ˆ 0 , which have energy eigenvalues Em . Thus, the matrix elements | φm > of H ˆ of S are found from the algebraic equation ˆ int | φn > = ( Em − En ) < φm | Sˆ | φn > < φm | H
(2249)
The complete set of energy eigenstates are energy eigenstates of the non interacting electron and phonon Hamiltonian. The non-zero matrix elements only occur between states which involve a difference of unity in the occupation number of one phonon mode ( either q or − q ) , and also a change of state of one electron ( k to k + q ). The anti-Hermitean operator Sˆ can be represented in second quantized form as Sˆ = −
XX
c†k+q,σ
k,σ q,α
ck,σ
λq aq,α ε(k + q) − ε(k) − ¯h ω(q)
+
λq a†−q,α ε(k + q) − ε(k) + h ¯ ω(q)
(2250) The transformed Hamiltonian contains the effects of the interaction only through the higher order terms 3 ˆ0 = H ˆ 0 + 1 Sˆ , H ˆ int + O H ˆ int H 2 (2251) On evaluating the commutation relation one finds a renormalization of the electron dispersion relation of order | λq |2 , and electron-electron interaction terms. The electron-electron interaction terms combine and can be written as 0 ˆ int H =
X k,σ;k0 ,σ 0
X
| λq |2 ¯h ω(q)
q,α
( ε(k + q) − ε(k) )2 − h ¯ 2 ω(q)2
c†k+q,σ ck,σ c†k0 −q,σ0 ck0 ,σ0
(2252) Thus, for electrons within ¯h ω(q) of the Fermi-energy there is an attractive interaction between the electrons. As this interaction depends on the energy transfer between the electrons, and the energy transfer corresponds to a frequency. As the interaction is frequency dependent, it corresponds to a retarded interaction. As the interaction is only attractive at sufficiently small frequencies the interaction is only attractive after long time delays. In the B.C.S. theory this interaction is further approximated. The approximation consists of only retaining scattering between electrons of opposite momentum, as this maximizes the phase space of allowed final states. That is the momenta and spin are restricted such that k = − k0 (2253) 490
and also σ = − σ0
(2254)
This produces a pairing between electrons of opposite spins. The B.C.S. Hamiltonian is composed of the energy of electrons in Bloch states and an attractive interaction between the electrons mediated by the phonons. The B.C.S. Hamiltonian is written as X X ˆ = H ε(k) c†k,σ ck,σ − V (k, k 0 ) c†−k0 ,↓ c†k0 ,↑ ck,↑ c−k,↓ (2255) k,k0
k,σ
25.3.2
The B.C.S. Variational State
The pairing theory of superconductivity considers the ground state to be a state within the grand canonical ensemble. That is the ground state is composed of a linear superposition of states with different numbers of particles. If required, a ground state in the canonical ensemble can be found by projecting the B.C.S. ground state onto one with a fixed number of particles. The B.C.S. state is chosen variationally, by minimizing the energy. The B.C.S. ground state is found from anti-symmetrizing the many-particle state which is composed of product over wave vector k. For each wave vector k the Cooper pair ((k, ↑), (−k, ↓)) is occupied with probability amplitude u(k) and unoccupied with probability amplitude v(k). The probability amplitudes are often referred to as coherence factors. Y † † | ΨBCS > = v(k) + u(k) ck,↑ c−k,↓ | 0 > (2256) k
The amplitudes satisfy the constraint | u(k) |2 + | v(k) |2 = 1
(2257)
The normal state for non-interacting electrons just corresponds to the special case, | u(k) |2 = Θ( µ − ε(k) ) (2258) The functions u(k) and v(k) are variational parameters that are found be minimizing the expectation value of the Hamiltonian, which includes the pairing interaction. The expectation value for the appropriate energy, in the B.C.S. state, is given by X ˆ − µ N ) | ΨBCS > = < ΨBCS | ( H 2 ( ε(k) − µ ) | u(k) |2 k
491
−
X
V (k, k 0 ) v(k)∗ u(k) u(k 0 )∗ v(k 0 )
k,k0
(2259) The term involving the double sum is eliminated by introducing a quantity X ∆(k) = V (k, k 0 ) u(k 0 )∗ v(k 0 ) (2260) k0
On minimizing the energy, subject to the constraint of conservation of probability, with respect to u(k) and v(k)∗ one finds 0 = 2 ( ε(k) − µ ) + λ u∗ (k) − ∆(k) v ∗ (k) 0
=
λ v(k) − ∆(k) u(k) (2261)
where λ is the Lagrange undetermined parameter. These equations can be solved to yield 1 ε(k) − µ 2 | u(k) | = 1 − (2262) 2 Eqp (k) and 1 ε(k) − µ 2 | v(k) | = 1 + (2263) 2 Eqp (k) where we have defined ∆(k) u(k) v ∗ (k) = (2264) 2 Eqp (k) The first two equations can be multiplied and equated to the modulus squared of the third equation according to the identity ∗ 2 2 ∗ ∗ | u(k) | | v(k) | = u(k) v (k) u(k) v (k) (2265) The resulting expression can be solved for Eqp (k) to yield p Eqp (k) = + ( ε(k) − µ )2 + | ∆(k) |2
(2266)
The factor | u(k) |2 , is the probability of finding an electron of momentum k in the B.C.S. ground state and, therefore, is just n(k). Unlike a Fermi-liquid, where n(k) is discontinuous at the Fermi-surface with magnitude 1/Zk , in the superconductor the distribution drops smoothly to zero as k increases above kF . Thus, the concept of Fermi-surface is not well defined in the superconducting state. The energy Eqp (k), relative to µ, turns out to be the energy required to create a quasi-particle of momentum k from the ground state. The quasi-particle is either of the form of an added electron or a hole. With the B.C.S. ground state both of these leave a single unpaired electron in an otherwise perfectly paired B.C.S. state. The minimum energy required to create two quasi-particles, that is two individual electrons, is just 2 ∆(k F ) .
492
25.3.3
The Gap Equation
The ”energy gap” parameter satisfies the non-linear integral equation ∆(k) =
X
V (k, k 0 )
k0
∆∗ (k 0 ) 2 E(k 0 )
(2267)
where V (k, k 0 ) is the attractive pairing interaction mediated by the phonons. The interaction can be approximated by the attractive s-wave potential V (k, k 0 )
=
V f or | ε(k) − µ | < ¯h ωD
V (k, k 0 )
=
0 f or | ε(k) − µ | > ¯h ωD (2268)
In this case one finds ∆(k)
=
∆(0) f or | ε(k) − µ | < ¯h ωD
∆(k)
=
0 f or | ε(k) − µ | > ¯h ωD (2269)
where the gap in the quasi-particle dispersion relation at the Fermi-energy is given by the solution of Z h¯ ωD 1 1 = V ρ(µ) dε p 2 ε + | ∆(0) |2 0 ¯h ωD (2270) = V ρ(µ) sinh−1 | ∆(0) | which is solved as | ∆(0) | =
¯h ωD 1 sinh V ρ(µ)
(2271)
This gap 2 ∆(0) just corresponds to the energy required to break a Cooper pair. At finite temperatures, the superconducting gap satisfies the equation ∆(k)
=
X
=
X
V (k, k 0 )
∆(k 0 ) ( 1 − 2 f (Eqp (k 0 )) ) 2 Eqp (k 0 )
V (k, k 0 )
∆(k 0 ) βEqp (k 0 ) tanh 0 2 2 Eqp (k )
k0
k0
(2272) The tanh factor is a decreasing function for increasing temperature, therefore, for the equation to have a non-trivial solution the denominator has to decrease 493
with increasing temperature. This can only happen if | ∆(T ) | decreases with increasing temperature. For sufficiently high temperatures, the equation can reduces to ∆(T ) = ∆(T ) V ρ(µ) β ¯hωD (2273) which only has the trivial solution ∆(0) = 0. The critical temperature where the gap first vanishes ∆(Tc ) = 0 is given by Z 1
=
h ¯ ωD
ρ(µ) V
dε 0
Z =
tanh βc2 ε 2ε
β¯ hωD 2
ρ(µ) V
dz 0
= =
β¯hωD 2 β¯hωD ρ(µ) V ln 2 ρ(µ) V
ln
tanh z z Z ∞ 2 − dz ln z sech z 0 π − ln 4 exp γ (2274)
The superconducting gap decreases with increasing temperature and vanishes at a critical temperature Tc given by 1 kB Tc = 1.14 ¯hωD exp − (2275) V ρ(µ) 1
The critical temperature is proportional to M − 2 as expected from the isotope effect. Above the critical temperature ∆(T ) = 0 and the B.C.S. state reduces to the normal state. Just below the critical temperature one has ∆(T )2 =
8 π2 2 k Tc ( Tc − T ) 7 ξ(3) B
T → Tc
(2276)
Thus, the order parameter has a typical mean field variation with an exponent of β = 12 close to Tc .
25.3.4
The Ground State Energy
The normal state is unstable to the B.C.S. state only if it has a higher energy. At T = 0 the stability can be seen be examining the energy " # 2 X ( ε(k) − µ ) ˆ − µ N ) | ΨBCS > = < ΨBCS | ( H ( ε(k) − µ ) − p ( ε(k) − µ )2 + | ∆(k) |2 k −
| ∆(0) |2 V (2277)
494
The condensation energy is defined as the difference between the energy of the superconducting state and the normal state Z 0 ˆ − µ N ) | ΨBCS > − 2 ∆E = < ΨBCS | ( H dε ε ρ(µ + ε) −∞
(2278) The condensation energy is evaluated by writing the sum over k as an integral over the density of states. " # Z h¯ ω ε2 ∆E = dε ρ(µ + ε) ε − p ε2 + | ∆(0) |2 0 " # Z 0 ε2 + dε ρ(µ + ε) − ε − p ε2 + | ∆(0) |2 −¯ hωD −
| ∆(0) |2 V (2279)
Then the integral over states below the Fermi-energy, ε < 0, is transformed to an integral over positive ε. This leads to " # Z h¯ ωD ε2 ∆E = 2 dε ρ(µ + ε) ε − p ε2 + | ∆(0) |2 0 | ∆(0) |2 V
−
Z =
"
h ¯ ωD
2 ρ(µ)
dε
# ε −
p
ε2
+ | ∆(0)
|2
0
Z +
h ¯ ωD
2 ρ(µ) 0
| ∆(0) |2
dε p
ε2 + | ∆(0) |2
−
| ∆(0) |2 V (2280)
The integrals are evaluated as ∆E
= =
| ∆(0) |2 | ∆(0) |2 | ∆(0) |2 + − 2 V V | ∆(0) |2 − ρ(µ) 2 − ρ(µ)
(2281) which shows that the condensation energy comes from the attractive potential that lowers the energy of the pair more than the increase in the potential energy 495
caused by the confinement within the coherence length ξ. The net lowering can be understood in terms of the quasi-particle dispersion relation. The electrons with energy within | ∆(0) | of µ have their energy lowered by an amount | ∆(0) |. The net lowering of energy is just the number of electrons ρ(µ) | ∆(0) | times | ∆(0) |. Therefore, the B.C.S. state has lower energy than the normal state when the gap is non-zero.
25.4
Quasi-Particles
The B.C.S. Hamiltonian can be solved for the quasi-particle excitations, in the mean field approximation, by linearizing the pairing interaction terms. In a normal metal, the only allowed matrix elements are between initial and final states which have the same number of electrons. However, since for a superconductor the average is to be evaluated in the B.C.S. ground state, matrix elements between operators with different numbers of pairs are non zero. These give rise to the anomalous expectation values. For example, the anomalous expectation value associated with adding a pair of electrons ((k 0 , ↑), (−k 0 , ↓)) to the superconducting condensate is given by the probability amplitude < ΨBCS | c†k0 ,↑ c†−k0 ,↓ | ΨBCS > = u(k 0 )∗ v(k 0 )
(2282)
The linearized mean field Hamiltonian is given by X ˆMF − µ N = ( ε(k) − µ ) c†k,↑ ck,↑ + ( ε(−k) − µ ) c†−k,↓ c−k,↓ H k
−
X
−
X
V (k, k 0 ) < ΨBCS | c†−k0 ,↓ c†k0 ,↑ | ΨBCS > ck,↑ c−k,↓
k,k0
V (k, k 0 ) c†−k0 ,↓ c†k0 ,↑ < ΨBCS | ck,↑ c−k,↓ | ΨBCS >
k,k0
+
X
V (k, k 0 ) < ΨBCS | c†−k0 ,↓ c†k0 ,↑ | ΨBCS > < ΨBCS | ck,↑ c−k,↓ | ΨBCS >
k,k0
(2283) The anomalous expectation value leads to a term in the Hamiltonian with strength X V (k, k 0 ) u(k 0 )∗ v(k 0 ) ∆(k) = (2284) k0
which corresponds to a process in which two electrons ((k, ↑), (−k, ↓)) are absorbed into the condensate. The mean field Hamiltonian also contains the Hermitean conjugate which represents the reverse process in which two electrons are emitted from the condensate. X ˆMF − µ N = H ( ε(k) − µ ) c†k,↑ ck,↑ + ( ε(−k) − µ ) c†−k,↓ c−k,↓ k
496
−
X
∆(k) ck,↑ c−k,↓ + c†−k,↓ c†k,↑ ∆(k)∗
+
k
| ∆(0) |2 V (2285)
In the absence of an electromagnetic field, the order parameter ∆(k) can be chosen to be real. The mean field Hamiltonian involves terms in which the condensate emits or absorbs two electrons. This is reminiscent of the treatment of anti-ferromagnetic spin waves, using the method of Holstein and Primakoff, except here the Hamiltonian involves fermions rather than bosons. The quadratic Hamiltonian can be diagonalized by means of a canonical transformation. We shall define two new fermion operators via the transformation αk = exp + Sˆ ck,↑ exp − Sˆ and βk†
= exp
+ Sˆ
c†−k,↓
exp
− Sˆ
(2286)
(2287)
ˆ The energy eigenvalues of where Sˆ is an anti-Hermitean operator, Sˆ† = − S. the Hamiltonian can be found directly from the transformed Hamiltonian 0 ˆ ˆ ˆ ˆ HM F = exp + S HM F exp − S (2288) as they have the same eigenvalues and the eigenstates are related via | φ0n > = exp + Sˆ | φn >
(2289)
The operator Sˆ is chosen to be of the form X Sˆ = θk c†k,↑ c†−k,↓ − c−k,↓ ck,↑
(2290)
k
Explicitly, the transformation yields αk = ck,↑ cos θk − c†−k,↓ sin θk βk† = c†−k,↓ cos θk + ck,↑ sin θk (2291) Rather than working with the transformed Hamiltonian, we shall express the original Hamiltonian in terms of the transformed operators. Hence, we shall require the inverse transformation which expresses the original electron and holes operators in terms of the new quasi-particles. The inverse transformation 497
is expressed in terms of the transformation matrix but with θk → − θk so one has ck,↑ = αk cos θk + βk† sin θk c†−k,↓ = βk† cos θk − αk sin θk (2292) The mean field Hamiltonian is expressed in terms of the new operators and θk is chosen so that the terms that are not represented in terms of the quasi-particle number operators vanish. The normal terms in the Hamiltonian are found as X ( ε(k) − µ ) c†k,↑ ck,↑ + ( ε(−k) − µ ) c†−k,↓ c−k,↓ k
" =
X
( ε(k) − µ )
2
sin θk
αk αk†
+
βk βk†
2
+ cos θk
αk†
αk +
βk†
# βk
k
+
X
( ε(k) − µ ) sin 2θk
αk†
βk†
+ βk αk
k
(2293) The anomalous terms are evaluated as X † † − ∆(k) ck,↑ c−k,↓ + c−k,↓ ck,↑ ∆(k) k
=
−
X
+
X
Re
∆(k)
βk† βk − αk αk†
αk† βk† + βk αk
sin 2θk
k
Re
∆(k)
cos 2θk
k
(2294) The off diagonal terms can be made to vanish by choosing Re ∆(k) tan 2θk = − ε(k) − µ
(2295)
Thus, θk decreases from a value less than π4 to less than − π4 as k varies from hωD below µ to ¯hωD above µ. After this value has been chosen, the Hamiltonian ¯ is expressed as the sum of a constant and terms involving the number operators of the α and β quasi-particles X ˆ M F = E0 + H Eqp (k) αk† αk + βk† βk (2296) k
498
This procedure shows that the excitations are quasi-particles as they are still fermions. Furthermore, these quasi-particles have excitation energies which have the dispersion relation p Eqp (k) = + ( ε(k) − µ )2 + | ∆(k) |2 (2297) The canonical transformation shows that the quasi-particles are part electron and part hole like. Basically, this is a consequence that the quasi-particle excitation consists of a single unpaired electron (k, σ), in the presence of the condensate. This specific state can be produced from the ground state, either by adding the electron (k, σ) to the system or by breaking a Cooper pair by removing the partner electron (−k, −σ). We note that the quasi-particles are eigenstates of the spin operator. The α quasi-particle is a spin up excitation as it is composed of an up spin electron and down spin hole, whereas the β quasi-particle is a spin down excitation. From the dispersion relation, one finds that the B.C.S. superconductor is actually characterized by the presence of a gap in the excitation spectrum. That is, there is a minimum excitation energy 2 | ∆(kF ) | corresponding to breaking a Cooper pair and producing two independent quasi-particles. ——————————————————————————————————
25.4.1
Exercise 83
Evaluate the constant term in the mean field B.C.S. Hamiltonian. Show that the variational B.C.S. ground state is the lowest energy state of the mean field Hamiltonian by showing that the quasi-particle destruction operators annihilate the B.C.S. state αk | ΨBCS > = 0 βk | ΨBCS > = 0 (2298)
——————————————————————————————————
25.5
Thermodynamics
Since the quasi-particles are fermions, the entropy S due to the gas of quasiparticles is given by the formulae X S = − 2 kB ( 1 − f (Eqp (k)) ) ln[ 1 − f (Eqp (k)) ] + f (Eqp (k)) ln[ f (Eqp (k)) ] k
(2299) 499
By the usual procedure of minimizing the grand canonical potential Ω with respect to the distribution f (Eqp (k)) , one can show that the non-interacting quasi-particles are distributed according to the Fermi-Dirac distribution function. Therefore, the quasi-particle contribution to the specific heat is just given by Z +∞ ∂f 2 T ∂∆(T )2 dE ρqp (E) E 2 − (2300) Cqp (T ) = − T −∞ 2 ∂T ∂E which involves the average of the temperature derivative of the square of the quasi-particle energy at µ, and the quasi-particle density of states X ρqp (E) = δ E − Eqp (k) (2301) k
Since, in the mean field approximation, the square of the gap has a finite slope for T just below Tc and is zero above, T ∆(T )2 ∼ ∆(0)2 1 − Θ( Tc − T ) (2302) Tc the specific heat has a discontinuity at Tc . In B.C.S. theory, the magnitude of the specific heat jump has the value given by, 3.03 ∆2 (0) ρ(µ) / Tc . Thus, the value of the specific heat jump found in weak coupling B.C.S. theory when normalized to the normal state specific heat is given by ∆C(Tc ) C(Tc )
Cs − Cn Cn 12 = 1.43 7 ξ(3)
= =
(2303) This ratio is a measure of the quantity 1 2 T 2 kB c
∂∆2 (T ) ∂T
∼ Tc
∆(0) kB Tc
2 (2304)
The values of the specific heat jumps for strong coupling materials tend to be higher than the B.C.S. value, for example the normalized jump for P b is as large as 2.71. This trend is understood as being due to inelastic scattering processes which tend to suppress Tc more than ∆(0). The heavy fermion superconductors show that the normalized specific heat discontinuities are significantly smaller than the B.C.S. ratio. Low Temperatures.
500
The gap in the quasi-particle density of states could be expected to show up in an activated exponential dependence of the low-temperature electronic specific heat, for T Tc . For these temperatures the order parameter is expected to have saturated, and so if one considers the Fermi-liquid as being well formed then the quasi-particle contribution is given by Z +∞ 2 ∂f dE ρqp (E) E 2 (2305) Cqp (T ) = − T −∞ ∂E The B.C.S. quasi-particle density of states is evaluated as X ρqp (E) = δ E − Eqp (k) k
Z
∞
=
dε ρ(ε) δ
E −
p
( ε − µ )2 + ∆(T )2
−∞
∼ =
|E| |ε − µ| |E| ρ(µ) p E 2 − ∆(T )2
ρ(µ)
f or | E | > ∆(T ) (2306)
In evaluating the B.C.S. density of states, the conduction band electron density of states has been approximated by a constant value. The resulting B.C.S. quasi-particle density of states has a gap of magnitude 2 ∆(T ) around the Fermi-energy. This yield an exponentially activated behavior of the specific heat, ∆(0) Cqp (T ) ∼ 9.17 γ Tc exp − (2307) kB T in B.C.S. theory.
25.6
Perfect Conductivity
The current is composed of the sum of a paramagnetic current and a diamagnetic current. The paramagnetic current can be evaluated from the Kubo formula. The paramagnetic current is expressed as e2 ¯ h 1 X j p (q; ω) = ( 2 k − q ) ( 2 k − q ) . A × 4 m2 c V k f (E(k)) − f (E(k − q)) f (E(k − q)) − f (E(k)) × + E(k − q) − E(k) + h ¯ ω E(k) − E(k − q) + h ¯ ω (2308)
501
In this expression E(k) is the quasi-particle energy in the superconductor. In the static limit with uniform fields, ( ω → 0, q → 0 ), the paramagnetic current reduces to e2 ¯ h 1 X ∂f (E(k)) j p (0; 0) = 2 2 k(k.A) − (2309) m c V ∂E k
The total current is found by combining the paramagnetic current with the diamagnetic current e2 ¯ h 1 X ∂f (E(k)) ρ e2 j(0; 0) = 2 2 k(k.A) − − A m c V ∂E mc k Z e2 ¯ h 1 ∂f (E(k)) ρ e2 4 = 2 2 dk k A − − A m c 6 π2 ∂E mc 2 Z e ¯h 1 ∂f (E(k)) ρ e2 4 = dk k − − A m2 c 3 π 2 ∂E mc (2310) In the normal state, where the gap in E(k) vanishes the derivative of the Fermifunction can be approximated as ∂f (E(k)) − = δ( E(k) − µ ) (2311) ∂E which leads to the vanishing response as 2m µ kF4 = 1 ¯h2 kF6
(2312)
Thus, in the normal state current does not flow in response to a static vector potential. However, in the superconducting state the total current is given by Z ρ e2 2µ ∂f (E(k)) 4 j = − (2313) A 1 − dk k − ∂E mc kF5 and as there is a gap on the Fermi-surface, the derivative of the Fermi-function is always exponentially small. Because of the finite superconducting gap, the second term is small and the cancellation does not occur. In the superconducting state, this reduces to the London equation j = −
ρ e2 A mc
(2314)
This shows that a current will flow in a superconductor in response to a static vector potential, that is the current will screen an applied magnetic field. This leads to the Meissner effect.
502
25.7
The Meissner Effect
In the superconducting state the susceptibility is expected to be dominated by the diamagnetic susceptibility produced by the supercurrent shielding the external field. The Pauli spin susceptibility will also be modified by the superconductivity, and provide information about the pairing. The zero field susceptibility is defined as a derivative of the magnetization, χs (T ) = ( ∂M ∂H ). The magnetization, produced by the electronic spins aligning with a magnetic field applied along the z axis, is given by " # g µB X Mz = f (E↑ (k)) − f (E↓ (k)) (2315) 2 k
which is given in terms of the Fermi-distribution for quasi-particles with spin σ and quasi-particle energy Eσ (k). For singlet pairing the magnetic field couples to the spins of the quasiparticles via the Zeeman energies and, as can be seen from inspection of the matrix only the time reversal partners pair. The quasi-particles consist of broken pairs, i.e. electrons of spin σ and holes of spin − σ. Since a down spin hole has the same Zeeman energy as an up spin electron, the quasi-particle energies depend on field through g µB σ H Eσ (k) = EH=0 (k) − (2316) 2 and so the spin susceptibility takes the usual form χs (T ) = − 2
g µB 2
2 Z
+∞
dE ρqp (E)
−∞
∂f ∂E
(2317)
which involves the B.C.S. quasi-particle density of states. The Pauli susceptibility tends to zero as T → 0 in an exponentially activated way ∆(0) χp (T ) ∼ exp − (2318) T The exponential vanishing of the spin susceptibility occurs as the electrons form singlet pairs in the ground state, and the finite spin moment is caused by thermal population of quasi-particles. Thus, in the spin singlet phases, the spin susceptibility could be expected to vanish as T → 0. However, spin-orbit coupling will produce a residual susceptibility that depends on the ratio of the superconducting coherence length, ξ0 to the mean free path due to spin-orbit scattering, lso . In the presence of spin-orbit coupling, the spin is no longer a good quantum number for the single particle eigenstates, and the spin up and spin down states are mixed. In 503
→− − → the limit that the strength of the spin-orbit coupling λ L . S is so large that λ ∆0 , the average value of σz for a single particle state tends to zero. The spin susceptibility is, therefore, reduced. The scattering has the effect that a significant contribution to the normal state χ(T ) comes from single particle states separated by an energy of the order of the spin-orbit scattering rate, which is by our assumption greater than ∆. As an opening up of a gap at the Fermi-energy is not expected to change the contribution of these higher energy states, one finds that the susceptibility in the superconducting state can remain comparable in magnitude to the normal state value. According to Anderson, the normalized susceptibility should have the two limits, χs (0) 2 lso = 1 − χn π ξ0
(2319)
for strong spin-orbit scattering and for weak spin-orbit scattering one has χs (0) π ξ0 = χn 6 lso Hence, a partial Meissner effect at T superconductor.
26
(2320)
= 0 can be found in a conventional
Landau-Ginsberg Theory
Superconductors can be divided into two categories, which depend on their macroscopic characteristics when an applied magnetic field is present. The classification is based on the length scale over which the magnetic field is screened λL relative to the length scale over which the superconducting order parameter changes. The latter length is given by the spatial extent of The Cooper pair wave function or coherence length ξ ξ =
¯h vf π ∆(T )
(2321)
Type I Superconductors. In simple (non-transition) metals the penetration depth is small, e.g., λ ∼ 300 A for Al while the coherence length is large ξ ∼ 1 × 104 A as vF is large. Materials where λ 1 κ = (2322) < √ ξ 2 are type I superconductors. Type II Superconductors.
504
For transition metals, rare earths and intermetallic compounds, where the band mass is very large, λL is very large ( ∼ 2000 A for V3 Ga ) and as the Fermi-velocity is small ( vF ∼ 104 m /s ) then as Tc and ∆(0) are high xi is small ( ∼ 50 A ). Materials where κ =
λ 1 > √ ξ 2
(2323)
are type II superconductors. Since λL and κ diverge the same way at Tc the dimensionless ratio κ is approximately temperature independent. If a magnetic field H < Hc is applied to a small superconductor, the field is excluded from the superconductor, but if H > Hc the field will penetrate and the superconductor will undergo a transition to the normal state. If a field is applied normal to the surface of a large superconducting slab then, because ∇ . B = 0, the field has to penetrate the slab. In a type I superconductor the magnetic field will concentrate into regions where | B | = Hc (2324) which are normal and regions where B = 0
(2325)
which are superconducting. The condensation energy density in the superconH2 H2 ducting state is 8 cπ and the diamagnetic energy of the normal state is also 8 cπ . These regions are separated by a domain wall which has positive energy. The energetic cost of forming a domain wall of area A can be estimated as H2 Hc2 E ∼ ξ c − λ A 8π 8π where the term ξ
Hc2 8 π
(2326)
is the energetic cost of setting the order parameter to H2
zero, and the diamagnetic energy is reduced by λ 8 πc . Because of this positive domain wall energy in a type I superconductor, the number of domains and domain walls will be minimized. The domain pattern will have a scale of subdivision which is intermediate between ξ and the sample size. In type II superconductors, a similar separation occurs, but as the domain wall energy is negative, the superconductor will break up into as many normal regions as possible. These normal regions have the form of magnetic flux carrying tubes that thread the sample, which are known as vortices. Each vortex carries a minimum amount of flux Φ0 , the flux is quantized in units of Φ0 . Φ0 =
hc = 2.07 × 10−7 2e 505
Gauss cm2
(2327)
The vortices first enter the superconductor at a critical field Hc1 . The vortices form a triangular lattice of vortices. The superconductor becomes saturated with vortices when it becomes completely normal at an upper critical field Hc2 . The magnetization M is linear in field up to Hc1 with susceptibility − 41π . At Hc1 the magnitude of the magnetization has a cusp and the magnitude falls to zero at Hc2 .
506