Chapter 1: Chemical Bonding Linus Pauling (1901–1994) December 28, 2001
Contents 1 The development of Bands and their filling
4
2 Different Types of Bonds
9
2.1
Covalent Bonding . . . . . . . . . . . . . . . . . . . . . . 10
2.2
Ionic Bonding . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.1
Madelung Sums . . . . . . . . . . . . . . . . . . . 17
2.3
Metallic Bonding . . . . . . . . . . . . . . . . . . . . . . 18
2.4
Van der Waals Bonds . . . . . . . . . . . . . . . . . . . . 20 2.4.1
Van der Waals-London Interaction . . . . . . . . 21
1
Periodic Table 1
18
H 1
He
1
2
Hydrogen
13
14
15
16
17
1.008
Li 2
4.003
3
Lithium 6.941
Na 3
11
19
2
39.098
37
Rubidium 85.468
Cs 6
3 Sc
20
Calcium
Ti
39
Yttrium
87.62
88.906
Lu
56
71
5 22
Titanium
44.956
Y
38
Strontium
Ba
21
Scandium
40.078
Sr
4
47.88
Zr
40
Zirconium 91.224
Hf
72
V
6 23
Vanadium 50.942
Nb
41
Niobium 92.906
Ta
73
Cr
7 24
Chromium 51.996
Mo
42
Mn
8 25
Manganese 54.938
Tc
43
Molybdenum Technetium 95.94
W
74
(98)
Re
Fe
9 26
Iron
44
Ruthenium 101.07
75
Os
27
Cobalt
55.847
Ru
Co
10
76
58.933
Rh
45
Rhodium 102.906
Ir
77
Ni
11 28
Nickel 58.69
Pd
46
Palladium 106.42
Pt
78
Cu
12 29
Copper 63.546
Ag
47
Silver 107.868
Au
79
Zn
30
14.007
P
15
8
F
Oxygen 15.999
S
16
9
Fluorine 18.998
Cl
17
Ne
10
Neon 20.180
Ar
18
Phosphorous
Sulfur
Chlorine
Argon
26.982
28.086
30.974
32.066
35.453
39.948
Ga
31
Ge
72.61
49
Sn
Indium
112.411
50
Tin
114.82
Tl
32
Germanium
69.723
In
Cadmium
80
14
O
Silicon
65.39
Hg
Si
7
Nitrogen
12.011
13
Gallium
48
N
Aluminum
Zinc
Cd
6
Carbon
Pb
33
Arsenic 74.922
Sb
51
Antimony
118.71
81
As
82
121.75
Bi
83
Se
34
Br
Selenium 78.96
Te
52
Tellurium 127.60
Po
84
35
Bromine 79.904
I
53 Iodine 126.905
At
85
Kr
36
Krypton 83.80
Xe
54
Xenon 131.29
Rn
86
Caesium
Barium
Lutetium
Halfnium
Tantalum
Tungsten
Rhenium
Osmium
Iridium
Platinum
Gold
Mercury
Thallium
Lead
Bismuth
Polonium
Astatine
Radon
132.905
137.327
174.967
178.49
180.948
183.85
186.207
190.2
192.22
195.08
196.967
200.59
204.383
207.2
208.980
(209)
(210)
(222)
Fr 7
55
Ca
C
10.811
Al
12
24.305
5 Boron
9.012
Mg
22.990
Potassium
B
4
Beryllium
Magnesium
Rb 5
Be
Sodium
K 4
2
Helium
87
Ra
Lr
88
103
Francium
Radium
Lawrencium
(223)
226.025
(260)
La
57
Lanthanum 138.906
Ac
89
Ce
58
Cerium 140.115
Th
90
Pr
59
Nd
60
Praseodymium Neodymium 140.908
Pa
91
144.24
U
92
Pm
61
Promethium (145)
Np
Sm
62
Samarium 150.36
93
Pu
94
Eu
63
Europium 151.965
Am
95
Gd
64
Gadolinium 157.25
Cm
96
Tb
65
Terbium 158.925
Bk
97
Dy
66
Dysprosium 162.50
Cf
98
Ho
67
Holmium 164.93
Es
99
Er
68
Tm
69
Yb
70
Erbium
Thulium
Ytterbium
167.26
168.934
173.04
Fm 100 Md101 No 102
Actinium
Thorium
Protactinium
Uranium
Neptunium
Plutonium
Americium
Curium
Berkelium
Californium
Einsteinium
Fermium
Mendelevium
Nobelium
227.028
232.038
231.036
238.029
237.048
(244)
(243)
(247)
(247)
(251)
(252)
(257)
(258)
(259)
Solid state physics is the study of mainly periodic systems (or things that are close to periodic) in the thermodynamic limit ≈ 1021 atoms/cm3 . At first this would appear to be a hopeless task, to solve such a large system.
Figure 1: The simplest model of a solid is a periodic array of valance orbitals embedded in a matrix of atomic cores.
However, the self-similar, translationally invariant nature of the periodic solid and the fact that the core electrons are very tightly bound at each site (so we may ignore their dynamics) makes approximate solutions possible. Thus, the simplest model of a solid is a periodic array of valance orbitals embedded in a matrix of atomic cores. Solving the problem in one of the irreducible elements of the periodic solid (cf. one of the spheres in Fig. 1), is often equivalent to solving the whole system. For this reason we must study the periodicity and the mechanism (chemical bonding) which binds the lattice into a periodic structure. The latter is the emphasis of this chapter. 3
1
The development of Bands and their filling
nl
elemental solid
1s H,He 2s Li,Be 2p B→Ne 3s Na,Mg 3p Al→Ar 4s K,Ca 3d transition metals Sc→Zn 4p Ga→Kr 5s Rb,Sr 4d transition metals Y→Cd 5p In-Xe 6s Cs,Ba 4f
Rare Earths (Lanthanides) Ce→Lu
5d Transition metals La→Hg 6p Tl→Rn Table 1: Orbital filling scheme for the first few atomic orbitals
We will imagine that each atom (cf. one of the spheres in Fig. 1) is composed of Hydrogenic orbitals which we describe by a screened
4
Coulomb potential V (r) =
−Znl e2 r
(1)
where Znl describes the effective charge seen by each electron (in principle, it will then be a function of n and l). As electrons are added to the solid, they then fill up the one-electron states 1s 2s 3s 3p 3d 4s 4p 4d 4f· · ·, where the correspondence between spdf and l is s → l = 0, p → l = 1, etc. The elemental solids are then made up by filling these orbitals systematically (as shown in Table 1) starting with the lowest energy states (where Enl =
2 me4 Znl 2 2 2¯h n
Note that for large n, the orbitals do not fill up simply as a function of n as we would expect from a simple Hydrogenic model with En =
mZ 2 e4 2¯h2 n2
(with all electrons seeing the same nuclear charge Z). For
example, the 5s orbitals fill before the 4d! This is because the situation is complicated by atomic screening. I.e. s-electrons can sample the core and so are not very well screened whereas d and f states face the angular momentum barrier which keeps them away from the atomic core so that they feel a potential that is screened by the electrons of smaller n and l. To put is another way, the effective Z5s is larger than Z4d . A schematic atomic level structure, accounting for screening, is shown in Fig. 2. Now let’s consider the process of constructing a periodic solid. The simplest model of a solid is a periodic array of valence orbitals embedded in a matrix of atomic cores (Fig. 1). As a simple model of how 5
5d 4f 6s 5p 4d 5s 4p 3d 4s 3p 3s 2p 2s 1s
V(r)
z/r
atom 6spdf
d
5spdf 4spdf
Ce Valence Shell
6s
3spd 2sp
+
4f
5d
s
1s
V(r) + C l (l+1)/r
2
Figure 2: Level crossings due to atomic screening. The potential felt by states with large l are screened since they cannot access the nucleus. Thus, orbitals of different principle quantum numbers can be close in energy. I.e., in elemental Ce, (4f 1 5d1 6s2 ) both the 5d and 4f orbitals may be considered to be in the valence shell, and form metallic bands. However, the 5d orbitals are much larger and of higher symmetry than the 4f ones. Thus, electrons tend to hybridize (move on or off ) with the 5d orbitals more effectively. The Coulomb repulsion between electrons on the same 4f orbital will be strong, so these electrons on these orbitals tend to form magnetic moments.
the eigenstates of the individual atoms are modified when brought together to form a solid, consider a pair of isolated orbitals. If they are far apart, each orbital has a Hamiltonian H0 = ²n, where n is the orbital occupancy and we have ignored the effects of electronic correlations (which would contribute terms proportional to n↑ n↓ ). If we bring them together so that they can exchange electrons, i.e. hybridize, then the degeneracy of the two orbitals is lifted. Suppose the system can gain 6
Figure 3: Two isolated orbitals. If they are far apart, each has a Hamiltonian H 0 = ²n, where n is the orbital occupancy.
Figure 4: Two orbitals close enough to gain energy by hybridization. The hybridization lifts the degeneracy of the orbitals, creating bonding and antibonding states.
an amount of energy t by moving the electrons from site to site (Our conclusions will not depend upon the sign of t. We will see that t is proportional to the overlap of the atomic orbitals). Then H = ²(n1 + n2 ) − t(c†1 c2 + c†2 c1 ) .
(2)
where c1 (c†1 ) destroys (creates) an electron on orbital 1. If we rewrite this in matrix form µ
H = c†1 , c†2
¶
² −t −t
²
c1 c2
(3)
then it is apparent that system has eigenenergies ² ± t. Thus the two states split their degeneracy, the splitting is proportional to |t|, and they remain centered at ² 7
If we continue this process of bringing in more isolated orbitals into the region where they can hybridize with the others, then a band of states is formed, again with width proportional to t, and centered around ² (cf. Fig. 5). This, of course, is an oversimplification. Real
+
+
+
+ . . . = Band
E Figure 5: If we bring many orbitals into proximity so that they may exchange electrons (hybridize), then a band is formed centered around the location of the isolated orbital, and with width proportional to the strength of the hybridization
. solids are composed of elements with multiple orbitals that produce multiple bonds. Now imagine what happens if we have several orbitals on each site (ie s,p, etc.), as we reduce the separation between the orbitals and increase their overlap, these bonds increase in width and may eventually overlap, forming bands. The valance orbitals, which generally have a greater spatial extent, will overlap more so their bands will broaden more. Of course, eventually we will stop gaining energy (˜t) from bringing the atoms closer together, due to overlap of the cores. Once we have reached the optimal 8
point we fill the states 2 particles per, until we run out of electrons. Electronic correlations complicate this simple picture of band formation since they tend to try to keep the orbitals from being multiply occupied.
2
Different Types of Bonds
These complications aside, the overlap of the orbitals is bonding. The type of bonding is determined to a large degree by the amount of overlap. Three different general categories of bonds form in solids (cf. Table 2). Bond
Overlap
Lattice
Ionic
very small (< a) closest unfrustrated
constituents dissimilar
packing Covalent small (∼ a)
determined by the
similar
structure of the orbitals Metallic
very large (À a) closest packed
unfilled valence orbitals
Table 2: The type of bond that forms between two orbitals is dictated largely by the amount that these orbitals overlap relative to their separation a.
9
2.1
Covalent Bonding
Covalent bonding is distinguished as being orientationally sensitive. It is also short ranged so that the interaction between nearest neighbors is of prime importance and that between more distant neighbors is often neglected. It is therefore possible to describe many of its properties using the chemistry of the constituent molecules. Consider a simple diatomic molecule O2 with a single electron and ∇2 Ze2 Ze2 Z 2 e2 − − + H=− 2m ra rb R
(4)
We will search for a variational solution to the the problem of the molecule (HΨmol = EΨmol ), by constructing a variational wavefunction from the atomic orbitals ψa and ψb . Consider the variational molecular wavefunction Ψ0 = c a ψa + c b ψb (5) R Ψ0∗ HΨ0 0 E = R 0∗ 0 ≥ E (6) Ψ Ψ The best Ψ0 is that which minimizes E 0 . We now define the quantum integrals S=
Z
ψa∗ ψb
Haa = Hbb =
Z
ψa∗ Hψa
Hab =
Z
ψa∗ Hψb .
(7)
Note that 1 > Sr > 0, and that Habr < 0 since ψa and ψb are bound states [where Sr = ReS and Habr = ReHab ]b. With these definitions, E0 =
(c2a + c2b )Haa + 2ca cb Habr c2a + c2b + 2ca cb Sr 10
(8)
and we search for an extremum
∂E 0 ∂ca
=
∂E 0 ∂cb
0
= 0. From the first condition, ∂E ∂ca =
0 and after some simplification, and re-substitution of E 0 into the above equation, we get the condition ca (Haa − E 0 ) + cb (Habr − E 0 Sr ) = 0 The second condition,
∂E 0 ∂cb
(9)
= 0, gives
ca (Habr − E 0 S) + cb (Haa − E 0 ) = 0 .
(10)
Together, these form a set of secular equations, with eigenvalues E0 =
Haa ± Habr . 1 ± Sr
(11)
Remember, Habr < 0, so the lowest energy state is the + state. If we substitute Eq. 10 into Eqs. 8 and 9, we find that the + state corresponds √ to the eigenvector ca = cb = 1/ 2; i.e. it is the bonding state. 1 Haa + Habr 0 . (12) Ψ0bonding = √ (ψa + ψb ) Ebonding = 1 + Sr 2 √ For the −, or antibonding state, ca = −cb = 1/ 2. Thus, in the bonding state, the wavefunctions add between the atoms, which corresponds to a build-up of charge between the oxygen molecules (cf. Fig. 6). In the antibonding state, there is a deficiency of charge between the molecules. Energetically the bonding state is lower and if there are two electrons, both will occupy the lower state (ie., the molecule gains energy by bonding in a singlet spin configuration!). Energy is lost if there are more electrons which must fill the antibonding states. Thus the covalent 11
e
ra
rb R
Ze spin triplet
spin singlet
Ze
Ψanti-bonding
Ψbonding
Figure 6: Two oxygen ions, each with charge Ze, bind and electron with charge e. The electron, which is bound in the oxygen valence orbitals will form a covalent bond between the oxygens
bond is only effective with partially occupied single-atomic orbitals. If the orbitals are full, then the energy loss of occupying the antibonding states would counteract the gain of the occupying the bonding state and no bond (conventional) would occur. Of course, in reality it is much worse than this since electronic correlation energies would also increase. The pile-up of charge which is inherent to the covalent bond is important for the lattice symmetry. The reason is that the covalent bond is sensitive to the orientation of the orbitals. For example, as shown in Fig. 7 an S and a Pσ orbital can bond if both are in the same plane; 12
whereas an S and a Pπ orbital cannot. I.e., covalent bonds are directional! An excellent example of this is diamond (C) in which the
-
Pσ Pπ
S
S
+
-
+ No bonding
Bonding
Figure 7: A bond between an S and a P orbital can only happen if the P-orbital is oriented with either its plus or minus lobe closer to the S-orbital. I.e., covalent bonds are directional!
(tetragonal) lattice structure is dictated by bond symmetry. However at first sight one might assume that C with a 1s2 2s2 2p2 configuration could form only 2-bonds with the two electrons in the partially filled p-shell. However, significant energy is gained from bonding, and 2s and 2p are close in energy (cf. Fig. 2) so that sufficient energy is gained from the bond to promote one of the 2s electrons. A linear combination of the 2s 2px , 2py and 2pz orbitals form a sp3 hybridized state, and C often forms structures (diamond) with tetragonal symmetry. Another example occurs most often in transition metals where the d-orbitals try to form covalent bonds (the larger s-orbitals usually form metallic bonds as described later in this chapter). For example, consider a set of d-orbitals in a metal with a face-centered cubic (fcc) structure, 13
as shown in Fig. 8. The xy, xz, and yz orbitals all face towards a neighboring site, and can thus form bonds with these sites; however, the x2 − y 2 and 3z 2 − r2 orbitals do not point towards neighboring sites and therefore do not participate in bonding. If the metal had a simple cubic structure, the situation would be reversed and the x2 − y 2 and 3z 2 − r2 orbitals, but not the xy, xz, and yz orbitals, would participate much in the bonding. Since energy is gained from bonding, this energetically favors an fcc lattice in the transition metals (although this may not be the dominant factor determining lattice structure). z
z d
xz
y
z d
xy
x
x
z
y
y
yz
y
x
Face Centered Cubic Structure
z d 2 2 x-y
d
d 2 2 3z - r
x
y
x
Figure 8: In the fcc structure, the xy, xz, and yz orbitals all face towards a neighboring site, and can thus form bonds with these sites; however, the x2 −y 2 and 3z 2 −r2 orbitals
do not point towards neighboring sites and therefore do not participate in bonding
One can also form covalent bonds from dissimilar atoms, but these will also have some ionic character, since the bonding electron will no 14
longer be shared equally by the bonding atoms. 2.2
Ionic Bonding
The ionic bond occurs by charge transfer between dissimilar atoms which initially have open electronic shells and closed shells afterwards. Bonding then occurs by Coulombic attraction between the ions. The energy of this attraction is called the cohesive energy. This, when added to the ionization energies yields the energy released when the solid is formed from separated neutral atoms (cf. Fig. 9). The cohesive energy is determined roughly by the ionic radii of the elements. For example, for NaCl Ecohesive
ao e2 = 5.19eV . = ao rN a + rCl
(13)
Note that this does not agree with the experimental figure given in the caption of Fig. 9. This is due to uncertainties in the definitions of the ionic radii, and to oversimplification of the model. However, such calculations are often sufficient to determine the energy of the ionic structure (see below). Clearly, ionic solids are insulators since such a large amount of energy ∼ 10eV is required for an electron to move freely. The crystal structure in ionic crystals is determined by balancing the needs of keeping the unlike charges close while keeping like charges ◦
apart. For systems with like ionic radii (i.e. CsCl, rCs ≈ 1.60 A, rCl ≈ ◦
1.81 A) this means the crystal structure will be the closest unfrustrated 15
Na
Figure 9:
Na+
+ 5.14 eV
Cl
+
Na+
+
e-
Cl-
+
+ 3.61 eV
Na+ Cl-
Cl-
e-
r Cl
= 1.81
r Na
= 0.97
+ 7.9 eV
The energy per molecule of a crystal of sodium chloride is (7.9-
5.1+3.6) eV=6.4eV lower than the energy of the separated neutral atoms. The cohesive energy with respect to separated ions is 7.9eV per molecular unit. All values on the figure are experimental. This figure is from Kittel.
packing. Since the face-centered cubic (fcc) structure is frustrated (like charges would be nearest neighbors), this means a body-centered cubic (bcc) structure is favored for systems with like ionic radii (see Fig. 10). For systems with dissimilar radii like NaCl (cf. Fig. 9), a simple cubic structure is favored. This is because the larger Cl atoms requires more room. If the cores approach closer than their ionic radii, then since they are filled cores, a covalent bond including both bonding and anti16
bonding states would form. As discussed before, Coulomb repulsion makes this energetically unfavorable. Body Centered Cubic
Cubic
Face Centered Cubic
Cl c
Cl b
b
a Na
c a
Cs
Figure 10: Possible salt lattice structures. In the simple cubic and bcc lattices all the nearest neighbors are of a different species than the element on the site. These ionic lattices are unfrustrated. However, it not possible to make an unfrustrated fcc lattice using like amounts of each element.
2.2.1
Madelung Sums
This repulsive contribution to the total energy requires a fully-quantum calculation. However, the attractive Coulombic contribution may be easily calculated, and the repulsive potential modeled by a power-law. Thus, the potential between any two sites i and j, is approximated by e2 B φij = ± + n rij rij
(14)
where the first term describes the Coulombic interaction and the plus (minus) sign is for the potential between similar (dissimilar) elements. The second term heuristically describes the repulsion due to the over17
lap of the electronic clouds, and contains two free parameters n and B (Kittel, pp. 66–71, approximates this heuristic term with an exponential, B exp (−rij /ρ), also with two free parameters). These are usually determined from fits to experiment. If a is the separation of nearest neighbors, rij = apij , and their are N sites in the system, then the total potential energy may be written as
e2 X ± B X 1 Φ = N Φi = N − . + n a i6=j pij a i6=j pnij The quantity A =
P
± i6=j pij ,
(15)
is known as the Madelung constant. A
depends upon the type of lattice only (not its size). For example AN aCl = 1.748, and ACsCl = 1.763. Due to the short range of the potential 1/pn , the second term may be approximated by its nearest neighbor sum. 2.3
Metallic Bonding
Metallic bonding is characterized by at least some long ranged and nondirectional bonds (typically between s orbitals), closest packed lattice structures and partially filled valence bands. From the first characteristic, we expect some of the valance orbitals to encompass many other lattice sites, as discussed in Fig. 11. Thus, metallic bonds lack the directional sensitivity of the covalent bonds and form non-directional bonds and closest packed lattice structures determined by an optimal filling of space. In addition, since the bands are composed of partially 18
3d x2 - y2 4S
Figure 11: In metallic Ni (fcc, 3d8 4s2 ), the 4s and 3d bands (orbitals) are almost degenerate (cf. Fig. 2) and thus, both participate in the bonding. However, the 4s orbitals are so large compared to the 3d orbitals that they encompass many other lattice sites, forming non-directional bonds. In addition, they hybridize weakly with the d-orbitals (the different symmetries of the orbitals causes their overlap to almost cancel) which in turn hybridize weakly with each other. Thus, whereas the s orbitals form a broad metallic band, the d orbitals form a narrow one.
filled orbitals, it is always possible to supply a small external electric field and move the valence electrons through the lattice. Thus, metallic bonding leads to a relatively high electronic conductivity. In the transition metals (Ca, Sr, Ba) the d-band is narrow, but the s and p bonds are extensive and result in conduction. Partially filled bands can occur by bond overlap too; ie., in Be and Mg since here the full S bonds overlap with the empty p-bands.
19
2.4
Van der Waals Bonds
As a final subject involving bonds, consider solids formed of Noble gases or composed of molecules with saturated orbitals. Here, of course, there is neither an ionic nor covalent bonding possibility. Furthermore, if the charge distributions on the atoms were rigid, then the interaction between atoms would be zero, because the electrostatic potential of a spherical distribution of electronic charge is canceled outside a neutral atom by the electrostatic potential of the charge on the nucleus. Bonding can result from small quantum fluctuations in the charge which induce electric dipole moments.
n P1
P2
x1 +
-
R
+
x2
Figure 12: Noble gasses and molecules with saturated orbitals can form short ranged van der Waals bonds by inducing fluctuating electric dipole moments in each other. This may be modeled by two harmonic oscillators binding a positive and negative charge each.
As shown in Fig. 12 we can model the constituents as either induced dipoles, or more correctly, dipoles formed of harmonic oscillators. Sup20
pose a quantum fluctuation on 1 induces a dipole moment p1 . Then dipole 1 exerts a field E1 =
3n(p1 · n) − p1 r3
(16)
which is felt by 2, which in turn induces a dipole moment p2 ∝ E1 ∝
1/r3 . This in turn, generates a dipole field E2 felt by 1 ∝ p2 /r3 ∝ 1/r6 . Thus, the energy of the interaction is very small and short ranged. W = −p1 · E2 ∝ 1/r6
2.4.1
(17)
Van der Waals-London Interaction
Of course, a more proper treatment of the van der Waals interaction should account for quantum effects in induced dipoles modeled as harmonic oscillators (here we follow Kittel). As a model we consider two identical linear harmonic oscillators 1 and 2 separated by R. Each oscillator bears charges ±e with separations x1 and x2 , as shown in Fig. 12. The particles oscillate along the x axis with frequency ω0 (the strongest optical absorption line of the atom), and momenta P1 and P2 . If we ignore the interaction between the charges (other than the self-interaction between the dipole’s charges which is accounted for in the harmonic oscillator potentials), then the Hamiltonian of the system is H0 =
P12 + P22 1 + mω02 (x21 + x22 ) . 2m 2 21
(18)
If we approximate each pair of charges as point dipoles, then they interact with a Hamiltonian H1 ≈
−3(p2 · n)(p1 · n) + p1 · p2 −2p1 p2 2e2 x1 x2 = − = − . R3 R3 |x1 + R − x2 |3
(19)
The total Hamiltonian H0 + H1 can be diagonalized a normal mode transformation that isolates the the symmetric mode (where both oscillators move together) from the antisymmetric one where they move in opposition √ xa = (x1 − x2 )/ 2
√ xs = (x1 + x2 )/ 2 √ Ps = (P1 + P2 )/ 2
(20)
√ Pa = (P1 − P2 )/ 2
(21)
After these substitutions, the total Hamiltonian becomes
(22)
1/2
(23)
Ps2 + Pa2 1 2e2 2 1 2e2 2 2 2 H= + mω0 − 3 xs + mω0 + 3 xa 2m 2 R 2 R
The new eigenfrequencies of these two modes are then
1/2
2e2 2 ωs = ω0 − mR3
2e2 2 ωa = ω0 + mR3
The zero point energy of the system is now
2
1 1 2e2 E0 = h ¯ (ωs + ωa ) ≈ h ¯ ω 0 1 − + · · · 2 3 2 4 mω0 R
(24)
or, the zero point energy is lowered by the dipole interaction by an amount
2
h ¯ ω0 2e2 ∆U ≈ 4 mω02 R3 22
(25)
which is typically a small fraction of an electron volt. This is called the Van der Waals interaction, known also as the London interaction or the induced dipole-dipole interaction. It is the principal attractive interaction in crystals of inert gases and also in crystals of many organic molecules. The interaction is a quantum effect, in the sense that ∆U → 0 as h ¯ → 0.
23
Chapter 2: Crystal Structures and Symmetry Laue, Bravais December 28, 2001
Contents 1 Lattice Types and Symmetry
3
1.1
Two-Dimensional Lattices . . . . . . . . . . . . . . . . .
3
1.2
Three-Dimensional Lattices . . . . . . . . . . . . . . . .
5
2 Point-Group Symmetry
6
2.1
Reduction of Quantum Complexity . . . . . . . . . . . .
6
2.2
Symmetry in Lattice Summations . . . . . . . . . . . . .
7
2.3
Group designations . . . . . . . . . . . . . . . . . . . . . 11
3 Simple Crystal Structures
13
3.1
FCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2
HCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3
BCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1
A theory of the physical properties of solids would be practically impossible if the most stable elements were not regular crystal lattices. The N-body problem is reduced to manageable proportions by the existence of translational symmetry. This means that there exist a set of basis vectors (a,b,c) such that the atomic structure remains invariant under translations through any vector which is the sum of integral multiples of these vectors. As shown in Fig. 1 this means that one may go from any location in the lattice to an identical location by following path composed of integral multiples of these vectors.
a α b Figure 1: One may go from any location in the lattice to an identical location by following path composed of integral multiples of the vectors a and b.
Thus, one may label the locations of the ”atoms”1 . which compose the lattice with rn = n 1 a + n 2 b + n 3 c 1
(1)
we will see that the basic building blocks of periodic structures can be more complicated than
a single atom. For example in NaCl, the basic building block is composed of one Na and one Cl ion which is repeated in a cubic pattern to make the NaCl structure
2
where n1 , n2 , n3 are integers. In this way we may construct any periodic structure.
1 1.1
Lattice Types and Symmetry Two-Dimensional Lattices
These structures are classified according to their symmetry. For example, in 2d there are 5 distinct types. The lowest symmetry is an oblique lattice, of which the lattice shown in Fig. 1 is an example if a 6= b and α is not a rational fraction of π. Notice that it is invariSquare b
Rectangular b
a
a
|a| = |b|, γ = π/2
|a| = |b|, γ = π/2 Centered
Hexangonal b
a
a
b
|a| = |b|, γ = π/3
Figure 2: Two dimensional lattice types of higher symmetry. These have higher symmetry since some are invariant under rotations of 2π/3, or 2π/6, or 2π/4, etc. The centered lattice is special since it may also be considered as lattice composed of a two-component basis, and a rectangular unit cell (shown with a dashed rectangle).
3
ant only under rotation of π and 2π. Four other lattices, shown in Fig. 2 of higher symmetry are also possible, and called special lattice types (square, rectangular, centered, hexagonal). A Bravais lattice is the common name for a distinct lattice type. The primitive cell is the parallel piped (in 3d) formed by the primitive lattice vectors which are defined as the lattice vectors which produce the primitive cell with the smallest volume (a · (c × c)). Notice that the primitive cell does not always capture the symmetry as well as a larger cell, as is the case with the centered lattice type. The centered lattice is special since it may also be considered as lattice composed of a two-component basis on a rectangular unit cell (shown with a dashed rectangle). Cu
O
O
Cu
O
O
O
O
Cu
Cu
Cu
O
Cu
O
Cu
O
O
O
O
Cu
Cu
O
O
Cu
O
O
O
Cu
O
O
O
O
O
Cu
O
Cu
O
O
Basis
Primitive cell and lattice vectors b
O
a Cu
O
O
Cu
O
O
Cu
O
Cu
O
O
O
Figure 3: A square lattice with a complex basis composed of one Cu and two O atoms (c.f. cuprate high-temperature superconductors).
4
To account for more complex structures like molecular solids, salts, etc., one also allows each lattice point to have structure in the form of a basis. A good example of this in two dimensions is the CuO2 planes which characterize the cuprate high temperature superconductors (cf. Fig. 3). Here the basis is composed of two oxygens and one copper atom laid down on a simple square lattice with the Cu atom centered on the lattice points. 1.2
Three-Dimensional Lattices Cubic
Body Centered Cubic
c
Face Centered Cubic
c b
b
a a=x b=y c=z
a = (x+y-z)/2 b = (-x+y+z)/2 c = (x-y+z)/2
a
a = (x+y)/2 b = (x+z)/2 c = (y+z)/2
Figure 4: Three-dimensional cubic lattices. The primitive lattice vectors (a,b,c) are also indicated. Note that the primitive cells of the centered lattice is not the unit cell commonly drawn.
The situation in three-dimensional lattices can be more complicated. Here there are 14 lattice types (or Bravais lattices). For example there 5
are 3 cubic structures, shown in Fig. 4. Note that the primitive cells of the centered lattice is not the unit cell commonly drawn. In addition, there are triclinic, 2 monoclinic, 4 orthorhombic ... Bravais lattices, for a total of 14 in three dimensions.
2
Point-Group Symmetry
The use of symmetry can greatly simplify a problem. 2.1
Reduction of Quantum Complexity
If a Hamiltonian is invariant under certain symmetry operations, then we may choose to classify the eigenstates as states of the symmetry operation and H will not connect states of different symmetry. As an example, imagine that a symmetry operation R leaves H invariant, so that RHR−1 = H then [H, R] = 0 Then if |j > are the eigenstates of R, then
P
j
(2)
|j >< j| is a repre-
sentation of the identity, and we expand HR = RH, and examine its elements X k
< i|R|k >< k|H|j >=
X
< i|H|k >< k|R|j > .
(3)
k
If we recall that Rik =< i|R|k >= Rii δik since |k > are eigenstates of 6
R, then Eq. 3 becomes (Rii − Rjj ) Hij = 0 .
(4)
So, Hij = 0 if Ri and Rj are different eigenvalues of R. Thus, when the states are classified by their symmetry, the Hamiltonian matrix becomes Block diagonal, so that each block may be separately diagonalized. 2.2
Symmetry in Lattice Summations
As another example, consider a Madelung sum in a two-dimensional square centered lattice (i.e. a 2d analog of NaCl). Here we want to calculate X ij
± . pij
(5)
This may be done by a brute force sum over the lattice, i.e. lim n→∞
(−1)i+j . pij i=−n,n j=−n,n X
(6)
Or, we may realize that the lattice has some well defined operations which leave it invariant. For example, this lattice in invariant under inversion (x, y) → (−x, −y), and reflections about the x (x, y) → (x, −y) and y (x, y) → (−x, y) axes, etc. For these reasons, the eight points highlighted in Fig. 5(a) all contribute an identical amount to the sum in Eq. 5. In fact all such interior points have a degeneracy of 8. Only special points like the point at the origin (which is unique) and points along the symmetry axes (the xy and x axis, each with a degeneracy of 7
(a) (b) 4
O
8 1 4 Figure 5: Equivalent points and irreducible wedge for the 2-d square lattice. Due to the symmetry of the 2-d square lattice, the eight patterned lattice sites all contribute an identical amount to the Madelung sum calculated around the solid black site. Due to this symmetry, the sum can be reduced to the irreducible wedge (b) if the result at each point is multiplied by the degeneracy factors indicated.
four) have lower degeneracies. Thus, the sum may be restricted to the irreducible wedge, so long as the corresponding terms in the sum are multiplied by the appropriate degeneracy factors, shown in Fig. 5(b). An appropriate algorithm to calculate both the degeneracy table, and the sum 5 itself are: c
First calculate the degeneracy table
c do i=1,n do j=0,i if(i.eq.j.or.j.eq.0) then 8
deg(i,j)=4 else deg(i,j)=8 end if end do end do deg(0,0)=1 c c
Now calculate the Madelung sum
c sum=0.0 do i=1,n do j=0,i p=sqrt(i**2+j**2) sum=sum+((-1)**(i+j))*deg(i,j)/p end do end do By performing the sum in this way, we saved a factor of 8! In fact, in three-dimensions, the savings is much greater, and real band structure calculations (eg. those of F.J. Pinski) always make use of the point group symmetry to accelerate the calculations. The next question is then, could we do the same thing for a more complicated system (fcc in 3d?). To do this, we need some way of 9
classifying the symmetries of the system that we want to apply. Group theory allows us to learn the consequences of the symmetry in much more complicated systems. A group S is defined as a set {E, A, B, C · · ·} which is closed under a binary operation ∗ (ie. A ∗ B ∈ S) and: • the binary operation is associative (A ∗ B) ∗ C = A ∗ (B ∗ C) • there exists an identity E ∈ S : E ∗ A = A ∗ E = A • For each A ∈ S, there exists an A−1 ∈ S : AA−1 = A−1 A = E In the point group context, the operations are inversions, reflections, rotations, and improper rotations (inversion rotations). The binary operation is any combination of these; i.e. inversion followed by a rotation. In the example we just considered we may classify the operations that we have already used. Clearly we need 2!22 of these (ie we can choose to take (x,y) to any permutation of (x,y) and choose either ± for
each, in D-dimensions, there would be D!2D operations). In table. 1, all of these operations are identified The reflections are self inverting as is the inversion and one of the rotations and inversion rotations. The set is clearly also closed. Also, since their are 8 operations, clearly the interior points in the irreducible wedge are 8-fold degenerate (w.r.t. the Madelung sum). This is always the case. Using the group operations one may always reduce the calculation to an irreducible wedge. They the degeneracy of 10
Operation
Identification
(x, y) → (x, y)
Identity
(x, y) → (x, −y)
reflection about x axis
(x, y) → (−x, y)
reflection about y axis
(x, y) → (−x, −y) inversion (x, y) → (y, x)
reflection about x = y
(x, y) → (y, −x)
rotation by π/2 about z
(x, y) → (−y, −x) inversion-reflection (x, y) → (−y, x)
inversion-rotation
Table 1: Point group symmetry operations for the two-dimensional square lattice. All of the group elements are self-inverting except for the sixth and eight, which are inverses of each other.
each point in the wedge may be determined: Since a group operation takes a point in the wedge to either itself or an equivalent point in the lattice, and the former (latter) does (does not) contribute the the degeneracy, the degeneracy of each point times the number of operations which leave the point invariant must equal the number of symmetry operations in the group. Thus, points with the lowest symmetry (invariant only under the identity) have a degeneracy of the group size. 2.3
Group designations
Point groups are usually designated by their Sch¨onflies point group symbol described in table. 2 As an example, consider the previous ex11
Symbol Meaning Cj
(j=2,3,4, 6) j-fold rotation axis
Sj
j-fold rotation-inversion axis
Dj
j 2-fold rotation axes ⊥ to a j-fold principle rotation axis
T
4 three-and 3 two-fold rotation axes, as in a tetrahedron
O
4 three-and 3 four-fold rotation axes, as in a octahedron
Ci
a center of inversion
Cs
a mirror plane
Table 2: The Sch¨onflies point group symbols. These give the classification according to rotation axes and principle mirror planes. In addition, their are suffixes for mirror planes (h: horizontal=perpendicular to the rotation axis, v: vertical=parallel to the main rotation axis in the plane, d: diagonal=parallel to the main rotation axis in the plane bisecting the two-fold rotation axes).
ample of a square lattice. It is invariant under • rotations ⊥ to the page by π/2 • mirror planes in the horizontal and vertical (x and y axes) • mirror planes along the diagonal (x=y, x=-y). The mirror planes are parallel to the main rotation axis which is itself a 4-fold axis and thus the group for the square lattice is C4v .
12
3
Simple Crystal Structures
3.1
FCC Face Centered Cubic (FCC) Close-packed planes
Principle lattice vectors z
b c
y
a x a = (x+y)/2 b = (x+z)/2 c = (y+z)/2
3-fold axes 4-fold axes
Figure 6: The Bravais lattice of a face-centered cubic (FCC) structure. As shown on the left, the fcc structure is composed of parallel planes of atoms, with each atom surrounded by 6 others in the plane. The total coordination number (the number of √ nearest neighbors) is 12. The principle lattice vectors (center) each have length 1/ 2 of the unit cell length. The lattice has four 3-fold axes, and three 4-fold axes as shown on the right. In addition, each plane shown on the left has the principle 6-fold rotation axis ⊥ to it, but since the planes are shifted relative to one another, they do not share 6-fold axes. Thus, four-fold axes are the principle axes, and since they each have a
perpendicular mirror plane, the point group for the fcc lattice is Oh .
The fcc structure is one of the close packed structures, appropriate for metals, with 12 nearest neighbors to each site (i.e., a coordination number of 12). The Bravais lattice for the fcc structure is shown in Fig. 6 It is composed of parallel planes of nearest neighbors (with six 13
nearest neighbors to each site in the plane) Metals often form into an fcc structure. There are two reasons for this. First, as discussed before, the s and p bonding is typically very long-ranged and therefore rather non-directional. (In fact, when the p-bonding is short ranged, the bcc structure is favored.) This naturally leads to a close packed structure. Second, to whatever degree there is a d-electron overlap in the transition metals, they prefer the fcc structure. To see this, consider the d-orbitals shown in Fig. 7 centered on one of the face centers with the face the xy plane. Each lobe of the dxy , dyz , and dxz orbitals points to a near neighbor. The xz,xy,yz triplet form rather strong bonds. The dx2 −y2 and d3z 2 −r2 orbitals do not since they point away from the nearest neighbors. Thus the triplet of states form strong bonding and anti-bonding bands, while the doublet states do not split. The system can gain energy by occupying the triplet bonding states, thus many metals form fcc structures. According to Ashcroft and Mermin, these include Ca, Sr, Rh, Ir, Ni, Pd, Pt, Cu, Ag, Au, Al, and Pb. The fcc structure also explains why metals are ductile since adjacent planes can slide past one another. In addition each plane has a 6-fold rotation axis perpendicular to it, but since 2 adjacent planes are shifted relative to another, the rotation axes perpendicular to the planes are 3-fold, with one along the each main diagonal of the unit cell. There are also 4-fold axes through each center of the cube with mirror planes 14
z
z
z d xz
y
d
xy
y
y
d yz
x
x
z
x
z dx2- y2
y
d 2 2 3z - r
x
y
x
Figure 7: The d-orbitals. In an fcc structure, the triplet of orbitals shown on top all point towards nearest neighbors; whereas, the bottom doublet point away. Thus the triplet can form bonding and antibonding states.
perpendicular to it. Thus the fcc point group is Oh . In fact, this same argument also applies to the bcc and sc lattices, so Oh is the appropriate group for all cubic Bravais lattices and is often called the cubic group. 3.2
HCP
As shown in Fig. 8 the Hexagonal Close Packed (HCP) structure is described by the D3h point group. The HCP structure (cf. Fig. 9) is similar to the FCC structure, but it does not correspond to a Bravais lattice (in fact there are five cubic point groups, but only three cubic Bravais lattices). As with fcc its coordination number is 12. The simplest way to construct it is to form one hexagonal plane and then add two identical ones top and bottom. Thus its stacking is ABABAB... 15
3-fold axis mirror plane three 2-fold axes in plane
Figure 8: The symmetry of the HCP lattice. The principle rotation axis is perpendicular to the two-dimensional hexagonal lattices which are stacked to form the hcp structure. In addition, there is a mirror plane centered within one of these hexagonal 2d structures, which contains three 2-fold axes. Thus the point group is D 3h .
of the planes. This shifting of the planes clearly disrupts the d-orbital bonding advantage gained in fcc, nevertheless many metals form this structure including Be, Mg, Sc, Y, La, Ti, Zr, Hf, Tc, Re, Ru, Os, Co, Zn, Cd, and Tl. 3.3
BCC
Just like the simple cubic and fcc lattices, the body-centered cubic (BCC) lattice (cf. Fig. 4) has four 3-fold axes, 3 4-fold axes, with mirror planes perpendicular to the 4-fold axes, and therefore belongs to the Oh point group. The body centered cubic structure only has a coordination number of 8. Nevertheless some metals form into a BCC lattice (Ba V Nb, Ta W M, in addition Cr and Fe have bcc phases.) Bonding of p-orbitals is 16
FCC A B C
A B
C
A
B C
A
B C
C A
B C
A B A
B C
HCP A B C
A
B C
A
B C
C
C
A
A
B
A
B B
C
A B
C
A B
A
A B
A
A B
A
A B
A
A
A
B
B
B
B
B
B
A
A
A
A
A
A
B
B
B
B
B
B
C These spaces unfilled
Figure 9: A comparison of the FCC (left) and HCP (right) close packed structures. The HCP structure does not have a simple Bravais unit cell, but may be constructed by alternately stacking two-dimensional hexagonal lattices. In contract, the FCC structure may be constructed by sequentially stacking three shifted hexagonal twodimensional lattices.
ideal in a BCC lattice since the nnn lattice is simply composed of two interpenetrating cubic lattices. This structure allows the next-nearest neighbor p-orbitals to overlap more significantly than an fcc (or hcp) structure would. This increases the effective coordination number by including the next nearest neighbor shell in the bonding (cf. Fig. 10).
17
2 12
6
24
fcc
1s
R(r)
8 6
12 bcc
1 2s,2p
0
1
2
3 o
r(A) Figure 10: Absolute square of the radial part of the electronic wavefunction. For the bcc lattice, both the 8 nearest, and 6 next nearest neighbors lie in a region of relatively high electronic density. This favors the formation of a bcc over fcc lattice for some elemental metals (This figure was lifted from I&L).
18
Chapter 3: The Classical Theory of Crystal Diffraction Bragg December 29, 2001
Contents 1 Classical Theory of diffraction
4
2 Scattering from Periodic Structures
8
2.1
The Scattering Intensity for a Crystal . . . . . . . . . . . 10
2.2
Bragg and Laue Conditions (Miller Indices) . . . . . . . 12
2.3
The Structure Factor . . . . . . . . . . . . . . . . . . . . 16 2.3.1
The Structure Factor of Centered Lattices . . . . 19
2.3.2
Powdered x-ray Diffraction . . . . . . . . . . . . . 22
1
In the last two chapters, we learned that solids generally form periodic structures of different symmetries and bases. However, given a solid material, how do we learn what its periodic structure is? Typically, this is done by diffraction, where we project a beam (of either particles or radiation) at a solid with a wavelength λ ≈ the characteristic length scale of the lattice ( a ≈ twice the atomic or molecular radii of the constituents). Diffraction of waves and particles (with de Broglie
k0
incident waves or particles λ ≈ | a 1 | or |a 2 |
k0 K k
θ
k K = k - k0
a1
θ
d
a2 d sin(θ) Figure 1: Scattering of waves or particles with wavelength of roughly the same size as the lattice repeat distance allows us to learn about the lattice structure. Coherent addition of two particles or waves requires that 2d sin θ = λ (the Bragg condition), and yields a scattering maximum on a distant screen.
wavelength λ = h/p) of λ ≈ a allows us to learn about the periodic structure of crystals. In a diffraction experiment one identifies Bragg peaks which originate from a coherent addition of scattering events in 2
multiple planes within the bulk of the solid. However, not all particles with de Broglie wavelength λ ≈ a will work for this application. For example, most charged particles cannot probe the bulk properties of the crystal, since they lose energy to the scatterer very quickly. Recall, from classical electrodynamics, the rate at which particles of charge q, mass M , and velocity v lose energy to the electrons of charge e and mass m in the crystal is given roughly by dE 4πnq 2 e2 mγ 2 v 3 q 2 ≈− ln (1) ∼ 2. dx mv 2 qeω0 v As an example, consider a non-relativistic electron scattering into a √ solid with a ≈ 2A. If we require that a = λ = h/p = 12.3×10−8 cm/ E when E is measured in electron volts, then E ≈ 50eV. If we solve Eq. 1 for the distance δx where the initial energy of the incident is lost requiring that δE = E, when n ≈ 1023 /cm3 we find that δx ≈ 100A. Thus, if λ ≈ a, the electrons do not penetrate into the bulk of the sample (typically the first few hundred A of most materials are oxidized, or distorted by surface reconstruction of the dangling bonds at the surface, etc. See Fig. 2) Thus, electrons do not make a very good probe of the bulk properties of a crystal (instead in a process call low-energy electron diffraction, LEED, they may be used to study the surface of especially clean samples. I.e. to study things like surface reconstruction of the dangling bonds, etc.). Thus although they are obviously easier to accelerate (electrons or ion beams), they generally do not penetrate into the bulk and so tell us more about the surface properties of solids 3
v e-
Oxygen
Figure 2: An electron about to scatter from a typical material. However, at the surface of the material, oxidation and surface reconstruction distort the lattice. If the electron scatters from this region, we cannot learn about the structure of the bulk.
which are often not representative of the bulk. Thus the particle of the choice to determine bulk properties is the neutron which is charge neutral and scatters only from the nuclei. Radiation is often also used. Here the choice is only a matter of the wavelength used. X-rays are chosen since then λ ≈ a
1
Classical Theory of diffraction
In this theory of diffraction we will be making three basic assumptions. 1. That the operator which describes the coupling of the target to the scattered ”object” (in this case the operator is the density) 4
commutes with the Hamiltonian. Thus, this will be a classical theory. 2. We will assume some form of Huygens principle: that every radiated point of the target will serve as a secondary source spherical wavelets of the same frequency as the source and the amplitude of the diffracted wave is the sum of the wavelengths considering their amplitudes and relative phases. (For light, this is equivalent to assuming that it is unpolarized, and that the diffraction pattern varies quickly with scattering angle θ so that the angular dependence of a unpolarized dipole, 1 + (cos θ)2 , may be neglected.) 3. We will assume that resulting spherical waves are not scattered again. In the fully quantum theory which we will derive later for neutron scattering, this will correspond to approximating the scattering rate by Fermi’s golden rule (first-order Born approximation). The basic setup of a scattering experiment is sketched in Fig. 3. Generally, we will also assume that |R| À |r|, so that we may always approximate the amplitude of the incident waves on the target as plane waves. AP = AO ei(k0 ·(R+r)−ω0 t) .
5
(2)
R’ - r source
B
r
R
Q
R’
P
observer or screen
target
Figure 3: Basic setup of a scattering experiment.
Then, consistent with the second assumption above, 0
AB (R ) ∝
Z
0
eik·(R −r) d rAP ρ(r) 0 , |R − r| 3
(3)
which, after substitution of Eq.2, becomes 0
AB (R0 ) ∝ AO ei(k0 ·R+k·R −ω0 t)
Z
d3 rρ(r)
e−i(k−ko )·r . |R0 − r|
(4)
At very large R0 (ie. in the radiation or far zone) 0
AO ei(k0 ·R+k·R −ω0 t) Z 3 AB (R ) ∝ d rρ(r)e−i(k−ko )·r . 0 R 0
(5)
Or, in terms of the scattered intensity IB ∝ |AB |2 ¯2 |AO |2 ¯¯¯Z 3 −i(k−ko )·r ¯¯ IB ∝ ¯ d rρ(r)e ¯ . R02
(6)
The scattering intensity is just the absolute square of the Fourier transform of the density of scatterers. If we let K = k − k0 (cf. Fig. 1), then we get ¯2 |AO |2 |AO |2 ¯¯¯Z 3 −iK·r ¯¯ |ρ(K)|2 . IB (K) ∝ ¯ = ¯ d rρ(r)e 02 02 R R
6
(7)
From the associated Fourier uncertainty principle ∆k∆x ≈ π, we can see that the resolution of smaller structures requires larger values of K (some combination of large scattering angles and short wavelength of the incident light), consistent with the discussion at the beginning of this chapter.
ρ(r)
I(K)
Figure 4: Since the measured scattering intensity I(K) ∝ |ρ(K)|2 the complex phase information is lost. Thus, a scattering experiment does not provide enough information to invert the transform ρ(r) =
R
d3 r ρ(K)e+iK·r . (2π)3
In experiments the intensity I as a function of the scattering angle K is generally measured. In principle this is under-complete information. In order to invert the Fourier transform (which is a unitary transformation) we would need to know both the real and imaginary parts of ρ(K) =
Z
d3 rρ(r)e−iK·r .
(8)
Of course, if the experiment just measures I ∝ |ρ(K)|2 , then we lose
the relative phase information (i.e. ρ(K) = ρK eiθK so that I ∝ |ρK |2 ,
and the phase information θK is lost). So, from a complete experiment, measuring I(K) for all scattering angles, we do not have enough infor7
mation to get a unique ρ(r) by inverting the Fourier transform. Instead experimentalists analyze their data by proposing a feasible model structure (i.e. a ρ(r) corresponding to some guess of which of one the 14 the Bravais lattice and the basis), Fourier transform this, and compare it to the experimental data. The parameters of the model are then adjusted to obtain a best fit.
2
Scattering from Periodic Structures
Given this procedure, it is important to study the scattering pattern that would arise for various periodic structures. The density in a periodic crystal must have the same periodicity of the crystal ρ(r + rn ) = ρ(r) where rn = n1 a1 + n2 a2 + n3 a3
(9)
for integer n1 , n2 , n2 . This also implies that the Fourier coefficients of ρ will be chosen from a discrete set. For example, consider a 1-d periodic structure
a ρ(x + na) = ρ(x) .
(10)
Then we must choose the Gn ρ(x) =
X n
8
ρn eiGn x ,
(11)
so that ρ(x + ma) = =
X
n X n
ρn eiGn (x+ma) = ρn e
iGn (x)
X n
ρn eiGn (x) eiGn ma
= ρ(x) ,
(12)
I.e. eiGn ma = 1, or Gn = 2nπ/a where n is an integer. This may be easily generalized to three dimensions, for which ρ(r) =
X
ρG eiG·r
(13)
G
where the condition of periodicity ρ(r + rn ) = ρ(r) means that G · rn = 2πm m ∈ Z
(14)
where Z is the group of integers (under addition). Now, lets consider G in some three-dimensional space and decompose it in terms of three independent basis vectors for which any two are not parallel and the set is not coplanar G = hg1 + kg2 + lg3 .
(15)
The condition of periodicity then requires that (hg1 + kg2 + lg3 ) · n1 a1 = 2πm m ∈ Z
(16)
with similar conditions of the other principle lattice vectors a2 and a3 . Since g1 , g2 and g3 are not parallel or coplanar, the only way to satisfy this constraint for arbitrary n1 is for g1 · a1 = 2π g2 · a1 = g3 · a1 = 0 9
(17)
or some other permutation of 1 2 and 3, which would just amount to a renaming of g1 , g2 , and g3 . The set (g1 , g2 , g3 ) are called the basis set for the reciprocal lattice. They may be constructed from g1 = 2π
a2 × a 3 plus cyclic permutations . a1 · (a2 × a3 )
(18)
It is easy to see that this construction satisfies Eq. 17, and that there is a one to one correspondence between the lattice and its reciprocal lattice. So, the reciprocal lattice belongs to the same point group as the real-space lattice1 . 2.1
The Scattering Intensity for a Crystal
Lets now apply this form for the density ρ(r) =
X
ρG eiG·r
(19)
G
to our formula for the scattering intensity ¯2
¯
¯ |AO |2 ¯¯¯Z 3 X −i(K−G)·r ¯¯ ρG e IB (K) ∝ ¯ ¯ d r ¯ R02 ¯ G
(20)
The integral above is simply
1
V δG,K =
V if G = K 0 if G 6= K
,
(21)
One should note that this does not mean that the reciprocal lattice must have the same Bravais
lattice structure as the real lattice. For example, the reciprocal of a fcc lattic is bcc and vice versa. This is consistent with the the statement that the reciprocal lattice belongs to the same point group as the real-space lattice since fcc and bcc share the Oh point group
10
where V is the lattice volume, so |AO |2 |ρG |2 V 2 δG,K IB (K) ∝ 02 R
(22)
This is called the Laue condition for scattering. The fact that this is proportional to V 2 rather than V just indicates that the diffractions spots, in this approximation, are infinitely bright (for a sample in the thermodynamic limit). Of course, this is because the spots are infinitely narrow or fine. When real broadening is taken into account, IB (K) ∝ V as expected. Then as G = hg1 + kg2 + lg3 , we can label the spots with the three integers (h, k, l ), or Ihkl ∝ |ρhkl |2 .
(23)
Traditionally, negative integers are cabled with an overbar, so −h → h. Then as ρ(r) is real, ρG = ρ−G , or Ihkl = Ihkl Friedel’s rule
(24)
Most scattering experiments are done with either a rotating crystal, or a powder made up of many crystalites. For these experiments, Friedel’s rule has two main consequences • For every spot at k − k0 = G, there will be one at k0 − k0 = −G. Thus, for example if we scatter from a crystal with a 3-fold symmetry axis, we will get a six-fold scattering pattern. Clearly this can only happen, satisfy the Laue condition, and have |k| = 11
|k0 |, if the crystal is rotated by π in some axis perpendicular to the three-fold axis. In fact, single-crystal experiments are usually done either by mounting the crystal on a precession stage (essentially like an automotive universal joint, with the drive shaft held fixed, and the joint rotated over all angles), or by holding the crystal fixed and moving the source and diffraction screen around the crystal. • The scattering pattern always has an inversion center, G → −G even when none is present in the target! 2.2
Bragg and Laue Conditions (Miller Indices)
Above, we derived the Laue condition for scattering; however, we began this chapter by reviewing the Bragg condition for scattering from adjacent planes. In this subsection we will show that, as expected, these are the some condition. Consider the real-space lattice shown in Fig. 5. Highlighted by the solid lines are the parallel planes formed by (1, 2, 2) translations along the principle lattice vectors (a1 , a2 , a3 ), respectively. Typically these integers are labeled by (m, n, o), however, the plane is not typically labeled as the (m, n, o) plane. Rather it is labeled with the inverses h0 = 1/m k 0 = 1/n l0 = 1/o .
(25)
Since these typically are not integers, they are multiplied by p, the
12
h’k’l’ plane
d h’k’l’ a1
γ
Ghkl
a3 a2 O
a1
(211) plane
Figure 5: Miller indices identification of planes in a lattice. Highlighted by the solid lines are the parallel planes formed by (1, 2, 2) translations along the principle lattice vectors (a1 , a2 , a3 ), respectively. Typically these integers are labeled by (m, n, o), however, the plane is not typically labeled as the (m, n, o) plane. Rather it is labeled with the inverses h0 = 1/m
k 0 = 1/n
l0 = 1/o. Since these typically are not integers,
they are multiplied by p, the smallest integer such that p(h0 , k 0 , l0 ) = (h, k, l) ∈ Z. In this case, p = 2, and the plane is labeled as the (2, 1, 1) plane. Note that the plane
formed by (2, 4, 4) translations along the principle lattice vectors is parallel to the (2, 1, 1) plane.
smallest integer such that p(h0 , k 0 , l0 ) = (h, k, l) ∈ Z .
(26)
In this case, p = 2, and the plane is labeled as the (2, 1, 1) plane. On may show that the reciprocal lattice vector Ghkl lies perpendicular to the (h, k, l) plane, and that the length between adjacent parallel planes dhkl = 2π/Ghkl . To show this, note that the plane may be defined 13
by two non-parallel vectors v1 and v2 within the plane. Let v1 = ma1 − na2 = a1 /h0 − a2 /k 0 v2 = oa3 − na2 = a3 /l0 − a2 /k 0 . (27) Clearly the cross product, v1 × v2 is perpendicular to the (h, k, l) plane v1 × v 2 = −
a3 × a 1 a1 × a 2 a2 × a 3 − − . h0 l 0 h0 k 0 k 0 l0
(28)
If we multiply this by −2πh0 k 0 l0 /a1 · (a2 × a3 ), we get
−2πh0 k 0 l0 v1 × v2 = a1 · (a2 × a3 ) 2πp 0 a3 × a1 a × a a × a 1 2 2 3 = k + l0 + h0 p a1 · (a2 × a3 ) a1 · (a2 × a3 ) a1 · (a2 × a3 ) Ghkl /p (29)
Thus, Ghkl ⊥ to the (h, k, l) plane. Now, if γ is the angle between a1
and Ghkl , then the distance dh0 k0 l0 from the origin to the (h0 , k 0 , l0 ) plane is given by dh0 k0 l0 = m|a1 | cos γ =
|a1 |a1 · Ghkl 2πh 2πp = = h0 |a1 ||Ghkl | h0 Ghkl Ghkl
(30)
Then, as there are p planes in this distance (cf. Fig. 5), the distance to the nearest one is dhkl = dh0 ,k0 ,l0 /p = 2π/Ghkl
(31)
With this information, we can reexamine the Laue scattering condition K = k − ko = Ghkl , and show that it is equivalent to the more intuitive Bragg condition. Part of the Laue is condition is that |K| = K = |k − k0 | = Ghkl , now K = 2k0 sin θ =
4π sin θ and Ghkl = 2π/dhkl λ 14
(32)
thus, the Laue condition implies that 1/dhkl = 2 sin θ/λ or λ = 2dhkl sin θ
(33)
which is the Bragg condition. Note that the Laue condition is more Laue Condition (in reciprocal space) K= k - k0 = Ghkl
Bragg Condition (in real space) 2dhkl sin(θ) = λ d hklsin(θ)
Ghkl
θ
k k k
k0
θ
d hkl hkl plane
k0
Figure 6: Comparison of the Bragg λ = 2dhkl sin θ and Laue Ghkl = Khkl conditions for scattering.
restrictive than the Bragg condition; it requires that both the magnitude and the direction of G and K be the same. However, there is no inconsistency here, since whenever we apply the Bragg condition, we assume that the plane defined by k and k0 is perpendicular to the scattering planes (cf. Fig. 6).
15
O
Body Centered Cubic
O
Cu
Cu O
O
Cu
O O
O
Cu
O
O
Cu
Cu
basis
r O
Cu
O O
Cu
O O
O
rα O
O
O
O
rn
Figure 7: Examples of lattices with non-trivial bases. The CuO2 lattice (left) is characteristic of the cuprate high-temperature superconductors. It has a basis composed of one Cu and two O atoms imposed on a simple cubic lattice. The BCC lattice(right) can be considered as a cubic lattice with a basis including an atom at the corner and one at the center of the cube.
2.3
The Structure Factor
Thus far, we have concentrated on the diffraction pattern for a periodic lattice ignoring the fine structure of the molecular of the basis. Examples of non-trivial molecular bases are shown in Fig. 7. Clearly the basis structure will effect the scattering (Fig. 8). For example, there will be interference from the scattering off of the Cu and two O in each cell. In fact, even in the simplest case of a single-element basis composed of a spherical atom of finite extent, scattering from one side of the atom will interfere with that from the other. In each case, the structure of the basis will change the scattering pattern due to interference of the waves 16
scattering from different elements of the basis. The structure factors account for these interference effects. The information about this interference, and the basis structure is contained in the atomic scattering
Cu
O
factor f and the structure factor S.
R
O Figure 8: Rays scattered from different elements of the basis, and from different places on the atom, interfere giving the scattered intensity additional structure described by the form factor S and the atomic form factor f , respectively.
To show this reconsider the scattering formula Ihkl ∝ |ρhkl |2
(34)
The Fourier transform of the density may be decomposed into an integral over the basis cell, and a sum over all such cells 1 V 1 = V
ρhkl =
Z
d3 rρ(r)e−iGhkl ·r = X
Z
n1 ,n2 ,n3 cell
1 V
X Z
cells
cell
d3 rρ(r)e−iGhkl ·(r+rn ) 17
d3 rρ(r)e−iGhkl ·r (35)
where the location of each cell is given by rn = n1 a1 + n2 a2 + a3 a3 . Then since Ghkl · rn = 2πm, m ∈ Z, ρhkl =
N V
Z
cell
d3 rρ(r)e−iGhkl ·r
where N is the number of cells and
N V
(36)
= 1/Vc , Vc the volume of a
cell. This integral may be further subdivided into an integral over the atomic density of each atom in the unit cell. If α labeles the different elements of the basis, each with density ρα (r0 ) ρhkl
1 X −iGhkl ·rα Z 3 0 0 = e d r ρα (r0 )e−iGhkl ·r Vc α
(37)
The atomic scattering factor f and the structure factor may then be defined as parts of this integral fα =
Z
d3 r0 ρα (r0 )e−iGhkl ·r
0
(38)
so Shkl 1 X −iGhkl ·rα fα = e (39) Vc α Vc fα describes the interference of spherical waves emanating from different ρhkl =
points within the atom, and Shkl is called the structure factor. Note that for lattices with an elemental basis S = f . If we imagine the crystal to be made up of isolated atoms like that shown on the right in Fig. 8 (which is perhaps accurate for an ionic crystal) then, since the atomic charge density is spherically symmetric about the atom fα =
Z
3 0
0
d r ρα (r )e
−iGhkl ·r0
=−
Z
18
r02 dr0 d(cos θ)dφ ρα (r0 )e−Ghkl r
0
cos θ
= 4π
Z
sin Ghkl r0 r dr ρ(r ) . Ghkl r0 02
0
0
(40)
As an example, consider a spherical atom of charge Ze− , radius R, and charge density ρα (r0 ) =
3Z θ(R − r0 ) 3 4πR
(41)
then fα
3Z = R3 = −
0 Z R 02 0 sin Ghkl r r dr 0 Ghkl r0
Z
(Ghkl R)3
(sin (Ghkl R) − (Ghkl R) cos (Ghkl R))
(42)
This has zeroes whenever tan (Ghkl R) = Ghkl R and a maximum when Ghkl = 0, or in terms of the scattering angle 2k0 sin θ = Ghkl , when θ = 0, π. In fact, we have that fα (θ = 0) = Z. This is true in general, since fα (θ = 0) = fα (Ghkl = 0) =
Z
d3 r0 ρα (r0 ) = Z .
(43)
Thus, for x-ray scattering I ∝ Z 2 . For this reason, it is often difficult to see small-Z atoms with x-ray scattering. 2.3.1
The Structure Factor of Centered Lattices
Now let’s look at the structure factor. An especially interesting situation occurs for centered lattices. We can consider a BCC lattice as a cubic unit cell |a1 | = |a2 | = |a3 |, a1 ⊥ a2 ⊥ a3 and a two-atom basis rα = 0.5α(a1 + a2 + a3 ), α = 0, 1. Now if both sites in the unit cell 19
(α = 0, 1) contain the same atom with the same scattering factor f , then rα · Ghkl = 0.5α(a1 + a2 + a3 ) · (hg1 + kg2 + lg3 ) = πα(h + k + l) (44) so that Shkl =
X
f e−iπα(h+k+l)
α=0,1
= f (1 + e−iπ(h+k+l) ) =
0
if h + k + l is odd
2f
if h + k + l is even (45)
This lattice gives rise to extinctions (lines, which appear in a cubic lattice, but which are missing here)! If both atoms of the basis are identical (like bcc iron), then the bcc structure leads to extinctions; however, consider CsCl. It does not have these extinctions since fCs+ 6= fCl− . In fact, to a good approximation fCs+ ≈ fXe and fCl− ≈ fAr . However CsI (also a bcc structure) comes pretty close to having complete extinctions since both the Cs and I ions take on the Xe electronic shell. Thus, in the scattering pattern of CsI, the odd h + k + l peaks are much smaller than the even ones. Other centered lattices also lead to extinctions. In fact, this leads us to a rather general conclusion. The shape and dimensions of the unit cell determines the location of the Bragg peaks; however, the content of the unit cell helps determine the relative intensities of the peaks. 20
Unit Cell of BCC ordered FeCo Fe Co
Figure 9: The unit cell of body-centered cubic ordered FeCo. Extinctions in Binary Alloys
Another, and significantly more interest-
ing, example of extinctions in scattering experiments happens in binary alloys such as FeCo on a centered BCC lattice. Since Fe and Co are adjacent to each other on the periodic chart, and the x-ray form factor is proportional to Z (ZCo = 27, and ZFe = 26) x−ray x−ray fFe ≈ fCo .
(46)
However, since one has a closed nuclear shell and the other doesn’t, their neutron scattering factors will be quite different neutron neutron 6= fCo fFe
(47)
Thus, neutron scattering from the ordered FeCo structure shown in Fig. 9 will not have extinctions; whereas scattering from a disordered structure (where the distribution of Fe and Co is random, so each site has a 50% chance of having Fe or Co, independent of the occupation of the neighboring sites) will have extinctions! 21
2.3.2
Powdered x-ray Diffraction
If you expose a columnated beam of x-rays to a crystal with a single crystalline domain, you usually will not achieve a diffraction spot. The reason why can be seen from the Ewald construction, shown in Fig. 10 For any given k0 , the chances of matching up so as to achieve G = k−k0 are remote. For this reason most people use powdered x-ray diffraction to characterize their samples. This is done by making a powder with randomly distributed crystallites. Exposing the powdered sample to x-rays and recording the pattern. The powdered sample corresponds to averaging over all orientations of the reciprocal lattice. Thus one will observe all peaks that lie within a radius of 2|k0 | of the origin of the reciprocal lattice.
22
Ghkl Ewald Sphere
k
k0 O
Figure 10:
The Ewald Construction to determine if the conditions are correct for
obtaining a Bragg peak: Select a point in k-space as the origin. Draw the incident wave vector k0 to the origin. From the base of k0 , spin k (remember, that for elastic scattering |k| = |k0 |) in all possible directions to form a sphere. At each point where
this sphere intersects a lattice point in k-space, there will be a Bragg peak with G =
k − k0 . In the example above we find 8 Bragg peaks. If however, we change k0 by a small amount, then we have none!
23
Chapter 4: Crystal Lattice Dynamics Debye December 29, 2001
Contents 1 An Adiabatic Theory of Lattice Vibrations
2
1.1
The Equation of Motion . . . . . . . . . . . . . . . . . .
6
1.2
Example, a Linear Chain . . . . . . . . . . . . . . . . . .
8
1.3
The Constraints of Symmetry . . . . . . . . . . . . . . . 11 1.3.1
Symmetry of the Dispersion . . . . . . . . . . . . 12
1.3.2
Symmetry and the Need for Acoustic modes . . . 15
2 The Counting of Modes
18
2.1
Periodicity and the Quantization of States . . . . . . . . 19
2.2
Translational Invariance: First Brillouin Zone . . . . . . 19
2.3
Point Group Symmetry and Density of States . . . . . . 21
3 Normal Modes and Quantization 3.1
21
Quantization and Second Quantization . . . . . . . . . . 24
1
4 Theory of Neutron Scattering
26
4.1
Classical Theory of Neutron Scattering . . . . . . . . . . 27
4.2
Quantum Theory of Neutron Scattering . . . . . . . . . . 30 4.2.1
The Debye-Waller Factor . . . . . . . . . . . . . . 35
4.2.2
Zero-phonon Elastic Scattering . . . . . . . . . . 36
4.2.3
One-Phonon Inelastic Scattering . . . . . . . . . . 37
2
A crystal lattice is special due to its long range order. As you explored in the homework, this yields a sharp diffraction pattern, especially in 3-d. However, lattice vibrations are important. Among other things, they contribute to • the thermal conductivity of insulators is due to dispersive lattice vibrations, and it can be quite large (in fact, diamond has a thermal conductivity which is about 6 times that of metallic copper). • in scattering they reduce of the spot intensities, and also allow for inelastic scattering where the energy of the scatterer (i.e. a neutron) changes due to the absorption or creation of a phonon in the target. • electron-phonon interactions renormalize the properties of electrons (make them heavier). • superconductivity (conventional) comes from multiple electronphonon scattering between time-reversed electrons.
1
An Adiabatic Theory of Lattice Vibrations
At first glance, a theory of lattice vibrations would appear impossibly daunting. We have N ≈ 1023 ions interacting strongly (with energies of about (e2 /A)) with N electrons. However, there is a natural expansion parameter for this problem, which is the ratio of the electronic to the 3
ionic mass: m ¿1 M
(1)
which allows us to derive an accurate theory. Due to Newton’s third law, the forces on the ions and electrons are comparable F ∼ e2 /a2 , where a is the lattice constant. If we imagine that, at least for small excursions, the forces binding the electrons and the ions to the lattice may be modeled as harmonic oscillators, then 2 2 F ∼ e2 /a2 ∼ mωelectron a ∼ M ωion a
(2)
This means that ωion ωelectron
Ã
m ∼ M
!1/2
∼ 10−3 to 10−2
(3)
Which means that the ion is essentially stationary during the period of the electronic motion. For this reason we may make an adiabatic approximation: • we treat the ions as stationary at locations R1 , · · · RN and determine the electronic ground state energy, E(R1 , · · · RN ). This may be done using standard ab-initio band structure techniques such as those used by FJP. • we then use this as a potential for the ions; i.e.. we recalculate E as a function of the ionic locations, always assuming that the electrons remain in their ground state. 4
n’th unit cell atom α sα
rα rn rnα
Rnα = r n + rα + sα
O Figure 1: Nomenclature for the lattice vibration problem. sn,α is the displacement of the atom α within the n-th unit cell from its equilibrium position, given by rn,α = rn + rα , where as usual, rn = n1 a1 + n2 a2 + n3 a3 .
Thus the potential energy for the ions φ(R1 , · · · RN ) = E(R1 , · · · RN ) + the ion-ion interaction
(4)
We will define the zero potential such that when all Rn are at their equilibrium positions, φ = 0. Then H=
X n
Pn2 + φ(R1 , · · · RN ) 2M
(5)
Typical lattice vibrations involve small atomic excursions of the order 0.1A or smaller, thus we may expand about the equilibrium position of the ions. φ({rnαi + snαi }) = φ({rnαi }) +
∂φ 1 ∂ 2φ snαi + snαi smβj (6) ∂rnαi 2 ∂rnαi ∂rmβj
The first two terms in the sum are zero; the first by definition, and the second is zero since it is the first derivative of a potential being 5
evaluated at the equilibrium position. We will define the matrix Φmβj nαi =
∂ 2φ ∂rnαi ∂rmβj
(7)
From the different conservation laws (related to symmetries) of the system one may derive some simple relationships for Φ. We will discuss these in detail later. However, one must be introduced now, that is,
rα
sα
Figure 2: Since the coefficients of potential between the atoms linked by the blue lines (m−n)βj
(or the red lines) must be identical, Φmβj nαi = Φ0αi
.
due to translational invariance. Φmβj nαi
=
(m−n)βj Φ0αi
∂ 2φ = ∂r0αi ∂r(n−m)βj
(8)
ie, it can only depend upon the distance. This is important for the next subsection.
6
1.1
The Equation of Motion
From the derivative of the potential, we can calculate the force on each site Fnαi = −
∂φ({rmβj + smβj }) ∂snαi
(9)
so that the equation of motion is ¨nαi −Φmβj nαi smβj = Mα s
(10)
If there are N unit cells, each with r atoms, then this gives 3N r equations of motion. We will take advantage of the periodicity of the lattice by using Fourier transforms to achieve a significant decoupling of these equations. Imagine that the coordinate s of each site is decomposed into its Fourier components. Since the equations are linear, we may just consider one of these components to derive our equations of motion in Fourier space 1 snαi = √ uαi (q)ei(q·rn −ωt) Mα
(11)
where the first two terms on the rhs serve as the polarization vector for the oscillation, uαi (q) is independent of n due to the translational invariance of the system. In a real system the real s would be composed of a sum over all q and polarizations. With this substitution, the equations of motion become ω 2 uαi (q) =
q
1 iq·(rm −rn ) uβj (q) sum repeated indices . Φmβj nαi e Mα Mβ (12) 7
t=0 t = ∆t t = 2 ∆t Figure 3: uαi (q) is independent of n so that a lattice vibration can propagate and respect the translational invariant of the lattice. (m−n)βj
Recall that Φmβj nαi = Φ0αi βj = Dαi
q
so that if we identify
1 iq·(rm −rn ) = Φmβj nαi e Mα Mβ
q
1 iq·(rp ) Φpβj 0αi e Mα Mβ
(13)
where rp = rm − rn , then the equation of motion becomes βj ω 2 uαi (q) = Dαi uβj (q)
or
µ
βj Dαi
−
βj ω 2 δαi
¶
(14)
uβj (q) = 0
(15) ³
´
which only has nontrivial (u 6= 0) solutions if det D(q) − ω 2 I = 0. For each q there are 3r different solutions (branches) with eigenvalues ω (n) (q) (or rather ω (n) (q) are the root of the eigenvalues). The dependence of these eigenvalues ω (n) (q) on q is known as the dispersion relation.
8
M1
M2
basis α=1 α=2
f
a Figure 4: A linear chain of oscillators composed of a two-element basis with different masses, M1 and M2 and equal strength springs with spring constant f .
1.2
Example, a Linear Chain
Consider a linear chain of oscillators composed of a two-element basis with different masses, M1 and M2 and equal strength springs with spring constant f . It has the potential energy 1 X φ = f (sn,1 − sn,2 )2 + (sn,2 − sn+1,1 )2 . 2 n
(16)
We may suppress the indices i and j, and search for a solution 1 uα (q)ei(q·rn −ωt) snα = √ Mα
(17)
to the equation of motion ω 2 uα (q) = Dαβ uβ (q) where Dαβ = and, Φm,β n,α
q
1 iq·(rp ) Φp,β 0α e Mα Mβ
∂ 2φ = ∂r0,α ∂r(n−m),β
(18)
(19) ³
´
where nontrivial solutions are found by solving det D(q) − ω 2 I = 0. The potential matrix has the form n,2 Φn,1 n,1 = Φn,2 = 2f
9
(20)
n+1,1 n−1,2 n,1 = −f . = Φn,2 Φn,2 n,1 = Φn,2 = Φn,1
(21)
This may be Fourier transformed on the space index n by inspection, so that Dαβ = =
q
1 iq·(rp ) Φpβ 0α e Mα Mβ − √Mf1 M2
2f M1 ³
1+e
+iqa
´
− √Mf1 M2
³
1+e
−iqa
2f M2
´
(22)
Note that the matrix D is hermitian, as it must be to yield real, physical, eigenvalues ω 2 (however, ω can still be imaginary if ω 2 is negative, ³
´
indicating an unstable mode). The secular equation det D(q) − ω 2 I = 0 becomes Ã
!
1 1 4f 2 ω − ω 2f + + sin2 (qa/2) = 0 , M1 M2 M1 M2 4
2
(23)
with solutions Ã
v uÃ
!
u 1 1 1 1 ω2 = f + ± ft + M1 M2 M1 M2
!2
−
4 sin2 (qa/2) M1 M2
(24)
This equation simplifies significantly in the q → 0 and q/a → π limits. In units where a = 1, and where the reduced mass 1/µ = v u u t
fµ lim ω− (q) = qa q→0 2M1 M2
lim ω+ (q) =
q→0
µ
1 M1
+
1 M2
¶
,
v u u 2f t
(25)
q
(26)
µ
and q
ω− (q = π/a) = 2f /M2 .
ω+ (q = π/a) = 2f /M1 10
2.0 optical mode
ω
1.5 acoustic mode
1.0
ω+ ω-
0.5
ω ~ ck
0.0 -4
-2
0 q
2
4
Figure 5: Dispersion of the linear chain of oscillators shown in Fig. 4 when M 1 = 1, M2 = 2 and f = 1. The upper branch ω+ is called the optical and the lower branch is the acoustic mode.
As a result, the + mode is quite flat; whereas the − mode varies from zero at the Brillouin zone center q = 0 to a flat value at the edge of the zone. This behavior is plotted in Fig. 5. It is also instructive to look at the eigenvectors, since they will tell us how the atoms vibrate. Let’s look at the optical mode at q = 0, q
ω+ (0) = 2f /µ. Here,
D=
√
2f /M1 −2f / M1 M2 √ 2f /M2 −2f / M1 M2
11
.
(27)
Eigenvectors are non-trivial solutions to (ω 2 I − D)u = 0, or
0=
√
2f /µ − 2f /M1 2f / M1 M2 √ 2f / M1 M2 2f /µ − 2f /M2 q
u1 u2
.
(28)
with the solution u1 = − M2 /M1 u2 . In terms of the actual displacements Eqs.11
or sn1 /sn2
v u
sn1 u M 2 u1 =t (29) sn2 M 1 u2 = −M2 /M1 so that the two atoms in the basis are moving
out of phase with amplitudes of motion inversely proportional to their masses. These modes are described as optical modes since these atoms,
Figure 6: Optical Mode (bottom) of the linear chain (top).
if oppositely charged, would form an oscillating dipole which would couple to optical fields with λ ∼ a. Not all optical modes are optically active. 1.3
The Constraints of Symmetry
We know a great deal about the dispersion of the lattice vibrations without solving explicitly for them. For example, we know that for each q, there will be dr modes (where d is the lattice dimension, and r is the 12
number of atoms in the basis). We also expect (and implicitly assumed above) that the allowed frequencies are real and positive. However, from simple mathematical identities, the point-group and translational symmetries of the lattice, and its time-reversal invariance, we can learn more about the dispersion without solving any particular problem. The basic symmetries that we will employ are • The translational invariance of the lattice and reciprocal lattice. • The point group symmetries of the lattice and reciprocal lattice. • Time-reversal invariance. 1.3.1
Symmetry of the Dispersion
Complex Properties of the dispersion and Eigenmodes
First, from the
symmetry of the second derivative, one may show that ω 2 is real. Recall ³
´
that the dispersion is determined by the secular equation det D(q) − ω 2 I =
0, so if D is hermitian, then its eigenvalues, ω 2 , must be real. 1 −iq·(rp ) Φpβj 0αi e Mα Mβ 1 −p,β,j iq·(rp ) e = q Φ0,α,i Mα Mβ
∗βj Dαi =
q
(30) (31)
Then, due to the symmetric properties of the second derivative ∗βj Dαi =
q
1 0,α,i Φ−p,β,j eiq·(rp ) = Mα Mβ
q
13
1 iq·(rp ) αi Φp,α,i = Dβj 0,β,j e Mα Mβ
(32)
Thus, DT ∗ = D† = D so D is hermitian and its eigenvalues ω 2 are real. This means that either ω are real or they are pure imaginary. We will assume the former. The latter yields pure exponential growth of our Fourier solution, indicating an instability of the lattice to a secondorder structural phase transition. Time-reversal invariance allows us to show related results. We assume a solution of the form 1 snαi = √ uαi (q)ei(q·rn −ωt) Mα
(33)
which is a plane wave. Suppose that the plane wave is moving to the ˆ qx , then the plane of stationary phase travels to the right so that q = x right with ω t. (34) qx Clearly then changing the sign of qx is equivalent to taking t → −t. x=
If the system is to display proper time-reversal invariance, so that the plane wave retraces its path under time-reversal, it must have the same frequency when time, and hence q, is reversed, so ω(−q) = ω(q) .
(35)
αi Note that this is fully equivalent to the statement that Dβj (q) = ∗αi Dβj (−q) which is clear from the definition of D.
Now, return to the secular equation, Eq. 15. µ
βj Dαi (q)
−ω
2
βj (q)δαi
14
¶
²βj (q) = 0
(36)
Lets call the (normalized) eigenvectors of this equation ². They are the elements of a unitary matrix which diagonalizes D. As a result, they have orthogonality and completeness relations X ∗(n) (m) ²α,i (q)²α,i (q) α,i
= δm,n
X ∗(n) ∗(n) ²α,i (q)²β,j (q) n
orthogonality = δα,β δi,j
(37) (38)
If we now take the complex conjugate of the secular equation µ
¶
βj βj Dαi (−q) − ω 2 (−q)δαi ²∗βj (q) = 0
(39)
Then it must be that ²∗βj (q) ∝ ²βj (−q) .
(40)
Since the {²} are normalized the constant of proportionality may be chosen as one ²∗βj (q) = ²βj (−q) . Point-Group Symmetry and the Dispersion
(41) A point group operation
takes a crystal back to an identical configuration. Both the original and final lattice must have the same dispersion. Thus, since the reciprocal lattice has the same point group as the real lattice, the dispersion relations have the same point group symmetry as the lattice. For example, the dispersion must share the periodicity of the Brillouin zone. From the definition of D βj (q) = Dαi
q
1 iq·(rp ) Φpβj 0αi e Mα Mβ 15
(42)
βj βj (q) (since G · rp = 2πn, where (q + G) = Dαi it is easy to see that Dαi
n is an integer). I.e., D is periodic in k-space, and so its eigenvalues (and eigenvectors) must also be periodic.
1.3.2
ω (n) (k + G) = ω (n) (k)
(43)
²βj (k + G) = ²βj (k) .
(44)
Symmetry and the Need for Acoustic modes
Applying basic symmetries, we can show that an elemental lattice (that with r = 1) must have an acoustic model. First, look at the transla-
s
11
Figure 7: If each ion is shifted by s1,1,i , then the lattice energy is unchanged.
tional invariance of Φ . Suppose we make an overall shift of the lattice by an arbitrary displacement sn,α,i for all sites n and elements of the basis α (i.e. sn,α,i = s1,1,i ). Then, since the interaction is only between
16
ions, the energy of the system should remain unchanged. 1 X m−n,β,j Φ0,α,i sn,α,i sm,β,j = 0 2 m,n,α,β,i,j 1 X m−n,β,j Φ0,α,i s1,1,i s1,1,j = 2 mnα,β,i,j X 1X m−n,β,j = s1,1,i s1,1,j Φ0,α,i 2 i,j mnα,β
δE =
(45) (46) (47)
Since we know that s1,1,i is finite, it must be that X
m,n,α,β
m−n,β,j Φ0,α,i =
X
p,α,β
Φp,β,j 0,α,i = 0
(48)
Now consider a strain on the system Vm,β,j , described by the strain matrix mα,i β,j Vm,β,j =
X α,i
mα,i β,j sm,α,i
(49)
After the stress has been applied, the atoms in the bulk of the sample
V
Figure 8: After a stress is applied to a lattice, the movement of each ion (strain) is not only in the direction of the applied stress. The response of the lattice to an applied stress is described by the strain matrix.
are again in equilibrium (those on the surface are maintained in equilibrium by the stress), and so the net force must be zero. Looking at 17
the central (n = 0) atom this means that 0 = F0,α,i = −
X
m,β,j,γ,k
γ,k Φm,β,j 0,α,i mβ,j sm,γ,k
(50)
Since this applies for an arbitrary strain matrix mγ,k β,j , the coefficients for each mγ,k β,j must be zero X m
Φm,β,j 0,α,i sm,γ,k = 0
(51)
An alternative way (cf. Callaway) to show this is to recall that the reflection symmetry of the lattice requires that Φm,β,j 0,α,i be even in m; whereas, sm,γ,k is odd in m. Thus the sum over all m yields zero. Now let’s apply these constraints to D for an elemental lattice where r = 1, and we may suppress the basis indices α. Dij (q) =
1 M
X p
iq·(rp ) Φp,j 0,i e
(52)
For small q we may expand D Dij (q)
1 = M
X p
Φp,j 0,i
Ã
1 1 + iq · (rp ) − (q · (rp ))2 + · · · 2
!
(53)
We have shown above that the first two terms in this series are zero. Thus, Dij (q) ≈ −
1 2M
X p
2 Φp,j 0,i (iq · (rp ))
(54)
Thus, the leading order (small q) eigenvalues ω 2 (q) ∼ q 2 . I.e. they are acoustic modes. We have shown that all elemental lattices must have acoustic modes for small q. 18
In fact, one may show that all harmonic lattices in which the energy is invariant under a rigid translation of the entire lattice must have at least one acoustic mode. We will not prove this, but rather make a simple argument. The rigid translation of the lattice corresponds to a q = 0 translational mode, since no energy is gained by this translation, it must be that ωs (q = 0) = 0 for the branch s which contains this mode. The acoustic mode may be obtained by perturbing (in q) around this point. Physically this mode corresponds to all of the elements of the basis moving together so as to emulate the motion in the elemental basis.
2
The Counting of Modes
In the sections to follow, we need to perform sums (integrals) of functions of the dispersion over the crystal momentum states k within the reciprocal lattice. However, the translational and point group symmetries of the crysal, often greatly reduce the set of points we must sum. In addition, we often approximate very large systems with hypertoroidal models with periodic boundary conditions. This latter approximation becomes valid as the system size diverges so that the surface becomes of zero measure.
19
2.1
Periodicity and the Quantization of States
A consequence of approximating our system as a finite-sized periodic system is that we now have a discrete sum rather than an integral over k. Consider a one-dimensional finite system with N atoms and periodic boundary conditions. We seek solutions to the phonon problem of the type sn = ²(q)ei(qrn −ωt) where rn = na
(55)
and we require that sn+N = sn
(56)
q(n + N )a = qna + 2πm where m is an integer
(57)
or
Then, the allowed values of q = 2πm/N a. This will allow us to convert the integrals over the Brillouin zone to discrete sums, at least for cubic systems; however, the method is easily generalized for other Bravais lattices. 2.2
Translational Invariance: First Brillouin Zone
We can use the translational invariance of the crystal to reduce the complexity of sums or integrals of functions of the dispersion over the crystal momentum states. As shown above, translationally invariant systems have states which are not independent. It is useful then to define a region of k-space which contains only independent states. Sums 20
G vector Bisector First Brillouin Zone
Figure 9:
The First Brillouin Zone. The end points of all vector pairs that satisfy
the Bragg condition k − k0 = Ghkl lie on the perpendicular bisector of Ghkl . The smallest polyhedron centered at the origin of the reciprocal lattice and enclosed by
perpendicular bisectors of the G’s is called the first Brillouin zone.
over k may then be confined to this region. This region is defined as the smallest polyhedron centered at the origin of the reciprocal lattice and enclosed by perpendicular bisectors of the G’s is called the Brillouin zone (cf. Fig. 9). Typically, we choose to include only half of the bounding surface within the first Brillouin zone, so that it can also be defined as the set of points which contains only independent states. From the discussion in chapter 3 and in this chapter, it is also clear that the reciprocal lattice vectors have some interpretation as momentum. For example, the Laue condition requires that the change in momentum of the scatterer be equal to a reciprocal lattice translation vector. The end points of all vector pairs that satisfy the Bragg condition k − k0 = Ghkl lie on the perpendicular bisector of Ghkl . Thus, the FBZ is also the set of points which cannot satisfy the Bragg condition. 21
2.3
Point Group Symmetry and Density of States
Two other tricks to reduce the complexity of these sums are worth mentioning here although they are discussed in detail elswhere. The first is the use of the point group symmetry of the system. It is clear from their definition in chapter 3, the reciprocal lattice vectors have the same point group symmetry as the lattice. As we discussed in chapter 2, the knowledge of the group elements and corresponding degeneracies may be used to reduce the sums over k to the irreducible wedge within the the First Brillioun zone. For example, for a cubic system, this wedge is only 1/23 3! or 1/48th of the the FBZ! The second is to introduce a phonon density of states to reduce the multidimensional sum over k to a one-dimensional integral over energy. This will be discussed in chapter 5.
3
Normal Modes and Quantization
In this section we will derive the equations of motion for the lattice, determine the canonically conjugate variables (the the sense of Lagrangian mechanics), and use this information to both first and second quantize the system. Any lattice displacement may be expressed as a sum over the eigen-
22
vectors of the dynamical matrix D. 1 sn,α,i = √ Mα N
X q,s
Qs (q, t)²sα,i (q)eiq·rn
(58)
Recall that ²sα,i (q) are distinguished from usα,i (q) only in that they are normalized. Also since q + G is equivalent to q, we need sum only over the first Brillouin zone. Finally we will assume that Qs (q, t) contains the harmonic time dependence and since sn,α,i is real Q∗s (q) = Qs (−q). We may rewrite both the kinetic and potential energy of the system as sums over Q. For example, the kinetic energy of the lattice 1 X Mα (s˙ nα,i )2 2 n,α,i 1 X X ˙ = Qr (q)²rα,i (q)eiq·rn Q˙ s (k)²sα,i (k)eik·rn 2N n,α,i q,k,r,s
T =
(59) (60)
Then as 1 N
X i(k+q)·r n n
e
= δk,−q and
X r ∗s ²α,i ²α,i α,i
= δrs
(61)
the kinetic energy may be reduced to T =
¯2 1 X ¯¯ ˙ ¯ ¯Qr (q)¯ 2 q,r
(62)
The potential energy may be rewritten in a similar fashion V =
1 X Φm,β,j sn,α,i sm,β,j 2 n,m,α,β,i,j n,α,i
m−n,β,j Φ0,α,i 1 X q = 2 n,m,α,β,i,j N Mα Mβ X
q,k,s,r
Qs (q, t)²sα,i (q)eiq·rn Qr (k, t)²rβ,j (k)eik·rm 23
(63)
Let rl = rm − rn
l,β,j Φ0,α,i 1 X q V = 2 n,l,α,β,i,j N Mα Mβ X
q,k,s,r
Qs (q, t)²sα,i (q)eiq·rn Qr (k, t)²rβ,j (k)eik·(rl +rn )
(64)
and sum over n to obtain the delta function δk,−q so that V =
1 1 X ik·rl . (65) Qs (−k)²sα,i (−k)Qr (k)²rβ,j (k) q Φl,β,j 0,α,i e 2 l,α,β,i,j,s,r Mα Mβ
Note that the sum over l on the last three terms yields D, so that 1 X β,j Dα,i (k)Qs (−k)²sα,i (−k)Qr (k)²rβ,j (k) . 2 l,α,β,i,j,s,r
V = Then, since
P
β,j
βj Dαi (k)²rβj (k) = ωr2 (k)²rα,i (k) and ²sα,i (−k) = ²∗s α,i (k),
V = Finally, since
P
(66)
1 X r ∗ 2 ²α,i (k)²∗s α,i (k)ωr (k)Qs (k)Qr (k) 2 α,i,k,r,s
r ∗s α,i ²α,i (k)²α,i (k)
V =
(67)
= δr,s
1X 2 ωs (k) |Qs (k)|2 2 k,s
(68)
Thus we may write the Lagrangian of the ionic system as ¶ ¯2 1 X µ¯¯ ˙ 2 ¯ 2 ¯Qs (k)¯ − ωs (k) |Qs (k)| L=T −V = , 2 k,s
(69)
where the Qs (k) may be regarded as canonical coordinates, and Pr∗ (k) =
∂L = Q˙ ∗s (k) ∂Qr (k)
(70)
(no factor of 1/2 since Q∗s (k) = Qs (−k)) are the canonically conjugate momenta. 24
The equations of motion are
d ∂L ∂L ¨ s (k) + ωs2 (k)Qs (k) = 0 − or Q ∗ ∗ ˙ dt ∂ Qs (k) ∂Qs (k)
(71)
for each k, s. These are the equations of motion for 3rN independent harmonic oscillators. Since going to the Q-coordinates accomplishes the decoupling of these equations, the {Qs (k)} are referred to as normal coordinates. 3.1
Quantization and Second Quantization
P.A.M. Dirac laid down the rules of quantization, from Classical HamiltonJacobi classical mechanics to Hamiltonian-based quantum mechanics following the path (Dirac p.84-89): 1. First, identify the classical canonically conjugate set of variables {qi , pi } 2. These have Poisson Brackets {{u, v}} =
X i
Ã
∂u ∂v ∂u ∂v − ∂qi ∂pi ∂pi ∂qi
!
{{qi , pj }} = δi,j {{pi , pj }} = {{qi , qj }} = 0
(72) (73)
3. Then define the quantum Poisson Bracket (the commutator) [u, v] = uv − vu = i¯h{{u, v}} 4. In particular, [qi , pj ] = i¯hδi,j , and [qi , qj ] = [pi , pj ] = 0. 25
(74)
Thus, following Dirac, we may now quantize the normal coordinates [Q∗r (k), Ps (q)] = i¯hδk,q δr,s where the other commutators vanish . (75) Furthermore, since we have a system of 3rN uncoupled harmonic oscillators we may immediately second quantize by introducing
q 1 i as (k) = √ ωs (k)Qs (k) + q Ps (k) 2¯h ωs (k) q i 1 Ps∗ (k) , a†s (k) = √ ωs (k)Q∗s (k) − q 2¯h ωs (k)
or
Qs (k) =
v u u t
´ h ¯ ³ as (k) + a†s (−k) 2ωs (k)
Ps (k) = −i Where
v u uh ³ t ¯ ωs (k)
2
as (k) − a†s (−k)
(76) (77)
(78) ´
[as (k), a†r (q)] = δr,s δq,k [as (k), ar (q)] = [a†s (k), a†r (q)] = 0
(79)
(80)
This transformation {Q, P } → {a, a† } is canonical, since is preserves the commutator algebra Eq. 75, and the Hamiltonian becomes H=
X
h ¯ ωs (k)
k,s
Ã
a†s (k)as (k)
1 + 2
!
(81)
which is a sum over 3rN independent quantum oscillators, each one referred to as a phonon mode! The number of phonons in state (k, s) is given by the operator ns (k) = a†s (k)as (k) 26
(82)
and a†s (k) and as (k) create and destroy phonons respectively, in the state (k, s) a†s (k) |ns (k)i
=
as (k) |ns (k)i =
q
ns (k) + 1 |ns (k) + 1i
(83)
ns (k) |ns (k) − 1i
(84)
q
If |0i is the normalized state with no phonons present, then the state with {ns (k)} phonons in each state (k, s) is |{ns (k)}i =
Y k,s
1
1 2 Y ³ † ´ns (k) |0i as (k) ns (k)! k,s
(85)
Finally the lattice point displacement sn,α,i
1 =√ Mα N
v u Xu h ¯ ³ t as (q) q,s 2ωs (q)
´
+ a†s (−q) ²sα,i (q)eiq·rn
(86)
will be important in the next section, especially with respect to zeroD
point motion (i.e. s2
4
E
T =0
6= 0).
Theory of Neutron Scattering
To “see” the lattice with neutrons, we want their De Broglie wavelength λ = h/p 0.29A λneutron = √ E measured in eV (87) E to be of the same length as the intersite distance on the lattice. This means that their kinetic energy E ≈ 12 M v 2 ≈ 0.1eV, or E/kb ≈ 1000K; i.e. thermal neutrons. 27
Source of thermal n neutrons
λ ≈ | a 1 | or |a 2 |
a1 a2
Figure 10: Neutron Scattering. The De Broglie wavelength of the neutrons must be roughly the same size as the lattice constants in order to learn about the lattice structure and its vibrational modes from the experiment. This dictates the use of thermal neutrons.
Since the neutron is chargeless, it only interacts with the atomic nucleus through a short-ranged nuclear interaction (Ignoring any spinspin interaction). The range of this interaction is 1 Fermi (10−13 cm.) or about the radius of the atomic nucleus. Thus λ ∼ A À range of the interaction ∼ 10−13 cm.
(88)
Thus the neutron cannot ”see” the detailed structure of the nucleus, and so we may approximate the neutron-ion interaction potential as a contact interaction V (r) =
X rn
Vn δ(r − rn )
(89)
i.e., we may ignore the angular dependence of the scattering factor f. 4.1
Classical Theory of Neutron Scattering
Due to the importance of lattice vibrations, which are inherently quantum in nature, there is a limit to what we can learn from a classical 28
theory of diffraction. Nevertheless it is useful to compare the classical result to what we will develop for the quantum problems. For the classical problem we will assume that the lattice is elemental (r = 1) and start with a generalization of the formalism developed in the last chapter I ∝ |ρ(K, Ω)|2
(90)
where K = k0 − k and Ω = ω0 − ω. Furthermore, we take ρ(r(t)) ∝
X n
δ(r − rn (t))
(91)
where 1 rn (t) = rn + sn (t) and sn (t) = √ u(q)ei(q·rn −ω(q)t) M
(92)
describes the harmonic motion of the s-mode with wave-vector q. ρ(K, Ω) ∝
XZ n
dtei[K·(rn +sn (t))−Ωt] .
(93)
For |K| ∼ 2π/A and sn (t) ¿ A we may expand ρ(K, Ω) ∝
XZ n
dtei[K·(rn )−Ωt] (1 + iK · sn (t) + · · ·)
(94)
The first term yields a finite contribution only when K = k0 − k = G and Ω = ω0 − ω = 0
(95)
which are the familiar Bragg conditions for elastic scattering. The second term, however, yields something new. It only yields a finite result when K ± q = k0 − k ± q = G and Ω ± ωs (q) = ω0 − ω ± ωs (q) = 0 (96) 29
When multiplied by h ¯ , these can be interpreted as conditions for the conservation of (crystal) momentum and energy when the scattering event involves the creation (destruction) of a lattice excitation (phonon). These processes are called Stokes and antistokes processes, respectively, and are illustrated in Fig. 11.
n
k = k 0- q , ω = ω 0 - ω q
k = k 0+ q , ω = ω 0 + ω q
q ,ω q
q ,ω q n
k 0, ω0
Stokes Process (phonon creation)
k 0, ω0
Anti-Stokes Process (phonon absorbtion)
Figure 11: Stokes and antistokes processes in inelastic neutron scattering involving the creation or absorption of a lattice phonon.
Clearly, the anti-Stokes process can only happen at finite temperatures where real (as opposed to virtual) phonons are excited. Thus, our classical formalism does not correctly describe the temperature dependence of the scattering. Several other things are missing, including: • Security in the validity of the result. • The effects of zero-point motion. • Correct temperature dependence.
30
4.2
Quantum Theory of Neutron Scattering
To address these concerns, we will do a fully quantum calculation. Several useful references for this calculation include • Ashcroft and Mermin, Appendix N, p. 790) • Callaway, p. 36–. • Hook and Hall (for experiment) Ch. 12 p.342We will imagine that the scattering shown in Fig. 12 occurs in a box of volume V . The momentum transfer, from the neutron to the lattice Final
Initial ψ 0 = 1− e i( k 0•⋅ r - ω 0 t ) V √ n k 0 ⋅ω 0
ψ f = 1− i( k f •⋅ r - ω f t ) V e √ n k ⋅ω
φ0 E0
φf Ef
Figure 12: The initial (left) and final (right) states of the neutron and lattice during a scattering event. The initial system state is given by Ψ0 = φ0 ψ0 , with energy ²0 = E 0 + h ¯ ω0 where ω0 = k02 /2M . The final system state is given by Ψf = φf ψf , with energy ²f = Ef + h ¯ ωf where ωf = kf2 /2M .
is K = k0 − kf and the energy transfer which is finite for inelastic scattering is h ¯Ω = h ¯ (ω0 − ωf ). Again we will take the neutron-lattice 31
interaction to be local V (r) =
X rn
V (r − rn ) =
1 N
X
q,n
V (q)eiq·(r−rn ) =
Z
d3 q X iq·(r−rn ) (97) V0 e V n
where the locality of the interaction (V (r − rn ) ∝ δ(r − rn )) indicates that V (q) = V (0) = V0 . Consistent with Aschcroft and Mermin, we will take
2π¯h2 a 1 V (r) = M V
XZ rn
d3 qeiq·(r−rn )
where a is the scattering length, and V0 =
2π¯h2 a M
(98)
is chosen such that the
total cross section σ = 4πa2 . To formulate our quantum theory, we will use Fermi’s golden rule for time dependent perturbation theory. (This is fully equivalent to the lowest-order Born approximation). The probability per unit time for a neutron to scatter from state k0 to kf is given by 2π X δ(²0 − ²f ) |hΨ0 |V | Ψf i|2 h ¯ f 2π X = δ(E0 + h ¯ ω 0 − Ef − h ¯ ωf ) h ¯ f
P =
¯ ¯1 Z ¯ ¯ ¯V
3
d re
i(k−k0 )·r
hφ0 |V
(99)
¯2 ¯ (r)| φf i¯¯¯
(100)
If we now substitute in the ion-neutron potential Eq. 98, then the integral over r will yield a delta function V δ(q + k − k0 ) which allows the q integral to be evaluated ¯
¯
¯X D ¯ ¯ E ¯2 (2π¯h)3 X ¯ ¯ ¯ −iK·rn ¯ ¯ φf ¯ P =a δ(E0 − Ef + h ¯ Ω) ¯¯ φ 0 ¯e ¯ 2 (M V ) f rn 2
32
(101)
Now, before proceeding to a calculation of the differential cross section
dσ dΩdE
we must be able to convert this probability (rate) for eigen-
states into a flux of neutrons of energy E and momentum p. A differential volume element of momentum space d3 p contains V d3 p/(2π¯h)3 neutron states. While this is a natural consequence of the uncertainty principle, it is useful to show this in a more quantitative sense: Imagine a cubic volume V = L3 with periodic boundary conditions so that for any state Ψ in V , Ψ(x + L, y, z) = Ψ(x, y, z) If we write Ψ(r) =
1 P iq·r Ψ(q), qe N
(102)
then it must be that
qx L = 2πm where m is an integer
(103)
with similar relations for the y and z components. So for each volume element of q-space
³
´ 2π 3 L
there is one such state. In terms of states p =
h ¯ q, the volume of a state is (2π¯h/L)3 . Thus d3 p contains V d3 p/(2π¯h)3 states. The incident neutron flux of states (velocity times density) is ¯
¯
h ¯ k0 h ¯ k0 h ¯ k0 ¯¯ 1 ik0 ·r ¯¯2 2 ¯√ e ¯ = j= |Ψ0 | = ¯ M M ¯ V MV
(104)
Then since the number of neutrons is conserved
dσ h ¯ k0 dσ d3 p p2 dpdΩ j dEdΩ = dEdΩ = P V = PV (105) dEdΩ M V dEdΩ (2π¯h)3 (2π¯h)3 33
And for thermal (non-relativistic) neutrons E = p2 /2M , so dE = pdp/M , and h ¯ kM dEdΩ h ¯ k0 dσ dEdΩ = P V M V dEdΩ (2π¯h)3
(106)
k (M V )2 dσ =P . dEdΩ k0 (2π¯h)3
(107)
or
Substituting in the previous result for P ¯
¯
¯ ¯X D ¯ E ¯2 dσ k (M V )2 2 (2π¯h)3 X ¯ ¯ ¯ −iK·rn ¯ ¯ φf ¯ φ 0 ¯e δ(E0 − Ef + h ¯ Ω) ¯¯ = a ¯ 3 2 dEdΩ k0 (2π¯h) (M V ) f rn (108)
or
dσ k N a2 = S(K, Ω) dEdΩ k0 h ¯
(109)
where 1 S(K, Ω) = N
X f
δ(E0 − Ef +
¯ ¯ ¯X D ¯ ¯ E ¯2 ¯ ¯ ¯ −iK·rn ¯ ¯ φf ¯ h ¯ Ω) ¯¯ φ 0 ¯e ¯ rn
.
(110)
We may deal with the Dirac delta function by substituting δ(Ω) =
Z ∞
dt iΩt e . −∞ 2π
(111)
so that S(K, Ω) =
then as
1 N
XZ ∞
dt i((E0 −Ef )/¯h+Ω)t e −∞ 2π
f ¯ ¯ ¯ ¯ E¯ ¯D E¯ X ¯¯D ¯¯ ¯ iK·rn ¯ ¯ ¯ −iK·rm ¯ ¯ φ 0 ¯e ¯ φ f ¯ ¯ φ f ¯e ¯ φ0 ¯ rn ,rm
e−iHt/¯h |φl i = e−iEl t/¯h |φl i 34
.
(112)
(113)
where H is the lattice Hamiltonian, we can write this as 1 S(K, Ω) = N D
XZ ∞
dt iΩt X D ¯¯ iHt/¯h iK·rn −iHt/¯h ¯¯ E ¯ φf e e φ 0 ¯e e −∞ 2π rn ,rm
f ¯ ¯ E φf ¯¯e−iK·rm ¯¯ φ0
,
(114)
and the argument in the first expectation value is the time-dependent operator eiK·rn in the Heisenberg representation eiK·rn (t) = eiHt/¯h eiK·rn e−iHt/¯h .
(115)
Thus, 1 S(K, Ω) = N 1 = N
XZ ∞
dt iΩt X D ¯¯ iK·rn (t) ¯¯ E D ¯¯ −iK·rm ¯¯ E ¯ φ0 ¯ φf φ f ¯e e φ 0 ¯e −∞ 2π rn ,rm f Z ∞ dt ¯ ¯ E X D eiΩt φ0 ¯¯eiK·rn (t) e−iK·rm ¯¯ φ0 . (116) −∞ 2π rn ,rm
Now since rn (t) = rn + sn (t) (with rn time independent), 1 S(K, Ω) = N
XZ ∞
n,m
dt i(K·(rn −rm )+Ωt) D ¯¯ iK·sn (t) −iK·sm ¯¯ E ¯ φ0 . (117) φ 0 ¯e e e −∞ 2π
This formula is correct at zero temperature. In order to describe finite T effects (ie., anti-stokes processes involving phonon absorption) we must introduce a thermal average over all states hφ0 |A| φ0 i → hAi =
X −βE l
e
l
hφl |A| φl i /
X −βE l
e
.
(118)
l
With this substitution, S(K, Ω) =
1 N
XZ ∞
n,m
dt i(K·(rn −rm )+Ωt) D iK·sn (t) −iK·sm E . e e e −∞ 2π
and S(K, Ω) is called the dynamical structure factor. 35
(119)
4.2.1
The Debye-Waller Factor
To simplify this relation further, recall that the exponentiated operators within the brackets are linear functions of the creation and annihilation operators a† and a. 1 sn,α (t) = √ Mα N
v u ³ Xu h ¯ s t ²α (q) as (q)(t) q,s 2ωs (q)
+
a†s (−q)(t)
´
eiq·rn (120)
So that, in particular hsn,α,i (t)i = hsn,α,i (0)i = 0. Then let A = iK · sn,α,i (t) and B = iK · sm,α,i (0) and suppose that the expectation values of A and B are small. Then D
A B
e e
E
= ≈ ≈ ≈
*
1 1 (1 + A + A2 + · · ·)(1 + B + B 2 + · · ·) 2 2 + * 1 2 1 2 1 + A + B + AB + A + B + · · · 2 2 D E 1 1 + 2AB + A2 + B 2 + · · · 2 1 2 2AB+A +B 2 i h e2
+
(121)
This relation is in fact true to all orders, as long as A and B are linear functions of a† and a . (c.f. Ashcroft and Mermin, p. 792, Callaway pp. 41-48). Thus D
E 2 2 1 1 eiK·sn (t) e−iK·sm = e− 2 h(K·sn (t)) i e− 2 h(K·sm ) i ehK·sn (t)K·sm i .
(122)
Since the Hamiltonian has no time dependence, and the lattice is invariant under translations rn D
E 2 eiK·sn (t) e−iK·sm = e−h(K·sn ) i ehK·sn−m (t)K·s0 i ,
36
(123)
where the first term is called the Debye-Waller factor e−2W . 2 e−2W = e−h(K·sn ) i .
Thus letting l = n − m S(K, Ω) =
Z −2W X ∞ e −∞ l
(124)
dt i(K·rl +Ωt) hK·sl (t)K·s0 i e e . 2π
(125)
Here the Debye-Waller factor contains much of the crucial quantum physics. It is finite, even at T = 0 due to zero-point fluctuations, and since hK · sn i2 will increase with temperature, the total strength of the Bragg peaks will diminish with increasing T . However, as long as a crystal has long-ranged order, it will remain finite. 4.2.2
Zero-phonon Elastic Scattering
One may disentangle the elastic and inelastic processes by expanding the exponential in the equation above. ehK·sl (t)K·s0 i =
X m
1 (hK · sl (t)K · s0 i)m m!
(126)
If we approximate the exponential by 1, ie. take only the first, m = 0 term, then
XZ ∞
dt i(K·rl −Ωt) e . (127) −∞ 2π l And we recover the lowest order classical result (modified by the DebyeS0 (K, Ω) = e−2W
Waller factor) which gives us the Bragg conditions that S0 (K, Ω) is only finite when K = G and Ω = ω0 − ωf = 0. S0 (K, Ω) = e−2W δ(Ω)N
X G
37
δK,G ,
(128)
X dσ0 k N a2 −2W = e δ(Ω)N δK,G (129) dEdΩ k0 h ¯ G However, now the scattering intensity is reduced by the Debye-
Waller factor e−2W , which accounts for zero-point motion and thermal fluctuations. 4.2.3
One-Phonon Inelastic Scattering
When m = 1, then the scattering involves either the absorption or creation of a phonon. To evaluate X Z ∞ dt i(K·r +Ωt) l e hK · sl (t)K · s0 (0)i . S1 (K, Ω) = e−2W −∞ 2π l we need v u ´ ³ Xu 1 h ¯ t sn,α (t) = √ ²sα (q) as (q, t) + a†s (−q, t) eiq·rn Mα N q,s 2ωs (q) in the Heisenberg representation, and therefore we need,
(130)
(131)
a(q, t) = eiHt/¯h a(q)e−iHt/¯h †
†
= ei(ω(q)ta (q)a(q)) a(q)e−i(ω(q)ta (q)a(q) = a(q)ei(ω(q)t(a (q)a(q)−1)) e−i(ω(q)ta (q)a(q) †
= a(q)e−iω(q)t
†
(132)
where we have used the fact that (a† a)n a = (a† a)n−1 a† aa = (a† a)n−1 (aa† −
1)a = (a† a)n−1 a(a† a−1) = a(a† a−1)n . Similarly a† (q, t) = a† (q)eiω(q)t . Thus, 1 sn,α (t) = √ Mα N
√ iq·r ´ ³ h ¯e n s q ²α (q) as (q)e−iωs (q)t + a†s (−q)eiωs (q)t q,s 2ωs (q) (133)
X
38
and v u ³ Xu h ¯ t ²rα (p) ar (p) p,r 2ωr (p)
1 s0,α (0) = √ Mα N
+ a†r (−p)
´
(134)
Recall, we want to evaluate S1 (K, Ω) =
Z −2W X ∞ e −∞ l
dt i(K·rl +Ωt) e hK · sl (t)K · s0 (0)i . 2π
(135)
Clearly, the only terms which survive in hK · sl (t)K · s0 (0)i are those with r = s and p = −q. Furthermore, the sum over l yields a delta function N k) = ω(k),
P
G δK+q,G .
S1 (K, Ω) = e h
−2W
Then as ²(G − k) = ²(−k) = ²∗ (k), and ω(G −
¯ |K · ²(K)|2 dt iΩt X h e δK+q,G −∞ 2π q,G,s 2ωs (q)M
Z ∞
E
D
D
(136)
e−iωs (q)t as (−K)a†s (−K) + eiωs (q)t a†s (−K)as (−K)
Ei
The occupancy of each mode n(q) is given by the Bose factor hn(q)i =
1 eβω(q) − 1
(137)
So, finally h ¯ |K · ²s (K)|2 (138) s 2M ωs (K) [(1 + ns (K))δ(−Ω + ωs (K)) + ns (K)δ(Ω + ωs (K))] .
S1 (K, Ω) = e−2W
X
For the first term, we get a contribution only when Ω − ωs (K) = ω0 − ωf − ωs (K) = 0; ie., the final energy of the neutron is smaller than the initial energy. The energy is lost in the creation of a phonon. Note that 39
this can happen at any temperature, since (1 + ns (K)) 6= 0 at any T . The second term is only finite when Ω + ωs (K) = ω0 − ωf + ωs (K) = 0; ie., the final energy of the neutron is larger than the initial energy. The additional energy comes from the absorption of a phonon. Thus phonon absorption is only allowed at finite temperatures, and in fact, the factor ns (K) = 0 at zero temperature. These terms correspond to the Stokes and anti-Stokes processes, respectively, illustrated in Fig. 13.
n
k = k 0- q , ω = ω 0 - ω q
k = k 0+ q , ω = ω 0 + ω q
1 + n (K) s q ,ω q
n (K) s q ,ω q n
k 0, ω0
Stokes Process (phonon creation)
k 0, ω0
Anti-Stokes Process (phonon absorbtion)
Figure 13: Stokes and antistokes processes in inelastic neutron scattering involving the creation or absorption of a lattice phonon. The antistokes process can only occur at finite-T, when ns (K) 6= 0.
If we were to continue our expansion of the exponential to larger values of m, we would find multiple-phonon scattering processes. However, these terms are usually of minimal contribution to the total cross section, due to the fact that the average ionic excursion s is small, and are usually neglected.
40
Chapter 5: Thermal Properties of Crystal Lattices Debye December 22, 2000
Contents 1 Formalism
2
1.1
The Virial Theorem . . . . . . . . . . . . . . . . . . . . .
3
1.2
The Phonon Density of States . . . . . . . . . . . . . . .
5
2 Models of Lattice Dispersion
10
2.1
The Debye Model . . . . . . . . . . . . . . . . . . . . . . 10
2.2
The Einstein Model . . . . . . . . . . . . . . . . . . . . . 12
3 Thermodynamics of Crystal Lattices
13
3.1
Long-Range Order . . . . . . . . . . . . . . . . . . . . . 14
3.2
Thermodynamics . . . . . . . . . . . . . . . . . . . . . . 17
3.3
Thermal Expansion, the Gruneisen Parameter . . . . . . 21
3.4
Thermal Conductivity . . . . . . . . . . . . . . . . . . . 27
1
In the previous chapter, we have shown that the motion of a harmonic crystal can be described by a set of decoupled harmonic oscillators. H=
1X |Ps(k)|2 + ωs2(k) |Qs(k)|2 2 k,s
(1)
At a given temperature T, the occupancy of a given mode is hns(k)i =
1 eβωs(k) − 1
(2)
In this chapter, we will apply this information to calculate the thermodynamic properties of the ionic lattice, in addition to addressing questions regarding its long-range order in the presence of lattice vibrations (i.e. do phonons destroy the order). In order to evaluate the different formulas for these quantities, we will first discuss two matters of formal convenience. 1
Formalism
To evaluate some of these properties we can use the virial theorem and, integrals over the density of states.
2
1.1
The Virial Theorem
Consider the Hamiltonian for a quantum system H(x, p), where x and p are the canonically conjugate variables. Then the expectation value of any function of these canonically conjugate variables f (x, p) in a stationary state (eigenstate) is constant in time. Consider d i i hx · pi = h[H, x · p]i = hHx · p − x · pHi dt h ¯ h ¯ iE hx · p − x · pi = 0 (3) = h ¯ where E is the eigenenergy of the stationary state. Let p2 H= + V (x) , (4) 2m then + * 2 p 0 = + V (x), x · p 2m * + 2 p = , x · p + [V (x), x · p] 2m + * 1 · 2 ¸ p , x · p + x · [V (x), p] (5) = 2m h i 2 Then as p , x = p [p, x] + [p, x] p = −2i¯hp and p = −i¯h∇x, * + −i¯h 2 0= p + i¯hx · ∇xV (x) (6) m 3
or, the Virial theorem: 2 hT i = hx · ∇xV (x)i
(7)
Now, lets apply this to a harmonic oscillator where V = 21 mω 2x2 and T = p2/2m, < H >=< T > + < V >= h ¯ ω(n + 12 ) and . We get 1 1 hT i = hV (x)i = h ¯ ω(n + ) (8) 2 2 Lets now apply this to find the RMS excursion of a lattice site in an elemental lattice r = 1 ¿
s
2
À
= = = = = = =
1 X¿ 2 À s N n,i n,i + 1 *X 1 X ik·rn iq·rn r s Qs(k)²i (k)e Qs(q)²i (q)e N n,i N M q,s,k,r + 1 *X 1 X −ik·rn r iq·rn s Qs(−k)²i (−k)e Qs(q)²i (q)e N n,i N M q,s,k,r + 1 *X 1 X ∗r −ik·rn s iq·rn ∗ Qs (k)²i (k)e Qs(q)²i (q)e N n,i N M q,s,k,r À 1 X¿ 2 |Qs(q)| N M q,s + 1 X 2 *1 2 2 ω (q) |Qs(q)| N M q,s ωs2(q) 2 s 1 X 2 1 1 h ¯ ωs(q) ns(q) + 2 N M q,s ωs (q) 2 2 4
¿
s2
À
h ¯ 1 ns(q) + q,s ωs (q) 2
1 = NM
X
(9)
This integral, which must be finite in order for the system to have long-range order, is still difficult to perform. However, the integral may be written as a function of ωs(q) only. ¿
s
2
À
1 = NM
h ¯ 1 1 + q,s ωs (q) eβωs (q) − 1 2 X
(10)
It would be convenient therefore, to introduce a density of phonon states Z(ω) =
1 N
X
q,s
δ (ω − ωs(q))
so that ¿
1.2
s2
À
h ¯ = M
Z
(11)
1 1 dωZ(ω) n(ω) + ω 2
(12)
The Phonon Density of States D
E
In addition to the calculation of s2 , the density of states Z(ω) is also useful in the calculation of E =< H >, the partition function and the related thermodynamic properties In order to calculate Z(ω) =
1 N
X
q,s
δ (ω − ωs(q))
5
(13)
we must first better define the sum over q. As we discussed last chapter for a 3-d cubic system, we will assume that we have a periodic finite lattice of N basis points, or N 1/3 in each of the principle lattice directions a1, a2, and a3. Then, we want the Fourier representation to respect the periodic boundary conditions (pbc), so eiq·(r+N
1/3 (a +a +a ) 1 2 3
) = eiq·r
(14)
This means that qi = 2πm/N 1/3 , where m is an integer, and i indicates one of the coordinates x, y or z. In addition, we only
G vector Bisector First Brillouin Zone
Figure 1: First Brillouin zone of the square lattice
want unique values of qi, so we will choose those within the first 6
Brillouin zone, so that 1 G · q ≤ G2 2
(15)
The size of this region is the same as that of a unit cell of the reciprocal lattice g1 · (g2 × g3). Since there are N states in this region, the density of q states is V N N Vc = = g1 · (g2 × g3) (2π)3 (2π)3
(16)
where Vc is the volume of a Bravais lattice cell (a1 · a2 × a3), and V is the lattice volume. Clearly as N → ∞ the density increases until a continuum of states is formed (all that we need here is that the spacing between q-states be much smaller than any physically relevant value of q). The number of states in a frequency interval dω is then given by the volume of q-space between the surfaces defined by ω = ωs(q) and ω = ωs(q)+dω multiplied by V /(2π)3 V Z ω+dω 3 dq (2π)3 ω V XZ 3 = dω d qδ (ω − ωs(q)) s (2π)3
Z(ω)dω =
7
(17)
qy
dS ω
dq ⊥
Surface ω = ωs (q) qx
Figure 2: States in q-space. Sω is the surface of constant ω = ωs (q), so that d3 q = dSw dq⊥ =
dSω dω . ∇q ωs (q)
As shown in Fig. 2, dω = ∇q ωs(q)dq⊥, and d3q = dSw dq⊥ =
dSω dω , ∇q ωs(q)
(18)
where Sω is the surface in q-space of constant ω = ωs(q). Then Z(ω)dω =
V dSω XZ dω s ω=ωs (q) ∇q ωs (q) (2π)3
(19)
Thus the density of states is high in regions where the dispersion is flat so that ∇q ωs(q) is small. As an example, consider the 1-d Harmonic chain shown in Fig. 3. Real phonon dispersions have maxima which are not 8
M a Dispersion ω(q) ω0
Z(ω) flat
π/a
-π/a
DOS
q
ω0
ω
linear
Figure 3: Linear harmonic chain. The phonon dispersion of this chain must include an acoustic mode, so ω(q) will be linear near q = 0, and it must be symmetric about q = 0 and q = π/q due to the point-group symmetry. Thus, the density of states (DOS) will be flat near ω = 0 corresponding to the acoustic mode (for which ∇q ω(q) =constant), and will be divergent near ω = ω0 corresponding to the peak of the dispersion (where ∇q ω(q) = 0).
at a zone boundary, with corresponding peaks in the phonon DOS. However, any point within the Brillouin zone for which ∇q ω(q) = 0 (cusp, maxima, minima) will yield an integrable singularity in the DOS.
9
2 2.1
Models of Lattice Dispersion The Debye Model
For most thermodynamic properties, we are interested in the modes h ¯ ω(q) ∼ kB T which are low frequency modes in general. From a very general set of (symmetry) constraints we have 2.0 ω+ ω-
1.5 ω
Debye model
1.0 0.5 0.0
-4
-2
0 q
2
4
Figure 4: Dispersion for the diatomic linear chain. In the Debye model, we replace the acoustic mode by a purely linear mode with the same initial dispersion and ignore any optical modes.
argued that all interacting lattices in which the total energy is invariant to an overall arbitrary rigid shift in the location of the lattice must have at least one acoustic mode, where for small 10
ωs(q) = cs|q|. Thus, for the thermodynamic properties of the lattice, we care predominantly about the limit ω(q) → 0. This physics is rather accurately described by the Debye model. In the Debye model, we will assume that all modes are acoustic (elastic), so that ωs(q) = cs|q| for all s and q, then ∇q ωs(q) = cs for all s and q, and V X Z dSω Z(ω) = (2π)3 s ∇q ωs(q) V X Z dSω = (2π)3 s cs
(20)
The surface integral may be evaluated, and yields a constant R
dSω = Ss for each branch. Typically cs is different for different
modes. However, we will assume that the system is isotropic, so cs = c. If the dispersion is isotopic, then the surface of constant ωs(q) is just a sphere, so the surface integral is trivial
2 for d = 1
Ss(ω = ω(q)) = 2πq = 2πω/c for d = 2
4πq 2 = 4πω 2/c2 for d = 3
11
(21)
then since the number of modes = d
Z(ω) =
2/c for d = 1
V 2πq = 4πω/c2 for d = 2 3 (2π) 2 2 3 4πq = 12πω /c for d = 3
0 < ω < ωD (22)
Note that since the total number of states is finite, we have introduced a cutoff ωD on the frequency. 2.2
The Einstein Model
“Real” two dimensional systems, i.e., a monolayer of gas (He) deposited on an atomically perfect surface (Vycor), may be better described by an Einstein model where each atom oscillates with a frequency ω0 and does not interact with its neighbors. The model is dispersionless ω(q) = ω0, and the DOS for this system is a delta function Z(ω) = cδ(ω −ω0). Note that it does not have an acoustic mode; however, this is not in violation of the discussion in the last chapter. Why?
12
Figure 5: Helium adsorbed on a Vycor surface. Each He atom is attracted weakly to the surface by a van der Waals attraction and sits in a local minimum of the surface lattice potential.
3
Thermodynamics of Crystal Lattices
We are now in a situation to calculate many of the thermodynamic properties of crystal lattices. However before addressing such questions as the lattice energy free energy and specific heat we should see if our model has long-range order... ie., is it consistent with our initial assumptions.
13
3.1
Long-Range Order
For simplicity, we will work on an elemental lattice model. We may define long-ranged order (LRO) as a finite value of
h ¯ X 1 1 1 hs2i = + M N q,s ωs(q) eβωs(q) − 1 2 h ¯ X sinh (βωs(q)/2) = 2M N q,s ωs(q)cosh (βωs(q)/2)
(23)
Since we expect all lattices to melt for some high temperature, we are interested only in the T → 0 limit. Clearly also given the factor of
1 ωs (q)
in the summand, we are most interested in
acoustic modes since they are the ones which will cause a divergence.
h ¯ 1 1 1 X lim hs2i = lim + β→∞ 2M N β→∞ q,s ωs(q) βωs(q) 2
(24)
Clearly the low frequency modes are most important so a Debye model may be used h ¯ lim hs i ≈ β→∞ 2M N 2
Z ω D
0
1 1 1 dωZ(ω) + , ω βω 2
14
(25)
where Z(ω) is the same as was defined above in Eq. 22.
Z(ω) =
2/c for d = 1
V 2πq = 4πω/c2 for d = 2 3 (2π) 2 2 3 4πq = 12πω /c for d = 3
0 < ω < ωD (26)
Thus in 3-d the DOS always cancels the 1/ω singularity but in two dimensions the singularity is only cancelled when T = 0 (β = ∞), and in one dimension hs2i = ∞ for all T . This hs2i
d=1 d=2 d=3
T =0 ∞
finite finite
T 6= 0 ∞
∞
finite
Table 1: hs2 i for lattices of different dimension, assuming the presence of an acoustic
mode.
is a specific case of the Mermin-Wagner Theorem. We should emphasize that the result hs2i = ∞ does not mean that our theory has failed. The harmonic approximation requires that the near-neighbor strains must be small, not the displacements. Physically, it is easy to understand why one-dimensional sys15
tems do not have long range order, since as you go along the chain, the displacements of the atoms can accumulate to produce a very large rms displacement. In higher dimensional systems, the displacements in any direction are constrained by the neighbors in orthogonal directions. “Real” two dimensional time
Figure 6: Random fluctuaions of atoms in a 1-d lattice may accumulate to produce a very large average rms displacement of the atoms from small interatomic displacements.
systems, i.e., a monolayer of gas deposited on an atomically perfect surface, do have long-range order even at fintie temperatures due to the surface potential (corrugation of the surface). These may be better described by an Einstein model where each 16
atom oscillates with a frequency ω0 and does not interact with its neighbors. The DOS for this system is a delta function as described above. For such a DOS, hs2i is always finite. You will explore this physics, in much more detail, in your homework. 3.2
Thermodynamics
We will assume that our system is in equilibrium with a heat bath at temperature T . This system is described by the canonical ensemble, and may be justified by dividing an infinite system into a finite number of smaller subsystems. Each subsystem is expected to interact weakly with the remaining system which also acts as the subsystems heat bath. The probability that any state in the subsystem is occupied is given by P ({ns(k)}) ∝ e−βE({ns(k)})
(27)
Thus the partition function is given by Z = = =
X
e−βE({ns(k)})
{ns (k)} X
{ns (k)} Y
s,k
e
−β
P
Zs(k)
¯ ωs (k) ns (k)+ 12 k,s h
(
) (28)
17
where Zs(k) is the partition function for the mode s, k; i.e. the modes are independent and decouple. Zs(k) = =
X
1
e−β¯hωs(k)(ns(k)+ 2 )
n X e−β¯hωs(k)/2 e−β¯hωs(k)(ns(k)) n −β¯hωs (k)/2
e 1 − e−β¯hωs(k) 1 = 2 sinh (βωs(k)/2) =
(29)
The free energy is given by F = −kB T ln (Z) = kB T
X
k,s
ln (2 sinh (β¯hωs(k)/2))
(30)
Since dE = T dS −P dV and dF = T dS −P dV −T dS −SdT , the entropy is
∂F S = − , ∂T V and system energy is then given by
(31)
∂F E = F + TS = F − T ∂T V
(32)
where constant volume V is guaranteed by the harmonic approximation (since < s >= 0). E=
1 h ¯ ωs(k)coth (βωs(k)/2) k,s 2 X
18
(33)
The specific heat is then given by
dE X C = = kB (β¯hωs(k))2 csch2 (βωs(k)/2) dT V k,s
(34)
where csch (x) = 1/sinh (x) Consider the specific heat of our 3-dimensional Debye model. Z ω D
2 2 dωZ(ω) (βω/2) csch (βω/2) 0 2 Z ω 2 D 2 12V πω (βω/2) csch (βω/2) (35) = kB 0 dω (2πc)3
C = kB
Where the Debye frequency ωD is determined by the requirement that 3rN =
Z ω D
0
dωZ(ω) =
3 or V /(2πc)3 = 3rN/(4ωD ).
Z ω D
0
2 12V πω , dω (2πc)3
(36)
Clearly the integral for C is a mess, except in the high and low T limits. At high temperatures β¯hωD /2 ¿ 1, C ≈ kB
Z ω D
0
dωZ(ω) = 3N rkB
(37)
This is the well known classical result (equipartition theorem) which attributes (1/2)kB of the specific heat to each quadratic degree of freedom. Here for each element of the basis we have 19
6 quadratic degrees of freedom (three translational, and three momenta). At low temperatures, β¯hω/2 À 1, csch2 (βω/2) ≈ 2e−βω/2
(38)
Thus, at low T , only the low frequency modes contribute, so the upper bound of integration may be extended to ∞ 3rN C ≈ 12πkB 2 4ωD
Z ∞
0
dωω
2
hω 2 β¯ 2
2e−βωs(k)/2 .
(39)
If we make the change of variables x = β¯hω/2, we get
9kB rN π 1 3 Z ∞ dxx4 e−x C ≈ 0 2 ωD β¯h 9kB rN π 1 3 ≈ 24 2 ωD β¯h
(40) (41)
Then, if we identify the Debye temperature θD = h ¯ ωD /kB , we get
3 T (42) C ≈ 96πrN kB θD C ∝ T 3 at low temperature is the characteristic signature
of low-energy phonon excitations.
20
3.3
Thermal Expansion, the Gruneisen Parameter
Consider a cubic system of linear dimension L. If unconstrained, we expect that the volume of this system will change with temperature (generally expand with increasing T , but not always. cf. ice or Si). We define the coefficient of free expansion (P = 0) as 1 dL 1 dV or αV = 3αL = . (43) L dT V dT Of course, this measurement only makes sense in equilibrium. αL =
dF P = − = 0 dV T
(44)
As mentioned earlier, since < s >= 0 in the harmonic approximation, a harmonic crystal does not expand when heated. Of course, real crystals do, so that lack of thermal expansion of a harmonic crystal can be considered a limitation of the harmonic theory. To address this limitation, we can make a quasiharmonic approximation. Consider a more general potential between the ions, of the form 1 V (x) = bx + cx3 + mω 2x2 2 21
(45)
and let’s see if any of these terms will produce a temperature dependent displacement. The last term is the usual harmonic term, which we have already shown does not produce a Tdependent < x >. Also the first term does not have the desired effect! It does correspond to a temperature-independent shift in the oscillator, as can be seen by completing the square
1 1 b2 b 2 2 2 mω x + bx = − x+ 2 2 mω 2 2mω 2
(46)
Clearly < x >= − mωb 2 , independent of the temperature; that is, assuming that b is temperature-independent. What we need is a temperature dependent coefficient b! The cubic term has the desired effect. As can be seen in Fig 7, as the average energy (temperature) of a particle trapped in a cubic potential increases, the mean position of the particle shifts. However, it also destroys the solubility of the model. To get around this, approximate the cubic term with a mean-field decomposition. 3
¿
cx ≈ cηx x
2
À
+ c(1 − η)x2 hxi
(47)
and treat these two terms separately (the new parameter η is 22
V(x)
3 c=0.1 c=0.0
2 1 0
-2
-1
0 x
1
2
Figure 7: Plot of the potential V (x) = 12 mx2 + cx3 when m = ω = 1 and c = 0.0, 0.1. The average position of a particle < x > in the anharmonic potential, c = 0.1, will shift to the left as the energy (temperature) is increased; whereas, that in the harmonic potential, c = 0, is fixed < x >= 0.
to be determined self-consistently, usually by minimizing the free energy with respect to η). The first term yields the needed temperature dependent shift of < x >
2
¿ À c2 < x 2 > 2 cη < x2 > 1 1 2 2 2 2 − mω x + cηx x = mω x + 2 2 mω 2 2mω 2 (48) 2
> . Clearly the renormalization of the so that < x >= − cη<x 2 mω
equilibrium position of the harmonic oscillator will be temperature dependent. While the second term, (1 − η)x2 < x >,
23
yields a shift in the frequency ω → ω 0
1
2(1 − η) < x > 2 0 ω = ω 1 + mω 2
(49)
which is a function of the equilibrium position. Thus a meanfield description of the cubic term is consistent with the observed physics. In what follows, we will approximate the effect of the anharmonic cubic term as a shift in the equilibrium position of the lattice (and hence the lattice potential) and a change of ω to ω 0; however, we imagine that the energy levels remain of the form 1 (50) En = h ¯ ω 0(< x >)(n + ), 2 and that < x > varies with temperature, consistent with the mean-field approximation just described. To proceed, imagine the cube of cubic system to be made up of oscillators which are independent. Since the final result can be formulated as a sum over these independent modes, consider only one. In equilibrium, where P = − 24
µ
¶
dF dV T
= 0, the free
energy of one of the modes is µ ¶ 1 −β¯hω F =Φ+ h ¯ ω + kB T ln 1 − e 2
(51)
and (following the notation of Ibach and L¨uth), let the lattice potential 1 Φ = Φ0 + f (a − a0)2 + · · · 2 where f is the spring constant. Then
(52)
1 ∂ω 1 h ¯ω dF , 0 = P = = f (a − a0) + h ¯ω − −β¯ h ω da T ω ∂a 2 1−e (53) If we identify the last term in parenthesis as ²(ω, T ), and solve for a, then ∂ω 1 ²(ω, T ) (54) ωf ∂a Since we now know a(T ) for a single mode, we may calculate a = a0 −
the linear expansion coefficient for this mode αL =
1 da 1 ∂ ln w ∂²(ω, T ) =− 2 a0 dT a0f ∂ ln a ∂T
(55)
To generalize this to a solid let αL → αV (as discussed above)
dP and a20f → V 2 dV = V κ (κ is the bulk modulus) and sum over
25
all modes the modes k, s αV =
1 dV 1 = V dT κV
X
k,s
−
Clearly (due to the factor of
∂ ln ωs(k) ∂² (ωs(k), T ) . ∂ ln V ∂T ∂² ∂T ),
(56)
αV will have a behavior sim-
ilar to that of the specific heat (αV ∼ T 3 for low T , and αV =constant for high T ). In addition, for many lattices, the Gruneisen number ∂ ln ωs(k) (57) ∂ ln V shows a weak dependence upon s, k, and may be replaced by γ=
its average, called the Gruneisen parameter *
∂ ln ωs(k) + hγi = , ∂ ln V
(58)
typically on the order of two. Before proceeding to the next section, I would like to reexamine the cubic term in a crystal where X
l
s3l
h ¯ 3/2 1 X i(p+q+k−G)·rl r e = (2M N )3/2 l,k,q,p ω(q)ω(k)ω(p) µ
†
a(k) + a (−k)
¶µ
†
a(p) + a (−p)
¶µ
†
(59) ¶
a(q) + a (−q) .
The sum over l yields a delta function δp+q+k,G (ie., crystal momentum conservation). Physically, these processes correspond 26
k (+ G)
k (+ G) q-k
q-k
q
-q
Figure 8: Three-phonon processes resulting from cubic terms in the inter-ion potential. Six other three-phonon processes are possible.
to phonon decay in which a phonon can decompose into two others. As we shall see, these anharmonic processes are crucial to the calculation of the thermal conductivity, κ, of crystals. 3.4
Thermal Conductivity
Metals predominately carry heat with free electrons, and are considered to be good conductors. Insulators, which lack free electrons, predominantly carry heat with lattice vibrations – phonons. Nevertheless, some very hard insulating crystals have very high thermal conductivities - diamond C which is often highly temperature dependent. However, most insulators are 27
material/T 273.2K 298.2K C
26.2
23.2
Cu
4.03
4.01
Table 2: The thermal conductivities of copper and diamond (CRC) ( in µOhm-cm).
not good thermal conductors. This subsection will be devoted to understanding what makes stiff crystals like diamond such good conductors of heat. The thermal conductivity κ is measured by setting up a small steady thermal gradient across the material, then Q = −κ∇T
(60)
where Q is the thermal current density; i.e., the energy density times the velocity. If the thermal current is in the x-direction, then Qx =
1 V
X
q,s
h ¯ ωs(q)hns(q)ivsx(q)
where the group velocity is given by vsx(q) =
(61) ∂ωs (q) ∂qx
Since we
assume ∇T is small, we will only look at the linear response of the system where hns(q)i deviates little from its equilibrium 28
value hns(q)i0 . Furthermore since ωs(q) = ωs(−q), ∂ωs(−q) = −vsx(q) ∂ − qx
(62)
h ¯ ωs(q)hns(q)i0vsx(q) = 0
(63)
vsx(−q) =
Thus as hns(−q)i0 = hns(q)i0 Q0x =
1 V
X
q,s
since the sum is over all q in the B.Z. Thus if we expand hns(q)i = hns(q)i0 + hns(q)i1 + · · ·
(64)
we get 1 X h ¯ ωs(q)hns(q)i1vsx(q) (65) V q,s since we presumably already know ωs(k), the calculation of Q Qx ≈
and hence κ reduces to the evaluation of the linear change in < n >. Within a region, < n > can change in two ways. Either phonons can diffuse into the region, or they can decay through an anharmonic (cubic) term into other modes. so ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯ ¯
dhni ∂hni ∂hni = + dt ∂t diffusion ∂t decay 29
(66)
decay
diffusion
diffusion
Figure 9: Change of phonon density within a trapazoidal region. hns (q)i can change either by phonon decay or by phonon diffusion into and out of the region.
However
dhni dt
= 0 since we are in a steady state. The decay
process is usually described by a relaxation time τ (or a meanfree path l = vτ ) ¯ < n > − < n >0 ∂hni ¯¯¯ < n >1 ¯ =− =≈ − ∂t ¯¯decay τ τ The diffusion part of
dhni dt
(67)
is addressed pictorially in Fig. 10.
Formally, ¯ ∂hn(x)i hn(x − vx∆t)i − hn(x)i ∂hni ¯¯¯ ¯ ≈ − vx (68) ≈ ∂t ¯¯diffusion ∆t ∂x ∂hn(x)i ∂T ∂hn(x)i0 + hn(x)i1 ∂T ≈ −vx ≈ −vx ∂T ∂x ∂T ∂x Keeping only the lowest order term, ¯ ∂hni ¯¯¯ ∂hn(x)i0 ∂T ¯ ≈ −vx (69) ∂t ¯¯diffusion ∂T ∂x 30
x region of interest
v source region v ∆t
Figure 10: Phonon diffusion. In time ∆t, all the phonons in the left, source region, will travel into the region of interest on the right, while those on the right region will all travel out in time ∆t. Thus, ∆n/∆t = (nlef t − nright )/∆t.
Then as
or
dhni dt
= 0, ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ¯ ¯
∂hni ∂hni =− ∂t diffusion ∂t decay
(70)
∂hn(x)i0 ∂T hni = −vxτs(q) . ∂T ∂x
(71)
1
Thus 1 X ∂hn(x)i0 ∂T 2 Qx ≈ − h ¯ ωs(q)vsx(q)τs(q) , V q,s ∂T ∂x and since Q = −κ∇T
(72)
∂hn(x)i0 1 X 2 κ≈ h ¯ ωs(q)vsx(q)τ . (73) V q,s ∂T From this relationship we can learn several things. First since
2 (q), phonons near the zone boundary or optical modes κ ∼ vsx 31
with small vs(q) = ∇q ωs(q) contribute little to the thermal conductivity. Also, stiff materials, with very fast speed of the acoustic modes vsx(q) ≈ c will have a large κ. Second, since κ ∼ τs(q), and ls(q) = vsx(q)τs(q), κ will be small for materials with short mean-free paths. The mean-free path is effected by defects, anharmonic Umklapp processes, etc. We will explore this effect, especially its temperature dependence, in more detail. At low T , only low-energy, acoustic, modes can be excited (those with h ¯ ωs(q) ∼ kB T ). These modes have vs(q) = cs
(74)
In addition, since the momentum of these modes q ¿ G, we only have to worry about anharmonic processes which do not involve a reciprocal lattice vector G in lattice momentum conservation. Consider one of the three-phonon anharmonic processes of phonon decay shown in Fig. 8 (with G = 0). For these processes Q ∼ h ¯ ωc so the thermal current is not disturbed by anharmonic processes. Thus the anharmonic terms at low T 32
do not affect the mean-free path, so the thermal resistivity (the inverse of the conductivity) is dominated by scattering from impurities in the bulk and surface imperfections at low temperatures. At high T momentum conservation in an anharmonic process may involve a reciprocal lattice vector G if the q1 of an excited mode is large enough and there exists a sufficiently small G so that q1 > G/2 (c.f. Fig. 11). This is called an Umklapp process,
q1 = q2 + q3+ G q3
G
q3 q q2
1
Qout
q2 q1
Q in
first Brillouin zone Figure 11: Umklapp processes involve a reciprocal lattice vector G in lattice momentum conservation. They are possible whenever q1 > G/2, for some G, and involve a virtual reversal of the momentum and heat carried by the phonons (far right).
and it involves a very large change in the heat current (almost a reversal). Thus the mean-free path l and κ are very much 33
smaller for high temperatures where q1 can be larger than half the smallest G. So what about diamond? It is very hard and very stiff, so the sound velocities cs are large, and so thermally excited modes for which kB T ∼ h ¯ω ∼ h ¯ cq involve small q1 for which Umklapp
processes are irrelevant. Second κ ∼ c2 which is large. Thus κ for diamond is huge!
34
Chapter 6: The Fermi Liquid L.D. Landau December 22, 2000
Contents 1 introduction: The Electronic Fermi Liquid
3
2 The Non-Interacting Fermi Gas
5
2.1
Infinite-Square-Well Potential . . . . . . . . . . . . . . . . . . . . . .
5
2.2
The Fermi Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.2.1
T = 0, The Pauli Principle . . . . . . . . . . . . . . . . . . . .
10
2.2.2
T 6= 0, Fermi Statistics . . . . . . . . . . . . . . . . . . . . . .
13
3 The Weakly Correlated Electronic Liquid
23
3.1
Thomas-Fermi Screening . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.2
Fermi liquids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
3.3
Quasi-particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.3.1
Particles and Holes . . . . . . . . . . . . . . . . . . . . . . . .
29
3.3.2
Quasiparticles and Quasiholes at T = 0 . . . . . . . . . . . . .
33
Energy of Quasiparticles. . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.4
4 Interactions between Particles: Landau Fermi Liquid
42
4.1
The free energy, and interparticle interactions . . . . . . . . . . . . .
42
4.2
Local Energy of a Quasiparticle . . . . . . . . . . . . . . . . . . . . .
46
1
4.2.1 4.3
Equilibrium Distribution of Quasiparticles at Finite T . . . . .
48
Effective Mass m∗ of Quasiparticles . . . . . . . . . . . . . . . . . . .
50
2
1
introduction: The Electronic Fermi Liquid
As we have seen, the electronic and lattice degrees of freedom decouple, to a good approximation, in solids. This is due to the different time scales involved in these systems. τion ∼ 1/ωD À τelectron ∼
h ¯ EF
(1)
where EF is the electronic Fermi energy. The electrons may be thought of as instantly reacting to the (slow) motion of the lattice, while remaining essentially in the electronic ground state. Thus, to a good approximation the electronic and lattice degrees of freedom separate, and the small electron-lattice (phonon) interaction (responsible for resistivity, superconductivity etc) may be treated as a perturbation (with ωD /EF as an expansion parameter); that is if we are capable of solving the problem of the remaining purely electronic system. At first glance the remaining electronic problem would also appear to be hopeless since the (non-perturbative) electronelectron interactions are as large as the combined electronic kinetic energy and the potential energy due to interactions with 3
the static ions (the latter energy, or rather the corresponding part of the Hamiltonian, composes the solvable portion of the problem). However, the Pauli principle keeps low-lying orbitals from being multiply occupied, so is often justified to ignore the electron-electron interactions, or treat them as a renormalization of the non-interacting problem (effective mass) etc. This will be the initial assumption of this chapter, in which we will cover • the non-interacting Fermi liquid, and • the renormalized Landau Fermi liquid (Pines Nozieres). These relatively simple theories resolved some of the most important puzzles involving metals at the turn of the century. Perhaps the most intriguing of these is the metallic specific heat. Except in certain “heavy fermion” metals, the electronic contribution to the specific heat is always orders of magnitude smaller than the phonon contribution. However, from the classical theorem of equipartition, if each lattice site contributes just one electron to the conduction band, one would expect the contributions 4
from these sources to be similar (Celectron ≈ Cphonon ≈ 3N rkB ). This puzzle is resolved, at the simplest level: that of the noninteracting Fermi gas. 2 2.1
The Non-Interacting Fermi Gas Infinite-Square-Well Potential
We will proceed to treat the electronic degrees of freedom, ignoring the electron-electron interaction, and even the electronlattice interaction. In general, the electronic degrees of freedom are split into electrons which are bound to their atomic cores with wavefunctions which are essentially atomic, unaffected by the lattice, and those valence (or near valence) electrons which react and adapt to their environment. For the most part, we are only interested in the valence electrons. Their environment described by the potential due to the ions and the core electrons– the core potential. Thus, ignoring the electron-electron interactions, the electronic Hamiltonian is P2 + V (r) . H= 2m 5
(2)
As shown in Fig. 1, the core potential V (r), like the lattice, is periodic
V(r)
a
V(r+a) = V(r)
Figure 1: Schematic core potential (solid line) for a one-dimensional lattice with lattice constant a.
For the moment, ignore the core potential, then the electronic wave functions are plane waves ψ ∼ eik·r . Now consider the core potential as a perturbation. The electrons will be strongly effected by the periodicity of the potential when λ = 2π/k ∼ a 1
. However, when k is small so that λ À a (or when k is large,
so λ ¿ a) the structure of the potential may be neglected, or we can assume V (r) = V0 anywhere within the material. The 1
Interestingly, when λ ∼ a, the Bragg condition 2d sin θ ≈ a ≈ λ may easily be satisfied, so the
electrons, which may be though of as DeBroglie waves, scatter off of the lattice. Consequently states
for which λ = 2π/k ∼ a are often forbidden. This is the source of gaps in the band structure, to be discussed in the next chapter.
6
potential still acts to confine the electrons (and so maintain charge neutrality), so V (r) = ∞ anywhere outside the material.
Figure 2: Infinite square-well potential. V (r) = V0 within the well, and V (r) = ∞
outside to confine the electrons and maintain charge neutrality.
Thus we will approximate the potential of a cubic solid with linear dimension L as an infinite square-well potential.
V (r) =
V0 0 < r i < L ∞ otherwise
(3)
The electronic wavefunctions in this potential satisfy h ¯2 2 − ∇ ψ(r) = (E 0 − V0) ψ(r) = Eψ(r) 2m 7
(4)
The normalize plane wave solution to this model is
1/2 2 ψ(r) = sin kixi where i = x, y, or z i=1 L 3 Y
(5)
and kiL = niπ in order to satisfy the boundary condition that ψ = 0 on the surface of the cube. Furthermore, solutions with ni < 0 are not independent of solutions with ni > 0 and may be excluded. Solutions with ni = 0 cannot be normalized and are excluded (they correspond to no electron in the state). The kz ky
3
(π/L)
kx π/L Figure 3: Allowed k-states for an electron confined by a infinite-square potential. Each state has a volume of (π/L)3 in k-space.
eigenenergies of the wavefunctions are ¶ h ¯ 2 ∇2 h ¯ X 2 h ¯ 2π2 µ 2 2 2 − ψ= k = n + ny + nz 2m 2m i i 2mL2 x
8
(6)
and as a result of these restrictions, states in k-space are confined to the first quadrant (c.f. Fig. 3). Each state has a volume (π/L)3 of k-space. Thus as L → ∞, the number of states with energies E(k) < E < E(k) + dE is (4πk 2 dk)/8 dZ = . (π/L)3 0
Then, since E =
h ¯ 2k2 2m ,
2
so k dk =
m h ¯2
r
2mE/¯h2dE
(7)
1 2m 3/2 1/2 0 3 E dE . dZ = dZ /L = 2 4π h ¯2
(8)
or, the density of state per unit volume is
dZ 1 2m 3/2 1/2 D(E) = E . = dE 4π 2 h ¯2
(9)
Up until now, we have ignored the properties of electrons. However, for the DOS, it is useful to recall that the electrons are spin-1/2 thus 2S + 1 = 2 electrons can fill each orbital or kstate, one of spin up the other spin down. If we account for this spin degeneracy in D, then
1 2m 3/2 1/2 E . D(E) = 2 2π h ¯2 9
(10)
2.2 2.2.1
The Fermi Gas T = 0, The Pauli Principle
Electrons, as are all half-integer spin particles, are Fermions. Thus, by the Pauli Principle, no two of them may occupy the same state. For example, if we calculate the density of electrons per unit volume n=
Z ∞
0
D(E)f (E, T )dE ,
(11)
where f (E, T ) is the probability that a state of energy E is occupied, the factor f (E, T ) must enforce this restriction. However, f is just the statistical factor; c.f. for classical particles f (E, T ) = e−E/kB T for classical particles ,
(12)
which for T = 0 would require all the electrons to go into the ground state f (0, 0) = 1. Clearly, this violates the Pauli principle. At T = 0 we need to put just one particle in each state, starting from the lowest energy state, until we are out of particles. Since E ∝ k 2 in our simple square-well model, will fill up all 10
k-states until we reach some Fermi radius kF , corresponding to some Fermi Energy EF kz ky
kf
2 2
h kf = Ef 2m
occupied states kx
kf
D(E)
Figure 4: Due to the Pauli principle, all k-states up to kF , and all states with energies up to Ef are filled at zero temperature.
h ¯ 2kF2 , EF = 2m
(13)
f (E, T = 0) = θ(EF − E)
(14)
thus,
and n =
Z ∞
0
D(E)f (E, T )dE =
Z E F
0
2m 3/2 1 Z EF 1/2 = E DE 2π 2 0 h ¯2 3/2 2m 1 2 2/3 = 2 E , 2π 2 3 F h ¯ 11
D(E)DE
(15)
or
h ¯ 2 µ 2 ¶3/2 EF = 3π n = k B TF (16) 2m which also defines the Fermi temperature TF . Thus for metals, in which n ≈ 1023/cm3, EF ≈ 10−11 erg ≈ 10eV ≈ kB 105K. Notice that due to the Pauli principle, the average energy of the electrons will be finite, even at T = 0! E=
Z E F
0
3 D(E)EdE = nEF . 5
(17)
However, it is the electrons near EF in energy which may be excited and are therefore important. These have a DeBroglie wavelength of roughly λe =
◦
12.3 A (E(eV))
1/2
◦
≈4A
(18)
thus our original approximation of a square well potential, ignoring the lattice structure, is questionable for electrons near the Fermi surface, and should be regarded as yielding only qualitative results.
12
2.2.2
T 6= 0, Fermi Statistics
At finite temperatures some of the states will be thermally excited. The energy available for these excitations is roughly kB T , and the only possible excitations are from filled to unfilled electronic states. Therefore, only the states within k B T (EF − kB T < E < EF + kB T ) of the Fermi surface may be excited. f (E, T ) must be modified accordingly. What we need is then f (E, T ) at finite T which also satisfies the Pauli principle. Lets return to our model of a periodic solid which is constructed by bringing individual atoms together from an infinite separation. First, just consider a solid constructed from only two atoms, each with a single orbital (Fig. 5). For
1
2
δ n1= +1
δ n1= -1
Figure 5: Exchange of electrons in a solid composed of two orbitals.
13
this system, in equilibrium, 0 = δF = electrons are conserved so system
P
X
i δni
i
∂F δni ∂ni
(19)
= 0. Thus, for our two orbital
∂F ∂F δn1 + δn2 = 0 and δn1 + δn2 = 0 ∂n1 ∂n2
(20)
or ∂F ∂F = (21) ∂n1 ∂n2 A similar relation holds for an arbitrary number of particles. Apparently this quantity, the increased free energy needed to add a particle to the system, is a constant ∂F =µ ∂ni
(22)
for all i. µ is called the chemical potential. Now consider an ensemble of orbitals. We will treat the thermodynamics of this system within the canonical ensemble (i.e. the system is in contact with a thermal bath, and the particle number is conserved) for which F = E − T S is the appropriate potential. The system energy E and Entropy S may be written 14
as functions of the orbital energies Ei and occupancies ni and the degeneracy gi of the state of energy Ei. For example,
E4 E3
g=4 n=2 4 4 g=4 n=4
E2 E1
g=2 n=1 2 2 g=2 n=2
3
1
3
1
Figure 6: states from an ensemble of orbitals.
E=
X
i
ni Ei .
(23)
The entropy S requires a bit more thought. If P is the number of ways of distributing the electrons among the states, then S = kB ln P .
(24)
Consider a set of gi states with energy Ei. The number of ways of distributing the first electron in these states is gi. For a second electron we then have gi − 1 ways... etc. So for ni electrons there are gi ! ni!(gi − ni)! 15
(25)
possible ways of accommodating the ni (indistinguishable) electrons in gi states. The number of ways of making the whole system (ie, filling energy levels with Ei 6= Ej ) is then gi ! Y , P = i ni !(gi − ni )! and so, the entropy
(26)
X
(27)
S = kB
i
ln gi! − ln ni! − ln(gi − ni)! .
For large n, ln n! ≈ n ln n − n, so S = kB
X
i
gi ln gi − ni ln ni − (gi − ni) ln(gi − ni)
(28)
and F=
X
i
ni Ei − k B T
X
i
gi ln gi − ni ln ni − (gi − ni) ln(gi − ni) (29)
We will want to use the chemical potential µ in our thermodynamic calculations ∂F = Ek + kB T (ln nk + 1 − ln(gk − nk ) − 1) , µ= ∂nk where β = 1/kB T . Solving for nk gk nk = . 1 + eβ(Ek −µ) 16
(30)
(31)
Thus the probability that a quantum state with energy E is occupied, is (the Fermi function) f (E, T ) =
1 1 + eβ(Ek −µ)
.
(32)
At T = 0, β = ∞, and f (E, 0) = θ(µ − E). Thus µ(T = 0) = EF . However in general µ is temperature dependent, since it must be adjusted to keep the particle number fixed. In addition,
1/(e
β(ω−µ)
+1)
1.5 1.0 0.5 0.0
0.0
0.5
1.0 ω ³
1.5
2.0
´
Figure 7: Plot of the Fermi function 1/ e−β(ω−µ) + 1 when β = 1/kB T = 20 and µ = 1. Not that at energies ω ≈ µ the Fermi function displays a smooth step of width ≈ kB T = 0.05. This allows thermal excitations of particles near the Fermi surface.
when T 6= 0, f becomes less sharp at energies E ≈ µ. This reflects the fact that particles with energies E − µ ≈ kB T may be excited to higher energy states. 17
Specific Heat
The form of f (E, T ) also clarifies why the elec-
tronic specific heat of metals is so small compared to the classical result Cclassical = 23 nkB T . The reason is simple: only the electrons with energies within about kB T of the Fermi surface may be excited (about
kB T EF
of the electron density) each with
excitation energy of about kB T . Therefore, Uexcitation ≈ kB T n
kB T T = nkB T EF TF
(33)
so T (34) TF Then as T ¿ TF (TF is typically about 105K in most metals2 ) C ≈ nkB
C ≈ nkB TT ¿ Cclassical ≈ nkB . Thus at temperatures where F
the phonons contribute essentially a classical result to the specific heat, the electronic contribution is vanishingly small. In general this holds except at very low T where the phonon contribution Cphonon ∼ T 3 goes to zero faster than the electronic contribution to the specific heat. 2
Heavy Fermion systems are the exception to this rule. There TF can be as small as a fraction
of a degree Kelvin. As a result, they may have very large electronic specific heats.
18
Of course, since we know the free energy
Specific Heat Calculation
of the non-interacting Fermi gas, we can calculate the form of the specific heat. Here we will follow Ibach and L¨uth and Kittel; however, since the chemical potential does depend upon the temperature, I would like to make the approximations we make a bit more explicit. Upon heating from T = 0 to finite T , the Fermi gas will gain energy U (T ) =
Z ∞
0
dE ED(E)f (E, T ) −
Z E F
0
dE ED(E)
(35)
so df (E, T ) dU Z ∞ = 0 dE ED(E) . (36) CV = dT dT Then since at constant volume the electronic density is constant, so
dn dT
= 0, and n =
R∞
0
dED(E)f (E, T ),
dn Z ∞ df (E, T ) 0 = EF = 0 dE EF D(E) dT dT
(37)
so we may write CV =
Z ∞
0
dE (E − EF ) D(E)
df . dT
(38)
In f , the temperature T enters through both β = 1/kB T and 19
µ
However,
df ∂f ∂β ∂f ∂µ = + dT ∂β ∂T ∂µ ∂T βeβ(E−µ) E − µ ∂µ = ³ − ´2 β(E−µ) T ∂T e +1 ∂µ ∂T
(39)
depends upon the details of the density of states
near the Fermi surface, which can differ greatly from material to material. Furthermore,
∂µ ∂T
< 1 especially in common metals E−µ T ∂µ neglect ∂T
at temperatures T ¿ TF , and the first term
is of order
one (c.f. Fig. 7). Thus, for now we will
relative to
E−µ T
(you will explore the validity of this approximation in your
homework), and, consistent with this approximation, replace µ by EF , so
and CV
df βeβ(E−EF ) E − EF ≈³ , ´2 β(E−E ) dT T F e +1
(40)
1 Z∞ (E − EF )2 D(E)βeβ(E−EF ) ≈ let x = β(E − EF ) dE ³ ´2 β(E−E ) kB T 2 0 F e +1 x Z ∞ e x 2 (41) ≈ kB T −βE dx D + EF x x F β (e + 1)2 x
As shown in Fig. 8, the function x2 (exe+1)2 is only large in the region −10 < x < 10. In this region, and for temperatures 20
0.5
0.3
2 x
x
x e /(e +1)
2
0.4
0.2 0.1 0.0 -10
-5
0 x
5
10
x
Figure 8: Plot of x2 (exe+1)2 vs. x. Note that this function is only finite for roughly −10 < x < 10. Thus, at temperatures T ¿ TF ∼ 105 K, we can approximate
D
³
x β
´
+ EF ≈ D (EF ) in Eq. 42.
T ¿ TF , D
µ
x β
¶
+ EF ≈ D (EF ), since the density of states
usually does not have features which are sharp on the energy scale of 10kB T . Thus CV ≈ kB T D (EF )
Z ∞
−βEF
π2 2 ≈ k T D(EF ) . 3 B
ex dx x x (e + 1)2 2
(42)
Note that no assumption about the form of D(E) was made other than the assumption that it is smooth within kB T of the Fermi surface. Thus, experimental measurements of the specific heat at constant volume of the electrons, gives us information 21
about the density of electronic states at the Fermi surface. Now let’s reconsider the DOS for the 3-D box potential.
1/2 E 1 2m 3/2 1/2 E = D(EF ) D(E) = 2 2π h ¯ EF
For which n =
R EF
0
(43)
D(E)dE = D(EF ) 23 EF , so
π2 T 3 CV = nkB ¿ nkB 2 TF 2
(44)
where the last term on the right is the classical result. For room temperatures T ∼ 300K, which is also of the same order of magnitude as the Debye temperatures θD , 3 CV phonon ∼ nkB À CV electron 2
(45)
So, the only way to measure the electronic specific heat in most materials is to go to very low temperatures T ¿ θD , for which CV phonon ∼ T 3. Here the total specific heat CV ≈ γT + βT 3
(46)
We will see that gives us some measurement of the electronic effective mass for our Fermi liquid theory. I.e. it tell us something about electron- electron interactions. 22
3 3.1
The Weakly Correlated Electronic Liquid Thomas-Fermi Screening
As an introduction to the effect of electronic correlations, consider the effect of a charged oxygen defect in one of the copperoxygen planes of a cuprate superconductor shown in Fig. 9. Assume that the oxygen defect captures two electrons from the metallic band, going from a 2s22p4 to a 2s22p6 configuration. The defect will then become a cation, and have a net charge O
Cu
O
O
Cu
Cu
O
O
O
Cu
Cu
O
Cu
O
O
Cu
O
O
O
Cu
Cu
O
O
O
Cu
Cu
O
O
O
Cu
O
q=2eO
Cu
O
O
O
Cu
O
O
O
Cu
O
O
Cu
O
O
O
O
O
O
Cu
O
Cu
O
O
O
Figure 9: A charged oxygen defect is introduced into one of the copper-oxygen planes of a cuprate superconductor. The oxygen defect captures two electrons from the metallic band, going from a 2s2 2p4 to a 2s2 2p6 configuration.
of two electrons. In the vicinity of this oxygen defect, the elec23
trostatic potential and the electronic charge density will be reduced. If we model the electronic density of states in this material with our box-potential DOS, we can think of this reduction in the local charge density in terms of raising the DOS parabola near the defect (cf. Fig. 10). This will cause the free electronic near charged defect
Away from charged defect
e EF
-eδU Figure 10: The shift in the DOS parabola near a charged defect.
charge to flow away from the defect. Near the defect (since e < 0 and hence eδU (rnear ) < 0) n(rnear ) ≈
Z E +eδU (r near ) F
0
24
D(E)DE
(47)
While away from the defect, δU (raway ) = 0, so n(raway ) ≈
Z E F
0
D(E)DE
(48)
or δn(r) ≈
Z E +eδU (r) F
0
D(E)DE −
Z E F
0
D(E)DE
(49)
If |eδU | ¿ EF , then δn(r) ≈ D(EF ) [EF + eδU − EF ] = eδU D(EF ) .
(50)
We can solve for the change in the electrostatic potential by solving Poisson equation. ∇2δU = 4πδρ = 4πeδn = 4πe2D(EF )δU .
(51)
Let λ2 = 4πe2D(EF ), then ∇2δU = λ2δU has the solution3 qe−λr δU (r) = r
(52)
The length 1/λ = rT F is known as the Thomas-Fermi screening length. µ
2
rT F = 4πe D(EF ) 3
¶−1/2
(53)
The solution is actually Ce−λr /r, where C is a constant. C may be deterined by letting D(EF ) =
0, so the medium in which the charge is embedded becomes vacuum. Then the potential of the charge is q/r, so C = q.
25
Lets estimate this distance for our square-well model, rT2 F = rT F
a0 a0 π ≈ 3(3π 2n)1/3 4n1/3
1 n −1/6 ≈ 2 a30
(54) ◦
In Cu, for which n ≈ 1023 cm−3 (and since a0 = 0.53 A) rT F Cu
³
´−1/6
◦ 1 1023 −8 ≈ ≈ 0.5 × 10 cm = 0.5 A (55) 2 (0.5 × 10−8)−1/2
Thus, if we add a charge defect to Cu metal, the effect of the defect’s ionic potential is screened away for distances r >
r
/r
-r/rTF
rTF=1/4
-1/6
rTF= n
rTF=1
-e
-e
-r/rTF
/r
r
1 ◦ 2 A.
bound states free states Figure 11: Screened defect potentials. As the screening length increases, states that were free, become bound.
Now consider an electron bound to an ion in Cu or some other metal. As shown in Fig. 11 the screening length decreases, and bound states rise up in energy. In a weak metal (i.e. something 26
like YBCO), in which the valence state is barely free, a reduction in the number of carriers (electrons) will increase the screening length, since rT F ∼ n−1/6 .
(56)
This will extend the range of the potential, causing it to trap or bind more states–making the one free valance state bound. Now imagine that instead of a single defect, we have a concentrated system of such ions, and suppose that we decrease the density of carriers (i.e. in Si-based semiconductors, this is done by doping certain compensating dopants, or even by modulating the pressure). This will in turn, increase the screening length, causing some states that were free to become bound, causing an abrupt transition from a metal to an insulator, and is believed to explain the MI transition in some transition-metal oxides, glasses, amorphous semiconductors, etc.
27
3.2
Fermi liquids
The purpose of these next several lectures is to introduce you to the theory of the Fermi liquid, which is, in its simplest form, a collection of Fermions in a box plus interactions. In reality , the only physical analog is a gas of 3He, which due its nuclear spin (the nucleus has two protons, one neutron), obeys Fermi statistics for sufficiently low energies or temperatures. In addition, simple metals, from the first or second column of the periodic table, for which we may approximate the ionic potential V (R) = V0
(57)
are a close approximant to Fermi liquids. Moreover, Fermi Liquid theory only describes the ”gaseous” phase of these quantum fermion systems. For example, 3He also has a superfluid (triplet), and at least in 4He-3 He mixtures, a solid phase exists which is not described by Fermi Liquid Theory. One should note; however, that the Fermi liquid theory state does serve as the starting point for the theories of super28
conductivity and super fluidity. One may construct Fermi liquid theory either starting from a many-body diagrammatic or phenomenological viewpoint. We, as Landau, will choose the latter. Fermi liquid theory has 3 basic tenants: 1. momentum and spin remain good quantum numbers to describe the (quasi) particles. 2. the interacting system may be obtained by adiabatically turning on a particle-particle interaction over some time t. 3. the resulting excitations may be described as quasi-particles with lifetimes À t. 3.3
Quasi-particles
The last assumption involves a new concept, that of the quasiparticles which requires some explanation. 3.3.1
Particles and Holes
Particles and Holes are excitations of the non-interacting system at zero temperature. Consider a system of N free Fermions 29
each of mass m in a volume V . The eigenstates are the antisymmetrized combinations (Slater determinants) of N different single particle states. 1 ψp(r) = √ eip·r/¯h V
(58)
The occupation of each of these states is given by np = θ(p−pF ) where pF is the radius of the Fermi sphere. The energy of the system is E=
X
p
p2 np 2m
(59)
and pF is given by
N 1 Ã p F !3 = 2 (60) V 3π h ¯ Now lets add a particle to the lowest available state p = pF
then, for T = 0, ∂E0 p2F µ = E0(N + 1) − E0(N ) = = . ∂N 2m
(61)
If we now excite the system, we will promote a certain number of particles across the Fermi surface SF yielding particles above and an equal number of vacancies or holes below the Fermi surface. These are our elementary excitations, and they are 30
quantified by δnp = np − n0p
δnp =
δp,p0 for a particle p0 > pF 0
−δp,p0 for a hole p < pF
.
(62)
If we consider excitations created by thermal fluctuations, then
E particle excitation δn p’ = 1 δn p= -1
EF
hole excitation
D(E) Figure 12: Particle and hole excitations of the Fermi gas.
δnp ∼ 1 only for excitations of energy within kB T of EF . The energy of the non-interacting system is completely characterized as a functional of the occupation E − E0 =
X
p
2 p2 X p 0 (np − np) = δnp . p 2m 2m
(63)
Now lets take our system and place it in contact with a particle bath. Then the appropriate potential is the free energy, 31
E 2
δF = p’ /2m - µ 2
δF = µ - p /2m
2 F
EF = µ = p /2m 2
δF = | µ - p /2m |
D(E) Figure 13: Since µ = p2F /2m, the free energy of a particle or a hole is δF = |p2 /2m − µ| > 0, so the system is stable to these excitations.
which for T = 0, is F = E − µN , and F − F0 =
X
p
p2 δnp . − µ 2m
(64)
The free energy of a particle, with momentum p and δnp0 = δp,p0 is
p2 2m
− µ and it corresponds to an excitation outside SF . 2
p , which correThe free energy of a hole δnp0 = −δp,p0 is µ − 2m
sponds to an excitation within SF . However, since µ = p2F /2m, the free energy of either at p = pF is zero, hence the free energy of an excitation is ¯ ¯ 2 ¯ ¯
¯ ¯ ¯ ¯
p /2m − µ ,
(65)
which is always positive; ie., the system is stable to excitations. 32
3.3.2
Quasiparticles and Quasiholes at T = 0
U
a
2
≈ ea
-a/r TF
e
Figure 14: Model for a fermi liquid: a set of interacting particles an average distance a apart bound within an infinite square-well potential.
Now let’s consider a system with interacting particles an average distance a apart, so that the characteristic energy of interaction is
e2 −a/rT F . ae
We will imagine that this system evolves
slowly from an ideal or noninteracting system in time t (i.e. the interaction U ≈
e2 −a/rT F ae
is turned on slowly, so that the non-
interacting system evolves while remaining in the ground state into an interacting system in time t). If the eigenstate of the ideal system is characterized by n0p, then the interacting system eigenstate will evolve quasistatistically from n0p to np. In fact if the system is isotropic and 33
remains in its ground state, then n0p = np. However, clearly in some situations (superconductivity, magnetism) we will neglect some eigenstates of the interacting system in this way. Now let’s add a particle of momentum p to the non-interacting ideal system, and slowly turn on the interaction. As U is
p
p
time = 0 U=0
p
time = t 2 U = (e /a) exp(-a/rTF)
Figure 15: We add a particle with momentum p to our noninteracting (U = 0) Fermi liquid at time t = 0, and slowly increase the interaction to its full value U at time t. As the particle and system evolve, the particle becomes dressed by interactions with the system (shown as a shaded ellipse) which changes the effective mass but not the momentum of this single-particle excitation (now called a quasi-particle).
switched on, we slowly begin to perturb the particles close to the additional particle, so the particle becomes dressed by these interactions. However since momentum is conserved, we have created an excitation (particle and its cloud) of momentum p. We call this particle and cloud a quasiparticle. In the same way, 34
if we had introduced a hole of momentum p below the Fermi surface, and slowly turned on the interaction, we would have produced a quasihole. Note that this adiabatic switching on procedure will have difficulties if the lifetime of the quasi-particle τ < t. If so, then the process is not reversible. If we shorten t so that again τ À t, then the switching on of U may not be adiabatic (ie., we will evolve to a system which is not in its ground state). Such difficulties do not arise so long as the energy of the particle is close to the Fermi energy. Here there are few states accessible for creating particle-hole excitations. In fact, one can formulate a perturbative argument that the lifetime of a quasi-particle is proportional to the square of its excitation energy above the Fermi energy ε =
p2 2m
−
p2F 2m
≈ v(p − pF ).
To estimate this lifetime consider the following argument from AGD: A particle with momentum p1 above the Fermi surface (p1 > pF ) interacts with one of the particles below the Fermi surface with momentum p2. As a result, two new particles appear above the Fermi surface (all other states are full) 35
p4 p3 p1
p2
Figure 16: A particle with momentum p1 above the Fermi surface (p1 > pF ) interacts with one of the particles below the Fermi surface with momentum p2 . As a result, two new particles appear above the Fermi surface (all other states are full) with momenta p3 and p4 ..
with momenta p3 and p4. This may also be interpreted as a particle of momentum p1 decaying into particles with momenta p3 and p4 and a hole with momentum p2. By Fermi’s golden rule, the total probability of such a process if proportional to 1 Z ∝ δ (ε1 + ε2 − ε3 − ε4) d3p2d3p3 τ where ε1 =
p21 2m
(66)
− EF , and the integral is subject to the con-
straints of energy and momentum conservation and that p2 < p F ,
p3 > pF ,
p4 = |p1 + p2 − p3| > pF 36
(67)
It must be that ε1 + ε2 = ε3 + ε4 > 0 since both particles 3 and 4 must be above the Fermi surface. However, since ε2 < 0, < ε1 is also small, so only of order if ε1 is small, then |ε2| ∼ ε1 /EF states may scatter with the state k1, conserve energy,
and obey the Pauli principle. Thus, restricting ε2 to a narrow shell of width ε1 /EF near the Fermi surface, and reducing the scattering probability 1/τ by the same factor. Now consider the constraints placed on states k3 and k4 by momentum conservation k1 − k 3 = k 4 − k 2 .
(68)
Since ε1 and ε2 are confined to a narrow shell around the Fermi surface, so too are ε3 and ε4 . This can be seen in Fig. 17, where the requirement that k1 − k3 = k4 − k2 limits the allowed states for particles 3 and 4. If we take k1 fixed, then the allowed states for 2 and 3 are obtained by rotating the vectors k1 −k3 = k4 −k2; however, this rotation is severely limited by the fact that particle 3 must remain above, and particle 2 below, the Fermi surface. This restriction on the final states further reduces the scattering probability by a factor of ε1/EF . 37
E k1 k3
ky
k4
3
EF
k 1- k3
k2
2
1 k 4- k2
4 kx
N(E) Figure 17: A quasiparticle of momentum p1 decays via a particle-hole excitation into a quasiparticle of momentum p4 . This may also be interpreted as a particle of momentum p1 decaying into particles with momenta p3 and p4 and a hole with < ε1 . Thus, restricting ε2 to a momentum p2 . Energy conservation requires |ε2 | ∼ narrow shell of width ε1 /EF near the Fermi surface. Momentum conservation k1 −
k3 = k4 − k2 further restricts the available states by a factor of about ε1 /EF . Thus the lifetime of a quasiparticle is proportional to
³
ε1 EF
´−2
.
Thus, the scattering rate 1/τ is proportional to
µ
¶ ε1 2 EF
so that
excitations of sufficiently small energy will always be sufficiently long lived to satisfy the constraints of reversibility. Finally, the fact that the quasiparticle only interacts with a small number of other particles due to Thomas-Fermi screening (i.e. those within a distance ≈ RT F ), also significantly reduces the scattering rate. 38
3.4
Energy of Quasiparticles.
As in the non-interacting system, excitations will be quantified by the deviation of the occupation from the ground state occupation n0p δnp = np − n0p .
(69)
At low temperatures δnp ∼ 1 only for p ≈ pF where the particles are sufficiently long lived that τ À t. It is important to
emphasize that only δnp not n0p or np, will be physically relevant. This is important since it does not make much sense to talk about quasiparticle states, described by np, far from the Fermi surface since they are not stable. For the ideal system E − E0 =
X
p
p2 δnp . 2m
(70)
For the interacting system E[np] becomes much more complicated. If however δnp is small (so that the system is close to its ground state) then we may expand: E[np] = Eo +
X
p
²pδnp + O(δn2p) ,
39
(71)
where ²p = δE/δnp. Note that ²p is intensive (ie. it is independent of the system volume). If δnp = δp,p0 , then E ≈ E0 + ²p0 ;
i.e., the energy of the quasiparticle of momentum p0 is ²p0 .
In practice we will only need ²p near the Fermi surface where δnp is finite. So we may approximate ²p ≈ µ + (p − pF ) · ∇p ²p|pF
(72)
where ∇p²p = vp, the group velocity of the quasiparticle. The ground state of the N + 1 particle system is obtained by adding a particle with ²p = ²F = µ =
∂E0 ∂N
(at zero temperature); which
defines the chemical potential µ. We make learn more about ²p by employing the symmetries of our system. If we explicitly display the spin-dependence, ²p,σ = ²−p,−σ under time-reversal
(73)
²p,σ = ²−p,σ under BZ reflection
(74)
So ²p,σ = ²−p,σ = ²p,−σ ; i.e. in the absence of an external magnetic field, ²p,σ does not depend upon σ if. Furthermore, for an isotropic system ²p depends only upon the magnitude of p, |p|, so p and vp = ∇²p(|p|) = 40
p d²p (|p|) |p| d|p|
are parallel. Let us
define m∗ as the constant of proportionality at the fermi surface vpF = pF /m∗
(75)
Using m∗ it is useful to define the density of states at the fermi surface. Recall, that in the non-interacting system,
1 2m 3/2 1/2 mpF EF = D(EF ) = 2 2π h ¯2 π¯h3
(76)
where p = h ¯ k, and E = p2/2m. Thus, for the interacting system at the Fermi surface m∗ p F , Dinteracting (EF ) = π¯h3
(77)
where the m∗ (generally > m, but not always) accounts for the fact that the quasiparticle may be viewed as a dressed particle, and must “drag” this dressing along with it. I.e., the effective mass to some extent accounts for the interaction between the particles.
41
4
Interactions between Particles: Landau Fermi Liquid
4.1
The free energy, and interparticle interactions
The thermodynamics of the system depends upon the free energy F , which at zero temperature is F − F0 = E − E0 − µ(N − N0) .
(78)
Since our quasiparticles are formed by adiabatically switching on the interaction in the N + 1 particle ideal system, adding one quasiparticle to the system adds one real particle. Thus, N − N0 =
X
δnp ,
(79)
²pδnp ,
(80)
(²p − µ) δnp .
(81)
p
and since E − E0 ≈
X
p
we get F − F0 ≈
X
p
As shown in Fig. 18, we will be interested in excitations of the system which distort the Fermi surface by an amount proportional to δ. For our theory/expansion to remain valid, we must 42
δ
Figure 18: We consider small distortions of the fermi surface, proportional to δ, so that
1 N
P
p
|δnp | ¿ 1.
have 1 X |δnp| ¿ 1 . N p Where δnp = 6 0, ²p − µ will also be of order δ. Thus, X
p
(²p − µ) δnp ∼ O(δ 2) ,
(82)
(83)
so, to be consistent we must add the next term in the Taylor series expansion of the energy to the expression for the free energy. F − F0 =
X
p
(²p − µ) δnp +
1 X fp,p0 δnpδnp0 + O(δ 3) (84) 2 p,p0 43
where δE (85) δnpδnp0 The term, proportional to fp,p0 , was added (to the Sommerfeld fp,p0 =
theory) by L.D. Landau. Since each sum over p is proportional to the volume V , as is F , it must be that fp,p0 ∼ 1/V . However, it is also clear that fp,p0 is an interaction between quasiparticles, each of which is spread out over the whole volume V , so the probability that they will interact is ∼ rT3 F /V , thus fp,p0 ∼ rT3 F /V
(86)
In general, since δnp is only of order one near the Fermi surface, we will only care about fp,p0 on the Fermi surface (assuming that it is continuous and changes slowly as we cross the Fermi surface. ¯
Interested in fp,p0 ¯¯²p=²
p0 =µ
in only!
(87)
Given this, we can reduce the spin dependence of fp,p0 to a symmetric and anti symmetric part. First in the absence of an external field, the system should be invariant under time44
reversal, so fpσ,p0 σ0 = f−p−σ,−p0 −σ0 ,
(88)
and, in a system with reflection symmetry fpσ,p0 σ0 = f−pσ,−p0 σ0 .
(89)
fpσ,p0 σ0 = fp−σ,p0 −σ0 .
(90)
Then
It must be then that f depends only upon the relative orientations of the spins σ and σ 0, so there are only two independent components fp↑,p0 ↑ and fp↑,p0 ↓. We can split these into symmetric and antisymmetric parts. a fp,p 0 =
´ 1³ fp↑,p0 ↑ − fp↑,p0 ↓ 2
s fp,p 0 =
´ 1³ fp↑,p0↑ + fp↑,p0 ↓ . 2 (91)
a fp,p 0 may be interpreted as an exchange interaction, or 0 a s fpσ,p0 σ0 = fp,p 0 + σ · σ fp,p0
(92)
where σ and σ 0 are the Pauli matrices for the spins. a Our ideal system is isotropic in momentum. Thus, fp,p 0 and s 0 fp,p 0 will only depend upon the angle θ between p and p , and 45
a s so we may expand either fp,p 0 and fp,p0 α fp,p 0
=
∞ X
l=0
flα Pl (cos θ) .
(93)
Conventionally these f parameters are expressed in terms of reduced units. D(EF )flα 4.2
V m ∗ pF α α = 3 fl = F l . 2 πh ¯
(94)
Local Energy of a Quasiparticle
p
Figure 19: The addition of another particle to a homogeneous system will yeilds in forces on the quasiparticle which tend to restore equilibrium.
Now consider an interacting system with a certain distribution of excited quasiparticles δnp0 . To this, add another quasiparticle of momentum p (δn0p → δn0p + δp,p0 ). From Eq. 84 the 46
free energy of the additional quasiparticle is ²˜p − µ = ²p − µ +
X
p0
fp0,p δnp0 ,
(95)
(recall that fp,p0 = fp0 ,p ). Both terms here are O(δ). The second term describes the free energy of a quasiparticle due to the other quasiparticles in the system (some sort of Hartree-like term). The term ²˜p plays the part of the local energy of a quasiparticle. For example, the gradient of ²˜p is the force the system exerts on the additional quasiparticle. When the quasiparticle is added to the system, the system is inhomogeneous so that δnp0 = δnp0 (r). The system will react to this inhomogeneity by minimizing its free energy so that ∇r F = 0. However, only the additional free energy due the added particle (Eq. 95) is inhomogeneous, and has a non-zero gradient. Thus, the system will exert a force −∇r ²˜ = −∇r
X
p0
fp0,p δnp0 (r)
(96)
on the added quasiparticle resulting from interactions with other quasiparticles. 47
4.2.1
Equilibrium Distribution of Quasiparticles at Finite T
²˜p also plays an important role in the finite-temperature properties of the system. If we write E − E0 = Now suppose that
P
X
p
²pδnp +
p |hδnp i|
1 X fp0 ,pδnp0 δnp 2 p,p0
(97)
¿ N , as indeed it must be for
the expansion above to be valid, so that δnp = hδnpi + (δnp − hδnpi)
(98)
where the first term is O(δ), and the second O(δ 2). Thus, δnpδnp0 ≈ −hδnpihδnp0 i + hδnpiδnp0 + hδnp0 iδnp
(99)
We may use this to rewrite the energy of our interacting system 1 X X fp0,p hδnpiδnp0 fp0,p hδnpihδnp0 i + p 2 p,p0 p,p0 1 X X X δnp − ²p + fp0,p hδnp0 i fp0 ,phδnpihδnp0 i ≈ p 0 0 2 p p,p 1 X X ≈ h˜²piδnp − (100) fp0,p hδnpihδnp0 i + O(δ 4) p 2 p,p0
E − E0 ≈
X
²pδnp −
At this point, we may repeat the arguments made earlier to determine the fermion occupation probability for non-interacting 48
Fermions (the constant factor on the right hand-side has no effect). We will obtain np(T, µ) =
1 , 1 + exp β(h˜²pi − µ)
(101)
or δnp(T, µ) =
1 − θ(pf − p) . 1 + exp β(h˜²p i − µ)
(102)
However, at least for an isotropic system, this expression bears closer investigation. Here, the molecular field (evaluated within kB T of the Fermi surface) h˜²p − ²pi =
X
p0
fp0 ,phδnp0 i
(103)
must be independent of the location of p on the Fermi surface (and of course, spin), and is thus constant. To see this, reconsider the Legendre polynomial expansion discussed earlier h˜²p − ²pi = ∝
X
fp0,p hδnp0 i
p0 XZ
l Z
d3pfl Pl (cos θ)hδnp0 i
∝ f0 d3phδnp0 i = 0 (104) 49
In going from the second to the third line above, we made use of the isotropy of the system, so that hδnp0 i is independent of the angle θ. The evaluation in the third line, follows from particle number conservation. Thus, to lowest order in δ np(T, µ) = 4.3
1 + O(δ 3) 1 + exp β(²p − µ)
(105)
Effective Mass m∗ of Quasiparticles
This argument most closely follows that of AGD, and we will follow their notation as closely as possible (without introducing any new symbols). In particular, since an integration by parts is necessary, we will use a momentum integral (as opposed to a momentum sum) notation X
p
→V
Z
d3 p . (2π¯h)3
(106)
The net momentum of the volume V of quasiparticles is Pqp = 2V
Z
d3 p pnp net quasiparticle momentum (107) (2π¯h)3
which is also the momentum of the Fermi liquid. On the other hand since the number of particles equals the number of quasiparticles, the quasiparticle and particle currents must also be 50
equal Jqp = Jp = 2V
d3 p vpnp net quasiparticle and particle current (2π¯h)3 (108)
Z
or, since the momentum is just the particle mass times this current d3 p Pp = 2V m vpnp net quasiparticle and particle current (2π¯h)3 (109) Z
where vp = ∇p²˜p, is the velocity of the quasiparticle. So Z
Z d3 p d3 p pnp = m ∇p²˜pnp (2π¯h)3 (2π¯h)3
(110)
Now make an arbitrary change of np and recall that ²˜p depends upon np, so that δ˜²p = V
XZ
σ0
d3 p fp,p0 δnp0 . (2π¯h)3
(111)
For Eq. 110, this means that Z
Z d3 p d3 p pδnp = m ∇p²˜pδnp (112) (2π¯h)3 (2π¯h)3 Z ´ ³ d 3 p X Z d 3 p0 0 0 δn +mV ∇ f p p,p p np , (2π¯h)3 σ0 (2π¯h)3
51
or integrating by parts (and renaming p → p0 in the last part), we get Z
d3 p p δnp = (2π¯h)3 m
d3 p ∇p²˜pδnp (113) (2π¯h)3 d 3 p0 Z d 3 p XZ δnpfp,p0 ∇p0 np0 , −V 3 3 0 (2π¯ h ) (2π¯ h ) σ Z
Then, since δnp is arbitrary, it must be that the integrands themselves are equal p X = ∇p²˜p − V m σ0
Z
d 3 p0 fp,p0 ∇p0 np0 (2π¯h)3
(114)
0
The factor ∇p0 np0 = − pp0 δ(p0 − pF ). The integral may be evaluated by taking advantage of the system isotropy, and setting p parallel to the z-axis, since we mostly interested in the properties of the system on the Fermi surface we take p = pF , let θ be the angle between p (or the z-axis) and p0, and finally note ¯ ¯ ¯ ¯
¯ ¯ ¯ p p p=pF ¯
that on the Fermi surface ∇ ²˜ |
= vF = pF /m∗ . Thus,
pF X Z p02 dpdΩ pF p0 0 = ∗+ fpσ,p0 σ0 0 δ(p − pF ) 3 0 m m (2π¯h) p σ
52
(115)
However, since both p and p0 are restricted to the Fermi surface p0 p0
= cos θ, and evaluating the integral over p, we get V pF 1 1 = ∗+ m m 2
dΩ fpσ,p0 σ0 cos θ , (2π¯h)3
X Z
σ,σ 0
where the additional factor of
1 2
(116)
compensates for the additional
spin sum. If we now sum over both spins, σ and σ 0, only the symmetric part of f survives (the sum yields 4f s), so 4πV pF 1 1 = ∗+ m m (2π¯h)3
Z
d (cos θ) f s(θ) cos θ ,
(117)
We now expand f in a Legendre polynomial series f α (θ) =
X
l
flα Pl (cos θ) ,
(118)
and recall that P0(x) = 1, P1(x) = x, .... that Z 1
dxPn(x)Pm(x)dx = −1
2 δnm 2n + 1
(119)
and finally that D(0)flα we find that
V m ∗ pF α α = 3 fl = F l , 2 πh ¯
1 1 F1s = + , m m∗ 3m∗ 53
(120)
(121)
Quantity
Fermi Liquid
Specific Heat
Cv =
Fermi Liquid/Fermi Gas
m∗ pF 2 k T 3¯h3 B
m∗ m =1+ 1+F0s κ = κ0 1+F0s /3 µ ¶2 1+F0s c = c0 1+F1s /3 1+F1s /3 χ = χ0 1+F0a CV CV 0
Compressibility c2 =
Sound Velocity
Spin Susceptibility χ =
p2F 3mm∗ (1 + m∗ pF β 2 π2h ¯ 3 1+F0a
F0s)
=
F1s/3
Table 1: Fermi Liquid relations between the Landau parameters Fnα and some experimentally measurable quantities. For the latter, a zero subscript indicates the value for the non-interacting Fermi gas.
or m∗/m = 1 + F1s/3. The effective mass cannot be experimentally measured directly; however, it appears in many physically relevant measurable quantities, including the specific heat
∂E/V 1 ∂ CV = = ∂T V N V ∂T
X
p
²˜pnp.
(122)
To lowest order in δ, we may neglect fp,p0 in both ²˜p and np, so 1 X ∂np ²p . (123) V p ∂T P Recall that the density of states D(E) = p δ(E−²p), and makCV =
ing the same assumption that we made for the non-interacting 54
system, that
∂µ ∂T
is negligible, we get,
1 CV = V
Z
d²D(²)²
∂ 1 . ∂T exp β(² − µ) + 1
(124)
This integral is identical to the one we had to evaluate for the non-interacting system, and yields the result CV
π2 2 k T D(EF ) = 3V B kB2 T m∗pF = . 3¯h3
(125)
Thus, measuring the electronic contribution to the specific heat CV yields information about the effective mass m∗, and hence F1s. Other measurements are related to some of the remaining Landau parameters, as summarized in table 1.
55
Chapter 7: The Electronic Band Structure of Solids Bloch & Slater April 2, 2001
Contents 1
Symmetry of ψ(r)
3
2
The nearly free Electron Approximation.
6
2.1
The Origin of Band Gaps
. . . . . . . . . . . . . . . . . . . . . . . .
9
3
Tight Binding Approximation
15
4
Photo-Emission Spectroscopy
24
1
Band Structure
Free electrons -FLT
V(r)
V(r) = V 0
E
E E
f
Ef
metal
E f
insulator
Ef
"heavy" metal
D(E)
D(E)
Figure 1: The additional effects of the lattice potential can have a profound effect on the electronic density of states (RIGHT) compared to the free-electron result (LEFT).
In the last chapter, we ignored the lattice potential and considered the effects of a small electronic potential U . In this chapter we will set U = 0, and consider the effects of the ion potential V (r). As shown in Fig. 1, additional effects of the lattice potential can have a profound effect on the electronic density of 2
states compared to the free-electron result, and depending on the location of the Fermi energy, the resulting system can be a metal, semimetal, an insulator, or a metal with an enhanced electronic mass. 1
Symmetry of ψ(r)
From the symmetry of the electronic potential V (r) one may infer some of the properties of the electronic wave functions ψ(r). Due to the translational symmetry of the lattice V (r) is periodic V (r) = V (r + rn),
r n = n 1 a1 + n 2 a2 + n 3 a3
(1)
and may then be expanded in a Fourier expansion V (r) =
X
G
VGeiG·r ,
G = hg1 + kg2 + lg3 ,
(2)
which, since G · rn = 2πm (m ∈ Z) guarantees V (r) = V (r + rn). Given this, and letting ψ(r) =
3
P
k Ck e
ik·r
the
Schroedinger equation becomes
Hψ(r) = −
⇒
X
k
2
h ¯ ∇2 + V (r) ψ = Eψ 2m
h ¯ 2k2 0 X X Ckeik·r + Ck0 VGei(k +G)·r = E Ckeik·r , 2m k k0 G
or
ik·r
2 2
h ¯ k X Ck + VGCk−G = 0∀r − E 2m G k Since this is true for any r, it must be that X
e
¯ 2k2 X h Ck + − E VGCk−G = 0, 2m G
∀k
(3) k0 → k−G (4)
(5)
(6)
Thus the potential acts to couple each Ck only with its reciprocal space translations Ck+G and the problem decouples in to N independent problems for each k in the first BZ. Ie., each of the N problems has a solution which is a sum over plane waves whos’ wave vectors differ only by G’s. Thus the eigenvalues may be indexed by k. Ek = E(k),
I.e. k is still a good q.n.!
(7)
We may now sum over G to get ψk with the eigenvector sum 4
X
X
X
X
X
X
First B.Z. Figure 2: The potential acts to couple each Ck with its reciprocal space translations Ck+G (i.e. x → x, • → •, and ° → °) and the problem decouples into N independent problems for each k in the first BZ.
restricted to reciprocal lattice sites k, k + G, . . . ψk(r) =
X
G
Note that if V (r) = 0, U (r) =
Ck−G e−iG·r eik·r
(8)
where Uk(r) = Uk(r + rn)
(9)
Ck−G ei(k−G)·r =
ψk(r) = Uk(r)eik·r ,
X
√1 V
G
. This result is called Bloch’s
Theorem; ie., that ψ may be resolved into a plane wave and a periodic function. Its consequences as follows: ψk+G(r) =
X
G0
Ck+G−G0 e
= ψk(r),
−i(G0 −k−G)·r
=
X
G00
Ck−G00 e
where G00 ≡ G0 − G 5
−iG00 ·r
eik·r (10)
Ie., ψk+G(r) = ψk(r) and as a result Hψk = E(k)ψk ⇒ Hψk+G = E(k + G)ψk+G = Hψk = E(k + G)ψk+G
(11) (12)
Thus E(k + G) = E(k) : E(k) is periodic then since both ψk(r) and E(k) are periodic in reciprocal space, one only needs knowledge of them in the first BZ to know them everywhere. 2
The nearly free Electron Approximation.
If the potential is weak, VG ≈ 0 ∀G, then we may solve the VG = 0 problem, subject to our constraints of periodicity, and treat VG as a perturbation. When VG = 0, then h ¯ 2 k2 E(k) = 2m
free electron
(13)
However, we must also have that (if VG 6= 0)
h ¯2 E(k) = E(k + G) ≈ |k + G|2 2m
(14)
Ie., the possible electron states are not restricted to a single parabola, but can be found equally well on paraboli shifted by 6
E
First BZ
2π/a
Figure 3: For small VG , we may approximate the band structure as composed of N parabolic bands. Of course, it is sufficient to consider this in the first Brillouin zone, where the parabola centered at finite G cross at high energies. To understand the effects of the perturbation VG consider this special k at the edge of the BZ. where the paraboli cross.
any G vector. In 1-d Since E(k) = E(k + G), it is sufficient to represent this in the first zone only. For example in a 3-D cubic lattice the energy band structure along kx(ky = kz = 0) is already rather complicated within the first zone. (See Fig.4.)
The effect of VG can now be discussed. Let’s return to the 1-d problem and consider the edges of the zone where the [Φparaboli intersect. (See Fig. 3.) An electron state with k =
π a
will
involve at least the two G values G = 0, 2π a . Of course, the 7
First B.Z.
-π⁄a
π⁄a
−π⁄a
π⁄a
kx
Figure 4: The situation becomes more complicated in three dimensions since there are many more bands and so they can cross the first zone at lower energies. For example in a 3-D cubic lattice the energy band structure along kx (ky = kz = 0) is already rather complicated within the first zone.
exact solution must involve all G since
¯ 2 k2 X h Ck + − Ek VGCk−G = 0 2m G
(15)
We can generally take V0 = 0 since this just sets a zero for the potential. Then, those G for which Ek = Ek−G ≈
h ¯ 2 k2 2m
are
going to give the largest contribution since Ck−G VG h¯ 2k2 G 2m − Ek−G Ck−G1 ∼ VG1 h¯ 2k2 2m − Ek−G1
Ck = Ck
X
8
(16) (17)
Ck−G1−G VG h¯ 2k2 G 2m − Ek−G−G1 Ck ∼ V−G1 h¯ 2k2 2m − Ek
Ck−G1 = Ck−G1
X
(18) (19)
Thus to a first approximation, we may neglect the other Ck−G, and since VG = V−G (so that V (r) is real) |Ck| ≈ |Ck−G1 | À other Gk−G ψk(r) =
X
G
Ck−G ei(k−G)·r ∼
(eiGx/2 + e−iGx/2 ) ∼ cos πx a
(eiGx/2 − e−iGx/2 ) ∼ sin πx a (20)
The corresponding electron densities are sketched in Fig. 5. Clearly ρ+ has higher density near the ionic cores, and will be more tightly bound, thus E+ < E−. Thus a gap opens in Ek near k = G2 . 2.1
The Origin of Band Gaps
Now let’s reexamine this gap at k = G1/2 in a quantitative manner. Start with the eigen value equation shifted by G.
h ¯2 X X = VG0 Ck−G−G0 = VG0−GCk−G0 |k − G|2 Ck−G Ek − 2m G0 G0 (21) 9
ρ+ (x)
E Gap!
ρ− (x) k E
V(x)
D(E)
Figure 5: ρ+ ∼ cos2 (πx/a) has higher density near the ionic cores, and will be more
tightly bound, thus E+ < E− . Thus a gap opens in Ek near k =
Ck−G =
Ã
P
G0
VG0−GCk−G0
Ek −
h ¯2 2m |k
−
G|2
G . 2
(22)
!
To a first approximation (VG ' 0) let’s set E =
h ¯ 2 k2 2m
(a free-
electron energy) and ignore all but the largest Ck−G ; ie., those for which the denominator vanishes. k2 = |k − G|2 , 10
(23)
or in 1-d 2π 2 π ) or k = − (24) a a This is just the Laue condition, which was shown to be equivk2 = (k −
alent to the Bragg condition. Ie., the strongest perturbation to the free-electron picture occurs for states with energies at the edge of the first B.Z. Thus the equation above also tells us
highly perturbed
essentially free electrons Figure 6: We can satisfy the condition Ek ' Ek−G only for k on the edge of the B.Z.. Here the lattice potential strongly perturbs the electronic states (i.e. more than one Ck−G is finite).
that Ck and Ck−G1 are the most important coefficients (if this electronic state was unperturbed, only Ck would be important). Thus approximately for VG ∼ 0, V0 ≡ 0 and for k near the 11
zone boundary
Ck E−
G=0
h ¯ 2k2
2m
2
= VG1 Ck−G1
(25)
2
h ¯ |k − G1| = V−G1 Ck, (26) 2m Again, ignore all other CG. This is a secular equation which E− Ck−G1
G = G1
has a nontrivial solution iff ¯ Ã ¯ ¯ h ¯ 2 k2 ¯ ¯ 2m ¯ ¯ ¯ ¯ ¯ −G1
−E
!
V
or
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
Ek0
Ã
−E
V−G1
V G1 h ¯ 2 |k−G1 |2 2m
−E
V G1 0 Ek−G 1
(V−G = VG∗ ,
−E
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ ! ¯¯ ¯ ¯ ¯
=0
(27)
=0
(28)
so thatV (r) ∈ <)
2 0 − E) − |V | =0 (Ek0 − E)(Ek−G G 1 1 0 Ek0 Ek−G 1
−E
µ
Ek0
+
0 Ek−G 1
¶
+ E 2 − |VG1 |2 = 0
(29) (30) 1
¶ ¶ 2 1 µ 1µ 0 ± 0 0 0 2 E = Ek−G1 + Ek ± Ek−G − Ek + |VG1 |2 (31) 2 4 0 At the zone boundary, where Ek−G = Ek0 , the gap is 1
∆E = E+ − E− = 2|VG1 | 12
(32)
E
k
e−
2 VG
k
-π/a
0
π/a
k Figure 7:
And the band structure looks something like Fig. 7. Within this approximation, the gap, or forbidden regions in which there are no electronic states arise when the Bragg condition (kf − k0 = G) is satisfied. | − k| ≈ |k + G|
(33)
The interpretation is clear: the high degree of back scattering for these k-values destroys the electronic states. Thus, by treating the lattice potential as a perturbation to the free electron problem, we see that gaps arise due to enhanced electron-lattice back scattering for k near the zone edge. How13
ever, in chapter one, we considered band structure qualitatively and determined that gaps could arise from perturbing about the atomic limit. This in fact, is another natural way of con-
State Energies
Separation Figure 8: Band gaps in the electronic DOS naturally emerge when perturbing around the atomic limit. As we bring more atoms together (left) or bring the atoms in the lattice closer together (right), bands form from mixing of the orbital states. If the band broadening is small enough, gaps remain between the bands.
structing a band structure theory. It is called the tight-binding approximation.
14
V (r) A
E i ψi r
r
Atomic cores
Valence electrons
Figure 9: In the tight-binding approximation, we generally ignore the core electron dynamics and consider only the ionic core potential. For now let’s assume that there is only one valence orbital φi on each atom.
3
Tight Binding Approximation
In the tight-binding approximation, we generally ignore the core electron dynamics and treat consider only the ionic core potential. For now let’s assume that there is only one valence orbital φi on each atom. We will also assume that the atomic problem is solved, and perturb around this solution. The atomic problem has valence eigenstates φi, and eigen energies Ei. The unperturbed Schroedinger equation for the nth atom is HA(r − rn) · φi(r − rn) = Eiφi(r − rn) 15
(34)
There is a weak perturbation v(r − rn) coming from the atomic potentials of the other atoms rm 6= rn
h ¯ 2 ∇2 + VA(r − rn) + v(r − rn) H = HA + v = − 2m v(r − rn) =
X
m6=n
VA(r − rm)
(35) (36)
We now seek solutions of the Schroedinger equation indexed by k (Bloch’s theorem)
⇒ where
Z
Hψk(r) = E(k)Ψk(r) hψk|H|ψki ψ∗ ⇒ E(k) = hψk|ψki
hψk|ψki ≡ hψk|H|ψki ≡
Z
Z
(37) (38)
d3rψk∗ (r)ψk(r) d3rψk∗ (r)Hψk(r)
(39)
Of course, this problem is almost hopelessly complicated. We cannot solve for ψk. Rather, we will solve for some φk ' ψk where the parameters of φk are determined by minimizing hφk|H|φki ≥ E(k). hφk|φki 16
(40)
This is called the Raleigh-Ritz variational principle. Consistent with our original motivation, we will approximate ψk with a sum over atomic states. ψk ' φ k =
X
n
anφi(r − rn) =
ψk(r) = Uk(r)eik·r ,
X
n
eik·rn φi(r − rn)
(41)
ψk(r) = ψk+G(r)
Where φk must be a Bloch state φk+G = φk which dictates our choice an = eik·rn . Thus at this level of approximation we have no free parameters to vary to minimize hφk|H|φki / hφk|φki ≈ E(k). Using φk as an approximate state the energy denominator hφk|φki, becomes hφk|φki =
X
n,m
e
ik·(rn −rm )
Z
d3rφ∗i (r − rm)φi(r − rn)
(42)
Let’s imagine that the valance orbital of interest, φi, has an very small overlap with adjacent atoms so that hφk|φki '
XZ
n
d3rφ∗i (r − rn)φi(r − rn) = N
The last identity follows since φi is normalized. 17
(43)
φ (r-r ) i
φi (r-r2 )
1
Figure 10: In the tight binding approximation, we assume that the atomic orbitals of adjacent sites have a very small overlap with each other.
The energy for our approximate wave function is then 1 E(k) ≈ N
X
n,m
e
ik·(rn −rm )
Z
d3rφ∗i (r−rm) {Ei + v(r − rn)} φi(r−rn) . (44)
Again, in the first part (involving Ei), we may neglect orbital overlap. For the second term, involving v(r − rn), the overlap should be included, but only to the nearest neighbors of each atom (why?). In the simplest case, where the orbitals φi, are s-orbitals, then we can use this symmetry to reduce the complexity of the problem to just two more integrals since the hybridization (Bi) will be the same in all directions. Z
Ai = − φ∗i (r − rn)v(r − rn)φi(r − rn)d3r 18
ren. Ei (45)
B
B
i
A
i
Bi i
B
i
Figure 11: A simple cubic tight binding lattice composed of s-orbitals, with overlap integral Bi . Z
Bi = − φ∗i (r − rm)v(r − rn)φi(r − rn)d3 r
(46)
Bi describes the hybridization of adjacent orbitals. Ai; Bi > 0,
since v(r − rn) < 0
(47)
Thus E(k) ' Ei − Ai − Bi
X
m
eik(rn−rm)
sum over m n.n. to n (48)
Now, if we have a cubic lattice, then (rn − rm) = (±a, 0, 0)(0, ±a, 0)(0, 0, ±a) 19
(49)
so E(k) = Ei − Ai − 2Bi{cos kxa + cos ky a + cos kz a}
(50)
Thus a band centered about Ei − Ai of width 12Bi is formed. Near the band center, for k-vectors near the center of the zone we can expand the cosines cos ka ' 1 − 12 (ka)2 + · · · and let k 2 = kx2 + ky2 + kz2, so that
E(k) ' Ei − Ai + Bia2k 2
(51)
The electrons near the zone center act as if they were free with a renormalized mass. h ¯ 2k2 2 2 = B a k , i 2m∗
i.e.
1 ∝ curvature of band m∗
(52)
For this reason, the hybridization term Bi is often associated with kinetic energy. This makes sense, from its origins of wave function overlap and thus electronic transfer. The width of the band, 12 Bi, will increase as the electronic overlap increases and the interatomic orbitals (core orbitals or valance f and d orbitals) will tend to form narrow bands with high effective masses (small Bi). 20
First B.Z.
Fermi surface
Figure 12: Electronic states for a cubic lattice near the center of the B.Z. act like free electrons with a renormalized mass. Hence, if the band is partially filled, the Fermi surface will be spherical.
The bands are filled then by placing two electrons in each band state ( with spins up and down). A metal then forms when the valence band is partially full. I.e., for Na with a 1s2 2s22p63s1 atomic configuration the 1s, 2s and 2p orbitals evolve into (narrow) filled bands, but the 3s1 band will only be half full, and thus it evolves into a metal. Mg 1s2 2s22p63s2 also metal since the p and s band overlaps the unfilled d-band. There are exceptions to this rule. Consider C with atomic configuration of 1s2 2s22p2. Its valance s and p states form a strong sp3 hybrid band which is further split into a bonding and anti21
Atomic Potential
Tight Binding Bands
E
Ek
2 111
E
111
1
A2
12 B 2
k k=0
E 1k E
2
a
111
A1 - π/a
π/a
k
12 B1
Figure 13: In the tight-binding approximation, band form from overlapping orbitals states (states of the atomic potential). The bandwidth is proportional to the hybridization B (12B for a SC lattice). More localized, compact, atomic states tend to form narrower bands.
bonding band. (See Fig.14). Here, the gap is not tied to the periodicity of the lattice, and so an amorphous material of C may also display a gap. The tight-binding picture can also explain the variety of features seen in the DOS of real materials. For example, in Cu (Ar)3d10 4s the d-orbitals are rather small whereas the valence s-orbitals have a large extent . As a result the s-s hybridization
22
E
sp3antibonding
P S
sp3 bonding a
ra
Figure 14: C (diamond) with atomic configuration of 1s2 2s2 2p2 . Its valance s and p states form a strong sp3 hybrid band which is split into a bonding and anti-bonding band.
Biss: is strong and the Bidd is weak. Bidd ¿ Biss
(53)
In addition the s-d hybridization is inhibited by the opposing symmetry of the s-d orbitals. Bisd
=
Z
φsi(r − r1)v(r − r2)φdi(r − r2)d3r ¿ Biss
(54)
where φsi is essentially even and φdi is essentially odd. So Bisd ¿ Biss. Thus, to a first approximation the s-orbitals will form a very wide band of mostly s-character and the d-orbitals will form a very narrow band of mostly d-character. Since both the 23
D(E) d 1 +
+
− −
+
+
− −
2 +
−
+
−
s
+
Figure 15: Schematic DOS of Cu 3d10 4s1 . The narrow d-band feature is split due to crystal fields.
s and d bands are valance, they will overlap leading to a DOS with both d and s features superimposed. 4
Photo-Emission Spectroscopy
The electronic density of electronic states (especially for occupied states), and to a less extent band structure, are very important for illuminating the interesting physics of materials. As we saw in Chap. 6, an enhanced DOS at the Fermi surface indicates an enhanced electronic mass, and if D(EF ) = 0, we have an insulator (semiconductor). The effective electronic 24
synchrotron V d
hω
r
detector α
e
material sample Figure 16: XPS Experiment: By varying the voltage one may select the kinetic energy of the electrons reaching the counting detector.
mass also varies inversely with the curvature of the bands. The density of states away from the Fermi surface can allow us to predict the properties of the material upon doping, or it can yield information about core-level states. Thus it is important to be able to measure D(E). This may be done by x-ray photoemission (XPS), UPS or PS in general. The band dispersion E(k) may also be measured using angle-resolved photoemission (ARPES) where angle between the incident radiation and the 25
detector is also measured. The basic idea is that a photon (usually an x-ray) is used to knock an electron out of the system (See figure 17.) Of hω - Eb -φ X-ray E
hω - Eb Eb
Intensity(E)
D(E) Kinetic Energy of electrons
hω - φ
Figure 17: Let the binding energy be defined so that Eb > 0, φ = work function, then the detected electron intensity I(Ekin − h ¯ ω − φ) ∝ D(−Eb )f (−Eb )
course, in order for an electron at an energy of Eb below the Fermi surface to escape the material, the incident photon must have an energy which exceeds Eb and the work function φ of the material. If h ¯ ω > φ, then the emitted electrons will have a distribution of kinetic energies Ekin , extending from zero to 26
h ¯ ω −φ. From Fermi’s golden rule, we know that the probability per unit time of an electron being ejected is proportional to the density of occupied electronic states times the probability (Fermi function) that the electronic state is occupied 1 ∝ D(−Eb)f (−Eb) τ (Ekin ) ∝ D(Ekin + φ − h ¯ ω)f (Ekin + φ − h ¯ ω)
I(Ekin ) =
(55)
Thus if we measure the energy and number of ejected particles, then we know D(−Eb). Secondary electrons
e phonon
Coulombic interation e
Figure 18: Left: Origin of the background in I(Ekin . Right: Electrons excited deep within the bulk scatter so often that they rarely escape. Thus, most of the signal I originates at the surface, which must be clean and representative of the bulk.
There are several problems with this procedure. First some 27
of the photon excited particles will scatter off phonons and electronic excitations within the material. Since these processes can occur over a very wide range of energies, they will produce a broad featureless background in N (Ek). Second, due to these 6 background subtracted background ’’raw’’ data
I(Ekin)
4
2
0
0
1
2 Ekin
3
4 hω-φ
Figure 19: In Photoemission, we measure the rate of ejected electrons as a function of their kinetic energy. The raw data contains a background. Once this is subtracted off, the subtracted data is proportional to the electronic density of states convolved with a Fermi function I(Ekin ) ∝ D(Ekin + φ − h ¯ ω)f (Ekin + φ − h ¯ ω).
secondary scattering processes, it is very unlikely that an electron which is excited deep within the bulk, will ever escape from 28
e
measure
E kin
hω
Figure 20: BIS Ekin = h ¯ ω − Eb − φ, Eb = h ¯ ω − Ekin − φ
the material. Thus, we only learn about D(E) near the surface of the material. Therefore it is important for this surface to be “clean” so that it is representative of the bulk. For this reason these experiments are often carried out in ultra-high vacuum conditions. We can also learn about the electronic states D(E) above the Fermi surface, E > FF , using Inverse Photoemmision. Here, an electron beam is focussed on the surface and the outgoing flus of photons are measured.
29
Chapter 8: Magnetism Holstein & Primakoff April 13, 2001
Contents 1 Introduction
2
1.1
The Relevance of Magnetostatics . . . . . . . . . . . . . . . . . . . .
3
1.2
Non-interacting Magnetic Systems . . . . . . . . . . . . . . . . . . . .
4
2 Coulombic Correlation Effects
7
2.1
Moment Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Magnetism and Intersite Correlations
7
. . . . . . . . . . . . . . . . .
11
2.2.1
The Exchange Interaction Between Localized Spins . . . . . .
14
2.2.2
Exchange Interaction for Delocalized Spins . . . . . . . . . . .
18
3 Band Model of Ferromagnetism
22
3.1
Enhancement of χ
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.2
Finite T Behavior of a Band Ferromagnet . . . . . . . . . . . . . . .
28
3.2.1
31
Effect of B . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Mean-Field Theory of Magnetism
32
4.1
Ferromagnetism for localized electrons (MFT) . . . . . . . . . . . . .
32
4.2
Mean-Field Theory of Antiferromagnets
37
1
. . . . . . . . . . . . . . . .
5 Spin Waves
43
5.1
Quantization of Ferromagnetic Spin Waves . . . . . . . . . . . . . . .
49
5.2
Antiferromagnetic Spin Waves . . . . . . . . . . . . . . . . . . . . . .
56
2
1
Introduction
Magnetism is one of the most interesting subjects in condensed matter physics. Magnetic effects are responsible for heavy fermion behavior, ferromagnetism, antiferromagnetism, ferrimagnetism and probably high temperature superconductivity. Unlike our previous studies, most magnetic systems are not well described by simple models which ignore intersite correlations. The reason is simple: magnetism is inherently due to electronic correlations of moments on different sites. As we
J
Figure 1: Both moment formulation and the correlation between these moments (J) are due to Coulombic effects
will see, systems without these inter-spin correlations (or those without well defined moments to be correlated) have uninteresting and unimportant (energetically) magnetic properties. 3
1.1
The Relevance of Magnetostatics
Perhaps the term magnetism is a misnomer, or rather describes only the external probe (B) which we use to study magnetic behavior. Magnetic effects are due to electronic correlations, those mediated or due to Coulombic effects, and not due to magnetic correlations between moments (these are smaller by orders of v/c, ie., they are relativistic corrections). For example, consider the magnetic correlation between two moments separated by a couple of Angstroms. U=
1 [m1 · m2 − 3(m1 · n)(m2 · n)] r3
(1)
then, Udipole−dipole ≈
m1 m2 r3
(2)
If we let, m1 ∼ m2 ∼ gµB ∼ ◦
e¯h m
r ≈ 2A then, (gµB )2 U∼ ∼ r3
2
2 e h ¯c 4
Ã
a 0 !3 e 2 ∼ 10−4eV r a0
(3)
Or roughly one degree Kelvin! Magnetic correlations due to magnetic interactions would be destroyed by thermal fluctuations at very low temperatures. 1.2
Non-interacting Magnetic Systems
We will define non-interacting magnetic systems as those for which the independent moments do not interact with each other, and only interact with the probing field. For the moment let’s consider only the magnetic moments due to electrons (as we will see, since they can interact with each other, they are by far the most important moments in the system). They have a moment ≈ µB (L + 2S)
(4)
The system energy will change by an amount ∆E ∼ gµB B(L + 2S) ∼ µB B,
µB =
eV e¯h ∼ 5.8 × 10−5 2m Te (5)
in an external field. The largest field which can regularly be produced in a lab is ∼ 10T e (100T e or more can be produced 5
at LANL by blowing things up), thus ∆E∼< 10−3 eV
or ∼ kB (10oK)
(6)
This is a very small energy. Thus magnetic effects are wiped out by thermal fluctuation for kB T > µ B B
(7)
at about 10 K! Thus experiments which measure the magnetism of non-interacting systems must be carried out at low temperatures. These experiments typically measure the susceptibility of the system with a Faraday balance or a magnetometer (SQUID). For a collection of isolated moments (spin 1/2), the susceptibility may be calculated from the moment
↑s=
1 2
−β 12 gµB B
β 12 gµB B
¶
−e 1 e ¶ µ 2 e−β 12 gµB B + eβ 12 gµB B µB µB B 'g tanh(βµB B) ≈ µB 2 T 2 ∂ hmi µ = ≈ B ∂B kB T
hmi = gµB
χ
µ
6
(8) (g ∼ 2) (9)
Once again, the energy of the moment-field interaction is roughly E∼
µ2B T
B2 ∼
µ ¶ −8 eV 2 10 Te 2 µ ¶2 B , −4 T 10◦KeV
(k = 1)
(10)
When E ∼ T , thermal fluctuations destroy the orientation of the moments with the external field. If B ∼ 10T e 100 ◦ K E∼ T
(11)
or E ∼ T at 10◦K! However, we know that systems such as
E ∼ kT/EF
D(E) Figure 2: In a metal, only the electrons near the Fermi surface, which are not paired into singlets, contribute to the bulk susceptibility χ ∼
2 kT µB EF k B T
∼ µ2B D(EF )
iron exist for which a small field can induce a relatively large moment at room temperature. This is surprising since for a metal, or a free electron gas, the susceptibility is much smaller 7
than the free electron result, since only the spins near the Fermi surface can participate, χ =
µ2B EF .
Note that this is even smaller
than the free electron result by a factor of 2 2.1
kB T EF
¿ 1!
Coulombic Correlation Effects Moment Formation
Of course real materials are not composed of free isolated electrons. Nevertheless some insulators act almost as if they are composed of non-interacting atoms (ions) with moments given by Hunds rules: maximum S maximum L which leads to large atomic moments. Hunds rules reflect the atom’s attempt to lower its Coulombic energy, see Fig. 3. By maximizing the total spin S, the spin part of the wavefunction becomes symmetric under electron exchange (i.e. for two electrons with s =
1 2
↑ ↓
the maximum value of total spin is S = 1 with a wavefunction |↑↓i + |↓↑i ). Then, since the total wavefunction must be antisymmetric under exchange, the spatial part must be antisymmetric requiring it to have a node. The node keeps the 8
Max S
Max L ψ(x) x e-
e-
Figure 3: Hunds rules, Maximize S and L, both result from minimizing the Coulomb energy.
electrons apart, minimizing their Coulomb energy. The second Hunds rule is also due to Coulombic interactions. Maximizing L tends to keep the electrons apart, much like a centrifuge. (alternatively, the radial Schroedinger equation obtains an angular momentum barrier L(L + 1)/r 2 ). µ
B
U
ε
Figure 4: A simple tight-binding model with a local Coulomb repulsion U . If U = 0, the rate that electrons hop on and off any site may be approximated using Fermi’s golden rule ∼ π|B|2 D(EF ) ∼
1 . ∆t
Then by the uncertainty principal ∆E∆t ∼ h ¯ so
each site energy acquires an uncertainty or width ∆E ∼
The sites will form moments (see Fig. 5) if Γ À U, |²|
h ¯ ∆t
∼ π|B|2 D(EF ) ≡ Γ.
To illustrate how band formation modifies this scenerio, lets 9
consider a simple tight binding model (See Fig. 4). By Fermi’s goldon rule, each level acquires a width (uncertainty in its energy) Γ = πB 2D(EF ) and each level can be in one of the four states shown in Fig. 5. Clearly, the states −−−° −−−− and −−↑−↓−−
2ε+U moment forms Γ
µ ε Figure 5: A moment forms on an orbital provided that Γ À U, |²|. U ∼
a
e2 − r T F e a
rT F is small for a metal, large for an insulator
do not have a moment, and the states −−↑− − and −↓−− do. If these states mix equally a moment will not form. The mixing between the states with moments is only through one of the other two states (−−−− ° −−− or −−↑−↓−−) and may be suppressed, as can the occupancy of the moment-less states, by increasing the energy of the states without moments. Ie., a moment will form on each site if −² À Γ and ² + U À Γ. 10
L I ξ
n e
m B
Figure 6: Here ξ = − 1c ∂φ , m = −µB L and φ = Bπa2 ∂t
In this limit U À B, the system will act more like a system of free moments than a free electron gas. Thus, one might expect χinsulator À χmetal
(12)
for noninteracting systems. However, this is not the case. The reason is that I have only told you half of the story. A real atom, or a system composed of such atoms, has a diamagnetic response due to the angular momentum L of the rotating electrons. This effect is due to Lenz’ Law. So that any introduced magnetic induction will induce an EMF and hence a current that opposes the electron current which reduces the moment. 11
⇒ diamagnetism with χ ∝ a2. In the free electron limit (see J.M. Ziman, )
!2 Ã m 1 2 χ = µB D(EF ) 1 − ∗ 3 m
∆E ∼
2
eV 10−8
a10−1 eV −1B 2 ∼ 10−9 eV B 2(T )
T For insulators, often with
m∗ m
(13) (14)
< 1 the diamagnetism wins;
whereas for metals, generally with
m∗ m
≥ 1, the Pauli para-
magnetism wins. 2.2
Magnetism and Intersite Correlations
From both Hund’s rules and a simple tight binding picture, we argued that moment formation in solids results from local Coulomb correlations between electrons. We also saw that a collection of such isolated moments is rather boring since all magnetic behavior is washed out by thermal fluctuations at very low temperatures. Consider once again an isolated moment of magnitude mµB in an external field. (mµB )2 χ ≈ kB T 12
(15)
ξ
a
Figure 7: If the magnetic moments in a small volume are correlated, then the magnetic susceptibility is strongly enhanced.
(mµB B)2 E ≈ kB T
(16)
For magnetism to be significant at room temperature (300K) we must increase the energy of our system in a field. This may be accomplished by increasing the effective moment m by correlating adjacent moments. If the range of this correlation is ξ, so that roughly
4πξ 3 3a3
moments are correlated, then let
ξ a
=3
so that ∼ 102 moments are correlated and m ∼ 102. This
increases E by about 104 , so that E ∼ kbT at T ∼ 103 K. The observed (measured) susceptibility also then increases by about 104, all by only correlating moments in a range of 3 lattice 13
spacings. Clearly correlations between adjacent spins can make magnetism in materials relevant. Such correlations are due to electronic effects and are hence usually short ranged due to electronic (Thomas-Fermi) screening. If we consider two s =
1 2
spins, ↑1 ↓2, then the correlation is usually parameterized by the Heisenberg exchange Hamiltonian, or H = −2Jσ1 · σ2
(17)
where J is the exchange splitting between the singlet and triplet energies.
|↑ ↑i
Et |↑ ↑i + |↓ ↑i
|↓ ↓i
(18)
{|↑ ↓i − |↓ ↑i} Es
(19)
Et − Es = −J
(20)
The trick then is to calculate J!
14
1 e r1A e
+
A
2 e
r12 r1B
r2A
r
2B
R
e
AB
+
B
Figure 8: Geometry of two electrons, 1 and 2, bound to two ions A and B. 2.2.1
The Exchange Interaction Between Localized Spins
Imagine that we have two hydrogen atoms A and B which localize two electrons 1 and 2. As these two electrons approach, their spins will become correlated. H = H1 + H2 + H12
(21)
e2 h ¯2 2 e2 − (22) H1 = − ∇ − 2m r1A r1B e2 e2 H12 = + (23) r12 RAB As we did in Chap. 1 to describe binding, we will use the atomic wave functions to approximate the molecular wavefunction ψ 12. ψ12 = (φA(1) + φB (1)) (φA(2) + φB (2)) ⊗ spin part 15
= (φA(1)φA (2) + φB (1)φB (2) + φA(1)φB (2) + φA(2)φB (1)) ⊗spin part
(24)
2
If re12 is strong (it is) then the first two states with both electrons on the same ion are suppressed, especially if the ions are far apart. Thus we neglect them, and make the Heitler-London approximation; for example ψ12 ' (φA(1)φB (2) + φB (1)φA (2)) ⊗ spin singlet
(25)
The spatial wave function is symmetric, and thus appropriate for the spin singlet state since the total electronic wave function must be antisymmetric. For the symmetric spin triplet states, the electronic wave function is ψ12 = (φA(1)φB (2) − φB (1)φA (2)) ⊗ spin triplet
(26)
ψ12 = φA(1)φB (2) ± φB (1)φA (2) ⊗ spin part
(27)
or
The energy of these states may then be calculated by evaluating hψ12 |H|ψ12 i hψ12 |ψ12 i .
E=
hψ12|H|ψ12i C± A = 2EI + , hψ12|ψ12i 1±S 16
+ singlet , − triplet (28)
where EI =
Z
h ¯2 2 e2 2 ∗ − φA(1) < 0 d r1φA(1) ∇ − 1 2m r1A
(29)
the Coulomb integral C = e2
Z
1 1 1 1 2 2 3 3 d r1 d r2 + − − |φA (1)| |φB (2)| < 0 RAB r12 r2A r1B (30)
the exchange integral A = e2
Z
1 1 1 ∗ 1 ∗ + − − d 3 r1 d 3 r2 φA (1)φA (2)φB (1)φB (2) RAB r12 r2A r1B (31)
and finally, the overlap integral is S=
Z
d3r1d3r2φ∗A(1)φA (2)φB (1)φ∗B (2) (0 < S < 1)
All EI , C, A, S ∈ <. So
(32)
C + A C −A − 2EI + −J = Et − Es = 2EI + 1−S 1+S C −A C +A − >0 −J = 1−S 1+S A − SC <0 (33) J= 2 1 − S2
where the inequality follows since the last two terms in the {} dominate the integral for A and in the Heitler-London approx17
Figure 9:
imation S ¿ 1. Or, for the effective Hamiltonian. H = −2J σ1 · σ2,
J <0
(34)
Clearly this favors an antiparallel or antiferromagnetic alignment of the spins (See Fig. 9) since then (classically) σ1 · σ2 < 0 and E < 0, so minimizing the energy. This type of interaction is clearly appropriate for insulators which may be approximate as a collection of isolated atoms. Indeed antiferromagnets are generally insulators for this and other reasons. H = −2J
X
hiji
σi · σ j ,
18
J <0
(35)
2.2.2
Exchange Interaction for Delocalized Spins
Ferromagnetism, where adjacent spins tend to align forming a bulk magnetic moment, is most often seen in conducting metals such as Fe. As we will see in this section, the Pauli principle, the Coulomb interactions, and the itinerancy of free (metallic) electrons favors a ferromagnetic (J > 0) exchange interaction. Consider two like-spin (triplet) free electrons in a volume V (See Fig. 10). If we describe the spatial part of their wave
e ri V
r
e
j
Figure 10: |↑ ↑i triplet-symmetric
function with plane waves, then ¾ 1 ½ iki·ri ikj ·rj iki ·rj ikj ·ri √ e −e e ψij = e 2V ¾ 1 iki·ri ikj ·rj ½ i(k −k )·(r −r ) i j i j = √ e 1−e e 2V 19
(36)
The probability that the electrons are in volumes d3ri and d3rj is |ψij |2d3rid3rj =
1 3 3 {1 − cos [(k − k ) · (r − r )]} d r d rj i j i j i V2 (37)
As required by the Pauli principle, this probability vanishes when ri = rj . This would not be the case for electrons in the singlet spin state (if the coulomb interaction continues to be ignored). Thus there is a hole, called the “exchange hole”, in the probability density for ri ≈ rj for triplet spin electrons, but not singlet spin ones. Now consider the effects of the electron-ion and the electronelectron coulomb interactions (See Figure 11). If one electron comes near an ion, it will screen the potential of that ion seen by other electrons; thereby raising their energy. Thus the effect of allowing electrons to approach each other, is to increase the electron-ion coulomb energy, and of course the electronelectron Coulomb energy. Thus, anything which keeps them apart without an energy cost, like the exchange hole for triplet spin electrons, will reduce their energy. As a result, like-spin 20
b e
a e
Ze
+
+
Ze
Figure 11: Electron a screens the potential seen by electron b, raising its energy. Anything which keeps pairs of electrons apart, but costs no energy like the exchange hole for the electronic triplet, will lower the energy of the system. Thus, triplet formation is favored thermodynamically.
electrons have lower energy and are thermodynamically favored ⇒ Ferromagnetism. To determine the range of this FM exchange interaction, we must average the effect over the Fermi sea. If one of the electrons is fixed at the origin (See Figure 12), then the probability that a second is located a distance r away, in a volume element d3r is P↑↑ (r)d3 r = n↑d3r (1 − cos [(ki − kj ) · r]) |
{z
Fermi sea average 21
}
(38)
i.e. ri ≡ 0 r=rj
O
e
e Figure 12: Geometry to calculate the exchange interaction.
1 1 # electrons n↑ = n = 2 2 volume In terms of an electronic charge density, this is
(39)
en (1 − cos [(ki − kj ) · r]) 2 ¶ en 1 Z kF 3 3 1 µ ı(ki−kj )·r −ı(ki −kj )·r = e + e 1 − d k d k i j 2 ( 34 kF3 )2 o 2 4 3 −2 Z kF 3 ıki·r Z kF 3 ıkj ·r en 1 − ( kF ) 0 d ki e d kj e = 0 2 3 2 (sin kF r − kF r cos kF r) en 1−9 (40) = 2 (kF r)6
ρex (r) =
Note that both of the exponential terms in the second line are
the same, since we integrate over all ki → −ki & kj → −kj . Since we have only been considering Pauli-principle effects, the electronic density of spin down electrons remains unchanged. Thus, the total charge density around the up spin electron fixed 22
ρeff en 1 1/2
2
4
kF r
Figure 13: Electron density near an electron fixed at the origin. Coulomb effects would reduce the density for small r further, but would not significantly effect the size of the exchange hole or the range of the corresponding potential, both ∼ 1/kF .
at the origin is
9 (sin kF r − kF r cos kF r)2 ρef f (r) = en 1 − 2 (kF r)6
(41)
The size of the exchange hole, and the range of the corresponding ferromagnetic exchange potential, is ∼
1 kF
∼ a which is
rather short. 3
Band Model of Ferromagnetism
Due to the short range of this potential, its Fourier transform is essentially flat in k. This fact may be used to construct a band 23
theory of FM where the mean effect of a spin-up electron is to lower the energy of all other band states of spin up electrons by a small amount, independent of k. E↑(k) = E(k) −
IN↑ ; N
< 1eV I∼
(42)
Likewise for spin down E↓(k) = E(k) −
IN↓ . N
(43)
Where I, the stoner parameter, quantifies the exchange hole energy. The relative spin occupation R is related to the bulk moment
(N↑ − N↓) R= , N Then Eσ (k) = E(k) −
I(N↑ +N↓ ) 2N
N M = µB R V
(44)
− σIR 2 ,
(45)
˜ ≡ E(k) − σIR 2 .
(σ = ±)
(46)
If R is finite and real, then we have ferromagnetism. R =
1 ¾ ˜ k exp (E(k) − IR/2 − EF )/kB T + 1 1 ½ ¾ − (47) ˜ exp (E(k) + IR/2 − EF )/kB T + 1 1 N
X
½
24
˜ For small R, we may expand around E(k) = EF . f (x − a) − f (x + a) = −2af 0 −
2 3 000 af 3!
(48)
˜ All derivatives will be evaluated at E(k) = EF , so f 0 < 0 and
f
f′ E E
F
E
E
F
f′′′
f′′
E
E
Figure 14:
f 000 > 0. Thus, IR 1 R = −2 2 N
X
k
¯
∂f ¯¯¯ 2 IR 3 1 ¯ ˜ ¯¯ − 6 2 N ∂ E(k) EF
This is a quadratic equation in R I −1 − N
X
k
¯
1 3 21 ∂f ¯¯¯ ¯ = I R ∂E(k) ¯¯EF 24 N 25
X
k
X
k
¯
∂ 3f ¯¯¯ ¯ (49) ∂ E˜ 3(k) ¯¯EF ¯
∂ 3f ¯¯¯ ¯ ∂E 3(k) ¯¯EF
(50)
which has a real solution iff −1 −
I N
X
k
¯ ¯ ¯ ¯ ¯ ¯ ¯
∂f >0 ∂E(k) EF
(51)
Or, the derivative of the Fermi function summed over the BZ must be enough to overcome the -1 and produce a positive result. Clearly this is most likely to happen at T = 0, where ¯
∂f ¯¯ ∂E(k) ¯E
T = 0,
F
→ −δ(E˜ − EF ) 1 − N
X
k
Z ∂f V V ˜ E−E ˜ ˜ F) D(E)δ( ) = D(EF ) = D(E = dE˜ F ∂Ek 2N 2N (52)
˜ F ) > 1. This is So, the condition for FM at T = 0 is I D(E known as the Stoner criterion. I is essentially flat as a function I (eV) ∼ D(E F) (eV-1)
1.0
Ni Fe Co
1.0
Z
50
Li
Na
Z
Figure 15:
26
of the atomic number, thus materials such as Fe, Co, & Ni with ˜ F ) are favored to be FM. a large D(E 3.1
Enhancement of χ
Even those systems without a FM ground state have their susceptibility strongly enhanced by this mechanism. Let us reconsider the effect of an external field (gS = 1) on the band energies. Eσ (k) = E(k) −
Inσ − µB σB N
(53)
Then R = − N1
P
∂f k ∂ E˜ (IR k
+ 2µB B)
˜ F )(IR + 2µB B) = D(E
(54)
or as M = µB N V R, we get M=
N 2µ2B
˜ F) D(E ˜ F )B V 1 − I D(E
(55)
χ =
∂M χ0 = ˜ F) ∂B 1 − I D(E
(56)
or
27
˜ χ0 = 2µ2B N V D(EF ) = µ2B D(EF )
(57) (58)
< 1, the susceptibility can be consider˜ F) ∼ Thus, when I D(E
ably enhanced over the non-interacting result χ0. However, this approximation usually overestimates χ since it neglects diamagnetic contributions, and spin fluctuations (at T 6= 0). As we will see, the latter especially are important for estimating T c.
ξ
a
Figure 16: Spin fluctuations can reduce the total moment within the correlated region, and even reduce ξ itself. Both effects lead to a reduction in the bulk susceptibility χ ∼ moment2
28
3.2
Finite T Behavior of a Band Ferromagnet
In principle, one could start from an ab-initio calculation of the electronic band structure of E(k) and I, such as Ni, and not correlated
s electrons (no moments)
D(E)
d electrons (with moments)
correlated
Figure 17: In metallic Ni, the d-orbitals are compact and hybridize weakly due to low overlap with the s-orbitals (due to symmetry) and with each other (due to low overlap). Thus, moments tend to form on the d-orbitals and they contribute narrow features in the electronic density of states. The s-orbitals hybridize strongly and form a broad metallic band.
calculate the temperature dependence of R (and hence the magnetization) using R=
1 N
X
k
IR IR f (E˜k − −µB B0 −EF )−f (E˜k + +µB B0 −EF ) 2 2 (59)
with f (x) =
1 . eβx +1
However, this would be pointless since
all of the approximations made to this point have destroyed the quantitative validity of the calculation. However, it still retains 29
a qualitative use. For Ni, we can do this by approximating the very narrow d-electron feature in D(E) as a δ function and performing the integral. However, only the d-electrons have a strong exchange splitting I and hence only they will tend ˜ to contribute to the magnetization. Thus our D(E) should reflect only the d-electron contribution, we will accommodate this by setting ˜ D(E) ≈ Cδ(E − EF ),
(C < 1)
(60)
C, an unknown constant, will be determined by the T = 0 behavior. Then
Let
R C
IR IR − µB B0 ) − f ( + µB B0 ) R = C f (− 2 2
˜ and Tc = ≡R
˜= R
exp
Ã
1 ˜ c −2RT T
!
IC 4kB ,
+1
then if B0 = 0
−
˜ =1= If T = 0, then R
exp
R C
=
Ã
1 ˜ c 2RT T
!
˜ c RT = tanh T +1
1 n↑ −n↓ C N .
µef f µB . 30
(62)
For Ni, the measured
ground state magnetization per Ni atom is Therefore, C = 0.54 =
(61)
µef f µB
= 0.54 =
n↑ −n↓ N .
For small x, tanh x ' x − 31 x3 , and for large x
sinh x ex − e−x 1 − e−2x = = tanh x = cosh x ex + e−x 1 + e−2x = (1 − e−2x )(1 − e−2x ) ' 1 − 2e−2x
(63)
Thus, 2Tc
˜ R
= 1 − 2e− T , √ ˜ = 3(1 − T ) 21 , R T
for T ¿ Tc
< Tc ˜ or T ∼ for small R
C
(64) (65)
However, neither of the formulas is verified by experiment. The Eq (64) ∼ R Eq(65) T/Tc Figure 18:
critical exponent β = ≈
1 3,
1 2
in Eq. 65 is found to be reduced to
and Eq. 64 loses its exponential form, in real systems.
˜ Using more realistic D(E) or values of I will not correct these problems. Clearly something fundamental is missing from this 31
model (spin waves).
Elementary Excitations
spin wave
spin flip B0
Not Included
Included
Figure 19: Local spin-flip excitations, left, due to thermal fluctuations are properly treated by mean-field like theories such as the one discussed in Secs. 3 and 4. However, non-local spin fluctuations due to intersite correlations between the spins are neglected in mean-field theories. These low-energy excitations can fundamentally change the nature of the transition.
3.2.1
Effect of B
If there is an external field B0 6= 0, then ˜ c + µB B0/2kB RT ˜ R = tanh T
(66)
Or for small R and B0, (or rather, large T À Tc.)
˜⇒ R ˜ = µB 1 B 0 ˜ = µ B B 0 + Tc R R 2kT T 2k T − Tc
Thus since M =
µB N V R
=
CµB N ˜ V R
⇒ χ =
∂M ∂B0
=
(67) Cµ2B N 2kV T −Tc .
This form for χ χ=
Const T − Tc 32
(68)
is called the Curie-Weiss form which is qualitatively satisfied for T À Tc; however, the values of Const and Tc predicted by band structure are inaccurate. Again, this is due to the neglect of low-energy excitations. 4
Mean-Field Theory of Magnetism
4.1
Ferromagnetism for localized electrons (MFT)
δ=2 δ=3
δ=1
i δ=4
Figure 20: Terms in the Heisenberg Hamiltonian H = −
P
iδ
Jiδ Si · Siδ − gµB B0
Here i refers to the sites and δ refers to the neighbors of site i.
P
i
Si
Some of the rare earth metals or ionic materials with valence d or f electrons are both ferromagnetic and have largely localized electrons for which the band theory of FM is inappropriate (A good example is CeSi2−x , with x > 0.2). As we have seen, 33
systems with localized spins are described by the Heisenberg Hamiltonian. H=−
X
iδ
Jiδ Si · Siδ − gµB B0
X
i
Si
(69)
In general, this Hamiltonian has no solution, and we must resort to (further) approximation. In this case, we will approximate the field (exchange plus external magnetic) felt by each spin as the average field due to the neighbors of that spin and the external field. (See Fig. 21.) Then
J
Each site υ nearest neighbors with exchange interaction J
Si Figure 21: The mean or average field felt by a spin Si at site i, due to both its neighbors and the external magnetic field, is
1 gµB
h
P
δ
Jiδ Siδ i + B0 = Bief f . Where h
is the internal field, due to the neighbors of site i.
H≈ −
X
i
gµB Bief f · Si = −
X
i
X
Si ·
δ
P
δ
Jiδ Siδ i
Jiδ hSiδ i + gµB B0
(70)
If Jiδ = J is a constant (independent of i and δ) describing the 34
exchange between the spin at site i and its ν nearest neighbors, then Bief f =
J
P
iδ
hSiδ i + gµB B0 Jν = hSi + B0 gµB gµB
N hSi ; ν = #nn . V For a homogeneous, ordered system, M = gµB
Bef f =
V νJM + B0 = BM F + B0 N g 2µ2B
(71) (72)
(73)
and H ≈ −gµB Bef f ·
X
i
Si
(74)
ie., a system of independent spins in a field Bef f . The probability that a particular spin is up, is then 1
P↑ ∝ e−β (−gµB Bef f 2 )
(75)
and 1
P↓ ∝ e−β (+gµB Bef f 2 )
(76)
N↓ = e−βgµB Bef f N↑
(77)
so, on average
35
and, since N↑ + N↓ = N
1 N↑ − N ↓ 1 N β M = gµB = gµB tanh gµB Bef f 2 V 2 V 2
(78)
Since tanh is odd and Bef f ∝ M , this will only have nontrivial solutions if J > 0 (if B0 = 0). If we identify 1 J N gµB ; Tc = ν V 2 4 k M Tc M = tanh Ms T Ms < Tc and again for T = 0, M (T = 0) = Ms, and for T ∼ Ms =
(79) (80)
a = tanh (ba) y=a
y = tanh (ba)
initial slope = b
Figure 22: Equations of the form a = tanh(ba), i.e. Eq. 80, have nontrivial solutions (a 6= 0) solutions for all b > 1. 1 √ 2 M T ' 3 1− Ms Tc
36
(81)
Again, we get the same (wrong) exponent β = 12 . When is this approximation good? When each spin really feels an ”average” field. Suppose we have an ordered solid, so that Jν = Breal (82) 2gµB Now, consider one spin flip excitation adjacent to site i only, BM F =
i
Figure 23: The flip of a single spin adjacent to site i makes a significant change in the effective exchange field, felt by spin Si , if the site has few nearest neighbors.
Fig. 23. If there are an infinite # of spins then BM F remains unchanged but for ν < ∞ Breal =
Jν Jν ν − 2 6= BM F = . 2gµB ν 2gµB
(83)
Clearly, for this approximation to remain valid, we need B real = 37
BM F , which will only happen if
ν −2 ν
= 1 or ν À 2. The
more nearest neighbors to each spin, the better MFT is! (This remains true even when we consider other lower energy excitations, other than a local spin flip, such as spin waves). 4.2
Mean-Field Theory of Antiferromagnets
Oxides of Fe Co Ni and of course Cu often display antiferromagnetic coupling between the transition-metal d orbitals. Lets assume we have such a magnetic system on a bipartite lattice composed of two inter-penetrating sublattices, like bcc. We consider the magnetization of each lattice separately: For
J<0 "down" sublattice "up" sublattice Figure 24: Antiferromagnetism (the Neel state) on a bcc lattice is composed of two interpenetrating sc sublattices lattices. 38
example, the central site shown in Fig. 24 feels a mean field from the ν = 8 near-neighbor spins on the “down” sublattice. so
− − 2 2 B = V/( ⁄ N g µB )νJM MF
Figure 25:
N gµB 1 V gµB tanh νJM − 2 − 2 2 V 2kT N g µB = (+ ↔ −) . . .
M+ = M−
+
(84) (85)
where M + is the magnetization of the up sublattice. These equations have the same form as that for the ferromagnetic case! We can make a closer analogy by realizing that N + = N − and M + = −M −, so that + + 1 V νJM N M + = gµB − tanh , 2kT N + gµ 2 V B − + M = −M 39
J < 0 (86) (87)
Again, these equations will saturate at Ms+ so
=
−Ms−
N+ 1 = gµB 2 V
where TN = − 14 kνJ
(88)
TN M + M+ = tanh T M Ms+ s
(89)
B
Now consider the effect of a small external field B0. This will yield a small increase or decrease in each sublattice’s magnetization ∆M ±. M + + ∆M + M − + ∆M − Or, since
d dx
gµB N+ 1 V νJ ³ − − ´ B 0 + = gµB tanh M + ∆M 2kT 2 V N −g 2µ2B = (+ ↔ −) . . . (90)
tanh x =
1 , cosh2 x
(
then ∆M =
∂M ∂Bef f ∆Bef f
)
.
1 V νJ N + gµB 1 + − B 0 + ∆M = ∆M +∆M = gµB ∆M 2 2 V 2kT cosh x 2N −g 2µ2B (91) where x =
TN M + T Ms+ .
For T > TN , M + = 0 and so x = 0, and
g 2µ2B N 4kB TN V B 0 − ∆M = 2 ∆M 2 2 8V kB T N g µB g 2µ2B N B0 − ∆M TN T ∆M = 4V kB 40
(92) (93)
g 2µ2B N ∆M = B0 4V kB (T + TN ) g 2µ2B N χ = 4V kB (T + TN )
(94) (95)
χ
−TN
0
T T
N
Figure 26: Sketch of χ = Const/(T + TN ). Unlike the ferromagnetic case, the bulk susceptibility χ does not diverge at the transition. However, as we will see, this equation only applies for the paramagnetic state (T > TN ), and even here, there are important corrections.
Below the transition, T < TN , the susceptibility displays different behaviors depending upon the orientation of the applied field. For T ¿ TN and a small B0 parallel to the axis of the
sublattice magnetization, we can approximate M +(T ) ≈ Ms+ and x ≈
TN T
in Eq. 91 g 2µ2B N 1 µ ¶ χ' 4V kB T cosh2 TTN + TN 41
(96)
B0
Figure 27: When T ¿ TN , a weak field applied parallel to the sublattice magnetization
axis only weakly perturbs the spins. Here M + (T ) ≈ Ms+ and x ≈
TN T
g 2µ2B N −2 TN χ' e T (97) 4V kB Now consider the case where B0 is perpendicular to the magnetic axis. The external field will cause each spin to rotate a α B
0
B0
BMF
Figure 28: When T ¿ TN , a weak field B0 applied perpendicular to the sublattice
magnetization, can still cause a rotation of each spin by an angle proportional to B0 /BM F .
small angle α. (See Fig. 28) The energy of each spin in this 42
external field and the mean field ∝ ν gµJ to the first order in B
B0 is 1 1 E = − gµB B0 sin ∝ + νJ cos α (98) 2 2 Equilibrium is obtained when ∂E ∂α = 0. Since B0 is taken as small, α ¿ 1.
1 1 1 E ∼ − gµB B0α + νJ 1 − α2 2 2 2
(99)
or ∂E 1 1 gµB B0 = 0 = gµB B0 + να ⇒ α = − ∂α 2 2 νJ
(100)
The induced magnetization is then g 2µ2B N B0 1 gµB B0 α =− ∆M = 2 V 2νJV
(101)
so
g 2µ2B N χ⊥ = = constant (102) 2ν |J| V Of course, in general, in a powdered sample, the susceptibility
will reflect an average of the two forms, see for example Fig. 30.
43
χ
χ
χ⊥
powdered sample
χ χ
T
T
T
N
Figure 29: Below the Neel transition, the lattice responds very differently to a field applied parallel or perpendicular to the sublattice magnetization. However, in a powdered sample, or for a field applied in an arbitrary direction, the susceptibility looks something like the sketch on the right.
5
Spin Waves
We have discussed the failings of our mean-field approaches to magnetism in terms of their inability to account for low-energy processes, such as the flipping of spins. (S α → −S α ) However, we have yet to discuss the lowest energy spin flip processes which are spin waves. We will approach spin-waves two ways. First following Ibach and Luth we will determine a spin wave in a ferromagnet. Second, we will argue that they should be quantized and then introduce a (canonical) transformation to a Boson representa44
1.2e-5 1.1e-5 1.0e-5 9.0e-6 8.0e-6 7.0e-6 6.0e-6 5.0e-6 4.0e-6 3.0e-6 2.0e-6 1.0e-6 0.0e+0 0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
T (K)
Figure 30: High temperature superconductor Y123 with 25% Fe substituted on Cu, courtesy W. Joiner, data from a SQUID magnetometer.
tion. Consider a ferromagnetic Heisenberg Model H = −J
X
iδ
Si · Si+δ
ˆSiy + ˆ zSiz . If we define |αi = where Si = x ˆSix + y
45
(103)
1
0
(i.e.
0
|↑i), β =
1
(i.e. |↓i) so that S
0 z
1
=−
1 0
2 1
···
(104)
and Sx =
1 0 −i
2 i
0
Sy =
·
α
S ,S
β
1 0 1
2 1 0 ¸
Sz =
1 1
0
2 0 −1 (105)
= i²αβγ S γ
(106)
It is often convenient to introduce spin lowering and raising operators S − and S +.
0 1
S + = S x + iS y = −
x
S = S − iS
y
S+
=
0
0 0 0 0
1 0
=
1
1 0 They allow is to rewrite H as H = −J
X
iδ
h
z + Siz Si+δ
·
, S−
i
S z , S ± = ±S +/−(107) S ,S
0 1
2
±
¸
=0
= 0...
1³ + − + ´ Si Si+δ + S−Si+δ 2 46
(108)
(109)
(110)
Since J > 0, the ground state is composed of all spins oriented, for example |0i = Πi |αii
(i.e. all up)
(111)
This is an eigenstate of H, since − |0i = 0 Si+Si+δ
(112)
and z Siz Si+δ |0i =
1 |0i 4
(113)
so 1 H |0i = − JνN |0i ≡ E0 |0i (114) 4 where N is the number of spins each with ν nearest neighbors. Now consider a spin-flip excitation. (See Fig. 31)
j Figure 31: A single local spin-flip excitation of a ferromagnetic system. The resulting state is not an eigenstate of the Heisenberg Hamiltonian.
|↓j i = Sj−Πn |αin 47
(115)
− This is not an eigenstate since the Hamiltonian operator Sj+Sj+δ
will move the flipped spin to an adjacent site, and hence create another state. However, if we delocalize this spin-flip excitation, then we can create a lower energy excitation (due to the non-linear nature of the inter-spin potential) which is an eigenstate. Consider the
j Figure 32: If we spread out the spin-flip over a wider region then we can create a lower energy excitation. A spin-wave is the completely delocalized analog of this with one net spin flip.
state 1 |ki = √ N It is an eigenstate. Consider:
X
j
eik·rj |↓j i .
(116)
1 X ik·rj 1 √ H |ki = e − νJ(N − 2) |↓j i 4 N j 1 1 X + νJ |↓j i − J (|↓j+δ i + |↓j−δ i) 2 2 δ
(117)
where the sum in the last two terms on the right is over the near48
neighbors δ to site j. The last two terms may be rewritten: √ X ik·(r −r ) 1 X ik·rj √ e |↓j+δ i = 1 N e m δ |↓mi (118) m N j where rj+δ = rj + rδ = rm
(119)
so
¶ 1 1 X µ ik·rδ 1 −ik·rδ √ +e e H |ki = − νJN + νJ − J 4 2 δ N
Thus |ki is an eigenstate with an eigenvalue
1X cos k · rδ E = E0 + Jν 1 − ν δ
X
j
eik·rj |↓j i
(120)
(121)
Apparently the energy of the excitation described by |ki vanishes as k → 0. What is |ki? First, consider S z |ki =
X
i
Siz |ki =
X
i
1 Siz √ N
X
j
eik·
rj
|↓j i
1 X ik·rj X z =√ e Si |↓j i = (SN − 1) |ki (122) N j i I.e., it is an excitation of the ground state with one spin flipped. Apparently, since Ek=0 = E0 , the energy to flip a spin in this way vanishes as k → 0. 49
5.1
Quantization of Ferromagnetic Spin Waves
In the ground state all of the spins are up. If we flip a spin, using a spin-wave excitation, then S z |ki = (SN − 1) |ki ,
S z |0i = SN |0i
(123)
If we add another spin wave, then hS z i = (SN −2). For spin 21 ,
hS z i =
N 2
− n, where n is the number of the spin waves. Since
S z is quantized, so must be the number of spin waves in each mode. Thus, we may describe spin waves using Boson creation
and annihilation operators a† and a. By specifying the number nk excitations in each mode k, the corresponding excited spin state can be described by a Boson state vector |n1n2 . . . nN i We can introduce creation and anhialation operators to describe the spin excitations on each site. Suppose, in the ground state, the spin is saturated in the state S z = S, then n = 0. If S z = S − 1, then n = 1, and so on. Apparently Siz = S − a†i ai Si+ ∝ ai
Si− ∝ a†i 50
(124)
Sz n 3 2
0
1 2
1
− 12 2 − 32 3 Table 1: The correspondence between S z and the number of spin-wave excitations on a site with S = 3/2.
If these excitations are Boselike, then ·
ai, a†i
¸
=1
(125)
√ = ni |n − 1i √ a†i |ni = ni + 1 |n + 1i ai |ni
(126) (127)
This transformation is faithful (canonical) and will maintain the dynamical properties of the system (given by
∂ ∂t θ
= i¯h [H, θ]) if
it preserves the commutator algebra h
i
h
i
h
i
Si+, Si− = 2Siz , Si−, Siz = 2Si−, Si+, Siz = −2Si+ (128)
consider n
+ −
− +o
S S −S S
|ni = 2S z |ni = 2(S − n) |ni 51
(129)
If S + = a, and S − = a†, then the left-hand side of the above equation would be {(n + 1) − n} |ni = |ni 6= 2(S − n) |ni. In order to maintain the commutators, we need √ √ + − † S = 2S − n a S = a 2S − n
(130)
Then [S +, S −] |ni = S +S − |ni − S −S + |ni =
q
2S −
a†aaa†
q
2S − a†a |ni − a†(2S − a†a)a |ni
= (2S − n)(n + 1) |ni − n(2S − (n − 1)) |ni = (2Sn + 2S − n2 − n − 2Sn + n2 − n) |ni = 2(S − n) |ni
(131)
You can check that this transformation preserves the other commutators. Of course, we need one other constraint, since −S ≤ S z ≤ S, we also must have n ≤ 2S
(132)
This transformation Si+
r
= 2S −
a†i aiai
Si−
=
a†i
r
2S − a†i ai
Siz = S − a†i ai (133)
52
is called the Holstein Primakoff transformation. If we Fourier transform these operators, 1 a†i = √ N
X
k
1 ai = √ N
eik·Ri a†k ;
X
k
e−ik·Ri ak
(134)
then (since the Fourier transform is unitary) these new operators satisfy the same commutation relations ·
ak , a†k0
¸
·
= δkk0
a†k , a†k0
¸
= [ak , ak0 ] = 0
(135)
To convert the Hamiltonian into this form, assume the number of magnons in each mode is small and expand √ √ ni + (136) Si = 2S − niai ' 2S(1 − )ai 4S 1 1 X ik·R X i(p+q−k)·R † i i √ e a − ≈ e a a a k 3 k p q N k 4SN 2 kpq Of course this is only exact for ni ¿ 2S, i.e. for low T where there are few spin excitations, and large S (the classical spin limit). In this limit 2S X ik·Ri e ak N k v u u 2S X u ' t e−ik·Ri a†k N k
Si+ ' Si−
v u u u t
53
(137) (138)
Siz = S −
1 N
X
kk 0
0
ei(k−k )·Ri a†k ak0 ,
(139)
the Hamiltonian H = −J
X
iδ
1 + − z z − + + ) S S S + S S (S i i+δ i i+δ 2 i i+δ
(140)
may be approximated as
H ' −N JνS 2 + 2JνS −2JνS
X
k
H ' E0 + where γk =
1 X ik· e ν δ
X
1 P ikRδ . ν δe
k
Rδ † ak ak
X
k
a†k ak
+ O(a4k )
2JνS(1 − γk )a†k ak + O(a4k )
(141) (142)
This is the Hamiltonian of a collection of
k-q k′+q
k
k
k′
k′ Figure 33: The fourth order correction to Eq. 142 corresponds to interactions between the spin waves, giving them a finite lifetime
harmonic oscillators plus some other term of order O(a4k ) which 54
corresponds to interactions between the spin waves. These interactions are a result of our definition of a spin-wave as an itinerant spin flip in an otherwise perfect ferromagnet. Once we have one magnon, another cannot be created in a “perfect” ferromagnetic background. Clearly if the number of such excitations is small (T small) and S is large, then our approximation should be valid. Furthermore, since these are the lowest energy excitations of our spin system, they should dominate the low-T thermodynamic properties of the system such as the specific heat and the magnetization. Consider hEi =
X
k
h ¯ ωk . eβ¯hωk − 1
(143)
For small k, h ¯ ωk = 2JνS(1 − γk ) = 2JνSk 2 on a cubic lattice.
Then let’s assume that the k-space is isotropic, so that d3k ∼
k 2dk, then
2 k2 γk = (cos kx + cos ky + · · ·) = 1 − ν ν and hEi ≈
X
k
2JνSk 2 ∝ eβ2JνSk2 − 1 55
Z ∞
0
k 4dk eβαk2 − 1
(144)
(145)
x = βαk 2 so that
1
x 2 k= βα
1
1 1 2 1 dk = x− 2 dx (146) 2 βα
x3/2 dx x ∼ T 5/2 (147) hEi ∝ β β 0 e −1 Thus, the specific heat at constant volume CV ∼ T 3/2, which −2 −1/2
Z ∞
is in agreement with experiment. M/M(0) 1 - T 3/2
T
Figure 34: The magnetization in a ferromagnet versus temperature. At low temperatures, the spin waves reduce the magnetization by a factor proportional to T 3/2 , which dominates the reduction due to local spin fluctuations, derived from our mean-field theory. This result is also consistent with experiment.
If we increase the temperature from zero, then the change in the magnetization is proportional to the number of magnons generated M (0) − M (T ) = 56
* X
k
nk
+
gµB V
(148)
since each magnon corresponds to spin flip. Thus 3 k 2dk M (T ) − M (0) ∼ − βαk2 (149) ∼ T2 e −1 which clearly dominates the exponential form found in MFT Z
(1 − 2e 5.2
2Tc T
). This is also consistent with experiment!
Antiferromagnetic Spin Waves
Since the antiferromagnetic ground state is unknown, the spin wave theory will perturb around the Neel mean-field state in which there are both a spin up and down sublattices. Spin down sublattice
up sublattice Figure 35: To formulate an antiferromagnetic spin-wave theory, we once again consider a bipartite lattice, which may be decomposed into interpenetrating spin up and spin down sublattices.
operators can then be written in terms of the Boson creation 57
and annihilation operators as before “up” sublattice
“down” sublattice Siz = −S + ni
Siz = S − ni √ − + + Si = (Si ) = 2Sfi(S)ai where
Si+
=
³
− ´+ Si
(150)
√ = 2Sa†i fi(S)
v u u t
ni (151) and ni = a†i ai 2S Again this transformation is exact (canonical) within the manfi(S) = 1 −
ifold of allowed states 0 ≤ ni ≤ 2S ⇔ −S ≤ Sz ≤ S .
(152)
The Hamiltonian H = −J
X
iδ
z = Siz Si+δ
1³ + − + ´ Si Si+δ + Si−Si+δ 2
(153)
may be rewritten in terms of Boson operators as H = +JS 2N ν + J − JS
X½ †
iδ
ai ai +
X
a†i aia†i+δ ai+δ
iδ a†i+δ ai+δ
+ fi(S)aifi+δ (S)ai+δ +
58
a†i fi(S)a†i+δ fi(S)
¾
(154)
Once again, we will expand v u u t
ni ni n2i fi(S) = 1 − =1− − − ··· 2S 4S 32S 2
(155)
and include terms in H only to O(a2) 2
H ' JS N ν − JS
X½ †
iδ
ai ai +
a†i+δ ai+δ
+ aiai+δ +
a†i a†i+δ
¾
(156)
This Hamiltonian may be diagonalized using a Fourier transform 1 X −ik·Ri ai = √ e ak N k and the Bogoliubov transform † sinh uk ak = αk cosh uk − α−k
a†k = αk† cosh uk − α−k sinh uk
(157)
(158) (159)
tanh 2uk = −γk (160) 1 X ik· Rδ γk = e (161) ν δ · ¸ † Here the αk are also Boson operators αk , αk = 1. To see if this transform is canonical, we must ensure that the commutators are preserved. † Sk , αk† 0 Ck0 − α−k0 Sk0 ] 1 = [ak , a†k0 ] = [αk Ck − α−k 59
= =
½
½
Ck2[αk , αk† ] Ck2
−
Sk2
¾
+
† Sk2[α−k , α−k ]
¾
δkk0
(162)
δkk0 = δkk0
where Ck (Sk ) is shorthand for cosh uk (sinh uk ). You should check that the other relations, [ak , ak0 ] = [a†k , a†k0 ] = 0, are preserved. After this transformation, H ≈ JN νS(S + 1) + r
X
k
h ¯ ωk αk† αk
1 + + O(a4) 2
(163)
where h ¯ ωk = −2JSν 1 − γk2. Notice that for small k, h ¯ ωk ∼ √ √ 2JSνk ≡ Ck (C = − 2JSν is the spin-wave velocity). The ground state energy of this system (no magnons), is E0 = JN νS(S + 1) − JSν
X
k
r
1 − γk2
(164)
If γk = 0, then each spin decouples from the fluctuations of its neighbors and E0 = JN νS 2 (J < 0) which is the energy of the Neel state. However, since γk 6= 0, the ground state energy E0 < EN . Thus the ground state is not the Neel state, and is thus not composed of perfectly antiparallel aligned spins. Each sublattice has a small amount of disorder ∝ hnii in its spin alignment. 60
E N = JNνS
2
Figure 36: The Neel state of an antiferromagnetic lattice. Due to zero point motion, this is not the ground state of the Heisenberg Hamiltonian when J < 0 and S is finite.
The linear dispersion of the antiferromagnet means that its bulk thermodynamic properties will emulate those of a phonon lattice. For example h ¯ ωk ' β¯hωk − 1 e k Z ∞ αk 3 dk ∼ 0 βαk e −1 3 Z 4 ∞ x dx hEi ∼ T 0 x e −1
hEi '
X
X
k
αk eβαk − 1
(165)
∂ hEi ∼ T3 like phonons! (166) ∂T Which means that a calorimeter experiment cannot distinguish C=
phonon and magnon excitations of an antiferromagnet. Therefore, perhaps the most distinctive experiment one may 61
E n
Ei
sample
f
θ
thermal spinpolarized neutrons
dθ ∝ S(k, ω) ∝ I {F(-i〈[a(t),a†(0)]〉)} ∼ n dΩdω 2θ ∝ k = k i − k j h ω = E i − Ef
Figure 37: Polarized neutrons are used for two reasons. First if we look at only spin flip events, then we can discriminate between phonon and magnon contributions to S(k, ω). Second the dispersion may be anisotropic, so excitations with orthogonal polarizations may disperse differently.
perform on an antiferromagnet is inelastic neutron scattering. If spin-polarized neutrons are scattered from a sample, then only those with flipped spins have created a magnon. If the neutron creates a phonon, then its spin remains unchanged. The time of flight of the neutron allows us to determine the energy loss or gain of the neutron. Thus, if we plot the differential cross section of neutrons with flipped spins, we learn about magnon dispersion and lifetime. Notice that the peak in S(k, ω) has a width. This is not just due to the instrumental resolution of the experiment; rather it also reflects the fact that magnons have a 62
S(k,ω) γ
n↓ magnon
n↑ phonon junk subtracted off
ω
〈ω k 〉
k
ω
k
2θ
Figure 38: Sketch of neutron structure factor from scattering off of a magnetic system. The spin-wave peak is centered on the magnon dispersion. It has a width due to the finite lifetime of magnon excitations.
finite life time δt,which broadens their neutron signature by γ k . γk δt ∼ h ¯
δt ∼
1 γk
(167)
However in the quadratic spin wave approximation the lifetime of the modes h ¯ ωk is infinite. It is the neglected terms in H, of order O(a4) and higher which give the magnons a finite lifetime. †
a i a i a†i+δ a
i+δ
⇒
Figure 39:
63
Chapter 9: Electronic Transport Onsager April 23, 2001
Contents 1 Quasiparticle Propagation 1.1 2
2
Quasiparticle Equation of Motion and Effective Mass . . . . . . . . .
Currents in Bands
5 8
2.1
Current in an Insulator . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2
Currents in a Metal . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
3
Scattering of Electrons in Bands
13
4
The Boltzmann Equation
18
5
6
4.1
Relaxation Time Approximation . . . . . . . . . . . . . . . . . . . . .
21
4.2
Linear Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . .
22
Conductivity of Metals
24
5.1
Drude Approximation
. . . . . . . . . . . . . . . . . . . . . . . . . .
24
5.2
Conductivity Using the Linear Boltzmann Equation . . . . . . . . . .
25
Thermoelectric Effects 6.1
30
Linearized Boltzmann Equation . . . . . . . . . . . . . . . . . . . . .
1
31
7
6.2
Electric Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
6.3
Thermal and Energy Currents . . . . . . . . . . . . . . . . . . . . . .
34
6.4
Seebeck Effect, Thermocouples
. . . . . . . . . . . . . . . . . . . . .
39
6.5
Peltier Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
The Wiedemann-Franz Law (for good metals)
2
42
As we have seen, transport in insulators (of heat mostly) is dominated by phonons. The thermal conductivity of some insulators can be quite large (cf. diamond). However most insulators have small and uninteresting transport properties. Metals, on the other hand, with transport dominated by electrons generally conduct both heat and charge quite well. In addition the ability to conduct thermal, charge, and entropy currents leads to interesting phenomena such as thermoelectric effects. 1
Quasiparticle Propagation
In order to understand the transport of metals, we must understand how the metallic state propagates electrons: ie., we must know the electronic dispersion ω(k). The dispersion is obtained from band structure E(k) = h ¯ ω(k) in which the metal is approximated as an almost free gas of electrons interacting weakly with a lattice potential V (r), but not with each other.
3
V
e
V(r)
Figure 1: The dispersion is obtained from band structure E(k) = h ¯ ω(k) in which the metal is approximated as an almost free gas of electrons interacting weakly with a lattice potential V (r), but not with each other.
The Bloch states of this system φk (r) = Uk (r)eik·r ,
Uk (r) = Uk (r + rn)
(1)
may be approximated as plane waves Uk (r) = Uk . Then, the state describing a single quasiparticle may be expanded. 1 Z∞ dkU (k)ei(k ψ(x, t) = √ −∞ 2π
·x −ω(k)t)
(2)
If U (k) = cδ(k − k0) then ψ(x, t) ∝ ei(k0·x−ωt) and the quasiparticle is delocalized. On the other hand, if U (k) = constant then ψ(x, t) ∝ δ(x) and the quasiparticle is perfectly localized. This is an expression of the uncertainty principle ∆k ∆x ∼ 1 or ∆p ∆x ∼ h ¯, 4
(3)
Re ψ(x)
Figure 2: If U (k) = cδ(k − k0 ) then ψ(x, t) ∝ ei(k0 ·x−ωt) and the quasiparticle is delocalized.
so that we cannot know both the momentum and location of the quasiparticle to arbitrary precision. ω(k) 6= constant, so the different components propagate with different phase velocities, so the quasiparticle spreads as it propagates. This is also the reason why the group velocity of the quasiparticle is not the phase velocity. Consider the propagation of ψ(x, t) which when t = 0. 1 Z∞ ik·x √ ψ(x, 0) = dkU (k)e . 2π −∞
(4)
Suppose that U (k) has a well-defined dominant peak (See Fig. 3) so that ω(k) ' ω(k0 ) + ∇kω(k)|k0 · (k − k0)t
(5)
then 1 Z∞ i(k·x−ω0 t− ∇k ω(k)|k (k−k0 )t) 0 dkU (k)e ψ(x, t) ' √ 2π −∞ 5
(6)
U(k)
k0
k
Figure 3: The distribution of plane wave state that make up a quasiparticle.
ei(k0· ∇kω(k)|k0 −ω0)t Z ∞ ik·(x− ∇k ω(k)|k t) 0 U (k) √ ψ(x, t) ' dke −∞ 2π µ ¶ ' ψ x − ∇kω(k)|k0 t, 0 ei(k0 ∇kω(k)|k0 −ω0)t (7) Ie., aside from a phase factor, the quasiparticle travels along with velocity ∇kω(k)|k0 = vg . (If we had considered higher order terms, we would have seen the quasiparticle distorts as it propagates. (c.f. Jackson p.305). In general, 1 vg = ∇k ω(k) = ∇k E(k) h ¯ 1.1
(8)
Quasiparticle Equation of Motion and Effective Mass
We are ultimately interested in the transport; i.e. the response of this quasiparticle to an external electric field E , from which 6
it gains energy. δE = −eE · vδt
(i.e. force × distance)
(9)
This energy is reflected by the quasiparticle ascending to higher energy k states. δE = ∇k E(k) · δk = h ¯ v · δk
(10)
h ¯ δk = −eEδt
(11)
So
h ¯ k˙ = −eE
E.O.M
(12)
This equation of motion is identical to that for free electrons (c.f. Jackson); however, it may be shown to be applicable to general Bloch states provided that E is smaller than the atomic fields, and it must not vary in space or time too fast. We may put this EOM in a more familiar form 1d 1 X ∂ 2E dkj v˙ i = (∇k E(k))i = h ¯ dt h ¯ j ∂ki∂kj dt 1 X ∂ 2E (−eEj ) (13) = h ¯ j ∂ki∂kj This will have the form F = ma, if we define the mass tensor 1 1 ∂E 1 = = symmetric & real (14) m∗ ij m∗ ji h ¯ 2 ∂ki∂kj 7
which may be diagonalized to define three principle axes. In the simple cubic case, the matrix will have the same element along each principle direction and ∗
m =
h ¯2
.
d2 E dk 2
(15)
In this way the effective mass of electrons on a lattice can vary ∗
2
strongly, the larger ddkE2 is, the smaller mm is. Consider the simple 1-d case (See Fig. 4) E(k) 2
d E <0 dk2
2
d E =0 dk2 k 2
d E >0 dk2
-π/a
π/a
m*
k
Figure 4:
8
2
Currents in Bands
Our previous discussion of the motion of an electron (or a quasiparticle) in a metal under the influence of an applied field E, ignored the presence of other electrons and the Pauli principle. 2.1
Current in an Insulator
The Pauli principle insures that a full band of states is insulating. Consider the electric current due to d3k states
3 L dJ = −ev(k) d3k 2π
(16)
The current density is then dj =
1 3 −e ∇k E(k) dk h ¯ (2π)3
(17)
ie., different occupied states in the Brillouin zone contribute differently to the current. The net current density j is then the integral over all occupied states, which for our full band is the integral over the first Brillouin zone e Z j = − 3 1st B.Z. ∇k E(k)d3 k . 8π h ¯ Thus for each k vector in the integral, there is also −k. 9
(18)
First B.Z.
Fermi surface Figure 5: Different occupied states make different contributions to the current density.
Now consider a lattice with inversion symmetry k → −k so that E(k) = E(−k). Alternatively, recall that time-reversal invariance requires that E↑(k) = E↓(−k) ,
(19)
but since E↑(k) = E↓(k) due to spin degeneracy, we must have that E(k) = E(−k). Thus 1 v−k = ∇−kE(−k) = −∇kE(k) = −vk! (20) h ¯ i.e., for the insulator −e Z (21) j = 3 1st B.Z. d3k∇kE(k) ≡ 0 8π h ¯ Now imagine that the band is not full (See Fig. 6, left). Then, if we apply an external field E, so that eE k˙ = − e > 0! h ¯ 10
(22)
ky
ky E x
kx
empty
kx
empty
occupied states
occupied states
Figure 6: The Fermi sea of a partially filled band will shift under the influence of an applied field E. This destroys the inversion symmetry of the Fermi sea, causing a net current.
the electrons will redistribute as the Fermi surface shifts (See Fig. 6 right). −e Z 3 j = v(k)d k 3 8π koccupied −e Z −e Z 3 3 d kv(k) − d kv(k) = 8π 3 Z1st B.Z. 8π 3 empty e 3 = d kv(k) (23) 3 8π empty Thus the current may be formally described as a current of positive charge particles (holes) assigned to the unoccupied states in the band.
11
2.2
Currents in a Metal
Now imagine that the band is almost full. Near the top of the Ek
E
E
E
holes
k
D(E)
full states
Figure 7: Left: A nearly full simple band. States near the Fermi surface that can be thermally excited have negative mass. Right: Density of states with holes at the top which have positive charge and mass.
band
d2 E dk 2
< 0, so the mass is negative and the dispersion at the
top of the band is always also parabolic, so
or
h ¯ 2k2 E(k) = E0 − ¯¯ ∗ ¯¯ 2 ¯¯mˆ¯¯
k = deviation from top!
(24)
h ¯ k˙ eE 1d ∇k Ek = − ¯¯ ∗ ¯¯ = ¯¯ ∗ ¯¯ (25) v˙ = ¯m ¯ ¯m ¯ h ¯ dt ¯ ˆ¯ ¯ ˆ¯ This is the EOM of a positively charged particle with positive
mass in an electric field E. I.e., holes at the top of the band have positive mass. 12
We have just shown that a material with full bands is an insulator (See Fig. 8 left). Ie., it carries no current, as least at E
E
T=0 empty
T≠0
Conduction band electrons
Ef
Eg
full
holes
Valence band D(E)
D(E)
Figure 8: An insulator form when the fermi energy falls in a gap of D(E). As the temperature is raised, electrons are promoted over the gap, and both the electrons and holes contribute to the conductivity which increases with temperature.
T = 0 and for a small E. However we ignored the presence of other bands. If there is a conductiong band, for T 6= 0, and a reasonably small Eg , there will be conductivity due to a small number of thermally excited holes and electrons n ∼ exp(−Eg /KB T ) (See Fig.
8 right). Thus perhaps a better
definition of an insulator is a material for which the conductivity increases with T .
13
3
Scattering of Electrons in Bands
According to the EOM for hole at the top of a band v˙ =
¯ ¯ ¯ ¯
e ¯E ¯ ∗ mˆ¯¯
(26)
as long as E is finite, these holes will continue to accelerate and j will increase accordingly. Of course, this does not happen. Rather the material simply heats up (ie., has a finite R). In addition, if E is returned to zero, then j likewise returns to zero. Why? In 1900 Drude assumed that the electrons scatter from the lattice yielding resistivity. Of course, as we have seen the quasie
Figure 9: Drude thought that electrons scatter off the lattice yielding resistivity. Bloch showed this to be wrong.
particle state may be defined from a sum over Bloch waves (described by k) each of which is a stationary state and describe 14
the unperturbed propagation of electrons. Thus a perfect lattice yields no resistivity. We can get resistivity in two ways. 1. Deviations from a perfect lattice (a) Defects (See Fig. 10a) (b) Lattice vibrations = phonons (See Fig. 10b) 2. Electron - electron interactions (See Fig. 11) (a)
(b)
e
e
E(k) - E(k+q) = hω(q) k+q
†
c † c a -q or c † c a q k+q k k+q k
k+q -q
q
⇓ c
†
c (a q + a†-q )
k+q k
k
k
Figure 10: Electrons do scatter from defects in the lattice or lattice vibrations. They contribute to the resistivity, with the phonon contribution increasing with temperature, and the defect contribution more-or-less constant.
15
Due to the strength of the electron-electron interaction and the density of electrons, (2) should dominate. However, it is easy to show, using the Pauli principle, that effect of (2) is quite often negligible, so that we may return to regarding the pure electronic system as a (perhaps renormalized) non-interacting Fermi gas. According to momentum and energy conservation Fig 11 E1 + E 2 = E 3 + E 4 k 1 + k 2 = k 3 + k 4 .
(27)
(Of course, momentum conservation is only up to a reciprok4
k3 E E
4
3
E4
3
1
E2 2
1
k2
k1
Figure 11: E1 + E2 = E3 + E4 and k1 + k2 = k3 + k4 . Electron-electron interactions also contribute to the resistivity (from simple order of magnitude arguments based on relative strengths of the interactions, their contribution should dominate–but due to the Pauli principle, it does not).
cal lattice vector G, k1 + k2 = k3 + k4 = G; however, as with phonon conductivity, these processes with finite G involve much 16
higher energies, and may be neglected near T = 0.) Furthermore, since all states up to EF are occupied, E3; E4 > EF ! Suppose E1 is (thermally) excited, so E1 > EF and it collides with an occupied state E2 < EF . Then (E1 − EF ) + (E2 − EF ) = (E3 − EF ) + (E4 − EF ) > 0 (28) ²1 + ²2 = ²3 + ²4 > 0,
² 3 ; ²4 > 0
(29)
or ²1 + ²2 > 0, However, since ²2 < 0, if ²1 is small, then |²2| ≤ ²1, is also small, so only states with
²2 EF
≤
²1 EF
states
may scatter with the state k1 conserve energy and obey the Pauli principle, thus restricting ²2 to a narrow shell of width ²1 around the Fermi surface. Now consider the restrictions placed on the states 3 and 4 by momentum conservation. k1 − k 3 = k 4 − k 2
(30)
I.e. k1 − k3 and k4 − k2 must remain parallel, and since k1 is fixed, this restriction on the final states further reduces the scattering probability by a factor of 17
²1 EF .
E k1
k3
ky
k4
3
EF
k2
2
k 1- k3
k 4- k2
1 4 k x
D(E) Figure 12: Momentum and energy conservation severley restrict the states that can an electron can scatter with and into.
Thus the total scattering cross section σ is reduced from the classical result σ0 by
µ
¶ ²1 2 EF .
If the initial excitation ²1 is due to
thermal effects, then ²1 ∼ kB T and
2 σ k T B ¿ 1! ∼ σ0 EF
(31)
The total scattering due to electron - electron repulsion is very small. Therefore, unless EF can be made small, the dominant contribution to a material’s resistivity is due to defects and phonons.
18
4
The Boltzmann Equation
The nonequilibrium (but steady-state) situation of an electronic current in a metal driven by an external field is described by the Boltzmann equation. L
V e j
defect
E Figure 13: Electronic transport due to an applied field E, is limited by inelastic collisions with lattice defects and phonons.
This differs from the situation of a system in equilibrium in that a constant deterministic current differs from random particle number fluctuations due to coupling to a heat and particle bath. Away from equilibrium (E 6= 0) the distribution function may depend upon r and t as well as k (or E(k)). Nevertheless, when E = 0 we expect the distribution function of the particles 19
in V to return to f0(k) = f (r, k, t)|E=0 =
1
(32)
eβ(E(k)−EF ) + 1
As indicated, To derive a form for f (r, k, t), we will consider length scales ◦
larger than atomic distances A, but smaller than distances in which the field changes significantly. In this way the system is considered essentially homogeneous with any inhomogeneity driven by the external field. Now imagine that there is no scattering (no defects, phonons), then since electrons are conserved
r-
dr dt dt
r
k - dk dt dt
· hk = -eE
k t
t - dt
Figure 14: In lieu of scattering, particles flow without decay.
eE f (r, k, t) = f r − vdt, k + dt, t − dt (33) h ¯ Now consider defects and phonons (See Fig. 15) which can scatter a qauasiparticle in one state at r −v dt and time t−dt, µ
to another at r and time t, so that f (r, k, t) 6= f r − vdt, k + 20
eE h ¯ dt, tdt
¶
.
We will express this scattering by adding a term.
Figure 15: Scattering leads to quasiparticle decay.
dt ∂f f (r, k, t) = f r − v dt, k + eE , t − dt + dt h ¯ ∂t S (34) For small dt we may expand f ∂f ∂f f (r, k, t) = f (r, k, t)−v · ∇r f +eE · ∇k − + (35) h ¯ ∂t ∂t S or
∂f ∂f e + v · ∇ r f − E · ∇k f = ∂t h ¯ ∂t S
Boltzmann Equation (36)
If the phonon and defect perturbations are small, time-independent, and described by H, then the scattering rate from a Bloch state k to k0 (occupied to unoccupied) is wk0k =
2π h ¯
|hk0 |H| ki|2.
Then
3Z L ∂f (k) = d3 k 0 ∂t S 2π
21
{(1 − f (k)) wkk0 f (k0) − (1 − f (k0)) wk0k f (k)} (37)
Needless to say it is extremely difficult to solve these last two coupled equations. 4.1
Relaxation Time Approximation
As a result we make a series of approximations and ansatz. The first of these is the relaxation time approximation that the rate at which a system returns to equilibrium f0 is proportional to its deviation from equilibrium
f (k) − f0(k) ∂f =− . ∂t S τ (k)
(38)
Here τ (k) is called the relaxation time (for a spatially inhomogeneous system τ will also depend upon r). Ie., we make the assumption that scattering merely acts to drive a nonequilibrium system back to equilibrium. If E 6= 0 for t < 0 and then at t = 0 it is switched off so that for t > 0 E = 0, then for a homogeneous system
∂f ∂f f − f0 = =− ∂t ∂t S τ
(39)
so that t
f − f0 = (f (t = 0) − f0) e− τ 22
(40)
ie., τ is the time constant at which the system returns to equilibrium. Now consider the steady-state situation of a metallic system ˆ . Then in a time-independent external field E = E x ∂f =0 ∂t
(41)
Furthermore since the system is homogeneous ∇r f = 0 then
(42)
e ∂f f (k) − f0(k) − E · ∇k f (k) = = − h ¯ ∂t S τ (k)
ie
(43)
e f (k) = f0(k) + τ (k)E · ∇k f (k) (44) h ¯ which may be solved iteratively, generating a power series in E (or Ex). 4.2
Linear Boltzmann Equation
For small E (Ohmic conditions)
e f (k) ' f0(k) + τ (k)E · ∇k f0(k) h ¯
linear Boltzmann Eqn. (45)
23
I.e. the lowest order Taylor series of f (k). Or equivalently, if ˆ E = Ex x
! e f (k) ' f0 k + τ (k)E (46) h ¯ Ie., the effect is to shift the Fermi surface from its equilibrium Ã
position by an amount k
1ST BZ
y
δk x
k
x
Figure 16: According to the linear Boltzmann equation, the effect of a field E x is to
shift the Fermi surface by δkx = −eτ Eh¯x
δkx = −eτ
Ex h ¯
(47)
From the discussion in Sec.??, it is clear that a finite current results. Interesting! Note that elastic scattering |k| = |k0| cannot restore equilibrium. Rather they would only cause the Fermi 24
surface to expand. Inelastic scattering (i.e. from phonons) is needed to explain relaxation.
A
B
Figure 17: Note that elastic scattering |k| = |k0 | cannot restore equilibrium. Rather
they would only cause the Fermi surface to expand.
5
Conductivity of Metals
5.1
Drude Approximation
As mentioned above, Drude calculated the conductivity of metals assuming that • all free electrons participate, and • electron-lattice scattering yields a scattering rate 1/τ . Under these assumptions, the EOM is ¶ mµ mv˙ + v − vtherm = −eE τ
25
(48)
where v − vtherm = vD , the drift velocity, and
m τ vD
is friction.
Again when E = 0, we again have an exponential decay of v so τ is again the relaxation time. In steady-state v˙ = 0 vD =
−eτ E m
(49)
so that
ne2τ j = −envD = E m or defining j = σE, and σ = µne, ne2τ σ = m
5.2
µ =
eτ m
(50)
(51)
Conductivity Using the Linear Boltzmann Equation
Of course, this is wrong since all free electrons do not participate in σ due to the Pauli principle. And a more careful derivation, using the Boltzmann Equation, is required. Again, the relationship between j and f (k) is −e Z 3 d k v(k)f (k) 8π 3 Z −e eτ (k) ∂f0 3 ' d k v(k) Ex f0(k) + 8π 3 h ¯ ∂kx
j =
26
(52) (53)
V
E = E xˆ
Figure 18: To calculate the conductivity, we apply a field in the x-direction only and use the linearized Boltzmann Eqn.
For an isotopic material jz = jy = 0, and the equation becomes scalar. Furthermore, again Z
v(k)f0(k)d3k = 0
(54)
since v−k = −vk . Then as
∂f0 ∂f0 ∂E ∂f0 h ¯ vx = = ∂kx ∂E ∂kx ∂E
Then as
∂f0 ∂E
(55)
Z e2 ∂f0 (56) jx ' − 3 Ex d3kvx2 τ (k) 8π ∂E ' −δ(E − EF ) for T ¿ EF the integral in k is
confined to the surface of constant E, and d3k = dSE dk⊥ = dSE then 27
dE h ¯ v(k)
(57)
kz k⊥ ∇k E dk⊥ =
dE ∇k E
ky constant energy surface kx
Figure 19:
jx vx2 (k) e2 Z σ = dSE dE τ (k)δ(E − EF ) = Ex 8π 3h ¯ v(k) e2 Z vx2 (k) = τ (k) . dSE 8π 3h ¯ E=EF v(k)
(58) (59)
As expected only the properties of the electrons on the Fermi surface are relevant. E=0
E≠0
E
Figure 20: Only the electrons near the fermi surface participate in the transport. Far below the Fermi surface, pairs of states k and −k are occupied. Their contribution
to the conductivity cancels, leaving contributions to only the occupied states near the fermi surface.
We can now calculate the conductivity of a metal by aver28
aging
vx2 v τ (k)
over the Fermi surface. Consider a simple system
with a spherical Fermi surface, then Z
or
τ (k)vx2 (k) 4π 2 = kF τ (EF )v(EF ) dSE v(k) 3 4π h ¯ kF = kF2 τ (EF ) ∗ 3 m h ¯ kF e2 4π 2 kF τ (EF ) ∗ σ = 3 8π h ¯ 3 m
then as kB T ¿ EF , N = 2 43 π
3 kF
2π 3 L
( )
(60)
(61)
⇒ kF3 = 3π 2n we find
that
e2τ (EF ) eτ (EF ) σ = n µ = (62) m∗ m∗ For semiconductors where n is T dependent, and for more realistic material where the Fermi surface 6= sphere, the formula is more complicated. However, for metals the temperature dependence of σ is dom-
inated by that of τ ; ie., by the temperature dependence of phonons. However, before we can calculate σ(T ), we must first disentangle the phonon from the defect scattering. Assuming
29
that the two mechanisms are independent, they must add 1 1 1 = + τ τph τdefect
(63)
i.e. ρ = ρph + ρdefect
Matthiesen’s Rule
(64)
The defect contribution is proportional to the defect cross sec1
∝ Σdefect v(EF ). defect Is is roughly temperature independent, since the cross section tion Σdefect and the current, or v(EF ), τ
Σdefect and v(EF ) are. The phonon contribution, on the other hand, is highly temperature dependent since at zero temperature, there are no phonons. The scattering cross section is roughly proportional to D
E
2
the rms phonon excursion S (q) . However, from the equipartition theorem ¿ À 1 kB T 2 2 M ωq S (q) = 2 2
T À θD .
(65)
Thus 1 τph
∝
¿
2
S (q)
À
∝
kB T mωq2
(66)
Ie., at high temperatures, all modes contribute a linear in T 30
scattering to ature)
1 τ
ph
. Therefore, at T À θD (θD = debye temperρ = aT + ρdefect
ρ
Ni 2% 3 . i +3 %N Cu .16 i 2 + %N Cu .12 1 + Cu
(67)
R
αT
Cu αT T
5
0.1
T Θ D
Figure 21: The phonon and defect contributions to the resistivity add (left), and the phonon contribution is linear at high temperatures T À θD .
6
Thermoelectric Effects
Until now, we have assumed that the transport system is thermally homogeneous. Of course this need not be the case since we can obviously maintain both an electrical and a thermal current. Here, each electron can carry a charge current ∼ ev ∼ e2E and a thermal current kT k∇T . In fact, a heat current can be 31
T2
T1
T1 ≠ T
xˆ
2
Figure 22: Thermoelctric effects are important in systems with both electric potential and thermal gradients. We will assume both are in the x-direction
used to induce an electrical potential (Seebeck or thermoelectric effect) and, conversely, an electric current can be used to move heat (Peltier effect) which makes the solid state refrigeration possible. 6.1
Linearized Boltzmann Equation
To allow for a thermal gradient ∇T , our formalism must be modified. Imagine that ∇T and E are fixed in time, then the Boltzmann equation becomes
∂f ∂f e f (k) − f0(k) + v · ∇ r f − E · ∇k f = = − (68) ∂t h ¯ ∂t S τ (k)
32
where in steady state
∂f ∂t
→ 0. After linearizing (replacing f
by f0 in the left-hand side), we get ) e f (k) ' f0(k) − τ (k) v · ∇r f0 − E · ∇k f0 h ¯ (
(69)
Then, as before e ∂f0 e ∂f0 e E · ∇ k f0 = E · h ¯ v = Ex h ¯ vx h ¯ h ¯ ∂E h ¯ ∂E
(70)
The spatial inhomogeneity is through ∇T , and in a semiconductor for which EF depends strongly upon T , through ∇EF
∂f0 ∂f0 ∂f0 ∂f0 ∇EF = v· ∇T + ∇T − ∇EF v· ∇r f0 = v· ∂T ∂EF ∂T ∂E (71)
Apparently ∇EF only contributes a term which modifies the electric field dependence vx
∂f0 ∂f0 0 {eEx + (∇EF )x} ≡ vx eE ∂E ∂E x
(72)
Of course, in a metal E 0 = E. 6.2
Electric Current
Thus, we now have f (k) = f0(k) − τ
∂f0 ∂f0 ∂T e 0 vx + τ Exh ¯ vx ∂T ∂x h ¯ ∂E 33
(73)
Then for e Z 3 jx = − 3 d kvx(k)f (k) (74) 8π e Z 3 ∂f0 ∂T e 0 ∂f0 jx = − 3 d kvx(k) f0(k) − τ vx + τ Ex h ¯ vx 8π ∂T ∂x h ¯ ∂E
recall that the last term yielded σ last time (and still will) jx =
σEx0
e Z 3 2 ∂f0 ∂T + 3 d kvxτ 8π ∂T ∂x
(75)
Again we will calculate the second term assuming a spherical Fermi surface. The term
∂f0 ∂T
confines the integral to the Fermi
f 0 T=0 T≠0 E f
Figure 23: The derivative
∂f0 ∂E
E
is only significant near the fermi surface.
sphere and so again effectively it amounts to a Fermi-surface average, so v¯2 → 1 v2 ≈ 2 ∗ 1 m∗v2 = 2 ∗ E or changing to an x
3
3m 2
3m
integral over the DOS jx = σEx0 +
2 e Z ∂f0 ∂T dEτ (E)ED(E) 3 m∗ ∂T ∂x 34
(76)
Assuming that τ (E) ∼ τ (EF ), we get 2 e ∂T Z ∂f0 D(E) jx = + τ (E ) dEE F 3 m∗ ∂x ∂T 2 e ∂T ∗ jx = σEx0 + c (T ) c ∝ m τ (E ) v v F 3 m∗ ∂x In general, this intuitive form is rewritten as σEx0
jx =
σEx0
+
∂T − L12 xx ∂x
(77) (78)
(79)
and from it we see that both an electric field (or the generalized field strength E 0), and the thermal gradient contribute to the electron current, jx. 6.3
Thermal and Energy Currents
Of course one can have a thermal current without having an electric current (same number of electrons moving right and left, but more of the hot ones moving right). Thermodynamics is needed to quantify this though since these electrons will also carry entropy as well as energy and heat. Imagine that a small subsection of our material is in thermal equilibrium and then some electrons are introduced/taken away 35
so that dQ = T dS = dU − µdN
First Law of Thermodynamics (80)
in terms of particle flow jQ = j E − E F jn
− ejn = j
(81)
where this equation defines jQ, the thermal current, and jE =
Z
d3 k E(k)v(k)f (k, r) . 8π 3
(82)
Again one could work out the form of jQ for the spherical Fermi surface using the linearized Boltzmann equation. However one must obtain a form like j = L11E 0 + L12(−∇T )
(83)
jQ = L21E 0 + L22(−∇T )
(84)
(The fact that L12 = L21 is referred as the Onsaser relation.) These relationship between the L’s and the transport coefficients depends upon what experiment is being done. For example in Fig. 24 there is a potential gradient (V 6= 0) but no 36
T1 = T I≠0 V≠0
2
T1
T1
ρ = σ-1 11
j = L E′ 12
j = L E′ Q
j=σE 11
L =σ
V I
Figure 24: Here, there is a potential gradient (V 6= 0) but no thermal gradient since
T1 = T2 . The electric field drives both electric and thermal currents. Thus, a heat bath is required to keep both sides of the sample at the same temperature.
thermal gradient since T1 = T2. The electric field drives both electric and thermal currents. j = L11E jQ = L12E
(85)
Thus, we may identify σ =
neτ 11 = L m∗
(86)
Note that since there is a thermal current induced by the potential gradient, a heat bath is required to keep both sides of the sample at the same temperature. 37
In Fig. 25 we maintain a thermal gradient, but turn off the electric current. Here, T2
T1
T1 ≠ T I=0
2
V I ³
Figure 25: Here j = 0 = L11 E 0 + L12 (−∇T ) and jQ = −L12
where −L12
³
L12 L11
´
+ L22 = κT
³
L12 L11
´
´
+ L22 (−∇T )
j = 0 = L11 E 0 + L12(−∇T )
(87)
and
12 21 L 21 0 22 (−∇T ) jQ = L E + L (−∇T ) = −L 11 + L22 L (88)
and since (you will show) 2eτ cv = L21 ∗ 3m ´ ³ 12 2 L κ = L22 − L11
L12 = −
38
(89) (90)
We could also measure the thermal conductivity by driving a heat current through the sample, maintaining the ends at the same potential (see Fig. 26 right). Here, we would find T1 ≠ T
2
V2
V1
V2
V1
j Q
j
Q
V1 = V2
V1 ≠ V
2
I
12
j = L (-∇T) 12
κ=L
22
-
(L
L
22
j = L (-∇T)
)
Q
11
κ=L
22
Figure 26: Two methods for measuring κ.
κ = L22 .
(91)
Thus, we can identify κ = L22
³
´2
L12 22 or L − L11
(92)
depending upon the experiment. These are the same if the sample is a good metal where L11 = σ is large (Young Kim). 39
David Mast measures κ by the method of the left of Fig. 26. This yields the more conventional definition of κ. 6.4
Seebeck Effect, Thermocouples
These relations result in some interesting physical effects. Consider a bimetallic conducting loop with two junctions maintained at temperatures T1 T2. Let metal A be different than B, ij so that Lij A 6= LB . If no current flows around the loop, then metal A
T1 1
2 T
2
T0
metal B
metal B
0 V
Figure 27: A bimetallic conducting loop with junctions maintained at T1 and T2 . If 11 12 11 T1 6= T2 , and L12 A /LA 6= LB /LB , then the heat current induces a potential V ∝ T2 −T1
j = 0 = L11Ex + L22 (−∇T ) ⇒ Ex =
40
12
L dT L11 dx
(93)
where S =
L12 L11
is called the thermopower and is a property of
a material. The potential measured around the loop is given by V =
Z 1
0
EB dx +
Z 2
1
EAdx +
Z 0
2
EB dx
(94)
or Z 1
Z 2 ∂T Z 0 ∂T ∂T V = SB 0 dx + 2 dx + SA 1 dx ∂x ∂x ∂x Z 1 ∂T Z 2 ∂T = SB 2 dx + SA 1 dx ∂x Z ∂x T = (SA − SB ) T 2 dT = (SA − SB ) (T2 − T1) 1
Or if SA and SB are not T -independent V =
R T2
T1
(95) (96) (97)
dT (SA − SB ).
So if T1 6= T2 and SA 6= SB , then the heat current induces an emf! This is called the Seebeck effect ⇒ (solid state thermometer with ice H2O as a reference). 6.5
Peltier Effect
Now consider the inverse situation where an electrical current j is driven through the loop which is held at a fixed temperature
41
metal A
1
metal B
2 j
metal B
T0 0
Figure 28: An electrical current j is driven through the loop which is held at a fixed temperature µ
∂T ∂x
¶
= 0 . Then jQ = L21E
21
j = L11 E
(98)
L j = πj (99) 11 L This is known as the Peltier effect whereby heat is carried jQ =
from one junction to the other or an electric current is accompanied by a heat current. One may use this effect to create an extremely simple (and similarly inefficient) refrigerator.
42
π j j = A Q
A
j = (π A - π )j Q
(π - π )j = j
j
B
A
B
Q
=
Q
π Bj
B T0 B
π j B
0
11 21 11 Figure 29: If dT /dx = 0 and πA = L21 A /LA 6= LB /LB = πB then the electric current
also induces a heat from one junction to another.
7
The Wiedemann-Franz Law (for good metals)
One may independently measure the thermal κ and electrical σ conductivities. However, in general one expects that κ ∝ σT j = κ(-∇T)
j = σE
Q
V1 T1
j
V2
T1 ≠ T
T
V1 = V2
2
Q
T
T
2
1
2
E
1
2
V1 ≠ V
j V1
T = T
2
V2 electric field -e/E
thermal field -
∂T ∂x
Figure 30: The thermal and electrical conductivities may be measured independently.
since in electrical conduction each election carries a charge e 43
and is acted on by a force −eE. The current per unit electric
field proportional to e2. In thermal conduction each electron
carries a thermal energy kB T and is acted on by a thermal force −kB ∇T . The heat current per unit thermal gradient is proportional to kB2 T , thus one expects kB2 κ ∝ 2T σ e
(100)
Due to the simplicity of these arguments, our formalism should reproduce this relationship. As we discussed before jQ = j E − E F jn 1 Z 3 = d k (E − EF )v(k)f (k) 8π 3
(101) (102)
In the linear approximation to the Boltzmann equation for E x0 = 0, we get
1 Z 3 ∂f0 2 ∂T v τ − jQ = 3 d k (E − EF ) 8π ∂T x ∂x
where E −EF and
∂f0 ∂T
(103)
are odd in (E −EF ), for the Fermi liquid
Z d3 k ∂f0 1 ∂T 2 (E − E ) j Q ' − v F τF F ∂x 3 3π 3 ∂T ∂T 1 jQ = − vF3 τF cv ∂x 3
44
(104) (105)
(E - E ) F
∂f 0 ∂T
EF
E
0 Figure 31: The function (E − EF ) ∂f is sharply peaked at the fermi surface and even. ∂T
in E − EF .
1 (106) κ = vF3 τF cv 3 2 Now recall that for the Fermi liquid cv = kB π2 nkB k TT so that B F
π2 π2 kB2 T 1 m∗vF2 T = τF n ∗ τF kB nkB κ= 3 m∗ 2 EF 3 m
(107)
However, also for the Fermi liquid, we found that σ = e2τF mn∗ , so
κ π 2 kB 2 = T (108) σ 3 e Of course this relationship only holds in a good metal. There
are two reasons for this. First we are neglecting terms like 2 (L12) in κ which are unimportant for a good metal (or if we σ electrically short the sample.) Second, we are assuming that κ is dominated by electronic transport. 45
Chapter 10: Superconductivity Bardeen, Cooper, & Schrieffer May 9, 2001
Contents 1 Introduction
2
2
1.1
Evidence of a Phase Transition . . . . . . . . . . . . . . . . . . . . .
2
1.2
Meissner Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
The London Equations
7
3 Cooper Pairing
10
3.1
The Retarded Pairing Potential . . . . . . . . . . . . . . . . . . . . .
11
3.2
Scattering of Cooper Pairs . . . . . . . . . . . . . . . . . . . . . . . .
12
3.3
The Cooper Instability of the Fermi Sea . . . . . . . . . . . . . . . .
14
4 The BCS Ground State
5
17
4.1
The Energy of the BCS Ground State . . . . . . . . . . . . . . . . . .
18
4.2
The BCS Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
Consequences of BCS and Experiment
28
5.1
Specific Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.2
Microwave Absorption and Reflection . . . . . . . . . . . . . . . . . .
28
5.3
The Isotope Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
1
6
BCS ⇒ Superconducting Phenomenology
32
7
Coherence of the Superconductor ⇒ Meisner effects
37
8
Quantization of Magnetic Flux
41
9 Tunnel Junctions
43
2
1
Introduction
From what we have learned about transport, we know that there is no such thing as an ideal (ρ = 0) conventional conductor. All materials have defects and phonons (and to a lessor degree of importance, electron-electron interactions). As a result, from our basic understanding of metallic conduction ρ must be finite, even at T = 0. Nevertheless many superconductors, for which ρ = 0, exist. The first one Hg was discovered by Onnes in 1911. It becomes superconducting for T < 4.2◦K. Clearly this superconducting state must be fundamentally different than the ”normal” metallic state. Ie., the superconducting state must be a different phase, separated by a phase transition, from the normal state. 1.1
Evidence of a Phase Transition
Evidence of the phase transition can be seen in the specific heat (See Fig. 1). The jump in the superconducting specific heat Cs indicates that there is a phase transition without a latent heat 3
C (J/mol°K)
Cn ∼ γT CS
T
T
c
Figure 1: The specific heat of a superconductor CS and and normal metal Cn . Below the transition, the superconductor specific heat shows activated behavior, as if there is a minimum energy for thermal excitations.
(i.e. the transition is continuous or second order). Furthermore, the activated nature of C for T < Tc Cs ∼ e−β∆
(1)
gives us a clue to the nature of the superconducting state. It is as if excitations require a minimum energy ∆. 1.2
Meissner Effect
There is another, much more fundamental characteristic which distinguishes the superconductor from a normal, but ideal, con4
ductor. The superconductor expels magnetic flux, ie., B = 0 within the bulk of a superconductor. This is fundamentally dif˙ = 0 since for any ferent than an ideal conductor, for which B closed path Superconductor
S C
Figure 2: A closed path and the surface it contains within a superconductor.
0 = IR = V =
I
1 Z ∂B E · dl = S ∇ × E · dS = − S · dS , (2) c ∂t Z
or, since S and C are arbitrary 1˙ ˙ =0 ·S ⇒ B 0=− B c
(3)
Thus, for an ideal conductor, it matters if it is field cooled or zero field cooled. Where as for a superconductor, regardless of the external field and its history, if T < Tc, then B = 0 inside the bulk. This effect, which uniquely distinguishes an 5
Ideal Conductor Zero-Field Cooled
Field Cooled
T > Tc
T > Tc
B=0
B≠0
T < Tc B=0
T < Tc B≠0
T
T < Tc
B≠0
B=0
c
Figure 3: For an ideal conductor, flux penetration in the ground state depends on whether the sample was cooled in a field through the transition.
ideal conductor from a superconductor, is called the Meissner effect. For this reason a superconductor is an ideal diamagnet. I.e.
B = µH = 0 ⇒ µ = 0
M = χH = 6
µ−1 H 4π
(4)
1 (5) 4π Ie., the measured χ, Fig. 4, in a superconducting metal is very χSC = −
large and negative (diamagnetic). This can also be interpreted χ Tc
0
Pauli
∝ D(E ) F
T
js
χ
M
∼ ∼
∼ ∼ H
-1 4π
Figure 4: LEFT: A sketch of the magnetic susceptibility versus temperature of a superconductor. RIGHT: Surface currents on a superconductor are induced to expel the external flux. The diamagnetic response of a superconductor is orders of magnitude larger than the Pauli paramagnetic response of the normal metal at T > T C
as the presence of persistent surface currents which maintain a magnetization of 1 H (6) 4π ext in the interior of the superconductor in a direction opposite M=−
to the applied field. The energy associated with this currents 7
increases with Hext . At some point it is then more favorable (ie., a lower free energy is obtained) if the system returns to a normal metallic state and these screening currents abate. Thus there exists an upper critical field Hc H Normal Hc S.C. Tc
T
Figure 5: Superconductivity is destroyed by either raising the temperature or by applying a magnetic field.
2
The London Equations
London and London derived a phenomenological theory of superconductivity which correctly describes the Meissner effect. They assumed that the electrons move in a frictionless state, so that 8
mv˙ = −eE or, since
∂j ∂t
(7)
˙ = −ensv,
∂js e2ns = E (First London Eqn.) ∂t m Then, using the Maxwell equation ∇×E =− or
m 1 ∂B ∂js 1 ∂B ⇒ + =0 ∇ × c ∂t ns e 2 ∂t c ∂t
(8)
(9)
∂ m 1 B =0 (10) ∇ × j + s ∂t nse2 c This described the behavior of an ideal conductor (for which ρ = 0), but not the Meissner effect. To describe this, the constant of integration must be chosen to be zero. Then ns e 2 B ∇ × js = − mc or defining λL =
m , ns e 2
(Second London Eqn.)
(11)
the London Equations become
B = −λL∇ × js c
9
E = λL
∂js ∂t
(12)
If we now apply the Maxwell equation ∇×H = 4π c µj
4π c j
⇒ ∇×B =
then we get ∇ × (∇ × B) =
4π 4πµ µ∇ × j = − 2 B c c λL
(13)
and 4πµ 1 ∇×B=− 2 j (14) λL c c λL or since ∇ · B = 0, ∇ · j = 1c ∂ρ ∂t = 0 and ∇ × (∇ × a) = ∇ × (∇ × j) = −
∇(∇ · a) − ∇2a we get ∇2 B −
4πµ B=0 c2 λL
∇2 j −
4πµ j=0 c2 λL
(15)
x SC ^
^
j ∝∇×B∝z×x
B
s
y
∂Bx ∂z
z
j
Figure 6: A superconducting slab in an external field. The field penetrates into the slab a distance ΛL =
q
mc2 . 4πne2 µ
10
Now consider a the superconductor in an external field shown in Fig. 6. The field is only in the x-direction, and can vary in space only in the z-direction, then since ∇ × B =
4π c µj,
the
current is in the y-direction, so ∂ 2Bx 4πµ − 2 Bx = 0 ∂z 2 c λL
∂ 2jsy 4πµ jsy = 0 − ∂z 2 c2 λL
(16)
with the solutions Bx = ΛL = 3
s
c 2 λL 4πµ
=
s
z
− B0xe ΛL
mc2 4πne2 µ
jsy = jsy e
− Λz
L
(17)
is the penetration depth.
Cooper Pairing
The superconducting state is fundamentally different than any possible normal metallic state (ie a perfect metal at T = 0). Thus, the transition from the normal metal state to the superconducting state must be a phase transition. A phase transition is accompanied by an instability of the normal state. Cooper first quantified this instability as due to a small attractive(!?) interaction between two electrons above the Fermi surface. 11
3.1
The Retarded Pairing Potential
The attraction comes from the exchange of phonons. The lat-
e-
+
e-
+ 8
vF ∼ 10 cm/s
+ +
ions +
+
region of positive charge attracts a second electron
+
+
+
+
+
+
+
+
Figure 7: Origin of the retarded attractive potential. Electrons at the Fermi surface travel with a high velocity vF . As they pass through the lattice (left), the positive ions respond slowly. By the time they have reached their maximum excursion, the first electron is far away, leaving behind a region of positive charge which attracts a second electron.
tice deforms slowly in the time scale of the electron. It reaches its maximum deformation at a time τ ∼
2π ωD
∼ 10−13 s after the
electron has passed. In this time the first electron has traveled ◦ −13 ∼ vF τ ∼ 108 cm · 10 s ∼ 1000 . The positive charge of A s the lattice deformation can then attract another electron without feeling the Coulomb repulsion of the first electron. Due to retardation, the electron-electron Coulomb repulsion may be neglected! 12
The net effect of the phonons is then to create an attractive interaction which tends to pair time-reversed quasiparticle states. They form an antisymmetric spin singlet so that the k↑
e
ξ ∼ 1000Α°
e - k↓
Figure 8: To take full advantage of the attractive potential illustrated in Fig. 7, the spatial part of the electronic pair wave function is symmetric and hence nodeless. To obey the Pauli principle, the spin part must then be antisymmetric or a singlet.
spatial part of the wave function can be symmetric and nodeless and so take advantage of the attractive interaction. Furthermore they tend to pair in a zero center of mass (cm) state so that the two electrons can chase each other around the lattice. 3.2
Scattering of Cooper Pairs
This latter point may be quantified a bit better by considering two electrons above a filled Fermi sphere. These two electrons 13
are attracted by the exchange of phonons. However, the maximum energy which may be exchanged in this way is ∼ h ¯ ωD . Thus the scattering in phase space is restricted to a narrow shell of energy width h ¯ ωD .
Furthermore, the momentum in
k1
k’
Ek ∼ k
2
k’1
k’
1
2
ω k’
2
k1
k2
D
k2
Figure 9: Pair states scattered by the exchange of phonons are restricted to a narrow scattering shell of width h ¯ ωD around the Fermi surface.
this scattering process is also conserved k1 + k2 = k01 + k02 = K
(18)
Thus the scattering of k1 and k2 into k01 and k02 is restricted to the overlap of the two scattering shells, Clearly this is negligible unless K ≈ 0. Thus the interaction is strongest (most likely) if k1 = −k2 and σ1 = −σ2; ie., pairing is primarily between 14
time-reversed eigenstates. scattering shell k1
-k
2
K
Figure 10: If the pair has a finite center of mass momentum, so that k1 + k2 = K, then there are few states which it can scatter into through the exchange of a phonon.
3.3
The Cooper Instability of the Fermi Sea
Now consider these two electrons above the Fermi surface. They will obey the Schroedinger equation. h ¯2 2 − (∇1 + ∇22)ψ(r1r2) + V (r1r2)ψ(r1r2) = (² + 2EF )ψ(r1r2) 2m (19) If V = 0, then ² = 0, and ψV =0 =
1 ik1·r1 1 ik2·r2 1 ik(r1−r2) e e = e , L3/2 L3/2 L3
15
(20)
where we assume that k1 = −k2 = k. For small V, we will perturb around the V = 0 state, so that ψ(r1r2) =
1 X ik·(r1 −r2 ) g(k)e L3 k
(21)
The sum must be restricted so that h ¯ 2 k2 < EF + h ¯ ωD EF < 2m
(22)
this may be imposed by g(k), since |g(k)|2 is the probability of finding an electron in a state k and the other in −k. Thus we take g(k) = 0 for
k < kF √ 2m(EF +¯hωD ) k > h ¯
(23)
The Schroedinger equations may be converted to a k-space equation by multiplying it by 1 Z 3 −ik0· r dre ⇒ S.E. L3
(24)
h ¯ 2k2 1 X g(k) + 3 g(k0)Vkk0 = (² + 2EF )g(k) m L k0
(25)
so that
where Vkk0 =
Z
V (r)e−i(k 16
−k0 )·r 3
dr
(26)
now describes the scattering from (k, −k) to (k0, −k0). It is usually approximated as a constant for all k and k0 which obey the Pauli-principle and scattering shell restrictions
so
Vkk0 =
2 ¯ 2 k0 h ¯ 2 k2 h 2m , 2m
−V0
EF <
0
otherwise
< EF + h ¯ ωD
¯ 2 k2 V0 X h − g(k) = − g(k0) ≡ −A + ² + 2EF 3 m L k0
.
(27)
(28)
or g(k) =
2 2 − h¯ mk
−A
+ ² + 2EF
(i.e. for EF <
h ¯ 2 k2 2m
< EF + h ¯ ωD ) (29)
Summing over k
or
A V0 X = +A L3 k h¯ 2k2 − ² − 2EF m
(30)
1 V0 X (31) L3 k h¯ 2k2 − ² − 2EF m This may be converted to a density of states integral on E = 1=
h ¯ 2 k2 2m
17
dE EF 2E − ² − 2EF ² − 2¯ h ω 1 D 1 = V0Z(EF ) ln 2 ²
1 = V0
Z E +¯ F hωD
Z(EF )
² =
2¯hωD −2/(V0 Z(EF )) ' −2¯ h ω e < 0, D 1 − e2/(V0 Z(EF ))
4
The BCS Ground State
(32) (33) as
V0 → 0 EF (34)
In the preceding section, we saw that the weak phonon-mediated attractive interaction was sufficient to destabilize the Fermi sea, and promote the formation of a Cooper pair (k ↑, −k ↓). The scattering (k ↑, −k ↓) → (k0 ↑, −k0 ↓)
(35)
yields an energy V0 if k and k0 are in the scattering shell EF < ¯ ωD . Many electrons can participate in this Ek , E k0 < E F + h process and many Cooper pairs are formed, yielding a new state (phase) of the system. The energy of this new state is not just 18
N 2²
less than that of the old state, since the Fermi surface is
renormalized by the formation of each Cooper pair. 4.1
The Energy of the BCS Ground State
Of course, to study the thermodynamics of this new phase, it is necessary to determine its energy. It will have both kinetic and potential contributions. Since pairing only occurs for electrons above the Fermi surface, the kinetic energy actually increases: if wk is the probability that a pair state (k ↑, −k ↓) is occupied then
h ¯ 2 k2 Ekin = 2 wk ξk , ξk = − EF (36) 2m k The potential energy requires a bit more thought. It may be X
written in terms of annihilation and creation operators for the pair states labeled by k |1ik
(k ↑, −k ↓)occupied
(37)
|0ik
(k ↑, −k ↓)unoccupied
(38)
or |ψk i = uk |0ik + vk |1ik 19
(39)
where vk2 = wk and u2k = 1 − wk . Then the BCS state, which is a collection of these pairs, may be written as Y
|φBCS i '
k
{uk |0ik + vk |1ik } .
(40)
We will assume that uk , vk ∈ <. Physically this amounts to taking the phase of the order parameter to be zero (or π), so that it is real. However the validity of this assumption can only be verified for a more microscopically based theory. By the Pauli principle, the state (k ↑, −k ↓) can be, at most,
singly occupied, thus a (s = 21 ) Pauli representation is possible
Where
σk+
and
|1ik =
σk−,
1 0
|0ik =
k
0 1
(41) k
describe the creation and anhialation of the
state (k ↑, −k ↓)
0 1
σk+ = 12 (σk1 + iσk2 ) =
(42)
(43)
0 0 0 0
Of course
σk+
σk− = 12 (σk1 − iσk2 ) =
0 1
= k
σk+ |1ik =
1 0
1 0
0
σk+ |0ik = 20
|1ik
(44)
σk− |1ik = |0ik
σk+ |0ik = 0
(45)
The process (k ↑, −k ↓) → (k0 ↑, −k0 ↓), if allowed, is associated with an energy reduction V0. In our Pauli matrix representation this process is represented by operators σk+0 σk−, so V =−
V0 X + − σ 0σ L3 kk0 k k
(Note that this is Hermitian)
(46)
Thus the reduction of the potential energy is given by hφBCS |V | φBCS i
V 0 Y X − 3 (up h0| + vp h1|) σk+σk−0 L p kk0
hφBCS |V | φBCS i =
Yµ
p0
up0 |0ip0 + vp0 |1ip0
¶
(47)
Then as k h1|1ik0 = δkk0 , k h0|0ik0 = δkk0 and k h0|1ik0 = 0 hφBCS |V | φBCS i = −
V0 X vk u k 0 u k vk 0 L3 kk0
(48)
Thus, the total energy (kinetic plus potential) of the system of Cooper pairs is WBCS = 2
X
k
vk2 ξk −
V0 X vk u k 0 u k vk 0 L3 kk0
As yet vk and uk are unknown.
(49)
They may be treated as
variational parameters. Since wk = vk2 and 1 − wk = u2k , we 21
may impose this constraint by choosing vk = cos θk ,
uk = sin θk
(50)
At T = 0, we require WBCS to be a minimum. WBCS =
P
k
2ξk cos2 θk − LV03
=
P
k
P
kk 0
2ξk cos2 θk − LV03
cos θk sin θk0 cos θk0 sin θk P
1 kk 0 4
(51)
sin 2θk sin 2θk0
∂WBCS V0 X = 0 = −4ξk cos θk sin θk − 3 cos 2θk sin 2θk0 (52) ∂θk L k0 1 V0 X (53) ξk tan 2θk = − 3 sin 2θk0 2 L k0 r
Conventionally, one introduces the parameters Ek = ξk2 + ∆2, ∆ = V0 P L3 k
u k vk =
V0 P L3 k
cos θk sin θk . Then we get
ξk tan 2θk = −∆ ⇒ 2uk vk = sin 2θk = cos 2θk =
∆ (54) Ek
−ξk = cos2 θk − sin2 θk = vk2 − u2k = 2vk2 − 1 (55) Ek
wk = vk2 =
1 ξk −ξk 1 1− = 1 − r 2 2 2 Ek 2 ξk + ∆ µ
If we now make these substitutions 2uk vk =
∆ 2 Ek , v k
=
(56) 1 2
µ
1−
ξk Ek
into WBCS , then we get WBCS =
X
k
ξ k L3 2 − ∆. ξk 1 − Ek V0 22
(57)
¶¶
wk = v 2 k
clearly kinetic energy increases
T=0
1
2 2
h k ξ k = -E + F 2m
0
Figure 11: Sketch of the ground state pair distribution function.
Compare this to the normal state energy, again measured relative to EF Wn =
X
k
2ξk
(58)
or
ξk WBCS − Wn 1 X ∆2 = − 3 ξk 1 + − L3 L k Ek V0 1 ≈ − Z(EF )∆2 < 0. 2
(59) (60)
So the formation of superconductivity reduces the ground state energy. This can also be interpreted as ∆Z(EF ) electrons pairs per and volume condensed into a state ∆ below EF . The average energy gain per electron is
∆ 2. 23
4.2
The BCS Gap
The gap parameter ∆ is fundamental to the BCS theory. It tells us both the energy gain of the BCS state, and about its excitations. Thus ∆ is usually what is measured by experiments. To see this consider
1 ξ k L3 ∆ 2 WBCS = 2ξk 1 − − 2 E V0 k k ↓ Lots of algebra (See I&L) X
WBCS = −
X
2Ek vk4
(61)
(62)
Now recall that the probability that the Cooper state (k ↑, k ↓)
was occupied, is given by wk = vk2 . Thus the first pair breaking excitation takes vk20 = 1 to vk20 = 0, for a change in energy ∆E = −
X
k6=k 0
2vk4 Ek
Then since ξk0 =
h ¯ 2 k 02 2m
+
X
k
2vk4 Ek
r
= 2Ek0 = 2 ξk20 + ∆2
(63)
− EF , the smallest such excitation is just ∆Emin = 2∆
(64)
This is the minimum energy required to break a pair, or create an excitation in the BCS ground state. It is what is measured by the specific heat C ∼ e−β2∆ for T < Tc. 24
k′↑
e
e
-k′↓
2 vk′ =0
w = v2 = 1 k′
k′
q
Figure 12: Breaking a pair requires an energy 2 ξk2 + ∆2 ≥ 2∆
Now consider some experiment which adds a single electron, or perhaps a few unpaired electrons, to a superconductor (ie tunneling). This additional electron cannot find a partner for normal metal
superconductor
Figure 13:
pairing. Thus it must enter one of the excited states discussed
25
above. Since it is a single electron, its energy will be r
Ek = ξk2 + ∆2 For ξk2 À ∆, Ek = ξk =
h ¯ 2 k 02 2m
(65)
− EF , which is just the energy of
a normal metal state. Thus for energies well above the gap, the normal metal continuum is recovered for unpaired electrons. To calculate the density of unpaired electron states, recall that the density of states was determined by counting k-states. These are unaffected by any phase transition. Thus it must be that the number of states in d3k is equal. kz
3
d k ky π L
3
k
x
Figure 14: The number of k-states within a volume d3 k of k-space is unaffected by any phase transition.
Ds(Ek )dEk = Dn(ξk )dξk
(66)
In the vicinity of ∆ ∼ ξk , Dn(ξk ) ≈ Dn(EF ) since |∆| ¿ EF 26
(we shall see that ∆ ≤ 2wD ). Thus for ξk ∼ ∆ Ds(Ek ) dξx d r 2 Ek = Ek − ∆ 2 = r = Dn(EF ) dEk dEk Ek2 − ∆2
Ek > ∆ (67)
E
Density of additional electron states only!
∆
1
Ds Dn
Figure 15:
Given the experimental and theoretical importance of ∆, it should be calculated. ∆ =
V0 X V0 X V0 X ∆ sin θ cos θ = u v = k k k k L3 k L3 k L3 k 2Ek ∆ =
∆ 1 V0 X r 2 L3 k ξk2 + ∆2
(68) (69)
Convert this to sum over energy states (at T = 0 all states with
27
ξ < 0 are occupied since ξk = V0 ∆ = ∆ 2
For small ∆,
h ¯ 2 k2 2m
Z h ¯ ωD
−¯hωD
− EF ).
Z(EF + ξ)dξ √ 2 ξ + ∆2
Z h 1 dξ ¯ω = 0 D√ 2 V0Z(EF ) ξ + ∆2 h ¯ ω 1 D = sinh−1 V0Z(EF ) ∆ 1 h ¯ ωD V0 Z(EF ) ∼ e ∆
∆ '
sinh x ∼ ex
x
28
(71) (72)
(73)
1 − V Z(E 0 F) h ¯ ωD e
Figure 16:
(70)
(74)
5 5.1
Consequences of BCS and Experiment Specific Heat
As mentioned before, the gap ∆ is fundamental to experiment. The simplest excitation which can be induced in a superconductor has energy 2∆. Thus ∆E ∼ 2∆e−β2∆
T ¿ Tc
∆2 −β2∆ ∂∆E ∂β ∼ 2e C∼ ∂β ∂T T
5.2
(75) (76)
Microwave Absorption and Reflection
Another direct measurement of the gap is reflectivity/absorption. A phonon impacting a superconductor can either be reflected or absorbed. Unless h ¯ ω > 2∆, the phonon cannot create an excitation and is reflected. Only if h ¯ ω > 2∆ is there absorption. Consider a small cavity within a superconductor. The cavity has a small hole which allows microwave radiation to enter the cavity. If h ¯ ω < 2∆ and if B < Bc, then the microwave intensity is high I = Is. On the other hand, if h ¯ ω > 2∆ ,or 29
I s - In In
superconductor
B=0
cavity 10
hω
microwave hω
hω = 2∆ B
Figure 17: If B > Bc or h ¯ ω > 2∆, then absorption reduces the intensity to the normal-state value I = In . For B = 0 the microwave intensity within the cavity is large so long as h ¯ ω < 2∆
B > Bc, then the intensity falls in the cavity I = In due to absorbs ion by the walls. Note that this also allows us to measure ∆ as a function of T.
At T = Tc, ∆ = 0, since thermal excitations reduce the
number of Cooper pairs and increase the number of unpaired electrons, which obey Fermi-statistics. The size of (Eqn. 71) is only effected by the presence of a Cooper pair . The probaµ√ ¶ 2 2 bility that an electron is unpaired is f ξ + ∆ + EF , T = exp β
√1
ξ 2 +∆2 +1
so, the probability that a Cooper pair exists is
30
k′↑
e kT ∼ 2∆
e
-k′↓ Figure 18: ¶ √ 2 2 1 − 2f ξ + ∆ + EF , T . Thus for T 6= 0 µ
( Ãr !) Z h dξ 1 ¯ ωD 2 2 √ 2 1 − 2f ξ + ∆ + EF , T = 0 V0Z(EF ) ξ + ∆2 (77) √ Note that as ξ 2 + ∆2 ≥ 0, when β → ∞ we recover the
T = 0 result. This equation may be solved for ∆(T ) and for Tc. To find Tc ∆(T) ∆(0)
In Pb Sn Real SC data (reflectivity)
1
T/Tc
Figure 19: The evolution of the gap (as measured by reflectivity) as a function of temperature. The BCS approximation is in reasonably good agreement with experiment. 31
consider this equation as
T Tc
→ 1, the first solution to the gap
equation, with ∆ = 0+, occurs at T = Tc. Here
Z h 1 ξ ¯ ω dξ = 0 D tanh V0Z(EF ) ξ 2kB Tc
(78)
which may be solved numerically to yield 1 = V0Z(EF ) ln
1.14¯hωD k B Tc
kB Tc = 1.14¯hωD e−1/{V0Z(EF )}
(79) (80)
but recall that ∆ = 2¯hωD e−1/{V0Z(EF )} , so ∆(0) 2 = 1.764 = kB Tc 1.14
(81)
metal Tc◦K Z(EF )V0 ∆(0)/kB Tc Zn
0.9
0.18
1.6
Al
1.2
0.18
1.7
Pb
7.22
0.39
2.15
Table 1: Note that the value 2.15 for ∆(0)/kB Tc for Pb is higher than BCS predicts. Such systems are labeled strong coupling superconductors and are better described by the Eliashberg-Migdal theory.
32
5.3
The Isotope Effect
Finally, one should discuss the isotope effect. We know that Vkk0 , results from phonon exchange. If we change the mass of one of the vibrating members but not its charge, then V0N (EF ) etc are unchanged but ωD ∼
v u u u t
1 k ∼ M −2 . M
(82)
1
Thus Tc ∼ M − 2 . This has been confirmed for most normal superconductors, and is considered a ”smoking gun” for phonon mediated superconductivity. 6
BCS ⇒ Superconducting Phenomenology
Using Maxwell’s equations, we may establish a relation between the critical current and the critical field necessary to destroy the superconducting state. Consider a long thick wire (with radius r0 À ΛL) and integrate the equation ∇×H=
33
4π j c
(83)
j = j0 e
H •
(r - r0 )/ΛL
H ⊗ S
r0
j
0
dl
Λ
L
Figure 20: Integration contour within a long thick superconducting wire perpendicular to a circulating magnetic field. The field only penetrates into the wire a distance Λ L .
along the contour shown in Fig. 20. Z
∇ × HdS =
Z
H · dl =
4π Z j · ds c
4π 2πr0ΛLj0 c If j0 = jc (jc is the critical current), then 2πr0H =
(84) (85)
4π ΛL jc (86) c Since both Hc and jc ∝ ∆, they will share the temperatureHc =
dependence of ∆. At T = 0, we could also get an expression for Hc by noting 34
that, since the superconducting state excludes all flux, 1 2 1 (W − W ) = H n BCS L3 8π c
(87)
However, since we have earlier 1 1 2 (W − W ) = N (0)∆ , n BCS L3 2
(88)
we get r
Hc = 2∆ πN (0) We can use this, and the relation derived above jc =
(89) c 4πΛL Hc ,
to get a (properly derived) relationship for jc. r c jc = 2∆ πN (0) 4πΛL
(90)
However, for most metals N (0) '
n EF
v u u u u t
mc2 ΛL = 4πne2µ taking µ = 1 v u u u t
v
√ u πn2m c 4πne2 u ne u t jc = 2∆ = 2∆ 4π mc2 h ¯ kF h ¯ 2kF2 35
(91) (92)
(93)
This gives a similar result to what Ibach and L¨uth get, but for a completely different reason. Their argument is similar to one originally proposed by Landau. Imagine that you have a fluid which must flow around an obstacle of mass M . From the perspective of the fluid, this is the same as an obstacle moving in it. Suppose the obstacle makes an excitation of energy ² and v
vP
M
M
E
Figure 21: A superconducting fluid which must flow around an obstacle of mass M . From the perspective of the fluid, this is the same as an obstacle, with a velocity equal and opposite the fluids, moving in it.
momentum p in the fluid, then E0 = E − ²
P0 = P − p
(94)
or from squaring the second equation and dividing by 2M E′ (a)
P
(b)
P′
M
E
M
p
ε
Figure 22: A large mass M moving with momentum P in a superfluid (a), creates an excitation (b) of the fluid of energy ² and momentum p 36
P 02 P2 P·p p2 − =− + = E0 − E = ² 2M 2M M 2M
(95)
p θ
P v = P/M
P′
Figure 23:
pP cos θ p2 ² = − (96) M 2M p2 ² = pv cos θ − (97) 2M If M → ∞ (a defect in the tube which carries the fluid could have essentially an infinite mass) then ² = v cos θ p
(98)
Then since cos θ ≤ 1
² (99) p Thus, if there is some minimum ²,then there is also a miniv ≥
mum velocity below which such excitations of the fluid cannot
37
happen. For the superconductor vc =
²min 2∆ = p 2¯hkF
(100)
Or ne (101) h ¯ kF This is the same relation as we obtained with the previous √ thermodynamic argument (within a factor 2). However, the jc = envc = ∆
former argument is more proper, since it would apply even for gapless superconductors, and it takes into account the fact that the S.C. state is a collective phenomena ie., a minuet, not a waltz of electric pairs. 7
Coherence of the Superconductor ⇒ Meisner effects
Superconductivity is the Meissner effect, but thus far, we have not yet shown that the BCS theory leads to the second London equation which describes flux exclusion. In this subsection, we will see that this requires an additional assumption: the rigidity of the BCS wave function. 38
In the BCS approximation, the superconducting wave function is taken to be composed of products of Cooper pairs. One can estimate the size of the pairs from the uncertainty principle
2 ∆ pF p 2∆ = δ ∼ δp ⇒ δp ∼ 2m 2m m pF
h ¯ pF h ¯ 2 kF EF h ¯ ∼ = = ξcp ∼ δx ∼ δp 2m∆ 2m∆ kF ∆
(102) (103)
◦
ξcp ∼ 103 − 104 A∼ size of Cooper pair wave function (104) Thus in the radius of the Cooper pair, about
4πn ξcp 3 ∼ 108 3 2 other pairs have their center of mass.
(105)
Figure 24: Many electron pairs fall within the volume of a Cooper wavefunction. This leads to a degree of correlation between the pairs and to rigidity of the pair wavefunction.
39
The pairs are thus not independent of each other (regardless of the BCS wave function approximation).
In fact they are
specifically anchored to each other; ie., they maintain coherence over a length scale of at least ξcp. Normal Metal
SC
φBCS
2
ξ coh > ξ cp
Figure 25:
In light of this coherence, lets reconsider the supercurrent 2e {ψp∗ψ ∗ + ψ ∗pψ} 4m where pair mass = 2m and pair charge = −2e. j=−
p = −i¯h∇ −
2e A c
(106)
(107)
A current, or a CM momentum K, modifies the single pair state ψ(r1, r2) =
1 X g(k)eiK· 3 L k
(r1 +r2 )/2 ik· (r1 −r2 )
e
ψ(K, r1, r2) = ψ(K = 0, r1, r2)eiK·R 40
(108) (109)
where R =
r1 +r2 2
is the cm coordinate and h ¯ K is the cm mo-
mentum. Thus ΦBCS ' eiφΦBCS (K = 0) = eiφΦ(0)
(110)
φ = K · (R1 + R2 + · · ·)
(111)
(In principle, we should also antisymmetrize this wave function; however, we will see soon that this effect is negligible). Due to the rigidity of the BCS state it is valid to approximate ∇ = ∇R + ∇r ≈ ∇R
(112)
Thus
or
2e X ∗ 2eA Φ js ≈ −i¯ h ∇ + ΦBCS R ν 4m ν BCS c ∗ 2eA ∗ ΦBCS +ΦBCS i¯h∇Rν + c
2e 2X 2 4eA js = − + 2¯h |Φ(0)| ∇ Rν φ |Φ(0)| ν 2m c
(113)
(114)
Then since for any ψ, ∇ × ∇ψ = 0
2e2 |Φ(0)|2 ∇ × A ∇ × js = − mc 41
(115)
or since |Φ(0)|2 =
ns 2
ne2 B ∇×j=− mc
(116)
which is the second London equation which as we saw in Sec.?? leads to the Meissner effect. Thus the second London equation can only be derived from the BCS theory by assuming that the BCS state is spatially homogeneous. 8
Quantization of Magnetic Flux
The rigidity of the wave function (superconducting coherence) also guarantees that the flux penetrating a superconducting loop is quantized. This may be seen by integrating Eq. 114 along a contour within the superconducting bulk (at least a distance ΛL from the surface). e 2 ns e¯hns X A− ∇ Rν φ js = − mc 2m ν
(117)
e¯hns X Z e 2 ns Z ◦A · dl − ◦∇Rν φ · dl (118) ◦js · dl = − ms 2m ν Presumably the phase of the BCS state ΦBCS = eiφΦ(0) is Z
42
superconducting loop C X
X X
X
X X
X
B
X X X X
X
X
X X
X
ΛL
X
Figure 26: Magnetic flux penetrating a superconducting loop is quantized. This may be seen by integrating Eq. 114 along a contour within the superconducting bulk (a distance ΛL from the surface).
single valued, so XZ
ν
∇Rν φ · dl = 2πN
N ∈Z
(119)
Also since the path l may be taken inside the superconductor by a depth of more than ΛL, where js = 0, we have that Z
js · dl = 0
(120)
so e 2 ns Z e 2 ns Z e¯hns A · dl = − B · ds = 2N π − ms ms 2m Ie., the flux in the loop is quantized. 43
(121)
9
Tunnel Junctions
Imagine that we have an insulating gap between two metals, and that a plane wave (electronic Block State) is propagating towards this barrier from the left V a
c
b
V0
metal
metal
insulator
2m d 2ψ + 2 Eψ = 0 h dx 2
0
d
2
d ψ + 2m (E - V )ψ 0 h2 dx2
d2ψ
x
2m + 2 Eψ = 0 h dx 2
Figure 27:
ψa = A1eikx + B1e−ikx
0
0
ψb = A2eik x + B2e−ik x ψc = B3e−ikx (122)
These are solutions to the S.E. if √ 2mE k = in a & c h ¯ r 2m(E − V0) k0 = in b h ¯ 44
(123) (124)
The coefficients are determined by the BC of continuity of ψ and ψ 0 at the barriers x = 0 and x = d. If we take B3 = 1 and E < V0, so that r
2m(E − V0) k = iκ = (125) h ¯ then, the probability of having a particle tunnel from left to 0
right is
Pl→r
−1
1 1 1 k κ 2 1 k κ 2 |B3|2 = = ∝ − − + + cosh 2κd 2 |B1|2 |B1|2 8 κ k 8 κ k (126)
For large κd
Pl→r
−2 κ k (127) ∝ 8 + e−2κd κ k r −2 2m(V 2d − E) k κ 0 ∝ 8 + exp − (128) κ k h ¯
Ie, the tunneling probability falls exponentially with distance.
Of course, this explains the physics of a single electron tunneling across a barrier, assuming that an appropriate state is 45
filled on the left-hand side and available on the right-hand side. This, as can be seen in Fig.
28, is not always the case, es-
pecially in a conductor. Here, we must take into account the densities of states and their occupation probabilities f . We will be interested in applied voltages V which will shift the chemical potential eV . To study the gap we will apply S
N
I
E eV
X
N(E)
Figure 28: Electrons cannot tunnel accross the barrier since no unoccupied states are available on the left with correspond in energy to occupied states on the right (and vice-versa). However, the application of an appropriate bias voltage will promote the state on the right in energy, inducing a current.
We know that
2∆ k B Tc
eV ∼ ∆
(129)
4kB Tc 2
∼ 10◦K. However typical
∼ 4, ∆ ∼
metallic densities of states have features on the scale of electron46
volts ∼ 104◦K. Thus, on this energy scale we may approximate the metallic density of states as featureless. Nr (²) = Nmetal (²) ≈ Nmetal (EF )
(130)
The tunneling current is then, roughly, I∝ P −P
Z
Z
d²f (² − eV )Nr (EF )Nl (²)(1 − f (²)) d²f (²)Nl (²)Nr (EF )(1 − f (² − eV ))
For eV = 0, clearly I = 0 i.e. a balance is achieved.
(131) For
EF
Figure 29: If eV= 0, but there is a small overlap of occupied and unoccupied states on the left and right sides, then there still will be no current due to a balance of particle hopping.
eV 6= 0 a current may occur. Let’s assume that eV > 0 and kB T ¿ ∆.
Then the rightward motion of electrons is 47
suppressed. Then Z
I ∼ P Nr (EF ) d²f (² − eV )Nl (²)
(132)
and Z dI ∂f (² − eV ) ∼ P Nr (EF ) d² Nl (²) (133) dV ∂V ∂f ∼ eδ(² − eV − EF ) (T ¿ EF ) (134) ∂V dI ' P Nr (EF )Nl (eV + EF ) (135) dV dI Thus the low temperature differential conductance dV is a mea-
sure of the superconducting density of states. dI dV
I
∆/e
∆/e
V
V
Figure 30: At low temperatures, the differential conductance in a normal metal– superconductor tunnel junction is a measure of the quasiparticle density of states.
48
Chapter 11: Dielectric Properties of Materials Lindhardt May 8, 2002
Contents 1 Classical Dielectric Response of Materials
2
2
1.1
Conditions on ²
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2
Kramer’s Kronig Relations . . . . . . . . . . . . . . . . . . . . . . .
6
Absorption of E and M radiation
8
2.1
Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2
Reflectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.3
Model Dielectric Response . . . . . . . . . . . . . . . . . . . . . . . .
13
3
The Free-electron gas
17
4
Excitons
19
1
Electromagnetic fields are essential probes of material properties • IR absorption • Spectroscopy The interaction of the field and material may be described either classically or Quantum mechanically. We will first do the former. 1
Classical Dielectric Response of Materials
Classically, materials are characterized by their dielectric response of either the bound or free charge. Both are described by Maxwells equations ∇×E =−
1 ∂B , c ∂t
∇×H=
4π 1 ∂D j+ c c ∂t
(1)
and Ohm’s law j = σE .
(2)
Both effects may be combined into an effective dielectric constant ²˜, which we will now show. For an isotropic medium, we 2
e
x
x
x << λ 0
E(x,t) ≅ E(x0 ,t)
λ k << G ε(k,ω) ≅ ε(ω) Figure 1: If the average excursion of the electron is small compared to the wavelength of the radiation < x >¿ λ, then we may ignore the wave-vector dependence of the radiation so that ²(k, ω) ≈ ²(ω).
have D(ω) = ²(ω)E(ω)
(3)
where R
E(t) = dωe
−iωt
E(ω)
H(t) =
D(t) = dωe−iωt D(ω)
B(t) =
R
Z
Z
dωe−iωt H(ω) (4) dωe−iωt B(ω) (5)
E(ω) = E ∗(−ω) ⇒ E(t) ∈ < Then ∇ × H =
4π c j
+ 1c ∂D ∂t ⇒
3
(6)
4π Z 1∂ Z −iωt ∇ × dωe H(ω) = dωe j(ω) + dωe−iωt D(ω) c c ∂t (7) Z 4π 1 dωe−iωt ∇ × H(ω) − j(ω) − (−iω)D(ω) = 0 (8) c c Z
−iωt
4π ω j − i D(ω) c c 4πσ ω = E − i ²E c c µ ¶ ω c 4πσ = − iωc E ²˜ −i E ² − c ω c = µ ¶ c iω 4π 4π ˜ c E σ − 4π c ² = c E σ
∇×H =
(9)
Thus we could either define an effective conductivity σ˜ = σ − iω² 4π
which takes into account dielectric effects, or an effective
dielectric constant ²˜ = ² + i 4πσ ω , which accounts for conduction. 1.1
Conditions on ²
From the reality of D(t) and E(t), one has that E(+ω) = E ∗(−ω) and D(ω) = D∗(−ω), hence for D = ²E, ²(ω) = ²∗(−ω)
4
(10)
Additional constraints are obtained from causality Z
dω²(ω)E(ω)e−iωt 0 Z Z −iωt dt iωt0 = dω²(ω)e e E(t0) 2π Z 1 0 = dtdω (²(ω) − 1 + 1) E(t0)e−iω(t−t ) 2π
D(t) =
(11)
then we make the substitution χ(ω) ≡ R
µ
²−1 4π
D(t) = 2 dtdω χ(ω) +
1 4π
¶
(12) 0
E(t0)e−iω(t−t ) 0
R
D(t) = E(t) + 2 dtdωχ(ω)E(t0 )e−ω(t−t ) R
(13)
Define G(t) ≡ 2 dωχ(ω)e−iωt , then D(t) = E(t) +
Z ∞
−∞
dt0G(t − t0)E(t0 )
(14)
Thus, the electric displacement at time t depends upon the field at other times; however, it cannot depend upon times t0 > t by causality. Hence G(τ ) ≡ 0
τ <0
5
(15)
no poles
τ<0
X
X
poles
Figure 2: If χ(ω) is analytic in the upper half plane, then causality is assured.
This can be enforced if χ(ω) is analytic in the upper half plane, then we may close G(τ ) = 2
Z ∞
Z
−∞
dωχ(ω)e−iωτ
= 2 ◦dωχ(ω)e−iωτ ≡ 0
(16)
contour in the upper half plane and obtain zero for the integral since χ is analytic within and on the contour. 1.2
Kramer’s Kronig Relations
One may also derive an important relation between the real and imaginary parts of the dielectric function ²(ω) using this analytic property. From the Cauchy integral formula, if χ is 6
analytic inside and on the contour C, then 1 Z χ(ω 0) 0 dω (17) χ(z) = ◦ 2πi C z − ω 0 Then taking the contour shown in Fig. 3 and assuming that ω’
ω.
Figure 3: The contour (left) used to demonstrate the Kramer’s Kronig Relations. Since the contour must contain ω, we must deform the contour so that it avoids the pole 1/(ω − ω 0 ).
< χ(ω) ∼
1 ω
< for large ω (in fact Reχ ∼
1 ω
< and Imχ ∼
1 ) ω2
may ignore the large semicircle. Let z = ω + i0+, so that 1 Z ∞ χ(ω 0)dω 0 + χ(ω + i0 ) = 2πi −∞ ω − ω 0 − i0+ Z ∞ χ(ω 0 )dω 0 + + iπχ(ω) 2πiχ(ω + i0 ) = P −∞ ω − ω0 0 1 Z∞ 0 χ(ω ) χ(ω) = P −∞ dω iπ ω − ω0 Then let 4πχ(ω) = ²(ω) − 1 = ²1 + i²2 − 1 and we get 2 Z ∞ −i²1(ω 0) + ²2(ω 0) + 1 0 dω ²1(ω) + i²2(ω) − 1 = P −∞ π ω − ω0 7
we
(18) (19) (20)
(21)
or
1 Z ∞ ²2(ω 0) 0 dω (22) ²1(ω) − 1 = P −∞ π ω − ω0 1 Z ∞ ²1(ω 0) − 1 0 ²2(ω) = − P −∞ dω (23) π ω − ω0 which are known as the Kramer’s Kronig relations. Many experiments measure ²2 and from Eq. 22 we can calculate ²1! 2
Absorption of E and M radiation
2.1
Transmission
~
ξ = ξ e-iω(t-nx/c) 0
~=1 n
~=1 n
~ n = n + iκ = √ε(ω) ~n = n2 +2inκ - κ 2 = ε + iε 2 1 2
ε = n - κ2
ε
1
ε = 2nκ 2
Figure 4: In transmission experiments a laser beam is focused on a thin slab of some material we wish to study.
8
Imagine that a laser beam of known frequency is normally incident upon a thin slab of some material (see Fig. 4), and we are able to measure its transmitted intensity. Upon passing through a boundary, part of the beam is reflected and part is transmitted (see Fig. 5). In the case of normal incidence, it is easy to calculate the related coefficients from the conditions of continuity of E⊥ and H⊥ = B⊥(µ = 1) µ=1
B0 ξ
µ=1 B′
X 0
X
X
ξ″
ξ′
B″
Figure 5: The assumed orientation of electromagnetic fields incident on a surface.
∇×E =−
1 ∂B , c ∂t 00
E0 + E = E
iω B c |nE⊥| = |B⊥|
ik × E =
0
00
(E0 − E ) = nE 9
0
t=
2 n+1
(24) (25) (26)
In this way the other coefficients may be calculated (see Fig. 6). Accounting for multiple reflections the total transmitted field is
t = ~n 2+ 1 1
t
r
t =
1
2
1
t
r
r =
2
2
2
~ 2n ~ n+1 ~n - 1 ~n + 1
Figure 6: Multiple events contribute the the radiation transmitted through and reflected from a thin slab.
E = E0t1t2eikd + E0t1r2r2t2e3ikd + · · · ,
where k =
E0t1t2eid˜nω/c E= 1 − r22e2id˜nω/c If n ˜ ∼ 1 and d is small, then E ' E0eid˜nω/c I ' I0eidω(˜n−˜n
∗ )/c
= I0e−dω2κ/c
²2 κ = √ 2 ²1
I ' I0 e
−(²2 ω/c)d
(28)
(29) (30) (31)
²2 ω ! d ' I0 1 − c
10
n ˜ω c (27)
Ã
(32)
If the thickness d is known, then the quantity ω²2(ω) 4πσ1(ω) = c c
(33)
may be measured (the absorption coefficient). The real part of the dielectric response, ²1(ω) may be calculated with the Kramers Kronig relation 1 ²1(ω) = 1 + P π
²2(ω 0) 0 dω −∞ ω 0 − ω
Z ∞
(34)
According to Young Kim, this analysis works for sufficiently thin samples in the optical regime, but typically fails in the IR where ²2 becomes large. 2.2
Reflectivity
Of course, we could also have performed the experiment on a very thick slab of the material, and measured the reflectivity R However, this is a much more complicated experiment since
R I
depends upon both ²1 and ²2(ω), and is hence much more difficult to analyze [See Frederick Wooten, Optical Properties of Solids, (Academic Press, San Diego, 1972)]. 11
~ n=1
R= I
n~ - 1 ~ n +1
2
~
n
Figure 7: The coefficient of reflectivity (the ratio of the reflected to the incident intensities) R/I depends upon both ²1 (ω) and ²2 (ω), making the analysis more complicated than in the transmission experiment.
To analyze these experiments, one must first measure the reflectivity, R(ω) over the entire frequency range. We then write
(n − 1)2 + κ2 R(ω) = r(ω)r (ω) = (n + 1)2 + κ2 where n ˜ = n + iκ and ∗
r(ω) = ρ(ω)eiθ
(35)
(36)
so that R(ω) = ρ(ω)2. If ρ(ω) → 1 and θ → 0 fast enough as the frequency increases, then we may employ the Kramers Kronig relations replacing χ with ln r(ω) = ln ρ(ω) + iθ(ω), so that
r
0
2ω Z ∞ ln R(ω) dω θ(ω ) = − P 0 π ω 2 − ω 02 0
12
(37)
Thus, if R(ω) is measured over the entire frequency range where it is finite, then we can calculate θ(ω). This complete knowledge of R is generally not available, and various extrapolation and fitting schemes are used on R(ω) so that the integral above may be completed. We may then use Eq. 35 above to relate R and θ to the real and imaginary parts of the refractive index, n(ω) =
κ(ω) =
1 − R(ω)
(38)
r
1 + R(ω) − 2 R(ω) cos θ(ω) r
2 R(ω) sin θ(ω) r
1 + R(ω) − 2 R(ω) cos θ(ω)
,
(39)
and therefore the dielectric response ²(ω) = (n(ω) + κ(ω)) 2. 2.3
Model Dielectric Response
We have seen that EM radiation is a sensitive probe of the dielectric properties of materials. Absorption and reflectivity experiments allow us to measure some combination of ²1 or ²2, with the remainder reconstructed by the Kramers-Kronig relations. 13
In order to learn more from such measurements, we need to have detailed models of the materials and their corresponding dielectric properties. For example, the electric field will interact with the moving charges associated with lattice vibrations. At the simplest level, we can model this as the interaction of isolated dipoles composed of bound damped charge e∗ and length s p = e∗ s
(40)
The equation of motion for this system is µ¨ s = −µγ s˙ − µω02s + e∗E
(41)
where the first term on the right-hand side is the damping force, the second term is the restoring force, and the third term is the external field. Furthermore, the polarization P is P =
N ∗ N e s + αE V V
(42)
where α is the polarizability of the different molecules which make up the material. It is to represent the polarizability of rigid bodies. For example, (see Fig. 8) a metallic sphere has α = a3 14
(43)
a
E=0
Figure 8: We may model our material as a system harmonically bound charge and of metallic spheres with polarizablity α = a3 .
If we F.T. these two equations, we get −µω 2s = iωµrs − µω02s + e∗E(ω) ⇒ s(ω)
½
ω02
2
− ω − iωγ
¾
(44)
e∗E(ω) = µ
(45)
and P (ω) = ne∗s(ω) + nαE(ω) or
(46)
∗2
ne /µ + nα (47) ω02 − ω 2 − iωγ The term in brackets is the complex electric susceptibility of P (ω) = E(ω)
the system χ = E/P .
ne∗2 /µ 1 = (²(ω) − 1) χ(ω) = nα + 2 ω0 − ω 2 − iωγ 4π
4πne∗2 /µ ²(ω) = 1 + 4πnα + 2 , ω0 − ω 2 − iγω 15
where
ωp2
(48)
4πne∗2 = µ (49)
Or, introducing the high and zero frequency limits ²∞ = 1 + 4πnα
(50)
ωp2 ωp2 ²0 = 1 + 4πnα + 2 = ²∞ + 2 (51) ω0 ω0 ω02(²0 − ²∞) (52) ²(ω) = ²∞ + 2 ω0 − ω 2 − iγω For our causality arguments we must have no poles in the upper complex half plane. This is satisfied since γ > 0. In addition, we need χ =
1 (²(ω) − 1) → 0, 4π
as ω → ∞
(53)
This means that ²∞ = 1 + 4πnα = 1. Of course α is finite. The problem is that in making α = constant, we neglected the electron mass. Ie., for our example of a metallic sphere, α < a3 for very high ω! (See Fig. 9) due to the finite electronic mass. To analyze experiments ² is separated into real ²1 and imaginary parts ²2 4π (²0 − ²∞)ω02(ω02 − ω 2) ²1(ω) = − σ2 = ²∞ + ω (ω02 − ω 2)2 + γ 2ω 2 (²0 − ²∞)ω02γω 4π ²2(ω) = σ1 = 2 ω (ω0 − ω 2)2 + γ 2ω 2 16
(54) (55)
a
-- - ↑ ξ(ω) Figure 9: For very high ω, α < a3 since the electrons have a finite mass and hence cannot respond instantaneously to changes in the field.
Thus a phonon mode will give a roughly Lorentzian-like line shape in the optical conductivity σ1 centered roughly at the phonon frequency. In addition, this form may be used to con√ struct a model reflectivity R = |˜ n − 1|2/|˜ n + 1|2 , with n ˜= ² which is often appended to the high end of the reflectivity data, so that the Kramers-Kronig integrals may be completed. 3
The Free-electron gas
Metals have a distinct feature in their optical conductivity σ 1(ω) which may be emulated by the free-electron gas. The equation of motion of the free-electron gas is nm¨ s = −γ s˙ − neE 17
(56)
ε2 (ω) ε1 (ω)
ε
γ
0
ε∝
ω0
ω0
ω
Figure 10: Sketch of the real and imaginary parts of the dielectric response of the harmonically bound charge model.
In steady state γ s˙ = −neE; however
ne2τ −nes˙ = j = σE = E m
(57)
in the relaxation time approximation, so s˙ = −
eτ ne E =− E m γ
or γ =
nm . τ
(58)
Thus, the equation of motion is nm¨ s=−
nm s˙ − neE τ
(59)
If we work in a Fourier representation, then −mω 2s(ω) =
iωm s(ω) − eE(ω) τ 18
(60)
However, since P = −ens
iωm mω 2 + P = −ne2E(ω) τ
(61)
or
1 ne2/m E = χE = (² − 1)E P =− 2 ω + iω/τ 4π ω − i/τ ωp2 4πn2e2 1 ² =1− =1− ω 2 + 1/τ 2 mω ω + i/τ ω ωp2 ωp2 ω 1 ²1 = 1 − , ² = 2 ω 2 + 1/τ 2 ω τ ω ω 2 + 1/τ 2 4πσ1 2 However, recall that ²1 = − 4πσ ω and ²2 = ω , so that ωp2 1/τ π . σ1(ω) = 4 ω 2 + 1/τ 2
4
(62) (63) (64)
(65)
Excitons
One of the most dramatic effects of the dielectric properties of semiconductors are excitons. Put simply, an exciton is a hydrogenic bound state made up of a hole and an electron. The Hamiltonian for such a system is 1 ∗ 2 1 ∗ 2 e2 H = m h vh + m e ve − 2 2 ²r 19
(66)
σ1 (ω)
Drude peak (electronic) ∫ σ (ω) dω = ω2p /4 1
1/τ
phonons
Figure 11: A sketch of the optical conductivity of a metal at finite temperatures. The low-frequency Drude peak is due to the coherent transport of electrons. The higher frequency peak is due to incoherent scattering of electrons from phonons or electronelectron interactions.
As usual, we will work in the center of mass, so 1 ∗ 2 1 ∗ 2 1 ∗ (67) mhvh + me ve = (mh + m∗e )R˙ 2 + µr˙ 2 2 2 2 m∗e x¯e + m∗hx¯h ¯ R= , r = x¯e − x¯h (68) m∗h + m∗e The eigenenergies may be obtained from the Bohr atom solution En = −µ(Ze2)2/(2¯h2n2) by making the substitutions
e2 (69) Ze → ² m∗hm∗e ∗ (70) µ → µ = ∗ mh + m∗e In addition there is a cm kinetic energy, let h ¯ k = P , then 2
EnK
h ¯ 2K 2 µ∗ e 4 = + Eg − 2(m∗h + m∗e ) ²22¯h2n2 20
(71)
K electron Eg
r
hole
R
O
Figure 12: Excitons are hydrogenic bound states of an electron and a hole. In semiconductors with an indirect gap, such as Si, they can be very long lived, since a phonon or defect must be involved in the recombination process to conserve momentum. For example, in ultra-pure Si (≈ 1012 impurities per cc) the lifetime can exceed τSi ≈ 10−5 s; whereas in direct-gap GaAs, the lifetime is much shorter τGaAs ≈ 10−9 s
and is generally limited by surface states. On the right, the coordinates of the exciton in the center of mass are shown.
The binding energy is strongly reduced by the dielectric effects, since ² ' 10. (it is also reduced by the effective masses, since
typically m∗ < m). Thus, typically the binding energy is a
small fraction of a Ryberg. Similarly, the size of the exciton is much larger than the hydrogen atom a'
²µ a0 me
(72)
This in fact is the justification for the hydrogenic approximation. Since the orbit contains many sites the lattice may be approximated as a continuum and thus the exciton is well ap21
proximated as a hydrogenic atom. e
+
° ∼ 10A
e
Figure 13: An exciton may be approximated as a hydrogenic atom if its radius is large compared to the lattice spacing. In Si, the radius is large due to the reduced hole and electron masses and the enhanced dielectric constant ² ≈ 10.
22
Chapter 12: Semiconductors Bardeen & Shottky May 18, 2001
Contents 1
Band Structure
4
2
Charge Carrier Density in Intrinsic Semiconductors.
7
3
Doping of Semiconductors
12
4
Carrier Densities in Doped semiconductor
15
1
Semiconductors are of obvious technological importance - so much so, that a whole chapter will be dedicated to them. Semiconductors are distinguished from metals in that they have a gap at the Fermi surface, and are distinguished from < 1eV . Most condensed insulators in that the gap is small ∼ metal
semiconductor
insulator
Figure 1: There is no band gap at the Fermi energy in a metal, while there is a band gap in an insulator. Semiconductors on the other hand have a band gap, but it is much smaller than those found in insulators.
matter physicists make the distinction on the basis of the conductivity and its temperature dependence. In the Drude model (parabolic band) ne2τ σ = , m∗ Almost always
1 τ
µ =
eτ , m∗
σ = neτ
(1)
increases as T increases, ie the thermal exci-
tations increase the scattering rate and decrease the lifetime of 2
the quasiparticle. For example, we have seen that
1 τ
∼ T at high temperatures
due to electron-phonon interactions. In metals, n is about constant, so the temperature dependence of metals is dictated by τ. 1 , σ ↓ as T ↑ (2) T However, in semiconductors, the population of free carriers n is metals
σ ∼ τ∼
temperature dependent. The exponential always will dominate
∼e
Eg
-Eg /2kT
∼n
Figure 2: This shows the temperature dependence for the excitation of electrons, thus allowing the number of free carriers to vary with changes in temperature.
the power law dependence of τ . σ ∼ τn ∼ σ ↑
1 −Eg /kT e T
as T ↑
(3) (4)
The same is true for insulators, of course, except here n is so 3
small that for all realistic purposes σ ∼ 0. 1
Band Structure
Clearly the band structure of the semiconductors is crucial then for their device applications. Semiconductors fall into several categories, depending upon their composition, the simplest, type IV include silicon and germanium. The type refers to their valence. E
conduction band AB P
Eg
Eg ↓ S
B
T↓
sp3 ⇒ 4 electrons per band
valence band r
r0
Figure 3: Sketch of the sp3 bands in Si vs. Si-Si separation.
Recall that Si and Ge have a s2p2 atomic shell, which forms highly directional sp3 hybrid bonds in the solid state (with tetragonal symmetry). It is the covalent bonding, or rather the splitting between the bonding and antibonding bands, that 4
forms the gap. The band structure is also quite rich K 111
Eg
L
110
K 000
Γ
X
100
Si
L
Γ
X
K
Γ
Figure 4: Sketch of quasiparticle bands in Si (right) along the high symmetry directions (left).
1 1 ∂ 2E(k) = 2 m∗ij h ¯ ∂ki∂kj
(5)
The situation in III-V semiconductors such as GaAs is similar, in that covalent sp3 bands still form. However, the gap is direct. For this reason GaAs makes more efficient optical devices than does either Si or Ge. A particle-hole excitation across the gap can readily recombine, emit a photon (which has essentially no momentum) and conserve momentum in GaAs; whereas, in an indirect gap semiconductor, this recombination requires the addition creation or absorption of a phonon or some 5
Eg
Ge
Γ
L
X
K
Γ
Figure 5: Sketch of quasiparticle bands in Ge along the high symmetry directions. Note the indirect, roughly Γ → L, minimum gap energy.
other lattice excitation to conserve momentum. For the same reason, excitons live much longer in Si and especially Ge than they do in GaAs. material
τexciton
GaAs
1ns(10−9 s)
Si
19µs(10−5 s)
Ge
1ms(10−3 s) Table 1:
6
E g ∼ 1.5eV
GaAs
L
Γ
X
K
Γ
Figure 6: Sketch of quasiparticle bands in GaAs along the high symmetry directions of the Brillouin zone. Note the direct, Γ → Γ, minimum gap energy. The nature of the gap can be tuned with Al doping.
2
Charge Carrier Density in Intrinsic Semiconductors.
Both electrons and holes contribute to the conductivity with the same sign. Here the mobilities are assumed to be constant. This is valid since for semiconductors all of the conducting carriers full near the top or bottom of bands, where Ek ∼
h ¯ 2 k2 2m∗p
and
the effective mass approximation is valid. Here, we found that µ ∼ eτ /m∗ However, as mentioned before, the carrier concentrations are 7
conduction
hω ∼ Eg
phonon photon
valence
vs k ≅ ω ck ≅ ω ≅ E g/ h
Figure 7: A particle-hole excitation across the gap can readily recombine, emit a photon (which has essentially no momentum) and conserve momentum in a direct gap semiconductor (left) such as GaAs. Whereas, in an indirect gap semiconductor (right), this recombination requires the addition creation or absorption of a phonon or some other lattice excitation to conserve momentum. e+ v
n = # electrons volume
j
e-
E v
p = # holes volume
j
σ = e(nµ n + pµ r )
Figure 8: The contribution of the electrons and holes to the conductivity.
highly T -dependent since all of the carriers in an intrinsic (undoped) semiconductor are thermally induced (i.e. n = p = 0 at T = 0). n= p=
Z E top
Ec
Z E v
Ebottom
DC (E)f (E, T )dE →
Z ∞
Ec
DC (E)f (E, T )dE (6)
DV (E) {1 − f (E, T )} dE →
Z E v
−∞
DV (E) {1 − f (E, T )} dE (7)
To proceed further we need forms for DC and DV . Recall that in 8
µ ∼ eτ* m
∼E k ∼
Ec E
2 2
h k 2m*
n
Eg
2 2
∼E k ∼ h k* 2m
v
p
Figure 9:
the parabolic approximation Ek ' 3√ (2m∗ ) 2 E. Thus, 2π 2 h ¯3
h ¯ 2 k2 2m∗
we found that D(E) =
3
DC (E) = DV (E) =
(2m∗n) 2 √ E 2 3 2π h ¯
¶3
µ
− EC
2m∗p 2 √ EV 2π 2h ¯3
−E
(8) (9)
for E > EC and E < EV respectively, and zero otherwise EV < E < E C . In an intrinsic (undoped) semiconductor n = p, and so EF must lie in the band gap. However, if m∗n 6= m∗p (ie. DC 6= DV ), then the chemical potential, EF , must be adjusted up or down from the center of the gap so that n = p. Furthermore, the carriers which are induced across the gap are relatively high in energy, compared to kB T , since typically 9
E g = E C − E V À kB T . Eg (eV ) ni(cm−3 )(300◦ K) Ge
0.67
Si
1.1
GaAs
1.43
2.4 × 1013
1.5 × 1010 5 × 107
Table 2:
1eV ∼ 10000◦ K À 300◦K ∼ T kB > Thus, assuming that E − EF ∼ 1
e(E−EF )/kB T + 1
'
Eg 2
(10)
À kB T
1
e(E−EF )/kB T
= e−(E−EF )/kB T
(11)
ie., Boltzmann statistics. A similar relationship holds for holes > E g À kB T where −(E − EF ) ∼ 2 1−
1
e(E−EF )/kB T + 1
½
' 1− 1 − e
−(E−EF )/kB T
¾
= e−(E−EF )/kB T (12)
since (1 − f (E)) = f (−E) and e(E−EF )/kB T is small. Thus, the concentration of electrons n 3
(2m∗n) 2 EF /kB T Z ∞ √ −E/kB T n ' e E − E e dE C 3 EC 2π 2h ¯ 10
(13)
3
Z 3 (2m∗n) 2 −β(EC −EF ) ∞ 21 −x 2 = (kB T ) e x e dx (14) 0 2π 2h ¯3 3 ∗ 2 k T 2πm n B C −β(EC −EF ) e−β(EC −EF ) = Nef (15) = 2 fe 2 h
Similarly
3
2πm∗p kB T 2 −β(EV −EF ) V −β(EV −EF ) e = N p = 2 e ef f h2
(16)
V C where Nef f and Nef f are the partition functions for a classical
gas in 3-d and can be regarded as ”effective densities of states” which are temperature-dependent. Within this interpretation, we can regard the holes and electrons statistics as classical. This holds so long as n and p are small, so that the Pauli principle may be ignored - the so called nondegenerate limit. In general, in the nondegenerate limit,
3µ ¶3 k T B ∗ ∗ 2 −βEg mn mp e np = 4 2π¯h2
(17)
this, the law of mass action, holds for both doped and intrinsic semiconductor so long as we remain in the nondegenerate limit. However, for an intrinsic semiconductor, where n = p, it
11
gives us further information. 3
kB T 2 µ ∗ ∗¶ 43 −βEg /2 ni = p i = 2 mn mp e (18) 2π¯h2 However, we already have relationships for n and p involving EC and EV C −β(EC −EF ) V β(EV −EF ) n = p = Nef = Nef fe fe
e
2βEF
V Nef = C f eβ(EV +EC ) Nef f
or
(19) (20)
NV 1 1 ef f (21) EF = (EV + EC ) + kB T ln C 2 2 Nef f ∗ m 1 3 p EF = (EV + EC ) + kB T ln (22) ∗ 2 4 mn Thus if m∗p 6= m∗n, the chemical potential EF in a semiconductor is temperature dependent. Recall that this T -dependence was important for the transport of a semiconductor in the presence of a thermal gradient ∇T . 3
Doping of Semiconductors
σ = neµ, so the conductivity depends linearly upon the doping (it may also effect µ in some materials, leading to a non-linear 12
doping dependence). A typical metal has nmetal ∼ 1023/(cm)3
(23)
whereas we have seen that a typical semiconductor has niSeC
1010 ∼ cm3
atT ' 300◦K
(24)
Thus the conductivity of an intrinsic semiconductor is quite small! To increase n (or p) to ∼ 1018 or more, dopants are used. For example, in Si the elements used as dopants have either a s2p1 or s2p3 atomic valence. Thus, in the tetrahedral bonding of Si there is either an extra electron (half bond) or an unsatisfied bond or a hole. Thus P or B will either donate or Si
Si
Si
Si
Si
Si
Si
P
Si
Si
Si
Si
Si e+
Si
Si
B
Si
Si
Si
Si
Si
Si
Si
Si e-
r Si
Si
Si
Si
Si
Si
Si
Si
2 Si 3s 3p 2 P 3s 3p
B 3s2 3p
2
3
1
2 r = h ε 2 big!
m* e
Figure 10:
absorb additional electron (with the latter called the creation 13
of a hole). As in an exciton, these additional charges will be localized around the donor or acceptor ion. The difference is that here the donor/acceptor is fixed and may be treated as having infinite mass, thus the binding energy is given by m∗ e 4 E= 2 2 2 2² h ¯ n
Again, since
m∗ =
m∗ m
(25)
hole mass acceptor(B)
(26)
electron mass donor(P )
< 1 and ² ∼ 10 these energies are often much
less than 13.6eV c.f. in Si E ∼ 30M eV ∼ 300◦K or in Ge E ∼ GM eV ∼ 60◦K. Thus thermal excitations will often
ionize these dopant sites. In terms of energy levels p - SeC
n - SeC EF
EC
EC
unoccupied at T= 0 (occupied by holes at T = 0)
ED occupied at T= 0 EV
EA EF
EV
Figure 11:
14
4
Carrier Densities in Doped semiconductor
The law of mass action is valid so long as the use of Boltzmann statistics is valid i.e., if the degeneracy is small. Thus, even for doped semiconductor C V −βEg np = Nef = n2i = p2i f Nef f e
(27)
Now imagine the temperature is finite so that some of the donors or acceptors are ionized. Furthermore, in equilibrium, the semi# ionized EC ED
+
0
ND = N D + ND
EF EA
# un-ionized +
0
NA = NA + NA
EV
Figure 12:
conductor is charge neutral so that n + NA− = p + ND+
15
(28)
The probability that a donor/acceptor is occupied by an electron is determined by Fermi statistics nD = ND0 = ND
1
(29)
1 + eβ(ED −EF )
pA = NA0 = NA(1 − f (EA )) = NA
1 1 + eβ(EF −EA)
(30)
To provide a solvable example, imagine that we have an n-type semiconductor (no p-type dopants) so that NA = NA0 = NA+ = 0, then C −β(EC −EF ) n = Nef fe
ND = ND0 + ND+ ND0 = ND
(31) (32)
1 eβ(ED −EF ) + 1
(33)
Furthermore, charge neutrality requires that n = p + ND+
(34)
An excellent approximation is to assume that for a (commercially) doped semiconductor ND+ À ni
(35)
ie., many more carriers are provided by dopants than are thermally excited over the entire gap, then as np = n2i , it must be 16
that ND+ À p so that n ≈ ND+ = ND − ND0 1 n ≈ ND 1 − β(E −E ) D F e +1
(36) (37)
If we recall that thermally induced carriers satisfy the Boltzmann equation, C β(EF −EC ) n = Nef fe
(38)
we can eliminate EF in n (where Ed = Ec − ED ) n=
ND C ) 1 + eβEd n/(Nef f
(39)
This quadratic equation has only one meaningful solution n= At low T ¿
2ND
s
1+ 1+4
µ
C ND /Nef f
¶
eβEd
(40)
Ed kB
n' at higher T À
r
C e−βEd ND Nef f
(41)
Ed kB
n = ND
(42)
At still higher T our approximation breaks down that ND+ À n since thermally excited carriers will dominate. 17