Kluwer

  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Kluwer as PDF for free.

More details

  • Words: 127,253
  • Pages: 400
The Metrical Theory of Continued fractions Marius Iosifescu and Cor Kraaikamp

Contents Preface

ix

Frequently Used Notation

xv

1 Basic properties of the continued fraction expansion 1.1 A generalization of Euclid’s algorithm . . . . . . . . . 1.1.1 The continued fraction transformation τ . . . . 1.1.2 Continuants and convergents . . . . . . . . . . 1.1.3 Some special continued fraction expansions . . 1.2 Basic metric properties . . . . . . . . . . . . . . . . . . 1.2.1 Defining random variables of interest . . . . . . 1.2.2 Gauss’ problem and measure . . . . . . . . . . 1.2.3 Fundamental intervals, and applications . . . . 1.3 The natural extension of τ . . . . . . . . . . . . . . . . 1.3.1 Definition and basic properties . . . . . . . . . 1.3.2 Approximation coefficients . . . . . . . . . . . . 1.3.3 Extended random variables . . . . . . . . . . . 1.3.4 The conditional probability measures . . . . . . 1.3.5 Paul L´evy’s solution to Gauss’ problem . . . . 1.3.6 Mixing properties . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

1 1 1 4 11 14 14 15 17 25 25 27 31 36 39 43

2 Solving Gauss’ problem 2.0 Banach space preliminaries . . . . . . 2.0.1 A few classical Banach spaces . 2.0.2 Bounded essential variation . . 2.1 The Perron–Frobenius operator . . . . 2.1.1 Definition and basic properties 2.1.2 Asymptotic behaviour . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

53 53 53 55 56 56 62

v

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

vi

CONTENTS 2.1.3

2.2

2.3

2.4

2.5

Restricting the domain of the Perron–Frobenius operator . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 A solution to Gauss’ problem for probability measures with densities . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Computing variances of certain sums . . . . . . . . . . Wirsing’s solution to Gauss’ problem . . . . . . . . . . . . . . 2.2.1 Elementary considerations . . . . . . . . . . . . . . . . 2.2.2 A functional-theoretic approach . . . . . . . . . . . . . 2.2.3 The case of Lipschitz densities . . . . . . . . . . . . . Babenko’s solution to Gauss’ problem . . . . . . . . . . . . . 2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 A symmetric linear operator . . . . . . . . . . . . . . . 2.3.3 An ‘exact’ Gauss–Kuzmin–L´evy theorem . . . . . . . 2.3.4 ψ-mixing revisited . . . . . . . . . . . . . . . . . . . . Extending Babenko’s and Wirsing’s work . . . . . . . . . . . 2.4.1 The Mayer–Roepstorff Hilbert space approach . . . . 2.4.2 The Mayer–Roepstorff Banach space approach . . . . 2.4.3 Mayer–Ruelle operators . . . . . . . . . . . . . . . . . The Markov chain associated with the continued fraction expansion . . . . . . . . . . . . . . . . . . 2.5.1 The Perron–Frobenius operator on BV (I) . . . . . . . 2.5.2 An upper bound . . . . . . . . . . . . . . . . . . . . . 2.5.3 Two asymptotic distributions . . . . . . . . . . . . . . 2.5.4 A generalization of a result of A. Denjoy . . . . . . . .

3 Limit theorems 3.0 Preliminaries . . . . . . . . . . . . . . . . . . . 3.1 The Poisson law . . . . . . . . . . . . . . . . . 3.1.1 The case of incomplete quotients . . . . 3.1.2 The case of associated random variable 3.1.3 Some extreme value theory . . . . . . . 3.2 Normal convergence . . . . . . . . . . . . . . . 3.2.1 Two general invariance principles . . . . 3.2.2 The case of incomplete quotients . . . . 3.2.3 The case of associated random variables 3.3 Convergence to non-normal stable laws . . . . . 3.3.1 The case of incomplete quotients . . . . 3.3.2 Sums of incomplete quotients . . . . . . 3.3.3 The case of associated random variables 3.4 Fluctuation results . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

64 70 71 79 79 85 95 101 101 103 111 119 120 120 127 130 135 135 139 151 156 165 165 169 169 171 173 179 179 182 188 196 196 202 207 213

CONTENTS

vii

3.4.1 3.4.2

The case of incomplete quotients . . . . . . . . . . . . 213 The case of associated random variables . . . . . . . . 215

4 Ergodic theory of continued fractions 219 4.0 Ergodic theory preliminaries . . . . . . . . . . . . . . . . . . . 219 4.0.1 A few general concepts . . . . . . . . . . . . . . . . . . 219 4.0.2 The special case of the transformations τ and τ . . . . 224 4.1 Classical results and generalizations . . . . . . . . . . . . . . 225 4.1.1 The case of incomplete quotients . . . . . . . . . . . . 225 4.1.2 Empirical evidence, and normal continued fraction numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 240 4.1.3 The case of associated and extended random variables 244 4.2 Other continued fraction expansions . . . . . . . . . . . . . . 257 4.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 257 4.2.2 Semi-regular continued fraction expansions . . . . . . 260 4.2.3 The singularization process . . . . . . . . . . . . . . . 264 4.2.4 S-expansions . . . . . . . . . . . . . . . . . . . . . . . 266 4.2.5 Ergodic properties of S-expansions . . . . . . . . . . . 273 4.3 Examples of S-expansions . . . . . . . . . . . . . . . . . . . . 281 4.3.1 Nakada’s α-expansions . . . . . . . . . . . . . . . . . . 281 4.3.2 Minkowski’s diagonal continued fraction expansion . . 289 4.3.3 Bosma’s optimal continued fraction expansion . . . . . 292 4.4 Continued fraction expansions with σ-finite, infinite invariant measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 4.4.1 The insertion process . . . . . . . . . . . . . . . . . . . 299 4.4.2 The Lehner and Farey continued fraction expansions . 300 4.4.3 The backward continued fraction expansion . . . . . . 307 Appendix 1: Spaces, functions, and measures A1.1 . . . . . . . . . . . . . . . . . . . . . . . . . A1.2 . . . . . . . . . . . . . . . . . . . . . . . . . A1.3 . . . . . . . . . . . . . . . . . . . . . . . . . A1.4 . . . . . . . . . . . . . . . . . . . . . . . . . A1.5 . . . . . . . . . . . . . . . . . . . . . . . . . A1.6 . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 2: Regularly A2.1 . . . . . . . . . . A2.2 . . . . . . . . . . A2.3 . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

313 . 313 . 313 . 314 . 314 . 316 . 319

varying functions 321 . . . . . . . . . . . . . . . . . . . . . . . . . 321 . . . . . . . . . . . . . . . . . . . . . . . . . 323 . . . . . . . . . . . . . . . . . . . . . . . . . 324

viii Appendix 3: Limit theorems for A3.1 . . . . . . . . . . . . . . . . A3.2 . . . . . . . . . . . . . . . . A3.3 . . . . . . . . . . . . . . . .

CONTENTS mixing . . . . . . . . . . . . . . .

random variables 325 . . . . . . . . . . . . . . 325 . . . . . . . . . . . . . . 327 . . . . . . . . . . . . . . 328

Notes and Comments

333

References

347

Index

377

Preface This monograph is intended to be a complete treatment of the metrical theory of the (regular) continued fraction expansion and related representations of real numbers. We have attempted to give the best possible results known so far, with proofs which are the simplest and most direct. The book has had a long gestation period because we first decided to write it in March 1994. This gave us the possibility of essentially improving the initial versions of many parts of it. Even if the two authors are different in style and approach, every effort has been made to hide the differences. Let Ω denote the set of irrationals in I = [0, 1]. Define the (regular) continued fraction transformation τ by τ (ω) = fractional part of 1/ω, ω ∈ Ω. Write τ n for the nth iterate of τ, n ∈ N = {0, 1, · · · }, with τ 0 = identity map. The positive integers an (ω) = a1 (τ n−1 (ω)), n ∈ N+ = {1, 2 · · · } , where a1 (ω) = integer part of 1/ω, ω ∈ Ω, are called the (regular continued fraction) digits of ω. Writing [x1 ] = 1/x1 ,

[x1 , · · · , xn ] = 1/(x1 + [x2 , · · · , xn ]),

n ≥ 2,

for arbitrary indeterminates xi , 1 ≤ i ≤ n, we have ω = lim [a1 (ω), · · · , an (ω)] , n→∞

ω ∈ Ω,

thus explaining the name of τ . The above equation will be also written as ω = lim [a1 (ω), a2 (ω), · · · ], n→∞

ω ∈ Ω.

The an , n ∈ N, to be called incomplete quotients, are clearly positive integervalued random variables which are defined almost surely on (I, BI ) with respect to any probability measure assigning probability 0 to the set I\Ω of rationals in I. (Here BI denotes the σ-algebra of Borel subsets of I.) The metrical theory of the (regular) continued fraction expansion is about the sequence (an )n∈N+ of its incomplete quotients, and related sequences. ix

x

Preface C.F. Gauss stated in 1812 that, in current notation, lim λ(τ −n ([0, x))) = γ([0, x]),

n→∞

x ∈ I,

where λ denotes Lebesgue measure and γ is what we now call Gauss’ measure, defined by Z dx 1 γ(A) = , A ∈ BI . log 2 A x + 1 Gauss asked for an estimate of the convergence rate in the above limiting relation, and this has actually been the first problem of the metrical theory of continued fractions. Ramifications of this problem, which was given a first solution only in 1928, still pervade the current developments. Chapter 2 contains a detailed treatment of Gauss’ problem by an elementary approach and functional-theoretic methods as well. The latter are applied to the Perron–Frobenius operator associated with τ , considered as acting on various Banach spaces including that of functions of bounded variation on I. Gauss’ measure is important since it is preserved by τ , that is, γ(τ −1 (A)) = γ(A) for any A ∈ BI . This implies that, by its very definition, the sequence (an )n∈N+ is strictly stationary under γ. As such, there should exist a doubly infinite version of it, say (¯ a` )`∈Z , Z = { · · · , −1, 0, 1, · · · }, defined on a richer probability space. It appears that this doubly infinite version can be effectively constructed on (I 2 , BI2 , γ¯ ), where γ¯ is the so called extended Gauss’ measure defined by ZZ 1 dxdy γ¯ (B) = , B ∈ BI2 . log 2 B (xy + 1)2 Put a ¯−n (ω, θ) = an+1 (θ), a ¯0 (ω, θ) = a1 (θ), a ¯n (ω, θ) = an (ω) for any n ∈ N+ and (ω, θ) ∈ Ω2 . Then whatever ` ∈ Z, k ∈ N, and n ∈ N+ the probability distribution of the random vector (¯ a` , · · · , a ¯`+k ) under γ¯ is identical with that of the random vector (an , · · · , an+k ) under γ, that is, (¯ a` )`∈Z under γ¯ is a doubly infinite version of (an )n∈N+ under γ. A distinctive feature of our treatment is the consistent use of the extended incomplete quotients a ¯` , ` ∈ Z. It appears that γ¯ ( [0, x] × I | a ¯0 , a ¯−1 , · · · ) =

(a + 1)x ax + 1

γ¯ -a.s.

for any x ∈ I, where a = [¯ a0 , a ¯−1 , · · · ], which in turn implies that γ¯ (¯ a`+1 = i | a ¯` , a ¯`−1 , · · · ) =

a+1 (a + i)(a + i + 1)

γ¯ -a.s.

Preface

xi

for any i ∈ N+ and ` ∈ Z. The last equation emphasizes a ‘chain of infinite order’ structure of the incomplete quotients when properly defined on a richer probability space. This idea goes back to W. Doeblin (1940) and, hopefully, is fully clarified by our treatment. Also, the considerations above motivate the introduction of the family (γa )a∈I of probability measures on BI defined by their distribution functions γa ([0, x]) =

(a + 1)x , ax + 1

x ∈ I.

In particular, γ0 = λ. Besides γ, these probability measures, which we call conditional, are the most natural ones associated with the regular continued fraction expansion. It appears that (¯ a` )`∈Z is ψ-mixing under γ¯ while (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I, and that the ψ-mixing coefficients of the latter under γ (which are equal to the corresponding ones of the former under γ¯ ) can in principle be exactly calculated. The facts just described are part of our Chapter 1. Chapter 3 is devoted to limit theorems for incomplete quotients, related random variables, and their extended versions. These include weak convergence to the Poisson, normal, and non-normal stable laws as well as the law of the iterated logarithm, in both classical and functional approaches, and are essentially based, in general, on the ψ-mixing property of both (¯ a` )`∈Z and (an )n∈N+ . The ergodic properties of the regular continued fraction expansion, leading to strong laws of large numbers, is deferred to Chapter 4. The reason is that whilst these properties are inherited by the continued fraction expansions which can be derived from the regular continued fraction expansion by the procedures called singularization and insertion, the limit properties in Chapter 3 do not transfer automatically to continued fraction expansions so derived. We give applications of the ergodic properties of the continued fraction transformation τ and its natural extension τ¯. After an introduction, in which several general ergodic theoretical concepts and results—such as Birkhoff’s ergodic theorem—are described, various classical results and important recent results, based on the natural extension, are derived. It is then shown that—via singularization and insertion—the ergodic properties of very many other continued fraction expansions can easily be obtained. In particular, the ergodic properties of the so called S-expansions are described in detail. Several examples of S-expansions are studied, such as Nakada’s α-expansions, Minkowski’s diagonal continued fraction expansion and Bosma’s optimal continued fraction expansion. Also, the connection between the regular continued fraction expansion and continued fraction

xii

Preface

expansions with σ-finite, infinite invariant measures, such as the backward continued fraction expansion and Lehner’s continued fraction expansion, is explained. To make the book self-contained as reasonably as possible, we have included three appendices containing less known notions and results from measure theory, regularly varying functions, and limit theorems for mixing sequences of random variables, which we use frequently, especially in Chapter 3. We urge the reader to become familiar with the appendices early on so as to be aware of what can be found there as needed. We also warn the reader that Chapter 3 and some subsections of Chapter 2 are more involved or more abstract, and thus they make more difficult reading. The concluding notes and comments aim at giving credit, pointing out to results not included in the main text, or tracing historical developments. The references list greatly exceeds the number of works quoted in the course of the book. It should be consulted with the purpose of discovering historical sources, parallel research, and starting points for new investigations. For what our work is not, the reader is referred to the books by Brezinski (1991) and von Plato (1994)—for the history of continued fractions—Jones and Thron (1980), Lorenzen and Waadeland (1992), Olds (1963), Perron (1954, 1957), Rockett and Sz¨ usz (1992), Schmidt (1980), Sprindˇzuk (1979), Sudan (1959), and Wall (1948)—for various, mainly non-metric, aspects of the theory of continued fractions. Acknowledgements Much of our original work included in this book has been carried out in the framework of our association with the Bucharest ‘Gheorghe Mihoc’ Centre for Mathematical Statistics of the Romanian Academy, and the Department of Probability and Statistics (CROSS), Faculty ITS, of the Delft University of Technology. Many institutions and persons have helped us in various ways. The first of us wishes to acknowledge the hospitality of Universit´e Ren´e Descartes – Paris 5, Universit´e des Sciences et des Technologies de Lille, and Universit´e Victor Segalen – Bordeaux 2. He is grateful to Bui Trong Lieu, Michel Schreiber (both of Paris 5), George Haiman (Lille), and JeanMarc Deshouillers (Bordeaux 2) for their kind invitations at these locations where his stays in the period 1996–1999 were very helpful in completing parts of the book. He is also grateful to the Nederlandse Organisatie voor

Preface

xiii

Wetenschappelijk Onderzoek (NWO)—the Dutch organization for scientific research—for two one-month research grants in the years 2000 and 2001, and to the Department of Probability and Statistics (CROSS) for invitations allowing several short stays in Delft during which much of the joint work on the book was done. A short stay in the spring of 2000 at the Department of Mathematics of Uppsala University, for which he is grateful to Allan Gut, was very beneficial for gathering recent literature on the subject. Last, but not the least, he gratefully acknowledges generous financial support in the years 2000 and 2001 from a French–Romanian CNRS International Project of Scientific Cooperation (PICS) directed by Ha¨ım Brezis and Doina Cior˘anescu (both of Universit´e Pierre et Marie Curie – Paris 6). This allowed him to spend more time in Delft, which was decisive for completing the book. Finally, he wishes to acknowledge the technical help he has received from Adriana Gr˘adinaru who changed his handwritten, hardly legible drafts into a camera ready copy. The second author would also like to thank the Romanian Academy for their support during his visits to Bucharest. Adriana Berechet read several versions of the typescript, and with her penetrating mind detected some inaccuracies and slips. Expressing our indebtedness to her, we wish to make it clear that any remaining errors are our own. Finally, we must thank all the people with Kluwer Academic Publishers who helped during the development and production of this book project.

Delft, November 2001

M.I. C.K.

xiv

Preface

Frequently Used Notation Abbreviations a.e. = almost everywhere (with respect to Lebesgue measure) a.s. = almost surely (with respect to any other measure) Cov = covariance g.c.d. = greatest common divisor i.i.d.= independent identically distributed i.o. = infinitely often log = natural logarithm p.m. = probability measure s.i. = strongly infinitesimal r.v. = random variable var = total variation Var = variance 2 = end of example, proof, or remark

xv

xvi

Frequently Used Notation

Symbols N = {0, 1, 2, · · · } , N+ = {1, 2, · · · } , −N = {· · · , −2, −1, 0} Z = (−N) ∪ N+ = {· · · , −1, 0, 1, · · · } Q = the set of rational numbers R = the set of real numbers bac = integer part of a ∈ R {a} = fractional part of a ∈ R R+ = (x ∈ R : x ≥ 0) , R++ = (x ∈ R : x > 0) I = [0, 1] = the unit interval of R Ω = I \ Q = the set of irrationals in I C = the set of complex numbers i=

√ −1 (imaginary unit)

z ∗ = complex conjugate of z ∈ C Rn = real n-vector space, or Euclidean n-space, n ∈ N+ ; R1 = R B n = σ-algebra of Borel sets in Rn ; B1 = B BM = Bn ∩ M := (B ∩ M : B ∈ B n ), M ∈ B n , n ∈ N+ BI = B ∩ I = σ-algebra of Borel sets in I BI2 = BI 2 = σ-algebra of Borel sets in I 2 Ac = complementary set of the set A

Frequently Used Notation

xvii

IA = indicator function of the set A ∂A = boundary of the Borel set A δx = p.m. concentrated at the point x λ = Lebesgue measure on B λ2 = Lebesgue measure on B2 N (0, 1) = standard normal distribution Φ = standard normal distribution function P (θ) = Poisson distribution with parameter θ P f −1 = P -distribution of r.v. f ∗ = convolution of measures ⊗ = product of σ-algebras or measures C = 0.577 215 · · · (Euler’s constant) Fn = nth Fibonacci number: F0 = F1 = 1, Fn+1 = Fn + Fn−1 ,

n ∈ N+

√ g = ( 5 − 1)/2, G = g + 1 (‘golden ratios’) K0 = 2.685 452 · · · (Khinchin’s constant) K−1 = 1.745 405 · · · (Khinchin’s constant) λ0 = 0.303 663 002 898 732 568 · · · (Wirsing’s constant) ζ(2) =

P i∈N+

i−2 = π 2 /6

xviii

Frequently Used Notation an , 3, 14

|| · || L , 54

a ¯` , 31

Lp , 55

B(I), 53

||·||p , 55

|| · ||, 53

L∞ , 55

BEV (I), 55

||.||∞ , 55

||·||v , 56

Lpµ , 54

||·||v,µ , 56

||·||p,µ , 54

BV (I), 54

L∞ µ , 55

|| · || v , 54

||.||∞,µ , 55

C, 319

m(X ), 314

C(I), 53

µ-ess sup, 55

C 1 (I), 53

να , 197

|| · || 1 , 53

Pλ , 60

cτ Pois µ, 317

Pi , 22

γ, 16

Pi1 ···in , 136

γ¯ , 26

Pµ , 57

γa , 36

pn , 4, 19

d0 , 319

pen , 261

dP , 315

peen , 265

D = D(I), 319

Pois µ, 317

ess sup, 55

pr(X ), 314

F, 324

qn , 4, 19

G, Gan , 39

qne , 261

L(I), 53

qene , 265

Frequently Used Notation

xix

Qν , 328

W , 319

rn , 14

yn , 15

r¯` , 34

y ` , 34

s (f ), 53 sn , 14 san , 36 s¯` , 34 σ (C), 313 σ ((fi )i∈I ), 314 ten , 263 e ten , 273 τ, 2 τ , 25 Θn , 27 Θ0n , 251 Θen , 263 e e , 280 Θ n ¡ (n) ¢ u i , 18 U := Pγ , 59 un , 14 uan , 38 u ¯` , 34 v (f ), 55 ¡ ¢ v i(n) , 18

xx

Frequently Used Notation

Chapter 1

Basic properties of the continued fraction expansion In this chapter the (regular) continued fraction expansion is introduced and notation fixed. Some basic properties to be used in subsequent chapters are also derived.

1.1 1.1.1

A generalization of Euclid’s algorithm The continued fraction transformation τ

In Proposition 2 of Book VII, Euclid gave an algorithm—now bearing his name—for finding the greatest common divisor (g.c.d.) of two given integers: let a, b ∈ Z and assume for convenience that a > b > 0. Put v0 := a,

v1 := b,

and determine a1 ∈ N+ , v2 ∈ N, such that v0 = a1 v1 + v2 , where 0 ≤ v2 < v1 . If v2 6= 0 then we repeat this procedure and obtain v1 = a2 v2 + v3 , where 0 ≤ v3 < v2 . In general, if vm 6= 0 for some m ≥ 2, then we obtain vm−1 = am vm + vm+1 , 1

(1.1.1)

2

Chapter 1

where 0 ≤ vm+1 < vm . Clearly, the procedure should stop after finitely many steps: there exists n ∈ N+ such that vn 6= 0 and vn+1 = 0. Then, as is well known, we have vn = g.c.d. (a, b) . Remark. The running time of Euclid’s algorithm depends on the number of division steps required to get the g.c.d. of the given positive integers v0 > v1 . In an 1844 paper of the French mathematician Gabriel Lam´e it is essentially shown that (i) given n ∈ N+ , if Euclid’s algorithm applied to v0 and v1 requires exactly n division steps and v0 is as small as possible satisfying this condition, then v0 = Fn+1 and v1 = Fn ; (ii) if v1 < v0 < m ∈ N+ , then the number of division steps required by Euclid’s algorithm when applied to v0 and v1 is at most k j √ √ log( 5m)/ log(( 5 + 1)/2) − 2 ≈ b2.078 log m + 1.672c − 2, where b c : R → Z is the greatest integer function, that is, bxc = greatest integer not exceeding x ∈ R. For historical details we refer the reader to Shallit (1994), and for recent developments to Knuth (1981, Section 4.5.3) and Hensley (1994). It should be noted that the latter are based on results to be proved in this and later chapters. 2 To consider Euclid’s algorithm more closely we define the so called continued fraction transformation τ : I → I by ½ −1 x − bx−1 c if x 6= 0, τ (x) = 0 if x = 0. Then putting x = b/a we obviously have a1 = a1 (x) = bv0 /v1 c, and

vm vm−1

= τ m−1 (x) ,

··· ,

an = an (x) = bvn−1 /vn c

1 ≤ m ≤ n,

τ n (x) = 0,

where τ 0 = identity map and τ ` , ` ∈ N+ , is the composition of τ with itself ` times. Note that ¡ ¢ am (x) = a1 τ m−1 (x) , 1 ≤ m ≤ n. (1.1.2)

Basic properties

3

As vm−1 = am vm + vm+1 , we have 1 τ m−1 (x)

= am + τ m (x) ,

1 ≤ m ≤ n.

If for arbitrary indeterminates xi , 1 ≤ i ≤ n, n ∈ N+ , we write [x1 ] =

1 , x1

[x1 , · · · , xn ] =

1 , x1 + [x2 , · · · , xn ]

n ≥ 2,

then it follows that x = [a1 + τ (x)] = [a1 , · · · , am−1 , am + τ m (x)] = [a1 , · · · , an ]

(1.1.3)

for 1 < m ≤ n. An expression as on the right hand side of (1.1.3) is called a finite (regular ) continued fraction (RCF for short). It follows from Euclid’s algorithm that each rational number x ∈ / Z can be written as x = a0 + [a1 , . · · · , an ] ,

(1.1.4)

where a0 = bxc. (Note that for any x ∈ R, x ∈ / Z, the fractionary part x − bxc of x is a number in the open interval (0, 1) !) The right hand side of (1.1.4) will be denoted by [a0 ; a1 , · · · , an ] . Euclid’s algorithm yields an ≥ 2. Hence each rational number x ∈ / Z has two continued fraction expansions, namely, [a0 ; a1 , · · · , an ] = [a0 ; a1 , · · · , an − 1, 1] . Of course, there is no reason whatsoever to stick to rationals. Let x ∈ R\Q and, as in the case of rationals, put a0 = bxc. It follows from the very definition of τ that τ n (x − a0 ) ∈ Ω = I\Q,

n ∈ N.

Let us define an = an (x) = b1/τ n−1 (x − a0 )c,

n ∈ N+ ,

so that, similarly to (1.1.2), ¡ ¢ an (x) = a1 τ n−1 (x − a0 ) ,

n ∈ N+ .

(1.1.20 )

4

Chapter 1

Hence x = [a0 ; a1 + τ (x − a0 )] = · · · = [a0 ; a1 , · · · , an−1 , an + τ n (x − a0 )] (1.1.5) for any n ≥ 2. The two cases x ∈ Q and x ∈ R\Q can be treated in a unitary manner if we define a1 (0) = ∞, the symbol ∞ being subject to the rules 1/∞ = 0, 1/0 = ∞. Equations (1.1.5) are then valid for any x ∈ R. Clearly, for any x ∈ Q there exists n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n. The integers a1 (x), a2 (x), · · · will be called the (continued fraction) digits of x ∈ R whilst the functions x → ai (x) ∈ N+ ∪ {∞}, x ∈ R, i ∈ N+ , will be called the incomplete (or partial ) quotients of the continued fraction expansion. Euclid’s algorithm implies that x ∈ R has finitely many finite continued fraction digits if and only if x ∈ Q.

1.1.2

Continuants and convergents

Throughout the first three chapters, without express mention to the contrary, we will assume that x ∈ [0, 1), which implies that a0 = 0, and write [0; a1 , · · · , an ] = [a1 , · · · , an ] ,

n ∈ N+ .

We will usually drop the dependence on x in the notation. Define ω0 = 0,

ωn = ωn (x) = [a1 , · · · , an ] , x ∈ [0, 1),

n ∈ N+ .

Clearly, ωn ∈ Q, say

pn , n ∈ N+ , qn where pn , qn ∈ N+ and g.c.d. (pn , qn ) = 1. The number ωn ∈ ωn (x) is called the nth (regular continued fraction) (RCF) convergent of x, n ∈ N. As a rule, in the first three chapters the specification RCF will be dropped. Clearly, for any x ∈ Q there exists n = n (x) ∈ N such that ωm (x) = x for any m ≥ n. We shall show that for any irrational ω ∈ Ω := I\Q we have ωn =

lim ωn (ω) = ω.

n→∞

For that we need some preparation. Define recursively polynomials Qn of n variables, n ∈ N, by  if n = 0,  1 x1 if n = 1, Qn (x1 , · · · , xn ) =  x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn ) if n ≥ 2.

Basic properties

5

Thus Q2 (x1 , x2 ) = x1 x2 + 1, Q3 (x1 , x2 , x3 ) = x1 x2 x3 + x1 + x3 , Q4 (x1 , x2 , x3 , x4 ) = x1 x2 x3 x4 + x1 x2 + x1 x4 + x3 x4 + 1, etc. In general, as noted by Leonhard Euler, for any n ∈ N+ , Qn (x1 , · · · , xn ) is the sum of all terms which can be obtained starting from x1 · · · xn and deleting zero or more non-overlapping pairs (xi , xi+1 ) of consecutive variables. There are Fn such terms. (Prove it!) The polynomials Qn , n ∈ N, are called continuants, and their basic property is that [x1 , · · · , xn ] =

Qn−1 (x2 , · · · , xn ) , Qn (x1 , · · · , xn )

n ∈ N+ .

(1.1.6)

The proof by induction is immediate and is left to the reader. The continuants enjoy the symmetry property Qn (x1 , · · · , xn ) = Qn (xn , · · · , x1 ) ,

n ∈ N+ .

(1.1.7)

This follows from Euler’s remark above. Hence Qn (x1 , · · · , xn ) = xn Qn−1 (x1 , · · · , xn−1 ) + Qn−2 (x1 , · · · , xn−2 )

(1.1.8)

for any n ≥ 2. The continuants also satisfy the equation Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 ) n

− Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn ) = (−1) ,

(1.1.9) n ∈ N+ .

The proof is immediate. For n = 1 equation (1.1.9) is true. By the very definition of Qn , for any n ≥ 2 we have Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 )−Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn ) = (x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn )) Qn (x2 , · · · , xn+1 ) − (x1 Qn (x2 , · · · , xn+1 ) + Qn−1 (x3 , · · · , xn+1 )) Qn−1 (x2 , · · · , xn ) = (−1) Qn−1 (x2 , · · · , xn ) Qn−1 (x3 , · · · , xn+1 ) −(−1)Qn (x2 , · · · , xn+1 ) Qn−2 (x3 , · · · , xn ) = · · · = (−1)n−1 (Q1 (xn ) Q1 (xn+1 ) − Q2 (xn , xn+1 )) = (−1)n .

6

Chapter 1

Now, let ω ∈ Ω = I\Q have digits a1 (ω), a2 (ω), · · · . It follows from (1.1.6) and (1.1.9) that ωn (ω) =

Qn−1 (a2 , · · · , an ) , Qn (a1 , · · · , an )

(1.1.10)

pn = Qn−1 (a2 , · · · , an ), qn = Qn (a1 , · · · , an ),

n ∈ N+ .

Hence pn (ω) = qn−1 (τ (ω)), n ∈ N+ , ω ∈ Ω, and using (1.1.8) we obtain qn = an qn−1 + qn−2 , n ≥ 2, pn = an pn−1 + pn−2 , n ≥ 3,

(1.1.11)

with q0 = 1, q1 = a1 , p1 = 1, p2 = a2 . If we define p0 = q−1 = 0, p−1 = 1, then equations (1.1.11) hold for any n ∈ N+ . It follows from (1.1.9) and (1.1.10) that pn qn−1 − pn−1 qn = (−1)n+1 , n ∈ N. (1.1.12) Clearly, either (1.1.10) or (1.1.11) implies that pn+1 ≥ Fn ,

qn ≥ Fn ,

n ∈ N.

(1.1.13)

Notice that by (1.1.5), (1.1.6), (1.1.7), (1.1.10), and (1.1.11) we also have ω = [a1 + τ (ω)]

ω=

=

£ ¤ a1 , a2 + τ 2 (ω) =

1 a1 + τ (ω)

=

p1 + τ (ω) p0 , q1 + τ (ω) q0

a2 + τ 2 (ω) a1 a2 + 1 + a1 τ 2 (ω)

=

p2 + τ 2 (ω) p1 , q2 + τ 2 (ω) q1

and for n ≥ 3, ω = [a1 , · · · , an−1 , an + τ n (ω)] =

Qn−1 (an + τ n (ω) , an−1 , · · · , a2 ) Qn (an + τ n (ω) , an−1 , · · · , a1 )

=

(an + τ n (ω)) Qn−2 (a2 , · · · , an−1 ) + Qn−3 (a2 , · · · , an−2 ) (an + τ n (ω)) Qn−1 (a1 , · · · , an−1 ) + Qn−2 (a1 , · · · , an−2 )

=

pn + τ n (ω) pn−1 an pn−1 + pn−2 + τ n (ω) pn−1 = . an qn−1 + qn−2 + τ n (ω) qn−1 qn + τ n (ω) qn−1

Therefore we can assert that ω=

pn + τ n (ω) pn−1 , qn + τ n (ω) qn−1

ω ∈ Ω, n ∈ N,

(1.1.14)

Basic properties

7

and remark that (1.1.14) also holds for any rational ω in [0, 1). Remark. A matrix approach to equations (1.1.12) and (1.1.14) is as follows. Consider the matrices ¶ µ pn−1 pn Mn = , n ∈ N, qn−1 qn so that M0 = identity matrix, and define µ ¶ 0 1 M−1 = . 1 0 Then equations (1.1.11) imply that Mn = Mn−1 An , where

µ An =

0 1 1 an

n ∈ N,

¶ ,

n ∈ N,

with a0 = 0. Hence µ Mn =

0 1 1 0

¶Y n µ i=0

0 1 1 ai

¶ ,

n ∈ N,

and (1.1.12) is nothing but the equation det Mn = (−1)n ,

n ∈ N.

Clearly, M−1 , Mn , An ∈ SL (2, Z), n ∈ N, that is, the entries of these 2 × 2 matrices belong to Z and their determinants are equal either to 1 or −1 . Recall that any matrix µ ¶ a b M= ∈ SL (2, Z) c d can be viewed as a M¨obius transformation denoted by the same letter of the compactified complex plane C∗ , which is defined by µ ¶ az + b a b M (z) = (z) := , z ∈ C∗ . c d cz + d With T denoting transpose we also have M (z) =

(1, 0) M (z, 1)T (0, 1) M (z, 1)T

,

z ∈ C∗ ,

8

Chapter 1

which implies at once that ¡ ¢ M 0 M 00 (z) = M 0 M 00 (z) ,

z ∈ C∗ ,

for any M 0 , M 00 ∈ SL (2, Z) . Next, for any z ∈ C and n ∈ N we have ¶ µ ¶ µ ¶ µ z z pn + zpn−1 = Mn = Mn−1 An 1 1 qn + zqn−1 µ ¶ 1 = Mn−1 . an + z In particular, for z = 0 we have ¶ µ ¶ µ ¶ µ 1 pn 0 , = Mn = Mn−1 1 an qn

(1.1.100 )

n ∈ N,

whence Mn (0)

=

(1, 0) Mn−1 (1, an )T T

=

pn qn

(0, 1) Mn−1 (1, an ) ½ [a1 , · · · , an ] if n ∈ N+ , := 0 if n = 0.

It follows that Mn (z) =

pn + zpn−1 = [a1 , · · · , an−1 , an + z] , qn + zqn−1

n ≥ 2,

for any z ∈ C, z 6= −qn /qn−1 , and 1 M1 (z) = a1 + z

µ ¶ p1 + zp0 = q1 + zq0

for any z ∈ C, z 6= −a1 . Now, (1.1.14) follows from the last two equations by taking z = τ n (ω) , n ≥ 2, respectively z = τ (ω), ω ∈ Ω. Finally, it is obvious by (1.1.100 ) that pn and qn , n ∈ N+ , can be actually defined as µ ¶ µ ¶ µ ¶µ ¶ pn 0 1 0 1 0 = ··· . qn 1 a1 1 an 1 It is worth mentioning that any irrational number ω = [a0 ; a1 , a2 , · · · ] ∈ R

Basic properties

9

can be represented in terms of only two elements of SL(2, Z), namely µ ¶ µ ¶ 0 1 1 1 Q= and R = , −1 0 0 1 so that Q(z) = −1/z, R(z) = z + 1, z ∈ C. It is not hard to check that Q and R generate SL(2, Z) and that ω = lim Ra0 QR−a1 QRa2 Q · · · R−a2n−1 Q Ra2n (z0 ) n→∞

for any z0 ∈ C. This simple remark is the starting point for understanding by the use of elementary results about continued fractions the behaviour of the geodesic flow on a certain Riemann surface. For details see Series (1982, 1991). See also Adler (1991), Faivre (1993), and Nakada (1995). For another representation of irrationals ω ∈ R in terms of matrices R and L = (P Q)2 Q see Raney (1973). 2 We can now prove the result announced before defining the continuants. Proposition 1.1.1 For any x ∈ [0, 1) we have x − ωn (x) =

(−1)n τ n (x) , qn (qn + τ n (x) qn−1 )

n ∈ N.

(1.1.15)

For any ω ∈ Ω we have 1 1 < |ω − ωn (ω)| < , qn (qn+1 + qn ) qn qn+1

n ∈ N,

(1.1.16)

and lim ωn (ω) = ω.

n→∞

(1.1.17)

Proof. Equation (1.1.15) follows from (1.1.12) and (1.1.14). Next, since 1 τ n (ω)

= an+1 + τ n+1 (ω) ,

n ∈ N, ω ∈ Ω,

by (1.1.11) we have τ n (ω) qn (qn + τ n (ω) qn−1 )

=

1 qn (qn (an+1 + τ n+1 (ω)) + qn−1 )

=

1 , qn (qn+1 + qn τ n+1 (ω))

10

Chapter 1

and (1.1.16) follows. Finally, (1.1.17) follows from (1.1.16) and (1.1.13).

2

Remark. It is easy to see that (1.1.15) implies |x − ωn (x)| ≤

1 , qn qn+1

n ∈ N,

for any x ∈ [0, 1). Of course, for a rational x the inequality above is meaningful just for finitely many values of n ∈ N. 2 Notice that (1.1.12) implies that ωn − ωn−1 =

(−1)n+1 , qn qn−1

n ∈ N+ , ω ∈ Ω,

(1.1.18)

which in conjunction with (1.1.15) yields 0 = ω0 < ω2 < ω4 < · · · < ω3 < ω1 < 1

(1.1.19)

for any ω ∈ Ω. Clearly, the above inequalities also hold for any rational ω ∈ [0, 1) with some inequality signs ‘<’ replaced by ‘≤’. In what follows we shall write ω = [a1 , a2 , · · · ] ,

ω ∈ Ω,

to mean precisely equation (1.1.17). The next result shows that the continued fraction expansion of an irrational number is unique in a certain sense. Proposition 1.1.2 Let (in )n∈N+ be a sequence of positive integers. Define the rational numbers ωn = [i1 , · · · , in ] ,

n ∈ N+ .

Then the limit lim ωn = ω

n→∞

exists, where ω ∈ Ω and, moreover, the in , n ∈ N+ , are the continued fraction digits of ω. Proof. Writing ωn = pn /qn , n ∈ N+ , ω0 = 0, where pn , qn ∈ N+ and g.c.d.(pn , qn ) = 1, it follows from (1.1.18) that ωn =

n X (−1)k+1 k=1

qk−1 qk

,

n ∈ N+ .

Basic properties

11

As qk increases with k, Leibnitz’s theorem ensures the existence of limn→∞ ωn , say, ω, and (1.1.19) shows that 0 < ω < 1. It remains to show that an (ω) = in , n ∈ N+ . This will also prove that ω ∈ Ω, since if ω ∈ Q then we should have am (ω) = am+1 (ω) = · · · ∞ for some m ∈ N+ . As ωn =

1 , i1 + [i2 , · · · , in ]

n ≥ 2,

(1.1.20)

it is sufficient to show that a1 (ω) = b1/ωc = i1 . This follows from (1.1.20) letting n → ∞ and noting that limn→∞ [i2 , · · · , in ] exists and lies in the open interval (0, 1). 2

1.1.3

Some special continued fraction expansions

The continued fraction expansion of a real number is a fundamental representation of it through its connection with the Euclidean algorithm and with ‘best’ rational approximations [see, e.g., Hardy and Wright (1979, Ch. 11)]. At the same time very little is known about the explicit continued fraction expansions of some interesting numbers. We already know that these expansions are finite (i.e., terminating) exactly for rational numbers. Also, by a well known theorem of J.-L. Lagrange [for all classical non-metric results the basic reference is Perron (1954, 1957)], the sequence of digits of an irrational number x is eventually periodic if and only if x is a quadratic irrationality. Here ‘eventually periodic’ means that if x = [a0 ; a1 , a2 , · · · ] , then there exist k ∈ N and ` ∈ N+ such that an = an+` we use the notation  if  [a0 ; · · · , a`−1 ] [a0 ; a1 , · · · , a` ] if x=  [a0 ; a1 , · · · , ak−1 , ak , · · · , ak+`−1 ] if

for any n ≥ k, and k = 0, k = 1, k≥2

as a convenient abbreviation. The smallest such ` ∈ N+ is called the period length of x. If we can take k = 0, then x is called purely periodic. Next, a quadratic irrationality is a number of the form √ a+ b x= , c

12

Chapter 1

where b ∈ N+ is not a perfect square, and a, c ∈ Z, c 6= 0. Then x0 = ³ √ ´ a − b /c is called the algebraic conjugate of x. A purely periodic quadratic irrationality x is characterized by the inequalities x > 1, −1 < x0 < 0. We have, for example, √ ¤ 1+ 7 £ = 1; 1, 4, 1 2 and √ ¤ 1+ 2 £ = 1, 4, 8 . 3 The first quadratic irrationality above is purely periodic and has period length 4 while the second one has period length 2 but is not purely periodic. Apart from that, the continued fraction expansion of even a single additional algebraic number is not explicitly known. We do not know even whether the sequence of digits is unbounded for such a number. [In connection with this matter see, however, Brjuno (1964) and Richtmyer (1975).] For transcendental numbers of interest it is not clear when to expect a continuous fraction expansion with a good ‘pattern’. For example, in a paper titled De Fractionibus Continuis, published Pin 1737, Leonhard Euler gave a nice continued fraction expansion for e = n∈N 1/n!, namely e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, · · · , 1, 2n, 1, · · · ] . In this expansion the digits are eventually comprised of a meshing of two arithmetic progressions, one of which has zero common difference while the other has difference two. Generalizing the above result, Euler showed—the overline in the notation indicates infinite arithmetic progressions— that e1/n = [1; n − 1 + 2in, 1]i∈N = [1; n − 1, 1, 1, 3n − 1, 1, 1, 5n − 1, 1, · · · ] for any 1 < n ∈ N+ , and e2/n = [1; (n − 1)/2 + 3in, 6n + 12in, (5n − 1)/2 + 3in, 1]i∈N = [1; (n − 1)/2, 6n, (5n − 1)/2, 1, 1, (7n − 1)/2, 18n, (11n − 1)/2, 1, · · · ] for any odd n ∈ N+ greater than 1. Recently, Clemens et al. (1995) have given explicit formulae relating continued fraction expansions with almost periodic or almost symmetric patterns in their digits, and series whose terms satisfy certain recurrence relations. The method developed by these authors ties together as a single

Basic properties

13

phenomenon previous results by Davison and Shallit (1991), K¨ohler (1980), Peth˝o (1982), Shallit (1979, 1982 a,b), van der Poorten and Shallit (1992), and Tamura (1991), who have found continued fraction expansions for numbers expressed by certain types of series. On the other hand, nobody has made any sense out of the pattern in the continued fraction expansion for π : π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] . The digits of π do not appear to follow any pattern and are widely suspected to be in some sense random. There is a vague folklore statement [cf. Thakur (1996)] that the nice patterns come from the connection with hypergeometric functions and the representation of the latter by certain generalized continued fraction expansions. For more on that see Chudnovsky and Chudnovsky (1991, 1993). Remark. Using the continued fraction expansion for e, Alzer (1998) proved that ¯ ¯ p ¯¯ q 2 log q ¯¯ e− ¯ min p,q∈N+ ,q≥3 log log q ¯ q exists and is only attained at the 19th convergent of e p19 28 245 729 = , q19 10 391 013 thus it is equal to ¯ ¯ (10 391 013)2 log 10 391 013 ¯¯ 28 245 729 ¯¯ ¯e − 10 391 013 ¯ log log 10 391 013 = 0.386 249 199 819 · · · . Further, the inequality

¯ q 2 log q ¯¯ e− log log q ¯

¯ p ¯¯
has infinitely many solutions in integers p, q ∈ N+ if and only if c ≥ 1/2. For further developments see Elsner (1999). 2

14

1.2 1.2.1

Chapter 1

Basic metric properties Defining random variables of interest

By (1.1.20 ) the incomplete quotients an , n ∈ N+ , of the irrationals in I are defined by a1 (ω) = b1/ωc,

¡ ¢ an (ω) = a1 τ n−1 (ω) ,

ω ∈ Ω, n ∈ N+ .

If we define a1 (0) = ∞ then the above equations also define the incomplete quotients for the rational numbers in [0, 1). As we have noted in Subsection 1.1.1, for any rational x ∈ [0, 1) there exists n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n. The metric point of view in studying the sequence (an )n∈N+ is to consider that the an , n ∈ N+ , are N+ -valued random variables on (I, BI ) which are defined µ-a.s. in I for any probability measure µ on BI assigning measure 0 to the rationals in I . (Such a µ is clearly Lebesgue measure λ.) Alternatively, we can look at the an , n ∈ N+ , as N+ ∪ {∞}-valued random variables which are defined everywhere in [0, 1). It is clear, for example, that ¶ µ 1 ,1 , a1 (0) = ∞, a1 (x) = 1, x ∈ 2 ¸ µ 1 1 , , i ≥ 2, a1 (x) = i, x ∈ i+1 i µ ¶ 1 = ∞, i ≥ 2, a2 (0) = a2 i ¶ [ µ 1 1 a2 (x) = 1, x ∈ , , i + 1 i + 1/2 i∈N+ ¶ [ · 1 1 a2 (x) = j, x ∈ , , j ≥ 2. i + 1/j i + 1/ (j + 1) i∈N+

The distinction between the two cases is nevertheless immaterial as we shall only consider probability measures on BI assigning measure 0 to the rationals in I . The probability structure of (an )n∈N+ under λ will be given later. See Proposition 1.2.7. Let us define some related random variables. For any n ∈ N+ put rn =

1 τ n−1

= [an ; an+1 , an+2 , · · · ],

(1.2.1)

Basic properties

15 qn−1 1 , yn = , qn sn ¯ ¯−1 ¯ pn−1 ¯¯ −2 ¯ un (ω) = qn−1 ω − , ω ∈ Ω, ¯ qn−1 ¯ sn =

(1.2.2) (1.2.3)

where, as usual, pn /qn = [a1 , · · · , an ] , n ∈ N+ , is the nth convergent, p0 = 0, q0 = 1. Note that qn = y1 · · · yn = (s1 · · · sn )−1 , n ∈ N+ . Next, it follows from the first equation (1.1.11) that 1 = an + sn−1 , sn

n ∈ N+ ,

with s0 = 0. Hence sn = [an , · · · , a1 ] ,

n ∈ N+ .

(1.2.20 )

Finally, using (1.1.15) it is easy to see that un = sn−1 + rn ,

n ∈ N+ .

(1.2.30 )

In what follows we shall refer to the qn , rn , sn , un , yn , n ∈ N+ , as associated (with (an )n∈N+ ) random variables. It is clear that 0 < sn < 1 whilst rn , un , yn > 1, n ∈ N+ . We defer to Subsection 1.2.3 the study of distributional properties under λ of the associated random variables.

1.2.2

Gauss’ problem and measure

Of paramount importance for the metric theory of the continued fraction expansion, actually its first basic result, is the asymptotic behaviour of the distribution function Fn (x) = λ (τ n < x) = λ (τ −n ([0, x))), x ∈ I, of τ n as n → ∞. C.F. Gauss wrote on 25th October 1800 in his diary that (in modern notation) lim Fn (x) =

n→∞

log (x + 1) , log 2

x ∈ I.

Gauss’ proof has never been found. Later, in a letter dated 30th January 1812, Gauss asked Laplace what we now call: Gauss’ Problem. Estimate the error en (x) := Fn (x) −

log(x + 1) , log 2

n ∈ N, x ∈ I.

16

Chapter 1

Gauss’ letter has been published on pages 371–372 of his Werke, Volume 1, Section 1, Teubner, Leipzig, 1917. Almost the whole letter is reproduced on pages 396–397 of J.V. Uspensky’s Introduction to Mathematical Probability, McGraw-Hill, New York, 1937. See also Gray (1984, p. 123) for other historical details about Gauss’ problem. The first one to give a solution to Gauss’ problem (implicitly proving Gauss’ 1800 assertion) was R.O.√Kuzmin, who showed in 1928 [see Kuzmin (1928, 1932)] that en (x) = O(q n ) as n → ∞, with 0 < q < 1, uniformly in x ∈ I. Kuzmin’s proof is reproduced in Khintchine (1956, 1963, 1964). Independently, Paul L´evy showed one year later [see L´evy (1929) and√also L´evy (1954, Ch.IX)] that |en (x)| ≤ q n , n ∈ N+ , x ∈ I, with q = 3.5−2 2 = 0.67157 · · · . We present a slightly improved version of L´evy’s solution in Subsection 1.3.5. Using Kuzmin’s approach, Sz˝ usz (1961) claimed to have lowered the L´evy estimate for q to 0.4. Actually, Sz˝ usz’s argument yields just 0.485 rather than 0.4. The optimal value of q was determined by Wirsing (1974), who found that it was equal to 0.303 663 002 · · · . Chapter 2 is devoted to a thorough treatment of Gauss’ problem. In particular, Corollary 2.3.6 provides a complete solution to a generalization of it, where the interval [0, x), x ∈ I, is replaced by an arbitrary set A ∈ BI . The limiting distribution function log(x + 1)/ log 2, x ∈ I, occurring in Gauss’ problem motivates the introduction of what we now call Gauss’ measure γ, which is defined on BI by Z 1 dx γ (A) = , A ∈ BI . log 2 A x + 1 Then clearly γ([0, x]) = log(x+1)/ log 2, x ∈ I. We are going to prove that γ and τ enjoy an important property. First, we note that τ does not preserve λ. This means that we do not have λ(τ −1 (A)) = λ (A) for any A ∈ BI . Indeed, for, e.g., A = (1/2, 1) we have ¶ [ µ 1 1 −1 , τ (A) = i + 1 i + 1/2 i∈N+

and

¶ ¶ X µ 1 1 1 1 − =2 − i + 1/2 i + 1 2i + 1 2i + 2 i∈N+ i∈N+ µ ¶ 1 = 2 log 2 − 1 + = 2 log 2 − 1 2

¡ ¢ λ τ −1 (A) =

while λ (A) = 1/2.

X µ

Basic properties

17

Instead, τ does preserve γ and we state formally this result, which is a basic one in the metric theory of the RCF expansion. Theorem 1.2.1 Gauss’ measure γ is preserved by τ, and the sequence (an )n∈N+ is strictly stationary under γ. Proof. We should show that ¡ ¢ γ τ −1 (A) = γ(A),

A ∈ BI .

For this it is enough to show that the above equation holds for any interval A = (0, u], 0 < u ≤ 1. As [ · 1 1¶ −1 τ ((0, u]) = , , u+i i i∈N+

we only need to verify that Z u X Z 1/i dx dx = , 1/(u+i) x + 1 0 x+1 i∈N+

which is an easy exercise. Since an = a1 ◦ τ n−1 , n ∈ N+ , the second assertion is obvious.

2

Remark. The expectation of a1 under γ is infinite. Indeed Z 1 Z 1/i 1 a1 (x) 1 X dx dx = i = ∞. log 2 0 x + 1 log 2 1/(i+1) x + 1 i∈N+

2

1.2.3

Fundamental intervals, and applications

For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn+ define I(i(n) ) = ( ω ∈ Ω : ak (ω) = ik , 1 ≤ k ≤ n ) . For example, for any i ∈ N+ we have µ I (i) = ( ω ∈ Ω : a1 (ω) = i ) = Ω ∩

1 1 , i+1 i

¶ .

We are going to prove that any I(i(n) ) is the set of irrationals from a certain open interval with rational endpoints. The sets I(i(n) ), i(n) ∈ Nn+ , are

18

Chapter 1

called fundamental intervals of rank n. Let us make the convention that I(i(0) ) = Ω. Theorem 1.2.2 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn+ let pn−1 = [i1 , · · · , in−1 ] , qn−1

pn = [i1 , · · · , in ] qn

with g.c.d. (pn−1 , qn−1 ) = g.c.d. (pn , qn ) = 1, p0 = 0, q0 = 1. Then I(i(n) ) = Ω ∩ (u(i(n) ), v(i(n) )), where u(i

(n)

)=

v(i(n) ) =

We have

 pn + pn−1     qn + qn−1

if n is odd,

   

if n is even,

    

pn qn pn qn

if n is odd,

 pn + pn−1    qn + qn−1

 

pn + pn−1 =  qn + qn−1

(1.2.4)

if n is even.

[i1 + 1]

if n = 1,

[i1 , · · · , in−1 , in + 1] if n > 1,

λ(I(i(n) )) =

1 qn (qn + qn−1 )

(1.2.5)

and max λ(I(i(n) )) = λ (I (1(n))) =

i(n) ∈Nn +

1 , Fn Fn+1

n ∈ N+ ,

(1.2.6)

with 1(n) = (i1 , · · · , in ), where i1 = · · · = in = 1. ¡ ¢ Proof. Since [i1 , · · · , in−1 , in + ω] ∈ I i(n) , n ≥ 2, and [i1 + ω] ∈ I (i1 ) ¡ ¡ ¢¢ for any ω ∈ Ω, we have τ n I i(n) = Ω for any n ∈ N+ and i(n) ∈ Nn+ . In conjunction with (1.1.14) this proves (1.2.4). It thus appears that I(i(n) ) is the image of Ω under the map ω→

pn + ωpn−1 , qn + ωqn−1

ω ∈ Ω.

Basic properties

19

Next, (1.2.5) follows from (1.2.4) and (1.1.12). Finally, (1.2.6) is an immediate consequence of (1.2.5), as the minimum of qn is attained for i1 = · · · = in = 1 [cf.(1.1.13)]. 2 Remark. When denoting by pn and qn , n ∈ N+ , quantities seemingly different from those already defined in Subsection 1.1.2, we clearly abused the notation. However, it should be noted that according to the context pn and qn will appear either functions of ω ∈ Ω or of i(n) ∈ Nn+ as well. ¡ (n) ¢ to be ¡ (n) ¢ Actually, pn i (qn i ) is the common value of pn (qn ) as defined in ¡ ¢ Subsection 1.1.2 at all points ω ∈ I i(n) , n ∈ N+ . 2 Corollary 1.2.3 For p, q ∈ N+ with p < q and g.c.d. (p, q) = 1 let p = [i1 , · · · , in ] = [i1 , · · · , in−1 , in − 1, 1] q for some n = n (p/q) ∈ N+ , where in ≥ 2. Define pn−1 = [i1 , · · · , in−1 ] , qn−1

p− n = [i1 , · · · , in−1 , in − 1] qn−

− with g.c.d. (pn−1 , qn−1 ) = g.c.d. (p− n , qn ) = 1, p0 = 0, q0 = 1, and µ ¶ p Ip/q = ω ∈ Ω : is a convergent of ω . q

Then Ip/q = I (i1 , · · · , in ) ∪ I (i1 , · · · , in−1 , in − 1, 1)  µ ¶ p + pn−1 p + p−  n  Ω ∩ , if n is odd,    q + qn−1 q + qn− = µ ¶   p + p−  n p + pn−1   Ω ∩ , if n is even q + qn− q + qn−1 and

¡ ¢ λ Ip/q =

(1.2.7)

3 ¡ ¢. (q + qn−1 ) q + qn−

We have max

{p,q∈N+ : n(p,q)=n}

¡ ¢ ¡ ¢ λ Ip/q = λ IFn /Fn+1 =

3 , (Fn−1 + Fn+1 ) Fn+2

n ∈ N+ .

20

Chapter 1 Proof. By (1.1.11) we have p = p− n + pn−1 ,

q = qn− + qn−1 .

It then follows from (1.2.4) that

I (i1 , · · · , in−1 , in − 1, 1) =

 ¶ µ p p + p−  n  if n is odd, , Ω ∩    q q + qn− µ ¶   p + p−  n p   Ω ∩ , if n is even q + qn− q

while, by (1.2.4) again,

I (i1 , · · · , in ) =

 µ ¶ p + pn−1 p   Ω ∩ , if n is odd    q + qn−1 q ¶ µ   p p + pn−1   , if n is even.  Ω ∩ q q + qn−1

The last two equations ¡ ¢show that (1.2.7) holds. To compute λ Ip/q we have to¡ use ¢(1.1.12) three times. Finally, we should note that the maximum of λ Ip/q is obtained for i1 = · · · = in−1 = 1, in = 2. 2 Corollary 1.2.4 (Legendre’s theorem) For ω ∈ Ω and p, q ∈ N+ with p < q and g.c.d. (p, q) = 1 let p = [i1 , · · · , in ] , q

pn−1 = [i1 , · · · , in−1 ] qn−1

with p0 = 0, q0 = 1, where the length n = n (p/q) ∈ N+ of the continued fraction expansion of p/q is chosen in such a way that it is even if p/q < ω and odd otherwise. Define ¯ ¯ ¯ p ¯¯ 2¯ Θ = q ¯ ω − ¯. q Then Θ<

q p if and only if is a convergent of ω. q + qn−1 q

In particular, if Θ ≤ 1/2 then p/q is a convergent of ω . Proof. If p/q is a convergent of ω, then by (1.1.15) we have ¯ ¯ ¯ p ¯¯ q τ n (ω) q 2¯ Θ=q ¯ω− ¯= < . n q q + τ (ω) qn−1 q + qn−1

Basic properties

21

Conversely, if Θ < q / (q + qn−1 ) then ¯ ¯ ¯ ¯ 1 ¯ω − p ¯ < . ¯ ¯ q q (q + qn−1 )

(1.2.8)

Assuming that p/q < ω, that is, n is even, from (1.2.8) we obtain p p 1 p + pn−1 <ω< + = q q q (q + qn−1 ) q + qn−1 [by (1.1.12)]. Similarly, assuming that p/q > ω, that is, n is odd, we obtain p p 1 p + pn−1 >ω> − = q q q (q + qn−1 ) q + qn−1 [by (1.1.12) again]. In both cases we thus have ω ∈ I (i1 , · · · , in ). Hence p/q = [i1 , · · · , in ] is a convergent of ω. The special case follows from the inequality q / (q + qn−1 ) > 1/2 which holds since q > qn−1 . 2 Corollary 1.2.5 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ N+ we have γ (ak = i1 , · · · , ak+n−1 = in ) =

1 1 + v(i(n) ) log , log 2 1 + u(i(n) )

k ∈ N+ .

In particular, µ ¶ 1 (i + 1)2 1 1 γ (ak = i) = log = log 1 + log 2 i (i + 2) log 2 i (i + 2)

(1.2.9)

for any k, i ∈ N+ . Proof. Theorem 1.2.1 and equation (1.2.4).

2

Corollary 1.2.6 (Brod´en–Borel–L´evy formula) For any n ∈ N+ we have (sn + 1) x λ (τ n < x | a1 , · · · , an ) = , x ∈ I, (1.2.10) sn x + 1 where sn is defined by (1.2.2) or (1.2.20 ). Proof. Clearly, for any n ∈ N+ and x ∈ I, λ (τ n < x | a1 , · · · , an ) =

λ ((τ n < x) ∩ I (a1 , · · · , an )) . λ (I (a1 , · · · , an ))

22

Chapter 1

By (1.1.14) and (1.2.4) we have (τ n < x) ∩ I (a1 , · · · , an )

=

 µ ¶ pn pn + xpn−1   <ω< ω∈Ω:    qn + xqn−1 qn

if n is odd,

µ ¶   pn pn + xpn−1   <ω< if n is even.  ω∈Ω: qn qn + xqn−1

Hence, using (1.2.5) and (1.1.12), λ (τ n < x | a1 , · · · , an ) =

qn (qn + qn−1 ) x (sn + 1) x = qn (qn + xqn−1 ) sn x + 1

for any n ∈ N+ and x ∈ I, and the proof is complete.

2

Remark. For x ∈ N+ equation (1.2.10) has been obtained by the Swedish mathematician T. Brod´en as early as 1900 [see Brod´en (1900, p. 246)], nine ´ Borel [see Borel (1909)]. L´evy (1929) also obtained and years before E. used (1.2.10). This equation was called the Borel-L´evy formula by Doeblin (1940). A generalization of (1.2.10) will be given in Proposition 1.3.8. 2 The Brod´en–Borel–L´evy formula (1.2.10) allows us to determine the probability structure of (an )n∈N+ under λ. Proposition 1.2.7 For any i, n ∈ N+ we have λ (a1 = i) =

1 , i (i + 1)

(1.2.11)

λ (an+1 = i | a1 , · · · , an ) = Pi (sn ) ,

(1.2.12)

x+1 , (x + i) (x + i + 1)

(1.2.13)

where Pi (x) =

x ∈ I.

Proof. As we have already noted, µ ( ω ∈ Ω : a1 (ω ) = i ) = Ω ∩ and (1.2.11) follows at once.

¶ 1 1 , , i+1 i

i ∈ N+ ,

Basic properties

23

Since τ n (ω) = [an+1 (ω) , an+2 (ω) , · · · ] , n ∈ N+ , ω ∈ Ω, we have µ µ ¶¶ 1 1 n ( ω ∈ Ω : an+1 (ω) = i ) = ω ∈ Ω : τ (ω) ∈ , i+1 i for any n, i ∈ N+ so that λ (an+1

µ µ ¶¯ ¶ 1 1 ¯¯ n = i | a1 , · · · , an ) = λ τ ∈ , a1 , · · · , an , i+1 i ¯

and (1.2.12) follows from (1.2.10).

2

Remark. Proposition 1.2.7 is the starting point of an approach to the metrical theory of the continued fraction expansion via dependence with complete connections. See Iosifescu and Grigorescu (1990, Section 5.2). 2 Corollary 1.2.8 The sequence (sn )n∈N+ with s0 = 0 is a Q ∩ I-valued Markov chain on (I, BI , λ) with the following transition mechanism: from state s ∈ Q ∩ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s), i ∈ N+ . We conclude this subsection by considering the random variables rn and un , n ∈ N+ , introduced in Subsection 1.2.1. Proposition 1.2.9 For any n ∈ N+ and x ≥ 1 we have 1 , x

(1.2.14)

sn + 1 , sn + x

(1.2.15)

λ (r1 < x) = λ (u1 < x) = 1 − λ (rn+1 < x | a1 , · · · , an ) = 1 − (

0 if x ≤ sn + 1, (1.2.16) sn + 1 if x > sn + 1. 1− x Proof. Equations (1.2.14) are obvious since r1 = u1 = 1/τ 0 . Then for any n ∈ N+ and x ≥ 1 we have ¶ µ 1 ¯¯ λ (rn+1 < x | a1 , · · · , an ) = λ τ n > ¯ a1 , · · · , an x λ (un+1 < x | a1 , · · · , an ) =

and λ (un+1 < x | a1 , · · · , an ) = λ (rn+1 < x − sn | a1 , · · · , an ) µ = λ τn >

¶ 1 ¯¯ ¯ a1 , · · · , a n . x − sn

24

Chapter 1

To obtain equations (1.2.15) and (1.2.16) it remains to use (1.2.10).

2

Corollary 1.2.10 For any n ∈ N+ let Gn (s) = λ(sn < s), s ∈ R, G0 (s) = 0 or 1 according as s ≤ 0 or s > 0. For any n ∈ N+ and x ≥ 1 we have Z 1 x−1 λ (rn < x) = dGn−1 (s) 0 s+x (1.2.17) ¶ µ Z 1 Gn−1 (s) ds 1 , = (x − 1) + x+1 (s + x)2 0  Z x−1 µ ¶ s+1   dGn−1 (s) if 1 ≤ x ≤ 2, 1−    0 x λ (un < x) = (1.2.18) ¶ Z 1µ   s + 1    dGn−1 (s) if x > 2 1− x 0  Z 1 x−1   Gn−1 (s) ds if 1 ≤ x ≤ 2,    x 0 = Z   1 1 2    1− + Gn−1 (s) ds if x > 2 x x 0

=

1 x

d λ (rn < x) = dx

Z Z

x−1

Gn−1 (s) ds,

0 1

(s + 1) dGn−1 (s) (s + x)2 0 Z 1 2 (s − x + 2) Gn−1 (s) ds = . 2 + (x + 1) (s + x)3 0 Also, for any n ∈ N+ we have d 1 1 λ (un < x) = Gn−1 (x − 1) − 2 dx x x

=

Z 0

x−1

Gn−1 (s) ds

 Z x−1 1 1   G (x − 1) − 2 Gn−1 (s) ds if 1 ≤ x ≤ 2,    x n−1 x 0 µ ¶ Z 1    1   2 2− Gn−1 (s) ds x 0

(1.2.19)

if x > 2

(1.2.20)

Basic properties

25

a.e. in [1, ∞). Proof. The first equality in (1.2.17) follows at once from (1.2.15). To obtain the second one we integrate by parts noting that Gn (0) = 0 and Gn (1) = 1 for any n ∈ N. Similarly, the first equality in (1.2.18) follows at once from (1.2.16). To obtain the second and third ones we integrate by parts and then note that Gn (s) = 1 for any n ∈ N and s ≥ 1. Finally, equations (1.2.19) and (1.2.20) follow immediately from (1.2.17) and (1.2.18), respectively. 2

1.3

The natural extension of τ

1.3.1

Definition and basic properties

The incomplete quotients an , n ∈ N+ , are expressed in terms of a1 and the powers of the continued fraction transformation τ . Such a thing is not possible for the variables sn or un , n ∈ N+ . To rule out this inconvenience we consider the so called natural extension τ of τ which is a transformation of (0, 1) × I defined by µ ¶ 1 τ (ω, θ) = τ (ω) , , (ω, θ) ∈ (0, 1) × I. (1.3.1) a1 (ω) + θ This is a one-to-one transformation of Ω2 with inverse ¶ µ 1 −1 , τ (θ) , (ω, θ) ∈ Ω2 . τ (ω, θ) = a1 (θ) + ω

(1.3.2)

It is easy to see that for any n ≥ 2 we have τ n (ω, θ) = (τ n (ω) , [an (ω) , · · · , a2 (ω) , a1 (ω) + θ])

(1.3.10 )

whatever (ω, θ) ∈ Ω × I, and τ −n (ω, θ) = ([an (θ) , · · · , a2 (θ) , a1 (θ) + ω], τ n (θ))

(1.3.20 )

whatever (ω, θ) ∈ Ω2 . Equations (1.3.1) and (1.3.10 ) imply that τ n (ω, 0) = (τ n (ω) , sn (ω)) ,

n ∈ N+ ,

(1.3.3)

26

Chapter 1

for any ω ∈ Ω. Note that the above equation also hold for n = 0 if we define τ 0 =identity map. Now, define the extended Gauss measure γ on BI2 by ZZ 1 dxdy γ (B) = , B ∈ BI2 . log 2 B (xy + 1)2 Note that γ (A × I) = γ (I × A) = γ (A)

(1.3.4)

for any A ∈ BI . The result below shows that γ plays with respect to τ the part played by γ with respect to τ (cf. Theorem 1.2.1). Theorem 1.3.1 The extended Gauss measure γ is preserved by τ . ¡ ¢ Proof. We should show that γ τ −1 (B) = γ (B) for any B ∈ BI2 or, equivalently, since τ is invertible on Ω2 , that γ (τ (B)) = γ (B) for any B ∈ (n) ∈ BI2 . As the set of Cartesian products I(i(m) ) × I(j (n) ), i(m) ∈ Nm +, j Nn+ , m, n ∈ N, generates the σ-algebra BI2 , it is enough to show that γ(τ (I(i(m) ) × I(j (n) ))) = γ(I(i(m) ) × I(j (n) )) i(m)

Nm +,

j (n)

(1.3.5)

Nn+ ,

for any ∈ ∈ m, n ∈ N. It follows from (1.3.4) and Theorem 1.2.1 that (1.3.5) holds for m = 0 and n ∈ N. If m ∈ N+ then it is easy to see that τ (I(i(m) ) × I(j (n) )) = I (i2 , · · · , im ) × I (i1 , j1 , · · · , jn ) ,

n ∈ N+ ,

where I (i2 , · · · , im ) equals Ω for m = 1. Also, if I(i(m) ) = Ω ∩ (a, b) and ¡I(j (n) ) = Ω ∩ (c,¢ d), with a, b, c, d ∈ Q ∩ I, then I (i2 , · · · , im ) = Ω ∩ b−1 − i1 , a−1 − i1 and I (i1 , j1 , · · · , jn ) = Ω ∩ ((d + i1 )−1 , (c + i1 )−1 ). A simple computation yields γ((a, b) × (c, d)) =

(bd + 1) (ac + 1) 1 log , log 2 (bc + 1) (ad + 1)

and then ¢ ¡ γ( b−1 − i1 , a−1 − i1 × ((d + i1 )−1 , (c + i1 )−1 )) =

1 ((a−1 − i1 )(c + i1 )−1 + 1)((b−1 − i1 )(d + i1 )−1 + 1) log log 2 ((a−1 − i1 )(d + i1 )−1 + 1)((b−1 − i1 )(c + i1 )−1 + 1)

=

1 (bd + 1) (ac + 1) log , log 2 (bc + 1) (ad + 1)

that is, (1.3.5) holds.

2

For more details on natural extensions we refer the reader to Subsection 4.0.1.

Basic properties

1.3.2

27

Approximation coefficients

On account of Legendre’s theorem (see Corollary 1.2.4), for any ω ∈ Ω we define the approximation coefficients Θn = Θn (ω) as ¯ ¯ ¯ pn ¯¯ 2¯ Θn = Θn (ω) = qn ¯ω − ¯ , n ∈ N. qn Clearly, Θ0 (ω) = ω, ω ∈ Ω, and by (1.2.3) we have Θn = u−1 n+1 ,

n ∈ N.

0 < Θn < 1,

n ∈ N.

(1.3.6)

Hence It is rather easy to obtain more information about Θn , n ∈ N. It follows from (1.2.30 ) and (1.2.1) that Θn =

τn 1 = , sn + rn+1 sn τ n + 1

n ∈ N.

−1 Moreover, as s−1 n = an + sn−1 and rn = an + rn+1 , n ∈ N+ , we also have

Θn−1 = =

1 1 = −1 sn−1 + rn sn−1 + an + rn+1 sn , n ∈ N+ . sn τ n + 1

Thus it appears that (Θn−1 , Θn ) = Ψ (τ n , sn ) ,

n ∈ N+ ,

the function Ψ : I 2 → R2+ being defined by ¶ µ y x , , Ψ (x, y) = xy + 1 xy + 1

(1.3.7)

(x, y) ∈ I 2 .

Clearly, Ψ is a C 1 -diffeomorphism between the interior of I 2 and the interior of the triangle ∆ with vertices (0, 0) , (1, 0) and (0, 1). It then follows from (1.3.7) that Θn−1 + Θn < 1, whence

1 min (Θn−1 , Θn ) < , 2

n ∈ N+ , n ∈ N+ ,

28

Chapter 1

a well known result due to Vahlen (1895). The inverse Ψ−1 of Ψ is given by ¶ µ 2β 2α −1 √ √ , , Ψ (α, β) = 1 + 1 − 4αβ 1 + 1 − 4αβ

(α, β) ∈ ∆.

For i ∈ N+ put Vi = I (i) × Ω Hi = Ω × I (i) . It follows from the definition of τ that τ (Vi ) = Hi , Vi = τ −1 (Hi ) ,

i ∈ N+ ,

and that for any i ∈ N+ we have τ n ∈ Vi if and only if an+1 = i, τ n ∈ Hi if and only if an = i,

n ∈ N,

(1.3.8)

n ∈ N+ .

(1.3.9)

Furthermore, the set Vi∗ = ΨVi , is a quadrangle with vertices µ ¶ µ ¶ µ ¶ µ ¶ 1 i 1 i+1 1 1 0, , , , , and 0, , i i+1 i+1 i+2 i+2 i+1 and notice that its symmetrical with respect to the diagonal α = β is Hi∗ = ΨHi , i ∈ N+ . (For i = 1 both quadrangles are in fact triangles.) Define the mapping F : ∆ → ∆ as F = Ψτ Ψ−1 . It is easy to check that for any i ∈ N+ we have ³ ´ p (α, β) ∈ Vi∗ ⇒ F (α, β) = β, α + i 1 − 4αβ − i2 β . (1.3.10) Now, by (1.3.7) we have Ψ−1 (Θn−1 , Θn ) = (τ n , sn ) , whence

¡ ¢ τ Ψ−1 (Θn−1 , Θn ) = τ n+1 , sn+1 ,

n ∈ N+ .

Therefore, by (1.3.7) again, F (Θn−1 , Θn ) = Ψ τ Ψ−1 (Θn−1 , Θn ) ¡ ¢ = Ψ τ n+1 , sn+1 = (Θn , Θn+1 ),

(1.3.11) n ∈ N+ .

Basic properties

29

Hence, by (1.3.3), (1.3.8), and (1.3.10), p Θn+1 = Θn−1 + an+1 1 − 4Θn−1 Θn − a2n+1 Θn ,

n ∈ N+ .

Similarly, for any i ∈ N+ we have ³ ´ p (α, β) ∈ Hi∗ ⇒ F −1 (α, β) = β + i 1 − 4αβ − i2 α, α .

(1.3.12)

(1.3.13)

As by (1.3.3), (1.3.9), and (1.3.13) we have F −1 (Θn , Θn+1 ) = (Θn−1 , Θn ) ,

n ∈ N+ ,

we obtain Θn−1 = Θn+1 + an+1

p

1 − 4Θn Θn+1 − a2n+1 Θn ,

n ∈ N+ .

(1.3.120 )

We note that both (1.3.12) and (1.3.120 ) can be established by direct computation using the relationships between Θn , rn , sn , and an , n ∈ N+ . We are now able to derive some classical results in Diophantine approximation. Put p fi (α, β) = α + i 1 − 4αβ − i2 β, i ∈ N+ , so that (1.3.10) can be rewritten as (α, β) ∈ Vi∗ ⇒ F (α, β) = (β, fi (α, β)) . It is easy to check that ∂fi (α, β) < 0, ∂α

∂fi (α, β) < 0, ∂β

(α, β) ∈ Vi∗ , i ∈ N+ .

The only fixed point of τ in Vi is (ξi , ξi ), where √ −i + i2 + 4 ξi = [i, i, i, · · · ] = , 2

(1.3.14)

i ∈ N+ ,

while the only fixed point of F in Vi∗ = ΨVi is (ξi∗ , ξi∗ ), where ¶ µ 1 1 ∗ ∗ ,√ , i ∈ N+ . (ξi , ξi ) = Ψ (ξi , ξi ) = √ i2 + 4 i2 + 4

(1.3.15)

Note that by (1.3.11) we have (Θn−1 , Θn , Θn+1 ) = (Θn−1 , F (Θn−1 , Θn )) , n ∈ N+ . Hence, for any i, n ∈ N+ , (Θn−1 , Θn , Θn+1 ) = (Θn−1 , Θn , fi (Θn−1 , Θn ))

30

Chapter 1

if and only if (Θn−1 , Θn ) ∈ Vi∗ , that is, by (1.3.7), if and only if an+1 = i. Finally, note that Θn−1 (ξi∗ ) 6= Θn (ξi∗ ) (1.3.16) for any i, n ∈ N+ . Now, on account of (1.3.14) through (1.3.16) we can state the following result. Theorem 1.3.2 For any ω ∈ Ω and n ∈ N+ we have 1 min (Θn−1 , Θn , Θn+1 ) < q 2 an+1 + 4 and max (Θn−1 , Θn , Θn+1 ) > q

1 a2n+1 + 4

(1.3.17)

.

(1.3.18)

Inequality (1.3.17) generalizes a result of Borel (1903) according to which 1 min (Θn−1 , Θn , Θn+1 ) < √ , 5

n ∈ N+ .

(1.3.11)

A great number of people independently found (1.3.17). See, e.g., Bagemihl and McLaughlin (1966), Obrechkoff (1951), Sendov (1959/60). Inequality (1.3.18) is due to Tong (1983). Actually, the method sketched above yields easy proofs of generalizations of a great number of classical results by M. Fujiwara, B. Segre, J. LeVeque, P. Sz˝ usz, and others. We will mention here a generalization of a result of B. Segre. For other results the reader is referred to Jager and Kraaikamp (1989) and Kraaikamp (1991). Theorem 1.3.3 Let ρ ≥ 0 and n ∈ N+ . Then of the three inequalities ρ

Θ2n−1 < q

a22n+1 + 4ρ

,

Θ2n < q

1 a22n+1 + 4ρ

,

ρ Θ2n+1 < q a22n+1 + 4ρ

at least one is satisfied and at least one is not satisfied. Corollary 1.3.4 [Segre (1945)] Let ρ ≥ 0 and ω ∈ Ω. Then there are infinitely many rational numbers p/q with p < q and g.c.d. (p, q) = 1 satisfying the inequalities ρ p 1 1 1 −√ <ω− < √ . q 1 + 4ρ q 2 1 + 4ρ q 2

Basic properties

31

Remark. Tong (1994) proved the optimal version of Theorem 1.3.2 by showing that for any ω ∈ Ω and n ∈ N+ we have 1

min (Θn−1 , Θn , Θn+1 ) < p

(an+1 + |τ n+1 − sn |)2 + 4

and

1 . max (Θn−1 , Θn , Θn+1 ) > p (an+1 − |τ n+1 − sn |)2 + 4 2

1.3.3

Extended random variables

It is well known [see, e.g., Doob (1953, p. 456)] that a doubly infinite version of (an )n∈N+ under γ (i.e., when the process is a strictly stationary one, see Theorem 1.2.1) should exist on a richer probability space. It is possible to construct it effectively by using the natural extension τ as follows. Define extended incomplete quotients a` , ` ∈ Z, on Ω2 by ³ ´ a`+1 (ω, θ) = a1 τ ` (ω, θ) , ` ∈ Z, with a1 (ω, θ) = a1 (ω) , (ω, θ) ∈ Ω2 . Clearly, by (1.3.10 ) and (1.3.20 ) we have an (ω, θ) = an (ω) , a0 (ω, θ) = a1 (θ) , a−n (ω, θ) = an+1 (θ) , n ∈ N+ , (ω, θ) ∈ Ω2 . Similarly to the interpretation of the an , n ∈ N+ , in Subsection ¡ 21.2.1, ¢ we can consider the a` , ` ∈ Z, as N+ -valued random variables on I , BI2 which are defined µ-a.s. in I 2 for any probability measure µ on BI2 assigning measure 0 to I 2 \Ω2 . (Such a µ is clearly γ.) Alternatively, we can look at the a` , ` ∈ Z, as N+ ∪ {∞}-valued random variables which are defined everywhere in [0, 1)2 , as the an , n ∈ N+ , can be defined everywhere in [0, 1) (cf. Subsection 1.2.1). In the latter case a typical trajectory of (a` )`∈Z is either — a doubly infinite sequence of natural numbers; — a doubly infinite sequence of elements of N+ ∪ {∞} in which the natural numbers appear finitely many times in consecutive positions; — a doubly infinite sequence of elements of N+ ∪ {∞} in which the natural numbers appear in consecutive positions from a certain rank on or up to a certain rank.

32

Chapter 1

The distinction between the two cases is again immaterial. Since τ preserves γ, the doubly infinite sequence (a` )`∈Z is strictly stationary under γ. It is indeed a doubly infinite version of (an )n∈N+ under γ, that is, the distribution of (ah , · · · , ah+m ) under γ and that of (ak , · · · , ak+m ) under γ are identical for any h ∈ Z, m ∈ N, and k ∈ N+ . The probability structure of (a` )`∈Z under γ is described by Corollary 1.3.6 to Theorem 1.3.5 below. The latter also brings to light an important family of probability measures on BI , to be called conditional, which we shall consider in some detail in the next subsection. Theorem 1.3.5 For any x ∈ I we have γ ([0, x] × I | a0 , a−1 , · · · ) =

(a + 1) x γ-a.s., ax + 1

where a = [a0 , a−1 , · · · ]. Proof. As is well known, γ ([0, x] × I | a0 , a−1 , · · · ) = lim γ ([0, x] × I | a0 , · · · , a−n ) γ¯ -a.s.. n→∞

For typographical convenience let us denote by In the fundamental interval I (a0 , · · · , a−n ) for any arbitrarily fixed values of the ai , i = 0, −1, . . . , −n. Then we have γ ([0, x] × I | a0 , · · · , a−n ) =

γ ([0, x] × In ) γ (I × In ) Z Z −1 (log 2) dy In

= Z =

In

0

x

du (uy + 1)2

Z −1

(log 2)

= γ (In ) x (y + 1) γ (dy) xy + 1 x (yn + 1) = γ (In ) xyn + 1

x (y + 1) dy In xy + 1 y + 1 γ (In )

for some yn ∈ In . Since lim yn = [a0 , a−1 , · · · ] = a,

n→∞

the proof is complete.

2

Basic properties

33

Corollary 1.3.6 For any i ∈ N+ we have γ (a1 = i| a0 , a−1 , · · · ) = Pi (a) γ-a.s., where a = [a0 , a−1 , · · · ] and the functions Pi , i ∈ N+ , are defined by (1.2.13). Proof. We have  µ ¶ 1   , 1 × [0, 1) if i = 1,    2 (a1 = i) = ¸ µ   1 1   , × [0, 1) if i ≥ 2.  i+1 i Hence the conditional probability in the statement is γ-a.s. equal to (a + 1) /i (a + 1) / (i + 1) − = Pi (a) . 1 + a/i 1 + a/ (i + 1) 2 Remarks. 1. The strict stationarity of (a` )`∈Z under γ implies that the conditional probability γ (a`+1 = i| a` , a`−1 , · · · ),

i ∈ N+ ,

does not depend on ` ∈ Z and is γ-a.s. equal to Pi (a), where a = [ a` , a`−1 , · · · ]. Thus Proposition 1.2.7 and Corollary 1.3.6 provide interpretations of Pi (x) for all x ∈ [0, 1). 2. The process (a` )`∈Z is an example of what is called an infinite-order chain in the theory of dependence with complete connections, see Section 5.5 in Iosifescu and Grigorescu (1990). The existence of such chains is not obvious. To ensure the existence several restrictions should be imposed. See, e.g., Theorems 5.5.1 and 5.5.2 in Iosifescu and Grigorescu (op. cit.). The latter refers to N+ -valued infinite-order chains and makes explicit use of the continued fraction expansion. ¡ 2 2 The ¢ simple effective construction of (a` )`∈Z on the probability space I , BI , γ fully clarifies an idea of Wolfgang Doeblin [see Doeblin (1940)], who was the first to use dependence with complete connections in the metric theory of the continued fraction expansion. 2 Note that by its very construction (a` )`∈Z is a reversible process, that is, the finite dimensional distributions under γ of (a` )`∈Z and (a−` )`∈Z are identical. A similar property holds for (an )n∈N+ under γ, as is shown by the following result.

34

Chapter 1

Proposition 1.3.7 The random sequence (an )n∈N+ on (I, BI , γ) is reversible, i.e., the distributions of (a` : m ≤ ` ≤ n) and (am+n−` : m ≤ ` ≤ n) are identical for any m, n ∈ N+ , m ≤ n. Proof. By the strict stationarity under γ of (a` )`∈Z , the distribution of (a` : m ≤ ` ≤ n) is identical with the distribution of (a`−m−n+1 : m ≤ ` ≤ n) (both under γ). But by the very definition of (a` )`∈Z the first distribution is identical with that of (a` : m ≤ ` ≤ n) while the second one is identical with that of (am+n−` : m ≤ ` ≤ n) (both under γ). 2 Remark. The result stated in Proposition 1.3.7 amounts to the fact that the γ-measures of the fundamental intervals I (i1 , · · · , in ) and I (in , · · · , i1 ) are equal for any n ∈ N+ and i1 , · · · , in ∈ N+ . This can be also proved by direct computation using results from Subsection 1.2.3. See Philipp (1967) and D¨ urner (1992). 2 Define extended associated random variables s` , y ` , r` and u` as s` = [ a` , a`−1 , · · · ] , r` = [ a` ; a`+1 , a`+2 , · · · ] ,

y` =

1 , s¯`

u` = s`−1 + r` ,

` ∈ Z.

Clearly, s` = s0 ◦ τ ` , r` = r0 ◦ τ ` ,

u` = s0 ◦ τ

y` = y0 ◦ τ ` , `−1

+ r1 ◦ τ

`−1

,

` ∈ Z.

It follows from the above equations, Theorem 1.3.1, and Corollary 1.3.6 ¡ ¢ that (s` )`∈Z is a strictly stationary Ω-valued Markov process on I 2 , BI2 , γ with the following transition mechanism: from state s ∈ Ω the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s) , i ∈ N+ . Clearly, for any ` ∈ Z we have γ (s` < x) = γ (s0 < x) = γ (I × [0, x]) = γ ([0, x]) ,

x ∈ I.

Similar considerations can be made about the¡ process ¢(y ` )`∈Z . This is a strictly stationary Ω0 -valued Markov process on I 2 , BI2 , γ , where Ω0 = the set of irrationals in [1, ∞). The transition mechanism of (y ` )`∈Z is as follows: from state y ∈ Ω0 the only possible transitions are to any state y −1 + i with corresponding transition probability Pi (1/y), i ∈ N. For any ` ∈ Z we have γ(y ` < x) = γ(y 0 < x) = γ([x−1 , 1]) = γ 0 ([1, x]) ,

x ∈ [1, ∞),

Basic properties

35

where γ 0 is the probability measure on B[1,∞] defined by Z ¡ 0¢ 1 dy 0 γ A = , A0 ∈ B[1,∞) . log 2 A0 y (y + 1) Next, the process¡(r` )`∈Z is¢ a strictly stationary Ω0 -valued ‘deterministic’ Markov process ¡on ¡ I 2 ,¢¢ BI2 , γ in which state r ∈ Ω0 is followed by state −1 1/(r − brc) = 1/ τ r . Obviously, for any ` ∈ Z we have γ (r` < x) = γ (r1 < x) = γ (r1 < x) = γ 0 ([1, x)),

x ∈ [1, ∞).

Note that by the reversibility of (¯ a` )`∈Z the finite-dimensional distributions under γ of (s` )`∈Z and (r−1 ) `∈Z are identical. ` 2 Finally, the process (s`−1 , r−1 ` ¡ )`∈Z is a¢ strictly stationary Ω -valued ‘de2 2 terministic’ Markov process on I , BI , γ in which state (s, ω) ∈ Ω2 is followed by state µ ¶ 1 −1 −1 −1 , ω − bω c . τ (s, ω) = s + bω −1 c For any ` ∈ Z we have ¡ ¡ ¢ ¢ γ s`−1 < x, r−1 = γ s0 < x, r−1 1
log (xy + 1) , log 2

Z 0

y

Z

x

0

dudv (uv + 1)2

x, y ∈ I.

The process (u` )`∈Z , which is a functional of (s`−1 , r−1 ` )`∈Z (note that u` = s`−1 +r` , ` ∈ Z), is no longer Markovian but is still a strictly stationary one. For any ` ∈ Z we have γ (u` < x) = γ (u1 < x) = γ (s0 + r1 < x) =

1 log 2

ZZ D

dudv , (uv + 1)2

x ∈ [1, ∞),

¡ ¢ where D = (u, v) ∈ I 2 : u + v −1 < x . Hence  µ ¶ 1 x−1   log x − if 1 ≤ x ≤ 2,    log 2 x γ (u` < x) = µ ¶   1 1   log 2 − if x ≥ 2.  log 2 x

36

Chapter 1

1.3.4

The conditional probability measures

Motivated by Theorem 1.3.5 we shall consider the family of (conditional ) probability measures (γa )a∈I on BI defined by their distribution functions γa ([0, x]) =

(a + 1) x , ax + 1

x ∈ I, a ∈ I.

In particular, γ0 = λ. The density ha of γa is ha (x) =

a+1 , (ax + 1)2

x ∈ I, a ∈ I,

and [see, e.g., Billingsley (1968, p. 224)] we then have Z 1 sup |γa (A) − γb (A)| = |ha (x) − hb (x)| dx 2 I A∈BI ¯ Z ¯¯ (ab + a + b) x2 + 2x − 1¯ 1 |b − a| = dx 2 (ax + 1)2 (bx + 1)2 I Z α 1 − 2x − (ab + a + b) x2 = |b − a| dx (ax + 1)2 (bx + 1)2 0 α (1 − α) |b − a| , = (αa + 1) (αb + 1) ³ ´−1 p , a, b ∈ I. Hence where α = 1 + (a + 1) (b + 1) sup |γa (A) − γb (A)| ≤

A∈BI

1 |b − a| , 4

a, b ∈ I .

(1.3.19)

It is easy to see that we also have sup |γa ([0, x]) − γb ([0, x])| = x∈I

α (1 − α) |b − a| , (1 + αa) (1 + αb)

a, b ∈ I.

For any a ∈ I put sa0 = a and san =

1 , san−1 + an

n ∈ N+ .

(1.3.20)

It follows from the properties just described of the process (s` )`∈Z that the sequence (san )n∈N+ is an I-valued Markov chain on (I, BI , γa ) which starts at sa0 = a and has the following transition mechanism: from state

Basic properties

37

s ∈ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s) , i ∈ N+ . [Strictly speaking, this only holds for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1, as (san )n∈N under γa is a version of (sn )n∈N under γ( · |s0 = a), a ∈ E. The validity of the above assertion for the remaining a ∈ I \ E follows by continuity on account of (1.3.19).] Proposition 1.3.8 (Generalized Brod´en–Borel–L´evy formula) For any a ∈ I and n ∈ N+ we have γa (τ n < x | a1 , · · · , an ) =

(san + 1) x , san x + 1

x ∈ I.

(1.3.21)

Proof. For any n ∈ N+ and x ∈ I consider the conditional probability ¡ ¢ γ τ −n ([0, x] × I)| an , · · · , a1 , a0 , a−1 , · · · . (1.3.22) Put a = [a0 , a−1 , · · · ] − actually, a (ω, θ) = θ, (ω, θ) ∈ Ω2 − and note that [an , · · · , a1 , a0 , a−1 , · · · ] = san . On the one hand, it follows from Theorems 1.3.1 and 1.3.5 (see also Remark 1 after Corollary 1.3.6) that the conditional probability (1.3.22) is γ-a.s. equal to (san + 1) x . san x + 1 On the other hand, putting γ¯a ( · ) = γ ( · | a0 , a−1 , · · · ) , it is clear that (1.3.22) is γ-a.s. equal to γ¯a (τ −n ([0, x] × I) ∩ (I (a1 , · · · , an ) × I)) . γ¯a (I(a1 , · · · , an ) × I)

(1.3.23)

Since τ −n ([0, x] × I) = τ −n ([0, x]) × I and γ¯a (A × I) = γa (A) , A ∈ BI , the fraction in (1.3.23) is equal to ¡ ¢ γa τ −n ([0, x]) |I(a1 , · · · , an ) = γa (τ n < x | a1 , · · · , an ) . Therefore (1.3.21) holds for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1, hence by continuity [use (1.3.19)] for the remaining a ∈ I\E. 2

38

Chapter 1

Remark. Equation (1.3.21) can be also proved by direct computation (cf. the proof of Corollary 1.2.6). 2 Corollary 1.3.9 For any a ∈ I and n ∈ N+ we have γa (A | a1 , · · · , an ) = γsan (τ n (A))

(1.3.24)

whatever the set A belonging to the σ-algebra generated by the random variables an+1 , an+2 , · · · , that is, τ −n (BI ). We now give a generalization of Proposition 1.2.9, where Lebesgue measure λ(= γ0 ) is replaced by γa , a ∈ I. Define first the random variables uan as uan = san−1 + rn ,

n ∈ N+ , a ∈ I.

Proposition 1.3.10 For any a ∈ I, n ∈ N+ , and x ≥ 1 we have γa (r1 < x) = 1 −

γa (ua1

   < x) =

0

  1− a+1 x

a+1 , x+a if

x ≤ a + 1,

if

x > a + 1,

γa (rn+1 < x | a1 , . . . , an ) = 1 −

γa (uan+1 < x | a1 , . . . , an ) =

  

0

a   1 − sn + 1 x

san + 1 , x + san if

x ≤ san + 1,

if

x > san + 1.

The proof is entirely similar to that of Proposition 1.2.9.

2

Corollary 1.3.11 For any a ∈ I and n ∈ N+ let Gan (s) = γa (san < s), s ∈ R, Ga0 (s) = 0 or 1 according as s ≤ a or s > a. For any a ∈ I, n ∈

Basic properties

39

N+ , and x ≥ 1 we have Z γa (rn < x)

1

= 0

x−1 a dG (s) s + x n−1 µ

= (x − 1)

1 + x+1

Z

1 Ga n−1 (s)

ds

¶ ,

(s + x)2

0

 Z x−1 µ ¶ s+1   dGan−1 (s) if 1−    0 x

γa (uan < x) =

Z     



s+1 1− x

0

1 x

=

Z

x−1

0

1 ≤ x ≤ 2,

¶ dGan−1 (s)

if

x>2

Gan−1 (s) ds.

Equations similar to (1.2.19) and (1.2.20) hold, too.

1.3.5

Paul L´ evy’s solution to Gauss’ problem

We now present the elegant solution given by L´evy (1929) to Gauss’ problem. Actually, as L´evy has done in the case a = 0, we shall obtain estimates for both ‘errors’ Fna − G and Gan − G, a ∈ I, n ∈ N, where Fna (x) = γa (τ n < x),

Gan (s) = γa (san < s),

x ∈ I,

s ∈ R,

and G(s) = 0, γ([0, s]), or 1 according as s < 0, s ∈ I, or s > 1. It follows from Corollary 1.3.11 that Z Fna (x)

= 0

1

x(s + 1) a dGn (s) xs + 1

(1.3.25)

for any a, x ∈ I and n ∈ N. It is easy to check that Z

1

G (x) = 0

and

µ Gan+1

1 m



µ =

Fna

x(s + 1) dG(s), xs + 1

1 m

x ∈ I,

(1.3.26)

¶ , m, n ∈ N+ ,

a ∈ I.

(1.3.27)

40

Chapter 1

The last equation is still valid for n = 0 and a 6= 0 while µ ¶ µ ¶ 1 1 1 0 0 G1 = F0 = , m ∈ N+ . m m+1 m+1

(1.3.270 )

Since (san )n∈N is a Markov chain on (I, BI , γa )—see the preceding subsection— for any m, n ∈ N+ , a ∈ I, and θ ∈ [0, 1) we have µ ¶ µ ¶ 1 1 a a Gn+1 − Gn+1 m m+θ µ ¶ 1 1 a = γa ≤ sn+1 < m+θ m µ µ = E γa Z

θ

= 0

while Ga1

µ

1 m



µ −

Ga1

¯ ¶¶ 1 1 ¯¯ a a ≤ sn+1 < ¯ sn m+θ m

(1.3.28)

Pm (s) dGan (s)

1 m+θ



µ = γa Z

θ

= 0+

1 1 1 ≤ < m+θ a1 + a m



Pm (s) dGa0 (s),

that is, (1.3.28) also holds for n = 0 if a 6= 0. It is easy to check that µ ¶ µ ¶ Z θ 1 1 Pm (s)dG(s) = G −G m m+θ 0 for any m ∈ N+ and θ ∈ [0, 1). Now, by (1.3.25) and (1.3.26) we have Z 1 x(s + 1) a d(Gan (s) − G(s)) Fn (x) − G(x) = xs + 1 0 µ ¶ Z 1 ∂ x(s + 1) a ds = − (Gn (s) − G(s)) ∂s xs + 1 0 for any a, x ∈ I and n ∈ N. Setting αna = sup |Gan (s) − G(s)| , s∈I

(1.3.280 )

a ∈ I, n ∈ N,

(1.3.29)

Basic properties

41

we obtain Z |Fna (x) hence

− G(x)| ≤

αna

0

1

x(1 − x) a x(1 − x) ds = α , 2 (xs + 1) x+1 n

√ |Fna (x) − G(x)| ≤ (3 − 2 2)αna

(1.3.30)

for any a, x ∈ I and n ∈ N. Let us note that α0a = max (G(a), 1 − G(a)),

a ∈ I.

Theorem 1.3.12 For any n ∈ N+ and a ∈ I we have √ √ 1 sup |Fna (x) − G(x)| ≤ (3 − 2 2)(3.5 − 2 2)n−1 , 2 x∈I √ 1 sup |Gan (x) − G(x)| ≤ (3.5 − 2 2)n−1 . 2 x∈I Proof. By (1.3.27) through (1.3.30), for any m, n ∈ N+ , a ∈ I, and θ ∈ [0, 1)—also for n = 0 and any m ∈ N+ , a ∈ (0, 1], and θ ∈ [0, 1)—we have ¯ ¶ µ ¶¯ µ ¯ ¯ a 1 1 ¯ ¯Gn+1 −G ¯ m+θ m+θ ¯ ¯ µ ¶¯ µ ¶ ¯ 1 ¯¯ 1 −G ≤ ¯¯Gan+1 m m ¯ ¯ µ ¶ µ ¶ µ ¶¯ µ ¶ ¯ ¯ a 1 1 1 1 a ¯ ¯ − Gn+1 −G +G + ¯Gn+1 m m+θ m m+θ ¯ ¯ µ ¶ ¯ µ ¶¯ ¯Z θ ¯ a 1 ¯ 1 ¯¯ ¯¯ a ¯ = ¯Fn −G +¯ Pm (s) d (Gn (s) − G(s))¯¯ ¯ m m 0 ³ √ ´ a ≤ 3 − 2 2 αn ¯Z θ ¯ ¯ ¯ a a ¯ +¯ (G(s) − Gn (s)) dPm (s) + Pm (θ)(Gn (θ) − G(θ))¯¯ 0 √ ≤ (3 − 2 2 + β(m, θ))αna , where

Z β(m, θ) = 0

θ

¯ ¯ ¯ dPm (s) ¯ ¯ ¯ ¯ ds ¯ ds + Pm (θ).

42

Chapter 1

It is easy to check that β(m, θ) ≤ 1/2 for Actually,  1/2          4/(3 + θ) − 2/(2 + θ) − 1/6 β(m, θ) = √    6 − 4 2 − 1/6       2Pm (θ) − 1/m(m + 1) Hence a αn+1

=

sup m∈N+ , θ∈[0,1)

any m ∈ N+ and θ ∈ [0, 1). if m = 1, if m = 2 and θ ≤

√ 2 − 1,

if m = 2 and θ ≥

√ 2 − 1,

if m ≥ 3.

¯ µ ¶ µ ¶¯ ¯ a ¯ 1 1 ¯G ¯ − G ¯ n+1 m + θ m+θ ¯

(1.3.31)

√ ≤ (3.5 − 2 2)αna for any a ∈ I and n ∈ N+ . Finally, by (1.3.27), (1.3.270 ), and (1.3.280 ), µ ¶ µ ¶ 1 1 1 G01 = G01 = m+θ m m+1 and µ Ga1

1 m+θ



µ

=

Ga1      

=

=

¶ Z θ 1 − Pm (s)dGa0 (s) m 0 µ ¶ 1 F0a − Pm (a) if 0 ≤ θ ≤ a, m

µ ¶   1  a   F0 m  a+1     a+m+1   a+1   a+m

if θ > a if 0 ≤ θ ≤ a, if θ > a

for any a ∈ (0, 1], θ ∈ [0, 1), and m ∈ N+ . It is easy to see that ¯ µ ¶ µ ¶¯ ¯ 1 ¯ a 1 1 a ¯ ≤ , a ∈ I. ¯G1 − G α1 = sup ¯ m+θ m+θ ¯ 2 m∈N+ , θ∈[0,1)

(1.3.32)

Basic properties

43

It follows from (1.3.31) and (1.3.32) that √ 1 αna ≤ (3.5 − 2 2)n−1 , 2

n ∈ N+ , a ∈ I.

By (1.3.30) the proof is complete.

2

Theorem 1.3.12 shows that both Fna and Gan converge very fast to Gauss’ distribution function G. Actually, the convergence is even considerably faster. See Corollary 2.3.6 and Theorem 2.5.5.

1.3.6

Mixing properties

We conclude this section by studying the ψ-mixing coefficients of (an )n∈N+ under either γa , a ∈ I, or γ. Theorem 1.3.12 plays here an important part. For any k ∈ N+ let B1k = σ (a1 , · · · , ak ) and Bk∞ = σ (ak , ak+1 , · · · ) denote the σ-algebras generated by the random variables a1 , · · · , ak , respectively, ak , ak+1 , · · · . Clearly, B1k is the σ-algebra generated by the closures of the fundamental intervals of rank k while Bk∞ = τ −k+1 (BI ), k ∈ N+ . For any µ ∈ pr (BI ) consider the ψ-mixing coefficients (cf. Section A3.1) ¯ ¯ ¯ ¯ µ (A ∩ B) ¯ − 1¯¯ , n ∈ N+ , ψµ (n) = sup ¯ µ (A) µ (B) ∞ where the supremum is taken over all A ∈ B1k and B ∈ Bk+n such that µ (A) µ (B) 6= 0, and k ∈ N+ . Define ¯ ¯ ¯ ¯ γa (B) ¯ − 1¯¯ , n ∈ N+ , εn = sup ¯ γ (B)

where the supremum is taken over all a ∈ I and B ∈ Bn∞ with γ (B) > 0. ∞ ⊂ B ∞ for any Note that the sequence (εn )n∈N+ is non-increasing since Bn+1 n a , a ∈ I, n ∈ N+ . We shall show that εn can be expressed in terms of Fn−1 and G, namely, εn = ε0n with ¯ ¯ a ¯ ¯ dFn−1 (x) /dx 0 − 1¯¯ , n ∈ N+ , εn = sup ¯¯ g (x) a,x∈I where g (x) = G 0 (x) = (log 2)−1 / (x + 1) , x ∈ I. Indeed, by the very definition of ε0n , for any a, x ∈ I we have ¯ a ¯ ¯ dFn−1 (x) ¯ 0 ¯ εn g (x) ≥ ¯ − g (x)¯¯ . dx

44

Chapter 1

By integrating the above inequality over B ∈ Bn∞ we obtain γ

(B) ε0n

¯ Z ¯ a ¯ dFn−1 (x) ¯ ¯ ¯ dx ≥ − g (x) ¯ ¯ dx B ¯Z ¯ Z ¯ ¯ a ¯ ≥ ¯ dFn−1 (x) − g (x) dx¯¯ = |γa (B) − γ (B)| B

B

for any B ∈ Bn∞ , n ∈ N+ , and a ∈ I. Hence ε0n ≥ εn , n ∈ N+ . On the other + hand, for any arbitrarily given n ∈ N+ let Bx,h = (x ≤ τ n−1 < x + h) ∈ Bn∞ , − with x ∈ [0, 1), h > 0, x + h ∈ I, and Bx,h = (x − h ≤ τ n−1 < x) ∈ Bn∞ , with x ∈ (0, 1], h > 0, x − h ∈ I. Clearly, ¯ ¯ ¯! ï ¯ γa (B + ) ¯ ¯ γa (B − ) ¯ ¯ ¯ ¯ ¯ x,h x,h εn ≥ max ¯ − 1 , − 1 ¯ ¯ ¯ + − ¯ γ(Bx,h ¯ ¯ γ(Bx,h ¯ ) ) for any a ∈ I and suitable x ∈ I and h > 0. Letting h → 0 we get εn ≥ ε0n , n ∈ N+ . Therefore εn = ε0n , n ∈ N+ . ¡ ¢ It is easy to compute ε01 = ε1 and ε02 = ε2 . Since F0a (x) = γa τ 0 < x = γa ([0, x]) , a, x ∈ I, we have ¯ a ¯ ¯ ¯ ¯ dF0 (x) /dx ¯ ¯ (a + 1) (x + 1) ¯ ε1 = sup ¯¯ − 1¯¯ = sup ¯¯ log 2 − 1¯¯ . 2 g (x) (ax + 1) a,x∈I a,x∈I As 1≤

(a + 1) (x + 1) ≤ 2, (ax + 1)2

a, x ∈ I,

it follows that ε1 = 2 log 2 − 1 = 0.38629 · · · . Next, as γa (sa1 = 1/(a + i)) = Pi (a), a ∈ I, i ∈ N+ , by Proposition 1.3.8 we have F1a (x) =

X (a + i + 1)x a+1 x + a + i (a + i)(a + i + 1)

i∈N+

=

X i∈N+

(a + 1)x , (x + a + i)(a + i)

a, x ∈ I.

Basic properties Then ε2

45

¯ a ¯ ¯ dF1 (x)/dx ¯ ¯ = sup ¯ − 1¯¯ g(x) a,x∈I ¯ ¯ ¯ ¯ X ¯ ¯ 1 − 1¯¯ . = sup ¯¯(log 2)(a + 1)(x + 1) 2 (x + a + i) a,x∈I ¯ ¯ i∈N+

It is not difficult to check that 2(ζ(2) − 1) ≤ (a + 1)(x + 1)

X i∈N+

Hence

1 ≤ ζ(2), (x + a + i)2

a, x ∈ I.

ε2 = max(ζ(2) log 2 − 1, 1 − 2(ζ(2) − 1) log 2) = ζ(2) log 2 − 1 = 0.14018 · · · .

For n ≥ 3 the computation of εn becomes forbidding. Instead, Theorem 1.3.12 can be used to derive good upper bounds for εn whatever n ∈ N+ . Proposition 1.3.13 We have ε1 < log 2 and 1 εn ≤ (log 2)cn−2 , 2

n ≥ 2,

√ where c = 3.5 − 2 2 = 0.67157 · · · .

Proof. It follows from (1.3.25) and (1.3.26) that Z 1 s+1 dFna (x) = dGan (s) 2 dx (xs + 1) 0 and

Z g(x) = 0

1

s+1 dG(s) (xs + 1)2

for any a, x ∈ I and n ∈ N. Using the last two equations, integration by parts yields ¯Z 1 ¯ a ¯ ¯ ¯ ¯ dFn (x) ¯ ¯ s + 1 a ¯ ¯ ¯ d(Gn (s) − G(s))¯¯ ¯ dx − g(x)¯ = ¯ 2 0 (xs + 1) ¯Z 1 µ ¶ ¯ ¯ ¯ ∂ s+1 a ¯ ((Gn (s) − G(s)) = ¯ ds¯¯ 2 ∂s (xs + 1) 0 Z 1 |x(s + 2) − 1| ≤ sup | Gan (s) − G(s)| ds. (xs + 1)3 s∈I 0

46

Chapter 1

But Z

1

|x(s + 2) − 1| ds (xs + 1)3 0  Z 1 1 − x(s + 2)    ds if 0 ≤ x ≤ 13 ,  3  (xs + 1)  0      Z 1  Z (1−2x)/x 1 − x(s + 2) 1 − x(s + 2) = ds − ds if 13 ≤ x ≤ 12 , 3 3  (xs + 1) (xs + 1) 0 (1−2x)/x       Z 1   x(s + 2) − 1   ds if 12 ≤ x ≤ 1  (xs + 1)3 0  2(x + 1)−2 − 1 if 0 ≤ x ≤ 13 ,      −2(x + 1)−2 − 1 + (2x(1 − x))−1 if 13 ≤ x ≤ 12 , =      1 − 2(x + 1)−2 if 12 ≤ x ≤ 1

and

Z (x + 1) 0

1

|x(s + 2) − 1| ds = (xs + 1)3

 2(x + 1)−1 − (x + 1) if 0 ≤ x ≤ 13      −2(x + 1)−1 − (x + 1) + (x + 1)(2x(1 − x))−1 if 31 ≤ x ≤ 12 =      x + 1 − 2(x + 1)−1 if 21 ≤ x ≤ 1 ≤ 1. Therefore ¯ ¯ a ¯ dFn (x)/dx ¯ ¯ sup ¯ − 1¯¯ ≤ (log 2) sup |Gan (s) − G(s)| , g(x) a,x∈I a,s∈I Then ε01 = ε1 ≤ log 2 and, by Theorem 1.3.12, 1 ε0n+1 = εn+1 ≤ (log 2)cn−1 , 2

n ∈ N+ .

n ∈ N.

Basic properties

47 2

Theorem 1.3.14 For any a ∈ I we have ψγa (n) ≤

εn + εn+1 , 1 − εn+1

n ∈ N+ .

(1.3.33)

Also, ψγ (n) = εn ,

n ∈ N+ .

Proof. It follows from (1.3.24) that for any a ∈ I we have ¯ ¡ ¯ ¯ γ B|I(i(k) )¢ ¯ a ¯ ¯ εn = sup ¯ − 1¯ , n ∈ N+ , ¯ ¯ γ(B)

(1.3.34)

(1.3.35)

∞ with γ(B) > 0, i(k) ∈ Nk , where the supremum is taken over all B ∈ Bk+n + and k ∈ N. For arbitrarily given k, `, n ∈ N+ , i(k) ∈ Nk+ , and j (`) ∈ N`+ put A = I(i(k) ), B = ((ak+n , · · · , ak+n+`−1 ) = j (`) ))

and note that γa (A) γa (B) 6= 0 for any a ∈ I. By (1.3.35) we have |γa (B|A) − γ (B)| ≤ εn γ (B)

(1.3.36)

|γa (B) − γ (B)| ≤ εn+k γ (B) .

(1.3.37)

and

It follows from (1.3.36) and (1.3.37) that |γa (B|A) − γa (B)| ≤ (εn + εn+k ) γ (B) , whence |γa (A ∩ B) − γa (A) γa (B)| ≤ (εn + εn+k ) γa (A) γ (B) . Finally, note that (1.3.37) yields γ (B) ≤

γa (B) . 1 − εn+k

Since the sequence (εn )n∈N+ is non-increasing, we have εn + εn+1 εn + εn+k ≤ , 1 − εn+k 1 − εn+1

k, n ∈ N+ ,

48

Chapter 1

which completes the proof of (1.3.33). To prove (1.3.34) we first note that putting A = I(i(k) ) for any given k ∈ N+ and i(k) ∈ Nk+ , by (1.3.35) we have |γa (A ∩ B) − γa (A) γ (B)| ≤ εn γa (A) γ (B) ∞ , and n ∈ N . By integrating the above inequality for any a ∈ I, B ∈ Bk+n + over a ∈ I with respect to γ and taking into account that Z γa (E) γ(da) = γ (E) , E ∈ BI , I

we obtain ψγ (n) ≤ εn , n ∈ N+ . To prove the converse inequality remark that the ψ-mixing coefficients under the extended Gauss measure γ¯ of the doubly infinite sequence (¯ a` )`∈Z of extended incomplete quotients, are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ . This is obvious by the very definitions of (¯ a` )`∈Z and ψ-mixing coefficients. See Subsection 1.3.3 and Section A3.1. As (¯ a` )`∈Z is strictly stationary under γ¯ , we have ¯ ¯ ¯ γ¯ (A ∩ B) ¯ ψγ (n) = ψγ¯ (n) = sup ¯¯ − 1¯¯ , γ¯ (A) γ¯ (B)

n ∈ N+ ,

¯ ∈ where the upper bound is taken over all A¯ = σ(¯ an , a ¯n+1 , · · · ) and B σ(¯ a0 , a ¯−1 , · · · ) for which γ¯ (A) γ¯ (B) 6= 0. Clearly, A = A × I and B = I × B, with A ∈ Bn∞ = τ −n+1 (BI ) and B ∈ BI . Then ¯ ¯ ¯ ¯ γ¯ (A × B) ¯ − 1¯¯ , n ∈ N+ . ψγ (n) = sup (1.3.38) ¯ A ∈ τ −n+1 (BI ), B ∈ BI γ(A) γ(B) γ(A)γ(B) 6= 0

Now, it is easy to check that Z Z γ¯ (A × B) = γ(da)γa (B) = γ(db)γb (A) A

B

for any A, B ∈ BI . It then follows from (1.3.38) and the very definition of εn that ¯ ¯ ¯ γb (A) ¯ ¯ ψγ (n) ≥ sup − 1¯¯ = εn , n ∈ N+ . ¯ b ∈ I, A ∈ τ −n+1 (BI ) γ(A) γ(A) 6= 0

Basic properties

49

This completes the proof of (1.3.34).

2

Corollary 1.3.15 The sequence (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ (ε1 + ε2 )/(1 − ε2 ) = 0.61231 · · · and (log 2)cn−2 (1 + c) ψγa (n) ≤ , n ≥ 2. 2 − (log 2)cn−1 Also, ψγ (1) = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ζ(2) log 2 − 1 = 0.14018 · · · and 1 ψγ (n) ≤ (log 2)cn−2 , n ≥ 3. 2 The doubly infinite sequence (¯ a` )`∈Z of extended incomplete quotients is ψ-mixing under the extended Gauss measure γ¯ , and its ψ-mixing coefficients are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ . The proof follows from Proposition 1.3.13 and Theorem 1.3.14. As already noted, the last assertion is obvious by the very definitions of (¯ a` )`∈Z and ψ-mixing coefficients. 2 Remark. The above result will be improved in Chapter 2. See Proposition 2.3.7. 2 Proposition 1.3.16 (F. Bernstein’s theorem) Let (cn )n∈N+ be a sequence of positive numbers. The random event (an ≥ cnP ) occurs infinitely often with γ-probability 0 or 1, according as the series n∈N+ 1/cn converges or diverges. In other words, γ(a ≥ c i.o.) is either 0 or 1 according n n P as the series n∈N+ 1/cn converges or diverges. Proof. We can clearly assume that cn ≥ 1, n ∈ N+ . Let En = (an ≥ cn ), n ∈ N+ . By (1.2.9) we have γ(En ) = γ(an ≥ cn ) = γ (a1 ≥ cn ) = γ(a1 ≥ c0n ) =

µ ¶ 1 1 log 1 + 0 , log 2 cn

where either c0n = bcn c + 1 or c0n = bcn c. Hence 1 2 ≤ γ(En ) ≤ , 2cn cn log 2

n ∈ N+ ,

P since x log 2 ≤ log(1 + x) ≤ x for any x ∈ I. Thus if n∈N+ 1/cn converges, then the result stated follows from the Borel–Cantelli lemma.

50

Chapter 1

P Assume now that n∈N+ 1/cn diverges. It follows from Theorem 1.3.14 that for any k, n ∈ N+ such that k ≤ n we have |γ (Ekc ∩ · · · ∩ Enc ∩ En+1 ) − γ (Ekc ∩ · · · ∩ Enc ) γ (En+1 )| ≤ ε1 γ (Ekc ∩ · · · ∩ Enc ) γ (En+1 ) , where ε1 = 2 log 2 − 1 = 0.38629 · · · . Hence γ ( En+1 | Ekc ∩ · · · ∩ Enc ) ≥ (1 − ε1 )γ(En+1 ) ≥ therefore

1 − ε1 , 2cn+1

¡ c ¯ c ¢ ¯ E ∩ · · · ∩ Enc ≤ 1 − 1 − ε1 γ En+1 k 2cn+1

for any k, n ∈ N+ such that k ≤ n. It follows that for any k, m ∈ N+ we have ¶ m µ ¢ Y ¡ 1 − ε1 c ≤ 1− γ Ekc ∩ · · · ∩ Ek+m , 2ck+i i=0

whence

¶ m µ Y ¡ c ¢ 1 − ε1 c 1− =0 γ Ek ∩ Ek+1 ∩ · · · ≤ lim m→∞ 2ck+i i=0

P

since n∈N+ 1/cn diverges. Finally, γ (an ≥ cn i.o.) = γ(∩k∈N+ ∪ i≥k Ei ) =

lim γ(∪ i≥k Ei ) = lim γ((∩i≥k Eic )c )

k→∞

k→∞

¡ ¢ c = 1 − lim γ Ekc ∩ Ek+1 ∩ · · · = 1. k→∞

2 In Chapter 3 we shall need the following result. Corollary 1.3.17 Let bn , n ∈ N+ , be real-valued random variables on (I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . Let (cn )n∈N+ be a sequence of positive P numbers. Then γ (bn ≥ cn i.o.) is either 0 or 1 according as the series n∈N+ 1/cn converges or diverges.

Basic properties

51

Proof. Clearly, (an ≥ cn i.o.) ⊂ (bn ≥ cn i.o.) ⊂ (an ≥ max(1, cn − c) i.o.), P P and the series n∈N+ 1/cn and n∈N+ 1/ max(1, cn −c) are both convergent or divergent. 2

52

Chapter 1

Chapter 2

Solving Gauss’ problem In this chapter a generalization of Gauss’ problem stated in Subsection 1.2.1 is solved. Several applications are also given.

2.0 Banach space preliminaries 2.0.1 A few classical Banach spaces In this subsection we describe some Banach spaces which are often mentioned throughout the book. We consider just functions defined on I, but almost all considerations below can be easily extended to more general cases. We denote by B (I) the collection of all bounded measurable functions f : I → C. This is a commutative Banach algebra with unit under the supremum norm || f || = sup |f (x)| , f ∈ B (I) . x∈I

We denote by C (I) the collection of all continuous functions f : I → C . This is a commutative Banach algebra with unit under the supremum norm. We denote by C 1 (I) the collection of all functions f : I → C which have a continuous derivative. This is a commutative Banach algebra with unit under the norm || f || 1 = || f || + || f 0 || , f ∈ C 1 (I) . We denote by L (I) the collection of all Lipschitz functions f : I → C, that is, those for which s (f ) := sup

x0 6=x00

|f (x0 ) − f (x00 ) | < ∞· |x0 − x00 | 53

54

Chapter 2

This is a commutative Banach algebra with unit under the norm || f || L = || f || + s (f ) ,

f ∈ L (I) .

Clearly, C 1 (I) ⊂ L (I) ⊂ C (I) ⊂ B (I) . The variation varA f over A ⊂ I of a function f : I → C is defined as sup

k−1 X

|f (ti ) − f (ti−1 )| ,

i=1

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and k ≥ 2. We write simply var f for varI f . If var f < ∞ then f is called a function of bounded variation. The collection BV (I) of all functions f : I → C of bounded variation is a commutative Banach algebra with unit under the norm || f || v = || f || + var f, f ∈ BV (I) . Clearly, L (I) ⊂ BV (I) ⊂ B (I) . Let µ be a measure on BI . Two measurable functions f : I → C and g : I → C are said to be µ-indistinguishable, or to be µ-versions of each other, if and only if µ (f 6= g) = 0. Let us partition the collection of all measurable complex-valued functions defined on I into (equivalence) classes of µ-indistinguishable functions. For any real number p ≥ 1 we denote by Lp (I, BI , µ) = Lpµ the collection of all such classes of µ-indistinguishable R 0 functions f : I → C for which I |f |p dµ < ∞. Clearly, Lpµ ⊂ Lpµ if p ≥ p0 ≥ 1. Next, Lpµ is a Banach space under the norm µZ ||f ||p,µ =

¶1/p |f | dµ , p

I

f ∈ Lpµ .

(Note that the value of the integral is the same for all functions in an equivalence class.) To define L∞ µ we should first define the µ-essential supremum. For a measurable function f : I → R, its µ-essential supremum, which is denoted µ-ess sup f , is defined as inf {a ∈ R : µ (f > a) = 0} .

Solving Gauss’ problem

55

A measurable function f : I → C is said to be µ-essentially bounded if and only if µ-ess sup|f | < ∞. Note that

µ-ess sup|f | = inf || fe || ,

where the lower bound is taken over all µ-versions fe or f . We denote by L∞ (I, BI , µ) = L∞ µ the collection of all classes of µ-essentially bounded complex-valued µ-indistinguishable functions defined on I ; L∞ µ is a commutative Banach algebra with unit under the norm ||f ||∞,µ = µ-ess sup |f |,

f ∈ L∞ µ .

(Note that the value of the essential supremum is the same for all functions p in an equivalence class.) Clearly, L∞ µ ⊂ Lµ for any p ≥ 1. The special case p = 2 is an important one: L2µ can be also considered as a Hilbert space with inner product (·, ·)µ defined by Z (f, g)µ = f g ∗ dµ, f, g ∈ L2µ . I

In the case where µ = λ we simply write Lp , ||f ||p , L∞ , ||f ||∞ , and ess sup f instead of Lpλ , ||f ||p,λ , L∞ λ , ||f ||∞,λ , and λ-ess sup f , respectively.

2.0.2 Bounded essential variation e the infimum is defined as v (f ) = inf var f, e f of f . If v (f ) < ∞ then f ∈ L∞ is called variation. It can be shown that Z 1 1 v (f ) = lim |f (u + a) − f (u) |du, 0
A variation v (f ) for f ∈ L∞ being taken over all λ-versions a function of bounded essential

where for x rel="nofollow"> 1 we define f (x) = f (1). Clearly, if f ∈ BV (I) then, in general, v (f ) ≤ var f . This is a special instance of the following more general result due to Stadje (1985). If v (f ) < ∞ then the limit 1 fe(t) = lim 0
Z

t+a

f (u) du t

exists for any t ∈ I, the function fe is a right-continuous λ-version of f , and var fe = v (f ). The collection BEV (I) of all functions f ∈ L∞ of bounded

56

Chapter 2

essential variation is a commutative Banach algebra with unit under any of the norms ||f ||v,µ = v (f ) + ||f ||1,µ , f ∈ BEV (I) , with µ ∈ pr (BI ) such that µ ≡ λ. See R˘aut¸u and Zb˘aganu (1989). In the case where µ = λ we simply write ||f ||v instead of ||f ||v,λ . Proposition 2.0.1 (i) Let µ ∈ pr (BI ). If f ∈ BV (I) then ¯Z ¯ ¯ ¯ ¯ || f || ≤ var f + ¯ f dµ¯¯ .

(2.0.1)

(ii) Let µ ∈ pr (BI ) with µ ≡ λ. If f ∈ BEV (I) then ¯Z ¯ ¯ ¯ ¯ µ-ess sup |f | ≤ v (f ) + ¯ f dµ¯¯ .

(2.0.2)

I

I

Proof. (i) For any x ∈ I we can write ¯Z ¯ ¯ ¯ ¯Z ¯ Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ |f (x)| − ¯ f dµ¯ ≤ ¯f (x) − f dµ¯ = ¯ (f (x) − f (u)) µ (du)¯¯ ≤ var f, I

I

I

from which (2.0.1) follows at once. (ii) (2.0.2) follows from (2.0.1) since µ-ess sup |f | = inf || fe|| , v (f ) = inf var fe, fe

fe

the infimum being taken over all µ-versions fe of f , and Z Z e f dµ = f dµ I

I

for such an fe.

2.1 2.1.1

2

The Perron–Frobenius operator Definition and basic properties

Let µ ∈ pr (BI ) such that ¡ ¢ µ τ −1 (A) = 0 whenever µ (A) = 0, A ∈ BI ,

(2.1.1)

where τ is the continued fraction transformation defined in Subsection 1.1.1.

Solving Gauss’ problem

57

In particular, this condition ¡ −1 ¢is satisfied if τ is µ-preserving, that is, = µ, to mean µ τ (A) = µ (A) for any A ∈ BI . In general, assuming that µ ¿ λ and putting h = dµ/dλ, it is easy to check that (2.1.1) holds if and only if λ (E) = 0, where E = (x ∈ I : h (x) = 0). The Perron–Frobenius operator Pµ of τ under µ is defined as the bounded linear operator on L1µ which takes f ∈ L1µ into Pµ f ∈ L1µ with µτ −1

Z A

or, equivalently,

Z Pµ f dµ =

f dµ , τ −1 (A)

Z I

A ∈ BI ,

Z gPµ f dµ =

(g ◦ τ ) f dµ

(2.1.2)

I

for any f ∈ L1µ and g ∈ L∞ µ . The existence of Pµ f is ensured by the Radon– Nikodym theorem on account of (2.1.1). Actually, Pµ so defined takes Lpµ into itself for any p ≥ 1 and p = ∞. So, (2.1.2) holds for any f ∈ Lpµ and g ∈ Lqµ , with p rel="nofollow"> 1 and q = p/ (p − 1). In particular, (2.1.2) holds for any f, g ∈ L2µ . The probabilistic interpretation of Pµ is immediate : if an R I-valued random variable ξ Ron I has µ-density h, that is, µ (ξ ∈ A) = A hdµ, A ∈ BI , with h ≥ 0 and I hdµ = 1, then τ ◦ ξ has µ-density Pµ h. In the special case µ = λ we obviously have Z d f dλ a.e. in I. Pλ f (x) = dx τ −1 ([0,x])

Proposition 2.1.1 The following properties hold : (i) Pµ is positive, that is, Pµ f ≥ 0 if f ≥ 0; (ii) Pµ preserves integrals, that is, Z Z Pµ f dµ = f dµ, f ∈ L1µ ; I

I

(iii) kPµ kp,µ := sup (||Pµ f ||p,µ : f ∈ Lpµ , ||f ||p,µ = 1) ≤ 1 for any p ≥ 1 and p = ∞; (iv) for any n ∈ N+ the nth power Pµn of Pµ is the Perron–Frobenius operator of the nth iterate τ n of τ under µ ; (v) (Pµ f )∗ = Pµ f ∗ for any f ∈ L1µ ; (vi) Pµ ((g ◦ τ ) f ) = gPµ f for any f ∈ L1µ and g ∈ L∞ µ and for any f ∈ p Lµ and g ∈ Lqµ with p > 1 and q = p/ (p − 1);

58

Chapter 2

(vii) PRµ f = f if and only if τ is ν-preserving, where ν is defined by ν (A) = A f dµ, A ∈ BI . In particular, Pµ 1 = 1 if and only if τ is µpreserving. For the proof see Boyarski and G´ora (1997, Ch. 4), Lasota and Mackey (1985, Ch. 3) or Mackey (1992, Ch. 4). 2 Remark. The above considerations on the Perron–Frobenius operator of the continued fraction transformation τ under different probability measures on BI apply mutatis mutandis to the general case of a transformation of an arbitrary probability space. For example, in the case of the natural extension τ of ¡ τ2 ¢(see Subsection 1.3.1) we should start by considering measures µ ∈ pr BI such that ¡ ¢ µ τ −1 (B) = 0 whenever µ (B) = 0, B ∈ BI2 . (2.1.10 ) Assuming that µ ¿ λ2 (two-dimensional Lebesgue measure) and¡ putting ¢ h = dµ/dλ2¡, it is easy to check that ¢(2.1.10 ) holds if and only if λ2 E = 0, where E = (x, y) ∈ I 2 : h (x, y) = 0 . The Perron–Frobenius operator P µ of linear ¡ ¢ ¡ τ¢ under µ is the ¡bounded ¢ operator on L1µ I 2 which takes f ∈ L1µ I 2 into P µ f ∈ L1µ I 2 with Z Z P µ f dµ = f dµ, B ∈ BI2 . B

τ¯−1 (B)

It is also quite easy to check that if µ ≤ λ2 and h = dµ/dλ2 > 0 a.e. in I 2 , then ¡ ¢ ¡ ¢ h ◦ τ −1 (x, y) f ◦ τ −1 (x, y) P µ f (x, y) = y 2 (x + b1/yc)2 h (x, y) a.e. in I 2 . Alternatively, ¢ ¶ ¡ µ x ¢ s1 (y) 2 h ◦ τ −1 (x, y) ¡ P µ f (x, y) = f ◦ τ −1 (x, y) 0 τ (y) h (x, y) a.e. in I 2 . In particular, for µ = γ when h (x, y) =

1 1 , log 2 (xy + 1)2

x, y ∈ I 2 ,

we have P γ f = f ◦ τ −1 a.e. in I 2 . Hence ¢ ¶ ¡ µ x ¢ s1 (y) · · · sxn (y) 2 h ◦ τ −n (x, y) ¡ n P µ f (x, y) = f ◦ τ −n (x, y) , 0 n−1 τ (y) · · · τ (y) h (x, y)

Solving Gauss’ problem

59 n

P γ f = f ◦ τ −n a.e. in I 2 for any n ∈ N+ . We should, however, note that the Perron–Frobenius operator of an invertible transformation, like τ¯, is not of great value for deriving asymptotic properties of its nth power as n → ∞. For an interesting discussion of the Perron–Frobenius operator of τ¯ in connection with the time evolution of certain spatially homogeneous cosmologies (‘mixmaster universe’), we refer the reader to Mayer (1987). 2 Proposition 2.1.2 The Perron–Frobenius operator Pγ := U of τ under γ is given a.e. in I by the equation µ ¶ X 1 U f (x) = Pi (x) f , f ∈ L1γ . (2.1.3) x+i i∈N+

Proof. Let τi : Ii → I denote the restriction of τ to the interval Ii = (1/ (i + 1) , 1/i], i ∈ N+ , that is, τi (u) =

1 − i, u

u ∈ Ii .

For any f ∈ L1γ and any A ∈ BI we have Z X Z X Z f dγ = f dγ = τ −1 (A)

i∈N+

τ −1 (A∩Ii )

i∈N+

τi−1 (A)

f dγ.

(2.1.4)

For any i ∈ N+ , by the change of variable x = τi−1 (y) = (y + i)−1 we successively obtain Z Z 1 f (x) dx f dγ = log 2 τi−1 (A) x + 1 τi−1 (A) =

=

1 log 2 1 log 2

µ

Z f A

A

A



µ

Z Pi (y) f µ

Z =

1 y+i

Pi (y) f

1 y+i

1 dy −1 (y + i) + 1 (y + i)2

1 y+i



(2.1.5)

dy y+1

¶ γ (dy) .

Now, (2.1.3) follows from (2.1.4) and (2.1.5).

2

60

Chapter 2

Proposition 2.1.3 Let µ ∈ pr (BI ). Assume that µ ¿ λ and h = dµ/dλ > 0 a.e. in I . Then the Perron–Frobenius operator Pµ of τ under µ is given a.e. in I by the equation ³ ´ −1 µ ¶ h (x + i) 1 X 1 Pµ f (x) = f 2 h (x) x+i (x + i) i∈N+ (2.1.6) =

U g (x) , (x + 1) h (x)

f ∈ L1µ ,

where g (x) = (x + 1) h (x) f (x), x ∈ I. The powers of Pµ are given a.e. in I by the equation Pµn f (x) =

U n g (x) , (x + 1) h (x)

f ∈ L1µ , n ∈ N+ .

(2.1.7)

Proof. The proof of (2.1.6) is entirely similar to that of (2.1.3), and is left to the reader. Note that f ∈ L1µ entails g ∈ L1γ . To prove (2.1.7) note that it holds for n = 1. Assuming that (2.1.7) holds for some n ∈ N+ , we have µ ¶ ¡ n ¢ U ng n+1 Pµ f (x) = Pµ Pµ f (x) = Pµ (x) (· + 1) h ³ ´ −1 ¶ µ ¶ ³ µ ´ h (x + i) X 1 1 1 −1 n = / + 1 h (x + i) U g h (x) x+i x+i (x + i)2 i∈N+

=

µ ¶ X 1 U n+1 g (x) 1 n Pi (x) U g = x+i (x + 1) h (x) (x + 1) h (x)

a.e. in I,

i∈N+

and the proof is complete.

2

Corollary 2.1.4 The Perron–Frobenius operator Pλ of τ under λ is given a.e. in I by the equation ¶ µ X 1 1 , f ∈ L1 . Pλ f (x) = 2f x + i (x + i) i∈N+ The powers of Pλ are given a.e. in I by the equation Pλn f (x) =

U n g (x) , x+1

f ∈ L1 , n ∈ N+ ,

Solving Gauss’ problem

61

where g (x) = (x + 1) f (x), x ∈ I. Proposition 2.1.5 Let µ ∈ pr (BI ). Assume that µ ¿ λ and let h = dµ/dλ. Then Z ¡ −n ¢ U n f (x) µ τ (A) = dx (2.1.8) A x+1 for any n ∈ N and A ∈ BI , where f (x) = (x + 1) h(x), x ∈ I. Proof. For n = 0 equation (2.1.8) reduces to Z µ (A) =

h (x) dx, A

A ∈ BI ,

which is obviously true. Assume that (2.1.8) holds for some n ∈ N. Then ³ ´ ¡ ¡ ¢¢ µ τ −(n+1) (A) = µ τ −n τ −1 (A) Z Z U n f (x) = dx = (log 2) U n f dγ. x + 1 −1 −1 τ (A) τ (A) By the very definition of the Perron–Frobenius operator U = Pγ we have Z

Z n

U n+1 f dγ.

U f dγ = τ −1 (A)

A

Therefore Z Z ³ ´ U n+1 f (x) −(n+1) n+1 µ τ (A) = (log 2) U f dγ = dx, x+1 A A and the proof is complete.

2

Remark. It should be noted that (2.1.8) holds without assuming that h > 0 a.e. Since Z ¡ −n ¢ n µ (τ ∈ A) = µ τ (A) = Pµn 1 dµ, n ∈ N, A ∈ BI , A

it is possible to derive (2.1.8) from Proposition 2.1.3 assuming that h > 0 a.e., which clearly restricts the generality of the result. 2

62

Chapter 2

2.1.2

Asymptotic behaviour

It is easy to check that

1 x+1 is an eigenfunction of Pλ corresponding to the eigenvalue 1. Define on L1 the linear operators Π1 and T0 by (log 2)−1 Π1 f (x) = x+1

Z f ∈ L1 , x ∈ I,

f dλ, I

T0 = Pλ − Π1 . Hence Π21 = Π1 ,

Pλ Π1 = Π1 Pλ = Π1 ,

T0 Π1 = Π1 T0 = 0.

(2.1.9)

It follows from the last equation (2.1.9) that Pλn = Π1 + T0n ,

n ∈ N+ .

(2.1.10)

Theorem 2.1.6 The only eigenvalue of modulus 1 of Pλ : L1 → is 1 and this eigenvalue is simple. The operator T0 has the following properties: (i) T0 (BEV (I)) ⊂ BEV (I); (ii) there exists 0 < q < 1 such that ||T0n ||v = O (q n ) as n → ∞ (equivalently, the spectral radius of T0 in BEV (I) under || · ||v is less than 1); (iii) supn∈N+ ||T0n ||1 < ∞ and limn→∞ ||T0n h||1 = 0 for any h ∈ L1 .

L1

Proof. This is a special case of Theorem 5.3.12 in Iosifescu and Grigorescu (1990). 2 The result just stated concerning the asymptotic behaviour of T0n as n → ∞ can be used to derive the asymptotic behaviour of U n as n → ∞. It follows from Corollary 2.1.4 and equation (2.1.10) that µ ¶ g U n g (x) = U ∞ g + (x + 1) T0n (x) (2.1.11) ·+1 a.e. in I for any g ∈ L1γ , where Z ∞

U g=

gdγ. I

Solving Gauss’ problem

63

It is obvious that U ∞ U ∞ = U U ∞ = U ∞ . Using the last equation (2.1.9) it is easy to check that U ∞U = U ∞ . (2.1.12) Now, defining the linear operator T : L1γ → L1γ by µ T g (x) = (x + 1) T0

g ·+1

¶ (x),

g ∈ L1γ ,

a.e. in I, it is easy to check that µ n

T g (x) = (x +

1) T0n

g ·+1

¶ (x),

g ∈ L1γ ,

(2.1.13)

a.e. in I for any n ∈ N+ , and T U ∞ = U ∞ T = 0.

(2.1.14)

It follows from (2.1.11) and (2.1.13) that U n = U ∞ + T n,

n ∈ N+ .

Proposition 2.1.7 The only eigenvalue of modulus 1 of U : L1γ → L1γ is 1 and this eigenvalue is simple. The corresponding eigenspace consists of the a.e. constant functions on I. The linear operator T : L1γ → L1γ has the following properties: (i) T (BEV (I)) ⊂ BEV (I); (ii) there exists 0 < q < 1 such that ||T n ||v,γ = O (q n ) as n → ∞ (equivalently, the spectral radius of T in BEV (I) under || · ||v,γ is less than 1); (iii) supn∈N+ ||T n ||1,γ < ∞ and limn→∞ ||T n h||1,γ = 0 for any h ∈ L1γ . Proof. By (2.1.11) and (2.1.13), all the conclusions are immediate consequences of the corresponding conclusions of Theorem 2.1.6. In checking (ii) we have to use Proposition 2.0.1(ii). 2 Remark. Since λ (A) λ (A) ≤ γ (A) ≤ , 2 log 2 log 2

A ∈ BI ,

the domains of the operators U, U ∞ and T can be as well taken to be L1 and then in (ii) and (iii) the norms || · ||v,γ and || · ||1,γ should be replaced by the norms || · ||v and || · ||1 , respectively. 2

64

Chapter 2 Corollary 2.1.8 For any h ∈ L1 we have Z Z n ∞ lim |U h − U h|dγ = lim |U n h − U ∞ h|dλ = 0. n→∞ I

n→∞ I

Hence, for any h ∈ L1 , Z lim

n→∞ A

U n h dµ = µ (A) U ∞ h

(2.1.15)

uniformly with respect to A ∈ BI , where µ stands for either λ or γ. Proof. For any A ∈ BI we have ¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ n ∞ n ∞ ¯ U h dµ − µ (A) U h¯ = ¯ (U h − U h) dµ¯ ¯ ¯ ¯ ¯ A ZA ≤ |U n h − U ∞ h| dµ ZA ≤ |U n h − U ∞ h| dµ −→ 0 I

as n → ∞, and the proof is complete.

2

Remark. It is not possible to show that U n h → U ∞ h a.e. as n → ∞ by using (2.1.15). It is an open problem whether this is actually true. Cf. Petek (1989) and Iosifescu (1992, p. 912). 2

2.1.3

Restricting the domain of the Perron–Frobenius operator

The asymptotic properties of the Perron–Frobenius operator U : L1γ → L1γ as described by Proposition 2.1.7, are not strong enough for to lead to a satisfactory solution to Gauss’ problem, whilst when restricting U to BEV (I) they are substantially better. See further Proposition 2.1.17. In the next sections the domain of U will be successively restricted to various Banach spaces. In this subsection we show that U , defined by ¶ µ X 1 (2.1.16) U f (x) = Pi (x) f x+i i∈N+

for any x ∈ I, is a bounded linear operator on any of the Banach spaces B (I) , C (I) , BV (I), L (I), and C 1 (I).

Solving Gauss’ problem

65

Proposition 2.1.9 The operator U defined by (2.1.16) is a bounded linear operator of norm 1 on both B (I) and C (I). Proof. It is obvious that if f ∈ B (I) then U f ∈ B (I) and || U f || ≤ || f ||. Next, if f ∈ C (I) then U f ∈ C (I) since the series defining U f is uniformly convergent, it being dominated by a convergent series of positive constants. We also have || U f || ≤ || f || , f ∈ C (I) ⊂ B (I), as a consequence of the validity of this inequality for f ∈ B (I). In both cases || U || = 1 since U preserves the constant functions. 2 A different interpretation is available for the operator U : B (I) → B (I). Proposition 2.1.10 The operator U : B (I) → B (I) is the transition a) operator of both the Markov chain on (I, BI , γa ), for any a ∈ I, and n n∈N ¡ 2 (s ¢ 2 the Markov chain (s` )`∈Z on I , BI , γ . Proof. As noted in Subsection 1.3.4, for any a ∈ I the sequence (san )n∈N is an I-valued Markov chain with the following transition mechanism: from state s ∈ I the possible transitions are to any state 1/ (s + i) with corresponding transition probability Pi (s), i ∈ N+ . Then the transition operator of (san )n∈N takes f ∈ B (I) to the function defined by ¶ µ X ¡ ¡ a ¢ a ¢ 1 = U f (s), s ∈ I, Pi (s) f E f sn+1 |sn = s = s+i i∈N+

that is, it coincides with the operator U whatever a ∈ I. A similar reasoning is valid for the case of the Markov chain (s` )`∈Z , whose transition mechanism is identical with that of (san )n∈N . (See Subsection 1.3.3.) 2 To prove a result similar to Proposition 2.1.9 for the Banach spaces BV (I), L (I), and C 1 (I) we need some preparation. We first prove that the operator U : B (I) → B (I) preserves monotonicity. Proposition 2.1.11 If f ∈ B (I) is non-decreasing (non-increasing), then U f is non-increasing (non-decreasing). Proof. To make a choice assume that f is non-decreasing. Let y > x, x, y ∈ I. We have U f (y) − U f (x) = S1 + S2 , where µ µ ¶ µ ¶¶ X 1 1 S1 = Pi (y) f −f , y+i x+i i∈N+ ¶ µ X 1 . S2 = (Pi (y) − Pi (x)) f x+i i∈N+

66

Chapter 2

Clearly, S1 ≤ 0. We shall prove that S2 ≤ 0, too. Since X Pi (u) = 1, u ∈ I, i∈N+

we can write X µ µ f S2 = − i∈N+

1 x+1



µ −f

1 x+i

¶¶ (Pi (y) − Pi (x)) .

As is easy to see, the function P1 is decreasing while the functions Pi , i ≥ 3, are all increasing. Note also that µ ¶ µ ¶ µ ¶ µ ¶ 1 1 1 1 −f ≥f −f ≥ 0, i ≥ 2. f x+1 x+i x+1 x+2 Therefore S2

Xµ µ = − f

¶ µ ¶¶ 1 1 −f (Pi (y) − Pi (x)) x+1 x+i i≥2 µ µ ¶ µ ¶¶ X 1 1 ≤ − f −f (Pi (y) − Pi (x)) x+1 x+2 i≥2 µ µ ¶ µ ¶¶ 1 1 = f −f (P1 (y) − P1 (x)) ≤ 0, x+1 x+2

as claimed. Thus U f (y) − U f (x) ≤ 0, and the proof is complete.

2

Remark. It is possible to show more generally that if f ∈ L1 is nondecreasing (non-increasing), then U f is non-increasing (non-decreasing). The proof, along the same lines as above, is left to the reader. 2 Proposition 2.1.12 If f ∈ B (I) is monotone, then var U f ≤

1 var f. 2

The constant 1/2 cannot be lowered. Proof. Assume, with no loss of generality, that f is non-decreasing. [Note that if f is non-increasing, then −f is non-decreasing while var U (−f ) = var U f and var (−f ) = var f .] Then by Proposition 2.1.11 we have µ ¶¶ µ ¶ X µ 1 1 − Pi (1) f . var U f = U f (0) − U f (1) = Pi (0) f i i+1 i∈N+

Solving Gauss’ problem

67

Since Pi (1) = 2Pi+1 (0), i ∈ N+ , it follows that X

var U f = P1 (0) f (1) −

µ Pi+1 (0) f

i∈N+

As

X

P1 (0) =

Pi+1 (0) =

i∈N+

and

µ f

1 i+1

1 i+1

¶ .

1 2

¶ ≥ f (0) ,

i ∈ N+ ,

we finally obtain var U f ≤

1 1 (f (1) − f (0)) = var f. 2 2

Since for f defined by f (x) = 0, 0 ≤ x < 1, and f (1) = 1 we have var U f = (var f ) /2, it follows that the constant 1/2 cannot be lowered. 2 Corollary 2.1.13 If f ∈ BV (I) is real-valued, then 1 var U f ≤ var f. 2 The constant 1/2 cannot be lowered. Proof. By Hahn’s decomposition of a signed measure, for any f ∈ BV (I) there exist monotone functions f1 , f2 ∈ B (I) such that f = f1 − f2 and var f = var f1 + var f2 . [To obtain this consider the signed measure µ on BI defined by µ ((a, b]) = f (b) − f (a), a < b, a, b ∈ I.] Then by Proposition 2.1.12 we have var U f

= var (U f1 − U f2 ) ≤ var U f1 + var U f2 1 1 (var f1 + var f2 ) = var f . ≤ 2 2

The optimality of the constant 1/2 follows from Proposition 2.1.12.

2

Proposition 2.1.14 We have s (U f ) ≤ (2ζ (3) − ζ (2)) s (f )

(2.1.17)

for any f ∈ L (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot be lowered.

68

Chapter 2 Proof. For x 6= y, x, y ∈ I, we have X Pi (y) − Pi (x) µ 1 ¶ U f (y) − U f (x) = f y−x y−x x+i

(2.1.18)

i∈N+

³ −

X

Pi (y)

f

i∈N+

´

1 y+i 1 y+i

³ −f −

1 x+i

´ 1 . (x + i) (y + i)

1 x+i

Next, remark that Pi (x) =

i i−1 − , x+i+1 x+i

i ∈ N+ ,

and then Pi (y) − Pi (x) i−1 i = − , y−x (x + i) (y + i) (x + i + 1) (y + i + 1)

i ∈ N+ .

Hence X Pi (y) − Pi (x) µ 1 ¶ f y−x x+i

i∈N+

=

X i∈N+

i (x + i + 1) (y + i + 1)

µ µ f

1 x+i+1



µ −f

1 x+i

(2.1.19)

¶¶ .

Assume that x > y. It then follows from (2.1.18) and (2.1.19) that ¯ ¯ ¶ X µ Pi (y) ¯ U f (y) − U f (x) ¯ i ¯ ≤ s (f ) ¯ + . ¯ ¯ y−x (y + i)2 (y + i) (y + i + 1)3 i∈N+

Now, the function g defined by g (y) =

X i∈N+

Pi (y) , (y + i)2

y ∈ I,

is precisely U h for h (y) = y 2 , y ∈ I. Since h is increasing, g is decreasing by Proposition 2.1.11. Therefore for any y ∈ I we have ¶ X µ Pi (y) i + (y + i)2 (y + i) (y + i + 1)3 i∈N+

Solving Gauss’ problem ≤

X µ i∈N+

=

69 1 1 + i3 (i + 1) (i + 1)3



¶ X µ1 1 1 1 1 − + − + i3 i2 i i + 1 (i + 1)3 i∈N+

= ζ (3) − ζ (2) + 1 + ζ (3) − 1 = 2ζ (3) − ζ (2) . As clearly

¯ ¯ ¯ U f (y) − U f (x) ¯ ¯ ¯ = s (U f ) , sup ¯ ¯ y−x x,y∈I, x>y

we obtain (2.1.17). Finally, it is easy to check that for f (x) = x, x ∈ I, we have s (f ) = 1 and s (U f ) = 2ζ (3) − ζ (2). The proof is complete. 2 Proposition 2.1.15 We have || (U f )0 || ≤ (2ζ (3) − ζ (2)) || f 0 ||

(2.1.20)

for any f ∈ C 1 (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot be lowered. Proof. Equations (2.1.19) and (2.1.18) show that for f ∈ C 1 (I) the series defining U f can be differentiated term by term since the series of the derivatives is uniformly convergent, it being dominated by a convergent series of positive constants (cf.further Subsection 2.2.1). Then (2.1.20) follows from (2.1.17) since for any f ∈ C 1 (I) we have s (f ) = || f 0 ||. 2 Now, we can state the result announced. Proposition 2.1.16 The operator U defined by (2.1.16) is a bounded linear operator of norm 1 on any of the Banach spaces BV (I), L (I), and C 1 (I). Proof. The result follows from Corollary 2.1.13 and Propositions 2.1.14 and 2.1.15, having in view that U preserves the constant functions. In the case of BV (I) we should note that for a complex-valued f ∈ BV (I) we have max (var Re f, var Im f ) ≤ var f ≤ var Re f + var Im f. Hence by Corollary 2.1.13 we have var U f ≤ var f for such an f.

2

70

Chapter 2

2.1.4

A solution to Gauss’ problem for probability measures with densities

Let µ ∈ pr (BI ) such that µ ¿ λ. By Proposition 2.1.5 for any n ∈ N we have Z ¡ ¢ U n f0 (x) µ τ −n (A) = dx, A ∈ BI , (2.1.21) A x+1 with f0 (x) = (x + 1) F00 (x) , x ∈ I, where F00 = dµ/dλ. We shall consider Gauss’ problem in a more general form, namely, that of the asymptotic behaviour of µ(τ −n (A)) as n → ∞ for any A ∈ BI . Equation (2.1.21) shows that solving this more general Gauss’ problem for a given µ ∈ pr (BI ) amounts to studying the behaviour of the nth power of the Perron–Frobenius operator U on a suitable Banach space. On account of the results obtained in Subsection 2.1.2 we can state the following result. Proposition 2.1.17 Let µ ∈ pr (BI ) such that µ ¿ λ. We have ¯ ¡ ¯ ¢ lim sup ¯µ τ −n (A) − γ (A)¯ = 0. (2.1.22) n→∞ A∈B

I

If F00 = dµ/dλ ∈ BEV (I) then there exists a constant C ∈ R+ such that ¯ ¡ −n ¯ ¢ ¯µ τ (A) − γ (A)¯ ≤ C q n γ (A) (2.1.23) for any n ∈ N+ and A ∈ BI . Here 0 < q < 1 is the constant occurring in Proposition 2.1.7(ii). Proof. We have ¡ ¢ µ τ −n (A) − γ (A) =

Z A

since

Z U ∞ f0 =

I

f0 dγ =

U n f0 (x) − U ∞ f0 dx x+1 1 log 2

Z I

F00 dλ =

(2.1.24)

1 , log 2

and equation (2.1.22) follows by (2.1.15). If F00 ∈ BEV (I) then for some C0 ∈ R+ by Proposition 2.1.7(ii) we have kU n f0 − U ∞ f0 kv ≤ C0 q n ||f0 ||v , n ∈ N+ . It then follows from Proposition 2.0.1(ii) that ess sup |U n f0 − U ∞ f0 | ≤ C0 q n ||f0 ||v , Now, (2.1.23) follows from (2.1.24) and (2.1.25).

n ∈ N+ .

(2.1.25) 2

Solving Gauss’ problem

71

2 √ Remark. As for q, we conjecture that its (optimal) value is g = (3 − 5)/2 = 0.38196 · · · , as in a further related result, namely, Corollary 2.5.7.2

F00

In the next three sections we will take up Gauss’ problem assuming that = dµ/dλ belongs to Banach spaces ‘smaller’ than BEV (I).

2.1.5

Computing variances of certain sums

In this subsection, using properties of the Perron–Frobenius operator U on BEV (I), we give some results concerning the variances of certain sums of random variables constructed starting from either the a` , ` ∈ Z, or the an , n ∈ N+ . These results will be used in Chapter 3. `−1 , ` ∈ Z, Let H be a real-valued function on NZ + . Set H` = H1 ◦ τ where H1 = H(· · · , a ¯−2 , a ¯−1 , a ¯0 , a ¯1 , a ¯2 , · · · ). ¡ 2 2 ¢ Clearly (H Pn` )`∈Z is a strictly stationary process on I , BI , γ . Set S0 = 0, Sn = i=1 Hi , n ∈ N+ . We start with some well known results. Theorem 2.1.18 If Eγ H12 < ∞, Eγ H1 = 0, and limn→∞ Eγ H1 Hn = 0, then the finite or infinite limit limn→∞¡Eγ¢Sn2 exists. We have limn→∞ Eγ Sn2 < ∞ if and only if there exists g ∈ L2γ I 2 such that H1 = g ◦ τ − g a.e. in I 2. This is a special case of Theorem 18.2.2 in Ibragimov and Linnik (1971). Proposition 2.1.19 If Eγ H12 < ∞, Eγ H1 = 0, and the series X Eγ H1 Hn+1 σ 2 = Eγ H12 + 2 (2.1.26) n∈N+

converges absolutely, then σ 2 ≥ 0 and ¡ ¢ Eγ Sn2 = n σ 2 + o (1) (2.1.27) P as n → ∞. If the stronger assumption n∈N+ n |Eγ H1 Hn+1 | < ∞ holds, then ¡ ¢ Eγ Sn2 = n σ 2 + O(n−1 ) (2.1.28) as n → ∞. Proof. By strict stationarity, for any n > 1 we have Eγ Sn2

=

n X i,j=1

Eγ Hi Hj =

nEγ H12

+2

n−1 X j=1

(n − j) Eγ H1 Hj+1 .

72

Chapter 2

Therefore

 n−1  P j |Eγ H1 Hj+1 |   X ¯ 1 ¯¯  j=1  2 2¯ Eγ Sn − nσ ≤ 2  + |Eγ H1 Hj+1 | , n n   j≥n

P |E H H and the right hand side is o (1) as n → ∞ when |<∞ Pn n∈N+ γ 1 n+1 P (note that n∈N+ |un | < ∞ implies limn→∞ j=1 j |uj | /n = 0), so that (2.1.27) holds. Finally, since n−1 P j=1

P

j |Eγ H1 Hj+1 | n

+

X

|Eγ H1 Hj+1 | ≤

j |Eγ H1 Hj+1 |

j∈N+

n

j≥n

equation (2.1.28) holds, too, under our stronger assumption.

, 2

Corollary 2.1.20 Assume that Eγ H12 < ∞, Eγ H1 = 0, and X n |Eγ H1 Hn+1 | < ∞. n∈N+

¡ ¢ Then σ = 0 if and only if there exists g ∈ Lγ2 I 2 such that H1 = g ◦ τ − g a.e. in I 2 . Proposition 2.1.21 If Eγ H12 < ∞, Eγ H1 = 0, and X 1/2 Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞,

(2.1.29)

n∈N+

then series (2.1.26) converges absolutely. On account of Corollary 1.3.15, this is a transcription of part of Theorem 18.6.1 in Ibragimov and Linnik (1971) for the special case of the doubly infinite sequence (a` )`∈Z . 2 Note that both the conditional mean value occurring and σ 2 ¡ 2 in2(2.1.29) ¢ can be expressed in terms of the random variable h on I , BI defined on Ω2 (thus a.e. in I 2 ) by h ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) = H (· · · , i−1 , i0 , i1 , · · · ) for any (i` )`∈Z ∈ NZ + . Clearly, Z Eγ H1 = hdγ, I2

Z Eγ H12

h2 dγ,

= I2

Solving Gauss’ problem

73

1 Eγ (H1 |¯ a−n , · · · , a ¯n ) (ω, θ) = 2 γ (I (i−n , · · · , in ))

Z hdγ I 2 (i−n ,··· ,in )

for (ω, θ) ∈ I 2 (i−n , · · · , in ), where I 2 (i−n , · · · , in ) = I (i1 , · · · , in ) × I (i0 , i−1 , · · · , i−n ) for any ik ∈ N+ , −n ≤ k ≤ n, n ∈ N+ , and Z X Z 2 2 σ = h dγ + 2 h (h ◦ τ n ) dγ. I2

I2

n∈N+

Condition (2.1.29) is fulfilled for a large class of functions h as shown by the following result. Proposition 2.1.22 Put ¯ ¡ ¢¯ cn = sup ¯h (ω, θ) − h ω 0 , θ0 ¯, n ∈ N+ , where the upper bound is taken over all (ω, θ),R (ω 0 , θ0 ) ∈ I 2 (i−n ,P · · · , in ) and ik ∈ N+ , −n ≤ k ≤ n. Assume that Eγ H12 = I 2 h2 dγ < ∞ and n∈N+ cn < ∞. Then (2.1.29) holds. Proof. For any n ∈ N+ we have Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 Z

 =

X i−n ,··· ,in ∈N+

Z I 2 (i−n ,··· ,in )

 ¡ 0 0¢ h ω , θ − 

Z =

X

¡ ¢ γ dω 0 , dθ0

I 2 (i−n ,··· ,in )

i−n ,··· ,in ∈N+



X

2 h (ω, θ) γ (dω, dθ)

I 2 (i−n ,··· ,in ) γ (I 2 (i

ÃZ

−n , · · ·

, in ))

 ¡ 0 ¢  γ dω , dθ0 

!2 ¡ ¡ 0 0¢ ¢ h ω , θ − h (ω, θ) γ (dω, dθ)

I 2 (i−n ,··· ,in ) γ 2 (I 2 (i−n , · · ·

, in ))

¡ ¢ γ I 2 (i−n , · · · , in ) c2n = c2n .

i−n ,··· ,in ∈N+

Hence the series occurring in (2.1.29) is dominated by the convergent series P 2 n∈N+ cn , which completes the proof.

74

Chapter 2 Remark. If for some positive constants c and ε we have ¯ ¯ ¯¶ε µ¯ ¯ ¯1 ¯ ¯1 ¯ ¡ 0 0 ¢¯ 1 1 ¯h (ω, θ) − h ω , θ ¯ ≤ c ¯ − ¯ + ¯ − ¯ ¯ ω ω 0 ¯ ¯ θ θ0 ¯

(2.1.30)

for any (ω, θ), (ω 0 , θ0 ) ∈ Ω2 , then the assumption of Proposition 2.1.22 holds. Indeed, for (ω, θ), (ω 0 , θ0 ) ∈ I 2 (i−n , · · · , in ) we have ¯ ¯ ¯1 ¯ ¯ − 1¯≤ sup λ (I (i−1 , · · · , i−n )) = (Fn Fn+1 )−1 ¯ θ θ0 ¯ i−1 ,··· ,i−n ∈N+ and similarly

Hence

¯ ¯ ¯ ¯1 1 ¯ − ¯ ≤ (Fn−1 Fn )−1 . ¯ ω ω0 ¯

¯ ¡ ¢¯ ¯h (ω, θ) − h ω 0 , θ0 ¯ ≤ c 2ε (Fn−1 Fn )−ε, (ω 0 , θ0 )

n ∈ N+ ,

I 2 (i

for any (ω,Pθ), ∈ −n , · · · , in ), ik ∈ N+ , −n ≤ k ≤ n, and clearly −ε the series n∈N+ (Fn−1 Fn ) is convergent. In particular, (2.1.30) holds if h satisfies a H¨older condition of order ε > 0, that is, sup (ω,θ),(ω 0 ,θ0 )∈Ω2

|h (ω, θ) − h (ω 0 , θ0 )| < ∞. (|ω 0 − ω| + |θ0 − θ|)ε 2

The results above clearly apply to the special case where H is a realN valued function on N+ + . In this case we set Hn = H (an , an+1 , · · · ) = H1 ◦ τ n−1 ,

n ∈ N+ .

Then (Hn )n∈N+ is a strictly stationary sequence on (I, BI , γ). Theorem 2.1.18, Proposition 2.1.19, Corollary 2.1.20, and Proposition 2.1.21 hold in the present case if in their statements we replace γ by γ, I 2 by I, τ by τ and inequality (2.1.29) by X Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞. (2.1.31) n∈N+

In the present case the conditional mean value occurring in (2.1.31) and σ 2 can be expressed in terms of the random variable h on (I, BI ) defined on Ω (thus a.e. in I) by h ([i1 , i2 , · · · ]) = H (i1 , i2 , · · · )

Solving Gauss’ problem

75

N

for any (i` )`∈N+ ∈ N+ + . Clearly, Z Z 2 Eγ H1 = hdγ, Eγ H1 = h2 dγ, I

I

1 Eγ (H1 |a1 , · · · , an ) (ω) = ¡ ¡ (n) ¢¢ γ I i

Z hdγ I(i(n) )

for any ω ∈ I(i(n) ), i(n) ∈ Nn+ , n ∈ N+ , and Z X Z 2 2 σ = h dγ + 2 h (h ◦ τ n ) dγ I

n∈N+

Z h2 dγ + 2

= I

I

X Z

n∈N+

h U n h dγ

I

[the last equation is a consequence of (2.1.2)]. It follows from Proposition 2.1.22 that condition (2.1.31) is fulfilled if we R P assume that I h2 dγ < ∞ and n∈N+ cn < ∞, where cn =

sup

sup

(n) ) 0 i(n) ∈Nn + ω,ω ∈I(i

¯ ¡ ¢¯ ¯h (ω) − h ω 0 ¯ ,

n ∈ N+ .

In turn, the second assumption holds if for some positive constants c and ε we have ¯ε ¯ ¯ ¯ ¯ ¡ ¢¯ ¯h (ω) − h ω 0 ¯ ≤ c ¯ 1 − 1 ¯ , ω, ω 0 ∈ Ω. (2.1.32) ¯ ω ω0 ¯ In particular, (2.1.32) holds if h satisfies a H¨older condition of order ε > 0, that is, |h (ω) − h (ω 0 )| sup < ∞. |ω − ω 0 |ε ω,ω 0 ∈Ω To indicate another class of functions h for which (2.1.31) holds let us recall that a function h : I → C is said to be of bounded p–variation, p ≥ 1, on A ⊂ I if and only if (p) varA h

:= sup

k−1 X

|h (ti+1 ) − h (ti )|p < ∞,

i=1

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and (p) k ≥ 2. We write simply var(p) h for varI h. If var(p) h < ∞ then h is called a function of bounded p-variation. Clearly, var(1) h = var h and a

76

Chapter 2

function of bounded variation is also a function of bounded p-variation for any p > 1. (The converse of this assertion is in general not true.) More generally, a function of bounded p-variation, p ≥ 1, is also a function of p0 -variation, p0 > p. Proposition 2.1.23. If h is a function of bounded p-variation on Ω, then (2.1.31) holds. Proof. Without any loss of generality, on account of the last assertion above we can assume that p ≥ 2. It is obvious that ´1/p ¯ ¡ ¢¯ ³ ¯h (ω) − h ω 0 ¯ ≤ var(p) h A for any A ⊂ Ω and ω, ω 0 ∈ A. Then Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 ≤ Eγ1/p |H1 − Eγ (H1 |a1 , · · · , an )|p   X = 

Z

i(n) ∈Nn + I(i(n) )

¯p ¯ 1/p ¯ ¯ Z ¯ ¯ ¡ ¢ ¡ ¢¯ 1  ¯ h ω 0 γ dω 0 ¯ γ (dω) ¯h (ω) − ¡ ¡ (n) ¢¢ ¯ ¯ γ I i ¯ ¯ I(i(n) )

  X =  i(n) ∈Nn +

γp

Z

1 ¡ ¡ ¢¢ I i(n)

I(i(n) )

¯ ¯p 1/p ¯ Z ¯ ¯ ¡ ¡ 0 ¢¢ ¡ 0 ¢¯¯  ¯ γ (dω) ¯ h (ω) − h ω γ dω ¯  ¯ ¯ ¯ I(i(n) ) ¯ 1/p



X   (p) varI(i(n) ) h ≤  max γ(I(i(n) )) n (n) i

µ ≤

∈N+

i(n) ∈Nn +

1 (Fn Fn+1 )−1 log 2

¶1/p ³ ´1/p (p) varΩ h .

Hence the series occurring in (2.1.31) is dominated by (p) (varΩ h)1/p X 1/p

(log 2)

(Fn Fn+1 )−1/p ,

n∈N+

and clearly the last series is convergent. σ2

2

It is important to know when defined in terms of H or, equivalently, in terms of h, is non-zero. In the result below the function h, which is only defined on Ω, is considered as the representative of a class of λ-indistinguishable

Solving Gauss’ problem

77

functions on I, after having been extended in an arbitrary manner to the whole of I . R Proposition 2.1.24 Assume that h ∈ L2γ (I), I hdγ = 0, and U h ∈ BEV (I). Then the series Z X Z 2 2 σ = h dγ + 2 h U n h dγ (2.1.33) I

n∈N+

I

converges absolutely, and we have σ = 0 if and only if there exists b ∈ L2γ (I) such that h = b ◦ τ − b a.e. in I. In particular, if h is essentially unbounded then σ 6= 0. Proof. By (2.0.2) and Proposition 2.1.7(ii) we have ess sup |U n h| ≤ ||U n h||v ≤ q n−1 ||U h||v ,

n ∈ N+ ,

(2.1.34)

for some positive q < P1. This Rclearly entails the absolute convergence of both series (2.1.33) and n∈N+ n I h U n h dγ. Then Corollary 2.1.20 completes the proof of the first two assertions concerning σ. Without appealing to Corollary 2.1.20, the characterization of the case P σ = 0 can be given a direct proof as follows. Put h1 = n∈N+ U n h. By (2.1.34) this series converges in BEV (I), and we have h1 = U h + U h1 = U (h + h1 ). Writing g = h + h1 we note that U g ∈ BEV (I) and Z Z ³ ´ ¢ ¡ 2 2 σ = h + 2hh1 dγ = g 2 − (U g)2 dγ. I

I

By (2.1.2) we have Z

Z 2

(U g) dγ = I

and

Z

((U g) ◦ τ )g dγ I

Z (U g)2 dγ =

I

((U g) ◦ τ )2 dγ. I

R R [Note that (2.1.2) implies in general that I f dγ = I f ◦ τ dγ, f ∈ L1γ , which also follows from the fact that τ is γ-preserving.] Consequently, we can write Z Z Z 2 2 σ = g dγ − 2 ((U g) ◦ τ ) g dγ + ((U g) ◦ τ )2 dγ I I I Z = (g − (U g) ◦ τ )2 dγ. I

78

Chapter 2

Now, if σ = 0 then g = (U g) ◦ τ a.e. in I. Hence h = (U g) ◦ τ − U g a.e. in I,

(2.1.35)

that is, we can take b = U g. Conversely, if h = b ◦ τ − b a.e. in I then Sn = b ◦ τ n − b a.e. in I for any n ∈ N+ . Hence Z −1 2 −1 n Eγ Sn ≤ 4n b2 dγ → 0 as n → ∞, I

that is, σ = 0. Finally, since U g ∈ BEV (I) as shown above, equation (2.1.35) cannot hold in the case where h is essentially unbounded, that is, we cannot have σ = 0. 2 Corollary 2.1.25 Let f : N+ → R such that Eγ f 2 (a1 ) < ∞, Eγ f (a1 ) = 0. Put X Eγ f (a1 ) f (an+1 ) (2.1.36) σ 2 = Eγ f 2 (a1 ) + 2 n∈N+

Then σ = 0 if and only if f = 0. Proof. As a special case of (2.1.26) with (2.1.31) trivially satisfied, series (2.1.36) is absolutely convergent. Moreover, in the present case h is defined by h (ω) = f (b1/ωc) , ω ∈ Ω, R and by hypothesis h ∈ L2γ (I) and I hdγ = 0. We then have X Pi (ω)f (i), ω ∈ Ω, U h (ω) = i∈N+

and v (U h) ≤

X i∈N+

|f (i)| var Pi ≤ C

X |f (i)| i2

i∈N+

for some C > 0. The last series is convergent since Eγ |f (a1 )| < ∞, so that U h ∈ BEV (I). Then by Proposition 2.1.24 we have σ = 0 if and only if there exists b ∈ L2γ (I) such that h = b ◦ τ − b a.e. in I, and we have to show that this happens if and only if f = 0. Clearly, if f = 0 then σ = 0. To prove the converse we first note that U h = U (b ◦ τ ) − U b = b − U b a.e. in I. P This equation holds for b equal to h1 = n∈N+ U n h ∈ BEV (I). Putting b = b1 + h1 we get b1 = U b1 . But by Proposition 2.1.7 the last equation

Solving Gauss’ problem

79

only holds for a.e. constant functions b1 . This shows that actually b ∈ BEV (I). Next, whatever i ∈ N+ , for u ∈ (1/ (i + 1) , 1/i) the equation h (u) = (b ◦ τ ) (u) − b (u) a.e. in I implies µ ¶ 1 f (i) = b (x) − b x+i a.e. in I . Hence nf (i) = b (x) − b ([i(n − 1), x + i]) a.e. in I for any n ≥ 2, where i (n − 1) = (i1 , · · · , in−1 ) with i1 = · · · = in−1 = i. If f (i) 6= 0 then this contradicts the fact that b ∈ BEV (I). The proof is complete. 2 We note another criterion for to have σ 6= 0 under stronger assumptions than in Proposition 2.1.24. Proposition 2.1.26 Let h : I → R be continuous except for a finite number of points of R R I 2and assume that inf x∈(0,δ) |h (x)| > 0 for some δ > 0, hdγ = 0, and I I h dγ < ∞. If the series defining U h is uniformly convergent in I and U h ∈ BV (I), then σ defined by (2.1.33) is non-zero. For the proof see Samur (1996).

2.2 2.2.1

2

Wirsing’s solution to Gauss’ problem Elementary considerations

Let µ ∈ pr (BI ) such that µ ¿ λ. For any n ∈ N put Fn (x) = µ (τ n < x) ,

x ∈ I,

with τ 0 = identity map. As (τ n < x) = τ −n ((0, x)), by Proposition 2.1.5 we have Z x n U f0 (u) Fn (x) = du, n ∈ N, x ∈ I, (2.2.1) u+1 0 with f0 (x) = (x + 1)F00 (x), x ∈ I, where F00 = dµ/dλ. [Clearly, (2.2.1) is a special case of (2.1.21).] In this subsection we will assume that F00 ∈ C 1 (I). In other words, we study the behaviour of U n as n → ∞, assuming that the domain of U is C 1 (I).

80

Chapter 2 Let f ∈ C 1 (I). Then U f (x) =

X

µ Pi (x) f

i∈N+

=

X µ

i∈N+

1 x+i



i i−1 − x+i+1 x+i

¶ µ ¶ 1 f , x+i

x ∈ I,

can be differentiated term by term to give ¶ µ ¶ X µµ i i−1 1 (U f )0 (x) = − − f x+i (x + i + 1)2 (x + i)2 i∈N+ µ ¶ µ ¶¶ i i−1 1 1 0 + − f x + i + 1 x + i (x + i)2 x+i µ µ ¶ µ ¶¶ X µ i 1 1 = − f −f x+i x+i+1 (x + i + 1)2 i∈N+ µ ¶¶ x+1 1 0 + f , x ∈ I, x+i (x + i)3 (x + i + 1) since the series of derivatives is uniformly convergent, it being dominated by a convergent series of positive constants. Hence (U f )0 = −V f 0 ,

f ∈ C 1 (I),

where V : C (I) → C (I) is defined by à Z 1/(x+i) X i V g (x) = g (u) du (x + i + 1)2 1/(x+i+1) i∈N+ µ ¶¶ x+1 1 + g , x+i (x + i)3 (x + i + 1) Clearly,

(U n f )0 = (−1)n V n f 0 , n ∈ N+ ,

(2.2.2)

g ∈ C (I), x ∈ I.

f ∈ C 1 (I) .

(2.2.3)

n

We are going to show that V takes certain functions into functions with very small values when n ∈ N+ is large. Proposition 2.2.1 There are positive constants v > 0.29017 and w < 0.30796, and a real-valued function ϕ ∈ C (I) such that vϕ ≤ V ϕ ≤ wϕ.

Solving Gauss’ problem

81

Proof. Let h : R+ → R be a continuous bounded function such that limx→∞ h (x) /x = 0. We look for a function g : (0, 1] → R such that U g = h, assuming that the equation µ ¶ X 1 U g (x) = Pi (x) g = h (x) (2.2.4) x+i i∈N+

holds for x ∈ R+ . Then (2.2.4) yields h (x) h (x + 1) 1 − = g x+1 x+2 (x + 1) (x + 2) Hence

µ g (u) =

µ

1 x+1

¶ µ ¶ µ ¶ 1 1 1 1 +1 h −1 − h , u u u u

¶ ,

x ∈ R+ .

u ∈ (0, 1],

and we indeed have U g = h since ¶ X µ h (x + i − 1) h (x + i) U g (x) = (x + 1) − x+i x+i+1 i∈N+

µ

= (x + 1)

h (x) h (x + i) − lim x + 1 i→∞ x + i + 1

¶ = h (x) ,

x ∈ R+ .

In particular, for any fixed a ∈ I we consider the function ha : R+ → R defined by 1 , x ∈ R+ . ha (x) = x+a+1 We have just seen that the function ga : (0, 1] → R defined by µ ¶ µ ¶ µ ¶ 1 1 1 1 ga (x) = + 1 ha − 1 − ha x x x x =

1 x+1 − , ax + 1 (a + 1) x + 1

x ∈ (0, 1],

satisfies U ga (x) = ha (x),

x ∈ I.

We come to V via (2.2.2). Setting ϕa (x) = ga0 (x) =

1−a a+1 , 2 + (ax + 1) ((a + 1) x + 1)2

x ∈ I,

82

Chapter 2

we have V ϕa (x) = − (U ga )0 (x) =

1 , (x + a + 1)2

x ∈ I.

Let us choose a by asking that ϕa ϕa (0) = (1) . V ϕa V ϕa This amounts to (a + 1)3 (2a + 1) + (a − 1) (a + 2)2 = 0 or

2 (a + 1)4 − 3 (a + 1) − 2 = 0,

which yields as unique acceptable solution a = 0.3126597 · · · . For this value of a the function ϕa /V ϕa attains its maximum equal to 2 (a + 1)2 = 3.44615 · · · at x = 0 and at x = 1, and has a minimum equal to ¶ µ ¡ ¢ a+1 m (a) = a3 + a2 − a + 1 + 3a (a + 2) 1 − a − a2 (1 − a) δ + δ = 3.247229 · · · at x = (δ − 1) / (1 − a (δ − 1)) = 0.3655 · · · , where µ δ=

a (a + 1) (a + 2) (1 − a) (1 − a − a2 )

¶1/3 = 1.328024 · · · .

It follows that for ϕ = ϕa with a = 0.3126597 · · · we have ϕ ϕ 2 ≤ V ϕ ≤ m (a) , 2 (a + 1) that is, vϕ ≤ V ϕ ≤ wϕ, where v=

1 > 0.29017, 2 (a + 1)2

w=

1 < 0.30796. m (a) 2

Solving Gauss’ problem

83

Remark. As noted by Wirsing (1974, p. 513), a better choice of ϕ is ϕ = 8ϕa0 − 7ϕa00 with a0 = 0.6247 and a00 = 0.7, which yields v = 0.3020, w = 0.3043. 2 Corollary 2.2.2 Let f0 ∈ C 1 (I) such that f00 > 0. Put α = min x∈I

Then

ϕ (x) , f00 (x)

β = max x∈I

α n 0 β v f0 ≤ V n f00 ≤ wn f00 , β α

ϕ (x) . f00 (x) n ∈ N+ .

(2.2.4)

Proof. Since V is a positive operator (that is, takes non-negative functions into non-negative functions) we have v n ϕ ≤ V n ϕ ≤ wn ϕ,

n ∈ N+ .

Noting that αf00 ≤ ϕ ≤ βf00 we then can write α n 0 v f0 ≤ β ≤

1 n 1 1 v ϕ ≤ V n ϕ ≤ V n f00 ≤ V n ϕ β β α 1 n β n 0 w ϕ ≤ w f0 , n ∈ N+ , α α

which shows that (2.2.4) holds.

2

Remark. A similar result holds if f0 ∈ C 1 (I) and f00 < 0.

2

Theorem 2.2.3 (Near-optimal solution to Gauss’ problem) Let f0 ∈ C 1 (I) such that f00 > 0. For any n ∈ N+ and x ∈ I we have (log 2)2 α minx∈I f00 (x) n v G (x) (1 − G (x)) 2β ≤ |µ (τ n < x) − G (x)| ≤

(log 2)2 β maxx∈I f00 (x) n w G (x) (1 − G(x)), α

where α, β, v, and w are defined in Proposition 2.2.1 and Corollary 2.2.2. In particular, for any n ∈ N+ and x ∈ I we have 0.07739 v n G (x) (1 − G (x)) ≤ |λ (τ n < x) − G (x)| ≤ 1.49132 wn G (x) (1 − G (x)) .

84

Chapter 2

¡ ¢ Proof. For any n ∈ N and y ∈ I set dn (y) = µ τ n < ey log 2 − 1 − y so that dn (G (x)) = µ (τ n < x) − G(x), x ∈ I. Then by (2.2.1) we have Z dn (G (x)) =

x

0

U n f0 (u) du − G(x). u+1

Differentiating twice with respect to x yields d0n (G (x))

1 (x + 1) log 2

=

(U n f0 (x))0 =

U n f0 (x) 1 − , x+1 (x + 1) log 2 1 d00n (G (x)) , n ∈ N, x ∈ I. (log 2)2 x + 1

Hence, by (2.2.3), d00n (G (x)) = (−1)n (log 2)2 (x + 1) V n f00 (x),

n ∈ N, x ∈ I.

Since dn (0) = dn (1) = 0, it follows from a well known interpolation formula that y (1 − y) 00 dn (θ), n ∈ N, y ∈ I, dn (y) = − 2 for a suitable θ = θ (n, y) ∈ I. Therefore µ (τ n < x) − G (x) = (−1)n+1 (log 2)2

θ+1 n 0 V f0 (θ) G (x) (1 − G (x)) 2

for any n ∈ N and x ∈ I, and another suitable θ = θ (n, x) ∈ I. The result stated follows now from Corollary 2.2.2. In the special case µ = λ we have f0 (x) = x + 1, x ∈ I. Then with a = 0.3126597 · · · we have α = min x∈I

ϕ (x) 1−a a+1 = = 0.644333 · · · , 2 + f00 (x) (a + 1) (a + 2)2 β = max x∈I

ϕ (x) = 2, f00 (x)

so that (log 2)2 α = 0.07739 · · · , 2β

(log 2)2 β = 1.49131 · · · . α

Solving Gauss’ problem

85

The proof is complete.

2

Remark. It follows from the above proof that for any n ∈ N the difference µ (τ n < x) − G (x) has a constant sign equal to (−1)n+1 whatever 0 < x < 1.

2.2.2

2

A functional-theoretic approach

The question naturally arises whether the operator V has an eigenvalue λ0 such that v ≤ λ0 ≤ w (see Theorem 2.2.3). This will indeed follow from the result below. Let B be a collection of bounded real-valued functions defined on a set X, with the following properties: (i) B is a linear space over R; (ii) B is complete with respect to the supremum norm, and (iii) B contains the constant functions. Theorem 2.2.4 Let V : B → B be a positive bounded linear operator and F : B → R a positive bounded linear functional such that V ≥ F.

(2.2.5)

Assume that there exist ϕ ∈ B with m (ϕ) = inf ϕ (x) > 0 x∈X

and two positive numbers v and w, v ≤ w, such that v≤

V ϕ (x) ≤ w, ϕ (x)

x ∈ X,

(2.2.6)

and

³ v´ F (ϕ) > 1 − || V ϕ ||. (2.2.7) w Then V has an eigenvalue λ0 ∈ [v, w] with corresponding positive eigenfunction ψ ∈ B such that ψ ≥ ϕ ≥ m (ϕ) > 0,

0<w

F (ϕ) F (ψ) − (w − v) ≤ ≤ λ0 , || V ϕ || || ψ ||

and for any n ∈ N and f ∈ B we have n

V f =G

(f ) λn0 ψ

f + osc ψ

µ ¶ F (ψ) n λ0 − θn ψ, || ψ ||

(2.2.8)

86

Chapter 2

where G : B → R is a positive bounded linear functional with || G || ≤ 1/m (ϕ), and θn : X → R is a function satisfying |θn | ≤ 1. Proof. Define ϕn = V n ϕ, n ∈ N, ϕ0 = ϕ. Since V is positive, from (2.2.6) we get vϕn ≤ ϕn+1 ≤ wϕn , n ∈ N. It follows that inf ϕn (x) > 0,

x∈X

n ∈ N.

Set v0 = v, w0 = w, and vn = inf

ϕn+1 ϕn+1 , wn = sup , ϕn ϕn

n ∈ N+ .

Then vn ϕn ≤ ϕn+1 ≤ wn ϕn ,

n ∈ N,

(2.2.9)

whence vn V ϕn ≤ V ϕn+1 ≤ wn V ϕn , that is, vn ϕn+1 ≤ ϕn+2 ≤ wn ϕn+1 . Therefore vn+1 ≥ vn and wn+1 ≤ wn , n ∈ N. We are going to improve these inequalities. It follows from (2.2.5) and (2.2.9) that ϕn+2 − vn ϕn+1 = V (ϕn+1 − vn ϕn ) ≥ F(ϕn+1 − vn ϕn ) ϕn+1 F(ϕn+1 − vn ϕn ), ≥ || ϕn+1 || whence vn+1 ≥ vn +

F(ϕn+1 − vn ϕn ) , || ϕn+1 ||

n ∈ N.

(2.2.10)

Similarly, wn ϕn+1 − ϕn+2 = V (wn ϕn − ϕn+1 ) ≥ F (wn ϕn − ϕn+1 ) ϕn+1 F(wn ϕn − ϕn+1 ), ≥ || ϕn+1 || whence wn+1 ≤ wn −

F (wn ϕn − ϕn+1 ) , || ϕn+1 ||

n ∈ N.

(2.2.100 )

Solving Gauss’ problem

87

Putting dn = wn − vn and en = F (ϕn ) /|| ϕn+1 || , n ∈ N, it follows from (2.2.10) and (2.2.100 ) that dn+1 ≤ dn (1 − en ),

n ∈ N,

(2.2.11)

which shows that en ≤ 1, n ∈ N. Now, note that (2.2.9) implies F (ϕn+1 ) ≥ vn F (ϕn ) and || ϕn+2 || ≤ wn+1 || ϕn+1 || , Hence en+1 ≥

vn en , wn+1

n ∈ N.

n ∈ N.

(2.2.12)

In conjunction with (2.2.11) and (2.2.12), assumption (2.2.7) which can be written as d0 e0 − > 0, w0 ensures exponential decrease of the dn , n ∈ N, since wn+1 en+1 − dn+1 ≥ vn en − dn (1 − en ) = wn en − dn ,

n ∈ N,

whence wn en − dn ≥ w0 e0 − d0 , 1 ≥ en ≥ and

d0 1 (w0 e0 − d0 ) ≥ e0 − > 0, wn w0

µ ¶ d0 n dn ≤ d0 1 − e0 + , w0

n ∈ N.

(2.2.13)

(2.2.14)

Put λ0 = limn→∞ vn = limn→∞ wn , and define ϕ e0 = ϕ0 = ϕ, ϕ en = ϕn (v0 · · · vn−1 )−1 ,

n ∈ N+ .

Then (2.2.9) amounts to ϕ en ≤ ϕ en+1

wn ≤ ϕ en = vn

µ ¶ µ ¶ dn dn 1+ ϕ en ≤ 1 + ϕ en , vn v0

and (2.2.14) implies that ¶ Yµ dn < ∞. A = 1+ v0 n∈N

n ∈ N,

(2.2.15)

88

Chapter 2

Hence ϕ en ≤

n−1 Yµ

1+

i=0

di v0

¶ ϕ e0 ≤ A ϕ e0 ,

n ∈ N+ .

(2.2.16)

It follows from (2.2.15) and (2.2.16) that dn dn A ϕ en ≤ ϕ e0 , n ∈ N. v0 v0 P Therefore by (2.2.14) the series en+1 − ϕ en || converges. By the n∈N || ϕ completeness of B the limit ψ = limn→∞ ϕ en exists. Letting n → ∞ in vn ϕ en ≤ V ϕ en ≤ wn ϕ en , n ∈ N, yields V ψ = λ0 ψ. Since ϕ en+1 ≥ ϕ en ≥ · · · ≥ ϕ e0 = ϕ, e we have ψ ≥ ϕ. As 1 ≥ en = F (ϕn ) /|| ϕn+1 || = F (ϕ en ) /|| V ϕ en || , n ∈ N, letting n → ∞ yields 1 ≥ F (ψ) /λ0 || ψ || . Finally, by (2.2.13) we have 0≤ϕ en+1 − ϕ en ≤

λ0 F (ψ) F (ϕ) F (ψ) = = lim wn en ≥ w0 e0 − d0 = w − w + v > 0. n→∞ || ψ || || V ψ || || V ϕ || To prove (2.2.8) let f ∈ B and define fn = V n f, n ∈ N, f0 = f, fn , λn0 ψ

ven = inf

w en = sup

fn , λn0 ψ

n ∈ N.

Hence fn+1 − ven λn+1 ψ = V (fn − ven λn0 ψ) 0 ≥ F (fn − ven λn0 ψ) ≥

ψ F(fn − ven λn0 ψ), || ψ ||

which yields ven+1 ≥ ven +

1

F λn+1 0 || ψ ||

(fn − ven λn0 ψ) ≥ ven ,

n ∈ N.

Similarly, w en+1 ≤ w en −

1 F(w en λn0 ψ n+1 λ0 || ψ ||

− fn ) ≤ w en ,

n ∈ N.

Therefore w en+1 − ven+1

µ ¶ F (ψ) ≤ (w en − ven ) 1 − , λ0 || ψ ||

n ∈ N,

Solving Gauss’ problem

89

whence w en − ven ≤ osc

f ψ

µ ¶ F (ψ) n 1− , λ0 || ψ ||

since w e0 − ve0 = sup

n ∈ N,

f f f − inf = osc . ψ ψ ψ

If we denote by G (f ) the common limit of ven and w en as n → ∞, then we have µ ¶ F (ψ) n f e 1− , n ∈ N, ven , w en = G (f ) + θn osc ψ λ0|| ψ || ¯ ¯ ¯ ¯ e with a suitable θn ∈ R satisfying ¯θen ¯ ≤ 1. Hence, by the very definition of the ven and w en , n ∈ N, equation (2.2.8) should hold. Since |G(f )| ≤ max (|e v0 | , |w e0 |) ≤ it follows that || G || = sup f ∈B

|| f || , inf ψ

f ∈ B,

|G (f )| 1 ≤ . || f || inf ψ

The fact that G is a positive linear functional is an immediate consequence of equation (2.2.8). 2 Let us show that Theorem 2.2.4 applies to Gauss’ problem as considered in Subsection 2.2.1. The space B is Cr (I), the collection of all real-valued functions in C (I) , and the operator V the one denoted there by the same letter. As function ϕ we could use the function ϕa constructed in Subsection 2.2.1 with a = 0.3126597 · · · . Nevertheless, it is more convenient to use V ϕa instead, for which the same values of v and w apply. Thus we take ϕ (x) =

1 , (x + a + 1)2

x ∈ I,

with a = 0.3126597 · · · . Finally, the functional F can be constructed as follows. Let f ∈ Cr (I) , f ≥ 0. [Note that actually the considerations below hold for any non-negative f ∈ B(I).] Then V f (x) ≥

X i∈N+

Z

=

i (x + i + 1)2

Z

f (y) dy 1/(x+i+1)

1

k (x, y) f (y) dy, 0

1/(x+i)

x ∈ I,

90

Chapter 2

where k (x, 0) = 0, k (x, y) =

x ∈ I, by −1 − xc

(x + by −1 − xc + 1)2

,

x ∈ I, y ∈ (0, 1].

If 0 < y ≤ 1/3 then by −1 − xc ≥ 2, and since t → (t + x + 1)−2 ,

t ≥ 2,

is a decreasing function, we have k (x, y) ≥

y −1 − x 2

(y −1 + 1)



y −1 − 1 (y −1 + 1)

2

=

y (1 − y) (y + 1)2

for x ∈ I, 0 ≤ y ≤ 1/3. If 1/3 < y ≤ 1/2 then either k (x, y) = (2 + x)−2 or k (x, y) = 2 (3 + x)−2 . Hence k (x, y) ≥ 1/9 for x ∈ I, 1/3 < y ≤ 1/2. Thus we have V f ≥ F (f ), where Z

1/3

F (f ) = 0

y (1 − y) 1 2 f (y) dy + 9 (y + 1)

Z

1/2

f (y) dy. 1/3

Elementary calculations yield Z

1/3

F (ϕ) = 0

Z

1/3 µ

= 0

1 + 9

Z

Z

1/2

1/3

dy (y + a + 1)2

3a + 4 2 a2 + 3a + 2 3a + 4 − − − a3 (y + 1) a2 (y + 1)2 a3 (y + a + 1) a2 (y + a + 1)2

1/2

1/3

y (1 − y) dy 1 2 2 + 9 (y + 1) (y + a + 1)

¶ dy

3a + 4 88a2 + 279a + 216 dy 4 (a + 1) = − . log a3 3a + 4 18a2 (2a + 3) (3a + 4) (y + a + 1)2

As V ϕ ≤ wϕ, we have F (ϕ) w F (ϕ) ≥ = (a + 1)2 F (ϕ) > 0.033184. || V ϕ || || ϕ ||

(2.2.17)

Since w − v < 0.01779, inequality (2.2.7) holds. Thus Theorem 2.2.4 applies and we have F (ψ) ≥ (a + 1)2 F (ϕ) − (w − v) > 0.01539. || ψ ||

(2.2.18)

Solving Gauss’ problem

91

To state the result corresponding to Theorem 2.2.3 we should first introduce a few notation. Let Z x Ψ (x) = ψ (u) du 0

and

Z ψe (x) = 0

x

Ψ (u) − U ∞ Ψ du, u+1

It is easy to check that ³ ´0 (x + 1) ψe0 (x) = ψ(x),

x ∈ I.

x ∈ I,

and ψe (0) = ψe (1) = 0. Remarks. 1. As noted by Wirsing (1974, p. 521), using as function ϕ the function V (8ϕa0 − 7ϕa00 ) with a0 = 0.6247 and a00 = 0.7 one can improve (2.2.18) to F(ψ)/|| ψ || ≥ 0.031. 2. Wirsing (1974, § 5) proved that the functions ψ and ψe are analytic. Their analytic continuations are holomorphic in the whole complex plane with a cut along the negative real axis from ∞ to −1, which is the natural boundary of these functions. 2 Theorem 2.2.5 Let f0 ∈ C 1 (I) (equivalently, dµ/dλ = F00 ∈ C 1 (I)). For any n ∈ N and x ∈ I we have ¯ ¯ ¯ ¯ n ¡ ¢ ¯ µ (τ n < x) − G (x) − (−λ0 ) G f00 ψe (x)¯ f00 (log 2)2 (λ0 − 0.01539)n G (x) (1 − G (x)) , ψ where λ0 = 0.303 663 002 898 732 658 · · · , ≤ || ψ || osc

1 3.41 , 2 ≤ ψ (x) ≤ (x + a + 1) (x + a + 1)2

x ∈ I,

with a = 0.3126597, and G is a positive bounded functional on Cr (I) such that 1 || G || ≤ ≤ (a + 2)2 = 5.34839 · · · . inf ψ In particular, for any n ∈ N and x ∈ I we have ¯ ¯ ¯ ¯ n (2.2.19) ¯λ (τ n < x) − G (x) − (−λ0 ) G (1) ψe (x)¯

92

Chapter 2 ≤ 4.605 (λ0 − 0.01539)n G (x) (1 − G (x)) .

Proof. We use the same in the¢proof of Theorem 2.2.3. n ∈¢ N ¡ ntrickyas ¡ y logFor log 2 n 0 2 e and y ∈ I set dn (y) = µ τ < e − 1 − y − (λ0 ) G(f0 )ψ e − 1 so that ¡ ¢ dn (G (x)) = µ (τ n < x) − G (x) − (−λ0 )n G f00 ψe (x) , x ∈ I. Differentiating twice with respect to x yields 1 d00n (G (x)) (log 2)2 x + 1

´0 ¡ ¢³ = (U n f0 )0 (x) − (−λ0 )n G f00 (x + 1) ψe0 (x) ¡ ¢ = (−1)n V n f00 (x) − (−λ0 )n G f00 ψ (x) .

Hence, by Theorem 2.2.4 and (2.2.18), 0 ¯ 00 ¯ ¯dn (G (x))¯ ≤ 2 || ψ || osc f0 (log 2)2 (λ0 − 0.01539)n , ψ

n ∈ N, x ∈ I.

Since dn (0) = dn (1) = 0, the first inequality in the statement follows (cf. the proof of Theorem 2.2.3). In principle, Theorem 2.2.4 provides the means for computing λ0 to any accuracy. It follows from that theorem that for any real-valued f ∈ C 1 (I) and n ∈ N we have U n f (1) − U n f (0) n

= (−1)

λn0

¡ ¢ G f0

Z 0

1

f0 ψdλ + (λ0 − 0.01539) osc ψ

Z

n

0

1

θen ψ dλ

with a suitable θen : I → R satisfying |θen | ≤ 1. Therefore if f 0 > 0 then µµ ¶ ¶ U n f (1) − U n f (0) λ0 − 0.01539 n = −λ0 + O U n−1 f (1) − U n−1 f (0) λ0 as n → ∞. Using this equation Wirsing (1974) has obtained the value given in the statement. Note that in Knuth (1981, p. 350) the first 20 (RCF) digits of λ0 are given as 3, 3, 2, 2, 3, 13, 1, 174, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1. The 20th convergent equals 227 769 828 , 750 074 345 which yields 14 exact significant digits of λ0 .

Solving Gauss’ problem

93

Now, we refer to the proof of Theorem 2.2.4. It is shown there that ϕ ≤ ψ ≤ Aϕ, with ¶ Yµ dn , A = 1+ v n∈N ¶ µ w−v n dn ≤ (w − v) 1 − e0 + , n ∈ N, w where in the present case v > 0.29017 and w < 0.30796. Then since by (2.2.17) we have we0 =

wF (ϕ) ≥ (a + 1)2 F (ϕ) ≥ 0.033184, || V (ϕ) ||

it follows that P A ≤ exp

n∈N dn

v

≤ exp

w (w − v) ≤ 3.409 · · · . v (we0 − (w − v))

In the special case µ = λ we have osc

f00 1 1 1 (a + 1)2 = osc = − ≤ (a + 2)2 − = 4.843094 · · · , ψ ψ inf ψ sup ψ 3.41

and (2.2.19) follows.

2

Theorem 2.2.6 Let f ∈

C 1 (I)

be real-valued. For any n ∈ N we have

|| U n f − U ∞ f || (2.2.20) ¶ µ Z Z x ¯ ¡ ¢¯ f0 γ(dx) ψ dλ ≤ λn0 ¯G f 0 ¯ + osc (λ0 − 0.01539)n ψ I 0 and || U n f − U ∞ f || (2.2.21) µ ¶ Z Z x ¯ ¡ ¢¯ f0 ≥ λn0 ¯G f 0 ¯ − osc (λ0 − 0.01539)n γ(dx) ψ dλ . ψ I 0 Here G is a positive bounded linear functional on Cr (I) with || G || ≤ 5.34839 · · · , and the last inequality is meaningful for n ∈ N+ large enough. Proof. It follows from (2.2.3) and (2.2.8) that U n f (x) − U n f (y) =

94

Chapter 2 µ Z = (−1)n G(f 0 )λn0

y

ψ dλ + osc

x

f0 (λ0 − 0.01539)n ψ

Z x

y

¶ θen ψ dλ

for any n ∈ N and x, y ∈ I with a suitable θen : I → R satisfying |θen | < 1. Integrating over y ∈ I with respect to γ, on account of (2.1.12) we obtain µ Z Z y n ∞ n 0 n U f (x) − U f = (−1) G(f )λ0 γ(dy) ψ dλ (2.2.22) I

f0 + osc (λ0 − 0.01539)n ψ

Z

x

Z

y

γ(dy) I

x



θen ψ dλ

for any n ∈ N and x ∈ I. Hence (2.2.20) and (2.2.21) follow at once. For the lower bound (2.2.21) we should note that || U n f − U ∞ f || ≥ |U n f (0) − U ∞ f | . 2 Remarks. 1. Equation (2.2.22) shows that whatever f ∈ C 1 (I) the exact rate of convergence of U n f (x)−U ∞ f to 0 as n → ∞ is O(λn0 ) for any x ∈ / E, where µ ¶ Z Z y E = x ∈ I : γ(dy) ψ dλ = 0 . I

x

Clearly, E is not empty since Z Z y Z Z γ(dy) ψ dλ > 0 and γ(dy) I

0

I

y

ψ dλ < 0.

1

2. By (2.1.12) and Proposition 2.0.1(i) with µ = γ, for any f ∈ C 1 (I) we have || U n f − U ∞ f || ≤ var U n f, n ∈ N. Next, since U n f (1) − U n f (0) = U n f (1) − U ∞ f − (U n f (0) − U ∞ f ), we have |U n f (1) − U n f (0)| ≤ 2 || U n f − U ∞ f ||. Finally, noting that by (2.2.3) we have Z Z ¯ n 0¯ ¯ n 0¯ n ¯ ¯ ¯V f ¯ dλ, var U f = (U f ) dλ = I

I

¯ ¯Z ¯ ¯Z ¯ ¯ ¯ ¯ n 0 n n n 0 ¯ ¯ ¯ |U f (1) − U f (0)| = ¯ (U f ) dλ¯ = ¯ V f dλ¯¯ , I

I

Solving Gauss’ problem

95

from (2.2.8) we obtain µ ¶Z ¯ ¯ f0 n¯ 0 ¯ n || U f − U f || ≤ λ0 G(f ) + osc (λ0 − 0.01539) ψ dλ ψ I n



and 1 || U f − U f || ≥ 2 n



µ ¶Z ¯ ¯ f0 n¯ 0 ¯ n λ0 G(f ) − osc (λ0 − 0.01539) ψ dλ ψ I

for any n ∈ N and any real-valued f ∈ C 1 (I). Since Z x Z Z γ(dx) ψ dλ < ψ dλ, I

|| U n f

0

I

U ∞f

the upper bound for − || just derived is slightly worse than that given in Theorem 2.2.6. The comparison of the lower bounds forR || U n f − U ∞ fR|| , here Rand in Theorem 2.2.6, amounts to a comparison of I ψ dλ/2 x and I γ(dx) 0 ψ dλ, a question we cannot answer. 2 Corollary 2.2.7 The spectral radius of the operator U − U ∞ in C 1 (I) is equal to λ0 . Proof. We should show that à n

lim || U −

n→∞

1/n U ∞ ||1

= lim

n→∞

|| U n f − U ∞ f ||1 sup || f ||1 06=f ∈C 1 (I)

!1/n = λ0 .

This follows easily using Theorem 2.2.6 and equations (2.2.3) and (2.2.8). The details are left to the reader. 2

2.2.3

The case of Lipschitz densities

Theorem 2.2.4 can be also used to solve Gauss’ problem in the case where F00 = dµ/dλ ∈ L(I). In other words, Theorem 2.2.4 enables us to study the behaviour of U n as n → ∞ assuming that the domain of U is L(I). Let f ∈ L(I). Then the derivative f 0 exists a.e. in I and is bounded by s(f ). Abusing the notation, we will also denote by f 0 the extension to I of the derivative of f , which is obtained by assigning the value 0 at the points where f is not differentiable. It is obvious that the operator V : C(I) → C(I) introduced in Subsection 2.2.1 can be extended to B(I) with V g, g ∈ B(I), defined by the same

96

Chapter 2

formula as in the case of a continuous g. The point is that, as is easy to see, equations (2.2.2) and (2.2.3) hold now a.e. in I, that is, (U n f )0 = (−1)n V n f 0 , f ∈ L(I),

n ∈ N+ ,

(2.2.23)

a.e. in I, with the null set of exempted points depending on f and n. Let us now apply Theorem 2.2.4 to our V in the case where B is Br (I), the collection of all real-valued functions in B(I), with the same function ϕ and functional F as in the case where B = Cr (I) ⊂ Br (I), which has been considered in Subsection 2.2.2. It follows that the operator V : Br (I) → Br (I) has an eigenvalue λ0 = 0.303 663 002 898 732 658 · · · with corresponding positive eigenfunction ψ ∈ C(I) satisfying 3.41 1 ≤ ψ(x) ≤ , 2 (x + a + 1) (x + a + 1)2

x ∈ I,

where a = 0.3126597 · · · , and V n g = G(g)λn0 ψ + osc

g (λ0 − 0.01539)n θn ψ ψ

(2.2.24)

for any n ∈ N and g ∈ Br (I). Here G : Br (I) → R is a positive bounded linear functional with || G || ≤ (a + 2)2 and θn : I → R is a function satisfying |θn | ≤ 1. Theorem 2.2.8 Let f ∈ L(I) be real-valued. For any n ∈ N+ we have µ ¶Z Z x ¯ ¯ f0 n ∞ n¯ 0 ¯ n || U f − U f || ≤ λ0 G(f ) + osc (λ0 − 0.01539) γ(dx) ψdλ ψ I 0 and || U n f − U ∞ f || ≥

µ ¶Z Z x ¯ ¯ f0 λn0 ¯G(f 0 )¯ − osc (λ0 − 0.01539)n γ(dx) ψdλ. ψ I 0

Here G is a positive bounded functional on Br (I) with || G || < 5.34839 · · · , and the last inequality is meaningful for n ∈ N+ large enough. The proof is identical with that of Theorem 2.2.6. Instead of (2.2.3) and (2.2.8) we should use (2.2.23) and (2.2.24). In particular, equation (2.2.22) holds for f ∈ L(I), too. 2 Remark. The contents of Remarks 1 and 2 following the proof of Theorem 2.2.6 apply mutatis mutandis to the present L(I) framework. 2

Solving Gauss’ problem

97

Corollary 2.2.9 Let f0 ∈ L(I) (equivalently, dµ/dλ = F00 ∈ L(I)). For any n ∈ N and A ∈ BI we have ¯ ¯ ¯µ(τ −n (A)) − γ(A)¯ (2.2.25) ¶ µ ¯ ¯ f00 0 ¯ n n¯ ≤ (1 − log 2) λ0 G(f0 ) + osc (λ0 − 0.01539) || ψ || min(γ(A), 1 − γ(A)). ψ Proof. By Proposition 2.1.5, for any n ∈ N and A ∈ BI we have Z U n f0 (x) − U ∞ f0 µ(τ −n (A)) − γ(A) = dx (2.2.26) x+1 A

since

Z ∞

U f0 =

1 f0 dγ = log 2

I

Note that Z Z γ(dx) I

x

0

Z F00 dλ =

1 . log 2

I

|| ψ || ψ dλ ≤ log 2

Z 0

1

µ ¶ x dx 1 = || ψ || −1 x+1 log 2

(2.2.27)

and µ(τ −n (A)) − γ(A) = γ(Ac ) − µ(τ −n (Ac ))

(2.2.28)

for any n ∈ N and A ∈ BI . Now, (2.2.25) follows from (2.2.26) through (2.2.28) and Theorem 2.2.8. 2 Corollary 2.2.10 The spectral radius of the operator U − U ∞ in L(I) equals λ0 . Proof. Obvious by Theorem 2.2.8.

2

As an application of Theorem 2.2.8 we shall derive the asymptotic behaviour of γa (uan < x), x ≥ 1, as n → ∞ for any a ∈ I. While it is natural to think that for any a ∈ I the limit distribution function lim γa (uan < x)

n→∞

is the common distribution function γ¯ (¯ u1 < x), x ≥ 1, of the extended random variables u ¯` , ` ∈ Z,—cf. the last paragraph of Subsection 1.3.3—it

98

Chapter 2

is somewhat surprising to find out that the (exact) convergence rate is O(λn0 ) for most a ∈ I. Theorem 2.2.11 For any n ∈ N+ and x ≥ 1 we have ¯ ¯ sup ¯γa (uan+1 < x) − H(x)¯

(2.2.29)

a∈I

I(1,∞) (x) n λ0 (1 + (0.94932)n ), x µ ¶ x−1 1 log x − if 1 ≤ x ≤ 2, log 2 x

≤ 3.2228 where

      H(x) =

    

1 log 2

µ ¶ 1 log 2 − x

if

x ≥ 2.

In (2.2.29), λ0 cannot be replaced by a smaller constant, and the exact convergence rate to 0 of the left hand side of (2.2.29) is O(λn0 ). Proof. By Proposition 1.3.10, for any a ∈ I, x ≥ 1, and n ∈ N+ we have µ ¶ san + 1 a γa (un+1 < x|a1 , . . . , an ) = 1 − I(san +1,∞) (x). x Hence γa

µ uan+1

¯ ¶ µ ¶ 1 ¯¯ 1 a ≥ ¯ a1 , . . . , an = 1 − (1 − t(sn + 1))I(san +1,∞) t t a a = min(1, t(sn + 1)) = ft (sn )

for any a ∈ I, t ∈ (0, 1], and n ∈ N+ , with ft (y) = min(1, t(y + 1)),

y ∈ I.

Therefore, by Proposition 2.1.10, ¯ ¶ µ µ µ ¶¶ 1 ¯¯ 1 1 a = E γa un+1 ≥ ¯ a1 , . . . , an γa un+1 ≥ = U n ft (a), (2.2.30) t t for any a ∈ I, t ∈ (0, 1], and n ∈ N+ . It is easy to check that (2.2.30) holds for n = 0, too. Clearly, ft ∈ L(I) for any t ∈ (0, 1], and  t  if 0 < t ≤ 1/2,   Z  log 2 U ∞ ft = ft (y)γ(dy) =  I  1   (1 − t + log(2t)) if 1/2 ≤ t ≤ 1. log 2

Solving Gauss’ problem

99

Next, 0 ≤ ft0 (y) ≤ tI(0,1) (t), t ∈ (0, 1], y ∈ I. Hence osc and

ft0 ≤ 5.348396 tI(0,1) (t) ψ

¯ ¯ ¯G(ft0 )¯ ≤ || G || || ft0 || ≤ 5.348396 tI(0,1) (t)

for any t ∈ (0, 1]. Finally, Z

Z γ(dx)

I

x

ψdλ ≤ 0

=

¶ Z µ 1 3.41 1 dx − log 2 I 1.312659 x + 1.312659 x + 1 µ ¶ 1 3.41 2.312659 1 log − 0.312659 log 2 1.312659 1.312659

≤ 0.60256. Consequently, Theorem 2.2.8 yields ¯ µ ¯ ¶ ¯ ¯ 1 ∞ ¯ a ¯ sup ¯γa un+1 ≥ − U ft ¯ ≤ 3.2228 t I(0,1) (t)λn0 (1 + (0.94932)n ) t a∈I for any n ∈ N and t ∈ (0, 1]. Hence, by putting 1/t = x, (2.2.29) follows. Finally, the assertion concerning the optimality of λ0 also follows from Theorem 2.2.8. 2 Remarks. 1. The convergence of λ(un < x) to H(x), x ≥ 1, as n → ∞ was first sketchy proved by Doeblin (1940, p. 365) with an unspecified convergence rate. A detailed proof following Doeblin’s suggestions was given by Samur (1989, Lemma 4.5) together with a slower convergence rate than that occurring in Theorem 2.2.11. 2. Theorem 2.2.8 shows that the convergence rate to 0 as n → ∞ of ¯ ¯ sup sup ¯γa (uan+1 < x) − H(x)¯ a∈I x≥1

is O(λn0 ). It is possible for some a ∈ I that the convergence rate to 0 as n → ∞ of ¯ ¯ sup ¯γa (uan+1 < x) − H(x)¯ x≥1

is O(αn ) with 0 < α < λ0 . It follows from equation (2.2.22), which is valid for f ∈ L(I) too, that this happens if and only if a ∈ E, with E defined in

100

Chapter 2

Remark 1 following Theorem 2.2.6. In particular, 0 and 1 do not belong to E, thus sup |λ(un+1 < x) − H(x)| = O(λn0 ) x≥1

and

¯ ¯ sup ¯γ1 (u1n+1 < x) − H(x)¯ = O(λn0 ) x≥1

as n → ∞. It would be interesting to effectively determine elements of E.2 The asymptotic behaviour as n → ∞ of the probability density of n ∈ N+ , a ∈ I, which exists a.e. by Corollary 1.3.11, can be established using a result to be proved later in Subsection 2.5.3. Set  x−1  if 1 ≤ x ≤ 2,   2 log 2  x dH (x) h (x) = =  dx  1   if x ≥ 2. 2 x log 2

uan ,

Recalling that

 0        log(x + 1) G (x) =  log 2       1

if x ≤ 0, if 0 ≤ x ≤ 1, if x > 1,

it is easy to check that 1 H (x) = x

Z

x−1

G (s) ds,

x ≥ 1.

0

Corollary 1.3.11 then yields γa (uan

1 < x) − H (x) = x

Z 0

x−1 ¡

¢ Gan−1 (s) − G (s) ds

(2.2.31)

for any a ∈ I, n ∈ N+ , and x ≥ 1. Letting Dx γa (uan < x) denote anyone of the four (two for x = 1) unilateral derivatives of γa (uan < x) at x, we can state the following result. Proposition 2.2.12 For any n ∈ N+ , a ∈ I, and x ≥ 1 we have |Dx γa (uan < x) − h (x)| ≤

k0 [min(x − 1, 1) + x I(1,2] (x)] 1 2 x Fn−1 Fn

Solving Gauss’ problem

101

where k0 is a constant not exceeding 14.8. The proof follows from (2.2.31) and Theorem 2.5.5. The details are left to the reader. 2 Remark.¡√The upper bound ¡in Proposition 2.2.12 is O(g2n ) as n → ∞ √ ¢ ¢ with g = 5 − 1 /2, g2 = 3 − 5 /2 = 0.38196 · · · . It is an open problem whether this yields the optimal convergence rate. 2 Theorem 2.2.11 and Proposition 2.2.12 can be restated in terms of the approximation coefficients defined in Subsection 1.3.2. Indeed, by (1.3.6) we have un+1 = u0n+1 = Θ−1 n , n ∈ N, and the results below are easily checked. Theorem 2.2.13 For any n ∈ N+ and t ∈ I we have ˜ |λ(Θn ≤ t) − H(t)| ≤ 3.2228 tI(0,1) (t)λn0 (1 + (0.94932)n ) and ˜ |Dt λ(Θn ≤ t) − h(t)| ≤ k0 where ˜ H(t) =

[min(t−1 − 1, 1) + t−1 I[1/2,1) (t)] , Fn Fn+1

 t    log 2   

if 0 ≤ t ≤ 1/2,

1 (1 − t + log(2t)) if 1/2 ≤ t ≤ 1 log 2

and

 1     log 2

˜ dH ˜ h(t) = =  dt   

1 log 2

if 0 ≤ t ≤ 1/2, µ

¶ 1 −1 if 1/2 ≤ t ≤ 1. t

Remark. The first result above improves on the convergence rate obtained by Faivre (1998a) while the second one on that obtained by Knuth (1984). 2

2.3 2.3.1

Babenko’s solution to Gauss’ problem Preliminaries

Let H−1/2 = H denote the collection of all complex-valued functions f which are holomorphic in the half-plane Re z > −1/2, bounded in every half-plane

102

Chapter 2

Re z > −1/2 + ε, ε > 0, and which satisfy ¶¯2 Z ¯ µ ¯ ¯ 1 ¯f − + iy ¯ dy < ∞. ¯ ¯ 2 R Note that H is known [see Duren (1970)] as the ordinary Hardy space of functions holomorphic in the half-plane Re z > −1/2, which is a Hilbert space with inner product (·, ·)H defined by µ ¶ µ ¶ Z 1 1 1 ∗ (f, g)H = f − + iy g − + iy dy, f, g ∈ H, 2π R 2 2 therefore a Banach space under the norm || · || H defined by à || f || H =

1 2π

¶¯2 !1/2 Z ¯ µ ¯ ¯ 1 ¯f − + iy ¯ dy , ¯ ¯ 2

f ∈ H.

R

Let L2 (R+ , BR+ , λ) = L2 (R+ ) denote the Hilbert space of square λintegrable functions ϕ : R+ → C with the usual scalar product Z (ϕ, ψ) = ϕψ ∗ dλ, ϕ, ψ ∈ L2 (R+ ) , R+

and norm

||ϕ||2 = (ϕ, ϕ)1/2 ,

ϕ ∈ L2 (R+ ) .

A Paley–Wiener theorem holds, giving a simple characterization of the elements of H [see Duren (1970)]: f ∈ H if and only if there exists ϕ ∈ L2 (R+ ) such that Z f (z) = e−zs−s/2 ϕ (s) ds, Re z > −1/2; R+

the function ϕ is unique (in the L2 -sense) and || f || H = || ϕ ||2 . In other words, the linear operator M : L2 (R+ ) → H defined by Z M ϕ (z) = e−zs−s/2 ϕ (s) ds, ϕ ∈ L2 (R+ ) , Re z > −1/2, R+

is an isometry and the image under M of L2 (R+ ) is H.

(2.3.1)

Solving Gauss’ problem

103

Notice that in Babenko (1978) an equivalent definition of H is considered. We follow here Mayer (1991). See also Hensley [(1992, p. 344) and (1994, p. 145)]. It is easy to check that the Perron–Frobenius operator Pλ of τ under λ takes H into itself. Obviously, for f ∈ H we define Pλ f by µ ¶ X 1 1 Pλ f (z) = f , Re z > −1/2. z+i (z + i)2 i∈N+

2.3.2

A symmetric linear operator

Consider the linear operator S : L2 (R+ ) → L2 (R+ ) defined by µ Sϕ (s) =

1 − e−s s

¶1/2 ϕ (s) ,

ϕ ∈ L2 (R+ ) , s ∈ R+ .

Clearly, S is invertible and µ S

−1

ϕ (s) =

s 1 − e−s

¶1/2 ϕ (s) ,

¡ ¢ ϕ ∈ S L2 (R+ ) , s ∈ R+ .

Consider also the linear operator A = SM −1 : H → L2 (R+ ) with inverse

¡ ¢ A−1 = M S −1 : S L2 (R+ ) → H.

Proposition 2.3.1 Define the symmetric linear operator K : L2 (R+ ) → by Z Kϕ (s) = k (s, t) ϕ (t) dt , ϕ ∈ L2 (R+ ) , s ∈ R+ ,

L2 (R+ )

R+

where k (s, t) =

¡ √ ¢ J1 2 st ((es − 1) (et − 1))1/2

,

s, t ∈ R+ ,

and J1 is the Bessel function of order 1 defined by J1 (s) =

s X (−1)k ³ s ´2k , 2 k! (k + 1)! 2 k∈N

s ∈ R+ .

104

Chapter 2

Then Pλ = A−1 K A.

(2.3.2)

¡ ¢ Proof. Note first that the range of K is included in S L2 (R+ ) . Let ϕ ∈ L2 (R+ ) and put f = M ϕ ∈ H. We have A−1 K A f = M S −1 K S ϕ. But µ

¡ −1 ¢ S KSϕ (s) =

s 1 − e−s

Z = R+

¶1/2 Z

µ k (s, t)

R+

³ s ´1/2 t

e

s−t 2

1 − e−t t

¡ √ ¢ J1 2 st ϕ (t) dt, es − 1

¶1/2 ϕ (t) dt s ∈ R+ ,

whence ¡ √ ¢ t −zs − ³ s ´1/2 J1 2 st 2 e ϕ (t) dsdt t es − 1 R2+   ¶ µ Z X 1 t t   ϕ (t) dt, = − exp − z+k 2 (z + k)2 R+ k∈N

¡ ¢ M S −1 KSϕ (z) =

Z

+

for Re z > −1/2, on account of the identity X k∈N+

¡ √ ¢ µ ¶ Z ³ ´ 1 t s 1/2 −zs J1 2 st exp − = e ds z+k es − 1 (z + k)2 R+ t

which is valid for t ∈ R+ and Re z > −1 [see Watson (1944, formula 7.13.9)]. It remains to note that µ ¶ µ ¶ µ ¶ Z t 1 1 t − dt = (M ϕ) =f ϕ (t) exp − z+k 2 z+k z+k R+ for any k ∈ N+ and Re z > −1, to obtain µ ¶ X ¡ −1 ¢ 1 1 A KAf (z) = f = (Pλ f ) (z), z+k (z + k)2

Re z > −1/2.

k∈N+

2

Solving Gauss’ problem

105

As an integral symmetric linear operator with continuous kernel, K is a compact operator on L2 (R+ ) with only real eigenvalues λj , j ∈ N+ , satisfying lim |λj | = 0. j→∞

See, e.g., Kanwal (1997, Ch.7). Note that 0 cannot be an eigenvalue since Kϕ = 0 implies that ϕ = 0 by the invertibility of the Hankel transform. See, e.g., Magnus et al. (1966, Ch. 11). As usual, we order the eigenvalues according to their absolute values, that is, |λ1 | ≥ |λ2 | ≥ ... , where we list each eigenvalue according to its multiplicity. We then have X Kϕ = λj (ϕ, ϕj ) ϕj , ϕ ∈ L2 (R+ ) , (2.3.3) j∈N+

where ϕj is a (real-valued) eigenfunction corresponding to λj , that is Kϕj = λj ϕj , j ∈ N+ , and the ϕj , j ∈ N+ , define an orthonormal system in L2 (R+ ). Note that this system is complete since 0 is not an eigenvalue of K. We actually can prove more about K. For that we recall that a linear operator L on a Banach space B of norm || · || is called nuclear of order 0 (or of trace class) if and only if it can be written as X Lx = yi (x)xi , x ∈ B, i∈I

with

X (||yi || ||xi ||)r < ∞ i∈I

for any r > 0. Here I is a countable set while xi ∈ B and yi ∈ B ∗ = the dual Banach space of B (consisting of all bounded linear functional on B) for any i ∈ I. Such operators have been introduced and studied by Grothendieck (1955, 1956). They are compact and thus have discrete spectra. Moreover, most of matrix algebra can be extended to them. In particular, one can define the trace of such an operator as X X yi (xi ) = λj , Tr L = (2.3.4) i∈I

j∈N+

where λj , j ∈ N+ , are the eigenvalues of L, each of them counted with its multiplicity. The traces of the powers Ln , n ≥ 2, are also well defined. The analog of the characteristic polynomial of a matrix for a nuclear operator of

106

Chapter 2

order 0, is known as the Fredholm determinant, which is an entire function of z ∈ C given by the formula Y det (Id − zL) = (1 − λj z). j∈N+

Then the equation 

 X zk det(Id − zL) = exp(−Tr log(Id − zL)) = exp − TrLk  k k∈N+

holds for |z| < 1. Hence Tr Ln =

X

λnj ,

n ∈ N+ .

j∈N+

Moreover, generalized traces defined as X |λj |ε j∈N+

exist for any ε > 0. Let us finally note that in some Banach spaces every bounded linear operator is nuclear of order 0. A typical example of such a Banach space is A∞ (D1 ), to be defined in Subsection 2.4.3. Proposition 2.3.2 K is a nuclear operator of trace class. Hence X |λj |ε < ∞ j∈N+

for any ε > 0. We have Z Z X Tr K = λj = k (s, s) ds = R+

j∈N+

Tr K 2 =

X j∈N+

ZZ λ2j =

R2+

k (s, t) k (t, s) ds dt

¡ √ ¢ J12 2 st ds dt = 1.103839654 · · · . s t R2+ (e − 1) (e − 1)

ZZ =

R+

J1 (2s) ds = 0.7711255237 · · · , es − 1

(2.3.5)

Solving Gauss’ problem

107

Proof. Consider the Laguerre polynomials L1n (s)

= (n + 1)!

n X

(−1)m

m=0

We have

Z R+

sm , (m + 1)!m! (n − m)!

¡ ¢2 se−s L1n (s) ds = n + 1,

n ∈ N, s ∈ R+ .

n ∈ N,

Z R+

se−s L1m (s) L1n (s) ds = 0,

m, n ∈ N, m 6= n.

¡ √ ¢ √ See, e.g., Magnus et al. (1966, Ch. 5). We expand J1 2 st / st, s, t ∈ R+ , in terms of the L1n (s) , n ∈ N, to obtain ¡ √ ¢ X J1 2 st √ = L1n (s) Cn (t), st n∈N

s, t ∈ R+ ,

where Cn (t) =

1 n+1

= n!

Z

n X

R+

X

m=0 k∈N

=

e−t tn , (n + 1)!

It follows that Kϕ =

X

se−s L1n (s)

¡ √ ¢ J1 2 st √ ds st

(−1)m+k (m + k + 1)!tk k! (k + 1)!m! (m + 1)! (n − m)! n ∈ N, t ∈ R+ .

(ϕ, βn ) αn ,

ϕ ∈ L2 (R+ ) ,

(2.3.6)

n∈N

where αn , βn ∈ L2 (R+ ) are given by αn (s) =

s1/2 L1n (s)

tn+1/2 e−t , β (t) = , n (es − 1)1/2 (et − 1)1/2 (n + 1)!

To prove the first assertion we should show that X n∈N

(||αn ||2 ||βn ||2 )r < ∞

s, t ∈ R+ .

108

Chapter 2

P for any r > 0. Since (es − 1)−1 = k∈N+ e−ks , s ∈ R++ , the computation of ||αn ||2 reduces to that of a standard integral: X Z ¡ ¢2 ||αn ||22 = se−ks L1n (s) ds R+

k∈N+

=

¶µ ¶ n µ X n+1 X n+1 n (k − 1)2p , k 2n+2 p p p=0

k∈N+

¡n+1¢

≤ 2n+1 , 0 ≤ p ≤ n, we obtain ³ ´n 2 X (k − 1) + 1 ||αn ||22 ≤ 2n+1 (n + 1) ≤ 2n+1 (n + 1) ζ (2) . k 2n+2

and since

p

k∈N+

Z sm e−s ds = m!, m ∈ N, we have

Next, as R+

||βn ||22

= =

XZ 1 s2n+1 e−ks ds ((n + 1)!)2 k≥3 R+ ¡2n+1¢ X 1 (2n + 1)! X 1 n+1 = , 2 n+1 k 2n+2 ((n + 1)!) k≥3 k 2n+2 k≥3

n ∈ N.

Since X k≥3

1 k 2n+2

=

2 X X j=0 `∈N+

X

≤ 3

`∈N+

and

µ ¶ 2n + 1 ≤ 22n+1 , n+1

we obtain ||βn ||22

1 (3` + j)2n+2

1 = 3−2n−1 ζ (2n + 2) (3`)2n+2

ζ (2n + 2) ≤ ζ(2),

ζ (2) ≤ n+1

µ ¶2n+1 2 , 3

n ∈ N,

n ∈ N.

Finally, for any r > 0 we have X n∈N

µ (||αn ||2 ||βn ||2 )r ≤

¶r X ÃÃ √ !r !n 2 2 2 √ ζ (2) < ∞. 3 3 n∈N

Solving Gauss’ problem

109

The formulae for Tr K and Tr K 2 in the statement follow from (2.3.4) and (2.3.6) which as easily checked yield Z X Tr K = (αn , βn ) = k(s, s)ds, R+ n∈N ZZ X 2 k(s, t)k(t, s)dsdt. Tr K = (αm , βn )(αn , βm ) = R2+

m,n∈N

Concerning the numerical values of Tr K and Tr K 2 we refer the reader to Mayer and Roepstorff (1987, Section 3). 2 Remark. There is an interesting relationship between Tr K n and the non-zero fixed points of τ n for any n ∈ N+ . It can be shown [see Mayer and Roepstorff (1987, Section 3) and (1988, Section 3)] that " #−1 n Y X −2 −2 n n xi1 ···in xik ···in i1 ···ik−1 − (−1) Tr K = , i1 ,... ,in ∈N+

k=2

¤ with k=2 = 1, where xi1 ···in = i1 , . . . , in , i1 , . . . , in ∈ N+ . (For notation see Subsection 1.1.3.) Clearly, these quadratic irrationalities are all non-zero solutions of the equation τ n x = x. Hence ¡ ¢1/2 ´ 1 ³ xi1 ···in = pn−1 − qn + (pn−1 + qn )2 + 4(−1)n−1 2qn−1 Q1

£

for any n ∈ N+ and i1 , . . . , in ∈ N+ . Here, as usual, pn = [i1 , . . . , in ] , g.c.d.(pn , qn ) = 1, qn

n ∈ N+ ,

with p0 = 0, q0 = 1. In particular, µ 2 ¶1/2 i i xi = +1 − , i ∈ N+ , 4 2 µ 2 ¶1/2 j j j xij = + − , i, j ∈ N+ . 4 i 2 It is asserted in Babenko (1978, p. 140) that for any n ∈ N+ , in our notation, we have   n−1 X (−1) pn−1 + qn   Tr K n = 1 − ³ ´1/2  . 2 i1 ,... ,in ∈N+ (pn−1 + qn )2 + 4(−1)n−1

110

Chapter 2

For n = 1 and n = 2 this is in agreement with the Mayer–Roepstorff formula, as easily checked. Clearly, Babenko’s formula is much simpler than Mayer– Roepstorff’s. It can be shown that it is true for any n ∈ N+ . See Subsection 2.4.3. Let us finally note that by the above we have µ ¶ i 1 X 1− √ Tr K = 2 i2 + 4 i∈N +

and Tr K 2 =

1 X 2

à p

i,j∈N+

=

1 X 2

k∈N+

Ã

ij + 2 ij (ij + 4)

! −1 !

k+2

p − 1 t(k), k(k + 4)

Q where t(k) is the number of divisors of k, equal to α (nα + 1) if 1 < k = Q nα 2 α pα is the factorization of k into distinct primes, and t(1) = 1. Corollary 2.3.3 The dominant eigenvalue λ1 of K is simple and is equal to 1. The corresponding eigenfunction ϕ1 is defined by ϕ1 (s) =

µ

1 (log 2)1/2

1 − e−s s

¶1/2 e−s/2 ,

s ∈ R+ .

Z sk e−s ds = k!, k ∈ N, we have

Proof. Since R+

Kϕ1 (s) = =

=

1

Z

(log 2)1/2 (es − 1)1/2 s1/2 (log 2)1/2 (es − 1)1/2

R+

³ √ ´ J1 2 st t−1/2 e−t dt

X (−1)k sk Z tk e−t dt k! (k + 1)! R+

k∈N

s1/2 (1 − e−s ) (log 2)1/2 (es − 1)1/2 s

= ϕ1 (s) ,

s ∈ R+ ,

Solving Gauss’ problem

111

and ||ϕ1 ||22 = =

Z

(1 − e−s ) e−s ds s R+ Z 1 X (−1)k+1 sk−1 e−s ds log 2 k! R+ 1 log 2

k∈N+

=

1 X (−1)k+1 = 1. log 2 k k∈N+

Thus 1 is an eigenvalue of K with corresponding eigenfunction ϕ1 . It should be the dominant eigenvalue since λn = 1 implies Tr K 2 ≥ n, which contradicts (2.3.5) unless n = 1. It should also be simple since λ1 = λ2 implies Tr K 2 ≥ 2, which contradicts again (2.3.5). 2 Concerning the remaining eigenvalues λn , n ≥ 2, we first have λ2 = −λ0 = −0.30366 30028 98732 65859 · · · (this follows from Theorem 2.2.5 and Theorem 2.3.5 below). Next, extensive computations [cf. Daud´e et al. (1997, Section 6) and MacLeod (1993)] yield λ3

= 0.10088 45092 93104 07530

··· ,

λ4

= −0.03549 61590 21659 84540 · · · ,

λ5

= 0.01284 37903 62440 26481

λ6

= −0.00471 77775 11571 03107 · · · ,

λ7

= 0.00174 86751 24305 51191

λ8

= −0.00065 20208 58320 50290 · · · ,

λ9

= 0.00024 41314 65524 51581

··· , ··· , ··· ,

λ10 = −0.00009 16890 83768 59330 · · · . It has been conjectured in Babenko (1978) that all eigenvalues λj , j ∈ N+ , are simple. Another conjecture [Mayer and Roepstorff (1988)] is that (−1)j+1 λj > 0, j ∈ N+ .

2.3.3

An ‘exact’ Gauss–Kuzmin–L´ evy theorem

Let us define the functions ψj ∈ H, j ∈ N+ , by µ ¶1/2 Z ¡ −1 ¢ s −zs−s/2 ψj (z) = A ϕj (z) = e ϕj (s) ds, 1 − e−s R+

Re z > −1/2.

112

Chapter 2

Note that since λj ϕj = Kϕj implies |ϕj (s)| ≤ Cj s1/2 e−s/2 ,

s ∈ R+ ,

for some suitable Cj ∈ R+ , it follows that ψj is regular in the halfplane Re z > −1. It is possible to show that actually the ψj , j ∈ N+ , are regular outside a cut along the negative axis from −1 to ∞, which is the natural boundary of them. In particular, ¯∞ Z 1 1 e−(z+1)s ¯¯ −zs−s e ds = − ψ1 (z) = ¯ (log 2)1/2 R+ (log 2)1/2 z + 1 ¯0 (2.3.7) 1 1 = , Re z > −1. 1/2 z + 1 (log 2) Proposition 2.3.4 We have X

|ψj (z)|2 =

j∈N+

j∈N+

µ max |ψj (x)| ≤ x∈I

X

1 , (2 Re z + j)2

1 π2 − 6 4 log 2

Re z > −1/2,

(2.3.8)

¶1/2 = 1.13325209315 · · · ,

j ≥ 2. (2.3.9)

Proof. For any fixed z with Re z > −1/2 consider the function µ ¶1/2 s −zs−s/2 ϕ (s) = e , s ∈ R+ , 1 − e−s which clearly belongs to L2 (R+ ). On account of the completeness of the system (ϕj )j∈N+ , whose properties are described in the lines following equation (2.3.3), we can write X ej ϕ j , ϕ= j∈N+

where ej = (ϕ, ϕj ) = ψj (z) , Parseval’s equation then yields X j∈N+

j ∈ N+ .

|ej |2 = ||ϕ||22 .

Solving Gauss’ problem But

Z ||ϕ||22 =

113

¯ ¯ ¯ −zs−s/2 ¯2 ¯e ¯

R+

Z =

e

−2sRez

R+

=



s ds 1 − e−s

X Z s ds = e−(2 Re z+j)s s ds es − 1 R+ j∈N+

X

µ e−(2 Re z+j)s

j∈N+

=

X j∈N+

¯ ¶¯∞ ¯ s 1 ¯ + 2 2 Re z + j (2 Re z + j) ¯¯ 0

1 , (2 Re z + j)2

Re z > −1/2,

and (2.3.8) follows. Finally, (2.3.9) follows from (2.3.7) and (2.3.8) since min ψ1 (x) = x∈I

1 2 (log 2)1/2

. 2

Remarks. 1. It is conjectured in Babenko (1978, p.140) that ψj (0) 6= 0 and |ψj (0)| = maxx∈I |ψj (x)| , j ≥ 2. Note that ψ2 (0) 6= 0 is implicit in Wirsing (1974). 2. If ψj (0) 6= 0 for some j ≥ 2, then ψj (−i − [i1 , . . . , in ] + z) =

ψj (0) (−1)n+1 + O(1) n+2 (1 − λj ) z λj

as z → 0 for any n ∈ N+ , i, i1 , . . . , in ∈ N+ , in ≥ 2, with ε < |arg z| < π − ε whatever ε > 0. This was proved by Wirsing (1974) for j = 2, thus establishing the cut along the negative real axis from −1 to ∞ as the natural boundary of the functions ψ and ψe in Subsection 2.2.2. (See Remark 2 before Theorem 2.2.5.) It is asserted in Babenko & Jur0 ev (1978) that Wirsing’s reasoning also works for any j ≥ 3. 2 We are now able to prove an ‘exact’ Gauss–Kuzmin–L´evy theorem for the measures γa , a ∈ I (cf. Subsection 1.3.4). Theorem 2.3.5 For any a ∈ I, A ∈ BI , and n ∈ N+ we have Z X ¡ −n ¢ n−1 γa τ (A) − γ (A) = (a + 1) λj ψj (a) ψj dλ. (2.3.10) j≥2

A

114

Chapter 2 Z

Next, I

ψj dλ = 0, j ≥ 2, and ¯ ¯ ¯ ¯ Z `−1 X ¯ ¡ −n ¯ ¢ n−1 ¯γa τ (A) − γ (A) − (a + 1) ψj dλ¯¯ λj ψj (a) ¯ A ¯ ¯ j=2

¶ π 2 log 2 − 1 |λ` |n−1 min (γ (A) , 1 − γ (A)) ≤ 6 P for any a ∈ I, A ∈ BI , ` ≥ 2, and n ∈ N+ . (Clearly, 1j=2 = 0.) µ

Proof. For any a ∈ I consider the function ha defined by ha (z) =

a+1 , (az + 1)2

Re z > −1/2.

Note that h0 does not belong to H. Instead, the function Pλ ha (z) = (a + 1)

X i∈N+

1 , (z + a + i)2

Re z > −1/2,

does belong to H for any a ∈ I. By (2.3.2) and (2.3.3) for any g ∈ H and n ∈ N we have 

 Pλn g = A−1 K n A g = A−1 

X

λnj (Ag, ϕj ) ϕj  =

X

λnj (Ag, ϕj ) ψj .

j∈N+

j∈N+

Hence, for any n ∈ N+ and a ∈ I, Pλn ha = Pλn−1 (Pλ ha ) =

X

λn−1 (APλ ha , ϕj ) ψj . j

(2.3.11)

j∈N+

We assert that for any a ∈ I we have µ (APλ ha ) (s) = (a + 1) e

−s/2−as

s 1 − e−s

¶1/2 , s ∈ R+ .

(2.3.12)

This can be checked as follows. Since Pλ ha = M S −1 (APλ ha ), we have to

Solving Gauss’ problem

115

prove that this last equation holds with APλ ha given by (2.3.12). We have s e−s/2−as , s ∈ R+ , 1 − e−s Z ¡ ¢ se−s e−(z+a)s ds M S −1 APλ ha (z) = (a + 1) 1 − e−s R+ X Z se−(z+j+a)s ds = (a + 1) S −1 (APλ ha ) (s) = (a + 1)

R+

j∈N+

X

= (a + 1)

j∈N+

= Pλ ha (z),

1 (z + j + a)2

Re z > −1/2.

Thus (2.3.12) holds and we then have (APλ ha , ϕj ) = (a + 1) ψj (a),

a ∈ I, j ∈ N+ .

Therefore (2.3.11) and (2.3.13) imply that X λn−1 ψj (a) ψj , Pλn ha = (a + 1) j

(2.3.13)

a ∈ I, n ∈ N+ .

j∈N+

The last equation P holds in H . By (2.3.9), Proposition 2.3.2, and Corollary 2.3.3, the series j∈N+ λn−1 ψj (a) ψj is uniformly and absolutely convergent j in I for any a ∈ I and n ∈ N+ . Hence whatever a ∈ I and n ∈ N+ by (2.3.7) we have Pλn ha (x) −

X 1 = (a + 1) λn−1 ψj (a) ψ(x), j (x + 1) log 2

x ∈ I.

(2.3.14)

j≥2

Equation (2.3.10) follows by integrating the last equation over A ∈ BI since by the very definition of the Perron–Frobenius operator we can write Z Z Z Pλn ha dλ = ha dλ = dγa = γa (τ −n (A)), n ∈ N. τ −n (A)

A

Since Z I

τ −n (A)

¡ ¢ γ (da) γa τ −n (A) = γ(τ −n (A)) = γ (A) ,

n ∈ N, A ∈ BI ,

116

Chapter 2

if we divide equation (2.3.10) by (a + 1) (log 2) and integrate the equation obtained over a ∈ I, then we obtain Z Z X n−1 ψj dλ, n ∈ N+ , A ∈ BI . 0= λj ψj dλ j≥2

I

A

Taking A = I and n = 1 we deduce that Z ψj dλ = 0, j ≥ 2. I

Finally, for a ∈ I, A ∈ BI , ` ≥ 2, and n ∈ N+ set Da,`,n (A) = D (A) ¯ ¯ ¯ ¯ Z `−1 X ¯ ¡ −n ¯ ¢ n−1 ¯ = ¯γa τ (A) − γ (A) − (a + 1) ψj dλ¯¯ λj ψj (a) A ¯ ¯ j=2 and note that D (A) = D (I \ A). It follows from (2.3.10) that   Z X  D (A) ≤ (a + 1) |λ` |n−1 |ψj (a)| |ψj (x)| dx A

Z ≤ (a + 1) |λ` |n−1

A

Z = (log 2) |λ` |n−1

A

j≥`

1/2 1/2   X X  ψj2 (x) dx ψj2 (a)  j≥`

j≥`



1/2 X (a + 1)2 ψj2 (a) j≥`



1/2 X × (x + 1)2 ψj2 (x) γ (dx) . j≥`

Now, equation (2.3.8) implies  X X (a + 1)2 ψj2 (a) ≤ (a + 1)2 



1 1  2 − 2 (2a + j) (a + 1) log 2 j∈N+

j≥`

≤ ζ (2) −

1 log 2

(2.3.15)

Solving Gauss’ problem

117

for any a ∈ I and ` ≥ 2. (The last inequality can be easily checked.) We therefore obtain µ 2 ¶ π log 2 D (A) ≤ − 1 |λ` |n−1 γ (A) . 6 Since D (A) = D (I \ A) we conclude that µ D (A) ≤

¶ π 2 log 2 − 1 |λ` |n−1 min (γ(A), 1 − γ (A)) . 6

Note that

π 2 log 2 − 1 = 0.14018 · · · = ε2 6

(cf. Subsection 1.3.6 ).

2

Corollary 2.3.6 For any a, x ∈ I, n ∈ N+ , and ` ≥ 2 we have Z x X γa (τ n < x) − γ([0, x]) = (a + 1) λn−1 ψ (a) ψj dλ, j j 0

j≥2

X 1 d γa (τ n < x) − = (a + 1) λn−1 ψj (a)ψj (x), j dx (x + 1) log 2 j≥2

¯ ¯ ¯ ¯ Z x `−1 X ¯ ¯ n−1 ¯γa (τ n < x) − γ([0, x]) − (a + 1) ¯ λ ψ (a) ψ dλ j j j ¯ ¯ 0 ¯ ¯ j=2 µ ≤

¯ ¯¶ ¶ µ ¯1 ¯ π 2 log 2 n−1 1 ¯ − 1 |λ` | − ¯ − γ([0, x])¯¯ , 6 2 2

¯ ¯ ¯ ¯ `−1 X ¯d ¯ 1 n−1 ¯ γa (τ n < x) − − (a + 1) λj ψj (a)ψj (x)¯¯ ¯ dx (x + 1) log 2 ¯ ¯ j=2 µ ≤

1 π2 − 6 log 2

¶ |λ` |n−1

1 . x+1

Next (cf. Corollary 1.2.5), for any a ∈ I, n, k ∈ N+ , and i(k) ∈ Nk+ we have ¯ ¡ ¯ ¢ µ 2 ¶ ¯ γ (a ¯ (k) , · · · , a ) = i π log 2 n+1 n+k ¯ a ¯ ¡ ¢ − 1¯ ≤ − 1 λn−1 , ¯ 0 ¯ ¯ 6 γ [u(i(k) ), v(i(k) )]

118

Chapter 2

which for k = 1 reduces to ¯ ¯ µ 2 ¶ ¯ ¯ γa (an+1 = i) π log 2 ¯ ¯ − 1 λ0n−1 , ¯ (log 2)−1 log(1 + 1/i(i + 2)) − 1 ¯ ≤ 6 for any a ∈ I and i, ∈ n ∈ N+ . Proof. The first equation is (2.3.10) for A = [0, x), x ∈ I, while the second one is simply (2.3.14). (Clearly, the latter can be obtained from the former by differentiation.) The first inequality is that occurring in Theorem 2.3.5 for A = [0, x), x ∈ I, while the second one is easily obtained using (2.3.15). Finally, the last inequality (the general case) is that occurring in Theorem 2.3.5 for A = [u(i(k) ), v(i(k) )] and ` = 2. 2 It is interesting to compare Theorem 2.2.5 (with µ = γa , a ∈ I) and Corollary 2.3.6. It is easy to see that for any a, x ∈ I we have Z x e −λ0 G(fa0 )ψ(x) = ψ2 (a) ψ2 dλ, (2.3.16) 0

where fa (x) =

x+1 , (ax + 1)2

a, x ∈ I.

Differentiating (2.3.16) with respect to x and then putting x = a yield ψ22 (a) = −λ0 G(fa0 )ψe 0 (a),

a ∈ I.

In particular, ψ22 (0) = −λ0 G(1)ψe 0 (0) = λ0 G(1)U ∞ Ψ 6= 0 (since G(1) > 0). Now, it follows from (2.3.16) that for any x ∈ I such that ψe 0 (x) 6= 0 the ratio ψ2 (x)/ψe 0 (x) has a constant value equal to −(sgn ψ2 (0))(λ0 G(1)/U ∞ Ψ)1/2 , and that for any a ∈ I such that ψ2 (a) 6= 0 the ratio G(fa0 )/ψ2 (a) has a constant value equal to G(1)/ψ2 (0). Then µ ∞ ¶1/2 Z x U Ψ e ψ2 dλ ψ(x) = −(sgn ψ2 (0)) λ0 G(1) 0 and

µ ψ2 (x) = −(sgn ψ2 (0))

λ0 G(1) U ∞Ψ

¶1/2

ψe 0 (x)

for any x ∈ I. Remark. It follows from Corollary 2.3.6 that the exact convergence rate to 0 as n → ∞ of sup |γa (τ n < x) − γ([0, x])| , x∈I

a ∈ I,

(2.3.17)

Solving Gauss’ problem

119

is O(λn0 ) as long as ψ2 (a) 6= 0. In particular this holds for a = 0 since, as we have just shown, ψ2 (0) 6= 0. If ψ2 (a) = · · · = ψj−1 (a) = 0 and ψj (a) 6= 0 for some j ≥ 3, then the exact convergence rate to 0 as n → ∞ of (2.3.17) is O(λnj ). The high accuracy computations of MacLeod (1993) show, however, that the only possible value of j is j = 3, since there exists a unique a ∈ I, very close to 0.4, with ψ2 (a) = 0 while ψ3 (a) 6= 0. 2

2.3.4

ψ-mixing revisited

Theorem 2.3.5 allows for an important improvement of Corollary 1.3.15. With the notation in Subsection 1.3.6, it follows from Theorem 2.3.5 that µ 2 ¶ π log 2 εn+1 ≤ − 1 λn−1 , n ∈ N+ . (2.3.18) 0 6 It is easy to check that for n = 1 we actually have equality in (2.3.18), that is, π 2 log 2 ε2 = − 1 = 0.14018 · · · , 6 in accordance with the result obtained in Subsection 1.3.6. We can thus reformulate Corollary 1.3.15 as follows. Proposition 2.3.7 The sequence (an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ 0.61231 · · · and ψγa (n) ≤

ε2 λn−2 (1 + λ0 ) 0 , 1 − ε2 λn−1 0

n ≥ 2.

In particular ψγa (2) ≤ ε2 (1 + λ0 )/(1 − ε2 λ0 ) = 0.19087 · · · for any a ∈ I. Also, ψγ (1) = ε1 = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ε2 = 0.14018 · · · , and ψγ (n) ≤ ε2 λn−2 , n ≥ 3. 0 The doubly infinite sequence (¯ a` )`∈Z of extended incomplete quotients is ψ-mixing under the extended Gauss measure γ¯ and its ψ-mixing coefficients are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ . Remark. From Theorem 2.3.5 we can also obtain a formula expressing the ψ-mixing coefficients ψγ (n), n ≥ 2, in terms of the eigenvalues λj and functions ψj , j ≥ 2, as ¯ ¯ ¯ ¯X ¯ ¯ n−1 λj ψj (a) ψj (b)¯¯ , n ∈ N+ . ψγ (n + 1) = (log 2) sup (a + 1)(b + 1) ¯¯ a,b∈I ¯ ¯ j≥2

120

Chapter 2

It is not difficult to check that the above formula yields ψγ (2) = ε2 . Otherwise it seems to be of little value. 2

2.4 2.4.1

Extending Babenko’s and Wirsing’s work The Mayer–Roepstorff Hilbert space approach

In this subsection we describe the setting devised by Mayer and Roepstorff (1987) for Babenko’s work which is thus simplified and extended. Proofs are in general not given, and for them the reader is referred to the original paper. Let m denote the measure on BR+ with density dm t = t , dt e −1

t ∈ R+ .

Note that Z m (R+ ) =

X

t R+

e−kt dt =

X 1 = ζ (2) . k2

k∈N+

k∈N+

¡ ¢ Consider the Hilbert space L2 R+ , BR+ , m = L2m (R+ ) of m-square integrable functions f : R+ → C with inner product (·, ·)m defined by Z (ϕ, ψ)m = ϕψ ∗ dm, ϕ, ψ ∈ L2m (R+ ), R+

and norm

µZ |ϕ|2 dm

kϕk2,m =

R+

¶1/2 ,

ϕ ∈ L2m (R+ ).

Let D denote the half-plane Re z > −1/2 and consider the measure ν on BD with density  1 1 1   if − < x < 0, y ∈ R,  2 2 π (x + 1) + y 2 dν = dxdy    0 otherwise. Note that 1 ν (D) = π

Z

Z

0

dx −1/2

R

dy = (x + 1)2 + y 2

Z

0

−1/2

dx = log 2. x+1

Solving Gauss’ problem

121

Consider the Hilbert space H 2 (ν) of functions f holomorphic in D such ¯ ¯ ¯ ¯ that ¯(z + 1)−1 f (z)¯ is bounded in every half-plane Re z > −1/2 + ε, ε > 0, and µZ ¶ |f |2 dν

kf k2,ν =

1/2

< ∞,

D

with inner product (·, ·)ν defined by Z (f, g)ν = f g ∗ dν,

f, g ∈ H 2 (ν) .

D

Thus H 2 (ν) is a Banach space under the norm k·k2,ν . Let fe denote the restriction of f ∈ H 2 (ν) to I. Then Z

(f, 1)ν fedγ = log 2 I

U fe = ∞

and

° ° ° e° °f °

2,γ

≤ kf k2,ν .

(2.4.1)

(2.4.2)

Next, the linear mapping M : L2m (R+ ) → H 2 (ν) defined by Z M ϕ (z) = (z + 1) e−zt ϕ (t) m(dt), ϕ ∈ L2m (R+ ) , z ∈ D, R+

is an isometry and the image under M of L2m (R+ ) is H 2 (ν). The Perron–Frobenius operator U takes H 2 (ν) into itself. Obviously, for f ∈ H 2 (ν) we define U f by µ ¶ X 1 Pi (z) f U f (z) = , z ∈ D. z+i i∈N+

b : ϕ → Kϕ, b where The mapping K Z ³ √ ´ ϕ (t) b Kϕ (s) = J1 2 st √ m (dt) , st R+

ϕ ∈ L2m (R+ ) , s ∈ R+ ,

defines on L2m (R+ ) an integral symmetric linear operator with continuous kernel ¡ √ ¢ X (−1)n J1 2 st b √ k (s, t) = (st)n , s, t ∈ R+ . = n! (n + 1)! st n∈N

122

Chapter 2

b has infinite-dimensional range, is nuclear (of trace class) and, therefore, K b and K (introduced in Subseccompact. The spectra of the operators K tion 2.3.2) coincide. Thus with the notation from Subsection 2.3.2 for the eigenvalues of K we have X b = Kϕ λk (ϕ, ϕ bk )m ϕ bk , ϕ ∈ L2m (R+ ), (2.4.3) k∈N+

bϕ where ϕ bk is an eigenfunction corresponding to λk , that is, K bk = λk ϕ bk , 2 k ∈ N+ , and the ϕ bk , k ∈ N+ , define an orthonormal basis in Lm (R+ ). Actually, ¡ ¢1/2 ϕ bk (t) = t−1/2 et − 1 ϕk (t) , k ∈ N+ , t ∈ R+ , where the ϕk , k ∈ N+ , are those introduced in Subsection 2.3.2. b and U are connected by the equation U = M KM b −1 . The operators M, K Hence b n M −1 , n ∈ N+ . Un = MK (2.4.4) From (2.4.3) we have X b nϕ = λnk (ϕ, ϕ bk )m ϕ bk , K

n ∈ N+ , ϕ ∈ L2m (R+ ) .

(2.4.5)

k∈N+

It then follows from (2.4.4) and (2.4.5) that X λnk (M −1 g, ϕ bk )m M ϕ bk , n ∈ N+ , g ∈ H 2 (ν) . U ng = k∈N+

Alternatively, U ng =

X

λnk (g, M ϕ bk )ν M ϕ bk ,

n ∈ N+ , g ∈ H 2 (ν) .

k∈N+

For k = 1 we have λ1 = 1 and ϕ b1 (t) =

1 1/2

(log 2)

Therefore

¡ ¢ t−1 et − 1 e−t ,

t ∈ R+ .

Z

Mϕ b1 (z) = (z + 1) =

R+

(z + 1) 1/2

(log 2)

e−zt ϕ b1 (t) m (dt)

Z

e−(z+1)t dt = R+

1 (log 2)1/2

,

z ∈ D,

Solving Gauss’ problem

123

and, by (2.4.1), (g, M ϕ b1 )ν M ϕ b1 =

1 (g, 1)ν = U ∞ ge, log 2

g ∈ H 2 (ν) .

b we also have As 0 is not an eigenvalue of K, X M −1 g = (M −1 g, ϕ bk )m ϕ bk , g ∈ H 2 (ν) , k∈N+

or, alternatively, g=

X

(g, M ϕ bk )ν M ϕ bk ,

g ∈ H 2 (ν) .

k∈N+

Then X ¯ X ° −1 °2 ¯2 2 −1 °M g ° ¯ ¯ |(g, M ϕ bk )ν |2 = ||g|| = (M g, ϕ b ) = m k 2,ν 2,m k∈N+

k∈N+

for any g ∈ H 2 (ν). Therefore ||U n g − U ∞ ge||22,ν

=

P

2n bk )ν |2 k≥2 |λk | |(g, M ϕ

³ ´ ≤ ||g||22,ν − |U ∞ ge|2 log 2 |λ2 |2n ,

(2.4.6)

for any n ∈ N+ and g ∈ H 2 (ν). Inequalities (2.4.2) and (2.4.6) imply the following result. Proposition 2.4.1 Let g ∈ H 2 (ν). Then for any n ∈ N+ we have ³ ´1/2 kU n ge − U ∞ gek2,γ ≤ ||g||22,ν − |U ∞ ge|2 log 2 |λ2 |n . Corollary 2.4.2 (L2 -version of the Gauss–Kuzmin–L´evy theorem) Let h : D → C such that the function z → (z + 1) h(z), z ∈ D, belongs to H 2 (ν) and the restriction of h to I is the Radon–Nikodym derivative with respect to λ of a probability measure µ on BI . Then |µ (τ −n (A)) − γ (A)| ≤ (log 2) γ

1/2

µ ZZ ¶1/2 1 1 2 (A) |h (x + iy)| dxdy − |λ2 |n π D log 2

(2.4.7)

124

Chapter 2

for any n ∈ N+ and A ∈ BI . Proof. Let g (z) = (z + 1) h(z), z ∈ D. For any A ∈ BI and n ∈ N+ we have ¶1/2 ¯ ¯ µZ ¯ ¯ n ∞ 2 IA dγ kU n ge − U ∞ gek2,γ . ¯(IA , U ge − U ge)γ ¯ ≤

(2.4.8)

I

But 1 (IA , U ge − U ge)γ = log 2 n

Z



A

U n ge (x) − U ∞ ge dx x+1

and, by Proposition 2.1.5, Z A

¡ ¢ U n ge (x) − U ∞ ge dx = µ τ −n (A) − γ (A) x+1

since 1 U ge = log 2

Z



I

1 (x + 1) h (x) dx = . x+1 log 2

Therefore (2.4.8) amounts to ¯ ¡ −n ¯ ¢ ¯µ τ (A) − γ (A)¯ ≤ (log 2) γ 1/2 (A) kU n ge − U ∞ gek

2,γ

(2.4.9)

for any n ∈ N+ and A ∈ BI . Now, (2.4.7) follows from (2.4.9) and Proposition 2.4.1. 2 Remark. Inequality (2.4.6) can be obviously generalized as follows. For any n, ` ∈ N+ and g ∈ H 2 (ν) we have ¯¯ ¯¯2 ¯¯ ¯¯ X ¯¯ n ¯¯ n ¯¯U g − U ∞ ge − λk (g, M ϕ bk )ν M ϕ bk ¯¯¯¯ ¯¯ ¯¯ ¯¯ 2≤k≤`

2,ν

 ≤ ||g||22,ν − |U ∞ ge|2 log 2 −

 X

|(g, M ϕ bk )ν |2  |λ`+1 |2n

2≤k≤`

with the usual convention which assigns value 0 to a sum over the empty set. Proposition 2.4.1 and Corollary 2.4.2 can be accordingly generalized. 2

Solving Gauss’ problem

125

We can again derive the ‘exact’ Gauss–Kuzmin–L´evy Theorem 2.3.5. First, we clearly have Z ψbk (z) := M ϕ bk (z) = (z + 1) e−zt ϕ bk (t) m (dt) R+

Z =

(z + 1) R+

=

¡ ¢−1/2 e−zt t1/2 et − 1 ϕk (t) dt

(z + 1) ψk (z),

(2.4.10)

k ∈ N+ , z ∈ D.

Second, the function ga , a ∈ I, defined by ga (z) =

(a + 1) (z + 1) , (az + 1)2

z ∈ D,

does not belong to H 2 (ν) for a = 0. Instead, the function U ga (z) = (a + 1) (z + 1)

X j∈N+

1 , (z + a + j)2

z ∈ D,

does belong to H 2 (ν) for any a ∈ I. Then U n ga = U n−1 (U ga ) =

X

λn−1 (M −1 U ga , ϕ bk )m ψbk k

k∈N+

for any a ∈ I and n ∈ N+ . Now, it is easy to check that M −1 U ga (t) = (a + 1) e−at ,

a ∈ I, t ∈ R+ .

(2.4.11)

Hence Z (M

−1

U ga , ϕ bk )m = (a + 1)

R+

e−at ϕ bk (t) m(dt) = ψbk (a),

a ∈ I, k ∈ N+ .

Therefore U n ga =

X

λn−1 ψbk (a)ψbk , k

n ∈ N+ , a ∈ I,

k∈N+

which by (2.4.10) is identical with (2.3.14).

(2.4.12)

126

Chapter 2 Note that by (2.4.11) for any a ∈ I we have ||U ga ||22,ν

¯¯ ¯¯2 = ¯¯M −1 U ga ¯¯2,m = (a + 1)2 2

= (a + 1)

X Z k∈N+

= (a + 1)2

X k∈N+

Z

e−2at t dt t R+ e − 1

te−(2a+k)t dt

(2.4.13)

R+

1 , (2a + k)2

that is, by Proposition 2.3.4, ||U ga ||22,ν = (a + 1)2

X

|ψk (a)|2 .

k∈N+

This result is not at all surprising. It can be derived immediately from (2.4.12) with n = 1 on account of the fact that (ψbk )k∈N+ is an orthonormal basis in H 2 (ν). (Remark that the ψk , k ∈ N+ , are not pairwise orthogonal in H !). Next, 1 U ∞ U gea = U ∞ gea = , a ∈ I. (2.4.14) log 2 It then follows from Proposition 2.4.1 that for any n ∈ N+ we have ° ° kU n gea − U ∞ gea k2,γ = °U n−1 (U gea ) − U ∞ gea °2,γ ≤

³ ´1/2 |λ2 |n−1 . ||U ga ||22,ν − |U ∞ gea |2 log 2

(2.4.15)

Proposition 2.4.3 For any a ∈ I, n ∈ N+ and A ∈ BI we have ¯ ¯ ¯γa (τ −n (A)) − γ(A)¯ (2.4.16)  ≤ (log 2) γ 1/2 (A) (a + 1)2

X k∈N+

1/2 1  1 2 − log 2 (2a + k)

Proof. The function gea (x) a+1 e ha (x) = = , x ∈ I, x+1 (ax + 1)2

|λ2 |n−1 .

Solving Gauss’ problem

127

is just the Radon–Nikodym derivative dγa /dλ. Now, (2.4.16) follows from (2.4.9) and (2.4.13) through (2.4.15). 2 Remarks. 1. On account of the remark following Corollary 2.4.2, inequality (2.4.16) can be generalized as follows. For any a ∈ I, `, n ∈ N+ , and A ∈ BI we have ¯ ¯ ¯ ¯ Z X ¯ ¡ −n ¯ ¢ n−1 b b ¯γa τ (A) − γ (A) − (log 2) λk ψk (a) ψk dγ ¯¯ (2.4.17) ¯ A ¯ ¯ 2≤k≤`  ≤ (log 2) γ 1/2 (A) (a + 1)2

X k∈N+

X

1/2

1 ψbk2 (a) 2 − (2a + k) 1≤k≤`

|λ`+1 |n−1 .

2. It is instructive to compare the inequality in Theorem 2.3.5 with (2.4.17). The difference between them reflects the difference between the Hilbert spaces H and H 2 (ν). 2

2.4.2

The Mayer–Roepstorff Banach space approach

In this subsection we give a summary of the work of Mayer and Roepstorff (1988) on the u0 -positivity of the Perron–Frobenius operators Pλ and U = Pγ on a suitable Banach space. Let us first recall a few concepts concerning positive operators with respect to a cone in a real Banach space B. A closed convex subset C of B is called a cone if and only if (i) x ∈ C and a ∈ R+ imply ax ∈ C, and (ii) x ∈ C and −x ∈ C imply x = 0. A cone C induces a partial order ≤C (≤ for short): x ≤ y if and only if y − x ∈ C. A cone C is said to be reproducing if and only if B = C − C, that is, any z ∈ B can be written as z = x − y with x, y ∈ C. A linear operator T : B → B is said to be positive with respect to a cone C if and only if T C ⊂ C. Let C be a cone and 0 6= u0 ∈ C. A positive with respect to C operator T is said to be u0 -positive if and only if for any 0 6= x ∈ C there exist p ∈ N+ and α, β ∈ R++ such that αu0 ≤ T p x ≤ βu0 . Compact operators on the complexification of B, which are positive with respect to a reproducing cone C ⊂ B and u0 -positive for some 0 6= u0 ∈ C, enjoy properties similar to those of finite positive matrices. They obey

128

Chapter 2

a generalization of the Perron–Frobenius theorem for such matrices. For details the reader is referred to Krasnoselskii (1964). Coming back to our problem, let D1 = (z ∈ C : |z − 1| < 3/2) and consider the collection A (D1 ) of all holomorphic functions in D1 which together with their first derivatives are continuous in D1 ; A (D1 ) is a Banach space under the norm à ! ¯ 0 ¯ kf k = max sup |f (z)| , sup ¯f (z)¯ , f ∈ A (D1 ) . z∈D1

z∈D1

Both operators Pλ and U take A (D1 ) into itself. Obviously, for f ∈ A (D1 ) we define Pλ f and U f by ¶ µ X 1 1 , z ∈ D1 , Pλ f (z) = f z+i (z + i)2 i∈N+

and U f (z) =

X

µ Pi (z) f

i∈N+

1 z+i

¶ ,

z ∈ D1 ,

respectively. Both Pλ and U are nuclear operators of trace class on A (D1 ). Let us write (compare with Subsection 2.1.2) Pλ = Π1 + T0 , where Z Π1 f (z) = f1 (z) f dλ, f ∈ A(D1 ), z ∈ D1 , I

and f1 (z) =

(log 2)−1 , z+1

z ∈ D1 .

Since Pλ (f1 f ) = f1 U f, f ∈ A (D1 ), the spectra of the operators Pλ and U on A (D1 ) are identical, algebraic multiplicities of the eigenvalues included. Theorem 2.4.4 The spectra of U on A (D1 ) and on H 2 (ν) (see Subsection 2.4.1) are identical, algebraic multiplicities of the eigenvalues included. Consider the subspaces µ ¶ Z ⊥ ∞ A (D1 ) = f ∈ A (D1 ) : U f = f dγ = 0 I

Solving Gauss’ problem

129

and e⊥ (D1 ) = A

µ ¶ Z f ∈ A(D1 ) : f dλ = 0 I

³ ´ ³ ´ e⊥ e⊥ (D1 ) of A (D1 ) and the real subspaces A⊥ of A⊥ (D1 ) A r (D1 ) Ar (D1 ) consisting of functions that take real values on R ∩ D1 = [−1/2, 5/2]. Note that by Proposition 2.1.1(ii) U leaves invariant both subspaces A⊥ (D1 ) and e⊥ (D1 ) and A e⊥ A⊥ A r (D1 ) while Pλ leaves invariant³both subspaces ´ ³ r (D1´). ⊥ e⊥ e⊥ (D1 ) . A A The complexification of A⊥ r (D1 ) is just A (D1 ) r (D1 ) e⊥ (D1 ) is identical with the spectrum of U on Also, the spectrum of T0 on A ⊥ A (D1 ). The set ³ ´ 0 C = f ∈ A⊥ r (D1 ) : f ≥ 0 on [−1/2, 5/2] is a reproducing cone in A⊥ r (D1 ) . Define u0 ∈ A(D1 ) by u0 (z) = z + 1 −

1 , log 2

z ∈ D1 .

Clearly, u0 ∈ C. Theorem 2.4.5 The operator −U on A⊥ r (D1 ) is positive with respect to the cone C . Moreover, −U is u0 -positive. Hence the operator − U + U ∞ on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 (cf. Theorem 2.2.5) with eigenfunction f2 in the interior C o of C. There is no other eigenfunction in C. e⊥ Corollary 2.4.6 The operator −T0 on A r (D1 ) is positive with respect to the (reproducing) cone f1 C = (f1 f : f ∈ C). Moreover, −T0 is f1 u0 -positive. Hence the operator −T0 on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 with eigenfunction fb2 = f1 f2 . There is no other eigenfunction in f1 C. Note that a minimax principle for −λ0 holds. We namely have mino

f ∈C

Hence

(U f )0 (x) (U f )0 (x) = −λ = max . min 0 f ∈C o −1/2≤x≤5/2 f 0 (x) −1/2≤x≤5/2 f 0 (x) max

(U f )0 (x) (U f )0 (x) ≤ −λ ≤ max 0 f 0 (x) f 0 (x) −1/2≤x≤5/2 −1/2≤x≤5/2 min

for any f ∈ C o . For example, taking f (z) =

z+1 − c, z + 1.14617

z ∈ D1 ,

130

Chapter 2

with c chosen such that f ∈ A⊥ (D1 ), we obtain 0.2995 ≤ λ0 ≤ 0.3038, that is, an approximation which is good enough.

2.4.3

Mayer–Ruelle operators

Statistical mechanics problems motivated the consideration of a class of operators including as a special case the Perron–Frobenius operator Pλ of τ under λ. This class has been thoroughly studied by Mayer (1990, 1991). Nowadays, these operators are named after him and D. Ruelle. Let D1 = (z ∈ C : |z − 1| < 3/2) and consider the collection A∞ (D1 ) of all holomorphic functions in D1 which are continuous in D1 ; A∞ (D1 ) is a Banach space under the supremum norm || f || = sup |f (z)| ,

f ∈ A∞ (D1 ).

z∈D1

For any β ∈ C with Re β > 1 and f ∈ A∞ (D1 ) define ¶ µ X 1 1 , z ∈ D1 . Gβ f (z) = f z+i (z + i)β i∈N+

It is easy to check that Gβ is a bounded linear operator on A∞ (D1 ). Hence, as mentioned when discussing nuclear operators in Subsection 2.3.2, Gβ is nuclear of order 0 and thus has a discrete spectrum. For β = 2, Gβ has the same analytical expression as Pλ . In what follows we give without proofs the most important properties of the Mayer–Ruelle operator Gβ for Re β > 1, which generalize those of Pλ . For proofs we refer the reader to Mayer (1990, 1991). See also Daud´e et al. (1997), Faivre (1992), Flajolet and Vall´ee (1998, 2000), and Vall´ee (1997). Theorem 2.4.7 Let β be real, strictly greater than 1. (i) The operator Gβ : A∞ (D1 ) → A∞ (D1 ) has a positive dominant eigenvalue λ(β) which is simple and strictly greater in absolute value than all other eigenvalues. The corresponding eigenfunction gβ ∈ A∞ (D1 ) is strictly positive on D1 ∩ R = [−1/2, 5/2]. (ii) The map β → λ(β) defines on (1, ∞) a strictly decreasing and logconcave function with √ log λ(β) 5−1 = log . lim λ(β) = ∞, λ(2) = 1, lim β↓1 β→∞ β 2

Solving Gauss’ problem Moreover,

131

Ã√ !u 5−1 λ(β + u) ≤ λ(β), 2

u ∈ R+ .

(iii) There exists a linear functional `β on A∞ (D1 ) with `β (gβ ) = 1 and `β (f ) > 0 for any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 (here f |[−1/2,5/2] denotes the restriction of f to [−1/2, 5/2]). If Π1β denotes the projection defined as Π1β f = `β (f )gβ , f ∈ A∞ (D1 ), then Gβ = λ(β)Π1β + T0β with Π1β T0β = T0β Π1β = 0. Hence n Gnβ = λn (β)Π1β + T0β ,

n ∈ N+ .

(iv) The spectral radius ρ(β) of the linear operator T0β : A∞ (D1 ) → A∞ (D1 ) is strictly smaller than λ(β), and for any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 we have Gnβ f (z) λn (β)`β (f )gβ (z)

µµ =1+O

ρ(β) λ(β)

¶n ¶

as n → ∞, where the constant implied in O is independent of z ∈ D1 (but dependent on f and β). (v) There exists ε = ε(β) > 0 such that for any α ∈ C satisfying |α − β| ≤ ε the dominant spectral properties of Gβ : A∞ (D1 ) → A∞ (D1 ) transfer to Gα : A∞ (D1 ) → A∞ (D1 ) : quantities λ(α), ρ(α), gα , `α (thus Π1α ) and T0α can be defined to represent the dominant spectral objects associated with Gα , and all of them are analytical with respect to α. Moreover, let a ∈ (ρ(β)/λ(β), 1) . For any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 we have Gnα f (z) = 1 + O(an ) λn (α)`α (f )gα (z) as n → ∞, where the constant implied in O is independent of z ∈ D1 and α satisfying |α − β| ≤ ε, but depends on a, f , and β. Finally, ρ(β + it) < ρ(β) for t ∈ [−ε, ε] , t 6= 0. The proof is the same Perron–Frobenius type of argument used in the case β = 2, which has been sketched in the preceding subsection. There the

132

Chapter 2

existence of a dominant simple real (in fact, negative) eigenvalue of T02 = T0 followed by considering the subspace A(D1 ) ⊂ A∞ (D1 ). 2 As in the special case β = 2, the Mayer–Ruelle operators enjoy better properties when defined on suitable Hilbert spaces. Let Re β > 1. Consider the collection H (β) of functions f which are holomorphic in the half plane Re z > −1/2, bounded in any half-plane Re z > −1/2 + ε, ε > 0, and can be represented in the form Z f (z) = e−zs ϕ(s)(β−1)/2 m0 (ds), Re z > −1/2, (2.4.18) R+

where m0 is the measure on BR+ with density  1  if s > 0, dm0  es − 1 =  ds  0 if s = 0, for some ϕ ∈ L2m0 (R+ ), the Hilbert space of m0 -square integrable functions ϕ : R+ → C with inner product (·, ·)m0 defined by Z (ϕ, ψ)m0 = ϕψ ∗ dm0 , ϕ, ψ ∈ L2m0 (R+ ) R+

and norm µZ ||ϕ||2,m0 =

2

|ϕ| dm R+

0

¶1/2 ,

ϕ ∈ L2m0 (R+ ).

Introducing the inner product (f1 , f2 )(β) = (ϕ1 , ϕ2 )m0 , where ϕi is associated with fi , i = 1, 2, by (2.4.18), H (β) is made a Hilbert space with norm || f ||(β) = ||ϕ||2,m0 , f ∈ H (β) , where f and ϕ are again associated by (2.4.18). Theorem 2.4.8 Let Re β > 1. (i) The linear operator Gβ takes boundedly H (β) into itself. (ii) For any f ∈ H (β) we have Z Gβ f (z) = e−zs Kβ ϕ(s)s(β−1)/2 m0 (ds), Re z > −1/2, R+

Solving Gauss’ problem

133

where Kβ : L2m0 (R+ ) → L2m0 (R+ ) is a symmetric integral operator defined by Z ³ √ ´ Kβ ϕ(s) = Jβ−1 2 st ϕ(t)m0 (dt), ϕ ∈ L2m0 (R+ ), s ∈ R+ . R+

Here Jβ−1 is the Bessel function of order β − 1 defined by Jβ−1 (u) =

³ u ´β−1 X 2

k∈N

³ u ´2k (−1)k , k! Γ(k + β) 2

u ∈ R+ .

Hence Gβ : H (β) → H (β) can be diagonalized in an orthonormal basis of H (β) . Moreover, if β ∈ R then Gβ is self-adjoint and its spectrum is real. (iii) The spectra of the operators Gβ : A∞ (D1 ) → A∞ (D1 ), Gβ : H (β) → H (β) and Kβ : L2m0 (R+ ) → L2m0 (R+ ) are identical. Hence for any real β > 1 these spectra are all real. Let us note in particular that for β = 2 the symmetric operator K2 from Theorem 2.4.8 is different from the symmetric operator K from Proposition 2.3.1. They are related by the simple relation K2 = SKS −1 , where S : L2 (R+ ) → L2m0 (R+ ) is an invertible linear operator defined by S ϕ(s) = (es − 1)1/2 ϕ(s),

s ∈ R+ .

Hence the spectra of K and K2 are identical. As for K, formulae for the trace of Kβ and its powers are available. Denoting by λi (β), i ∈ N+ , the eigenvalues of Kβ taken in order of decreasing moduli and counting their multiplicity, we have X X 1 λi (β) = Tr Kβ = , β−2 2 (yi + 1) i∈N+ yi i∈N+ ³ ´ √ where yi = i + i2 + 4 /2, i ∈ N+ , and, in general, Tr Kβn =

X

λni (β) =

i1 ,··· ,in ∈N+

i∈N+

where yi1 ···in =

X

pn−1 + qn +

with, as usual, pn = [i1 , · · · , in ] , qn

yiβ−2 1 ···in

1 ¡ 2 ¢, yi1 ···in + (−1)n−1

p (pn−1 + qn )2 + 4(−1)n−1 2

g.c.d. (pn , qn ) = 1,

p0 = 0,

134

Chapter 2

for any n ∈ N+ and i1 , · · · , in ∈ N+ . Let us note that for β = 2 we recover Babenko’s formula for Tr K n , n ∈ N+ . See the remark following the proof of Proposition 2.3.2. In particular [see Daud´e et al. (1997)], we have µ ¶µ ¶ 7 1X 1 7 2 2i i i−1 ζ(2i) − 1 − 2i Tr K4 = −√ −√ + (−1) 2 i+1 i 2 5 2 2 i≥2

= 0.14446 23962 46160 81588 · · · , Tr K42 = 0.04647 18256 42727 93983 · · · , and

λ1 (4) = 0.19945 88183 43767 26019

··· ,

λ2 (4) = −0.07573 95140 84360 60892 · · · , λ3 (4) = 0.02856 64037 69818 52783

··· ,

λ4 (4) = −0.01077 74165 76612 69829 · · · , λ5 (4) = 0.00407 09406 93426 42144

··· .

To conclude this brief discussion of Mayer–Ruelle operators we mention two generalizations of them. a. For any subset M of N+ define GM,β f (z) =

X i∈M

1 f (z + i)β

µ

1 z+i

¶ ,

z ∈ D1 ,

whatever β ∈ C with Re β > 1 and f ∈ A∞ (D1 ). Clearly, GM,β is a bounded linear operator on A∞ (D1 ), hence a nuclear one of trace class, which coincides with Gβ when M = N+ . Now, for an arbitrarily fixed k ∈ N+ , let Mi , 1 ≤ i ≤ k, be subsets of N+ and write M = (M1 , . . . , Mk ). Consider the linear operator GM,β : A∞ (D1 ) → A∞ (D1 ) defined as GM,β = GMk ,β ◦ · · · ◦ GM1 ,β , which is nuclear of trace class, too. The operators GM,β for various M control the dynamics of continued fraction expansions of irrationals subject to periodical constraints. Their spectral properties are entirely similar to those of Gβ . For details see Vall´ee (1998), who considered systematically such operators. See, however, Fluch (1986, 1992) for special cases.

Solving Gauss’ problem

135

b. The second generalization has been motivated by the study of the transformation ¹ º 1 1 z → − Re , 0 6= z ∈ C, z z which extends to the complex domain the continued fraction transformation τ . Let µ ¶ 5 D2 = z : |z − 1| < , 4 and consider the collection B∞ (D2 ) of all functions F which are holomorphic 2 in D22 and continuous in D2 . Under the supremum norm || F || =

sup

|F (z, w)| ,

2 (z,w)∈D2

B∞ (D2 ) is a Banach space. Then for any (α, β) ∈ C2 with Re (α + β) > 1 a linear bounded operator Gα,β : B∞ (D2 ) → B∞ (D2 ) is defined by Gα,β F (z, w) =

X i∈N+

1 F α (z + i) (w + i)β

µ

1 1 , z+i w+i



for any F ∈ B∞ (D2 ) and (z, w) ∈ D22 . The spectral properties of Gα,β , which is positive and nuclear of trace class, are strongly related to those of Gα+β+2` , ` ∈ N. For details see Vall´ee (1997).

2.5 2.5.1

The Markov chain associated with the continued fraction expansion The Perron–Frobenius operator on BV (I)

In this section we study the Perron–Frobenius operator U on BV (I). This is motivated by Proposition 2.1.10 which establishes U as the transition operator of certain Markov chains. Throughout, except for Corollary 2.5.7, we consider just real-valued functions in BV (I). By Proposition 2.1.16, the operator U defined by (2.1.16) is a bounded linear operator of norm 1 on BV (I). Moreover, by Corollary 2.1.13 we have 1 var U f ≤ var f 2

136

Chapter 2

for any f ∈ BV (I), the constant 1/2 being optimal. Hence var U n f ≤ 2−n var f for any f ∈ BV (I) and n ∈ N+ . As might be expected, we shall see that the constant 2−n is not optimal for n > 1. A natural problem thus arises: what is the upper bound of var U n f /var f over non-constant f ∈ BV (I)? A satisfactory answer to this problem will be given in Theorem 2.5.3 and Corollary 2.5.6. It is easy to check by induction with respect to n ∈ N+ that X U n f (x) = Pi1 ···in (x)f (uin ···i1 (x)), x ∈ I, (2.5.1) i1 ,··· ,in ∈N+

where uin ···i1 = uin ◦ · · · ◦ ui1 , (2.5.2) Pi1 ···in (x) = Pi1 (x)Pi2 (ui1 (x)) · · · Pin (uin−1 ···i1 (x)),

n ≥ 2,

and the functions ui and Pi , i ∈ N+ are defined by ui (x) =

1 , x+i

Pi (x) =

x+1 , (x + i)(x + i + 1)

x ∈ I.

Note that by Proposition 2.1.10 we have U n f (x) = Ex (f (sxn )) for any n ∈ N, f ∈ B(I), and x ∈ I (remember that sx0 = x, x ∈ I), where Ex denotes the mean value operator with respect to the probability measure γx . As sxn = uan ···a1 (x), x ∈ I, n ∈ N+ , we thus have U n f (x) =

X

γx ((a1 , · · · , an ) = i(n) )f (uin ···i1 (x))

(2.5.3)

i(n) ∈Nn +

for any n ∈ N+ , f ∈ B(I), and x ∈ I. Hence Pi1 ···in (x) = γx (I(i(n) ))

(2.5.4)

for any x ∈ I, n ∈ N+ , and (i1 , · · · , in ) = i(n) ∈ Nn+ . Of course, equation (2.5.4) could be also obtained by direct computation.

Solving Gauss’ problem

137

Now, by (1.2.4), I(i(n) ) is the set of irrationals in the interval with endpoints pn /qn and (pn + pn−1 )/(qn + qn−1 ). Since  1/i1 if n = 1,    pn = [i1 , · · · , in ] = 1  qn  if n > 1  i1 + pn−1 (i2 , · · · , in )/qn−1 (i2 , · · · , in ) and pn + pn−1 qn + qn−1

  =

=



1/(i1 + 1)

if n = 1,

[i1 , · · · , in−1 , in + 1]

if n > 1

   

1/(i1 + 1)

if n = 1,

  

1 i1 + pn (i2 , · · · , in , 1)/qn (i2 , · · · , in , 1)

if n > 1

we can write Pi1 ···in (x) = (x + 1) × ×

1 × qn−1 (i2 , · · · , in )(x + i1 ) + pn−1 (i2 , · · · , in )

1 qn (i2 , · · · , in , 1)(x + i1 ) + pn (i2 , · · · , in , 1)

(2.5.5)

for any n ≥ 2, i(n) ∈ Nn+ , and x ∈ I. A useful alternative representation of U n f, n ∈ N+ , when f ∈ BV (I) is available. Proposition 2.5.1 If f ∈ BV (I) then for any n ∈ N+ and x ∈ I we have Z n U f (x) = U n I(a,1] (x)df (a) + f (0) with

R

[0,x) df

[0,1)

= f (x) − f (0), x ∈ I.

Proof. Since f can be represented as the difference of two non-decreasing functions, we may and shall assume that f is non-decreasing. Then for any x ∈ I we have Z f (x) − f (0) = [0,1)

I(a,1] (x)df (a).

138

Chapter 2

By (2.5.1), using the above equation and Fubini’s theorem we obtain X U n f (x) = Pi1 ···in (x)f (uin ···i1 (x)) i1 ,··· ,in ∈N+

X

=

Z

Pi1 ···in (x)

[0,1)

i1 ,··· ,in

I(a,1] (uin ···i1 (x))df (a) + f (0)



Z

 X



= [0,1)

Pi1 ···in (x)I(a,1] (uin ···i1 (x)) df (a) + f (0)

i1 ,··· ,in ∈N+

Z = [0,1)

U n I(a.1] (x)df (a) + f (0)

for any n ∈ N+ and x ∈ I.

2

Corollary 2.5.2 For any n ∈ N+ we have var U n f f ∈BV (I) var f sup

= =

var U n f = f ∈B(I),f ↑ var f sup

var U n f f ∈B(I)f ↓ var f sup

sup var U n I(a,1] ,

a∈[0,1)

where the first three upper bounds are taken over non-constant functions f , and f ↑ (↓) means that f is non-decreasing (non-increasing). Proof. It is clear that var U n f var U n f = sup f ∈B(I),f ↓ var f f ∈B(I),f ↑ var f sup

since

var U n (−f ) var U n f = . var(−f ) var f

Next, let vn =

var U n f , f ∈B(I),f ↑ var f sup

n ∈ N+ .

Then (cf. the proof of Corollary 2.1.13) for any non-constant f ∈ BV (I) there exist two non-decreasing functions f1 and f2 such that f = f1 − f2 and var f = var f1 + var f2 . Therefore var U n f

≤ var U n f1 + var U n f2 ≤ vn (var f1 + var f2 ) = vn var f,

n ∈ N+ .

Solving Gauss’ problem

139

Hence

var U n f ≤ vn f ∈BV (I) var f sup

and since

var U n f var U n f ≥ sup = vn , f ∈BV (I) var f f ∈B(I),f ↑ var f sup

the first equation should hold. To derive the last equation let f ∈ B(I) be non-decreasing. Then U n f is a monotone function by Proposition 2.1.11, and Proposition 2.5.1 implies that Z ¡ n ¢ n n U f (1) − U f (0) = U I(a,1] (1) − U n I(a,1] (0) df (a) [0,1)

for any n ∈ N+ . Noting that I(a,1] : I → I is also a non-decreasing function for any a ∈ [0, 1), we obtain à ! var U n f ≤

sup var U n I(a,1]

var f.

a∈[0,1)

Hence, for any a ∈ [0, 1) and n ∈ N+ , var U n I(a,1] ≤

var U n f ≤ sup var U n I(a,1] f ∈B(I),f ↑ var f a∈(0,1] sup

and the proof is complete.

2.5.2

2

An upper bound

On account of Corollary 2.5.2 our guess for the upper bound of var U n f /var f over non-constant f ∈ BV (I) is given in the conjecture below. UB Conjecture. For any n ∈ N+ we have vn = sup var U n I(a,1] = var U n I(g,1] , a∈[0,1)

√ where g = [1, 1, 1, · · · ] = ( 5 − 1)/2 = 0.6180339 · · · . Without any loss of generality, throughout this subsection we assume that f ∈ BV (I) is non-decreasing. To simplify the writing put Pi1 ···in (0) = αi1 ···in , ui1 ···in (0) = βi1 ···in ,

i1 , · · · , in ∈ N+ .

140

Chapter 2

If n is odd then by Proposition 2.1.11 and equations (2.5.1), (2.5.2), and (2.5.5) we have var U n f = U n f (0) − U n f (1) X

=

(2.5.6)

[Pi1 ···in (0)f (uin ···i1 (0)) − Pi1 ···in (1)f (uin ···i1 (1))]

i1 ,··· ,in ∈N+

X

=

i1 ,··· ,in ∈N+

X

=

[Pi1 ···in (0)f (uin ···i1 (0)) − 2P(i1 +1)i2 ···in (0)f (uin ···i2 (i1 +1) (0))] 



α1i2 ···in f (βin ···i2 1 ) −

i2 ,··· ,in ∈N+

X

α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) .

i1 ∈N+

Similarly, if n is even then we have var U n f = U n f (1) − U n f (0) (2.5.7)   X X  α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) − α1i2 ···in f (βin ···i2 1)  . = i2 ,··· ,in ∈N+

i1 ∈N+

It is easy to see that if n is odd then var U n I(a,1] has a constant value for ¶  · 1 1   , if n = 1,  j1 + 1 j1 a∈    [ [j1 , · · · , jn−1 , jn + 1], [j1 , · · · , jn ] ) if n > 1 while if n is even then var U n I(a,1] has a constant value for a ∈ [ [j1 , · · · , jn ], [j1 , · · · , jn−1 , jn + 1] ) , that is, in both cases, on the closure without the right endpoint of any fundamental interval I(j (n) ), j (n) = (j1 , · · · , jn ) ∈ Nn+ . Write 1(n) for (j1 , · · · , jn ) with jk = 1, 1 ≤ k ≤ n, n ∈ N+ . Then in particular for a ∈ [ [1(2m + 2)], [1(2m + 1)]) , that is,

·

F2m+1 F2m a∈ , F2m+2 F2m+1

m ∈ N,

¶ ,

m ∈ N,

(2.5.8)

Solving Gauss’ problem

141

we have v10 := var U I(a,1] = 1/2, X

v30 := var U 3 I(a,1] =



 X

α1i2 1 −

i2 ∈N+

α(i1 +1)i2 1  +

i1 ∈N+

X

α(i1 +1)11 ,

i1 ∈N+

and 0 v2m+1 := var U 2m+1 I(a,1]

=

m−2 X

X

h α1i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1

q=0 i2 ,··· ,i2m−2q ∈N+



+

α(i1 +1)i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1

i1 ∈N+

 X

i

X

α1i2 1···1 −

X



i1 ∈N+

i2 ∈N+

X

α(i1 +1)i2 1···1  +

α(i1 +1)1···1

i1 ∈N+

for m ≥ 2. (In the last equation the number of subscripts of the α’s is 2m + 1.) Similarly, for a ∈ [ [1(2m + 2)], [1(2m + 3)]) , that is,

·

F2m+1 F2m+2 , a∈ F2m+2 F2m+3 we have

¶ , X

v20 := var U 2 I(a,1] =

m ∈ N,

m ∈ N,

(2.5.9)

α(i1 +1)1 ,

i1 ∈N+ 0 v2m+2 := var U 2m+2 I(a,1]

=

m−1 X

X

q=0 i2 ,··· ,i2m−2q+1 ∈N+

+

X i1 ∈N+

h X

α(i1 +1)i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1

i1 ∈N+

i − α1i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1

α(i1 +1)1···1

142

Chapter 2

for m ∈ N+ . (In the last equation the number of subscripts of the α’s is 2m + 2.) Since g belongs to all intervals (2.5.8) and (2.5.9), the UB Conjecture amounts to vn = vn0 , n ∈ N+ . The case n = 1. This case was dealt with in Proposition 2.1.12. Actually, writing i for i1 , equation (2.5.6) yields X αi+1 f (βi+1 ). var U f = α1 f (β1 ) − i∈N+

Hence var U I(a,1] = ·

1 1 for a ∈ , i+1 i



1 i+1

, i ∈ N+ and

v1 = sup var U I(a,1] = a∈[0,1)

1 = var U I(g,1] = v10 2

as g ∈ [1/2, 1). Thus in this case the UB Conjecture holds. The case n = 2. Write i for i1 and j for i2 . Then we have 1 , (ij + 1)(i(j + 1) + 1)

αij =

i, j ∈ N+ ,

and equation (2.5.7) yields   X X  α(i+1)j f (βj(i+1) ) − α1j f (βj1 ) var U 2 f = j∈N+

=

X

i∈N+

(2.5.10)

α(i+1)1 f (β1(i+1) )

i∈N+

 X

+

j∈N+



 X

α(i+1)(j+1) f (β(j+1)(i+1) ) − α1j f (βj1 ) .

i∈N+

Clearly, β(j+1)(i+1) < βj1 for any i, j ∈ N+ . Hence   X X X var U 2 f ≤ f (1) α(i+1)1 + f (βj1 )  α(i+1)(j+1) − α1j  . i∈N+

j∈N+

j∈N+

Solving Gauss’ problem

143

But X

X

α(i+1)(j+1) =

i∈N+

i∈N+



1 ((i + 1)(j + 1) + 1) ((i + 1)(j + 2) + 1)

X 1 1 (j + 1)(j + 2) (i + 1)2

(2.5.11)

i∈N+

= (ζ(2) − 1) α1j < α1j for any j ∈ N+ . Since f (βj1 ) ≥ f (0), j ∈ N+ , and   X X X  α(i+1)1 , α(i+1)(j+1) − α1j  = − j∈N+

i∈N+

i∈N+

(2.5.10) and (2.5.11) imply that X X α(i+1)1 var f α(i+1)1 (f (1) − f (0)) = var U 2 f ≤

(2.5.12)

i∈N+

i∈N+

for any non-decreasing f ∈ B(I). Now, note that for f = I(a,1] with a ∈ [1/2, 2/3), in particular for a = g, we have X α(i+1)1 , var U 2 I(a,1] = i∈N+

that is, the constant X

α(i+1)1 =

i∈N+

X i∈N+

¶ X µ 1 1 1 =2 − (i + 2)(2i + 3) 2i + 3 2i + 4 i∈N+

µ ¶ 1 1 1 7 = 2 log 2 − 1 + − + = log 4 − = 0.21962 · · · 2 3 4 6 occurring in (2.5.12) cannot be lowered. Therefore for n = 2 we have v2 = log 4 −

7 = 0.21962 · · · , 6

and the UB Conjecture holds in this case. The case n ≥ 3. We could try to treat this case similarly to the case n = 2. Using (2.5.5) it is not difficult to generalize inequality (2.5.11) to X α(i1 +1)(i2 +1)i3 ···in ≤ (ζ(2) − 1)α1i2 ···in < α1i2 ···in (2.5.13) i1 ∈N+

144

Chapter 2

for any n ≥ 3 and i2 , · · · , in ∈ N+ . Next, to make a choice let us assume that n is odd. Then it is easy to see that βin ···i3 (i2 +1)(i1 +1) > βin ···i3 i2 1 , βin ···i3 1(i1 +1) > βin ···i3 1 , βin ···i3 i2 1 < βin ···i2 for any i1 , · · · , in ∈ N+ . Then by (2.5.6) and (2.5.13) we have  var U n f

X



−

i3 ,··· ,in ∈N+

X

¡ ¢ α(i1 +1)1i3 ···in f βin ···i3 1(i1 +1)

i1 ∈N+

 X

+



α1i2 i3 ···in −

i2 ∈N+

 

i3 ,··· ,in ∈N+

α(i1 +1)(i2 +1)i3 ···in  f (βin ···i3 i21 )

 X

 X

α1i2 i3 ···in −

i2 ∈N+

α(i1 +1)i2 i3 ···in  f (βin ···i3 )

i1 ∈N



 +



i1 ∈N

X



X

X



α(i1 +1)1i3 ···in  (f (βin ···i3 ) − f (βin ···i3 1 )) .

i1 ∈N+

(2.5.14) For an even n the corresponding inequality is   X X X   α(i +1)i i var U n f ≤ 1

i3 ,··· ,in ∈N

i2 ∈N+

 +



2 3 ···in

i1 ∈N+

 X

− α1i2 i3 ···in  f (βin ···i3 ) 

α(i1 +1)1i3 ···in  (f (βin ···i3 1 ) − f (βin ···i3 )) .

(2.5.15)

i1 ∈N+

Put  δi3 ···in = (−1)n−1

X i2 ∈N+

α1i2 i3 ···in −

 X i1 ∈N

α(i1 +1)i2 i3 ···in ) 

Solving Gauss’ problem

145

for any i3 , · · · , in ∈ N+ . Note that  X

δi3 ···in = (−1)n−1 α1 −

i3 ,··· ,in ∈N+

 X

αi1 +1  = 0.

(2.5.16)

i1 ∈N+

Using (2.5.5), which implies   Pi1 ···in (0) = (−1)n  



 1 1  − pn (i2 , · · · , in , 1) pn−1 (i2 , · · · , in )  i1 + i2 + qn (i2 , · · · , in , 1) qn−1 (i2 , · · · , in )

for any n ≥ 2 and i1 , · · · , in ∈ N+ , it is easy to see that δi3 ···in can be expressed in terms of the digamma function ψ as ¶ µ ¶ µ p0n−2 + p0n−3 p0n−2 −ψ 2+ 0 δi3 ···in = ψ 2 + 0 0 qn−2 qn−2 + qn−3      +

 X    ψ 2 +  

i∈N+

   1 1    − ψ 2 +   ,  0 0 0 pn−2  pn−2 + pn−3   i+ 0 i+ 0 0 qn−2 qn−2 + qn−3

0 = q (i , · · · , i 0 where p0m = pm (i3 , · · · , im+2 ), qm m 3 m+2 ), m ∈ N+ , and p0 = 0 0, q0 = 1. Let us recall that the digamma function can be expressed by the convergent series ¶ X X µ1 z−1 1 − = −C + ψ(z) = −C + j j+z−1 j(j + z − 1) j∈N+

j∈N+

for z 6= 0, −1, −2, · · · , where C = 0.57721 · · · is the Euler constant. As is well known, ψ satisfies the equation ψ(z + 1) = ψ(z) +

1 z

for z 6= 0, −1, −2, · · · . Tables for ψ can be found in Abramowitz and Stegun (1964). Putting X δ (n) (f ) = δi3 ···in f (βin ···i3 ), i3 ,··· ,in ∈N+

146

Chapter 2

inequalities (2.5.14) and (2.5.15) imply that X var U n f ≤ δ (n) (f ) + α(i1 +1)1···1 var f

(2.5.17)

i1 ∈N+

for any n ≥ 3. Here we used the fact that α(i1 +1)1i3 ···in < α(i1 +1)11···1 for any n ≥ 3 and (i3 , · · · , in ) 6= 1(n − 2), which follows at once from (2.5.5). First, note that by (2.5.16) we have δ (n) (f ) ≤ Since

1 2

1 2

X

|δi3 ···in | (f (1) − f (0)).

(2.5.18)

i3 ,··· ,in ∈N+

X

X

|δi3 ···in | = sup

i3 ,··· ,in ∈N+

δi3 ···in ,

(i3 ,··· ,in )∈A

where the supremum is taken over all A ⊂ Nn−2 + , it follows that 1 2

X i3 ,··· ,in ∈N+

|δi3 ···in | ≥

1 X |δi | . 2 i∈N+

Hence the right hand side of (2.5.17) does not tend to 0 as n → ∞, and (2.5.18) is useless for n ≥ 3. As a matter of fact, it is a general result which does not take into account that f is non-decreasing. If for some given n ≥ 3 the inequality δ (n) (I(a,1] ) ≤ δ (n) (I(g,1] )

(2.5.19)

holds for any a ∈ [0, 1), then by (2.5.17) we have X var U n I(a,1] ≤ δ (n) (I(g,1] ) + α(i1 +1)1···1

(2.5.20)

i1 ∈N+

for any a ∈ [0, 1). It is easy to see that the right hand side of (2.5.20) is equal to vn0 . Since whatever n ∈ N+ we have var U n I(g,1) = vn0 , it follows from (2.5.20) that vn = vn0 . Thus if (2.5.19) holds then for the given n the UB Conjecture holds, too. In particular for n = 3, writing i, j, k for i1 , i2 , i3 , respectively, we have αijk =

1 , (i(jk + 1) + k)(i(j(k + 1) + 1) + k + 1)

i, j, k ∈ N+ .

Solving Gauss’ problem

147

It has been proved in Iosifescu (1994) that   X X α1jk − δk = α(i+1)jk  j∈N+

i∈N+

is positive for k = 1 and negative for k > 1. Then (2.5.19) clearly holds in this case. Hence the UB Conjecture holds for n = 3 and X α(i+1)11 v3 = δ1 + i∈N+

=

X µ j∈N+

¶ µ ¶¶ µ 2 1 1 −ψ 2+ +ψ 2+ (j + 2)(2j + 3) j+1 2j + 1

µ ¶ µ ¶ 2 1 +ψ 2 + −ψ 2+ 3 2 ¶ µ ¶¶ X µ µ 7 1 2 = log 4 − + ψ 2+ −ψ 2+ 6 j+1 2j + 1 j∈N+

3 3 + + +ψ 5 2

µ ¶ µ ¶ 2 2 1 − −2−ψ 3 3 2

4 π 7 17 − + log √ + √ 6 30 27 2 3 µ µ ¶ µ ¶¶ X 1 2 ψ 2+ + −ψ 2+ . j+1 2j + 1

= log 4 −

j∈N+

We have [see Iosifescu (1994, p.115)] 0.09104 < v3 < 0.09759 while a computation using MATHEMATICA yields 0.09436 < v3 < 0.09445. Returning to the general case, a good upper bound for vn , n ∈ N+ is available. For a lower bound see further Corollary 2.5.6. Theorem 2.5.3 We have vn ≤

k0 Fn Fn+1

(2.5.21)

148

Chapter 2

for any n ∈ N+ . Here and throughout the remainder of this section, k0 is a constant not exceeding 14.8. Proof. Clearly, (2.5.21) holds for n = 1, 2, 3 as was shown before. By Corollary 2.5.2 and on account of the constancy of the function a → var U n I(a,1] on any fundamental interval of order n, we have vn = sup var U n I(a,1] , n ∈ N+ . a∈Ω

If to make a choice we assume that n ∈ N+ is odd, then by Proposition 2.1.11 and equation (2.5.3) for any a ∈ I we have var U n I(a,1] = U n I(a,1] (0) − U n I(a,1] (1) X ¡ ¢ γ0 (I(i(n) ) − γ1 (I(i(n) )) I(a,1] (uin ···i1 (1))

=

i(n) ∈Nn +

+

X

γ0 (I(i(n) ))(I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)).

i(n) ∈Nn +

(2.5.22) Note that if a ∈ Ω then just one of the differences I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)),

i(n) ∈ Nn+ ,

is 6= 0 (and equal to 1). Also, for an arbitrarily given a = [j1 , j2 , · · · ] ∈ Ω the set n o (n) n i ∈ N+ : uin ···i1 (1) > a consists of the i(n) = (i1 , . . . , in ) ∈ Nn+ satisfying (i1 < j1 ),

if n = 1;

(i3 < j1 ) ∪ (i3 = j1 , i2 > j2 ) ∪ (i3 = j1 , i2 = j2 , i1 < j3 ),

if n = 3;

(in < j1 ) ∪ (in = j1 , in−1 > j2 ) ∪ (in = j1 , in−1 = j2 , in−2 < j3 ) ∪ · · · ∪(in = j1 , · · · , i3 = jn−2 , i2 > jn−1 ) ∪ (in = j1 , · · · , i2 = jn−1 , i1 < jn ), if n ≥ 5. Therefore, putting µ = γ0 − γ1 , it follows from (2.5.22) that for

Solving Gauss’ problem

149

a = [j1 , j2 , . . . ] ∈ Ω and any odd n ≥ 5 we have var U n I(a,1] ≤ |µ(an < j1 )| + |µ(an = j1 , an−1 > j2 )| + |µ(an = j1 , an−1 = j2 , an−2 < j3 )| + · · · + |µ(an = j1 , · · · , a3 = jn−2 , a2 > jn−1 |

(2.5.23)

+ |µ(an = j1 , · · · , a2 = jn−1 , a1 < jn | + maxi(n) ∈Nn γ0 (I(i(n) )). +

We shall use the inequalities |γ0 (A) − γ1 (A)| ≤ (log 2)γ(A), (2.5.24) |γa

(τ −n (A))

− γ(A)| ≤ (ζ (2) log 2 −

1)λ0n−1 γ(A),

which are valid for any a ∈ I, A ∈ BI , and n ∈ N+ , with λ0 = 0.303663 · · · (Wirsing’s constant). The first inequality follows by integrating over A the double inequality 1 2 1 − ≤1− , x ∈ I, ≤ 2 x+1 (x + 1) x+1 while the second one is the inequality in Theorem 2.3.5 for ` = 2. Note that (an < j1 ) = τ −n+1 (a1 < j1 ), (an = j1 , an−1 > j2 ) = τ −n+2 (a2 = j1 , a1 > j2 ), (an = j1 , an−1 = j2 , an−2 < j3 ) = τ −n+3 (a3 = j1 , a2 = j2 , a1 < j3 ), ···························································· (an = j1 , · · · , a3 = jn−2 , a2 > jn−1 ) = τ −1 (an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 ) and (a2 = j1 , a1 > j2 ) ⊂ (a2 = j1 ) (a3 = j1 , a2 = j2 , a1 < j3 ) ⊂ (a3 = j2 , a2 = j2 ) ···························································· (an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 ) ⊂ (an−1 = j1 , · · · , a2 = jn−2 ) (an = j1 , · · · , a2 = jn−1 , a1 < jn ) ⊂ (an = j1 , · · · , a2 = jn−1 ).

150

Chapter 2 Next, by Theorem 1.2.2 we have max γ0 (I(i(n) )) =

i(n) ∈Nn +

1 (:= σ(n)), Fn Fn+1

and then max γ(I(i(n) )) ≤

i(n) ∈Nn +

2 σ(n), 3 log 2

n ∈ N+ ,

n ∈ N+ .

(2.5.26)

Now, by (2.5.24) through (2.5.26), with k1 = log 2 = 0.69315 · · · , k2 √ ¢ ¡√ ¢2 ¡ = ζ(2) log 2 − 1 = 0.14018, · · · , θ = g2 = 5 − 1 /4 = 3 − 5 /2 = 0.38196 · · · , it follows from (2.5.23) that var U n I(a,1] µ ¶ 4k2 k1 n−2 n−3 σ (n − 1) + σ (n) ≤ λ0 σ (0) + λ0 σ (1) + · · · + σ (n − 2) + 3 log 2 2k2 µ 4k2 σ (0) σ (1) = σ (n − 1) λn−2 + λ0n−3 + ··· 0 3 log 2 σ (n − 1) σ (n − 1) ¶ σ(n − 2) k1 + + + σ(n). σ(n − 1) 2k2 Since

σ (k) 1 ≤ θk−n−1 , σ (n) 2

and

8 σ (n − 1) ≤ , σ (n) 3

k, n ∈ N,

n ≥ 3,

we finally obtain n

var U I(a,1]

µ ¶ 16k1 16k2 ≤ 1+ + σ (n) . 9 log 2 9θ (θ − λ0 ) log 2

We have 16k1 16k2 16 1+ + =1+ 9 log 2 9θ (θ − λ0 ) log 2 9 log 2 32 = 1+ 9 log 2

Ã

µ k1 +

k2 θ (θ − λ0 )

log 2 ζ (2) log 2 − 1 √ √ ¢ ¡ + 2 7 − 3 5 − 3 − 5 λ0

= 14.780 · · · < 14.8

!



Solving Gauss’ problem

151

and the proof is complete for any odd n. The case of an even n can be treated similarly. 2 Corollary 2.5.4 Let f ∈ BV (I). For any n ∈ N we have || U n f − U ∞ f || ≤

k0 var f . Fn Fn+1

Proof. By (2.1.12) and Proposition 2.0.1 (i) we have || U n f − U ∞ f || ≤ var U n f,

n ∈ N,

and the result stated is implied by Theorem 2.5.3 for n ∈ N+ . The case n = 0 can be checked directly. 2 Remark. It was claimed in Iosifescu (1997, p.76) that Theorem 2.5.3 holds with k0 = 1/ log 2 for all n ∈ N large enough. (This is clearly true for n = 1, 2, or 3.) A flaw detected by Adriana Berechet in the method of proof in that paper invalidates the conclusion. We conjecture, however, that both Theorem 2.5.3 and Corollary 2.5.4 hold with k0 = 1/ log 2 for any n ∈ N. 2

2.5.3

Two asymptotic distributions

We are now able to derive the asymptotic behaviour of γa (san ≤ x) as n → ∞ for any a, x ∈ I. Theorem 2.5.5 For any a ∈ I and n ∈ N we have k0 a+1 ≤ sup |γa (san ≤ x) − G(x)| ≤ . 2(Fn + aFn−1 )(Fn+1 + aFn ) Fn Fn+1 x∈I Proof. (i) The upper bound. We have already used in Subsection 2.5.1 the property of U of being the transition operator of the Markov chain (san )n∈N for any a ∈ I. Therefore in particular ¡ ¢ U n I[0,x] (a) = Ea I[0,x] (san ) = γa (san ≤ x) for any a, x ∈ I and n ∈ N. As Z ∞ U I[0,x] = I[0,x] dγ = γ([0, x]) = G(x), I

x ∈ I,

152

Chapter 2

Corollary 2.5.4 yields the upper bound announced . (ii) The lower bound. We start with two simple remarks. First, using the continuity of G and the equations limh↓0 γa (san ≤ x − h) = γa (san < x) and limh↓0 γa (san < x + h) = γa (san ≤ x), x ∈ I, it is easy to see that sup |γa (san ≤ x) − G(x)| = sup |γa (san < x) − G(x)| x∈I

x∈I

for any a ∈ I and n ∈ N. Second, for any s ∈ I we have γa (san = s) = γa (san ≤ s) − G(s) − (γa (san < s) − G(s)) ≤ sup |γa (san ≤ x) − G(x)| + sup |γa (san < x) − G(x)| x∈I

x∈I

= 2 sup |γa (san ≤ x) − G(x)| . x∈I

Hence sup |γa (san ≤ x) − G(x)| ≥ x∈I

1 sup γa (san = s) 2 s∈I

(2.5.27)

for any a ∈ I and n ∈ N. Next, recall (see Subsection 2.5.1) that γa (san = [in , . . . , i2 , i1 + a]) = γa (I(i(n) )) = Pi1 ···in (a), µ γa sa1 =

1 i1 + a

n ≥ 2,

¶ = γa (I(i1 )) = Pi1 (a)

for any a ∈ I and (i1 , · · · , in ) = i(n) ∈ Nn+ . By (2.5.5) and (2.5.27) we then have sup γa (san = s) = P1(n) (a), a ∈ I, n ∈ N+ , (2.5.28) s∈I

where we write 1(n) for (i1 , · · · , in ) with i1 = · · · = in = 1, n ∈ N+ . With the convention F−1 = 0, by equation (2.5.5) again, P1(n) (a) = =

a+1 ((a + 1)Fn−1 + Fn−2 )((a + 1)Fn + Fn−1 ) a+1 , (Fn + aFn−1 )(Fn+1 + aFn )

(2.5.29)

a ∈ I, n ∈ N+ .

The lower bound announced now follows from (2.5.27) through (2.5.29). The case n = 0 can be checked directly. 2

Solving Gauss’ problem

153

Remarks. 1. It is easy to see that P1(n) (·) is a decreasing function. Hence P1(n) (a) ≥ P1(n) (1) =

2 Fn+1 Fn+2

(2.5.30)

for any a ∈ I and n ∈ N+ . 2n 2. Both√lower and upper bounds √ in Theorem 2.5.5 are O(g ) as n → ∞ 2 with g = ( 5 − 1)/2, g = (3 − 5)/2 = 0.38196 · · · . Thus the optimal convergence rate has been obtained. 2 Corollary 2.5.6 For any n ∈ N+ we have vn ≥

2 . Fn+1 Fn+2

Proof. As noted in the proof of Theorem 2.5.5, we have γa (san ≤ x) = U n I[0,x] (a), G(x) = U ∞ I[0,x] for any a, x ∈ I and n ∈ N. Then Theorem 2.5.5, inequality (2.5.30), and the argument used in the proof of Corollary 2.5.4 yield 2 ≤ sup || U n I[0,x] − U ∞ I[0,x] || ≤ sup var U n I[0,x] Fn+1 Fn+2 x∈I x∈I

(2.5.31)

for any n ∈ N. By Corollary 2.5.2 the proof is complete.

2

Remark. Theorem 2.5.3 and Corollary 2.5.6 show that vn = n → ∞, and this convergence rate is optimal.

O(g2n )

as 2

Corollary 2.5.7 The spectral radius of the operator U − U ∞ in BV (I) √ equals g2 = (3 − 5)/2 = 0.38196 · · · . Proof. We should show that à n

lim || U −

n→∞

U ∞ ||1/n v

= lim

n→∞

|| U n f − U ∞ f ||v sup || f ||v 06=f ∈BV (I)

!1/n = g2 .

The argument used in the proof of Corollary 2.5.4, and Theorem 2.5.3 yield || U n f − U ∞ f ||v = || U n f − U ∞ f || + var U n f ≤ 2 var U n f ≤

4k0 4k0 var f ≤ || f ||v Fn Fn+1 Fn Fn+1

154

Chapter 2

for any n ∈ N and f ∈ BV (I). (We took into account that, as mentioned at the beginning of this section, here f is complex-valued. See the proof of Proposition 2.1.16.) Hence lim || U n − U ∞ ||1/n ≤ g2 . v

n→∞

The converse inequality follows by taking f = I[0,x] , x ∈ I, and using (2.5.31). 2 Theorem 2.5.5 allows a quick derivation of the asymptotic behaviour of γa (τ n ≤ x, san ≤ y) as n → ∞ for any a, x, y ∈ I, and of the (optimal) convergence rate, the same as above. Theorem 2.5.8 For any a ∈ I and n ∈ N we have a+1 2(Fn + aFn−1 )(Fn+1 + aFn ) ¯ ¯ ¯ log(xy + 1) ¯¯ n a ¯ ≤ sup ¯γa (τ ≤ x, sn ≤ y) − ¯ log 2 x,y∈I ≤

k0 . Fn Fn+1

Proof. Set Gan (y) = γa (san ≤ y), Hna (y) = Gan (y) − G(y), a, y ∈ I, n ∈ N. Theorem 2.5.5 yields

|Hna (y) | ≤

k0 , Fn Fn+1

a, y ∈ I, n ∈ N.

(2.5.32)

By the generalized Brod´en–Borel–L´evy formula (1.3.21), for any a, x, y ∈ I

Solving Gauss’ problem and n ∈ N we have n

γa (τ ≤

x, san

155

Z

y

≤ y) = 0

Z

γa (τ n ≤ x|san = z) dGan (z)

y

(z + 1)x a dGn (z) zx + 1 0 Z y Z y (z + 1)x dz (z + 1)x 1 = + dHna (z) log 2 0 zx + 1 z + 1 zx + 1 0 ¯ log(xy + 1) (z + 1)x a ¯¯z=y + H (z) = log 2 zx + 1 n ¯z=0 Z y x − x2 H a (z)dz. − 2 n 0 (zx + 1) =

[When applying formula (1.3.21) we used the fact that the σ-algebras generated by (a1 , · · · , an ) and by san are identical for any a ∈ I and n ∈ N+ .] Hence, by (2.5.32), ¯ ¯ ¯ ¯ ¯γa (τ n ≤ x, san ≤ y) − log(xy + 1) ¯ ¯ ¯ log 2 µ ¶ k0 (y + 1)x (x − x2 )y k0 ≤ + ≤ Fn Fn+1 xy + 1 xy + 1 Fn Fn+1 for any a, x, y ∈ I and n ∈ N, so that the upper bound holds. To get the lower bound we note that by Theorem 2.5.5 for any a ∈ I and n ∈ N we have ¯ ¯ ¯ log(xy + 1) ¯¯ n a ¯ sup ¯γa (τ ≤ x, sn ≤ y) − ¯ log 2 x,y∈I ¯ ¯ ¯ log(y + 1) ¯¯ n a ¯ ≥ sup ¯γa (τ ≤ 1, sn ≤ y) − log 2 ¯ y∈I = sup |γa (san ≤ y) − G(y)| ≥ y∈I

a+1 . 2(Fn + aFn−1 )(Fn+1 + aFn ) 2

Remarks. 1. We can replace γa (τ n ≤ x, san ≤ y) by λ(τ n ≤ x, san ≤ y) in the statement of ¯Theorem it is possible to relate these quantities ¯ 2.5.8 since 2 0 a ¯ ¯ by noticing that sn − sn ≤ 1/Fn , n ∈ N, a ∈ I. The new upper and lower bounds are of order O(g2n ) as n → ∞, too.

156

Chapter 2

2. As noted at the end of Subsection 1.3.3, log(xy + 1)/ log 2, x, y ∈ I, is the joint distribution function under γ¯ of the extended random variables τ¯n and s¯n . 2

2.5.4

A generalization of a result of A. Denjoy

Sixty five years ago, A. Denjoy published a Comptes Rendus Note [see Denjoy (1936 b)] in which he sketched a proof of the fact that (in our notation) lim λ([a1 , · · · , an ] ≤ x, s0n ≤ y) =

n→∞

x log(y + 1) log 2

(2.5.33)

uniformly with respect to x, y ∈ I. Of course, for x = 1 this follows at once from Theorem 2.5.5. In this subsection we prove that (2.5.33) holds with λ replaced by any probability measure µ on BI absolutely continuous with respect to λ, in particular with λ replaced by any γa , a ∈ (0, 1]. An estimate of the convergence rate is also given . These will follow from Theorem 2.5.9 below. Since |[a1 , · · · , an ] − τ 0 | ≤ (Fn Fn+1 )−1 , n ∈ N+ , it is easy to see that for any probability measure µ on BI absolutely continuous with respect to λ, we have ¯ ¯ ¯µ([a1 , · · · , an ] ≤ x, s0n ≤ y) − µ(τ 0 ≤ x, s0n ≤ y)¯ ≤ max(µ(x − (Fn Fn+1 )−1 < τ 0 ≤ x), µ(x < τ 0 ≤ x + (Fn Fn+1 )−1 )) → 0 uniformly with respect to x, y ∈ I as n → ∞. This allows us to replace [a1 , · · · , an ] by τ 0 in (2.5.33) and its generalizations. Fix a ∈ I arbitrarily. Let f be a λ-integrable complex-valued function on I. Since γa is equivalent to λ for any a ∈ I, f is γa -integrable, too. Denote by Ek , k ∈ N, the set consisting of the endpoints of all fundamental intervals of rank `, 0 ≤ ` ≤ k. For any n ∈ N we associate with f a function fna which hasR a constant value on each fundamental interval of rank n. Specifically, f0a = I f dγa and Z 1 a fn (x) = f dγa , x ∈ I(i(n) ), i(n) ∈ Nn+ , γa (I(i(n) )) I(i(n) ) for n ∈ N+ . Clearly, Z I

Z fna dγa

= I

f dγa ,

n ∈ N.

(2.5.34)

Solving Gauss’ problem

157

Since for any n ∈ N+ and x ∈ I \ En there is a unique i(n) ∈ Nn+ such that x ∈ I(i(n) ) and since max γa (I(i(n) )) → 0 i(n) ∈Nn +

as n → ∞, by a well known property of the Lebesgue integral we have lim f a (x) n→∞ n

= f (x)

(2.5.35)

a.e. in I. It follows from (2.5.34) and (2.5.35) that Z lim |f − fna |dγa = 0.

(2.5.36)

n→∞ I

By (2.5.36) the right hand side of (2.5.37) below converges to 0 as n → ∞. Remark. It is easy to check that (fna )n∈N is a martingale on (I, BI , γa ) whatever a ∈ I. 2 Theorem 2.5.9 Let f be a λ-integrable complex valued function on I and let h ∈ BV (I) be real-valued. Then ¯Z ¯ Z Z ¯ ¯ ¯ f (h ◦ san ) dγa − f dγa hdγ ¯ ¯ ¯ I

I

I

µ Z ≤ inf || h || |f − fka | dγa + 0≤k≤n

I

(2.5.37)

¶ Z k0 var h |f |dγa Fn−k Fn−k+1 I

for any a ∈ I and n ∈ N. Proof. For any a ∈ I and k, n ∈ N+ , k ≤ n, we have Z f (h ◦ san )dγa I

=

X i(k) ∈Nk+

ÃZ (f − I(i(k) )

!

Z fka )(h



san )dγa

+ I(i(k) )

fka (h



san )dγa

Clearly, ¯ ¯ ¯ ¯ Z ¯ ¯ X Z ¯ ¯ a a (f − fk )(h ◦ sn )dγa ¯ ≤ || h || |f − fka |dγa . ¯ ¯ ¯ (k) k I(i(k) ) I ¯ ¯i ∈N+

(2.5.38) .

(2.5.39)

158

Chapter 2

Next, for any fixed i(k) ∈ Nk+ we can write Z I(i(k) )

fka (h



san )dγa

1 = γa (I(i(k) )

Z

Z

I(i(k) )

f dγa

I(i(k) )

(h ◦ san )dγa . (2.5.40)

It is easy to check that a+1 , (qk + apk )(qk + qk−1 + a(pk + pk−1 ))

γa (I(i(k) )) = where

pk = [i1 , . . . , ik ], qk

g.c.d. (pk , qk ) = 1,

k ∈ N+ ,

and p0 = 0, q0 = 1. With the change of variable u=

pk + t pk−1 , qk + t qn−1

t ∈ I,

noting that 0

san (u) = san−k (t) for t ∈ Ω, where ½ a

0

= =

[ik , . . . , i2 , i1 + a] if k > 1, 1/(i1 + a) if k = 1

qk−1 + apk−1 , qk + apk

we obtain Z I(i(k) )

Z h(san (u))γa (du)

= (a + 1) I(i(k) )

Z

h(san (u))du (au + 1)2 0

= (a + 1) I

h(san−k (t))dt . (t(qk−1 + apk−1 ) + qk + apk )2

Hence 1 γa (I(i(k) ))

Z I(i(k) )

h(san (u))γa (du)

Z I

0 Z h(san−k (t))dt + 1) 0 2 I (a t + 1)

Z a0

=

=

(a0

(h ◦ sn−k )dγa0 =

a0

I

h(v) dGn−k (v),

(2.5.41)

Solving Gauss’ problem 0

159

0

where Gam (v) = γa0 (sam < v), m ∈ N, v ∈ I. By Theorem 2.5.5 we have k0 Fm Fm+1

|Gam (v) − G(v)| ≤ for any a, v ∈ I and m ∈ N. Then ¯Z ¯ Z ¯ ¯ 0 a ¯ h(v)dG hdγ ¯¯ n−k (v) − ¯ I

I

¯Z ¯ Z ¯ ¯ k0 var h 0 a = ¯¯ Gn−k (v)dh(v) − G(v)dh(v)¯¯ ≤ . F n−k Fn−k+1 I I It follows from (2.5.40) through (2.5.42) that ¯ ¯ ¯ ¯ Z Z ¯ X Z ¯ ¯ ¯ a a f (h ◦ s )dγ − f dγ hdγ ¯ ¯ a a n ¯ (k) k I(i(k) ) k ¯ I I ¯i ∈N+ ¯ k0 var h ≤ Fn−k Fn−k+1

(2.5.42)

(2.5.43)

Z I

|f |dγa .

Finally, (2.5.38), (2.5.39), and (2.5.43) for k = 0 and n ∈ N should be replaced by Z Z Z f (h ◦ san )dγa = (f − f0a )(h ◦ san )dγa + f0a (h ◦ san )dγa , (2.5.380 ) I

I

(2.5.390 )

¯ Z ¯ Z Z Z ¯ a ¯ a ¯f0 (h ◦ sn )dγa − f dγa hdγ ¯ ≤ k0 var h |f |dγa , ¯ ¯ Fn Fn+1 I I I I

(2.5.430 )

I

and

I

¯Z ¯ Z ¯ ¯ ¯ (f − f0a )(h ◦ san )dγa ¯ ≤ || h || |f − f0a | dγa , ¯ ¯ I

respectively. Now, (2.5.37) follows from (2.5.38), (2.5.380 ) (2.5.39), (2.5.390 ), (2.5.43), and (2.5.430 ). 2 Corollary 2.5.10 For any a, x, y ∈ I and n ∈ N we have ¯ ¯ ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯ µ ≤ inf δka (x) + 0≤k≤n

¶ k0 γa ([0, x]) Fn−k Fn−k+1

(2.5.44)

160

Chapter 2

where

  0 2(a + 1)(x − ak )(bk − x) δka (x) =  (bk − ak )(ax + 1)2

if x ∈ Ek , if x ∈ (ak , bk ),

and [ak , bk ] is the closure of the (unique) fundamental interval of order k ∈ N containing x ∈ I \ Ek . Proof. Clearly, Z 0

γa (τ ≤

x, san

≤ y) = I

I[0,x] (I[0,y] ◦ san ) dγa

for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = I[0,x] and h = I[0,y] , x, y ∈ I, yielding (2.5.44) since as is easy to see, in the present case Z |f − fka |dγa = δka (x), k ∈ N, a, x ∈ I. I

2 Corollary 2.5.11 For any a ∈ I and n ∈ N we have ¯ ¯ a+1 ≤ sup ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯ 2(Fn + aFn−1 )(Fn+1 + aFn ) x,y∈I µ ≤

a+1 + k0 2



1 . Fbn/2c Fbn/2c+1

(2.5.45)

k ∈ N, a, x ∈ I.

(2.5.46)

Proof. We clearly have δka (x) ≤

a+1 a+1 max λ(I(i(k) )) = , (k) 2 i 2Fk Fk+1

The upper bound from (2.5.45) follows by using (2.5.46) and taking k = bn/2c. Next, as in the proof of Theorem 2.5.8, we get ¯ ¯ supx,y∈I ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯ ≥ supy∈I |γa (san ≤ y) − G(y)| ≥

a+1 2(Fn + aFn−1 )(Fn+1 + aFn )

for any a ∈ I and n ∈ N, and so the lower bound holds, too.

2

Solving Gauss’ problem

161

Remark. The upper bound in Corollary 2.5.11 is O(gn ) as n → ∞, with √ g = ( 5 − 1)/2. The lower bound is O(g2n ) as n → ∞ so that the problem of the exact rate of convergence is unsettled. 2 Corollary 2.5.12 Let µ ∈ pr (BI ) such that µ ¿ λ and let ga = dµ/dγa , a ∈ I. Then we have ¯ ¯ ¯µ(τ 0 ≤ x, san ≤ y) − µ([0, x])G(y)¯ µZ ≤ inf

0≤k≤n

I

¯ ¯ ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa + k

¶ k0 µ([0, x]) Fn−k Fn−k+1

(2.5.47)

for any a, x, y ∈ I and n ∈ N. In particular, if ga has a version gea of bounded variation, then Z ¯ ¯ ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa (2.5.48) k I

 (a + 1)var[0,x] gea      (Fk + aFk−1 )(Fk+1 + aFk ) ≤

    

if x ∈ Ek

(a + 1)var[0,x] gea +2 (Fk + aFk−1 )(Fk+1 + aFk )

Z

x

ak

ga (t)γa (dt) if x ∈ (ak , bk ),

where [ak , bk ] is the closure of the (unique) fundamental interval of order k ∈ N containing x ∈ I \ Ek . Proof. We have

Z 0

µ(τ ≤

x, san

≤ y) = I

I[0,x] (I[0,y] ◦ san )ga dγa

for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = ga I[0,x] and h = I[0,y] , x, y ∈ I, yielding (2.5.47). Next, (2.5.48) can be obtained noting that (i) for a typical fundamental interval I(i(k) ) of order k ∈ N contained in [0, x] we have Z ¯ ¯ ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa k I(i(k) )

¯ ¯ Z ¯ ¯ 1 ¯ ¯ g (s)γ (ds) ¯ γa (dt) ¯ga (t) − a a (k) )) ¯ ¯ (k) (k) γ (I(i a I(i ) I(i ) ¯ ¯ Z ¯ ¯Z 1 ¯ ¯ (e g (t) − g e (s)) γ (ds) ¯ γa (dt) ¯ a a a ¯ γa (I(i(k) )) I(i(k) ) ¯ I(i(k) )

Z = =

≤ γa (I(i(k) ))varI (i(k) ) gea ,

162

Chapter 2

and (ii) for x ∈ (ak , bk ) we have Z bk ¯ ¯ ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa k ak

¯ Z x ¯ 1 = ga (s)γa (ds)¯¯ γa (dt) γa ([ak , bk ]) ak ak ¯ Z b k ¯Z x ¯ ¯ 1 ¯ ga (s)γa (ds)¯¯ γa (dt) + ¯ γa ([ak , bk ]) x ak ¶ Z x Z bk µZ x 1 ≤ ga (t)γa (dt) + ga (s)γa (ds) γa (dt) γa ([ak , bk ]) ak ak ak Z x = 2 ga (t)γa (dt). Z

¯ ¯ga (t) − ¯



ak

The proof is complete.

2

Corollary 2.5.13 Let µ ∈ pr(BI ) such that µ ¿ λ and let ga = dµ/dγa , a ∈ I. Then we have ¶ µZ k0 a a |µ(sn ≤ x) − G(x)| ≤ inf |ga − (ga )k | dγa + 0≤k≤n Fn−k Fn−k+1 I for any a, x ∈ I and n ∈ N. If ga has a version of bounded variation, then the right hand side of the above inequality is O(gn ) as n → ∞ uniformly with respect to a, x ∈ I. Proof. Take x = 1 in (2.5.47), and then x = 1 and k = bn/2c in (2.5.48). 2 Remark. Corollary 2.5.13 shows that the limiting distribution as n → ∞ of san under a probability measure on BI absolutely continuous with respect to λ is always Gauss’ γ for any a ∈ I. The problem of the exact rate of convergence, which should normally depend on ga , remains unsettled. 2 Other special cases of Theorem 2.5.9 and its corollaries can be considered. For example, we can check that lim γ(τ 0 ≤ x, san ≤ y) = G(x)G(y),

n→∞

a, x, y ∈ I.

(2.5.49)

It is interesting to note that (2.5.49) points to the asymptotic independence of τ 0 and san under γ as n → ∞.

Solving Gauss’ problem

163

As already noted at the beginning of this subsection, we can easily obtain the results corresponding to Corollaries 2.5.10 through 2.5.12 in the case considered by A. Denjoy, where τ 0 is replaced by [a1 , · · · , an ]. A definite difference occurs just in the convergence rates while the limiting probabilities are not altered.

164

Chapter 2

Chapter 3

Limit theorems This chapter is devoted to functional versions of central limit and other weak theorems, and of the law of the iterated logarithm for the incomplete quotients and associated random variables. The reader should keep in mind throughout that the sequence (an )n∈N+ of incomplete quotients is ψ-mixing under different probability measures on BI (see Subsections 1.3.6 and 2.3.4), while frequent reference is made to the three appendices at the end of the book.

3.0 Preliminaries As in Subsection 2.5.4 let g be a λ-integrable complex-valued function on I. We particularize here the framework considered there taking a = 0 and accordingly γ0 = λ. Denote by Ek , k ∈ N, the set consisting of the endpoints of all fundamental intervals of rank `, 0 ≤ ` ≤ k. For any n ∈ N+ we associate with g a function gn which has a constant value on each fundamental interval I(i(n) ), i(n) ∈ Nn+ , of rank n. Specifically, gn (x) = Then

1 ¡ ¢ λ I(i(n) )

Z gdλ, I(i(n) )

x ∈ I(i(n) ), i(n) ∈ Nn+ , n ∈ N+ .

Z I

(3.0.1)

Z gn dλ =

gdλ, I

n ∈ N+ ,

(3.0.2)

a.e. in I.

(3.0.3)

and lim gn (x) = g(x)

n→∞

165

166

Chapter 3

R It follows from (3.0.2) and (3.0.3) that limn→∞ I |g − gn |dλ = 0. Hence Z |g − gn |dλ → 0 (3.0.4) ωg,A (n) = A

uniformly with respect to A ∈ BI as n → ∞. We shall now prove a result which, in a sense, is dual to Theorem 2.5.9. Lemma 3.0.1 Let µ ∈ pr (BI ) such that µ ¿ λ and let g =dµ/dλ. For any n ∈ N+ and A ∈ Bn∞ = τ −n+1 (BI ) we have |µ(A) − γ(A)| ≤

inf (γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s) + γ(A)εn−s ) ,

1≤s
with εn , n ∈ N+ , defined as in Subsection 1.3.6. Hence lim sup |µ(A) − γ(A)| = 0.

n→∞ A∈B∞ n

Proof. Put h = IA − γ(A), A ∈ Bn∞ . Then Z µ(A) − γ(A) = ghdλ I

and

¯Z ¯ Z ¯Z ¯ ¯ ¯ ¯ ¯ ¯ ghdλ¯ ≤ |gs − g| |h|dλ + ¯ gs hdλ¯ , ¯ ¯ ¯ ¯ I

I

I

where gs is defined by (3.0.1) and s < n, s ∈ N+ , is arbitrary. Since |h| = 1 − γ(A) = γ(Ac ) on A and |h| = γ(A) on Ac , we have Z |gs − g| |h|dλ ≤ γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s). (3.0.5) I

Next, ¯ ¯ ¯ ¯ ¯Z ¯ Z ¯ X ¯ ¯ ¯ ¯ ¯ ¯ gs hdλ¯ = ¯ g hdλ ¯ s ¯ ¯ ¯ ¯ (s) I ¯i(s) ∈Ns+ I(i ) ¯ ¯ ¯ ¯ ¯ !Z à Z ¯ ¯ X 1 ¯ ¯ gdλ hdλ = ¯ ¯ ¯ ¯ (s) s λ(I(i(s) )) I(i(s) ) (s) I(i ) ¯ ¯i ∈N+ ¯ ¯ ¯ ¯ ¯ X µ(I(i(s) )) ³ ´¯ ¯ ¯ (s) (s) = ¯ λ(I(i ) ∩ A) − λ(I(i ))γ(A) ¯ . ¯ (s) s λ(I(i(s) )) ¯ ¯i ∈N+ ¯

Limit theorems

167

It then follows from equation (1.3.35) that ¯Z ¯ ¯ ¯ ¯ gs hdλ¯ ≤ γ(A)εn−s . ¯ ¯

(3.0.6)

I

Now, the result stated follows from (3.0.5), (3.0.6), and (3.0.4).

2

Let fn : N+ → R, n ∈ N+ , and define Xnj = fn (aj ), Sn0 = 0,

Snk =

k X

Xnj ,

1 ≤ j ≤ n,

1 ≤ k ≤ n,

Snn = Sn ,

n ∈ N+ .

j=1

For any n ∈ N+ define the process ξn = ((ξn (t))t∈I by ξn (t) = Snbntc , t ∈ I. Lemma 3.0.2 Let µ ∈ pr (BI ) such that µ ¿ λ. Assume that the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. under γ. ¡ ¢ ¡ ¢ (i) If either γξn−1 n∈N+ or µξn−1 n∈N+ converges weakly in BD , then both sequences converge ¡ ¢weakly in ¡ BD and ¢ have the same limit. (ii) If either γSn−1 n∈N+ or µSn−1 n∈N+ converges weakly in B, then both sequences converge weakly in B and have the same limit. Proof. Clearly, (ii) is an immediate consequence of (i). Let us therefore prove the latter. Take a sequence (kn )n∈N+ such that kn ≤ n, limn→∞ kn /n = 0, and limn→∞ kn = ∞. As X is s.i. under γ, we have lim γ (|Snkn | > ε) = 0

n→∞

(3.0.7)

for any ε > 0. Let us first show that µ lim

n→∞

¶ max |Snk | > ε = 0

1≤k≤kn

(3.0.8)

for any ε > 0. It follows from Proposition A3.5 (see also Section A1.4) that whatever ε > 0 we have ¡ −1 ¢ ε dP γSnk , δ0 ≤ , 1 ≤ k ≤ kn , 4 for any n large enough (≥ nε ). Therefore for some θ ≤ ε/4 we have −1 δ0 (A) < γSnk (Aθ ) + θ

168

Chapter 3

for any n ≥ nε , 1 ≤ k ≤ kn , and A ∈ B. Hence, with A = (−θ, θ) for which Aθ = (−2θ, 2θ), we obtain ³³ ε ε ´´ ³ ´ ε −1 −1 γSnk − , > γSnk Aθ > 1 − θ ≥ 1 − 2 2 4 for any n ≥ nε and 1 ≤ k ≤ kn . Equivalently, ³ ε´ ε min γ |Snk | < >1− , 1≤k≤kn 2 4

n ≥ nε .

If ε is small enough so that 1−

ε > ϕγ (1), 4

then by an Ottaviani type inequality [see Lemma 1.1.6 in Iosifescu and Theodorescu (1969)] we can write ¢ ¡ µ ¶ γ |Snkn | ≥ 2ε γ max |Snk | > ε ≤ 1≤k≤kn 1 − 4ε − ϕγ (1) for any n ≥ nε . Hence (3.0.8) holds on account of (3.0.7). Next, for any n ∈ N+ consider the process ξen = (ξen (t))t∈I defined by ξen (t) = Snbntc − Sn min(bntc,kn ) ,

t ∈ I.

Note that ξen is Bk∞n +1 -measurable and then by Lemma 3.0.1 and Lemma 2.1.1 in Iosifescu and Grigorescu (1990) we have µZ ¶ Z −1 −1 e e lim hd(γ ξn ) − hd(µξn ) = 0 (3.0.9) n→∞

D

D

for any bounded continuous real-valued function h on D. On the other hand (see Section A1.6), for any n ∈ N+ we have d0 (ξn , ξen ) ≤ sup |ξn (t) − ξen (t)| ≤ max |Snk |. t∈I

1≤k≤kn

It then follows from (3.0.8) that ∼

d0 (ξn , ξ n ) converges to 0 in γ-probability as n → ∞.

(3.0.10)

Hence as µ ¿ γ we also have that ∼

d0 (ξn , ξ n ) converges to 0 in µ-probability as n → ∞.

(3.0.11)

Limit theorems

169

We can now conclude the proof using (3.0.9) through (3.0.11). If, for w example, γξn−1 → ν for some ν ∈ pr (BD ), then it follows from (3.0.10) that w w γ ξen−1 → ν, too. Next, (3.0.9) implies that µξen−1 → ν, which in conjunction w with (3.0.11) yields µξn−1 → ν. 2 Remark. Lemma ¡ ¢ 3.0.2 still holds when the process ξn is replaced by the process ξnC = ξnC (t) t∈I defined by ¡ ¢ ξnC (t) = Snbntc + (nt − bntc) Sn(bntc+1) − Snbntc ,

t ∈ I,

with the convention Sn0 = 0, n ∈ N+ .

3.1 3.1.1

2

The Poisson law The case of incomplete quotients

Let θ ∈ R++ and α ∈ R be arbitrarily given. Consider the array X = {Xnj , 1 ≤ j ≤ n, n∈ N+ }, where Xnj =

³ a ´α j

n

I(aj >θn) .

(3.1.1)

For this array we have −α

Snk = n

k X

aαj I(aj >θn) ,

1 ≤ k ≤ n,

Sn = Snn ,

n ∈ N+ . (3.1.2)

j=1

Proposition 3.1.1 The array (3.1.1) is s.i. under γ. Proof. We only consider the case α ∈ R++ . The other cases can be treated similarly. We have γ (|Snk | > ε) ≤

k ³ ³ X ε´ ε´ γ |Xnj | > = kγ |Xn1 | > k k j=1

µ µ ³ ´ ¶¶ ε 1/α = kγ a1 > n max θ, k ≤ kγ(a1 > nθ),

1 ≤ k ≤ n.

170

Chapter 3

Hence Xn1 converges in γ-probability to 0 as n → ∞, and for any 0 < a < 1 we have lim sup max γ (|Snk | > ε) ≤ lim a n γ (a1 > nθ) = n→∞

n→∞ 1≤k≤an

a , θ log 2

which is less than 1 if we choose 0 < a < min(1, θ log 2). On account of Proposition A3.6 the proof is complete.

2

Theorem 3.1.2 We have w

γSn−1 → ν in B,

(3.1.3)

where: (i) if α ∈ R++ then ν = Pois ρ with dρ x−1−1/α (x) = δx ((θα , ∞)) , dλ α log 2

x ∈ R;

(ii) if −α ∈ R++ then ν = Pois ρ with x−1−1/α dρ (x) = −δx ((0, θα )) , x ∈ R; dλ α log 2 ³ ´ (iii) if α = 0 then ν = Pois (θ log 2)−1 δ1 , that is, ν is the Poisson ³ ´ distribution P (θ log 2)−1 with parameter (θ log 2)−1 . Proof. We only prove (i), the proofs of (ii) and (iii) being completely similar. Consider the measures µn on B defined by ³³ a ´α ´ 1 µn (A) = γ ∈ A, a1 > θn , A ∈ B, n∈ N+ . n Clearly, µn (R) = γ (a1 > θn) ≤ 1, µn ([−θα , θα ]) = 0,

n ∈ N+ ,

and γ(Xn1 ∈ A) = γ (a1 ≤ θn) δ0 (A) + µn (A),

A ∈ B, n∈ N+ .

Limit theorems

171

Also, for any x ∈ R we have ³ ´ lim n µn ((x, ∞)) = lim n γ a1 > n (max(x, θα ))1/α

n→∞

n→∞

=

  1 1 k lim n log 1 + j log 2 n→∞ n (max(x, θα ))1/α + 1

=

1 1 = ρ ((x, ∞)) . log 2 (max(x, θα ))1/α

Finally, lim n µn (R) = lim n γ(a1 > n θ) =

n→∞

n→∞

1 = ρ(R). θ log 2

Therefore all hypotheses of Theorem A3.10 are fulfilled, and (3.1.3) holds. 2 Now, on account of Proposition 3.1.1, Theorem 3.1.2, Lemma 3.0.2, and Theorem A3.7 we can state the following result. (See Section A3.3 for notation.) w

Corollary 3.1.3 Let µ ∈ pr(BI ) such that µ ≤ λ. Then µξn−1 → Qν w in BD , hence µSn−1 → ν in B, where ξn = (Snbntc )t∈I , with the convention Sn0 = 0, n ∈ N+ .

3.1.2

The case of associated random variable

We shall now show that both Theorem 3.1.2 and Corollary 3.1.3 still hold when aj is replaced by either yj , rj , or uj , 1 ≤ j ≤ n, in (3.1.1) and (3.1.2). This will follow from the result below. Lemma 3.1.4 Let bn , n ∈ N+ , be real-valued random variables on (I,BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ξn = 0 )t∈I , where Snk , 1 ≤ k ≤ n, is defined by (3.1.2) (Snbntc )t∈I and ξn0 = (Snbntc and k X 0 −α Snk = n bαj I(bj >θn) , 1 ≤ k ≤ n, j=1

172

Chapter 3

0 = 0. Then d (ξ , ξ 0 ) converges to 0 in γ-probability with the convention Sn0 0 n n as n → ∞.

Proof. For any n ∈ N+ we have d0 (ξn , ξn0 ) where

≤ sup |Snbntc − t∈I

0 Snbntc |



n X

|δnj |,

j=1

³ ´ δnj = n−α bαj I(bj >θn) − aαj I(aj >θn) ,

1 ≤ j ≤ n.

Notice that (aj > θn) ⊂ (bj > θn), 1 ≤ j ≤ n, and put δn0

=n

−α

n X

bαj

n ³ ´ X −α I(bj >θn) − I(aj >θn) = n bαj I(bj >θn,aj ≤θn) , j=1

j=1

δn00 = n−α Pn

n X

|bαj − aαj |I(aj >θn) .

j=1

δn0 + δn00 ,

and we are going to prove that δn0 and δn00 both Then j=1 |δnj | ≤ converge to 0 in γ-probability as n → ∞. We have γ(δn0 > 0) ≤ nγ(θn − c < a1 ≤ θn) → 0 as n → ∞ while

 δn00 ≤ cα n−1 n−(α−1)

n X

 aα−1 I(aj >θn)  , j

j=1

where

  cα(1 + c)α−1 if α ≥ 1, cα =



c|α|

if α < 1. ¡ ¢ [We have used the inequality (1+a)α −1 ≤ a {α} + bαc(1 + a)α−1 , valid for non-negative a and α, which implies 1 − (1 + a)−α ≤ aα.] By Theorem 3.1.2, δn00 converges to 0 in γ-probability as n → ∞. It follows that d0 (ξn , ξn0 ) is dominated by the sum of two non-negative random variables both converging in γ-probability to 0 as n → ∞. The proof is complete. 2 Corollary 3.1.5 Let bn denote either yn , rn , or un , n ∈ N+ . Put 0 Snk

−α

=n

k X j=1

bαj I(bj >θn) ,

1 ≤ k ≤ n,

Limit theorems

173

0 and for any n ∈ N+ consider the stochastic process ξn0 = (Snbntc )t∈I , with the w

0 = 0. Let µ ∈ pr(B ) such that µ ¿ λ. Then µξ 0−1 → Q in convention Sn0 ν I n w 0−1 −→ BD , hence µSnn ν in B.

Proof. Lemma 3.1.4 applies with c = 1 in the case of yn and rn and with c = 2 in the case of un . Since µ ¿ γ, the distance d0 (ξn , ξn0 ) converges to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.1.3 imply the result stated. 2 Let bn denote either an , yn , rn or un , n ∈ N+ , and consider the special case α = 0. By Corollaries 3.1.3 and 3.1.5, under any µ ∈ pr(BI ) such that µ ¿ λ, the random variable Sn0 =

n X

I(bj >θn)

j=1

¡ ¢ is asymptotically P (θ log 2)−1 as n → ∞. It is possible to estimate the rate of convergence of γ(Sn0 = k), k ∈ N, to its Poisson limit. The following result holds. Theorem 3.1.6 Let k ∈ N and 0 < δ < 1 be fixed. We have |γ(Sn0 = k) − e−θ θk /k!| ≤ c exp(−(log n)δ ),

n ∈ N+ ,

for θ = O(na ), 0 ≤ a < 1, where c only depends, perhaps, on δ, a, and k. The proof for the case bn = an , n ∈ N+ , k = 0, can be found in Philipp (1976, p. 382), where the proviso θ = O(na ), 0 ≤ a ≤ 1, does not appear. Cf. Galambos (1972) and Iosifescu (1978, p. 35).

3.1.3

Some extreme value theory

Throughout this subsection let again bn denote either an , yn , rn or un , n ∈ (k) N+ . For 1 ≤ k ≤ n let Mn be the kth largest of b1 , · · · , bn . Clearly, (1) Mn = Mn is the maximum of b1 , · · · , bn . The asymptotic distribution of (k) Mn as n → ∞ for any fixed k can be easily obtained from previous results as shown below. Proposition 3.1.7 Let µ ∈ pr(BI ) such that µ ¿ λ. For any fixed k ∈ N+ we have à ! k−1 −j (k) X Mn log 2 x − x1 lim µ ≤x =e , x ∈ R++ . (3.1.4) n→∞ n j! j=0

174

Chapter 3

In particular, µ lim µ

n→∞

¶ 1 Mn log 2 ≤ x = e− x , n

x ∈ R++ .

Proof. Let 1 ≤ k ≤ n. It is easy to see that Sn0 = than k if and only if

Pn

j=1 I(bj >θn)

is less

(k) Mn

does not exceed θn, that is, ³ ´ ¡ ¢ (k) Mn ≤ θn = Sn0 < k

(3.1.5)

for any θ ∈ R++ and n ∈ N+ . Hence, by Corollaries 3.1.3 and 3.1.5, ³ ´ µ Mn(k) ≤ θn

=

k−1 ¡ 0 ¢ X ¡ ¢ µ Sn < k = µ Sn0 = j j=0 −(θ log 2)−1

→ e

k−1 X j=0

1 j!(θ log 2)j

as n → ∞ for any fixed k ∈ N+ . Putting x = θ log 2 we obtain the result stated. 2 Remark. The limit distribution for the special case k = 1 is known as Type II Extreme Value distribution for sequences of i.i.d. random variables. See, e.g., de Haan (1970). The same result can also be obtained from general results of Loynes (1965) for mixing strictly stationary sequences. 2 In what follows we give some almost sure asymptotic properties of Mn due to Philipp (1976), which improve upon results of Galambos (1974). We start with a F. Bernstein type theorem (see Proposition 1.3.16). Proposition 3.1.8 Let (cn )n∈N+ be a non-decreasing sequence of positive numbers. Then γ(Mn ≥ cn i.o.) P is either 0 or 1 according as the series n∈N+ 1/cn converges or diverges. Proof. We have (bn ≥ cn i.o.) ⊂ (Mn ≥ cn i.o.) since bn (ω) ≥ cn for some n ∈ N+ and ω ∈ Ω implies Mn (ω) ≥ cn . Conversely, if Mn (ω) ≥ cn for some n ∈ N+ and ω ∈ Ω, then there exists n0 ≤ n such that Mn (ω) = bn0 (ω) ≥ cn ≥ cn0 . Hence (Mn ≥ cn i.o.) ⊂ (bn ≥ cn i.o.). Therefore (Mn ≥ cn i.o.) = (bn ≥ cn i.o.) , and the conclusion follows from Corollary 1.3.17. 2

Limit theorems

175

Corollary 3.1.9 Let (cn )n∈N+ be as in Proposition 3.1.8. Then either Mn = 0 a.e. n→∞ cn lim

or lim sup P

n→∞

(3.1.6)

Mn = ∞ a.e. cn

(3.1.7)

1/cn converges or diverges. P Proof. First, assume that s = n∈N+ 1/cn < ∞. P Choose positive numbers dn , n ∈ N+ , with limn→∞ dn = ∞ such that n∈N+ dn /cn < ∞. Pn This is always possible. Indeed, put sn = i=1 1/ci , n ∈ N+ , and define according as the series

n∈N+

E1 = {j ∈ N+ : sj ≤ 3s/4}, ( En =

j ∈ N+ : 3s

n−1 X

−i

4

< sj ≤ 3s

i=1

n X

) −i

4

,

n ≥ 2.

i=1

Consider the increasing sequence (nk )k∈N+ of indices n for which En 6= ∅Pand take dj = 2nk−1 if j ∈ Enk , k ∈ N+ , with n0 = 0. Then we have −nk + 4−(nk −1) + · · · + 4−nk−1 ) ≤ 4−nk−1 +1 s, k ∈ N , + j∈Enk 1/cj ≤ 3s(4 P P P P −n k−1 ≤ 8s. By hence n∈N+ dn /cn = k∈N+ j∈En dj /cj ≤ 4s k∈N+ 2 k Proposition 3.1.8 we have µ ¶ Mn 1 γ ≥ i.o. = 0, cn dn which is equivalent to (3.1.6). P Second, assume that n∈N+ 1/cP n = ∞. Choose positive numbers dn , n ∈ N+ , with limn→∞ dn = 0 such that n∈N+ dn /cn = ∞. This is again always P possible. Indeed, put sn = ni=1 1/ci , n ∈ N+ , and define E1 = {j ∈ N+ : sj ≤ 4} , En =

© ª j ∈ N+ : 4n−1 < sj ≤ 4n ,

n ≥ 2.

Consider the increasing sequence (nk )k∈N+ of indices n for which En 6= ∅ −nk−1 if j ∈ E · , with n0 = 0. Then and nk ∪ Enk+1 , k = 1, 3, · · P P take dj = 2 n n n k−1 k−1 k whence dj /cj ≥ ≥ 3·4 1/cj ≥ 4 − 4 j∈En ∪En j∈En ∪En k

k+1

k

k+1

176

Chapter 3

P 3 · 2nk−1 , k = 1, 3, · · · . Clearly, this implies n∈N+ dn /cn = ∞. By Proposition 3.1.8 we have µ ¶ Mn 1 γ ≥ i.o. = 1, cn dn which is equivalent to (3.1.7).

2

Theorem 3.1.10 Let (cn )n∈N+ be a non-decreasing sequence of positive numbers such that the sequence (n/cn )n∈N+ is non-decreasing. Then µ ¶ n γ Mn ≤ i.o. cn log 2 is either 0 or 1 according as the series X log log n n exp cn

n∈N+

converges or diverges. The proof is completely similar to that given for the i.i.d. case in Barndorff–Nielsen (1961). Theorem 3.1.6 plays an essential part in the present case. For details in the case bn = an , n ∈ N+ , see Philipp (1976, pp. 384–385). 2 Corollary 3.1.11 We have lim sup(inf) n→∞

log Mn − log n = 1(0) a.e., log log n

whence lim

n→∞

log Mn = 1 a.e.. log n

Proof. For the lim sup case we should show that for any ε > 0 we have µ ¶ log Mn − log n γ ≥ 1 + ε i.o. = 0 log log n and

µ γ

or, equivalently,

¶ log Mn − log n ≥ 1 − ε i.o. = 1 log log n

¡ ¢ γ Mn ≥ n(log n)1+ε i.o. = 0

Limit theorems

177

and

¡ ¢ γ Mn ≥ n(log n)1−ε i.o. = 1.

These equations clearly hold by Proposition 3.1.8. For the lim inf case we should show that for any ε > 0 we have µ ¶ log Mn − log n γ ≤ ε i.o. = 1 log log n and

µ γ

¶ log Mn − log n ≤ −ε i.o. = 0 log log n

or, equivalently, γ (Mn ≤ n(log n)ε i.o.) = 1 and

¡ ¢ γ Mn ≤ n(log n)−ε i.o. = 0

It is easy to check that these equations hold by Theorem 3.1.10.

2

Corollary 3.1.12 We have lim inf n→∞

Mn log log n 1 = a.e.. n log 2

Proof. We should show that for any ε > 0 we have µ ¶ Mn log log n 1 γ − ≤ ε i.o. = 1 n log 2 and

µ γ

¶ Mn log log n 1 − ≤ −ε i.o. = 0 n log 2

or, equivalently,

and

µ γ Mn ≤

¶ n(1 + ε0 ) i.o. = 1 (log log n)(log 2)

µ γ Mn ≤

¶ n(1 − ε0 ) i.o. = 0, (log log n)(log 2)

where ε0 = ε log 2. This follows immediately from Theorem 3.1.10.

2

178

Chapter 3 (k)

To conclude this subsection we consider the kth smallest mn of b1 , · · · , bn , (1) (n) (k) 1 ≤ k ≤ n, n ∈ N+ . Clearly, mn = Mn . In general, we have mn = (n−k+1) Mn , 1 ≤ k ≤ n. Then by (3.1.5) we have ¡ 0 ¢ (m(k) n ≤ θn) = Sn < n − k + 1 for any θ ∈ R++ and n ∈ N+ . Hence, for any µ ∈ pr(BI ) such that µ ¿ λ, ³ ´ ¡ ¢ µ m(k) ≤ θn = µ Sn0 < n − k + 1 n =

n−k X

µ(Sn0

n X

= j) = 1 −

j=0

µ(Sn0 = j).

j=n−k+1

Since n−1 Sn0 converges to 0 in µ-probability as n → ∞ by Corollaries 3.1.3 and 3.1.5, we have ¡ ¢ lim µ Sn0 = n − m = 0 n→∞

for any fixed m ∈ N. Consequently, ³ ´ lim µ m(k) ≤ θn =1 n n→∞

(3.1.8)

for any fixed k ∈ N+ . This result is not at all surprising. Indeed, by Proposition 4.1.1 we have lim a(k) n→∞ n

= 1 a.e.

(k)

for any fixed k ∈ N+ , where an denotes the kth smallest of a1 , · · · , an . (k) (k) As mn ≤ an + 2, n ∈ N+ , 1 ≤ k ≤ n, it follows that (k)

mn = 0 a.e. n→∞ n lim

for any fixed k ∈ N+ , which clearly entails (3.1.8). Remark. It is proved in Iosifescu (1977) that if (ηn )n∈N+ is a strictly stationary ψ-mixing sequence of positive random variables on a probability space (Ω , K, P ) such that for some real-valued function g on N+ there exists the positive finite limit lim nP (ηn < g(n)) = θ,

n→∞

Limit theorems

179

say, then P (ηk < g(n) for p values k, 1 ≤ k ≤ n) → e−θ θp /p! as n → ∞ for any fixed p ∈ N. In particular this result applies to a sequence (ηn )n∈N+ for which P (η1 ≥ x) = log(1 + 1/x)/ log 2,

x ≥ 1,

with

2θ log 2 , n ∈ N+ . n For such a sequence, similarly to (3.1.4) we can write g(n) = 1 +

à lim P

n→∞

! k−1 j (k) X n(ηn − 1) x ≥ x = e−x , 2 log 2 j!

x ∈ R++ ,

(3.1.9)

j=0

(k)

for any fixed k ∈ N+ , where ηn denotes the kth smallest of η1 , · · · , ηn , 1 ≤ k ≤ n. We cannot assert that (3.1.9) is true for ηn = an , n ∈ N+ , since the equation γ (a1 ≥ x) = log (1 + 1/x) / log 2 holds just for x ∈ N+ . It is conjectured in Iosifescu (1978) that (3.1.9) holds true for ηn = rn , n ∈ N+ , under any P ¿ λ. [Notice that γ (r1 ≥ x) = log (1 + 1/x) / log 2 for any 2 x ≥ 1, but the sequence (rn )n∈N+ is not ψ-mixing under γ.]

3.2 3.2.1

Normal convergence Two general invariance principles

Assume the framework of Subsection 2.1.5. Thus let H be a real-valued l−1 , l ∈ Z, where function on NZ + . Set Hl = H1 ◦ τ H1 = H( · · · , a−2 , a−1 , a0 , a1 , a2 , · · · ). Then (Hl )l∈Z is a strictly stationary process on (I 2 , BI2 , γ). Set S0 = 0, Sn = P n i=1 Hi − nEγ H1 , n ∈ N+ , assuming that the mean value Eγ H1 exists and is finite. For any n ∈ N+ let us define the stochastic processes ξnC = (ξnC (t))t∈I and ξnD = (ξnD (t))t∈I by ξnC (t) =

¢ 1 ¡ √ Sbntc + (nt − bntc)(Hbntc+1 − Eγ H1 ) , σ n

ξnD (t) =

1 √ Sbntc , σ n

t ∈ I,

180

Chapter 3

where σ = σ(H) is a positive number which will be specified later. We start with a weak invariance principle. Theorem 3.2.1 Assume that Eγ H12 < ∞ and X 1/2 Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞

(3.2.1)

n∈N+

so that by Propositions 2.1.19 and 2.1.21 1 Eγ Sn2 = σ 2 ≥ 0 n→∞ n lim

exists finitely and is given by the absolutely convergent series X ¡ ¢ σ 2 = Eγ H12 − Eγ2 H1 + 2 Eγ H1 Hn+1 − Eγ2 H1 .

(3.2.2)

n∈N+ w

If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI2 ) such that µ ¿ λ2 . Proof. This is a transcription of Theorem 21.1 in Billingsley (1968), with an improvement by Popescu (1978) (concerning the possibility of replacing γ by µ), for the special case of the doubly infinite sequence (al )l∈Z . Note that in Proposition 2.1.22 a class of functions H is indicated, for which (3.2.1) holds. 2 Next, we state a strong invariance principle. Theorem 3.2.2 Assume that there exist constants 0 < δ ≤ 2 and c > 0 such that Eγ |H1 |2+δ < ∞ and 1/(2+δ)



|H1 − Eγ (H1 |a−n , · · · , an )|2+δ ≤ cn−(2+7/δ) ,

n ∈ N+ ,

(3.2.3)

so that (3.2.1) holds and lim

n→∞

1 Eγ Sn2 = σ 2 ≥ 0 n

exists finitely and is given by the absolutely convergent series (3.2.2). If σ > 0 then the strong invariance principle holds for the stochastic processes ξnC and ξnD , n ∈ N+ . That is, without changing their distributions, we can redefine these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a ) a.s. t∈I

Limit theorems

181

as n → ∞, with a random constant implied in O, for each a > 0 small enough, depending on δ. Here ξn stands for either ξnC or ξnD . Proof. This is a transcription of Theorem 7.1.1 in Philipp and Stout (1975) for the special case of the doubly infinite sequence (al )l∈Z . 2 For further reference we also consider the special case where H only depends on the coordinates with positive indices of a current point in NZ +, N+ i.e., H is a real-valued function on N+ . (Completely similar considerations can be made in the case where H only depends on the coordinates with nonpositive indices of a current point in NZ + , i.e., H is a real-valued function on (−N) N+ .) In this case we set Hn = H1 ◦ τ n−1 , n ∈ N+ , where H1 = H(a1 , a2 , · · · ), and we have a strictly stationary sequence (Hn )n∈N+ on (I, BI , γ). With the same definitions as before for Sn , ξnC and ξnD , n ∈ N+ , where Eγ H1 is replaced by Eγ H1 , we can state the following special cases of Theorems 3.2.1 and 3.2.2. Theorem 3.2.10 Assume that Eγ H12 < ∞ and X Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞

(3.2.10 )

n∈N+

so that

1 Eγ Sn2 = σ 2 ≥ 0 n→∞ n exists finitely and is given by the absolutely convergent series X ¡ ¢ Eγ H1 Hn+1 − Eγ2 H1 . σ 2 = Eγ H12 − Eγ2 H1 + 2 lim

(3.2.20 )

n∈N+ w

If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI ) such that µ ¿ λ. Note that inequality (2.1.32) and Proposition 2.1.23 describe two classes of functions H for which (3.2.10 ) holds. Theorem 3.2.20 Assume that there exist constants 0 < δ ≤ 2 and c > 0 such that Eγ |H1 |2+δ < ∞ and Eγ1/(2+δ) |H1 − Eγ (H1 |a1 , · · · , an )|2+δ ≤ cn−(2+7/δ) ,

n ∈ N+ ,

(3.2.30 )

182

Chapter 3

so that (3.2.10 ) holds and lim

n→∞

1 Eγ Sn2 = σ 2 ≥ 0 n

exists finitely and is given by the absolutely convergent series (3.2.20 ). If σ > 0 then the strong invariance principle holds for the stochastic processes ξnC and ξnD , n ∈ N+ . That is, without changing their distributions, we can redefine these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a ) a.s. t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small enough, depending on δ. Here ξn stands for either ξnC or ξnD .

3.2.2

The case of incomplete quotients

An important special case of Theorem 3.2.10 is obtained when the function N H only depends on finitely many coordinates of a current point of N+ + , i.e., when H is a real-valued function on Nk+ for a given k ∈ N+ . In this case Hn = H(an , ..., an+k−1 ), n ∈ N+ , assumption (3.2.10 ) is trivially satisfied, and by Corollary 1.2.5 we have Eγ H1r =

1 log 2

X

H r (i(k) ) log

i(k) ∈Nk+

1 + v(i(k) ) 1 + u(i(k) )

with r = 1 or 2, and σ 2 = Eγ H12 − Eγ2 H1  X +2 

X

n∈N+

i(n+k) ∈Nn+k +

(3.2.200 ) 

H(i(k) )H(in+1 , · · · , in+k ) 1 + v(i(n+k) )  log − Eγ2 H1  . (n+k) log 2 1 + u(i )

Note that in the case k = 1 by either Corollary 2.1.25 or Proposition A3.4 we have σ = 0 if and only if H =const. It is an open problem to find necessary and sufficient conditions in terms of H in the case k > 1 for to have σ = 0. The special framework assumed allows for an estimate of the convergence rate in the classical central limit theorem. Thus we have the following result.

Limit theorems

183

Theorem 3.2.3 If σ > 0 and Eγ |H1 |2+δ =

1 log 2

i

¯2+δ X ¯¯ 1 + v(i(k) ) ¯ log <∞ ¯H(i(k) )¯ 1 + u(i(k) ) k (k) ∈N+

for some δ > 0, then there exist two positive constants a < 1 and c such that ¯ Ã Pn ¯ ! ¯ ¯ ¯ ¯ j=1 Hj − nEγ H1 √ < x − Φ(x)¯ ≤ c n−a ¯γ ¯ ¯ σ n for any x ∈ R and n ∈ N+ . Proof. This is a transcription of Theorem 1 in Iosifescu (1968) for the special case of the sequence (an )n∈N+ of incomplete quotients. 2 Remark. It is an open problem to determine the optimal value of a in Theorem 3.2.3. We conjecture that a = δ/2, that is, the same value as in the case of i.i.d. random variables with finite (2 + δ)-absolute moment. 2 In what follows, by restricting the class of functions H we give more precise results in the case k = 1. To emphasize this special framework we change the notation by using the letter f instead of H. Theorem 3.2.4 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with limn→∞ Bn = ∞, and define Xnj Sn0

= Bn−1 (f (aj ) − An ) , 1 ≤ j ≤ n, k X = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn ,

F (x) =

1 log 2

n ∈ N+ ,

j=1

X

f 2 (k)k −2 ,

{k:|f (k)|≤x}

Fe(x) = Eγ f 2 (a1 )I(|f (a1 )|≤x) =

1 log 2

X

f 2 (k) log

µ 1+

{k:|f (k)|≤x}

1 k(k + 2)

¶ ,

x ∈ R+ .

(i) The following assertions are equivalent. (I) The stochastic process ξnD = ξn = (ξn (t))t∈I defined for any n ∈ N+ by ξn (t) = Snbntc , t ∈ I, satisfies w

γξn−1 −→ WD in BD ,

184

Chapter 3 where WD is the Wiener measure on BD . w

(II) γSn−1 −→ N (0, 1), and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. under γ. (ii) When limx→∞ Fe(x) = Eγ f 2 (a1 ) = ∞, assertion (I) above holds with a bounded sequence (An )n∈N+ if and only if X x2 k −2 lim

n→∞

{k:|f (k)|>x}

X

f 2 (k)k −2

=0

(3.2.4)

{k:|f (k)|≤x}

or, equivalently (see Theorem A2.5), if and only if F is slowly varying. If this is the case, then we can take An = Eγ f (a1 ), n ∈ N+ ,and any sequence (Bn )n∈N+ such that limn→∞ nBn−2 F (Bn ) = 1. When Eγ f 2 (a1 ) < ∞, assertion (I) holds with a bounded sequence (An )n∈N+ if and only if f 6=const. If this is the case, then we can take √ 1/2 An = Eγ f (a1 ) and Bn = nσ(0) Eγ f 2 (a1 ), n ∈ N+ , for some σ(0) > 0. (iii) If either (I) or (II) holds, then γ can be replaced in (i) by any µ ∈ pr(BI ) such that µ ¿ λ. Proof. (i) and (iii) follow from Theorem A3.7 and Lemma 3.0.2, respectively. We thus have to only prove (ii). First, since ´ ³ 1 log 1 + k(k+2) lim = 1, k→∞ k −2 either F and Fe both tend to ∞ as x → ∞ and limx→∞ F (x)/Fe (x) = 1 or both have finite limits as x → ∞. Consequently, F is slowly varying if and only if Fe is. Assume that (3.2.4) holds. Note that this does always happen when 0 < Eγ f 2 (a1 ) = lim Fe(x) < ∞. x→∞

Then Theorem A3.12 applies with Xn = f (an ), n ∈ N+ , and  Eγ2 f (a1 )   if Eγ f 2 (a1 ) < ∞,  2 (a ) E f 2 γ 1 m (X1 ) =    0 if Eγ f 2 (a1 ) = ∞,

Limit theorems

(0) ϕ1

= 1,

185

ϕ(0) n

 Eγ f (a1 )f (an )    Eγ f 2 (a1 ) =    0

if Eγ f 2 (a1 ) < ∞, if Eγ f 2 (a1 ) = ∞

2 equals either for n ≥ 2 [use Proposition A3.1 and equation (A3.2)], and σ(0) ¡ ¢ P Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 n∈N+ Eγ f (a1 )f (an+1 ) − Eγ2 f (a1 )

Eγ f 2 (a1 ) or 1 according as Eγ f 2 (a1 ) < ∞ or Eγ f 2 (a1 ) = ∞. Noting that when Eγ f 2 (a1 ) < ∞ by Corollary 2.1.25 we have σ(0) 6= 0 if and only if f 6= const., we conclude that with An and Bn , n ∈ N+ , as indicated we have w γξn−1 −→ WD , that is, (I) holds with a bounded sequence (An )n∈N+ . Next, assume that (I) or, equivalently, (II) holds with a bounded sequence (An )n∈N+ . Clearly, this cannot happen if f = const. It thus remains to show that Fe is slowly varying when lim Fe (x) = ∞.

x→∞

(3.2.5)

Fix δ ∈ (0, 1) and put Xnjδ = Xnj I(|Xnj |≤δ) − Eγ Xnj I(|Xnj |≤δ) for any w 1 ≤ j ≤ n, n ∈ N+ . As γSn−1 → N (0, 1) by (II), it follows from Theorem A3.11(i) that 2  n X (3.2.6) Xnjδ  = 1. lim Eγ  n→∞

j=1

On the other hand, it follows from Corollary A3.2 that   2  n X X 2 ψ(k) nEγ Xn1 I(|Xn1 |≤δ) , n ∈ N+ . (3.2.7) Eγ  Xnjδ  ≤ 1 + 2 j=1

k∈N+

Now, note that |f (i) − An | ≤ δBn entails ¡ ¢ |f (i)| ≤ |An | + δBn = Bn |An |Bn−1 + δ ≤ Bn for any n large enough since δ ∈ (0, 1), (An )n∈N+ is bounded, and limn→∞ Bn = ∞. Then for such an n we have 2 Eγ Xn1 I(|Xn1 |≤δ)

≤ Bn−2 Eγ (f (a1 ) − An )2 I(|f (a1 )|≤Bn ) ³ ´ ≤ 2Bn−2 Fe (Bn ) + A2n ,

186

Chapter 3

whence, by (3.2.5), 2 Eγ Xn1 I(|Xn1 |≤δ) ≤ 4Bn−2 Fe(Bn )

(3.2.8)

for any n large enough. It follows from (3.2.6) through (3.2.8) that there exist c > 0 and n0 ∈ N+ such that nBn−2 Fe(Bn ) ≥ c,

n ≥ n0 .

(3.2.9)

Finally, by Theorem A3.11 we also have lim nγ (|Xn1 | > ε) = 0

n→∞

for any ε > 0. Since (|Xn1 | > ε) = (|f (a1 ) − An | > εBn ) ⊃ (|f (a1 )| > |An | + εBn ) and limn→∞ (|An | + εBn ) /εBn = 1, we then have lim nγ (|f (a1 )| > Bn ) = 0.

x→∞

(3.2.10)

It follows from (3.2.9) and (3.2.10) that Bn2 γ (|f (a1 )| > Bn ) = 0. n→∞ Eγ f 2 (a1 )I(|f (a )|≤B ) n 1 lim

Noting that limn→∞ Bn+1 /Bn = 1 (this follows from, e.g., Theorem A3.9, but a direct proof can be also easily given), the last equation implies x2 γ (|f (a1 )| > x) = 0, x→∞ Eγ f 2 (a1 )I(|f (a )|≤x) 1 lim

which shows by Theorem A2.5 that Fe is slowly varying. Remarks. 1. Theorem 3.2.4 still holds if we replace D by C, WD by WC , and the stochastic process ξnD by the stochastic process ξnC defined by ¡ ¢ ξnC (t) = Snbntc + (nt − bntc) Sn(bntc+1) − Snbntc , t ∈ I, n ∈ N+ . This follows from Theorem A3.8. 2. For the many consequences of Theorem 3.2.4 (as well as of other similar further results) concerning, e.g., the asymptotic behaviour as n → ∞ of random variables as min0≤k≤n Snk , max0≤k≤n Snk , max0≤k≤n |Snk |, Un = number of indices k, 1 ≤ k ≤ n, for which Snk > 0, we refer the reader to

Limit theorems

187

Billingsley (1968, § 11). In particular, in the last case we have an arc-sine law µ ¶ √ 2 Un < a = arcsin a, 0 ≤ a ≤ 1, lim µ n→∞ n π for any µ ∈ pr(BI ) such that µ ¿ λ.

2

Example 3.2.5 Let f (n) = na+1/2 , n ∈ N+ , with a ∈ R. Clearly, for 2 2 a < 0 we have EP γ f (a1 ) < ∞. For a = 0 we have Eγ f (a1 ) = ∞, F (x) ∼ 2 −2 2 log x/ log 2, x = O(1) as x → ∞. Thus (3.2.4) holds and {k:|f (k)|>x} k we can take ¶ µ ³ ´ 1 X 1/2 1 1/2 An = Eγ a1 = k log 1 + log 2 k(k + 2) k∈N+

and Bn = (n log n/ log 2)1/2 , n ∈ N+ . It is easy to check that ζ(3/2)/6 log 2 < An < ζ(3/2)/ log 2 and that we can also write ´ X³ √ √ √ An = 2 k − 1 − k − k − 2 log k, n ∈ N+ . k≥2

Finally, for a > 0 we have F (x) ∼ x4a/(2a+1) /2a log 2 and x2 ∼ x4a/(2a+1) as x → ∞, that is, (3.2.4) does not hold.

P

{k:|f (k)|>x} k

−2

2

As a special case of Theorem 3.2.20 we note the following result. Proposition 3.2.6 Let f : N+ → R be a non-constant function. As2+δ < ∞. Put sume that there Pnexists a constant δ > 0 such that Eγ |f (a1 )| S0 = 0, Sn = i=1 f (ai ) − nEγ f (a1 ), n ∈ N+ . Let σ 2 = Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2

X ¡ ¢ Eγ f (a1 )f (an+1 ) − Eγ2 f (a1 ) , n∈N+

which by Corollary 2.1.25 is positive. Then the strong invariance principle holds for the stochastic processes ξnC and ξnD , n ∈ N+ . That is, without changing their distributions we can redefine these processes on a common richer probability space together with a standard Brownian motion process (w(t))t∈I such that sup |ξn (t) − w(t)| = O(n−a ) t∈I

a.s.

(3.2.11)

188

Chapter 3

as n → ∞, with a random constant implied in O, for each a > 0 small enough, depending on δ. Here ξn stands for either ξnC or ξnD . Remark. It follows from a general result of Heyde and Scott (1973) that if we only assume Eγ f 2 (a1 ) < ∞, then instead of (3.2.11) we only can assert that ³ ´ sup |ξn (t) − w(t)| = o (log log n)1/2 a.s. t∈I

as n → ∞, with a random constant implied in o.

3.2.3

2

The case of associated random variables

Write bn for either yn , rn or un , n ∈ N+ , respectively bl for either y l , rl or ul , l ∈ Z. We now give a partial extension of Theorem 3.2.4 to the sequence (bn )n∈N+ in the case of infinite variance. Theorem 3.2.7 Assume f : [1, ∞) → R+ is regularly varying ¡R x of index¢ 1/2, Eγ f 2 (a1 ) = ∞, and f (x) = x1/2 L(x), where L(x) = c exp 1 ε(t)t−1 dt , x ≥ 1, with c > 0, ε : [1, ∞) → R+ continuous, and limt→∞ ε(t) = 0. For 0 0 any n ∈ N+ define the stochastic process ξn = (ξ (t))t∈I by 0

ξn (t) =

¢ 1 X ¡ f (bj ) − Eγ (b0 ) , Bn

t ∈ I,

j≤bntc

with the usual convention which assigns value 0 to a sum over the empty set, where (Bn )n∈N+ is any sequence satisfying limn→∞ nBn−2 F (Bn ) = 1 with F defined as in Theorem 3.2.4, and Eγ (b0 ) is equal to Z ∞ Z ∞ 1 f (x)dx 1 f (x)dx , Eγ f (r0 ) = Eγ f (r1 ) = Eγ f (y 0 ) = log 2 1 x(x + 1) log 2 1 x(x + 1) or

1 Eγ f (u0 ) = log 2

µZ 1

2

(x − 1)f (x)dx + x2

Z



2

f (x)dx x2



according as bn denotes yn , rn or un , n ∈ N+ . Then 0

w

µξn−1 → WD in BD for any µ ∈ pr(BI ) such that µ ¿ λ. The proof of Theorem 3.2.7 for the cases where bn = rn or bn = un , n ∈ N+ , can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈ N+ , can be treated in a similar manner. 2

Limit theorems

189

We note that the hypothesis of a slowly varying F occurring in Theorem 3.2.4 is replaced here by stronger hypotheses. [By Corollary A2.7(ii) the assumptions on f imply that F is slowly varying.] And even the Karamata representation of f is assumed to present special features (compare with Theorem A2.1). Example 3.2.8 Let f (x) = x1/2 , x ∈ [1, ∞) (cf. Example 3.2.5). Theorem 3.2.7 holds with Bn = (n log n/ log 2)1/2 , n ∈ N+ , and Z ∞ dx 1 π √ Eγ f (y 0 ) = Eγ f (r1 ) = = , log 2 1 2 log 2 x(x + 1) ¡√ ¢ µZ 2 ¶ Z ∞ 4 2−1 1 (x − 1)dx dx Eγ f (u0 ) = + = . log 2 log 2 x3/2 x3/2 1 2 2 The next result covers the case of finite variance. Theorem 3.2.9 Let f : [1, ∞) → R. Assume that either (i) f satisfies a Lipschitz condition of order 0 < ε ≤ 1, that is, |f (x) − f (y)| := sε (f ) < ∞, |x − y|ε x6=y, x,y≥1 sup

Z



and

|f (x)|2+δ x−2 dx < ∞ for some δ ≥ 0

1

or (ii) f = I(b,∞) for some b > 1. P Put S00 = 0, Sn0 = ni=1 (f (bi ) − Eγ f (b0 )), n ∈ N+ , and for any n ∈ N+ define the stochastic processes ξn0C = (ξn0C (t))t∈I and ξn0D = (ξn0D (t))t∈I on (I, BI , γ) by ξn0C (t) =

1 √ (S 0 + (nt − bntc)(f (bbntc+1 ) − Eγ f (b0 ))), σ(f ) n bntc

ξn0D (t) =

0 Sbntc √ , σ(f ) n

t ∈ I,

where σ(f ) is a positive number which is defined by (3.2.12) below. Then à n !2 X¡ ¢ 1 = σ 2 (f ) ≥ 0 (3.2.12) lim Eγ f (bi ) − Eγ f (b0 ) n→∞ n i=1

190

Chapter 3

exists finitely. If σ(f ) > 0 then (a) assuming that δ = 0, for any µ ∈ pr(BI ) such that µ ¿ λ we have w

µξn0−1 → W in both BC and BD , where ξn0 stands for either ξn0C or ξn0D ; (b) assuming that δ > 0, the strong invariance principle holds for the stochastic processes ξn0C and ξn0D , n ∈ N+ . That is, without changing their distributions we can redefine these processes on a richer common probability space together with a standard Brownian motion process (w(t))t∈I such that ¯ 0 ¯ ¯ ¯ sup ¯ξn (t) − w(t)¯ = O(n−a ) a.s. t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small enough, depending on δ. Here ξn0 stands for either ξn0C or ξn0D . Proof. We shall show that (a) and (b) follow from Theorems 3.2.1 and 3.2.2, respectively. We use the notation of Subsection 2.1.5 . Define ¢ ¡ H ((il )l∈Z ) = f b1 ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) , (il )l∈Z ∈ NZ +, (3.2.13) H1 = H((al )l∈Z ), Hm = H1 ◦ τ m−1 , m ∈ N+ . Hence

      h(ω, θ) =

    

f (1/θ)

in the case where bl = y l , l ∈ Z,

f (1/ω)

in the case where bl = rl , l ∈ Z,

f (θ + 1/ω) in the case where

bl = ul , l ∈ Z

for (ω, θ) ∈ Ω2 . Also, as in the proof of Proposition 2.1.22 we easily obtain Eγ |H1 − Eγ (H1 | a−n , · · · , an )|2+δ =

X

1

i−n ,··· ,in ∈N+

γ¯ 2+δ (I 2 (i−n , · · · , in ))

¯Z ¯ ¯ ׯ ¯

Z γ¯ (dω 0 , dθ0 ) I 2 (i−n ,··· ,in )

(3.2.14)

¯2+δ ¯ ¯ . (h(ω 0 , θ0 ) − h(ω, θ))¯ γ (dω, dθ)¯ ¯ 2 I (i−n ,··· ,in )

Now, under (i) it is easy to check that h satisfies an inequality of the form (2.1.30), which yields cn ≤ crn , n ∈ N+ , for some c > 0 and 0 < r < 1,

Limit theorems

191

with cn , n ∈ N+ , defined as in Proposition 2.1.22. It follows from (3.2.14) that 1/(2+δ)



|H1 − Eγ (H1 | a−n , · · · , an )|2+δ ≤ crn ,

n ∈ N+ .

Hence (3.2.3) clearly holds. Next, we are going to show that under (ii) condition (3.2.3) also holds. In the case where bl = y l , l ∈ Z, for any given n ∈ N+ there is at most one fundamental interval I(i0 , i−1 , ..., i−n ) such that 1/b ∈ I (i0 , i−1 , ..., i−n ). Similarly, in the case where bl = rl , l ∈ Z, for any given n ∈ N+ , there is at most one fundamental interval I(i1 , ..., in ) such that 1/b ∈ I (i1 , ..., in ). Therefore by (3.2.14) in both these cases Eγ |H1 − Eγ (H1 |a−n , ..., an )|2+δ does not exceed (Fn Fn+1 log 2)−1 for all n ∈ N+ , hence (3.2.3) holds. In the case where bl = ul , l ∈ Z, the last integral in (3.2.14) may be different from 0 only for those rectangles I 2 (i−n , ..., in ) which are intersected by the hyperbola y + 1/x = 1/b. It is easy to see that for n large enough the total Euclidean area of them does not exceed (Fn Fn+1 )−1 so that (3.2.3) holds in this case, too. To prove (a) note that for δ = 0 by Theorem 3.2.1 we have w

µξn−1 −→ W in both BC and BD

(3.2.15)

for any µ ∈ pr(BI2 ) such that µ ¿ λ2 , where ξn stands for either ξnC or ξnD defined as in Section 3.2.1, for our special H given by (3.2.13) and with σ(f ) = σ(H) defined by (3.2.12). But ¯ ¯ ¯bn (ω) − bn (ω, θ)¯ ≤ (Fn−1 Fn )−1 ,

n ∈ N+ , (ω, θ) ∈ Ω2 .

[In the case where bn = rn , n ∈ N+ , we even have bn (ω) = bn (ω, θ), n ∈ N+ , (ω, θ) ∈ Ω2 .] Thus under (i) we have ¯ ¯ sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ ≤ t∈I



¯ ¯ 1 √ max ¯Si0 (ω) − Si (ω, θ)¯ σ(f ) n 1≤i≤n n X ¯ ¡ ¢¯ 1 ¯f (bi (ω)) − f bi (ω, θ) ¯ √ σ(f ) n i=1

¯ε sε (f ) X ¯¯ √ bi (ω) − bi (ω, θ)¯ σ(f ) n i=1 ³ ´ = O n−1/2 ≤

192

Chapter 3

as n → ∞, with a non-random constant independent of (ω, θ) ∈ Ω2 implied in O, while under (ii) it is easy to see that ¯ ¯ sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ ≤ t∈I

n X ¯ ¯ 1 ¯I(b,∞) (bi (ω)) − I(b,∞) (bi (ω, θ))¯ √ σ(f ) n i=1



³ ´ O(1) √ = O n−1/2 γ-a.s. σ(f ) n

with a random constant implied in O. Therefore in both cases ³ ´ ¯ ¯ sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ = O n−1/2 µ-a.s.

(3.2.16)

t∈I

for any µ ∈ pr(BI2 ) such that µ ¿ λ2 . Now, (3.2.15) and (3.2.16) imply at once that 0 w µξn−1 −→ W in both BC and BD for any µ ∈ pr(BI ) such that µ ¿ λ. To prove (b) note that for δ > 0 by Theorem 3.2.2 we have sup |ξn (t) − w(t)| = O(n−a ) a.s. t∈I

as n → ∞. By (3.2.16) it is obvious that the strong invariance principle holds as stated for the stochastic processes ξn0C or ξn0D , n ∈ N+ . 2 In the case where bn = rn , n ∈ N+ , under different assumptions on f , we can derive from Theorems 3.2.10 and 3.2.20 the following result. Theorem 3.2.10 Let f : [1, ∞) → R and define the function g by g(u) = f (1/u) , u ∈ (0, 1]. Assume that g is a function of bounded pvariation, p ≥ 1. Put S00

= 0,

Sn0

=

n X

f (ri ) − nEγ f (r1 ),

n ∈ N+ .

i=1

Then the series à µZ ¶2 µZ ¶2 ! Z X Z σ 2 (f ) = g 2 dγ − gdγ + 2 g U n gdγ − gdγ I

I

n∈N+

I

I

converges absolutely. If σ(f ) 6= 0 then both the weak and strong invariance principles hold as described in Theorems 3.2.10 and 3.2.20 for the stochastic

Limit theorems

193

processes ξn0C and ξn0D , n ∈ N+ , defined as in Theorem 3.2.9 with bn = rn , n ∈ N+ . Proof. In this case the function H considered in Theorems 3.2.10 and 3.2.20 is defined by H (i1 , i2 , ...) = g ([i1 , i2 , ...]) ,

N

(in )n∈N+ ∈ N+ + .

It follows from Proposition 2.1.23 and its proof that both (3.2.10 ) and (3.2.30 ) hold in our special case, hence the present statement. 2 Remark. Convergence rates in the central limit theorem are available for P the sequence ( ni=1 f (ri ) − nEγ f (r1 ))n∈N+ . Hofbauer and Keller (1982, p. 133) proved that ¯ µ Pn ¯ ¶ ¯ ¯ nEγ f (r1 ) i=1 f (ri ) − ¯ √ sup ¯γ < x − Φ(x)¯¯ = O(n−a ) σ(f ) n x∈R as n → ∞ for some 0 < a ≤ 1/2. Rousseau-Eg`ele (1983) showed that in the case p = 1 we can take a = 1/2. See also Iosifescu and Grigorescu (1990, pp. 212–213) and Miseviˇcius (1971). 2 Example 3.2.11 Let f (x) = log x, x ∈ [1, ∞). This is clearly a Lip0 schitz function ¯ since ¯α f (x) = 1/x ≤ 1 for any x ∈ [1, ∞). Also, it is easy to ¯ ¯ see that Eγ f (b0 ) < ∞ for any α ∈ R+ . In the cases where bn = yn or bn = rn , n ∈ N+ , Theorem 3.2.9 holds with Z ∞ 1 log x dx Eγ f (b0 ) = log 2 1 x(x + 1) ¯ µ µ ¶ ¶ Z ∞ x + 1 ¯¯∞ 1 1 1 − log x log + log 1 + dx = log 2 x ¯1 x x 1 Z 1 X (−1)k+1 ∞ dx = log 2 k xk+1 1 k∈N+

=

1 X (−1)k+1 log 2 k2 k∈N+

=

π2 12 log 2

while the corresponding σ(f ) = σ < ∞ is non-zero. This can be shown as follows. By the reversibility of (¯ a` )`∈Z —see Subsection 1.3.3—the finite

194

Chapter 3

dimensional distributions under γ¯ of (¯ y` )`∈Z and (¯ r` )`∈Z are identical. Then σ2

à n µ X 1E log y i − = lim n γ n→∞

i=1

1E = lim n γ n→∞

à n µ X log ri −

1E = lim n γ n→∞

à n µ X log ri −

i=1

i=1

2

π 12 log 2 π2 12 log 2 π2 12 log 2

¶!2

¶!2

¶!2 .

So, σ 2 coincides with (2.1.33) in the case where the function h is defined by h(ω) = log

π2 1 − , ω 12 log 2

ω ∈ Ω.

It is easy to check that U h ∈ BV (I) while h is essentially unbounded. Hence σ 6= 0 by Proposition 2.1.24. It is worth mentioning that Mayer (1990) showed that −π 2 /12 log 2 is the value at β = 2 of the first derivative of the dominant eigenvalue λ(β) of the Mayer–Ruelle operator Gβ . See Theorem 2.4.7. Also, Hensley (1994) showed that σ 2 = λ00 (2) − (λ0 (2))2 > 1/6. Note that in the case where bn = yn , n ∈ N+ , we have Sn0 =

n X i=1

log yi −

nπ 2 nπ 2 = log qn − , 12 log 2 12 log 2

n ∈ N+ .

In this case convergence rates in the central limit theorem are available. Miseviˇcius (1981) proved that ¯ ¯ µ ¶ µ ¶ 2 /12 log 2 ¯ ¯ log n log q − nπ n √ sup ¯¯λ < x − Φ(x)¯¯ = O √ (3.2.17) σ n n x∈R as n → ∞. Vall´ee (1997) was able to obtain the optimal convergence rate in (3.2.17) using Mayer–Ruelle operators. She proved that for µ ∈ pr(BI ) such that µ ¿ λ and the Radon–Nikodym derivative dµ/dλ is analytic and strictly positive in I, we have ¯ µ ¯ µ ¶ ¶ ¯ ¯ log qn − nπ 2 /12 log 2 1 ¯ ¯ √ sup ¯µ < x − Φ(x)¯ = O √ (3.2.18) σ n n x∈R

Limit theorems

195

as n → ∞. The same result for µ = λ had been also obtained by Morita (1994). For further results on the sequence (log qn )n∈N+ see Miseviˇcius (1992) and Vall´ee (1997). See also Example 3.4.6. From (3.2.18), using the double inequality ¯ ¯ ¯ pn (ω) ¯¯ 1 1 ¯ ≤ ¯ω − ≤ 2 , ω ∈ Ω, n ∈ N+ , ¯ 2 qn (ω) qn (ω) 2qn+1 (ω) we can derive the corresponding result for the random variable zn defined by ¯ ¯ ¯ ¯ p (ω) n ¯ , ω ∈ Ω, n ∈ N+ . zn (ω) = ¯¯ω − qn (ω) ¯ We have ¯ µ ¯ ¶ ¶ µ 2 /6 log 2 ¯ ¯ 1 ¯µ log zn + nπ ¯ √ < x − Φ(x)¯ = O √ ¯ 2σ n n as n → ∞. The details are left to the reader. In the case where bn = un , n ∈ N+ , Theorem 3.2.9 should hold with µZ 2 ¶ Z ∞ (x − 1) log x 1 log x dx Eγ f (b0 ) = dx + log 2 x2 x2 1 2 µ ¶ 1 1 1 1 1 2 2 2 ∞ (log x − 1) |1 + (log x) |1 − (log x − 1) |2 = 1 + log 2 = log 2 x 2 x 2 while we conjecture that σ(f ) is non-zero.

2

Example 3.2.12 Let f (x) = 1/x, x ∈ [1, ∞). This is also a Lips0 chitz function since | f (x) | = 1/x2 ≤ 1 for all x ∈ [1, ∞) while g(ω) = f (1/ω), ω ∈ Ω, is a function of bounded variation. Both Theorems 3.2.9, in the case where bn = rn , n ∈ N+ , and 3.2.10 hold with Z ∞ 1 dx 1 Eγ f (r1 ) = Eγ f (r0 ) = = −1 2 log 2 1 x (x + 1) log 2 while the corresponding σ(f ) = σ is non-zero. Indeed, σ 2 coincides with (2.1.33) in the case where the function h is defined by h(ω) = ω − and Proposition 2.1.26 applies.

1 + 1, log 2

ω ∈ Ω, 2

196

Chapter 3

3.3

Convergence to non-normal stable laws

3.3.1

The case of incomplete quotients

We start with a result which parallels Theorem 3.2.4. Theorem 3.3.1 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with limn→∞ Bn = ∞, and define Xnj = Bn−1 (f (aj ) − An ) , Sn0 = 0,

Snk =

k X

1 ≤ j ≤ n,

Xnj ,

1 ≤ k ≤ n,

Snn = Sn ,

n ∈ N+ .

j=1

Let k1 , k2 ≥ 0, k1 + k2 > 0, α ∈ (0, 2), and denote by ν = ν(k1 , k2 , α) the stable p.m. c1 Pois µ(k1 , k2 , α) (see Section A1.5). (i) The following assertions are equivalent. (I) The stochastic process ξnD = ξn = (ξn (t))t∈I defined for any n ∈ N+ by ξn (t) = Snbntc , t ∈ I, satisfies w

γξn−1 → Qν in BD , where the p.m. Qν is defined as in Section A3.3. w

(II) γSn−1 → ν, under γ.

and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i.

(ii) Assertion (I) above holds if and only if X k −2 , x ∈ R+ , is regularly varying of index − α (3.3.1) Fe(x) = {k:|f (k)|>x}

and lim

x→∞

lim

x→∞

1

X

Fe(x) {k:f (k)>x} 1

k −2 =

X

Fe(x) {k:f (k)<−x}

k1 , k1 + k2 (3.3.2)

k −2

k2 = k1 + k2

or, equivalently (see Theorem A2.5), if and only if X F (x) = (log 2)−1 f 2 (k)k −2 , {k:|f (k)|≤x}

x ∈ R+ ,

Limit theorems

197

is regularly varying of index 2 − α and (3.3.2) holds or, equivalently, if and only if x2 Fe(x) 2−α lim = log 2 x→∞ F (x) α and (3.3.2) holds. If this is the case, then we can take An = Eγ f (a1 )I(|f (a1 )|≤Bn ) ,

n ∈ N+ ,

and any sequence (Bn )n∈N+ such that lim nBn−2 F (Bn ) = (k1 + k2 )/(2 − α).

n→∞

(iii) If either (I) or (II) above holds, then γ can be replaced in (i) by any µ ∈ pr (BI ) such that µ ¿ λ. Proof. (i) and (iii) follows from Theorem A3.7 and Lemma 3.0.2, respectively. The proof of (ii) is entirely similar to that working in the case of i.i.d. random variables. See Samur (1989, p. 62) and Araujo and Gin´e (1980, pp. 81, 84–85, 87–88). 2 Remark. In principle, from Theorem 3.3.1 we might derive the asymptotic behaviour as n → ∞ of random variables as, e.g., min Snk ,

0≤k≤n

max Snk ,

0≤k≤n

or

max |Snk |.

0≤k≤n

This depends on the possibility of determining the distribution of the random vector µ ¶ inf ξν (t), sup ξν (t), ξν (1) , t∈I

t∈I

where ξν = (ξν (t))t∈I is a stochastic process with stationary independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having probability distribution ν (see Section A3.3). Note that this problem could be solved in the case of normal convergence, when ν is the standard normal distribution and ξν is the standard Brownian motion process—see Remark 2 following Theorem 3.2.4. 2 Corollary 3.3.2 Let k1 , k2 , α, and ν = ν(k1 , k2 , α) be as in Theorem 3.3.1. (i) Let f ∈ F (see Section A2.3). Then (3.3.1) and (3.3.2) hold if and only if f is regularly varying of index 1/α.

198

Chapter 3

(ii) Assume f : [1, ∞) → R++ is bounded on finite intervals and regularly varying of index 1/α. Let  ¶ µ α   , 0, α if α 6= 1, δ ∗ν    α/(1−α) log 2 log 2 να = µ ¶   1   ν , 0, 1 if α = 1,  log 2 and for any n ∈ N+ define the stochastic process ηn = (ηn (t))t∈I by  X  f (aj ) if α < 1,     j≤bntc       ¢  X ¡ 1 f (aj ) − Eγ f (a1 )I(f (a1 )≤f (n)) if α = 1, ηn (t) = × f (n)  j≤bntc      X    (f (aj ) − Eγ f (a1 )) if α > 1,    j≤bntc

with the usual convention which assigns value 0 to a sum over the empty set. Then w µηn−1 → Qνα in BD for any µ ∈ pr(BI ) such that µ ¿ λ. Proof. (i) By Lemma A2.6(iii) it is sufficient to show that X k −2 ∼ (f1 (x))−1 as x → ∞.

(3.3.3)

{k:f (k)>x}

For any x ≥ 1 by the definition of f1 and f2 (see Section A2.3) we have {k : k > f2 (x)} ⊂ {k : f (k) > x} ⊂ {k : k ≥ f1 (x)}. Hence

X 1≤

X

k −2

{k:f (k)>x}

X

k

k>f2 (x)

−2

≤1+

k −2

f1 (x)≤k≤f2 (x)

X

(3.3.4)

k −2

(3.3.5)

k>f2 (x)

for any x ≥ 1. But X f1 (x)≤k≤f2 (x)

k −2 ≤ (f1 (x) − 1)−1 − (f2 (x))−1 ,

(3.3.6)

Limit theorems

199 X

k −2 ≥ (f2 (x) + 1)−1

(3.3.7)

k>f2 (x)

for any x ≥ 1, and (f1 (x))−1 ∼ (f2 (x))−1 ∼

X

k −2 as x → ∞.

(3.3.8)

k>f2 (x)

Now, (3.3.3) follows from (3.3.5) through (3.3.8). (ii) By Lemma A2.6(ii) we have f ∈ F. It follows from (i) above and Theorem 3.3.1 that w µξn−1 → Qνα in BD for any µ ∈ pr(BI ) such that µ ¿ λ, where for any n ∈ N+ the process ξn = (ξn (t))t∈I is defined by ξn (t) =

¢ 1 X ¡ f (aj ) − Eγ f (a1 )I(f (a1 )≤Bn ) , t ∈ I, Bn j≤bntc

with Bn satisfying lim n Bn−2 F (Bn ) =

n→∞

k1 + k2 . 2−α

(3.3.9)

It is therefore sufficient to prove that in (3.3.9) we can take Bn = f (n), n ∈ N+ , k1 = α/ log 2, k2 = 0, and that lim Eγ (ηn (1) − ξn (1))

n→∞

  Eγ f (a1 )I(f (a1 )≤f (n)) n = lim × n→∞ f (n)  −Eγ f (a1 )I(f (a1 )>f (n)) =

if α < 1, if α > 1

(3.3.10)

α . (1 − α) log 2

To proceed notice first that by the very definition of f1 and f2 we have f1 (f (n) − 1) ≤ n ≤ f2 (f (n)) ,

n ∈ N+ .

Since f1 is regularly varying, by Corollary A2.2(i) we have f1 (f (n) − 1) ∼ f1 (f (n)) as n → ∞.

200

Chapter 3

As f1 ∼ f2 , it follows that fi (f (n)) ∼ n as n → ∞,

i = 1, 2.

(3.3.11)

Taking up (3.3.9) we begin by noting that (3.3.4) implies that X X f 2 (k)k −2 f 2 (k)k −2 k
X

2

f (k)k

−2



k≤f2 (x)

{k:f (k)<x}

X

f 2 (k)k −2

≤1

(3.3.12)

k≤f2 (x)

for all x ≥ 1. Next, we use Theorem A2.3 taking L(x) = x−2/α f 2 (bxc) (bxc + 1) /bxc,

x ≥ 1,

which is a slowly varying function. We easily obtain P x k≤x f 2 (k)k −2 α = . (3.3.13) lim x→∞ f 2 (x) 2−α P P Clearly, (3.3.13) also holds when k≤x is replaced by k<x . Because f1 ∼ f2 and f is regularly varying, it follows from (3.3.13) that the first fraction in (3.3.12) tends to 1 as x → ∞. Then by (3.3.13) again and (3.3.11) we obtain X n n F (f (n)) ∼ f 2 (k)k −2 f 2 (n) f 2 (n) log 2 k≤f2 (f (n))



1 n f 2 (f2 (f (n))) α log 2 f2 (f (n)) f 2 (n) 2−α



α as n → ∞, (2 − α) log 2

that is, (3.3.9) is satisfied as stated. Now, coming to (3.3.10) assume first α < 1. Then since ³ ´ 1 log 1 + k(k+2) =1 lim k→∞ k −2 P and k∈N+ f (k)k −2 = ∞, we have Eγ f (a1 )I(f (a1 )≤f (n)) ∼

1 log 2

X {k:f (k)≤f (n)}

(3.3.14)

(3.3.15)

f (k)k −2 as n → ∞.

Limit theorems

201

Therefore the asymptotic behaviour of n Eγ f (a1 )I(f (a1 )≤f (n)) f (n) as n → ∞ can be obtained from (3.3.14) by replacing f 2 by f, thus α by 2α (note that while f 2 is regularly varying of index 2/α, f is regularly varying of index 1/α). Thus α n Eγ f (a1 )I(f (a1 )≤f (n)) ∼ as n → ∞, f (n) (1 − α) log 2 that is, (3.3.10) holds when α < 1. Finally, let α > 1. We now use Theorem A2.4 taking L(x) = x−1/α f (bxc) (bxc + 1) /bxc,

x ≥ 1,

which is a slowly varying function. We easily obtain P x k≥x f (k)k −2 α lim = . (3.3.16) x→∞ f (x) α−1 P P Clearly, (3.3.16) also holds when k≥x is replaced by k>x . By (3.3.4), similarly to (3.3.12) we have Eγ f (a1 )I(a1 >f2 (f (n))) Eγ f (a1 )I(f (a1 )>f (n)) ≤ ≤ 1, Eγ f (a1 )I(a1 ≥f1 (f (n))) Eγ f (a1 )I(a1 ≥f1 (f (n)))

n ∈ N+ .

(3.3.17)

It follows from (3.3.16) that P the first fraction in (3.3.17) tends to 1 as n → ∞. Notice then that since k∈N+ f (k)k −2 < ∞, by (3.3.15 ) we have Eγ f (a1 )I(a1 ≥f1 (f (n))) ∼

1 log 2

X

f (k)k −2 as n → ∞.

k≥f1 (f (n))

Using (3.3.16) again we thus obtain n Eγ f (a1 )I(f (a1 )>f (n)) ∼ f (n)

n f (n) log 2

X

f (k)k −2

k≥f1 (f (n))



1 n f (f1 (f (n))) α log 2 f1 (f (n)) f (n) α−1



α as n → ∞, (α − 1) log 2

202

Chapter 3

that is, (3.3.10) holds when α > 1, too.

2

To complete the remark following Theorem 3.3.1 we note that Corollary 3.3.2 allows to derive in some cases the asymptotic behaviour as n → ∞ of the random variable Un = number of indices k, 1 ≤ k ≤ n, for which Snk > 0. Proposition 3.3.3 Assume f is bounded on finite intervals and regularly varying of index 1/α with 1 < α < 2. Then µ ¶ Un lim µ <x (3.3.18) n→∞ n n o   P card 1 ≤ k ≤ n : kj=1 f (aj ) > kEγ f (a1 ) = lim µ  < x n→∞ n =

sin(π/α) π

Z 0

x

dt , − t)1/α

t1−1/α (1

0 ≤ x ≤ 1,

for any µ ∈ pr(BI ) such that µ ¿ λ. Proof. It is easy to check that να defined in Corollary 3.3.2 is a strictly stable probability and να ((0, ∞)) = 1/α for any 1 < α < 2. Then (3.3.18) is an immediate consequence of Theorem 5.1 in de Acosta (1982). 2 Remarks. 1. Proposition 3.3.3 holds for α = 2, too. In this case the limiting distribution is the classical arc-sine law mentioned in Remark 2 following Theorem 3.2.4. However, the assumption on f in Proposition 3.3.3 is slightly stronger [cf. Corollary A2.7(ii)] than the assumption on f in Theorem 3.2.4, under which the arc-sine law holds. 2. It follows from Proposition 3.3.3 [cf. Theorem 5.2 in de Acosta (1982)] that Z sin(π/α) x dt µ (λ (t ∈ I : ξνα (t) > 0) < x) = , 0 ≤ x ≤ 1, 1−1/α π (1 − t)1/α 0 t for any 1 < α < 2. This generalizes P. L´evy’s arc-sine law for Brownian motion. 2

3.3.2

Sums of incomplete quotients

P From Corollary 3.3.2 we can derive results for the sums tn = nj=1 aj , n ∈ N+ , of incomplete coefficients by taking f (x) = x, x ∈ [1, ∞). In this case

Limit theorems

203

we have n

An = Eγ a1 I(a1 ≤n)

1 X (j + 1)2 = j log log 2 j(j + 2) j=1

=

µ ¶ 1 n+2 log(n + 2) − (n + 1) log , log 2 n+1

Hence An =

n ∈ N+ .

1 (log n − 1 + o(1)) log 2

(3.3.19)

as n → ∞. For any µ ∈ pr(BI ) such that µ ¿ λ by Corollary 3.3.2(ii) we have w µ (ηn (1))−1 → ν1 , (3.3.20) where

n

ηn (1) =

1X (aj − An ) , n

n ∈ N+ .

j=1

It follows from (3.3.19) and (3.3.20) that w

µ (ζn (1))−1 → δ(C−1)/ log 2 ∗ ν1 := ν 0 , where

¶ n µ 1X C − log n ζn (1) = aj + , n log 2

(3.3.21)

n ∈ N+ ,

j=1

and C = 0.57722 · · · is Euler’s constant. Note that the ch.f. of ν 0 is µ ¶ ¶ µ 2 π 0 1 + i sgn t log |t| |t| , t ∈ R, νb (t) = exp − 2 log 2 π see Section A1.5. Hence ν 0 is strictly stable. A convergence rate in (3.3.21) is available in the special case where µ = γ. Heinrich (1987) proved that there exists c0 ∈ R++ such that 2 ¯ ¯ ¯γ (ζn (1) < x) − ν 0 ((−∞, x))¯ ≤ c0 (log n) n

for any n ∈ N+ and x ∈ R. To conclude let us note that (3.3.21) is a special case of w

µζn−1 −→ Qν 0 in BD ,

(3.3.22)

204

Chapter 3

where for any n ∈ N+ the process ζn = (ζn (t))t∈I is defined by µ ¶ C − log n 1 X aj + ζn (t) = , t ∈ I. n log 2 j≤bntc

As a consequence (compare with Remark 2 following Proposition 3.3.3) we have à ! P card{1 ≤ k ≤ n : kj=1 aj > k(log n − C)/ log 2} lim µ <x n→∞ n = µ (λ(t ∈ I : ξν 0 (t) > 0) < x) ,

0 ≤ x ≤ 1.

An explicit expression of the last distribution function is not known. Immediate consequences of (3.3.21) and (3.3.22) are that (i) for any µ ∈ pr(BI ) such that µ ¿ λ we have tn 1 −→ n log n log 2

in µ-probability as n → ∞,

(3.3.23)

and (ii) for any ε > 0 and n ∈ N+ we have ¯ µ¯ ¶ ¯ ¯ tn 1 ¯≤ε − γ ¯¯ n log n log 2 ¯ ¸¶ µ· C 2c0 (log n)2 C , ε log n + − . ≥ν −ε log n + log 2 log 2 n P Khintchine (1934/35) proved using (3.3.23) that the series n∈N+ 1/tn is divergent a.e. in I. A stronger result is Theorem 3.3.4 below. This was stated by Doeblin (1940), but his proof is incorrect. We reproduce here the proof of Iosifescu (1996). 0

Theorem 3.3.4 The series Xµ 1 n≥2

log 2 − tn n log n



is absolutely convergent a.e. in I . Proof. In what follows, the letter c with different indices will denote suitable positive constants. Let h : N+ → N+ be a function such that limn→∞ h(n) = ∞. For any n ∈ N+ put tn (h) =

n X i=1

ai I(ai ≤h(n)) .

Limit theorems

205

It follows from (3.3.19) and the strict stationarity of (an )n∈N+ under γ that Eγ tn (h) =

n (log h(n) − 1 + o(1)) log 2

(3.3.24)

as n → ∞. Next, for any n ∈ N+ we have Eγ a21 I(a1 ≤n)

µ ¶ n 1 1 X 2 j log 1 + ≤ c1 n, = log 2 j(j + 2) j=1

and Corollary A3.2 yields Eγ (tn (h) − Eγ tn (h))2 ≤ c2 nh(n),

n ∈ N+ .

(3.3.25)

³ ´ Now, write t¯n = tn (h) for h(n) = n blog4/3 nc + 1 and t0n = tn (h) for h(n) = n, n ∈ N+ . For any n ≥ 3 by (3.3.24) we have ¯ ¯ ¯ 1 log 2 ¯¯ log log n ¯ ¯ E t − n log n ¯ ≤ c3 n log2 n . γ n P 2 Since the series n≥3 (log log n)/n log n is convergent, it is sufficient to prove that the series ¶ Xµ 1 1 − (3.3.26) tn Eγ tn n≥2

is absolutely convergent a.e. in I. For any n ≥ 2 consider the random events ¢ ¡ ¢ ¡ A2 (n) = A2 = tn < 12 Eγ tn , A1 (n) = A1 = tn > 32 Eγ tn , A3 (n) = A3 = A4 (n) = A4 =

¡1

¢ ¡ ¢ ≤ tn ≤ 32 Eγ tn ∩ tn 6= tn ,

¡1

¢ ¡ ¢ ≤ tn ≤ 32 Eγ tn ∩ tn = tn .

2 Eγ tn 2 Eγ tn

Let us find upper bounds for the γ-probabilities of A1 , A2 , and A3 .We have ¯ ¡ ¢ ¡¯ ¢ A1 = tn − Eγ tn > 12 Eγ tn ⊂ ¯tn − Eγ tn ¯ > 12 Eγ tn . By (3.3.24) and (3.3.25) the Bienaym´e–Chebyshev inequality implies ³ ´ ¡ ¢2 γ(A1 ) ≤ 4c2 n2 blog4/3 nc + 1 / Eγ tn ≤ c4 (log n)−2/3 . (3.3.27)

206

Chapter 3

¡ ¢ Since t0n ≤ tn , n ∈ N+ and Eγ tn /2 − Eγ t0n < 0 for n large enough, for such an n we have ¡ ¢ ¡ ¢ ¡ ¢ A2 = tn < 12 Eγ tn ⊂ t0n < 12 Eγ tn = t0n − Eγ t0n < 12 Eγ tn − Eγ t0n ¯ ¢ ¡¯ 0 ¯tn − Eγ t0n ¯ > Eγ t0n − 1 Eγ tn . 2



Again by (3.3.24) and (3.3.25), the Bienaym´e–Chebyshev inequality implies c02 n2 −2 γ(A2 ) ≤ ¡ ¢2 ≤ c5 (log n) . Eγ t0n − Eγ tn /2 Noting that (tn 6= tn ) =

n ³ [

(3.3.28)

´ ai > n(blog4/3 nc + 1) ,

i=1

whence

³ ³ ´´ γ(tn 6= tn ) ≤ nγ a1 > n blog4/3 nc + 1 ≤ c6 (log n)−4/3 ,

(3.3.29)

we obviously have γ(A3 ) ≤ c6 (log n)−4/3 .

(3.3.30)

Next, let us find an upper bound for ¯ ¯ 4 ¯1 1 ¯¯ X ¯ Eγ ¯ − = Ii (n), tn Eγ tn ¯ i=1 where

¯ ¯ ¯1 ¯ 1 ¯ − ¯ Ii (n) = ¯ tn E t ¯ dγ, γ n Ai Z

1 ≤ i ≤ 4.

Since tn ≤ tn , n ∈ N+ , on A1 we have 1 1 2 ≤ < . tn tn 3Eγ tn

(3.3.31)

It follows from (3.3.24), (3.3.27), and (3.3.31) that I1 (n) ≤ c7 n−1 (log n)−5/3 .

(3.3.32)

Since tn ≥ n, n ∈ N+ , by (3.3.24), (3.3.28), and (3.3.30) we have I2 (n) ≤ c8 n−1 (log n)−2 , I3 (n) ≤ c9 n−1 (log n)−4/3 .

(3.3.33)

Limit theorems

207

Finally, set wn = (tn − Eγ tn )/Eγ tn and note that by (3.3.24) and (3.3.25) we have Eγ |wn | ≤ Eγ1/2 wn2 ≤ c10 (log n)−1/3 . Since on A4 we have Z I4 (n) = ≤

tn = tn and 2/3 ≤ 1/(1 + wn ) ≤ 2, it follows that ¯ ¯ Z ¯ ¯1 |wn | ¯ − 1 ¯ dγ = dγ ¯ tn E t ¯ (1 + wn )Eγ tn γ n A4 A4 (3.3.34) 2 Eγ |wn | ≤ c11 n−1 (log n)−4/3 . Eγ t¯n

Therefore by (3.3.32) through (3.3.34) we have ¯ ¯ ³ ´ ¯1 1 ¯¯ −1 −4/3 ¯ Eγ ¯ − = O n (log n) tn Eγ tn ¯ P −1 −4/3 is convergent, by Beppo as n → ∞. As the series n≥2 n (log n) Levy’s theorem series (3.3.26) is absolutely convergent a.e. in I. The proof is complete. 2 Corollary 3.3.5 We have Pn i=1 1/ti lim = log 2 n→∞ log log n

a.e..

Proof. This follows immediately from Theorem 3.3.4 since, as is well known, ! Ã n X 1 lim − log log n n→∞ i log i i=1

exists and is finite. 2 For further results on the sums tn , n ∈ N+ , see Theorem 4.1.9 and its corollaries.

3.3.3

The case of associated random variables

We shall now show that Corollary 3.3.2 still holds in the case where α < 1 when aj is replaced by either yj , rj , or uj , j ∈ N+ . This will follow from the result below (compare with Lemma 3.1.4).

208

Chapter 3

Lemma 3.3.6 Let bn , n ∈ N+ , be real-valued random variables on (I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ηn = (ηn (t))t∈I and ηn0 = (ηn0 (t))t∈I defined by 1 X 1 X ηn (t) = f (aj ), ηn0 (t) = f (bj ), t ∈ I, f (n) f (n) j≤bntc

j≤bntc

with the usual convention which assigns value 0 to a sum over the empty set, where f : [1, ∞) → R++ is bounded on finite intervals and regularly varying of index β > 1. Then d0 (ηn , ηn0 ) converges to 0 in γ-probability as n → ∞. Proof. Write f (x) = xβ L(x), x ∈ [1, ∞), where L is slowly varying. For any n ∈ N+ we have d0 (ηn , ηn0 ) ≤ sup |ηn (t) − ηn0 (t)| t∈I



n 1 P |f (a ) − f (b )| ≤ δ 0 + δ 00 , j j n n f (n) j=1

(3.3.35)

where ´ 1 X³ β bj − aβj L(aj ), f (n) n

δn0 =

j=1

n

δn00 =

1 X β bj |L(bj ) − L(aj )| . f (n) j=1

¡ ¢ Using the inequality (1 + a)α − 1 ≤ a {α} + bαc(1 + a)α−1 , valid for nonnegative a and α, we obtain bβj − aβj ≤ cβ(1 + c)β−1 aβ−1 , j whence

1 ≤ j ≤ n,

n

δn0 ≤ cβ(1 + c)β−1

1 X −1 aj f (aj ). f (n) j=1

Writing −1 −1 a−1 j f (aj ) = aj f (aj )I(aj ≤M ) + aj f (aj )I(aj >M ) ,

1 ≤ j ≤ n,

for an arbitrarily given M ≥ 1, we easily obtain   n X n f (i) 1 1 δn0 ≤ cβ(1 + c)β−1  max + f (aj ) . f (n) 1≤i≤M i M f (n) j=1

Limit theorems

209

Then for any ε > 0 by Corollary 3.3.2(ii) we have ³ ´ lim sup γ δn0 > cβ(1 + c)β−1 ε n→∞

≤ lim sup γ (ηn (1) > M ε/2) n→∞

µ· ≤ ν1/β

¶¶ Mε ,∞ −→ 0 as M → ∞. 2

Hence δn0 converges to 0 in γ-probability as n → ∞. Next, for any fixed M ≥ 1 we can write  ! Ã µ ¶β n X b 1 j  δn00 ≤ f (aj ) I(aj ≤M ) f (bj ) + f (n) aj j=1

+

¶ n µ X bj β j=1

aj

à ³ ≤

 ¯ ¯ ¯ L(bj ) ¯ f (aj ) ¯¯ − 1¯¯ I(aj >M )  L(aj )

β

1 + (1 + c)

!

´ sup

f (x)

1≤x≤M +c

(1 + c)β + f (n)

n f (n)

¯ ¯X ¯ L(x + s) ¯ n ¯ ¯ sup f (aj ). ¯ L(x) − 1¯ 0≤s≤c, x>M j=1

Given η > 0, choose M ≥ 1 such that ¯ ¯ ¯ L(x + s) ¯ ¯ sup ¯ − 1¯¯ ≤ η L(x) 0≤s≤c for x > M, which is possible by the Karamata representation of L (see Theorem A2.1). Then for any ε > 0 by Corollary 3.3.2(ii) again we have ³ ´ lim sup γ(δn00 > ε) ≤ lim sup γ ηn (1) > η −1 (1 + c)−β ε/2 n→∞

n→∞

µ· ≤ ν1/β

¶¶ η −1 (1 + c)−β ε ,∞ −→ 0 as η → 0. 2

Hence δn00 converges to 0 in γ-probability as n → ∞.

210

Chapter 3 By (3.3.35) the proof is complete.

2

Corollary 3.3.7 Let bn denote either yn , rn or un , n ∈ N+ . For any n ∈ N+ consider the stochastic process   X 1 ηn0 =  f (bj ) f (n) j≤bntc

t∈I

with the usual convention which assigns value 0 to a sum over the empty set, where f : [1, ∞) → R++ is bounded on finite intervals and regularly varying of index 1/α, 0 < α < 1. Let µ ∈ pr(BI ) such that µ ¿ λ. Then w

µηn0−1 → Qνα in BD . Proof. Lemma 3.3.6 applies with c = 1 in the case of yn and rn and with c = 2 in the case of un . Since µ ¿ λ, the distance d0 (ηn , ηn0 ) converges to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.3.2(ii) imply the result stated. 2 In the case where α ≥ 1 we have results which complement Theorem 3.2.7. Write b0 for either y 0 , r0 or u0 . Theorem 3.3.8 Let bn denote either yn , rn or un . Assume f : 2 [1, ∞) → R++ is regularly varying of index 1/α, µ x α ∈ [1, 2), ¶ Eγ f (a1 ) = ∞, R and f (x) = x1/α L(x), where L(x) = c exp ε(t)t−1 dt , x ≥ 1, with 1

c > 0, ε : [1, ∞) → R continuous, and limt→∞ ε(t) = 0. For any n ∈ N+ define the process η¯n0 = (¯ ηn0 (t))t∈I by

η¯n0 (t) =

 X ¡ ¢ f (bj ) − m(f, b0 ) − Eγ f (a1 )I(f (a1 )≤f (n))      j≤bntc

1 × X ¡ ¢ f (n)    f (b ) − E f (b )  j γ 0 

if α = 1,

if α > 1

j≤bntc

with the usual convention which assigns value 0 to a sum over the empty set, where m(f, b0 ) and Eγ f (b0 ) are equal to m(f, y 0 ) = m(f, r0 ) = Eγ (f (r0 ) − f (a0 )) = Eγ (f (r1 ) − f (a1 )) =

1 log 2

Z 1



(f (x) − f (bxc)) dx , x(x + 1)

Limit theorems

211

m(f, u0 ) = Eγ (f (u0 ) − f (a0 )) =

1 log 2

Z

∞Z ∞µ

1

µ ¶ ¶ 1 f x+ − f (bxc) (xy + 1)−2 dxdy y

1

µZ

=

2 1 (f (x) − f (1)) (x − 1) dx log 2 x2 1 ¶ Z ∞ f (x) − (bxc − x + 1)f (bx − 1c) − (x − bxc)f (bxc) + dx , x2 2

Z ∞ f (x)dx 1 Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = , log 2 1 x(x + 1) µZ 2 ¶ Z ∞ 1 f (x)(x − 1)dx f (x)dx Eγ f (u0 ) = + , log 2 x2 x2 1 2 according as bn denotes yn , rn or un , n ∈ N+ . Then w

µη 0−1 n −→ Qνα in BD for any µ ∈ pr(BI ) such that µ ¿ λ, where να is defined as in Corollary 3.3.2(ii). The proof of Theorem 3.3.8 for the cases bn = rn or bn = un , n ∈ N+ , can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈ N+ , can be treated in a similar manner. 2 Example 3.3.9 Let f (x) = x1/α , x ∈ [1, ∞), where α ∈ (1, 2). (For the case α = 2 see Example 3.2.8.) Theorem 3.3.8 holds with Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = =

=

1 log 2 1 log 2

Z



1

X j∈N+

1 2 log 2

x1/α dx 1 = x(x + 1) log 2

Z 0

1

v −1/α dv v+1

1 (2j − 1 − 1/α)(2j − 1/α)

µ µ ¶ µ ¶¶ 1 1 1 ψ 1− −ψ − , 2α 2 2α

212

Chapter 3

where ψ is the digamma function—see p. 145—and ¶ µZ 2 Z ∞ 1 (x − 1)dx dx α2 (21/α − 1) Eγ f (u0 ) = + = . log 2 (α − 1) log 2 x2−1/α x2−1/α 1 2 2 Example 3.3.10 Let f (x) = x, x ∈ [1, ∞). Theorem 3.3.8 holds with Z ∞ 1 (x − bxc) dx m(f, y 0 ) = m(f, r0 ) = log 2 1 x(x + 1) Z ∞ 1 dx = = (log 2)−1 − 1, log 2 1 x2 (x + 1) −1 m(f, u0 ) = Eγ (r0 − a0 + y −1 0 ) = m(f, r 0 ) + Eγ (y 0 )

=

2 log 2

Z



1

¡ ¢ dx −1 = 2 (log 2) − 1 . x2 (x + 1) 0

by

0

It follows that if for any n ∈ N+ the process ζn = (ζn (t))t∈I is defined µ ¶ C − log n 1 X bj + , ζn (t) = n log 2 0

t ∈ I,

j≤bntc

where bn denotes either yn , rn or un , n ∈ N+ , then for any µ ∈ pr(BI ) such that µ ¿ λ we have 0 w µζn−1 −→ Qν 00 in BD in the cases where bn = yn or bn = rn , n ∈ N+ , with ν 00 = δC/ log 2−1 ∗ ν1 , and 0 w µζn−1 −→ Qν 000 in BD in the case where bn = un , n ∈ N+ , with ν 000 = δ(C+1)/ log 2−2 ∗ ν1 . As a consequence (compare with the similar result for the incomplete quotients an , n ∈ N+ , in Subsection 3.3.2) we have ! à P card{1 ≤ k ≤ n : kj=1 yj > k(log n − C)/ log 2} <x lim µ n→∞ n à = lim µ

card{1 ≤ k ≤ n :

Pk

j=1 rj

n→∞

= µ (λ(t ∈ I : ξν 00 (t) > 0) < x) ,

> k(log n − C)/ log 2}

n 0 ≤ x ≤ 1,

! <x

Limit theorems

213

and à lim µ

card{1 ≤ k ≤ n :

Pk

j=1 uj

> k(log n − C)/ log 2}

n

n→∞

¡ ¢ = µ λ(t ∈ I : ξν 000 (t) > 0) < x ,

! <x

0 ≤ x ≤ 1. 2

3.4

Fluctuation results

3.4.1

The case of incomplete quotients

We start with a direct consequence of Theorem 3.2.20 . Let K ⊂ C be the collection of all absolutely continuous functions x ∈ C R1 for which x(0) = 0 and 0 [x0 (t)]2 dt ≤ 1. Here x0 stands for the derivative of x which exists a.e. in I. N Let H be a real-valued function on N+ + . Set Hn = H (an , an+1 , · · · ) , n ∈ 2 0 N Denoting Sn = P+n, and assume that Eγ H1 < ∞ and (3.2.1 ) holds. 2 defined by (3.2.20 ) is H − nE H , n ∈ N , and assuming that σ n γ 1 + i=1 non-zero, for any n ≥ 3 put θn (t) = =

¢¢ ¡ ¡ 1 √ Sbntc + (nt − bntc) Hbntc+1 − Eγ H1 σ 2n log log n 1 √ ξC , 2n log log n n

t ∈ I.

Theorem 3.4.1 (Strassen’s law of the iterated logarithm). Assume that Eγ |H1 |2+δ < ∞ for some constant δ > 0, (3.2.30 ) holds, and σ 2 defined by (3.2.20 ) is non-zero. Then the sequence (θn )n≥3 , viewed as a subset of C , is a relatively compact set whose derived set coincides a.e. with K. Proof. The result follows from Strassen’s law of the iterated logarithm for standard Brownian motion [see Theorem 1 in Strassen (1964)] and Theorem 3.2.20 . 2 Corollary 3.4.2 (Classical law of the iterated logarithm). Under the assumptions of Theorem 3.4.1 the set of accumulation points of the sequence ³ ´ p Sn /σ 2n log log n n≥3

214

Chapter 3

coincides a.e. with the segment [−1, 1]. In the special case where H only depends on finitely many coordinates N of a current point of N+ + , i.e., when H is a real-valued function on Nk+ for a given k ∈ N+ , certain assumptions in Theorem 3.4.1 are no longer necessary. In this case Hn = H (an , · · · , an+k−1 ), n ∈ N+ , and (3.2.30 ) is trivially satisfied. Also, σ 2 reduces to (3.2.200 ) and when k = 1 by Corollary 2.1.25 we have σ 2 = 0 if and only if H = const. Finally, it is enough to assume that Eγ H12 < ∞. This follows from the work of Heyde and Scott (1973). Cf. the remark following Proposition 3.2.6. We state a most striking result. Proposition 3.4.3 Let f : N+ P → R be a nonconstant function. Assume that Eγ f 2 (a1 ) < ∞ and put Sn = ni=1 f (ai ) − nEγ f (a1 ) , n ∈ N+ . Let X ¡ ¢ σ 2 = Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 Eγ f (a1 ) f (an+1 ) − Eγ2 f (a1 ) , n∈N+

which by Corollary 2.1.25 is non-zero. For any n ≥ 3 put ¡ ¢ 1 Sbntc + (nt − bntc) (fbntc+1 − Eγ f (a1 )) , θn (t) = √ σ 2n log log n

t ∈ I.

Then the sequence (θn )n≥3 , viewed as a subset of C, is a relatively compact set whose derived set coincides a.e. with √ K. In particular, the set of accumulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e. with the segment [−1, 1]. The almost sure invariance principle is instrumental in establishing integral tests which characterize the asymptotic growth rates of partial sums and maximum absolute partial sums. Proposition 3.4.4 Let θ : [1, ∞) → R++ be non-decreasing. Then under the assumptions of Theorem 3.4.1 the following assertions hold: √ (i) γ (Sn > σ n θ (n) i.o.) = 0 or 1 according as µ 2 ¶ Z ∞ θ (t) θ (t) exp − dt t 2 1 converges or diverges. (ii) according as

√ γ (max1≤i≤n |Si | < σ n/θ(n) i.o.) = 0 or 1 Z 1



µ 2 2 ¶ θ2 (t) π θ (t) exp − dt t 8

Limit theorems

215

converges or diverges. Proof. These results follow from Theorem 3.2.20 and properties of standard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and Stout (1975) [cf. Philipp and Stout (1975)]. 2 Except for the sufficiency of the moment assumption Eγ H12 < ∞ in the case considered there, the considerations on Theorem 3.4.1 following Corollary 3.4.2 are valid for Proposition 3.4.4, too. We note that Proposition 3.4.4(i) implies the classical law of the iterated logarithm µ ¶ Sn γ lim sup √ = 1 = 1. (3.4.1) n→∞ σ 2n log log n √ To obtain (3.4.1) √ we should take successively θ(n) = (1 + ε) 2 log log n and θ(n) = (1 − ε) 2 log log n, 0 < ε < 1, n ∈ N+ . Also, Proposition 3.4.4(ii) implies Chung’s law of the iterated logarithm for maximum absolute partial sums ! Ã π max1≤i≤n |Si | = 1. (3.4.2) =√ γ lim inf p n→∞ σ n/(log log n) 8 √ √ To obtain (3.4.2) we should take successively θ(n) = ( 8/π)(1+ε) log log n √ √ and θ(n) = ( 8/π)(1 − ε) log log n, 0 < ε < 1, n ∈ N+ . We conjecture that in the special case where H only depends on finitely N many coordinates of a current point in N+ + , Chung’s law of the iterated logarithm (3.4.2) holds only assuming that Eγ H12 < ∞ [as (3.4.1) does]. See Jain and Pruitt (1975) for the i.i.d. case.

3.4.2

The case of associated random variables

Write bn for either yn , rn or un , n ∈ N+ , respectively b0 for either y 0 , r0 or u0 . Theorem 3.4.5 Let f : [1, ∞) → R satisfy either (i) or (ii) of Theorem 3.2.9. With the notation of that theorem assume that σ(f ) > 0 and put 1 θn0 (t) = √ ξ 0C (t), 2n log log n n

n ≥ 3, t ∈ I.

If δ > 0 then the sequence (θn0 )n≥3 , viewed as a subset of C, is a relatively compact set whose derived set coincides a.e. √ with K. In particular, the set of 0 accumulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e. with the segment [−1, 1].

216

Chapter 3

Proof. The results follow at once from Theorem 3.2.9(b) and Strassen’s law of the iterated logarithm for standard Brownian motion [see Theorem 1 in Strassen (1964)]. 2 Note that in the present context we cannot make considerations similar to those following Corollary 3.4.2. Example 3.4.6 Let f (x) = log x, x ∈ [1, ∞). As we have seen in Example 3.2.11, in the cases where bn = yn or bn = rn , n ∈ N+ , we have Eγ f (b0 ) =

π2 12 log 2

and σ(f ) = σ < ∞ is non-zero. It follows that Strassen’s law of the iterated logarithm holds for the corresponding processes θn0 , n ∈ N+ . In particular, the classical law of the iterated logarithm µ ¶ log qn − nπ 2 /12 log 2 √ γ lim sup =1 =1 σ 2n log log n n→∞ holds. This had been proved by Gordin and Reznik (1970) and Philipp and Stackelberg (1969). 2 A result similar to Proposition 3.4.4 holds. Proposition 3.4.7 Let θ : [1, ∞) → R++ be non-decreasing. Then under the assumptions of Theorem 3.2.9 the following assertions hold: √ (i) γ(Sn0 > σ(f ) n θ(n) i.o.) = 0 or 1 according as µ 2 ¶ Z ∞ θ (t) θ(t) exp − dt t 2 1 converges or diverges. √ (ii) γ (max1≤i≤n |Si0 | < σ(f ) n/θ(n) i.o.) = 0 or 1 according as µ 2 2 ¶ Z ∞ 2 π θ (t) θ (t) exp − dt t 8 1 converges or diverges. Proof. These results follow from Theorem 3.2.9 and properties of standard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and Stout (1975) [cf. Philipp and Stout (1975)]. 2 The remarks following Proposition 3.4.4 concerning the classical and Chung’s laws of the iterated logarithm apply mutatis mutandis in the present context, too.

Limit theorems

217

It is obvious that all the results stated in this section still hold when γ is replaced by any µ ∈ pr(BI ) such that µ ¿ λ.

218

Chapter 3

Chapter 4

Ergodic theory of continued fractions In this chapter applications of the ergodic properties of the continued fraction transformation τ and its natural extension τ are given. Next, two operations (‘singularization’ and ‘insertion’) on incomplete quotients are introduced, which allow to obtain most of the continued fraction expansions related to the RCF expansion. Ergodic properties of these expansions are also derived.

4.0 Ergodic theory preliminaries 4.0.1 A few general concepts Let (X, X , µ) be a probability space. An X-valued random variable on X, i.e., an (X , X )-measurable map from X into itself (see Section A1.2), is called a transformation of X. A transformation T of X is said to be µ-non-singular if and only if µ(T −1 (A)) = 0 for any A ∈ X for which µ(A) = 0; it is said to be measure preserving if and only if µT −1 = µ, i.e., µ(T −1 (A)) = µ(A) for any A ∈ X – see Section A1.3. (When the probability µ should be emphasized we shall say that T is µ-preserving.) Clearly, any µ-preserving transformation of X is µ-non-singular. A pair (T, µ), where T is a µ-preserving transformation of X, is called an endomorphism of X. An endomorphism (T, µ) of X is called an automorphism if and only if T is bijective [that is, T (X) = X and T −1 exists] and T −1 is (X , X )-measurable. A quadruple (X, X , T, µ), where (T, µ) is an endomorphism of X, is called a (measurable) dynamical system. 219

220

Chapter 4

A transformation T of X is said to be ergodic (or metrically transitive, or indecomposable) under µ if and only if the sets A ∈ X with T −1 (A) = A, which are called T -invariant, satisfy either µ(A) = 0 or µ(A) = 1. An equivalent definition, even if seemingly more general, is that ¡ ¢ µ (T −1 (A) \ A) ∪ (A \ T −1 (A)) = 0 for A ∈ X if and only if either µ(A) = 0 or µ(A) = 1. Finally, in terms of functions this is equivalent to f = f ◦ T µ-a.s. for an X-valued random variable f on X if and only if f is constant µ-a.s. In particular, T is ergodic under µ if it is strongly mixing under µ, that is, lim µ(T −n (A) ∩ B) = µ(A)µ(B) n→∞

for any sets A, B ∈ X . This is equivalent to Z Z Z lim (f ◦ T n )g dµ = f dµ g dµ n→∞ X

X

X

for any f ∈ L∞ (X, X , µ) and g ∈ L1 (X, X , µ). Proposition 4.0.1 Let T be a µ-non-singular transformation of X. If T is ergodic under µ, then there exists at most one probability measure ν on X such that ν ¿ µ and (T, ν) is an endomorphism of X. Conversely, if there exists a unique measure ν on X with ν ¿ µ and dν/dµ > 0 µ-a.s. such that (T, ν) is an endomorphism of X, then T is ergodic under µ. The proof of Proposition 4.0.1, which entails the concept of the Perron– Frobenius operator of T (cf. Section 2.1), can be found in Lasota and Mackey (1985). 2 An endomorphism (T, µ) of X is said to be exact if and only if, putting Xn =

¡

¢ T −n (A) : A ∈ X ,

n ∈ N,

T where T 0 is the identity map, the tail σ-algebra n∈N Xn is µ-trivial, i.e., it contains only sets A for which either µ(A) = 0 or µ(A) = 1. If an endomorphism (T, µ) of X is exact, then T is ergodic under µ; also, for any A ∈ X for which µ(A) > 0 and T n (A) ∈ X , n ∈ N+ , we have lim µ (T n (A)) = 1.

n→∞

Ergodic theory of continued fractions

221

Proposition 4.0.2 Let T be a µ-preserving transformation of X for which T (A) ∈ X for any A ∈ X . Then the endomorphism (T, µ) is exact if and only if Z lim ||P n f −

n→∞

X

f dµ||1,µ = 0

for any non-negative f ∈ L1 (X, X , µ), where P is the Perron–Frobenius operator of T under µ (cf. Section 2.1). For the proof see Boyarski and G´ora (1997, p. 82).

2

Theorem 4.0.3 (Birkhoff’s individual ergodic theorem) Let T be a µpreserving transformation of X. Then for any f ∈ L1 (X, X , µ) there exists f˜ ∈ L1 (X, X , µ) such that n−1

1X f (T k (x)) = f˜ µ-a.s. n→∞ n lim

k=0

and R

f˜ ◦ T = f˜ µ-a.s.

R

Moreover, X f˜ dµ = X f dµ and if, in addition, T is ergodic under µ, then R f˜ is µ-a.s. a constant equal to X f dµ. A proof of the ergodic theorem can be found in, e.g., Billingsley (1965), Walters (1982), Petersen (1983) or Cornfeld et al. (1982). In particular, in Keane (1991) a short proof, essentially based on an idea of Kamae (1982), is outlined. See also Katznelson and Weiss (1982). 2 Under suitable assumptions it is possible to refine Birkhoff’s theorem by giving an estimate of the convergence rate to the limit f˜. The result stated below is a special case of Theorem 3 of G´al and Koksma (1950). Proposition 4.0.4 Let T be a µ-preserving transformation of X which is ergodic under µ. Assume that Z Ãn−1 X X

Z f ◦ Tκ − n

!2 f dµ

dµ = O(Ψ(n))

X

κ=0

as n → ∞, where Ψ : N+ → R is a function such that the sequence (Ψ(n)/n)n∈N+ is non-decreasing. Then whatever ε > 0 we have n−1 X κ=0

Z κ

f (T (x)) = n X

´ ³ 3+ε f dµ + o Ψ1/2 (n) log 2 n

µ-a.s.

222

Chapter 4

as n → ∞. Here the constant implied in o depends on ε and the current point x ∈ X. Given a transformation T of X we can define its so called natural extension T as follows. Let ¡ ¢ XT = (xi )i∈N ∈ X N : xi = T (xi+1 ), i ∈ N and define T : XT → XT by T ((xi )i∈N ) = (T (x0 ), x0 , x1 , · · · ) for any (xi )i∈N = (x0 , x1 , · · · ) ∈ XT . It is easy to check that T is bijective. If T is µ-preserving, then we can also define a measure µ on the σ-algebra XT ⊂ X N generated by the cylinder sets C(A0 , . . . , An ) = ((xi )i∈N ∈ XT : xj ∈ Aj , 0 ≤ j ≤ n) , where Aj ∈ X , 0 ≤ j ≤ n, n ∈ N, by setting   \ µ(C(A0 , . . . , An )) = µ  T −n+j (Aj ),

n ∈ N.

0≤j≤n

Proposition 4.0.5 If T is µ-preserving, then T is µ-preserving; T is ergodic (strongly mixing) under µ if and only if T is ergodic (strongly mixing) under µ. Clearly, if (T, µ) is an endomorphism of X, then (T¯, µ ¯) is an automorphism of XT . Remarks. 1. The definition just given of the natural extension T of T is a constructive one. More generally, starting from a transformation T of X which is µ-preserving (µT −1 = µ), a bijective transformation T : X → X is called a natural extension of T if and only if (i) there exists a measurable space (X, X ) and a probability measure µ on X such that T is µ-preserving, and (ii) there exists a random variable f : X → X such that S n the σ-algebra generated by n∈N T f −1 (X )—see Section A1.1—coincides with X up to sets of µ ¯-probability 0, f ◦ T = T ◦ f µ ¯-a.s., and µ ¯f −1 = µ. The natural extension is unique up to isomorphism. By this we mean that if T i : X i → X i , i = 1, 2, are natural extensions of T : X → X, with X i being µi -preserving for a probability measure µi on X i (the σ-algebra in

Ergodic theory of continued fractions

223

X i ), i = 1, 2, then there exist Ei ∈ X i with µ(Ei ) = 0, i = 1, 2, and a one-to-one random variable g : X 1 \ E1 → X 2 \ E2 such that gT 1 = T 2 g on X 1 \ E1 and µ1 (g −1 (E)) = µ2 (E) for any set E in X 2 which is included in X 2 \ E2 . In the case of the constructive definition we clearly have X = XT while f is defined by f ((xi )i∈N ) = x0 ,

(xi )i∈N ∈ XT .

Note that the definition of isomorphism of two natural extensions of a given endomorphism also applies to the case of two arbitrary endomorphisms or dynamical systems. 2. Unlike ergodicity or strong mixing, exactness does not transfer from ¯). As T is invertible, an endomorphism (T, µ) to its natural extension (T , µ (T , µ ¯) cannot be exact since ³ ´ ¡ ¢ −1 µ ¯ T (A) = µ ¯ T (T (A)) = µ ¯(A), ¡ n ¢ ¡ ¢ hence µ ¯ T (A) = µ ¯(A) for any n ∈ N+ and A ∈ X . Instead, T , µ ¯ always is a K-automorphism, which means that there exists an algebra A ⊂ X S −1 n such that T (A) ⊂ A, n∈N+ T (A) generates X , and the tail σ-algebra T −n (A) is µ ¯-trivial. Cf. Petersen (1983, Section 2.5) 2 n∈N+ T Finally, let us consider together with the probability space (X, X , µ) and a transformation T : X → X, a family of probability spaces ((Y, Y, νx ))x∈X and a family (Tx )x∈X of transformations of Y such that the map (x, y) ∈ X × Y → Tx (y) ∈ Y is an Y -valued random variable on X × Y . The map S : X × Y → X × Y defined by S(x, y) = (T (x), Tx (y)) ,

(x, y) ∈ X × Y,

is called a skew product of T and (Tx )x∈X . In many cases the natural extensions are constructed as skew products. Several examples can be found in the next sections. Assuming that T is µ-preserving and Tx is νx -preserving for any x ∈ X, we might expect the skew-product S to be ν-preserving, where ν is the probability measure on X ⊗ Y defined by Z ν(A × B) = νx (B) µ(dx), A ∈ X , B ∈ Y. A

Unfortunately, such a result does not hold even if it is claimed in Boyarski and G´ora (1997, p. 64). It is contradicted, e.g., by the case of the natural extension τ¯ of τ . Cf. the next subsection.

224

Chapter 4

4.0.2 The special case of the transformations τ and τ It is possible to give a direct proof of the ergodicity under γ of the continued fraction transformation τ . See, e.g., Billingsley (1965, pp. 44–45). Results proved in Chapter 2 allow us to assert that actually τ is strongly mixing under γ and any γa , a ∈ I, thus in particular under γ0 = λ. This is a direct consequence of Corollary 1.3.15. Therefore τ is also ergodic under γ and any γa , a ∈ I. Moreover, the endomorphism (τ, γ) is exact by Corollary 2.1.8 and Proposition 4.0.2. It follows from Proposition 4.0.1 that any ν ¿ λ for which τ is ν-preserving should coincide with γ. As for τ , we shall show that it can be viewed as the natural extension of τ in the meaning of the constructive definition given in the preceding subsection. Indeed, in our case XT from the preceding subsection is Ωτ = {(ωi )i∈N ∈ ΩN : ωi = τ (ωi+1 ), i ∈ N}, and the natural extension of τ appears to be—we are bound to change notation—the transformation given by τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · ) for any (ωi )i∈N = (ω0 , ω1 , · · · ) ∈ Ωτ . Let us remark that by the very definition of Ωτ we have ωi+1 = 1/(κi + ωi ) for some κi ∈ N+ whatever i ∈ N. Hence Ωτ can be viewed as the Cartesian product N

Ω × N+ + or, equivalently, Ω × Ω = Ω2 . More precisely, there is a one-to-one correspondence between Ωτ and Ω2 given by (ωi )i∈N ∈ Ωτ ↔ (ω0 , [bω1−1 c, bω2−1 c, · · · ] ) ∈ Ω2 . Then there also is a one-to-one correspondence between τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · ) ∈ Ωτ and

µ τ (ω0 ),

1 bω0−1 c + [bω1−1 c, bω2−1 c, · · · ]

¶ ∈ Ω2 .

These considerations show that we can identify τe : Ωτ → Ωτ and τ : Ω2 → Ω2 defined as in Subsection 1.3.1 by µ ¶ 1 τ (ω, θ) = τ (ω), , (ω, θ) ∈ Ω2 . a1 (ω) + θ

Ergodic theory of continued fractions

225

It follows from Proposition 4.0.5 that τ¯ is strongly mixing (thus ergodic) under γ¯ . Also, (¯ τ , γ¯ ) is a K-automorphism. Clearly, τ¯ can be viewed as a skew product.

4.1 4.1.1

Classical results and generalizations The case of incomplete quotients

Since τ is γ-preserving and ergodic under γ, it follows from Theorem 4.0.3 that Z 1 n−1 f (x) 1X 1 κ lim f ◦τ = dx a.e. (4.1.1) n→∞ n log 2 0 x + 1 κ=0 R for any measurable function f : I → R such that I |f | dλ < ∞. It is clear that under suitable further assumptions on f , Proposition 4.0.4 should lead to estimates of convergence rates in (4.1.1). We now state several classical results which can be derived from (4.1.1) by specializing f , together with the corresponding estimates of the convergence rates, when available. Let us note that throughout this subsection the constants implied in o will depend on ε, the current point in Ω, and the other variables involved. Proposition 4.1.1 [Asymptotic relative digit frequencies – L´evy (1929)] For any i ∈ N+ we have µ ¶ 1 1 card{κ : aκ = i, 1 ≤ κ ≤ n} = log 1 + a.e.. lim n→∞ n log 2 i(i + 2) More precisely, whatever ε > 0, for any i ∈ N+ we have card{κ : aκ = i, 1 ≤ κ ≤ n} n µ ¶ ³ 1 ´ 1 1 = log 1 + + o n− 2 log(3+ε)/2 n log 2 i(i + 2)

a.e.

as n → ∞. Proof. The first equation in the above statement follows from (4.1.1) by taking f = I(a1 =i) , hence f ◦ τ κ = I(a1 ◦τ κ =i) = I(aκ+1 =i) , κ ∈ N. The second equation follows from Proposition 4.0.4 on account of Corollaries 1.3.15 and A3.3 which yield Ψ(n) = n, n ∈ N+ . 2

226

Chapter 4

A more general result yielding the asymptotic relative m-digit block frequencies is also easily obtained. Proposition 4.1.2 Whatever ε > 0, for any m ∈ N+ and i(m) = (i1 , · · · , im ) ∈ Nm + we have card{κ : (aκ , · · · , aκ+m−1 ) = i(m) , 1 ≤ κ ≤ n} n =

³ 1 ´ 1 1 + v(i(m) ) −2 (3+ε)/2 + o n log n log log 2 1 + u(i(m) )

a.e.

as n → ∞. The proof is quite similar to that of the preceding proposition. In (4.1.1) we should take f = I((a1 ,··· ,am )=i(m) ) . 2 It is important to note that the asymptotic relative digit frequencies as well as the asymptotic relative m-digit block frequencies, m ≥ 2, constitute probability distributions on N+ respectively Nm + . This is quite easily checked in the first case and not so easily in the second one (induction on m!). Actually, this follows from (4.1.1) on account of the countable additivity of the integral there with respect to the integrand. We now give other results related to asymptotic relative digit frequencies. Corollary 4.1.3 (Asymptotic relative frequencies of digits between two given values) For any i, j ∈ N+ such that i ≤ j we have 1 (i + 1)(j + 1) card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} = log n→∞ n log 2 i(j + 2) lim

a.e..

More precisely, whatever ε > 0, for any i, j ∈ N+ such that i ≤ j we have card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} n =

³ 1 ´ 3+ε (i + 1)(j + 1) 1 log + o n− 2 log 2 n log 2 i(j + 2)

a.e.

as n → ∞. This is a direct consequence of Proposition 4.1.1, which can be also obtained from (4.1.1) by taking f = I(i≤a1 ≤j) .

Ergodic theory of continued fractions

227

Proposition 4.1.4 (Asymptotic relative frequencies of digits exceeding a given value) For any i ∈ N+ we have 1 i+1 card{κ : aκ ≥ i, 1 ≤ κ ≤ n} = log n→∞ n log 2 i lim

a.e..

More precisely, whatever ε > 0, for any i ∈ N+ we have ´ ³ 1 3+ε card{κ : aκ ≥ i, 1 ≤ κ ≤ n} 1 i+1 = log + o n− 2 log 2 n n log 2 i as n → ∞.

a.e.

The proof is quite similar to that of Proposition 4.1.1. In (4.1.1) we should take f = I(a1 ≥i) . 2 Let us note that on account of the complete additivity of the asymptotic relative digit frequencies, the first half of Proposition 4.1.4 is a direct consequence of the first half of Proposition 4.1.1. Now, let m ∈ N+ such that m ≥ 2, and fix arbitrarily an ` ∈ N+ not exceeding m. It then follows from Proposition 4.1.1 that lim

n→∞

card{κ : aκ ≡ ` mod m, 1 ≤ κ ≤ n} n ∞

(` + pm + 1)2 1 X log = log 2 (` + pm)(` + pm + 2)

a.e..

p=0

[By taking f = I(a1 ≡` mod m) in (4.1.1), an estimate of the convergence rate can be also obtained.] It has been shown that the sum of the series above can be expressed in terms of Euler’s Gamma-function. To be precise, the following result holds. Proposition 4.1.5 [Nolte (1990)] We have ∞

1 X (` + pm + 1)2 1 log = log log 2 (` + pm)(` + pm + 2) log 2 p=0

Ã

` Γ( m )Γ( `+2 m )

Γ2 ( `+1 m )

! .

The proof rests on a special case of a result from Whittaker and Watson (1927, Section 12.13), which reads as follows. Let αi , βi ∈ C \ N+ , 1 ≤ i ≤ r, for a given r ∈ N+ . Then the infinite product Y (n − α1 )(n − α2 ) · · · (n − αr ) (n − β1 )(n − β2 ) · · · (n − βr ) n∈N+

228

Chapter 4

converges if and only if then

Pr

i=1 αi

=

Pr

i=1 βi .

If this condition is fulfilled,

r Y Y (n − α1 )(n − α2 ) · · · (n − αr ) Γ(1 − βi ) = . (n − β1 )(n − β2 ) · · · (n − βr ) Γ(1 − αi )

(4.1.2)

i=1

n∈N+

2 For example, using the well known relations Γ(z)Γ(1 − z) = π/ sin πz, z 6∈ Z, and Γ(z + 1) = zΓ(z), z 6∈ −N, if we take m = 2 and ` = 1 then we find that card{κ : aκ ≡ 1 mod 2, 1 ≤ κ ≤ n} n→∞ n lim

=

1 Γ(1/2)Γ(3/2) log π log = − 1 = 0.6514 · · · 2 log 2 Γ (1) log 2

a.e.,

i.e., about 65 % of the occurring digits are odd a.e.. Next, using the same relations for the function Γ, for m = 4 and ` = 1 we find that lim

n→∞

card{κ : aκ ≡ 1 mod 4, 1 ≤ κ ≤ n} n =

Γ(1/4)Γ(3/4) 1 1 log = 2 log 2 Γ (1/2) 2

a.e.,

i.e., about half of the occurring digits are ≡ 1 mod 4 a.e.. Similar considerations can be made about 2-digit blocks. For example, we have lim

n→∞

card{κ : (aκ , aκ+1 ) ≡ (0, 0) mod 2, 1 ≤ κ ≤ n} n =

1 X X (4ij + 1)(4ij + 2i + 2j + 2) log log 2 (4ij + 2i + 1)(4ij + 2j + 1) i∈N+ j∈N+

which by (4.1.2) is equal to 1 Γ(1 + 2i+1 1 X 4i )Γ(1 + 4i+2 ) log . 1 i+1 log 2 Γ(1 + 4i )Γ(1 + 2i+1 ) i∈N +

a.e.,

Ergodic theory of continued fractions

229

Nolte (op. cit.) proved that the last quantity can be expressed as µ ¶ 1 X 2n−1 − 1 n ζ(n) − 1 2−n 2−2n α+ (−1) (2 −2 − 1)(ζ(n) − 1) + 2n−2 , log 2 n 2 n≥2

where √ 2 4 α = log 2 − 1 + log 6 2π − log Γ log 2 log 2

µ ¶ 1 = 0.08167 · · · . 4

Setting y = 2 − log π/ log 2 = 0.3485 . . . , Nolte’s computations show that lim

n→∞

card{κ : (aκ , aκ+1 ) ≡ (a, b) mod 2, 1 ≤ κ ≤ n} n

is a.e. equal to z = 0.11694 · · ·

for (a, b) = (0, 0);

y − z = 0.23156 · · ·

for (a, b) = (0, 1) or (1, 0);

1 − 2y + z = 0.41993 · · ·

for (a, b) = (1, 1).

Actually, all the results we have proved so far are special cases of the following result. Proposition 4.1.6 Given m ∈ N+ , let H : Nm + → R be such that X

|H(i(m) )|(v(i(m) ) − u(i(m) )) < ∞

i(m) ∈Nm +

[which is equivalent to Eγ |H(a1 , · · · , am )| < ∞]. Then we have n−1

1X lim H(aκ , · · · , aκ+m−1 ) = αm n→∞ n

a.e.,

κ=0

where αm =

1 log 2

X i(m) ∈Nm +

H(i(m) ) log

1 + v(i(m) ) . 1 + u(i(m) )

If, in addition, Eλ H 2 (a1 , · · · , am ) =

X i(m) ∈Nm +

H 2 (i(m) )(v(i(m) ) − u(i(m) )) < ∞

230

Chapter 4

[which is equivalent to Eγ H 2 (a1 , · · · , am ) < ∞], then whatever ε > 0 we have n−1 ³ 1 ´ 1X H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 log(3+ε)/2 n n

a.e.

κ=0

as n → ∞. For the proof this time the choice of f in (4.1.1) is f (ω) = H(a1 (ω), · · · , am (ω)),

ω ∈ Ω,

while Corollaries 1.3.15 and A3.3 should be also invoked.

2

Remark. A generalization of the second half of Proposition 4.1.6 was given by Philipp (1967). It allows the integer m vary in relation to n, and reads as follows. S Proposition 4.1.7 Let H : m∈N+ Nm + → R be such that Eλ H 2 (a1 , · · · , am ) < ∞ for any m ∈ N+ . Whatever ε > 0, if 2m ≤ n < 2m+1 then ³ 1 ´ 1X 2 H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 αm log2+ε n n n−1

a.e.

κ=0

as n → ∞.

2

We shall now consider other important special cases of Proposition 4.1.6. With m = 1 and  p if p < 1, p 6= 0,  i H(i) = Hp (i) =  log i if p = 0 for i ∈ N+ , we obtain the following results. Proposition 4.1.8 We have lim (a1 · · · an )1/n = K0

n→∞

and

µ lim

n→∞

ap1 + · · · + apn n

a.e.

¶1/p = Kp

a.e.

Ergodic theory of continued fractions

231

for any p < 1, p 6= 0, where µ ¶log i/ log 2 ¶ Z 1 Y µ 1 1 logb1/tc K0 = 1+ = exp dt i(i + 2) log 2 0 1 + t i∈N+

= 2.685452 · · · and  1 Kp =  log 2

X i∈N+

µ p i log 1 +

 ¶ 1/p µ ¶1/p Z 1 1 1 (b1/tc)p  = dt . i(i + 2) log 2 0 1 + t

In particular, K−1 = 1.745405 · · · , K−2 = 1.450340 · · · ,

K−3 = 1.313507 · · · ,

K−4 = 1.236961 · · · , K−5 = 1.189003 · · · ,

K−6 = 1.156552 · · · ,

K−7 = 1.133323 · · · , K−8 = 1.115964 · · · ,

K−9 = 1.102543 · · · ,

K−10 = 1.091877 · · · . More precisely, whatever ε > 0 we have 1

(a1 · · · an )1/n = K0 + o(n− 2 log

3+ε 2

n) a.e.

as n → ∞, and µ p ¶1/p 3+ε 1 a1 + · · · + apn = Kp + o(n− 2 log 2 n) n

a.e.

for any p < 1/2, p 6= 0, as n → ∞. The cases p = 0 and p = −1 leading to the asymptotic a.e. values K0 and K−1 of the geometric, respectively, harmonic mean of the first n incomplete quotients as n → ∞ , were studied by Khintchine (1934/35). Ever since its discovery much effort has been put in the numerical evaluation of K0 . See Lehmer (1939), Pedersen (1959), Shanks and Wrench, Jr. (1959), Wrench, Jr. (1960). In the last reference K0 has been evaluated to 155 decimal places. Recently, using work by Wrench, Jr. and Shanks (1996), Bailey et al. (1997) have presented rapidly converging series for any Kp , p < 1, allowing them to evaluate K0 and K−1 to 7,350 decimal places and Kp for p = −2, −3, · · · , −10 to 50 decimal places. Setting ζ(s, n) = ζ(s) −

n X i=1

i−s ,

s > 1, n ∈ N+ ,

232

Chapter 4

the following identities hold: (i) for any n ∈ N+ we have   µ ¶ µ ¶ X X 1 1  Ai 1  log K0 = ζ(2i, n) − log 1 − log 1 + , log 2 i i i 2≤i≤n

i∈N+

where Ai =

2i−1 X

(−1)κ−1 /κ ,

i ∈ N+ ;

κ=1

(ii) whatever the negative integer p, for any n ∈ N+ we have µ ¶  P j−p−1 ζ(2i + j − p, n) j∈N X −p − 1 1   Kpp = log 2  i i∈N+



X

 µ ¶ 1 (i − 1)p log 1 − 2  ; i

2≤i≤n

(iii) in particular, for any n ∈ N+ we have   P2i −1 − −2 X X ζ(j, n) n 1  log(1 − i )  1 j=2 = − . K−1 log 2 i i−1 2≤i≤n

i∈N+

P

Clearly, for n = 1 the sums 2≤i≤n occurring above are empty, thus zero, so that both Kpp log 2 whatever the negative integer p and (log K0 )(log 2) can be cast in terms of series involving values of the Riemann zeta function and rationals. From (i) above, the elegant integral representation Z 1 1 log[sin(πt)/πt] log K0 = − dt log 2 0 t(t + 1) can be derived. Let us note that we also have Z 1 log[πt(1 − t2 )/ sin πt] 1 dt , log K0 = log 2 + log 2 0 t(t + 1) as shown in Shanks and Wrench, Jr. (1959). Actually, the second equation for log K0 follows from the first one since Z 1 log(1 − t2 ) dt = − log2 2. t(t + 1) 0

Ergodic theory of continued fractions

233

See Bailey et al. (op. cit. p. 419).

P Remarks. 1. Whatever p ∈ R the series i∈N+ api is divergent a.e. For p < 0 the assertion follows immediately fromPProposition 4.1.8 while for p ≥ 0 it is obvious since in this case clearly ni=1 api ≥ n, n ∈ N+ . For p < 0 arbitrarily large in absolute value this might seem strange at first sight. Actually, things are quite natural since by Proposition 4.1.1 any digit i ∈ N+ occurs a.e. infinitely often (and thus there is no need to invoke Proposition 4.1.8). ˇ at (1969, 1984) that from a topological 2. It has been proved by Sal´ standpoint the sets of probability 1 in Propositions 4.1.1 and 4.1.8 (for p = 0) are only of the first Baire category, i.e., they are countable unions of nowhere dense subsets of I. 3. A set which is ‘small’ in the measure theoretical sense, can be quite ‘large’ from the point of view of topology. Consider, for example, the set E2 of all numbers in [0, 1) whose RCF digits are 1 or 2. It is a trivial consequence of Proposition 4.1.1 that λ(E2 ) = γ(E2 ) = 0. On the other hand, it is also clear that E2 has the power of the continuum. To express the ‘topological size’ of sets like E2 the concepts of Hausdorff measure and Hausdorff dimension are suitable. We first recall their formal definitions and then outline two applications of these concepts to continued fractions. Given a subset E of Rn , for any ε, δ > 0 put ( ) X δ δ Hε (E) = inf diam(Ui ) , U

i

where the infimum is taken over all open coverings U = {Ui }i of E such that diam(Ui ) ≤ ε. The Hausdorff measure H δ (E) and the Hausdorff dimension dimH (E) of E are then defined as n o H δ (E) = lim Hεδ (E), dimH (E) = inf δ : H δ (E) = 0 . ε→0

See Falconer (1986, 1990), Harman (1998), and Rogers (1998). It follows from Proposition 1.1.1—see also Corollary 4.1.30—that for any ω ∈ Ω the inequality ¯ ¯ ¯ ¯ ¯ω − p ¯ < 1 ¯ q¯ q2 has infinitely many solutions in integers p, q ∈ N+ with g.c.d. (p, q) = 1. Let then Mc denote the set of all x ∈ [0, 1) satisfying ¯ ¯ ¯ ¯ ¯x − p ¯ < 1 ¯ q¯ qc

234

Chapter 4

for infinitely many pairs (p, q) of positive integers. Clearly, if c ≤ 2 then Mc = [0, 1), but what happens when c > 2? It is fairly easy to show that λ(Mc ) = 0 for c > 2. On the other hand, V. Jarn´ık proved in 1929 that dimH (Mc ) = 2/c for any c > 2. A simplified proof of this result can be found in Falconer (1990, p. 142). Using iterated function systems (IFS)—which is another name for dependence with complete connections—it is possible to calculate the Hausdorff dimension of sets defined by number-theoretic properties. For instance, the set E2 just defined is the attractor of the IFS consisting of the two (nonlinear) contractions u1 (x) =

1 1+x

It was first shown by Jarn´ık that Pollicott (2001) found that

and u2 (x) = 1 3

1 . 2+x

≤ dimH (E2 ) ≤ 23 , but Jenkinson and

dimH (E2 ) = 0.53128 05062 77205 14162 44686 · · · , an approximation accurate to 25 decimal places, which improves earlier estimates of Hensley (1996). A striking feature of Jenkinson and Pollicott’s method is that successive approximations of dimH (E2 ) converge at a superexponential rate. Their method can be also used to efficiently compute the Hausdorff dimension of other sets consisting of numbers whose RCF digits are constrained to belong to any given finite subset of N+ . 2 The case p = 1 is not settled by Proposition 4.1.8. For H(i) = i, i ∈ N+ , the series X 1 X X i = |H(i)|(v(i) − u(i)) = i(i + 1) i+1 i∈N+

i∈N+

i∈N+

is divergent. In this case Eγ H(a1 ) = ∞ but, however, we have lim

n→∞

a1 + · · · + an = ∞ a.e.. n

Before proving this (see Corollary 4.1.10 and Remark 1 following it) let us recall that in Subsection 3.3.2 we noted that, writing tn = a1 + · · · + an , n ∈ N+ , tn /n log n converges in µ-probability to 1/ log 2 as n → ∞ for any µ ∈ pr(BI ) such that µ ¿ λ. It follows that tnκ /nκ log nκ converges a.e. to 1/ log 2 as κ → ∞, where (nκ )κ∈N+ is some sequence of positive integers with limκ→∞ nκ = ∞. Hence tnκ /nκ converges a.e. to ∞ as κ → ∞.

Ergodic theory of continued fractions

235

Thus lim supn→∞ tn /n = ∞ a.e. and it remains to show that lim sup can be replaced by lim. Actually, we shall prove much more. Theorem 4.1.9 [Diamond and Vaaler (1986)] We have tn =

1 + o(1) n log n + θn max ai 1≤i≤n log 2

a.e.

as n → ∞, where θn is an I-valued random variable for any n ∈ N+ . Proof. Given ε > 0 and n ∈ N+ set a0i = ai I(ai ≤h(n)) ,

1 ≤ i ≤ n, 1

where h : N+ → R is defined by h(n) = n log 2 +ε n, and t0n = a01 + · · · + a0n . Then µ ¶ bh(n)c n X 1 0 Eγ tn = j log 1 + log 2 j(j + 2) j=1

=

bh(n)c n X 1 (1 + o(1)) = n logbh(n)c(1 + o(1))/ log 2 log 2 j j=1

as n → ∞. By Corollaries 1.3.15 and A3.2 we have Varγ t0n = O(nVarγ t01 ) = O(nEγ (t01 )2 ) as n → ∞. But Eγ (t01 )2

¶ µ bh(n)c 1 X 2 1 = = bh(n)c(1 + o(1))/ log 2 j log 1 + log 2 j(j + 2) j=1

as n → ∞. Therefore Varγ t0n = O(nbh(n)c) as n → ∞. Now, consider the sequence (nκ )κ∈N+ defined as nκ = bexp κ1−ε c , Note that

κ ∈ N+ .

¡ ¢ nκ−1 = 1 + O(κ−ε ) nκ

as κ → ∞ so that nκ−1 /nκ and h(nκ−1 )/h(nκ ) both converge to 1 as κ → ∞. By the choice of the nκ it is obvious that the series with general term Eγ (t0nκ − Eγ t0nκ )2 , nκ h(nκ )κ1+ε

κ ∈ N+ ,

236

Chapter 4

is convergent. Hence by Beppo Levi’s theorem the random series with general term (t0nκ − Eγ t0nκ )2 , κ ∈ N+ , nκ h(nκ )κ1+ε is convergent a.e. Therefore ³ ´ |t0nκ − Eγ t0nκ | = o nκ κ(1+ε)/2 log(1+2ε)/4 nκ

a.e.

as κ → ∞. Now, it is easy to check that µ ¶ ¡ ¢ Eγ t0nκ (1+ε)/2 (1+2ε)/4 nκ κ log nκ = O = o Eγ t0nκ ε/3 log nκ

a.e.

as κ → ∞ provided that ε < 0.126. Thus t0nκ = (1 + o(1))Eγ t0nκ

a.e.

as κ → ∞. Next, for any n ∈ N+ satisfying nκ−1 < n ≤ nκ for some κ ∈ N+ we clearly have t0nκ−1 ≤ t0n ≤ t0nκ , so that (1 + o(1))Eγ t0nκ−1 ≤ t0n ≤ (1 + o(1))Eγ t0nκ

a.e.

as k → ∞. On account of the properties already noted of the sequence (nκ )κ∈N+ we easily obtain t0n = (1 + o(1))Eγ t0n

a.e.

as n → ∞, and since n logbh(n)c − n log n = o(n log n) as n → ∞, we can also write t0n = (1 + o(1))

n log n log 2

a.e.

(4.1.3)

as n → ∞. To complete the proof we shall show that a.e. there exist at most finitely many integers n ∈ N+ for which the inequalities ai > h(n),

aj > h(n)

Ergodic theory of continued fractions

237

hold for two distinct indices i, j ≤ n. To proceed fix i < j. It follows from Corollary 1.3.15 that γ(ai > h(n), aj > h(n))

= O(γ(ai > h(n))γ(aj > h(n))) = O(γ 2 (a1 > h(n))) = O((h(n))−2 ) = O(n−2 (log n)−1−2ε )

as n → ∞. Hence the probability of the random event (ai > h(n), aj > h(n) for distinct indices i, j ≤ 2n) is of order at most (log n)−1−2ε . For κ ∈ N+ let [ (ai > h(2` ), aj > h(2` ) for distinct indices i, j ≤ 2`+1 ) . Eκ = `≥κ

P Then γ(Eκ ) = O( `≥κ `−1−2ε ) → 0 as κ → ∞. It is now clear that for ω 6∈ Eκ and n > 2κ+1 there exists at most one index i ≤ n for which ai (ω) > h(n). Consequently, we can assert that 0 ≤ tn − t0n ≤ max ai 1≤i≤n

a.e.

(4.1.4)

for all sufficiently large n. By (4.1.3) and (4.1.4) the proof is complete.

2

Remarks. 1. It is now clear from the above theorem and Proposition 3.1.7 why tn /n log n converges in probability, rather than a.e., to 1/ log 2 as n → ∞. The obstacle to a.e. convergence is the occurrence of a single large value of the digits. At the same time, a.e. convergence can be obtained by excluding at most one summand. 2. It is interesting to compare Theorems 3.3.4 and 4.1.9 (see also Corollary 3.1.11). 2 Corollary 4.1.10 Whatever 0 ≤ ε < 1 we have a1 + · · · + an = ∞ a.e.. n→∞ n(log n)ε lim

Remarks. 1. The equation lim

n→∞

a1 + · · · + an = ∞ a.e. n

238

Chapter 4

can be also derived from a slight generalization of equation (4.1.1). Hartman R (1951) proved that if f : I → R+ is measurable and I f dλ = ∞, then the limit in (4.1.1) exists and is equal to ∞ a.e.. The equation above then follows by taking f (ω) = a1 (ω), ω ∈ Ω. It is interesting to note that if we take f (ω) = a2 (ω)/a1 (ω) or f (ω) = a1 (ω)/a2 (ω), ω ∈ Ω, then we obrain 1 X ai 1 X ai+1 = lim = ∞ a.e.. n→∞ n n→∞ n ai ai+1 lim

i∈N+

i∈N+

2. Salem (1943) proved that the celebrated Minkowski’s ? function can be expressed in terms of the tn , n ∈ N, as X ?(x) = (−1)i−1 21−ti (x) i∈N+

for any x ∈ I, if we consider that ai (x) = ∞ for any large enough i ∈ N+ when x ∈ I \ Ω. It is known that ? is a strictly increasing singular function, that is, ?0 (x) = 0 a.e. in I. Recently, Viader et al. (1998) have shown that ¶ µ ¡ ¢ tn (x) = ∞ ∩ x ∈ I : ?0 (x) exists finitely x ∈ I : lim n→∞ n ¡ ¢ ⊂ x ∈ I : ?0 (x) = 0 , thus making more precise the set where the derivative of ? vanishes. Note that the sequence (an )n∈N+ is i.i.d. with common µ-distribution −m (2 : m ∈ N+ ) under the probability measure µ induced by ? on BI . Cf. Lagarias (1992, p. 45). 3. Vardi (1995, 1997) discussed an interesting relationship between the St. Petersburg game [see, e.g., Feller (1968, X.4)] and the sequence (an )n∈N+ , on account of the properties of the sequence (tn )n∈N+ . That game is a well known example of a sequence of independent identically distributed random variables with infinite mean value, and was considered as a paradox since no ‘fair’ entry fee exists. It appears that (an )n∈N+ makes a reasonable choice of entry fees for the St. Petersburg game. 2 Corollary 4.1.11 P Let (cn )n∈N+ be a non-decreasing sequence of positive numbers satisfying n∈N+ c−1 n < ∞. Then tn =

1 + o(1) n log n + θn cn log 2

a.e.

as n → ∞, where θn is an I-valued random variable for any n ∈ N+ .

Ergodic theory of continued fractions

239

Proof. This is an immediate consequence of Theorem 4.1.9 and Proposition 1.3.16 (F. Bernstein’s theorem). 2 Corollary 4.1.12 Set dn = exp(κ log2 κ)κ log2 κ for exp((κ − 1) log2 (κ − 1)) < n ≤ exp(κ log2 κ) , Then lim sup n→∞

a1 + · · · + an 1 = dn log 2

κ ≥ 2.

(4.1.5)

a.e..

Proof. In Corollary 4.1.11 set cn = dn /(log log 10κ) P for n in the range (4.1.5). It is easy to check that n∈N+ c−1 n < ∞ and that (4.1.5) implies n log n ≤ dn , n ∈ N+ . Then by Corollary 4.1.11 we have tn ≤

dn 1 + o(1) dn + log 2 log log 10κ

a.e.

as κ → ∞, so lim supn→∞ tn /dn ≤ 1/ log 2 a.e. To complete the proof we note that setting nκ = exp((κ + 1) log2 (κ + 1)) we have dnκ = nκ log nκ , κ ∈ N+ , and limκ→∞ tnκ /dnκ = 1/ log 2. 2 Remarks. 1. Philipp (1988, Theorem 1) proved that (i) for any seP quence (cn )n∈N+ of positive numbers such that n∈N+ c−1 n < ∞, we have lim supn→∞ tn /cn = 0 a.e., and (ii) for any sequence (cn )n∈N+ of positive such that the sequence (cn /n)n∈N+ is non-decreasing and P numbers −1 n∈N+ cn = ∞, we have lim supn→∞ tn /cn = ∞ a.e. Corollary 4.1.11 shows that the condition on the sequence (cn /n)n∈N+ in (ii) cannot be dispensed with. 2. It is easy to show, see Diamond and Vaaler (op. cit., pp. 81–82), that if (cn )n∈N+ is as in Corollary 4.1.11, then setting S = {n ∈ N+ : cn < n log n} , we have

1 x→∞ log x lim

X n≤x, n∈S

1 = 0, n

240

Chapter 4

that is, S has logarithmic density zero. It then follows from Corollary 4.1.11 that a1 + · · · + an = O(cn ) as n → ∞ for all integers n outside a set of logarithmic density 0. See also Corollary 3.1.9. 3. Theorem 4.1.9 can be easily generalized for a function H : N+ → R++ satisfying  

  X

H 2 (i)/i2  / 

1≤i≤n

X

2

´ ³ 3 H(i)/i2  = O n log− 2 −ε n

1≤i≤n

as n → ∞ for some ε > 0. [Clearly, H(i) = i, i ∈ N+ , satisfies the condition above.] For such a function H we have n X i=1

H(ai ) =

µ ¶ (1 + o(1)) X 1 n H(i) log 1 + log 2 i(i + 2) 1≤i≤n

+ θn max H(ai ) 1≤i≤n

a.e.,

where θn is an I-valued random variable for any n ∈ N+ . The proof can be found in Diamond and Vaaler (op. cit.). 2

4.1.2

Empirical evidence, and normal continued fraction numbers

We shall now discuss the important amount of empirical evidence already accumulated on continued fraction expansions of certain real numbers. The interest of such computations lies in comparing statistics of such expansions with known theoretical limiting distributions. It is clear that, for instance, contained in the exceptional set in Proposition 4.1.8 are all quadratic irrationalities and the number e − 2. See Subsection 1.1.3. Clearly, all the numbers just mentioned are also contained in the exceptional set in Proposition 4.1.1. As we have already mentioned in Subsection 1.1.3, in the opposite direction seems to lie π − 3 whose continued fraction expansion is π − 3 = [ 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] .

Ergodic theory of continued fractions

241

In Bailey et al. (1997, p. 423) it is asserted that, based on the first 17,001,303 continued fraction digits of π − 3, the geometric mean is 2.68639 and the harmonic mean is 1.745882, which are reasonably close to K0 and K−1 —see Proposition 4.1.8. Clearly, no conclusion can be drawn beyond this. For computations concerning the continued fraction digits of various irrationals in I we refer the reader to Alexandrov (1978), Brjuno (1964), Choong, Daykin and Rathbone (1971) (see nevertheless D. Shanks’ review [MR 52 # 7073] of this paper), Lang and Trotter (1972), Richtmyer (1975), Shiu (1995), and J.O. Shallit’s review [MR 96b: 11165] of this last paper. Presenting an algorithm for computing the continued fraction expansion of numbers which are zeroes of differentiable functions, Shiu (1995) obtained √ statistics of the√ first 10000 digits of irrationals in I such as 3 2 − 1, π − 3, π 2 − 9, log 2, 2 2 − 2. Table 1 below is compiled from his Table 1. The last column contains the (theoretical) asymptotic relative digit frequencies

µ ¶ 1 1 log 1 + , log 2 i(i + 2)

1 ≤ i ≤ 10,

in the first 10 lines, the asymptotic relative frequency

1 12 × 101 log log 2 11 × 102

of the digits in the range [11, 100] in the 11th line, and the asymptotic relative frequency 1 102 log log 2 101

of the digits exceeding 100 in the last line. Cf. Propositions 4.1.1, 4.1.3, and 4.1.4.

242

Chapter 4

Frequency of occurrence of i in 10000 digits of Digit i 1 2 3 4 5 6 7 8 9 10 11 − 100

≥ 101

√ 3 2−1

π−3

π2 − 9

log 2

4173 1675 946 636 421 295 240 163 122 118 1060 151

4206 1672 882 597 443 282 224 186 143 123 1113 129

4134 1706 948 581 401 302 232 185 138 117 1111 145

4149 1666 905 600 390 334 226 187 142 137 1113 151



2

2

Theoretical asymptotic relative frequency

−2

4192 1639 933 616 390 278 213 190 135 135 1130 149

0.415037499 · · · 0.169925001 · · · 0.093109404 · · · 0.058893689 · · · 0.040641984 · · · 0.029747343 · · · 0.022720076 · · · 0.017921908 · · · 0.014499569 · · · 0.011972641 · · · 0.111317022 · · · 0.014213859 · · ·

Table 1 It is also interesting to note that setting M10000 (ω) = max1≤κ≤10000 aκ (ω) (cf. Subsection 3.1.3) we have √ √ M10000 ( 3 2 − 1) = a1990 ( 3 2 − 1) = 12737, M10000 (π − 3) = a431 (π − 3) = 20776, M10000 (π 2 − 9) = a1234 (π 2 − 9) = 12013, M10000 (log 2) = a9168 (log 2) = 963664, √

M10000 (2

2



− 2) = a6342 (2

2

− 2) = 44122 ,

and that in all cases just considered there exist digits not exceeding 100 which do not appear, viz. √ 74, 86, 91, 96, 97, 99, and 100 for 3 2 − 1; 90, 91, and 96

for π − 3;

91 and 92

for π 2 − 9;

55, 73, 76, 96, and 97

for log 2;

79, 80, 81, 82, 91, 94, 97, and 99

for 2

√ 2

− 2.

Ergodic theory of continued fractions

243

Concerning Khinchin’s constant K0 , computations of 1

K0 (ω, n) = (a1 (ω) · · · an (ω)) n for n ≤ 10000 and various ω ∈ Ω, including those considered above, suggest that, e.g., π − 3 is not in the exceptional set. However, it should be pointed out that if even there might be convergence the rate has to be very slow. It was found that K0 (π − 3, 10000) differs from K0 by more than K0 (π − 3, 100) does! The existence of the asymptotic relative digit and, more generally, mdigit block frequencies (Propositions 4.1.1 and 4.1.2) raises naturally the question of normality for the continued fraction expansion. ´ Borel in 1909, is an attempt The idea of normality, first introduced by E. to formalize the notion of a real number being random. A real number x ∈ I is said to be normal in base b, b ∈ N+ , b ≥ 2, if and only if in its representation in base b all digits 0, 1, · · · , b−1 appear asymptotically equally often, i.e., with asymptotic relative frequencies all equal to 1/b. In addition, for each m ∈ N+ the bm different m-digit blocks must occur equally often. In other words, for any m ∈ N+ we should have µ ¶ 1 number of occurrences of a given m-digit = b−m lim n→∞ n block in the first n + m − 1 base-b digits of x whatever the given m-digit block. Actually, the above equation holds for all x ∈ I except for a set of Lebesgue measure zero. This can easily be seen by applying Birkhoff’s ergodic theorem to the transformation T x = bx mod 1 of I. A number that is normal in all bases b ∈ N+ , b ≥ 2, is called normal. However, even if there are lots of normal numbers, when we are given a ‘concrete’ number x ∈ I the existence result just mentioned does not help to decide whether x is normal or not. Such a problem cannot be handled by methods known today. (Will it ever be solved?) For instance, it is not known whether π − 3, e − 2, or any irrational algebraic number is normal or not. The first example of a normal number in base 10 was given by Champernowne (1933). His number is x = 0. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 · · · but an explicit example of a normal number is still lacking. Clearly, a similar problem can be considered for the continued fraction expansion (which has the advantage of not being related to any base). An irrational ω ∈ I is said to be a normal continued fraction number if and only

244

Chapter 4

if all its asymptotic relative m-digit block frequencies exist and are equal to those occurring in Proposition 4.1.2 for any m ∈ N+ . In other words, ω is a normal continued fraction number if it does not belong to the exceptional sets of λ-measure zero excluded in Proposition 4.1.2 for any m ∈ N+ . For instance, the quadratic irrationalities are not normal since they eventually have periodic expansions, and neither is e − 2. A construction of the Champernowne type for a normal continued fraction number was given by Adler, Keane, and Smorodinsky (1981). Their example is as follows. Let (rn )n∈N+ be the sequence of rationals in (0,1) obtained by first writing r1 = 1/2, then r2 = 1/3 and r3 = 2/3, then r4 = 1/4, r5 = 2/4, r6 = 3/4, etc., at each stage m ∈ N+ writing all quotients with denominator m + 1 in increasing order. Let ri = [ai,1 , ai,2 , . . . , ai,ni ] be the continued fraction expansion of ri , with ai,ni 6= 1, i ∈ N+ . The irrational ω with continued fraction expansion [a1,1 , a2,1 , a3,1 , a3,2 , a4,1 , a5,1 , a6,1 , a6,2 , a7,1 , a8,1 , a8,2 , a9,1 , a9,2 , a9,3 , · · · ], which is obtained by concatenating the expansions of r1 , r2 , · · · in the given order, is a normal continued fraction number. The first 14 digits of ω are 2, 3, 1, 2, 4, 2, 1, 3, 5, 2, 2, 1, 1, 2. Another example of a different nature had been given by Postnikov (1960). We should emphasize that even if the empirical evidence pleads in favour of normality for the continued fraction expansion of algebraic irrationals of degree exceeding 2, or of π − 3, π 2 − 9 etc., the only mathematical results proved so far are the examples of normal continued fraction numbers just discussed. Finally, a few words about the empirical evidence concerning Theorem √ 3 2 − 1) and 4.1.9. Von Neumann and Tuckerman (1955) computed t ( n √ 3 n log n/ log 2 for n = 100(100)2000. It appears that tn ( 2 − 1) log 2/n log n is most of the time greater than 1 and often nearly 2. As tn log 2/n log n converges just in probability to 1 as n → ∞, these deviations cannot be seen as significant.

4.1.3

The case of associated and extended random variables

Since τ¯ is γ¯ -preserving and ergodic under γ¯ (see Subsection 4.0.2), it follows again from Theorem 4.0.3 that Z 1 Z 1 ¯ n−1 1 f (x, y) 1X¯ k f ◦ τ¯ = dx dy a.e. in I 2 (4.1.6) lim 2 n→∞ n log 2 0 0 (xy + 1) k=0

Ergodic theory of continued fractions

245

RR for any measurable function f¯ : I 2 → R such that I 2 |f¯| dλ2 < ∞. As in Subsection 4.1.1, for suitable choices of f¯, Proposition 4.0.4 will lead to estimates of convergence rates in (4.1.6). We now give several results which can be derived from (4.1.6). Proposition 4.1.13 For any B ∈ BI2 we have n−1

1X 1 lim IB (τ k , s¯k ) = n→∞ n log 2 k=0

ZZ B

dx dy (xy + 1)2

a.e. in I 2 .

Proof. The equation above follows from (4.1.6) by taking f¯ = IB , B ∈ BI2 , and noting that by the very definition of the extended incomplete quotients (see Subsection 1.3.3), equations (1.3.1) and (1.3.10 ) can be written as τ¯n (ω, θ) = (τ n (ω), s¯n (ω, θ)) , (ω, θ) ∈ Ω × I, for any n ∈ N+ . (The last equation holds for n = 0, too.)

2

Corollary 4.1.14 For any A ∈ BI we have n−1

1X lim IA (τ k ) = γ(A) n→∞ n

a.e. in I,

k=0

and

n−1

1X IA (¯ sk ) = γ(A) a.e. in I 2 . lim n→∞ n k=0

Proof. The first equation follows by taking B = A × I. [It might be also derived from equation (4.1.1).] The second equation follows by taking B = I × A. 2 It follows by dominated convergence from Proposition 4.1.13 that for any µ ¯ ∈ pr(BI2 ) we have n−1 ´ 1 X ³ −k µ ¯ τ¯ (B) = γ¯ (B), n→∞ n

lim

B ∈ BI2 .

(4.1.7)

k=0

In particular, n−1 n−1 ´ 1 X ³ −k 1X µ ¯ τ¯ (I × A) = lim µ ¯ (¯ sk ∈ A) n→∞ n n→∞ n

lim

k=0

k=0

= γ(A),

A ∈ BI .

(4.1.8)

246

Chapter 4

We are going to show under suitable assumptions that in (4.1.7) actual convergence holds instead of C´esaro convergence while in (4.1.8) the extended random variable s¯k can be replaced by sak , k ∈ N, for a fixed a ∈ I. Proposition 4.1.15 Let µ ¯ ∈ pr(BI2 ) such that µ ¯ ¿ λ2 . Then ¡ ¢ lim µ ¯ τ¯−n (B) = γ¯ (B) n→∞

(4.1.9)

for any B ∈ BI2 . ¯ = d¯ Proof. Let h µ/dλ2 . Then for any B ∈ BI2 we have ZZ ZZ −n n ¯ g ) d¯ µ ¯(¯ τ (B)) = IB ◦ τ¯ d¯ µ = (IB ◦ τ¯n )(h/¯ γ, I2

I2

where g¯ = d¯ γ /dλ2 , that is, g¯(x, y) =

1 1 , log 2 (xy + 1)2

(x, y) ∈ I 2 .

Now, since τ¯ is strongly mixing (see Subsections 4.0.1 and 4.0.2), the last integral in the equations above converges to ZZ ZZ ¯ g ) d¯ IB d¯ γ (h/¯ γ = γ¯ (B)¯ µ(I 2 ) = γ¯ (B) I2

I2

as n → ∞.

2

Remarks. 1. Proposition 2.1.5 shows that measures µτ −n , n ∈ N, can be expressed in terms of the Perron–Frobenius operator Pγ = U of τ with respect to γ. A similar representation holds for the case of a measure µ ¯ as in Proposition 4.1.15. It is easy to check that we have ZZ −n µ ¯(¯ τ (B)) = P¯γ¯n f¯ d¯ γ , B ∈ BI2 , B

¯ g and P¯γ¯ is the Perron–Frobenius operator of τ¯ under γ¯ . See where f¯ = h/¯ the Remark following Proposition 2.1.1. If the endomorphism (¯ τ , γ¯ ) were exact, then from Proposition 4.0.2 we might have deduced that convergence in (4.1.9) is uniform with respect to B ∈ BI2 . Since (¯ τ , γ¯ ) is not exact, such a conclusion cannot be reached this way. It is an open problem whether this is really true. 2. Proposition 4.1.15 is a first step towards the solution of what can be called Gauss’ problem for the natural extension τ¯ of τ . 2

Ergodic theory of continued fractions

247

Theorem 4.1.16 Let µ ∈ pr(BI ) such that µ ¿ λ. For any B ∈ BI2 such that λ2 (∂B) = 0 we have (i)

lim µ (¯ τ n ( · , a) ∈ B) = γ¯ (B)

n→∞

uniformly with respect to a ∈ I; n−1 1X (ii) lim IB (τ k , sak ) = γ¯ (B) n→∞ n

a.e. in I

k=0

uniformly with respect to a ∈ I. Proof. (i) For any θ ∈ I and B ∈ BI2 set hn (θ, B) = µ (¯ τ n ( · , θ) ∈ B) , n ∈ N+ . By Fubini’s theorem we have ZZ ¡ −n ¢ (µ ⊗ λ) τ¯ (B) = IB (¯ τ n (ω, θ)) µ(dω) dθ I2

Z

Z

1

=

1

dθ 0

Z

0

Z

1

=

IB (¯ τ n (ω, θ)) µ(dω)

n

1

µ (¯ τ ( · , θ) ∈ B) dθ = 0

0

hn (θ, B) dθ.

Since µ ⊗ λ ¿ λ2 , it follows from Proposition 4.1.15 that Z 1 lim hn (θ, B) dθ = γ¯ (B) n→∞ 0

(4.1.10)

for any B ∈ BI2 . Now, note that—letting d denote the Euclidean distance in I 2 —by Theorem 1.2.2 we have d (¯ τ n (ω, θ), τ¯n (ω, a)) ≤

max I(i(n) ) =

i(n) ∈Nn +

1 , Fn Fn+1

n ∈ N+ ,

(4.1.11)

for any θ, a ∈ I. Given ε > 0, let Bε+ =

[

Dε (x, y),

(x,y)∈B

where Dε (x, y) is the open disk of radius ε centered at (x, y) ∈ I 2 , and Bε− = ((x, y) ∈ B : Dε (x, y) ⊂ B) .

248

Chapter 4

By (4.1.11), for n ≥ n0 (ε) great enough and any θ, a ∈ I we have (ω : τ¯n (ω, θ) ∈ Bε− ) ⊂ (ω : τ¯n (ω, a) ∈ B) (4.1.12) ⊂ (ω : τ¯n (ω, θ) ∈ Bε+ ) . On the other hand, for any n ∈ N and θ ∈ I we trivially have (ω : τ¯n (ω, θ) ∈ Bε− ) ⊂ (ω : τ¯n (ω, θ) ∈ B)

(4.1.13)

⊂ (ω : τ¯n (ω, θ) ∈ Bε+ ) . Hence ¡ ¢ ¡ ¢ −hn θ, Bε+ \ Bε− ≤ hn (θ, B) − hn (a, B) ≤ hn θ, Bε+ \ Bε− for any n ≥ n0 (ε) and θ, a ∈ I. Integrating the double inequality above over θ ∈ I yields ¯Z 1 ¯ Z 1 ¯ ¯ ¡ ¢ ¯ ¯ hn (θ, B) dθ − hn (a, B)¯ ≤ hn θ, Bε+ \ Bε− dθ ¯ 0

0

for any n ≥ n0 (ε) whatever a ∈ I. Finally, let first n → ∞ then ε → 0 in the last inequality. By (4.1.10) we obtain lim sup sup |¯ γ (B) − hn (a, B)| ≤ lim γ¯ (Bε+ \ Bε− ) = γ¯ (∂B) = 0 n→∞

ε→0

a∈I

since λ2 (∂B) = 0, and the proof of (i) is complete. (ii) It is easy to check that (4.1.12) and (4.1.13) imply the inequalities ³ ´ ³ ´ ³ ´ IBε− τ k , s¯k ≤ IB τ k , sak ≤ IBε+ τ k , s¯k for any a ∈ I, (ω, θ) ∈ Ω × I, and any k ≥ n0 (ε) great enough. Also, we trivially have ³ ´ ³ ´ ³ ´ IBε− τ k , s¯k ≤ IB τ k , s¯k ≤ IBε+ τ k , s¯k for any k ∈ N and (ω, θ) ∈ Ω × I. Hence ¯ ¯ ¯ ¯ ¯IB (τ k , s¯k ) − IB (τ k , sak )¯ ≤ IBε+ \Bε− (τ k , s¯k )

(4.1.14)

for any k ≥ n0 (ε), a ∈ I, and (ω, θ) ∈ Ω × I. By Proposition 4.1.13 we have n−1

1X IB (τ k , s¯k ) = γ¯ (B) n→∞ n lim

k=0

Ergodic theory of continued fractions and

249

n−1

1X IBε+ \Bε− (τ k , s¯k ) = γ¯ (Bε+ \ Bε− ) n→∞ n lim

a.e. in I 2 .

k=0

Since

λ2 (∂B)

= 0, we have lim γ¯ (Bε+ \ Bε− ) = γ¯ (∂B) = 0.

ε→0

It is now easy to see that (4.1.14) and the last three equations imply the result stated. 2 Remark. Theorem 4.1.16(i) has been proved by Barbolosi and Faivre (1995) while (ii) is implicit (or implicitly used) in many papers by Dutch authors. See, e.g., Bosma et al. (1983) or Jager (1986). 2 Theorem 4.1.16 has a host of consequences. We state some of them. Corollary 4.1.17 Let µ ∈ pr(BI ) such that µ ¿ λ. For any B ∈ BI2 such that λ2 (∂B) = 0 we have lim µ((τ n , san ) ∈ B) = γ¯ (B)

n→∞

(4.1.15)

uniformly with respect to a ∈ I. Proof. This is just a transcription of the result stated in Theorem 4.1.16(i) as τ¯n (ω, a) = (τ n (ω), s¯n (ω, a)) = (τ n (ω), san (ω)),

(ω, a) ∈ Ω × I,

for any n ∈ N.

2

Let us note that in Theorem 2.5.8 the (optimal) convergence rate in (4.1.15) has been obtained in the case where µ = γa for the class of rectangles B = [0, x] × [0, y], x, y ∈ I. Using this result we can prove Proposition 4.1.18 Let B be a simply connected subset of I 2 such that Sm ∂B = i=1 `i for some m ∈ N+ , where either `i := ( (x, fi (x)) : ai ≤ x ≤ bi ) with 0 ≤ ai < bi ≤ 1 and fi : [ai , bi ] → I continuous and monotone, or ¡ ¢ `i := (ci , y) : a0i ≤ y ≤ b0i with ci ∈ I and 0 ≤ a0i < b0i ≤ 1. Then γa ((τ n , san ) ∈ B) = γ¯ (B) + O(gn )

250

Chapter 4

as n → ∞, where the constant implied in O depends on m and the quantities defining the `i , 1 ≤ i ≤ m. The proof in the case a = 0 can be found in Dajani and Kraaikamp (1994). 2 By particularizing the set B in Corollary 4.1.17 and Proposition 4.1.18 we obtain results originally derived by ad hoc methods. We shall state below some of them leaving the calculation details to the reader. Corollary 4.1.19 For any µ ∈ pr(BI ) such that µ ¿ λ and any t ∈ I we have e lim µ (Θn ≤ t) = H(t), n→∞

e has been defined in Theorem 2.2.13. For µ = λ the convergence where H rate in the equation above is O(gn ) as n → ∞. Proof. This follows from Corollary 4.1.17 with a = 0 and ¶ µ x 2 ≤ t , t ∈ I, B = (x, y) ∈ I : xy + 1 and Proposition 4.1.18, as Θn = τ n /(sn τ n + 1), n ∈ N, by equation (1.3.7). Note that, however, Theorem 2.2.13 yields a better convergence rate! 2 Corollary 4.1.20 For any µ ∈ pr(BI ) such that µ ¿ λ and any (t1 , t2 ) ∈ I 2 we have lim µ (Θn−1 ≤ t1 , Θn ≤ t2 ) = H(t1 , t2 ),

n→∞

where H is the distribution function with density  1 1   log 2 √1 − 4t t if t1 ≥ 0, t2 ≥ 0, t1 + t2 < 1, 1 2

 

0

elsewhere.

For µ = λ the convergence rate in the equation above is O(gn ) as n → ∞. Proof. This follows from Corollary 4.1.17 with a = 0 and ¶ µ x y 2 ≤ t1 , ≤ t2 , (t1 , t2 ) ∈ I 2 , B = (x, y) ∈ I : xy + 1 xy + 1 and Proposition 4.1.18, as Θn−1 =

sn , sn τ n + 1

Θn =

τn , sn τ n + 1

n ∈ N,

Ergodic theory of continued fractions

251

by equation (1.3.7).

2

Let us define random variables ρn and Θn0 by ¯ ¯ ¯ pn+1 ¯ ¯ ¯ ¯ω − qn+1 ¯ ¯ pn ¯¯ 0 ¯ ¯ , ρn (ω) = ¯¯ Θn = qn qn+1 ¯ω − ¯ , p ¯ qn ¯ω − qnn ¯

ω ∈ Ω, n ∈ N.

It is easy to see that ρn = sn+1 τ n+1 and Θ0n = 1/(sn+1 τ n+1 + 1) so that Θ0n = 1/(ρn + 1), n ∈ N. Corollary 4.1.21 For any µ ∈ pr(BI ) such that µ ¿ λ we have µ ¶ 1 t log t lim µ(ρn ≤ t) = log(t + 1) − , t ∈ I, n→∞ log 2 t+1

lim µ(Θ0n ≤ t) =

n→∞

 0   

if 0 ≤ t ≤ 1/2,

 log(2tt (1 − t)1−t )   log 2

if 1/2 ≤ t ≤ 1.

For µ = λ the convergence rate in the equations above is O(gn ) as n → ∞. The proof is left to the reader.

2

For other results of the same type, which can be derived as before, we refer the reader to Bosma et al. (1983), Jager (1986), Kraaikamp (1994). Corollary 4.1.22 For any t, t1 , t2 ∈ I the limits lim

1 card{k : Θk ≤ t, 0 ≤ k ≤ n − 1 }, n

lim

1 card{k : Θk ≤ t1 , Θk+1 ≤ t2 , 0 ≤ k ≤ n − 1 }, n

lim

1 card{k : ρk ≤ t, 0 ≤ k ≤ n − 1 }, n

n→∞

n→∞

n→∞

and

1 card{k : Θ0k ≤ t, 0 ≤ k ≤ n − 1 }, n all exist a.e. in I and are equal to the corresponding values of the limiting distribution functions occurring in Corollaries 4.1.19, 4.1.20, and 4.1.21, respectively. lim

n→∞

252

Chapter 4

The proof is immediate on account of Theorem 4.1.16(ii) and the corollaries referred to in the statement. 2 Remarks. 1. It has been proved by Hensley (1998) that if (kn )n∈N+ is a strictly increasing sequence of positive integers, then for any t ∈ I we have 1 e lim card{j : Θkj ≤ t, 0 ≤ j ≤ n − 1 } = H(t) a.e. in I, (4.1.16) n→∞ n e has been defined in Theorem 2.2.13. Corollary 4.1.22 only covers where H the case kn = n, n ∈ N+ . 2. In the case kn = n, n ∈ N+ , equation (4.1.16) has been conjectured by H.W. Lenstra Jr. Actually, this conjecture is implicit in Doeblin (1940), which enables us to call it after both Doeblin and Lenstra. The Doeblin– Lenstra conjecture has been proved by Bosma et al. (1983) by using, even if not explicitly, Theorem 4.1.16(ii) in a special case. 2 Corollary 4.1.23 The equations n−1

1X Θk = lim n→∞ n

1 = 0.36067 · · · 4 log 2

k=0

n−1

1X Θk Θk+1 = n→∞ n

1 6

lim

k=0

n−1

1X ρk = n→∞ n

1 4 log 2

¶ = 0.10655 · · ·

π2 − 1 = 0.18656 · · · 12 log 2

lim

k=0

and

µ 1−

n−1

1 1 1X 0 Θk = + = 0.86067 · · · n→∞ n 2 4 log 2 lim

k=0

all hold a.e. in I. Proof. We consider just the first equation, leaving the calculation details to the reader, as the same idea underlies the proofs in the other cases. By Corollary 4.1.22 we have n−1

1X e lim I[0,t] (Θk ) = H(t) n→∞ n k=0

a.e. in I for any t ∈ I ∩ Q. Hence for any fixed ω ∈ Ω not belonging to the exceptional set the distribution function n−1

Fn (t) :=

1X I[0,t] (Θk ), n k=0

t ∈ I,

Ergodic theory of continued fractions

253

e as n → ∞. Consequently, converges weakly to H Z

n−1

1X t dFn (t) = Θk n I k=0

should converge to

Z e t dH(t) = I

1 4 log 2

as n → ∞ for any ω ∈ Ω not belonging to the exceptional set, thus a.e. in I. While for the last two equations the reasoning is quite similar, in the case of the second equation we should consider RR two-dimensional distribution functions, and the value of the limit equals I 2 t1 t2 dH(t1 , t2 ). 2 We turn now to limit properties of certain associated random variables. It follows from (4.1.6) that for any measurable real-valued function f on I R such that I |f | dλ < ∞ we have n−1

1X f (¯ sk ) = lim n→∞ n k=0

Z f dγ

a.e. in I 2 .

(4.1.17)

I

From (4.1.17) we can derive a weaker result for the sequences (san )n∈N , a ∈ I. Theorem 4.1.24 Let f : I → R be continuous. Then for any a ∈ I we have Z n−1 1X a f (sk ) = f dγ a.e. in I. lim n→∞ n I k=0

Proof. We have |¯ sk − sak | ≤ (Fk Fk+1 )−1 for any k ∈ N, (ω, θ) ∈ Ω × I, a ∈ I. The result then follows from (4.1.17) and the uniform continuity of f on I. 2 Remarks. 1. The above result also follows from a theorem of Breiman (1960) on account of the Markov property of the sequences (san )n∈N , a ∈ I. 2. The corresponding result for yna = 1/san , n ∈ N+ , a ∈ I, can be easily stated. In this form it can be found in Elton (1987) and Grigorescu and Popescu (1989). 2 Corollary 4.1.25 For any m ∈ N+ and a ∈ I we have n−1 1X a m 1 X (−1)i−1 lim (sk ) = n→∞ n log 2 (m + i) k=0

i∈N+

a.e. in I.

254

Chapter 4

In particular, for m = 1 the value of the limit is (1/ log 2) − 1. The proof amounts to computing the integral Z 1 m 1 t dt, log 2 0 t + 1 which yields the result stated.

2

Taking f (x) = log x, x ∈ I, in (4.1.17) and noting that Z Z 1 1 log x dx log xγ(dx) = log 2 x+1 I 0 µ ¶ Z 1 1 log(x + 1) dx 1 = log(x + 1) log x|0 − log 2 x 0 Z 1 X (−1)k 1 k 1 X (−1)k = − x dx = − log 2 k+1 0 log 2 (k + 1)2 k∈N

µ ¶ 2 π2 1 ζ(2) − ζ(2) = − , = − log 2 4 12 log 2

k∈N

we obtain 1 π2 log(¯ s0 s¯1 · · · s¯n−1 ) = − n→∞ n 12 log 2 lim

a.e. in Ω

or, equivalently 1 π2 log(¯ y0 y¯1 · · · y¯n−1 ) = n→∞ n 12 log 2 lim

a.e. in Ω.

In the last equation we can give an estimate of the convergence rate. We have shown in Example 3.2.11 that Ãn−1 µ ¶!2 X 1 π2 lim Eγ¯ log y¯i − > 0. n→∞ n 12 log 2 i=0

Then for any ε > 0 by Theorem 4.0.4 we obtain n−1

n−1

k=0

k=0

1X 1X log y¯k = − log s¯k n n =

π2 12 log 2

(4.1.18) ³ 1 ´ + o n− 2 log(3+ε)/2 n a.e. in Ω

Ergodic theory of continued fractions

255

as n → ∞, where the constant implied in o depends on ε and the current point (ω, θ) ∈ Ω2 . While we cannot take f (x) = log x, x ∈ I, in Proposition 4.1.24 since this is not a continuous function on I, we can however replace s¯k by sak , k ∈ N, a ∈ I, in (4.1.18) as shown below. Theorem 4.1.26 For any a ∈ I we have 1 π2 log(sa1 sa2 · · · san ) = − n→∞ n 12 log 2 lim

a.e. in Ω.

More precisely, whatever ε > 0, for any a ∈ I we have ³ 1 ´ 1 π2 log(sa1 sa2 · · · san ) = − + o n− 2 log(3+ε)/2 n n 12 log 2

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and the current point ω ∈ Ω. In particular, for a = 0 the above equations amount to lim

n→∞

√ 2 n qn = eπ /12 log 2

a.e. in Ω

(4.1.19)

and ´ ³ 1 √ 2 n qn = eπ /12 log 2 + o n− 2 log(3+ε)/2 n

a.e. in Ω

(4.1.20)

as n → ∞, respectively. Proof. By the mean value theorem we have ¯ ¯ ¯ log x − log y ¯ 1 ¯ ≤ ¯ ¯ ¯ x−y min(x, y) for any 0 < x, y ≤ 1, x 6= y. Next, note that µ ¶ v(i(k) ) 1 1 0 < − 1 ≤ max , Fk−1 Fk+1 F2k u(i(k) ) ¡ ¢ for any fundamental interval I(i(k) ) = Ω ∩ u(i(k) ), v(i(k) ) , i(k) ∈ Nk+ , k ∈ N+ . This follows easily from (1.1.12), (1.1.13), and Theorem 1.1.2. Consequently, for any k ∈ N+ and a ∈ I we have ¶ µ 1 1 a = O(g2k ) (4.1.21) |log s¯k − log sk | ≤ max , Fk−1 Fk+1 F2k

256

Chapter 4

as n → ∞, whatever the current point (ω, θ) ∈ Ω. Clearly, by (4.1.18) and (4.1.21) the proof is complete for any a ∈ I. In the special case a = 0 we only should note that s0k =

qk−1 , qk

k ∈ N+ . 2

Remark. The convergence rate in Theorem 4.1.26 with a = 0 is slightly better than that derived by Philipp (1967, p. 122). Equation (4.1.19) was first derived by L´evy (1929) using a different method. 2 Corollary 4.1.27 We have ¯ ¯ ¯ π2 1 pn ¯¯ ¯ lim log ¯ω − ¯ = − n→∞ n qn 6 log 2

a.e. in Ω

and, for any ε > 0, ¯ ¯ ³ ´ ¯ ¯ 1 p π2 n log ¯¯ω − ¯¯ = − + o n−1/2 log(3+ε)/2 n n qn 6 log 2

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. Proof. It follows from (1.1.16) that for any ω ∈ Ω and n ∈ N we have ¯ ¯ ¯ 1 1 pn ¯¯ ¯ < ¯ω − ¯ < 2 . 2 qn qn 2qn+1 Then the results stated are immediate consequences of equations (4.1.19) and (4.1.20). 2 Corollary 4.1.28 We have 1 π2 log λ (I(a1 , · · · , an )) = − n→∞ n 6 log 2 lim

a.e. in Ω

and, for any ε > 0, ³ ´ 1 π2 −1/2 (3+ε)/2 log λ (I(a1 , · · · , an )) = − +o n log n n 6 log 2

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. Proof. By (1.2.2) and (1.2.5) we have log λ (I(a1 , · · · , an )) = −2 log qn − log(sn + 1),

n ∈ N+ .

Ergodic theory of continued fractions

257

Since sn ∈ I, the results stated are again immediate consequences of equations (4.1.19) and (4.1.20). 2 Remark. The result above implies that the entropy H(τ ) of the continued fraction transformation τ is equal to π 2 /6 log 2. See e.g., Billingsley (1965, p. 134). 2 Corollary 4.1.29 For any ε > 0 we have ³ 1 ´ p 2 n pn (ω) = ω 1/n eπ /12 log 2 + o n− 2 log(3+ε)/2 n

a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω. The proof follows from the inequality ¯ ¯p p ¯n ¯ n ¯ pn (ω) − ω qn (ω)¯ ≤

1 (n−1)/n

Fn+1 Fn

,

ω ∈ Ω, n ∈ N+ ,

which can be easily checked.

2

Corollary 4.1.30 (Khinchin’s fundamental theorem of Diophantine approximation) Let f : N+ → R++ . P (i) If i∈N+ f (i) = ∞ and if (i) ≥ (i + 1)f (i + 1), i ∈ N+ , then a.e. in Ω the inequality ¯ ¯ ¯ ¯ ¯ω − p ¯ < f (q) ¯ q¯ q has infinitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1. P (ii) If i∈N+ f (i) < ∞, then a.e. in Ω the above inequality has at most finitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1. The proof follows from Theorem 4.1.26 with a = 0 and F. Bernstein’s theorem (Proposition 1.3.16). See, e.g., Billingsley (1965, p. 48). 2

4.2 4.2.1

Other continued fraction expansions Preliminaries

In this section we study a large class of continued fraction expansions which can be derived from the RCF expansion. Before defining them formally let us briefly describe the underlying idea.

258

Chapter 4

The following rather old and well known remark is fundamental. For a ∈ Z, b ∈ N+ and x ∈ [0, 1) we have 1

a+ 1+

1 b+x

=a+1+

−1 . b+1+x

This operation is called a singularization. We have singularized the digit 1 in [ · ; · · · , a, 1, b, · · · ] The effect of a singularization is that a new and shorter continued fraction expansion is obtained. Moreover, we will see that the sequence of convergents associated with the ‘new’ continued fraction expansion is a subsequence of the sequence of convergents of the ‘old’ one. For example, given n ∈ N+ , if we singularize the digit an+1 (ω) = 1 in the RCF expansion of some ω ∈ Ω, then the sequence of convergents of the ‘new’ continued fraction expansion is obtained by deleting the nth term from the sequence of RCF convergents of ω. Obviously, the ‘new’ continued fraction expansion is no longer an RCF expansion! Starting from the RCF expansion of a given x ∈ [0, 1) it is not possible (i) to singularize two consecutive digits equal to 1, and (ii) to singularize digits other than 1. It is also important to note that once we have singled out digits equal to 1 to be singularized, the order in which they are singularized has no impact on the final result. Of course, just one singularization does not make the new expansion ‘really faster’ than the old one. However, many algorithms can be devised such that for almost all x ∈ [0, 1) infinitely many convergents are skipped. Before considering such algorithms, let us fix notation. Let x ∈ [0, 1) with RCF expansion x = [a1 , a2 , · · · ] . Any finite or infinite string of consecutive digits ak (x) = 1,

ak+1 (x) = 1,

··· ,

ak+n−1 (x) = 1,

k ∈ N+ , n ∈ N+ ∪{∞}

is called a 1-block if either k = 1 and ak+n (x) 6= 1 (if n is finite) or k > 1 and ak−1 (x) 6= 1, ak+n (x) 6= 1 (if n is finite). The first algorithm we consider is: A For any x ∈ [0, 1) singularize the first, third, fifth, etc., components in any 1-block.

Ergodic theory of continued fractions

259

Applying algorithm A to a (finite or infinite) RCF expansion [a1 , a2 , · · · ] yields a (finite or infinite) continued fraction of the form e1

b0 + b1 +

(4.2.1)

e2 . b2 + . .

or [b0 ; e1 /b1 , e2 /b2 , · · · ], for short. In (4.2.1) we have b0 ∈ {0, 1}, bn ∈ N+ , en ∈ {−1, 1}, and bn + en+1 ≥ 2, n ∈ N+ . √ Example 4.2.1 Let x = (−3 + 17)/2 = 0.56155 · · · . As a quadratic irrationality x should have a periodic RCF expansion (see Subsection 1.1.3). We easily find that £ ¤ x = [0; 1, 1, 3, 1, 1, 3, · · · ] = 0; 1, 1, 3 . Applying algorithm A to the RCF expansion of x yields x = [1; −1/2, 1/4, −1/2, 1/4, · · · ] h i or x = 1; −1/2, 1/4 , for short.

2

By the very construction, the convergents pen := b0 + qne b1 +

e1

, e2 en . b2 + . . + bn

n = 1, 2, · · · ,

of (4.2.1) are a subset of the convergents of [a1 , a2 , · · · ]. Therefore in the case of an infinite RCF expansion we have pen = [a1 , a2 , · · · ] . e n→∞ qn lim

Several questions naturally arise : (i) Are there other algorithms yielding continued fraction expansions with the property above? (ii) Does algorithm A always yield fastest continued fraction expansions? Closest expansions? (The precise meaning of these terms will be explained later. See Subsection 4.3.3. Informally, one would like the denominators qne , n ∈ N+ , to grow as fast as possible while the approximation coefficients associated with the new expansion to be as small as possible.)

260

Chapter 4

(iii) Is there an underlying ergodic transformation? We can easily answer the first question. The second algorithm we consider is: B For any x ∈ [0, 1) singularize the last, third from last, fifth from last, etc., components in any 1-block. Example 4.2.2 Let x be as in Example 4.2.1. Applying algorithm B to the RCF expansion of x yields x = [1; 1/2, −1/4, 1/2, −1/4, · · · ] , h i or x = 1; 1/2, −1/4 , for short.

2

Clearly, in general, algorithms A and B yield different results. Actually it is possible to show that, in a sense, one cannot do better than either of these algorithms. Since one can singularize just digits equal to 1, and since two consecutive 1’s cannot be both singularized, it is not possible to go faster than either algorithms A or B. Slower algorithms are trivially at hand. Here is an example of such an algorithm: C For any x ∈ [0, 1) singularize all digits an+1 (x) = 1 for which Θn (x) ≥ 1/2 (see Subsection 1.3.2) whatever n ∈ N. In Subsection 4.3.2 it is shown that algorithm C is well defined, that is, not in conflict with the requirements of the singularization procedure. Example 4.2.3 Let x be as in Example 4.2.1. A simple calculation shows that the first four digits equal to 1 in the RCF expansion of x should not be singularized if we apply algorithm C to it. 2 From this example it is clear that, in general, algorithm C does not yield expansions which are fastest. In Subsection 4.3.3 we will discuss an algorithm which yields both fastest and closest expansions. This algorithm was introduced by Selenius (1960) and—independently—by Bosma (1987), and is called the optimal continued fraction (OCF) expansion. Finally, in Subsection 4.2.5 we will answer question (iii) above.

4.2.2

Semi-regular continued fraction expansions

Apart from the RCF expansion there exist many so called semi-regular continued fraction expansions. To define the latter we start by defining a continued fraction (CF) as a pair of two sets e = (ek )k∈M and (ak )k∈{0}∪M of

Ergodic theory of continued fractions

261

integers with ek ∈ {−1, 1} and a0 ∈ Z, ak ∈ N+ , k ∈ M , where either M = {k : 1 ≤ k ≤ n} for some n ∈ N+ or M = N+ . Next, for arbitrary indeterminates xi , yi , 1 ≤ i ≤ n, n ∈ N+ , write [y1 /x1 ] =

y1 , x1

[y1 /x1 , · · · , yn /xn ] =

y1 , x1 + [y2 /x2 , · · · , yn /xn ]

n ≥ 2.

If card M = n ∈ N+ then we say that the CF considered has length n and assign it the value [a0 ; e1 /a1 , · · · , en /an ] := a0 + [e1 /a1 , · · · , en /an ] =

e1

a0 + a1 +

e2

∈ R ∪ {−∞, ∞}.

en . a2 + . . + an

If M = N+ then we say that the CF considered is infinite and look at it as the sequence ((ek )1≤k≤n , (ak )0≤k≤n )n∈N+ of all finite CF’s which are obtained by finite truncation. In both cases we can associate with a CF its convergents pe0 := a0 , q0e

pek := [a0 ; e1 /a1 , · · · , ek /ak ] , qke

1 ≤ k ≤ n,

for either some n ∈ N+ or any n ∈ N+ , with pe0 = a0 , q0e = 1, pek ∈ Z, qke ∈ N+ , g.c.d. (|pek |, qke ) = 1, 1 ≤ k ≤ n. To ensure the convergence of the sequence of convergents of an infinite CF, which would enable us to speak of a CF expansion, additional conditions should be imposed on the ek and ak , k ∈ N+ . One possibility, yielding the so called semi-regular continued fraction (SRCF ) expansion, is to ask that ei+1 + ai ≥ 1, i ∈ N+ , and ei+1 + ai ≥ 2 infinitely often (in the infinite case). It can be shown that the sequence of convergents of an infinite SRCF expansion converges to an irrational number. See Tietze (1913) [cf. Perron (1954, §37)]. This will be written as pek e := [a0 ; e1 /a1 , e2 /a2 , · · · ] . k→∞ qk lim

As in the RCF expansion case a matrix theory is associated with an SRCF expansion (or, more generally, with a CF). Consider (cf. Remark

262

Chapter 4

preceding Proposition 1.1.1) the matrices µ ¶µ ¶ µ ¶ 0 1 0 1 1 a0 e A0 := = , 1 0 1 a0 0 1

µ Aen

:=

0 en 1 an

¶ ,

n ∈ N+ ,

and Mne := Ae0 · · · Aen ,

n ∈ N.

Clearly, det M0e = 1, det Mne = (−1)n e1 · · · en ,

n ∈ N+ .

(4.2.2)

One can prove that µ Mne

=

pen−1 pen e qn−1 qne

¶ ,

n ∈ N,

(4.2.3)

e with pe−1 = 1, q−1 = 0, which implies that the sequences (pen )n∈N and e (qn )n∈N satisfy the recurrence relations

pen = an pen−1 + en pen−2 ,

e e qne = an qn−1 + en qn−2 ,

n ∈ N+ .

The second equation above implies at once that e qn−1 = [1/an , en /an−1 , · · · , e2 /a1 ] , qne

sen :=

n > 1,

(4.2.4)

and clearly se1 := q0e /q1e = 1/a1 . It follows from (4.2.2) and (4.2.3) that e pe−1 q0e − pe0 q−1 = 1, e pen−1 qne − pen qn−1 = (−1)n e1 · · · en ,

n ∈ N+ ,

showing that indeed g.c.d (|pen |, qne ) = 1, n ∈ N. Next (see again the RCF expansion case), looking at Mne as a M¨obius transformation one can show that Mne (0) =

pen , qne

n ∈ N.

More generally, Mne (z) =

pen + zpen−1 = [a0 ; e1 /a1 , · · · , en−1 /an−1 , en /(an + z)] , e qne + zqn−1

n ≥ 2,

Ergodic theory of continued fractions

263

for any z ∈ C, z 6= −1/sen , and µ ¶ pe1 + zpe0 e1 e = M1 (z) = e a0 + a1 + z q1 + zq0e for any z ∈ C, z 6= −1/se1 . It follows that putting ten = [en+1 /an+1 , · · · ] , n ∈ N, we have pe + ten pen−1 a0 + te0 = ne , n ∈ N. e qn + ten qn−1 Finally, defining Θen (a0

+

te0 )

=

¯ ¯

¯ pen ¯¯ − e ¯, qn

n ∈ N,

en+1 ten |ten | = , sen ten + 1 sen ten + 1

n ∈ N.

(qne )2 ¯¯a0

+

te0

it is easy to check that Θen (a0 + te0 ) =

(4.2.5)

Since (ten )−1 = en+1 (an+1 + ten+1 ), sen + en+1 an+1 = we also have Θen (a0 + te0 ) =

sen+1 , sen+1 ten+1 + 1

en+1 , sen+1

n ∈ N,

n ∈ N.

(4.2.6)

The numbers Θen , n ∈ N, associated with a (finite or infinite) SRCF expansion are called its approximation coefficients. Compare with the RCF expansion case in Subsection 1.3.2. We conclude this subsection with a few examples of well known SRCF expansions. 1. The RCF expansion: this is the SRCF expansion for which en = 1 for any n ∈ N+ . 2. Nakada’s α-expansions for α ∈ [1/2, 1]: see Subsection 4.3.1. 3. The nearest integer continued fraction (NICF) expansion: this is the SRCF expansion for which en+1 +an ≥ 2 for any n ∈ N+ . It was introduced by Minnigerode (1873) and studied by Hurwitz (1889). Actually, the NICF expansion is the 1/2-expansion, and is obtained by applying algorithm A defined in Subsection 4.2.1 to the RCF expansion.

264

Chapter 4

4. The singular continued fraction (SCF) expansion: this is the SRCF expansion for which en + an ≥ 2, n ∈ N+ . It was introduced by Hurwitz √ (1889). Actually, the SCF expansion is the g-expansion with g = ( 5−1)/2, the golden ratio, and is obtained by applying algorithm B defined in Subsection 4.2.1 to the RCF expansion. 5. Minkowski’s diagonal continued fraction (DCF) expansion: this is the SRCF expansion which is obtained by applying algorithm C defined in Subsection 4.2.1 to the RCF expansion. See Subsection 4.3.2. 6. The continued fraction with odd incomplete quotients (Odd CF) expansion: this is the SRCF expansion for which e1 = 1, an ≡ 1 mod 2, en+1 + an ≥ 2, n ∈ N+ . It was introduced by Rieger (1981a) [see also Barbolosi (1990), Hartono and Kraaikamp (2002), and Schweiger (1995, Ch. 3)]. 7. The continued fraction with even incomplete quotients (Even CF) expansion: this is the SRCF expansion for which e1 = 1, an ≡ 0 mod 2, en+1 + an ≥ 2, n ∈ N+ . See also Kraaikamp and Lopes (1996) and Schweiger (1995, Ch. 3).

4.2.3

The singularization process

The following two easily checked identities are fundamental for the theory which we develop in this section: µ

µ

1 a 0 1

0 c 1 a

¶µ

¶µ

0 c 1 1

0 d 1 1

¶µ

¶µ

0 1 1 b

0 1 1 b



µ =



µ =

1 a+c 0 1

0 c 1 a+d

¶µ

¶µ

0 −c 1 b+1

0 −d 1 b+1

where a, b, c and d are arbitrary real or complex numbers. Let (ek )k∈M , (ak )k∈{0}∪M

¶ ,

(4.2.7)

,

(4.2.8)



(4.2.9)

be a (finite or infinite) CF with a`+1 = 1, e`+2 = 1 for some ` ∈ N for which ` + 2 ∈ M . The transformation σ` which takes (4.2.9) into the CF (e ek )k∈M \{`+1} , (e ak )k∈{0}∪(M \{`+1})

(4.2.10)

Ergodic theory of continued fractions

265

with eek = ek , k ∈ M, k < ` + 1 or k ≥ ` + 3, ee`+2 = −e`+1 , e ak = ak , k ∈ {0} ∪ M, k < ` or k ≥ ` + 3, e a` = a` + e`+1 , e a`+2 = a`+2 + 1, is called a singularization of the pair (a`+1 , e`+2 ). Let (pek /qke )k∈{0}∪M and (e pek /e qke )k∈{0}∪(M \{`+1}) be the sets of convergents associated with (4.2.9) and (4.2.10), respectively. We are going to derive the fe )k∈{0}∪(M \{`+1}) relationship between these sets. Let (Mke )k∈{0}∪M and (M k be the sets of matrices defined in the preceding subsection, associated with (4.2.9) and (4.2.10), respectively. We have µ e ¶ µ ¶ pek 0 e f = Mk , k ∈ {0} ∪ (M \{` + 1}). qeke 1 fe = M e for k < ` and, moreover, by (4.2.7) and (4.2.8) we have Clearly, M k k e f = Mk+1 for k ≥ ` + 1. The matrix M fe will then be given by M k ` ¶ µ 0 e` e e f M` = M`−1 1 a` + e`+1 µ ¶ 0 1 e := with M−1 and e0 = 1. Hence 1 0 µ fe = M e M ` `+1 µ =

e M`+1

0 e`+1 1 1

¶−1 µ

−e`+1 0 e`+1 1

0 e` 1 a`

¶−1 µ

0 e` 1 a` + e`+1



¶ .

Therefore µ

pee` qe`e



µ =

e M`+1

−e`+1 0 e`+1 1

¶µ

0 1



µ =

pe`+1 e q`+1

¶ ,

and we can state the following result. Proposition 4.2.4 Let ` ∈ N such that ` + 2 ∈ M . The set of convergents (e pek /e qke )k∈{0}∪(M \{`+1}) resulting after the singularization σ` of the pair (a`+1 , e`+2 ) = (1, 1), is obtained by deleting pe` /q`e from the set (pek /qke )k∈{0}∪M . In what follows a singularization process will consist of a set S of continued fractions and a rule which determines in an unambiguous way the pairs a`+1 = 1, e`+2 = 1 that should be singularized for any member of S.

266

Chapter 4

Remark. For an infinite CF the sequence of convergents of the ‘new’ CF obtained after singularization, is a subsequence of the sequence of convergents of the ‘old’ one. Therefore if the ‘old’ CF converged to x, so does the ‘new’ one, and it converges faster. In particular, this holds for any SRCF expansion to be singularized.

4.2.4

S-expansions

From now on we will concentrate on one special singularization process. The set S of continued fraction expansions to be singularized is the set of all (finite or infinite) RCF expansions. Since in this case all the e’s are +1, we will speak of singularizing a`+1 = 1 instead of singularizing the pair a`+1 = 1, e`+2 = 1. Before describing the general rule (as we should according to the definition just given) remark that Example 4.2.1 actually describes a singularization process: S plus algorithm A yield the NICF expansion! Now, notice that algorithm A is equivalent to singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SA ,

` ∈ N,

where (cf. Subsection 1.3) τ ` = [a`+1 , a`+2 , · · · ], s` = [a` , · · · , a1 ], ` ∈ N, with s0 = 0, and SA = [1/2, 1) × [0, g] ⊂ I 2 . We recall that the golden ratios g and G are defined as √ 5−1 , G = g + 1. g= 2 Similarly, we can verify that algorithm B—leading to Hurwitz’ SCF expansion— is equivalent to singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SB ,

` ∈ N,

where SB = [g, 1) × I ⊂ I 2 . Finally, using properties of the approximation coefficients Θn , n ∈ N, defined in Subsection 1.3.2, we can also show that algorithm C—leading to Minkowski’s DCF expansion—is equivalent to singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SC ,

` ∈ N,

Ergodic theory of continued fractions where SC

½ = (x, y) ∈ I 2 ;

267

x 1 ≥ xy + 1 2

¾ .

These three examples lead to the idea of prescribing by a subset S ⊂ I 2 which digits 1 = a`+1 are to be singularized in the RCF expansion in the form of the condition (τ ` , s` ) ∈ S, ` ∈ N. Such an S cannot be just any set but must satisfy the conditions S ⊂ [1/2, 1) × I, since otherwise a`+1 would not be equal to 1, and S ∩ τ¯(S) ⊂ {(g,g)}, since otherwise one would be forced to singularize two consecutive digits both equal to 1, which is impossible. Thus we are lead—in a natural way— to the following definition which exactly describes all S-expansions. Definition 4.2.5 A subset S of I 2 is said to be a singularization area if and only if (i) S ∈ BI2 and γ¯ (∂S) = 0; (ii) S ⊂ [1/2, 1) × I; (iii) S ∩ τ¯(S) ⊂ {(g,g)}. If S is a singularization area, then the S-expansion of ω ∈ Ω is defined as the SRCF expansion converging to ω which is obtained from the RCF expansion of ω by singularizing a digit 1 = a`+1 = a`+1 (ω) if and only if (τ ` , s` ) ∈ S, whatever ` ∈ N. Remarks. 1. We need the continuity condition γ¯ (∂S) = 0 in order to be able to draw the following conclusion. Let A(S, n) be the random variable defined as A(S, n) = card{j : (τ j , sj ) ∈ S, 1 ≤ j ≤ n}, By Theorem 4.1.16(ii) we then have lim

n→∞

A(S, n) = γ¯ (S) a.e.. n

n ∈ N+ .

268

Chapter 4

2. Actually, the sets SA and SB do not satisfy condition (iii). Indeed, in both cases, S ∩ τ¯(S) is a line segment. Of course, this can be easily repaired by taking ∗ SA = ([1/2, g] × [0, g]) ∪ ((g, 1) × [0, g)) and ∗ SB = ([g, 1) × [0, g]) ∪ ((g, 1) × (g, 1])

instead of SA and SB , respectively. 3. Since γ([1/2, 1) × I) = (log 2)−1 log

4 = 0.41503 · · · , 3

a singularization area S never can have γ-measure greater that 0.41503 · · · . But condition (iii) forces the maximal possible γ-measure of a singularization area S to be essentially smaller than 0.41503 · · · as shown below. Proposition 4.2.6 For any singularization area S we have γ(S) ≤ 1 −

log G = 0.30575 · · · , log 2

where the bound is sharp. ∗ with S ∗ as before and M = ([0, g) × (g, 1]) ∪ Proof. Define M1 = SA 2 A ([g, 1) × [g, 1]). It is easy to check that M2 = τ¯(M1 ) and

γ(M1 ) = γ(M2 ) = 1 −

log G . log 2

Next, put S1 = S ∩ M1 and S2 = S ∩ M2 . Clearly, τ¯(S1 ) ∪ S2 ⊂ M2 and by Definition 4.2.5(iii) we have τ¯(S1 ) ∩ S2 ⊂ {(g, g)} , see also Figure 4.1. We now see that γ(S) = γ(S1 ) + γ(S2 ) = γ(¯ τ (S1 )) + γ(S2 ) = γ(¯ τ (S1 ) ∪ S2 ) ≤ γ(M2 ) = 1 −

log G . log 2

Ergodic theory of continued fractions 1

269

..... .... ...... ......... ............ . . . . .......... ..... .. ...... ... ...... ... ...... ..... . . . . . . .. ....... ... ...... ... ....... ... ....... ....... .. . . . . . . . . . .. . ........ ... ......... .. ......... .. .......... ... .......... . . . . . . . . . . . . . .......... ... ............ ............. ... ... .. . ... ... ... .. . ... ... ... .... .. ... ... .. . ... ... .. ...

τ¯(S1 )

S2

g

S1

0

1 2

g

1

Figure 4.1: S = S1 ∪ S2 and τ¯(S1 ) That a singularization area actually can have γ¯-measure 1 − (log 2)−1 (log G) ∗ and S ∗ . is shown by the cases of SA 2 B On account of Proposition 4.2.6, a singularization area S will be called maximal if log G . γ(S) = 1 − log 2 Given a singularization area S, let BS be a subset of I 2 such that whatever ω = [a1 , a2 , · · · ] ∈ Ω any digit 1 = a`+1 = a`+1 (ω) is unchanged by Ssingularization if and only if (τ ` , s` ) ∈ BS , ` ∈ N. Clearly, such a set—which determines the occurrence of digits equal to 1 in the S-expansion—should have the following properties: (1) BS ⊂ [1/2, 1) × I since a`+1 = 1; (2) BS ∩ S = ∅ since a`+1 = 1 is not singularized; (3) τ¯−1 (BS ) ∩ S = ∅ since a` is not singularized; (4) τ¯(BS ) ∩ S = ∅ since a`+2 is not singularized. On account of the considerations above, the subset BS of I 2 defined as BS = ([1/2, 1) × I) \ (S ∪ τ¯−1 (S) ∪ τ¯(S)) is called the preservation area of 1’s. We have the following result.

270

Chapter 4

Proposition 4.2.7 If S is maximal, then γ(BS ) = 0. In general, the converse of this statement does not hold. Proof. Let M1 , M2 , S1 and S2 be as in the proof of Proposition 4.2.6. Put moreover B1 = BS ∩ M1 , B2 = BS ∩ M2 . It is now easy to see that τ¯(B1 ) ∩ (¯ τ (S1 ) ∪ S2 ) = ∅, τ¯(B1 ) ∪ τ¯(S1 ) ∪ S2 ⊂ M2 , B2 ∩ (¯ τ (S1 ) ∪ S2 ) = ∅, B2 ∪ τ¯(S1 ) ∪ S2 ⊂ M2 . Hence, since S is maximal, γ¯ (B2 ) = 0,

γ¯ (B1 ) = γ¯ (¯ τ (B1 )) = 0,

which completes the proof. (The reader is invited to give an example where the converse does not hold.) 2 We conclude this subsection by deriving a number of results, which are obtained as easy spin-off. Let S be a singularization area and ω ∈ Ω. As the sequence (e pek /e qke )k∈N+ of S-convergents of ω is a subsequence of the sequence (pn /qn )n∈N+ of its RCF convergents, there exists an increasing random function nS : N+ → N+ such that µ e ¶ µ ¶ pnS (k) pek = , k ∈ N+ . qeke qnS (k) Theorem 4.2.8 Let S be a singularization area. Then lim

k→∞

1 nS (k) = k 1 − γ(S)

a.e..

Proof. It follows from the definition of nS that nS (k)

nS (k) = k +

X

IS (τ j , sj ) .

j=1

Since γ¯ (∂S) = 0, by Theorem 4.1.16(ii) we have 1 =

nS (k) X k 1 + lim IS (τ j , sj ) k→∞ nS (k) k→∞ nS (k)

lim

j=1

=

k + γ¯ (S) a.e., k→∞ nS (k) lim

Ergodic theory of continued fractions

271

whence the result stated.

2

Remark. Theorem 4.2.8 implies that nS (k) log 2 ≤ = 1.4404 · · · k→∞ k log G lim

a.e.,

the upper bound being attained if and only if S is maximal. In words: sparsest sequences of S-convergents are given by maximal singularization ∗ which yields the NICF is maximal, we areas. As the singularization area SA have thus re-proved a theorem of Adams (1979), see also Jager (1982) and Nakada (1981). 2 The following corollary gives the S-expansion analogues of two classical results of P. L´evy in Subsection 4.1.3. Corollary 4.2.9 Let S be a singularization area and let (e pek /e qke )k∈N+ be the corresponding sequence of S-convergents. Then 1 log qeke = k→∞ k ¯ ¯ e¯ ¯ p e 1 k lim log ¯¯ω − e ¯¯ = k→∞ k qek lim

1 π2 1 − γ¯ (S) 12 log 2 −π 2 1 1 − γ¯ (S) 6 log 2

a.e., a.e.

Proof. This is an immediate consequence of Theorems 4.1.26 and 4.2.8. We have nS (k) 1 1 π2 1 log qeke = lim log qnS (k) = k→∞ k→∞ k k nS (k) 1 − γ¯ (S) 12 log 2 lim

and similarly for the second equation.

a.e., 2

By the mechanism of singularization the collection of RCF convergents that are deleted to obtain the S-convergents has the same cardinality as the set of the ee` , ` ∈ N+ , which are equal to −1. It is easy to see that à ! k X 1 nS (k) − k = k− ee` . 2 `=1

Therefore we can state the following result. Corollary 4.2.10 We have k 1X 1 − 3γ(S) ee` = k→∞ k 1 − γ(S)

lim

`=1

a.e..

272

Chapter 4

The minimum of the limit above is attained if and only if S is maximal, and is equal to 1 G3 log = 0.11915 · · · . log G 4 We conclude this subsection by giving the S-expansion analogue of Legendre’s theorem—see Corollary 1.2.4. Theorem 4.2.11 Let ¡ ¢ A(t) = (x, y) ∈ I 2 : x/(xy + 1) < t, y ∈ Q ,

0 < t ≤ 1,

and define cS = sup (t ∈ (0, 1] : A(t) ∩ S = ∅) . Put LS = min(cS , 1/2) . Let ω ∈ Ω and p, q ∈ N+ with g.c.d.(p, q) = 1, p < q. If ¯ ¯ ¯ ¯ p 2 e = Θ(ω, e Θ p/q) = q ¯¯ω − ¯¯ < LS , q then p/q is an S-convergent of ω. The constant LS is best possible. e Proof. Suppose that Θ(ω, p/q) < LS and that p/q is not an S-convergent of ω. Since LS ≤ 1/2, p/q is an RCF convergent of ω by Corollary 1.2.4, i.e., there exists n ∈ N+ such that p/q = pn /qn . Now, since pn /qn is not an S-convergent, by the very definition of an S-expansion we have (τ n , sn ) ∈ S. The definition of LS then implies τn ≥ cS ≥ LS , sn τ n + 1 which by the definition of the approximation coefficients in Subsection 1.3.2 yields e Θ(ω, p/q) = Θn ≥ LS , contrary to the hypothesis. Finally, it follows from the definition of LS and Corollary 1.2.4 that LS is best possible. 2 Remarks. 1. Rieger (1979) and Adams (1979) gave a proof of Corollary 4.2.10 for the special case of the NICF expansion, using a formula of Spence and Abel for the dilogarithm. We see that these transcendent techniques can be avoided, which was also observed by Jager (1982).

Ergodic theory of continued fractions

273

∗ (the singularization area 2. An easy calculation shows that for S = SA yielding the NICF expansion) we have

LS = g2 = 0.38166 · · · . This value was also found by Ito (1987) and by Jager and Kraaikamp (1989). Their methods are different. Ito (op. cit.) developed a theory for determining the Legendre constants for a class of continued fractions, larger than the class of S-expansions. Unfortunately, his method is rather complicated.

4.2.5

Ergodic properties of S-expansions

In this subsection we show that for any S-expansion there exists an ‘underlying’ two-dimensional ergodic dynamical system. These systems will be obtained via an induced transformation from (I 2 , BI2 , τ¯, γ¯ ), the two-dimensional ergodic dynamical system underlying the RCF expansion. Using the ergodic dynamical systems thus obtained we will then deduce more metric and arithmetic properties of S-expansions. Let S be a singularization area and let x = [ a0 ; a1 , a2 , · · · ] = a0 + [ a1 , a2 , · · · ], a0 ∈ Z, [ a1 , a2 , · · · ] ∈ Ω. Denote by [e a0 ; ee1 /e a1 , ee2 /e an , · · · ] the S-expansion of x (cf. Subsection 4.2.3). Recall that this is an SRCFexpansion satisfying een+1 + e an ≥ 1, n ∈ N+ . As before let τ n = [ an+1 , an+2 , · · · ] , sn = [an , · · · , a1 ] ,

n ∈ N,

n ∈ N+ , s0 = 0,

and put e ten = [ een+1 /e an+1 , · · · ] ,

seen

n ∈ N,

 0 if n = 0,      1/e a1 if n = 1, =      [1/e an , een /e an−1 , · · · , ee2 /e a1 ] if n > 1.

By equations (1.2.2) and (4.2.4) we have sn = qn−1 /qn ,

e seen = qen−1 /e qne ,

n ∈ N,

274

Chapter 4

where (pn /qn )n∈N and (e pen /e qne )n∈N are the sequences of RCF convergents and S-convergents of x, respectively. Also,  pn + τ n pn−1   , n    qn + τ qn−1 x = (4.2.11) e e e  e  p e + t p e k k−1  k   e ee e qek + tk qek−1 e = 0. Finally, put for any k, n ∈ N, with p−1 = pee−1 = 1, and q−1 = qe−1

∆ := I 2 \ S ,

∆− = τ¯(S),

∆+ = ∆ \ ∆− .

Theorem 4.2.12 For any n ∈ N+ the following assertions hold: (i) (τ n , sn ) ∈ S

if and only if pn /qn is not an S-convergent;

(ii) if pn /qn is not an S-convergent, then both pn−1 /qn−1 and pn+1 /qn+1 are S-convergents; (iii) (τ n , sn ) ∈ ∆+ is equivalent to the existence of k = k(n) ∈ N such that  e  pek−1 = pn−1 , peek = pn , 

e qek−1

=

qn−1 , qeke

and

= qn ,

 e tk = τ n (⇒ ek+1 = +1),  e 

seek = sn ;

(iv) (τ n , sn ) ∈ ∆− is equivalent to the existence of k = k(n) ∈ N such that  e  pek−1 = pn−2 , peek = pn , 

e qek−1

=

qn−2 , qeke

= qn ,

and

 e tk = −τ n /(τ n + 1) (⇒ ek+1 = −1),  e 

seek = 1 − sn .

Proof. (i) This follows directly from Definition 4.2.5 and 4.2.4. (ii) This follows from the fact that in the sequence of RCF we cannot remove two or more consecutive convergents and sequence of convergents of some srcf. (iii) If (τ n , sn ) ∈ ∆+ then the very definition of ∆+ implies (τ n−1 , sn−1 ) 6∈ S and (τ n , sn ) 6∈ S .

Proposition convergents still have a that

Ergodic theory of continued fractions

275

Hence neither an nor an+1 is singularized and therefore both pn−1 /qn−1 and pn /qn are S-convergents. But then there exists k ∈ N+ such that peek−1 pn−1 = , e qek−1 qn−1

peek pn = . e qek qn

Since all the fractions are in their lowest terms and their denominators are positive we should have peek−1 = pn−1 ,

peek = pn ,

e qek−1 = qn−1 ,

qeke = qn .

Then (4.2.11) implies that pn + e tek pn−1 pn + τ n pn−1 = , qn + τ n qn−1 qn + e tek qn−1 hence e tek = τ n . Finally, we have seek =

e qek−1 qn−1 = = sn . e qek qn

The converse is obvious. (iv) If (τ n , sn ) ∈ ∆− then the very definition of ∆− implies that (τ n−1 , sn−1 ) ∈ S and (τ n , sn ) 6∈ S . Hence an = 1, and it should be singularized according to Definition 4.2.5. Then pn−2 /qn−2 and pn /qn are consecutive S-convergents by (ii). Again, there exists k ∈ N+ such that

Since

peek−1 = pn−2 ,

peek = pn ,

e qek−1 = qn−2 ,

qeke = qn .

pn = an pn−1 + pn−2 = pn−1 + pn−2 , (4.2.12) qn = an qn−1 + qn−2 = qn−1 + qn−2

we have seek =

qn−2 qn − qn−1 = = 1 − sn . qn qn

276

Chapter 4

Next, from (4.2.11) we have pn + e tek pn−2 pn + τ n pn−1 = , n qn + τ qn−1 qn + e tek qn−2 and using equations (4.2.12) and (1.1.12) we obtain e tek + e tek τ n + τ n = 0 , whence

τn e tek = − n . τ +1

The converse is obvious.

2

Now, define the transformation τ¯∆ : ∆ → ∆ as   τ¯(x, y) if τ¯(x, y) 6∈ S, τ¯∆ (x, y) =  2 τ¯ (x, y) if τ¯(x, y) ∈ S for any (x, y) ∈ ∆ = I 2 \ S. This is a very simple instance of an induced transformation. Cf., e.g., Petersen (1983, Sections 2.3 and 2.4). According to the general theory, it follows that (∆, B∆ , τ¯∆ , γ¯∆ ) is an ergodic dynamical system. Here γ¯∆ is the probability measure on B∆ with density 1 1 , γ¯ (∆) log 2 (xy + 1)2

R2

(x, y) ∈ ∆.

Next, Theorem 4.2.12 leads us naturally to consider the map M : ∆ → defined by  (x, y) ∈ ∆+ ,  (x, y), M (x, y) =  (−x/(x + 1), 1 − y) (x, y) ∈ ∆− .

Set AS = M (∆). Clearly, AS consists of ∆+ = I 2 \ (S ∪ τ¯(S)) and the image M (¯ τ (S)) of ∆− = τ¯(S) under M , which lies in the second quadrant of the plane. Also, M : ∆ → AS is one-to-one. We can then define the transformation τ¯S : AS → AS as τ¯S = M τ¯∆ M −1 , and Theorem 4.2.12 implies that ¡e ¢ ¡ e e¢ e tk+1 , seek+1 = τ¯S e tk , sek , k ∈ N. (4.2.13)

Ergodic theory of continued fractions

277

It is immediate that the determinant of the Jacobian J of M |∆− is equal to 1/(x + 1)2 > 0. For (x, y) ∈ ∆− we have −1

|J|

1 = (xy + 1)2

µ

x+1 xy + 1

¶2 =

1 , (st + 1)2

where t = −x/(x + 1) and s = 1 − y. This shows that ZZ ZZ dx dy ds dt 1 1 |J| |J|−1 = log 2 M (∆− ) (st + 1)2 log 2 ∆− (xy + 1)2

(4.2.14)

= γ¯ (¯ τ (S)) = γ¯ (S). Note also that ¡ ¢ γ¯ ∆+ = 1 − γ¯ (S) − γ¯ (¯ τ (S)) = 1 − 2¯ γ (S) .

(4.2.15)

Theorem 4.2.13 Let ρ be the probability measure on BAS with density 1 1 , (1 − γ¯ (S)) log 2 (xy + 1)2

(x, y) ∈ AS .

Then (AS , BAS , τ¯S , ρ) is an ergodic dynamical system which underlies the corresponding S-expansion. Proof. The conclusion follows on account of equations (4.2.13) through (4.2.15) noting that the dynamical systems (∆, B∆ , τ¯∆ , γ¯∆ ) and (AS , BAS , τ¯S , ρ) are isomorphic by the very definition of the latter. See Remark 1 following Proposition 4.0.5 and Petersen (1983, Sections 1.3 and 2.3). 2 Remark. The entropy of the maps τ¯∆ and τ¯S can be easily obtained using Abramov’s formula [see e.g. Petersen (1983, p. 257)]. Since H(τ ) = π 2 /6 log 2 (see Remark following Corollary 4.1.28), we have 1 π2 H(τ ) = = H(¯ τS ), γ¯ (∆) 1 − γ¯ (S) 6 log 2 ¡ ¢ which shows that entropy is maximal π 2 /6 log G for maximal singularization areas. 2 H(¯ τ∆ ) =

At first sight the dynamical system (AS , BAS , τ¯S , ρ) looks very intricate. However, it is quite helpful. We have the following result.

278

Chapter 4 Theorem 4.2.14 Let the map f : AS → R ∪ {∞} be defined by ¯ ¯ (1) f (x, y) = ¯x−1 ¯ − τ¯S (x, y),

(x, y) ∈ AS ,

(1)

where τ¯S (x, y) is the first coordinate of τ¯S (x, y). Let a : [0, 1) → N+ ∪ {∞} be defined as in Chapter 1, that is,  −1  bt c if t ∈ (0, 1), a(t) =





if t = 0.

We have  a(x)          a(x) + 1 f (x, y) =

if sgn x = 1

and τ¯(x, y) 6∈ S,

if sgn x = 1

and τ¯(x, y) ∈ S,

  a(−x/(x + 1)) + 1 if sgn x = −1 and τ¯(M −1 (x, y)) 6∈ S,        a(−x/(x + 1)) + 2 if sgn x = −1 and τ¯(M −1 (x, y)) ∈ S

and ¡ ¢ τ¯S (x, y) = |x−1 | − f (x, y), (y f (x, y) + sgn x)−1 ,

(x, y) ∈ AS .

Proof. We should distinguish four cases, of which only two will be considered here. The other two cases can be treated similarly. Cf. Kraaikamp (1991, p. 26). 1. Let (x, y) ∈ ∆+ and τ¯(x, y) ∈ S. Then sgn x = 1 and µ τ¯∆ (M

−1

2

(x, y)) = τ¯ (x, y) = τ¯ µ

1 1 − a(x), x a(x) + y



¶ 1 1 = − 1, x−1 − a(x) 1 + 1/(a(x) + y) µ ¶ x − 1 + xa(x) a(x) + y = , ∈ ∆− . 1 − xa(x) a(x) + y + 1

Ergodic theory of continued fractions

279

Therefore 

 a(x) + y  τ¯S (x, y) = M (¯ τ∆ (M −1 (x, y))) =  , 1− a(x) + y + 1 1 + x−1+xa(x) 1−xa(x) µ ¶ 1 1 = − (a(x) + 1), . x a(x) + y + 1 − x−1+xa(x) 1−xa(x)

Thus we see that τ¯S (x, y) =

´ ³¯ ¯ ¯x−1 ¯ − f (x, y), (f (x, y) + y sgn x)−1 ,

where f (x, y) = a(x) + 1. ¡ ¢ 2. Let (x, y) ∈ M (∆− ) and τ¯ M −1 (x, y) 6∈ S. Then sgn x = −1 and we have ¶ µ x −1 −1 ,1 − y τ¯S (x, y) = M τ¯M (x, y) = τ¯M (x, y) = τ¯ − x+1 µ µ ¶ ¶ 1 x 1 = − −a − , x/(x + 1) x+1 a(−x/(x + 1)) + 1 − y µ µ ¶ ¶ 1 x 1 = − −1−a − , . x x+1 a(−x/(x + 1)) + 1 + y sgn x Thus we see that τ¯S (x, y) =

³¯ ´ ¯ ¯x−1 ¯ − f (x, y), (f (x, y) + y sgn x)−1 ,

where f (x, y) = a(−x/(x + 1)) + 1. 2 Corollary 4.2.15 We have (i) f (x, y) ∈ N+ for (x, y) ∈ AS , x 6= 0; te0 , see0 ) = (x − e a0 , 0). (ii) e ak+1 = f (e tek , seek ), k ∈ N, with (e

280

Chapter 4

Let AiS , i = 1, 2, be the projections of AS onto the two axes and let λAi S denote the probability measure defined by ¡ ¢ λ A ∩ AiS ¡ ¢ , λAi (A) = S λ AiS

A ∈ BAi , i = 1, 2. S

³ ´ Proposition 4.2.16 Let µ ∈ pr BA1 such that µ ¿ λA1 . For any S S B ∈ BAS such that λA1 ⊗ λA2 (∂B) = 0 we have S

S

¡ e e ¢ lim µ (e tn , sen ) ∈ B = ρS (B),

n→∞

lim 1 n→∞ n

n−1 X

IB (e tek , seek ) = ρS (B) a.e. in A1S .

k=0

Proof. This is the result corresponding to Theorem 4.1.16 and Corollary 4.1.17 for the ergodic dynamical system (AS , BAS , τ¯S , ρ). It is easy to see that the proof of Theorem 4.1.16 for the case of the ergodic dynamical system (I 2 , BI2 , τ¯, γ¯ ) carries over to the present case. 2 Corollary 4.2.17 Consider the approximation coefficients ¯ e ¯ e en = (e Θ qne )2 ¯e a0 + e te0 − peen /e qne ¯ ,

n ∈ N.

For any µ ∈ pr(BA1 ) such that µ ¿ λA1 and any (t1 , t2 ) ∈ I 2 we have S

S

³ ´ e e ≤ t1 , Θ e e ≤ t2 = ρ(B), lim µ Θ n n−1

n→∞

1 card{k : Θ e e ≤ t1 , Θ e e ≤ t2 , 0 ≤ k ≤ n − 1} = ρ(B) lim n k k+1

n→∞

where

µ B =

y |x| (x, y) ∈ AS ; ≤ t1 , ≤ t2 xy + 1 xy + 1

a.e. in A1S ,

¶ .

Proof. The results stated follow from Proposition 4.2.16 on account of equations (4.2.5) and (4.2.6). 2

Ergodic theory of continued fractions

4.3 4.3.1

281

Examples of S-expansions Nakada’s α-expansions

Let Iα = [α−1, α], α ∈ R, so that I1 = I. In this subsection we will consider transformations Nα : Iα → Iα defined by ¥ ¦  −1  |x | − |x−1 | + 1 − α if x 6= 0 Nα (x) =  0 if x = 0 for x ∈ Iα , with α ∈ [1/2, 1]. Any irrational number x ∈ Iα has an infinite SRCF expansion called α-expansion, of the form e1 b1 +

e2

:= [ e1 /b1 , e2 /b2 , · · · ] ,

. b2 + . .

where ¡ ¢ (en , bn ) = (en (x), bn (x)) = e1 (Nαn−1 (x)), b1 (Nαn−1 (x)) , with

¡ ¥ ¦¢ (e1 (x), b1 (x)) = sgn x, |x−1 | + 1 − α ,

n ∈ N+ ,

x ∈ Iα .

Here Nαn denotes the composition of Nα with itself n times while Nα0 is the identity map. The theory of α-expansions can be developed by parallelling that of the RCF expansion. This has been done by Nakada (1981), Nakada et al. (1977), Bosma et al. (1983), and Popescu (2000). Originally, these expansions were defined by McKinney (1907). Our approach here consists in putting any α-expansion in the framework of the S-expansion theory by giving a suitable singularization area Sα , α ∈ [1/2, 1]. This will allow us to retrieve results derived by Nakada and coworkers (op. cit.) using different methods. We should distinguish two cases: (i) α ∈ [1/2, g] and (ii) α ∈ (g, 1]. Case (i). Before giving the singularization areas Sα , α ∈ [1/2, g], we first return to the special case α = 1/2 which yields the NICF expansion. Recall that the NICF expansion of an irrational number can be obtained from its RCF expansion by applying algorithm A from Subsection 4.2.1 to the latter. We noticed in Subsection 4.2.4 that this is equivalent to singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SA ,

` ∈ N,

282

Chapter 4

where SA = [1/2, 1) × [0, g] . For α ∈ (1/2, g], notice that τ¯ ([1/2, α) × [0, g]) = ((1 − α)/α, 1] × [g, 1] . In particular, for α = g we have (SA \ ([1/2, α) × [0, g])) ∪ ((1 − α)/α, 1] × [g, 1]) = (SA \ ([1/2, g) × [0, g])) ∪ ((g, 1] × [g, 1]) = ([g, 1) × [0, g]) ∪ ((g, 1] × [g, 1]) , ∗ of Hurwitz’s SCF which only slightly differs from the singularizaton area SB expansion, which coincides with the g-expansion. See Remark 2 following Definition 4.2.5. It therefore seems natural to try as singularization areas Sα for α ∈ [1/2, g] the sets

Sα = ([α, g) × [0, g)) ∪ ([g, (1 − α)/α] × [0, g]) (4.3.1) ∪ ((1 − α)/α, 1] × I) . Hence τ¯(Sα ) = ([0, (2α − 1)/(1 − α)) × [1/2, 1]) ∪ ([(2α − 1)/(1 − α), g] × [g, 1]) ∪ ((g, (1 − α)/α] × (g, 1]) . It is easy to check that Sα is indeed a singularization area: obviously, γ¯ (∂Sα ) = 0, Sα ⊂ [1/2, 1] × I, and clearly Sα ∩ τ¯(Sα ) = {(g, g)}. Also, γ¯ (Sα ) = 1 −

log G , log 2

hence Sα is maximal for any α ∈ [1/2, g]. Notice that with M defined as in Subsection 4.2.5 we have M (¯ τ (Sα )) = ([α − 1, g − 1) × [0, 1 − g)) ∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g]) ∪ ((1 − 2α)/α, 0] × [0, 1/2]) .

Ergodic theory of continued fractions

283

Writing Aα for ASα —see again the general case in Subsection 4.2.5—we take Aα =

¡ 2 ¢ I \ (Sα ∪ τ¯(Sα )) ∪ (M (¯ τ (Sα )) \ ({0} × [0, 1/2]))

= ([α − 1, g − 1) × [0, 1 − g)) ∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g]) ∪ (((1 − 2α)/α, 0) × [0, 1/2]) ∪ ([0, (2α − 1)/(1 − α)] × [0, 1/2)) ∪ (((2α − 1)/(1 − α), α) × [0, g)) . If we denote by fα : Aα → R ∪ {∞} the function corresponding to the function f in Theorem 4.2.14, then it easy to see that actually fα maps Aα into N+ and that ¯ −1 ¯ ¯x ¯ − fα (x, y) ∈ [α − 1, α), x ∈ [α − 1, α) \ {0}. Since there exists only one n ∈ N+ such that ¯ −1 ¯ ¯x ¯ − n ∈ [α − 1, α), we deduce that fα (x, y) does not depend on y and that we should have ¯ ¯ fα (x, y) = b¯x−1 ¯ + 1 − αc, (x, y) ∈ Aα , x 6= 0. ¯ ¯ Hence x → ¯x−1 ¯ − fα (x, y) is Nakada’s transformation Nα . On account of Theorem 4.2.14 we can therefore state the main result for the case α ∈ [1/2, g]. Theorem 4.3.1 [Nakada (1981)] Let measure γ¯α on BAα with density 1 1 , log G (xy + 1)2

1 2

≤ α ≤ g. Consider the probability

(x, y) ∈ Aα ,

¯α : Aα → Aα defined by and the transformation N ³ ¡ ¢ ´ ¯α (x, y) = |x−1 | − b|x−1 | + 1 − αc, b|x−1 | + 1 − αc + y sgn x −1 , N ¯α , γ¯α ) is an ergodic dynamical system where (x, y) ∈ Aα . Then (Aα , BAα , N underlying the corresponding α-expansion. Taking projection onto the first axis, we deduce from Theorem 4.3.1 the following result.

284

Chapter 4

Corollary 4.3.2 Let 12 ≤ on BIα with density  1/(x + G + 1)      1 1/(x + 2) × log G      1/(x + G)

α ≤ g. Consider the probability measure µα if x ∈ [α − 1, (1 − 2α)/α], if x ∈ ((1 − 2α)/α, (2α − 1)/(1 − α)), if x ∈ [(2α − 1)/(1 − α), α] .

Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system. Remark. For α = 1/2 we obtain the NICF expansion, and the corresponding result has been derived independently by Rieger (1979) and Rockett (1980). 2 1

g



1α 2

g

Figure 4.2: Sα for

1 2

0

1−α α

1

≤α≤g

From Figure 4.2 it is clear that the vertices (α, g) and ((1 − α)/α, 1) of Sα determine the value of the Legendre constant Lα := LSα . See Theorem 4.2.11. More precisely, we have the following result. Theorem 4.3.3 Let

1 2

≤ α ≤ g. Then

Lα = min(α/(1 + αg), 1 − α).

Remark. Notice that for the values of α ∈ [1/2, g] under consideration we have τ¯([1/2, α) × [0, g)) ⊂ Sα .

Ergodic theory of continued fractions

285

. It follows at once from this and (4.3.1) that BSα = ∅, which is consistent with Proposition 4.2.7. 2 Case (ii). Let α ∈ (g, 1]. Put Sα = [α, 1] × I .

(4.3.2)

Hence τ¯(Sα ) = [0, (1−α)/α]×[1/2, 1], and Sα ∩ τ¯ (Sα ) = ∅ since for α ∈ (g, 1] we have (1 − α)/α < α . It is then easy to check that Sα is indeed a singularization area. However, a simple calculation shows that γ¯ (Sα ) = 1 −

log(1 + α) , log 2

thus for no value of α under consideration here the singularization area Sα is maximal. Next, with M defined as in Subsection 4.2.5 we have M (¯ τ (Sα )) = [α − 1, 0] × [0, 1/2] . Define Aα exactly as in case (i) and denote by fα : Aα → R ∪ {∞} the function corresponding to the function f in Theorem 4.2.14. The expression of Aα is now simpler, namely, Aα = ([α − 1, 0) × [0, 1/2])∪([0, (1 − α)/α] × [0, 1/2))∪(((1 − α)/α, α) × I) , see Figure 4.3. Similarly to case (i) we find that fα (x, y) is independent of y and that in fact we have again fα (x, y) = b|x−1 | + 1 − αc ,

(x, y) ∈ Aα , x 6= 0.

Thus we can state the main result for the case α ∈ (g, 1]. Theorem 4.3.4 [Nakada (1981)] Let g < α ≤ 1. Consider the probability measure γ¯α on BAα with density 1 1 , (x, y) ∈ Aα , log(1 + α) (xy + 1)2 ¯α : Aα → Aα defined as in Theorem 4.3.1. Then and the transformation N ¯α , γ¯α ) is an ergodic dynamical system. (Aα , BAα , N

286

Chapter 4

1

τ¯(Sα )

1/2



M (¯ τ (Sα ))

α−1

1−α α 1/2

0

α

1

Figure 4.3: Sα for g ≤ α ≤ 1 Taking again projection onto the first axis, we deduce from Theorem 4.3.4 the following result. Corollary 4.3.5 Let g < α ≤ 1. Consider the probability measure µα on BIα with density   1/(x + 2) if x ∈ [α − 1, (1 − α)/α], 1 × log(1 + α)  1/(x + 1) if x ∈ ((1 − α)/α, α]. Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system. We conclude the discussion of case (ii) with some results from Kraaikamp (1991). It is obvious that the vertex (α, 1) of Sα determines the value of the Legendre constant Lα := LSα . As min(α/(α + 1), 1/2) = α/(α + 1) in case (ii), we have the following result. See again Theorem 4.2.11. Theorem 4.3.6 Let g < α ≤ 1. Then Lα =

α . α+1

Ergodic theory of continued fractions

287

Next, it is easy to check that τ¯−1 (Sα ) ∩ ([1/2, 1] × I) = [1/2, 1/(1 + α)] × I. Since for our values of α we have (1 − α)/α < 1/(1 + α), we find that the set Bα := BSα from Proposition 4.2.7 is (1/(1 + α), α) × I. Then γ¯α (Bα ) = 2 −

log(2 + α) , log(1 + α)

and we can state the following result. Theorem 4.3.7 Let g < α ≤ 1. For the α-expansion [e e1 /e a1 , ee2 /e a2 , · · · ] = [e1 /b1 , e2 /b2 , · · · ] of irrationals in Iα we have lim

n→∞

log(2 + α) 1 card{k ; e ak = 1, 1 ≤ k ≤ n} = 2 − a.e.. n log(1 + α)

Remarks. 1. The case α = 1 gives the classical result from Proposition 4.1.1. 2. For α ∈ [g, 1] the limit 2 − log(2+α) log(1+α) increases monotonically from 0 3 to 2 − log log 2 = 0.4150 · · · , the asymptotic relative frequency of digit 1 in the RCF expansion. At α = 0.76292 · · · we have already lost half of the original 1’s. 3. It follows from Corollary 4.2.10 that for the α-expansion with α ∈ (g, 1] we have n log 4 1X eek = 3 − lim a.e.. n→∞ n log(1 + α) k=1

2 We conclude this subsection by giving the analogue of Vahlen’s theorem— see Subsection 1.3.2—for α-expansions with α ∈ [1/2, 1]. For the NICF and Hurwitz’ SCF expansions this analogue was independently given by Kurosu (1924) and Sendov (1959/60). Kraaikamp (1990) proved the Kurosu–Sendov ee , Θ e e ) always lies. results by giving a domain in R2 where the point (Θ n n−1 For the two expansions just mentioned, that is for α = 1/2 and α = g, we have e en−1 , Θ e en ) < 2g3 = 0.4721 · · · , min(Θ and the constant 2g3 is best possible.

288

Chapter 4

However, one might ask whether there are values of α for which still smaller values can be obtained for the corresponding approximation coeffiee ee cients √ Θn (α) = Θn , n ∈ N. Beforehand it is clear that a value smaller than 1/ 5 = 0.447 · · · can never be found by a classical result of √ A. Hurwitz [see Perron (1954, p. 49)], according to which for every θ < 1/ 5 there exist irrational numbers x such that the inequality q 2 |x − (p/q)| < θ is verified only for finitely many p/q ∈ Q. The above-mentioned method from Kraaikamp (1990) can easily be adapted for S-expansions. As an example we will mention here the case of αexpansions, for which the first result below is due to Bosma et al. (1983). Theorem 4.3.8 Let α ∈ [1/2, 1]. For any irrational number in Iα and any n ∈ N+ we have e en < c(α) Θ and

ee , Θ e e ) < V (α) , min(Θ n−1 n

where the functions c, V : [1/2, 1] → R are defined by µ ¶ 1−α 1 c(α) = max G , α , ≤ α ≤ 1, gα + 1 2 and

 Ã !  g   max , 4α − 2    1 + gα  V (α) =

if

1 2

≤ α ≤ g,

à !    α 2(1 − α)   , 2 if g ≤ α ≤ 1.   max α+1 α +1

The bounds c(α) and V (α) are best possible. For the proof see Kraaikamp (1991). Remark. A simple calculation yields minα c(α) = c(α0 ) = α0 , with µ ¶ q √ √ 1 α0 = −2 − 5 + 6 5 + 15 = 0.5473 · · · . 2 Moreover, we √ have minα V (α) = V (α1 ) = 0.4484 · · · , a constant slightly larger than 1/ 5, where √ 1 − 3g + 10 − 11g α1 = = 0.6121 · · · < g. 4g2 2

Ergodic theory of continued fractions

4.3.2

289

Minkowski’s diagonal continued fraction expansion

Let x ∈ R such that both x and 2x 6∈ Z. Consider the sequence σ of all irreducible fractions p/q ∈ Q with q ∈ N+ satisfying ¯ ¯ ¯ ¯ ¯x − p ¯ < 1 , ¯ q¯ 2q 2 ordered in such a way that their denominators form an increasing sequence. It can be shown [see, e.g., Perron (1954, §45)] that there exists a unique SRCF expansion whose sequence of convergents coincides with σ. Legendre’s theorem (see Corollary 1.2.4) implies that we take precisely those RCF convergents for which Θn < 1/2. By (4.2.5) this SRCF expansion—which is called Minkowski’s diagonal continued fraction (DCF ) expansion—is an S-expansion with singularization area ½ ¾ x 1 2 S = SDCF := (x, y) ∈ I : ≥ . xy + 1 2 Since min(Θn , Θn+1 ) < 1/2—cf. Subsection 1.3.2—the DCF expansion picks at least one out of two consecutive RCF convergents. Since γ¯ (SDCF ) = 1 −

1 , 2 log 2

the singularization area SDCF is not maximal. Also, by Theorem 4.2.8 we have nSDCF (k) = 2 log 2 = 1.3862 · · · a.e.. lim k→∞ k It can be shown [cf. Kraaikamp (1989, p. 210)] that the DCF expansion of any ω ∈ Ω can be obtained from its RCF expansion [a1 , a2 , · · · ] by singularizing any digit ak+1 (ω) = 1 if and only if one of the following four conditions is fulfilled: (i) k = 0, that is, a1 = 1; (ii) ak , ak+2 6= 1, k ∈ N+ ; (iii) ak 6= 1, ak+2 = 1, and [ak+3 , ak+4 , · · · ] > [ak −1, · · · , a1 ], k ∈ N+ , with the convention that the value of [ak − 1, · · · , a1 ] for k = 1 is [a1 − 1]; (iv) ak = 1, ak+2 6= 1, and [ak−1 , · · · , a1 ] > [ak+2 − 1, ak+3 , · · · ], k ≥ 2.

290

Chapter 4

It is also interesting to note that the DCF expansion of a quadratic irrationality is periodic. The general theory developed in Subsections 4.2.4 and 4.2.5 allows us to state the following results. For detailed proofs the reader is referred to Kraaikamp (op. cit.). With the notation in Subsection 4.2.5, for the DCF expansion case we have µ ¶ x 1 y 1 2 ∆+ = (x, y) ∈ R : < , < , ++ DCF xy + 1 2 xy + 1 2 µ ¶ (x + 1)(1 − y) 1 1 2 M (¯ τ (SDCF )) = (x, y) ∈ R : ≤ , − ≤ x ≤ 0, y ≥ 0 , xy + 1 2 2 ADCF := ASDCF = ∆+ τ (SDCF )) , DCF ∪ M (¯ see also Figure 4.4. 1

.... ..... ..... ............ . . . . . ......... ......... ...... ... ...... ... ...... ..... . . . . . . . ...... ... ....... ... ....... ... ....... ... ....... . . . . . . . . . . ........ ... ......... ... ......... ... ......... ......... .. . . . . . . . . . . . ... ........... ... ........... ... ............. .. ............. .. ........... . . . . . . . . . . ....... ... ........ ... ....... .. ...... .. ..... . . . . . . .. .... ... .... ... ... ... ... .. ... . . ... ... ... ... ... ... .. .. . . ... ... ... ... ... ... ... ...

τ¯(SDCF )

1/2

SDCF

M (¯ τ (SDCF ))

−1/2

0

1/2

1

Figure 4.4: SDCF Furthermore, writing fDCF for fSDCF and τ¯DCF for τ¯SDCF we have % $ ¯ −1 ¯ ¯x ¯c + y sgn x − 1 ¯ −1 ¯ b , fDCF (x, y) = ¯x ¯ + 2(b|x−1 | + y sgn x) − 1 and τ¯DCF (x, y) =

³¯ ´ ¯ ¯x−1 ¯ − fDCF (x, y), (fDCF (x, y) + y sgnx)−1

Ergodic theory of continued fractions

291

for (x, y) ∈ ADCF . Proposition 4.3.9 Let ρDCF be the probability measure on BSDCF with density 2 , (x, y) ∈ ADCF . (xy + 1)2 Then (ADCF , BSDCF , τ¯DCF , ρDCF ) is an ergodic dynamical system which underlies the DCF expansion. ¡ ¢ Proposition 4.3.10 For any µ ∈ pr B[−1/2,1] such that µ ¿ λ and any (t1 , t2 ) ∈ I 2 we have ³ ´ e e ≤ t1 , Θ e e ≤ t2 = H(t1 , t2 ). lim µ Θ n−1 n n→∞

Here H is the distribution function with density d1 + d2 , where 2IB (x, y) , d1 (x, y) = √ 1 1 − 4xy

2IB (x, y) d2 (x, y) = √ 2 , 1 + 4xy

with B1 = [0, 1/2] × [0, 1/2], ¡ ¢ B2 = B1 ∩ (x, y) ∈ E1 : 0 ≤ (x − y)2 + x + y ≤ 3/4 . The result above can be also stated in an equivalent form concerning the existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of 1 e e ≤ t1 , Θ e e ≤ t2 , 0 ≤ k ≤ n − 1} card{k : Θ k k+1 n as n → ∞. It then follows, e.g., that n−1

1 1 X ee lim Θk = n→∞ n 4

a.e..

k=0

We also note the following results. Proposition 4.3.11 An RCF digit ak+1 equal to 1 does not disappear in the DCF expansion if and only if µ ¶ 1 − 2x 2x − 1 1 k (τ , sk ) ∈ B = (x, y) ∈ ADCF : y < ,y> ,y< 3x − 2 x 2−x

292

Chapter 4

whatever k ∈ N. Note that γ¯ (B) is equal to ÃZ ! Z 1/(2−t) Z 2−√2 Z 1/(2−t) 1 1 du du dt − dt 2 2 log 2 1/2 (2t−1)/t (tu + 1) 1/2 (2t−1)/(2−3t) (tu + 1) 1 = log 2

µ ¶ √ √ 1 log( 2 − 1) + 2 − = 0.0473 · · · . 2

Corollary 4.3.12 Let [e ae0 ; e ae1 , e ae2 , · · · ] be the DCF expansion of an irrational number. Then lim

n→∞

1 card{k : e aek = 1, 1 ≤ k ≤ n} n = ρDCF (B) =

γ¯ (B) 1 − γ¯ (SDCF )

µ ¶ √ √ 1 = 2 log( 2 − 1) + 2 − = 0.0656 · · · 2

a.e..

This asymptotic relative frequency (6.56 · · · %) should be compared with 3 the asymptotic relative frequency of digit 1 in the RCF expansion (2− log log 2 = 41.50 · · · %). See Proposition 4.1.1 and Subsection 4.1.2.

4.3.3

Bosma’s optimal continued fraction expansion

A remarkable geometrical interpretation of the RCF expansion of an irrational number was given by Klein (1895). The idea behind it is to represent any irreducible p/q ∈ Q ∩ I by an integer-valued vector in R2+ , namely, by the point (q, p) ∈ R2+ , and to represent an irrational number ω ∈ Ω by a half-line L with slope ω. The approximation of ω by its RCF convergents amounts to systematically finding integer-valued vectors close to L. More precisely, starting from V−1 = (0, 1) and V1 = (1, 0) we define Vn recursively by Vn = an Vn−1 + Vn−2 , n ∈ N+ , where an ∈ N+ is maximal with respect to the property that Vn is on the same side of L as Vn−2 . It then appears that the positive integers a1 , a2 , · · · are in fact the RCF digits of ω, that is, ω = [a1 , a2 , · · · ].

Ergodic theory of continued fractions

293

Bosma (1987) gave a similar interpretation of α-expansions and, inspired by this, presented a very interesting SRCF expansion formally defined as follows. pee−1

Definition 4.3.13 Let −1/2 < x < 1/2. Put e ae0 = 0, e te0 = x, ee1 = sgn e te0 , e e e e = 1, qe−1 = 0, pe0 = 0, qe0 = 1, se0 = 0, and define recursively   ¯ e ¯−1   e e ¯ ¯ e ¯ e ¯−1  b tk c + eek+1 sek e ¯ ¯  , e ´ e ak+1 = tk + ³¯ ¯−1 e e e ¯ ¯ e 2 tk c + eek+1 sek + 1 ¯ e ¯−1 e tek+1 = ¯e tk ¯ − e aek+1 ,

eek+2 = sgn e tek+1 ,

e e peek+1 = e aek+1 peek + eek+1 peek−1 , qek+1 =e aek+1 qeke + eek+1 qek−1 , e seek+1 = qeke /e qk+1 , k ∈ N.

The optimal continued fraction (OCF ) expansion of x, denoted OCF(x), is the SRCF expansion [e e1 /e ae1 , ee2 /e ae2 , · · · ]. For an irrational x ∈ R such that 2x 6∈ Z, OCF(x)=[e ae0 ; ee1 /e ae1 , ee2 /e ae2 , · · · ] is defined as e ae0 + [e e1 /e ae1 , ee2 /e ae2 , · · · ], e e e e where e a0 ∈ Z is such that −1/2 < x − e a0 < 1/2, and [e e1 /e a1 , ee2 /e a2 , · · · ] = OCF(x − e ae0 ). is,

It is not difficult to see that the e tek and seek have the usual meaning, that e tek = [e ek+1 /e aek+1 , · · · ],

seek

k ∈ N,

 0 if k = 0,      1/e ae1 if k = 1, =      [1/e aek , eek /e aek−1 , · · · , ee2 /e ae1 ] if k ≥ 2

and peek /e qke , k ∈ N, are the OCF convergents of x. qke )k∈N is a subsequence of Next, the sequence of OCF convergents (e pek /e the sequence (pn /qn )n∈N of RCF convergents. If we define n(k) in such a qke = pn(k) /qn(k) , k ∈ N+ , then way that peek /e   n(k) + 1 if eek+2 = 1, n(k + 1) =  n(k) + 2 if eek+2 = −1

294

Chapter 4

with

  0 if x > 0, n(0) =



1 if x < 0.

Finally, it appears that the OCF expansion gives approximation coeffie en = (e cients Θ qne )2 |x − (e pen /e qne )| < 1/2 for any n ∈ N and, at the same time, it is a fastest expansion. Fastest SRCF expansions for which all convergents are RCF convergents can be defined as those in which always the maximal number of RCF convergents is skipped, meaning that whenever a 1-block of length m ∈ N+ occurs in the RCF expansion, exactly b(m + 1)/2c out of the m 1’s are skipped. (Note that this implies that for fastest SRCF expansions only a choice is left in deciding which RCF convergents will be skipped when m is even.) A still more precise definition of ‘fastest’ is as follows. Writing nα (k) := nSα (k), k ∈ N+ , α ∈ [1/2, 1], by Theorem 4.2.8 we have a.e.  log 2    log G = 1.44092 · · ·

nα (k) = lim  k→∞ k  

log 2 log(α + 1)

if 1/2 ≤ α ≤ g, if g < α ≤ 1.

Then an (arbitrary) SRCF expansion is said to be fastest if and only if nSRCF (k) = n1/2 (k) for infinitely many k ∈ N+ . Here the non-decreasing function nSRCF : N+ → N+ is defined by qnSRCF (k) ≤ qeke < qnSRCF (k)+1 ,

k ∈ N+ ,

where the qi and qeie , i ∈ N+ , are associated with the RCF expansion and the SRCF expansion considered, respectively. Cf. Bosma (1987, p. 364). The next result [cf. Bosma and Kraaikamp (1990)] places OCF expansions in the context of the S-expansion theory. More precisely, it shows how singularizing appropriately the RCF expansion yields the OCF expansion. (Note that it is for this reason that we have anticipated notation by denoting the OCF expansion as an S-expansion.) Lemma 4.3.14 Let ω ∈ Ω have RCF expansion [a1 , a2 , · · · ], RCF convergents pn /qn , and RCF approximation coefficients Θn , n ∈ N. Consider the set µ µ ¶¶ 2x − 1 2 SOCF = (x, y) ∈ I ; y < min x, . 1−x Then for any n ∈ N+ the following three assertions are equivalent:

Ergodic theory of continued fractions

295

(i) pn /qn is not an OCF convergent of ω; (ii) an+1 = 1 , Θn−1 < Θn and Θn > Θn+1 ; (iii) (τ n , sn ) ∈ SOCF . Proof. For the proof of the equivalence of (i) and (ii) we refer the reader to Corollary (4.20) of Bosma (1987). Here we show that (ii) and (iii) are equivalent. Since sn τn Θn−1 = , Θ = , n ∈ N+ , (4.3.2) n sn τ n + 1 sn τ n + 1 we have

|qn ω − pn | Θn qn−1 = = τ n < 1, |qn−1 ω − pn−1 | Θn−1 qn

ω ∈ Ω.

(4.3.3)

Also Θn−1 < Θn

if and only if

τ n > sn .

(4.3.4)

Furthermore, if an+1 = 1 then pn+1 = pn + pn−1 and qn+1 = qn + qn−1 , and by (4.3.3) we have Θn+1 = qn+1 |qn+1 ω − pn+1 | = (qn + qn−1 )|(qn + qn−1 )ω − (pn + pn−1 )| = (qn + qn−1 )|(qn−1 ω − pn−1 ) + (qn ω − pn )| = (qn + qn−1 )(|qn−1 ω − pn−1 | − |qn ω − pn |) since qn ω − pn and qn−1 ω − pn−1 have different signs, as shown by equation (1.1.18). Thus µ ¶ µ ¶ qn qn−1 Θn+1 = Θn−1 1 + − Θn 1 + . qn−1 qn It follows from (4.3.3) that an+1 = 1 and Θn+1 < Θn

if and only if

sn <

2τ n − 1 . 1 − τn

(4.3.5)

Combining (4.3.4) and (4.3.5) with the definition of SOCF completes the proof. 2

296

Chapter 4 Remarks. 1. It is easy to check that γ¯ (SOCF ) = 1 −

log G , log 2

so SOCF is a maximal singularization area. See Figure 4.5. Notice that SOCF contains SDCF , hence any sequence of OCF convergents is a subsequence of the corresponding sequence of DCF convergents. Since τ¯ (SOCF ) ⊂ I 2 \SOCF , the set BSOCF of the OCF preservation area of 1’s is empty. Hence any OCF incomplete quotient (or digit) is greater than or equal to 2. 2. It now appears that the function n : N+ → N+ considered above is in fact nSOCF . It then follows from Theorem 4.2.8 that n(k) log 2 = = 1.4404 · · · a.e.. k→∞ k log G lim

2 As in the DCF expansion case, the general theory developed in Subsections 4.2.4 and 4.2.5 allows us to state the following results. For detailed proofs the reader is referred to Bosma and Kraaikamp (1990, 1991). With the notation in Subsection 4.2.5, for the OCF case we have µ µ ¶¶ 2x − 1 2 2 ∆OCF = I \ SOCF = (x, y) ∈ I : y ≥ min x, , 1−x ¡ ¢ ∆− ¯ (SOCF ) = (x, y) ∈ I 2 : (y, x) ∈ SOCF , OCF = τ that is, reflecting SOCF in the diagonal y = x yields ∆− OCF , and AOCF := ASOCF = M (∆OCF ) =

µ µ ¶ 2x + 1 x + 1 (x, y) ∈ (−1/2, g) × [0, g] : y ≤ min , x+1 x+2 µ ¶¶ 2x − 1 and y ≥ max 0, , 1−x

see Figure 4.5. Furthermore, writing fOCF for fSOCF and τ¯OCF for τ¯SOCF we have º ¹ ¯ −1 ¯ b|x−1 |c + y sgn x ¯ ¯ , fOCF (x, y) = x + 2(b|x−1 |c + y sgn x) + 1 ³¯ ´ ¯ ¯x−1 ¯ − fOCF (x, y), (fOCF (x, y) + y sgn x)−1 τ¯OCF (x, y) =

Ergodic theory of continued fractions

297

1

..... ..... ..... ..... .... . . . . .... ..... ..... ..... ..... . . . . ..... .... ..... ..... ..... . . . . . ..... .... .................... .................... .. .................. . . . . . . . . . . . . . . . . . . ...... ............... ... .............. ... ............. .. ............ .. ........... . . . . . . . . . . . . .. ........... .. .......... .. ......... .. ... .. .. . . . .. ... .. ... .. ... .. ... .... . .. .. .. ... .. ... ... .. . . . ... ... ... ... .. .. ... ...

τ¯(SOCF )

1/2

SOCF

M (¯ τ (SOCF ))

−1/2

0

1/2 g

1

Figure 4.5: SOCF for (x, y) ∈ AOCF . Theorem 4.3.15 Let ρOCF be the probability measure on BAOCF with density 1 1 , (x, y) ∈ AOCF . log G (xy + 1)2 Then (AOCF , BAOCF , τ¯OCF , ρOCF ) is an ergodic dynamical system which underlies the OCF expansion. Remark. For both DCF and OCF expansions the two-dimensional sets ADCF and AOCF have curved boundaries. This implies that the functions fDCF and fOCF depend on both their arguments x and y, and not only on x as in the case of α-expansions, α ∈ [1/2, 1]. As a result, no one-dimensional ergodic dynamical system exists for either DCF or OCF expansion. 2 ¡ ¢ Proposition 4.3.16 For any µ ∈ pr B[−1/2,g] such that µ ¿ λ and any (t1 , t2 ) ∈ I 2 we have ³ ´ e en−1 ≤ t1 , Θ e en ≤ t2 = H(t1 , t2 ). lim µ Θ n→∞

Here H is the distribution function with density µ ¶   1 1 1  +p if (x, y) ∈ Π,  log G p 1 − 4xy 1 + 4xy    0 elsewhere,

298

Chapter 4

¡ ¢ where Π = (x, y) ∈ R2++ : 4x2 + y 2 < 1, x2 + 4y 2 < 1 . The result above can be also stated in an equivalent form concerning the existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of 1 e e ≤ t1 , Θ e e ≤ t2 , 0 ≤ k ≤ n − 1} card{k : Θ k k+1 n as n → ∞. It then follows, e.g., that n

arctan 12 1 X ee lim Θk = = 0.24087 · · · n→∞ n 4 log G

a.e..

(4.3.6)

k=1

Other consequences are that for any irrational number we have e en < 1/2, n ∈ N+ ; (i) 0 < Θ √ √ ee + Θ e en < 2/ 5, hence min (Θ ee , Θ e en ) < 1/ 5, n ∈ N+ . (ii) 0 < Θ n−1 n−1 √ In connection with (ii) above, it should be noted that the constant 1/ 5 in the second inequality is ‘best possible’ by A. Hurwitz’s result mentioned just before Theorem 4.3.8. Remark. The a.e. asymptotic arithmetic mean (4.3.6) should be compared with the corresponding values 1 = 0.36067 · · · 4 log 2 1 = 0.25 4 √ 5−2 = 0.24528 · · · 2 log G √ 8G + 6 − 2G − 1 = 0.24195 · · · log G

for the RCF expansion,

for the DCF expansion,

for the NICF and SCF expansions,

for the α0 -expansion,

where α0 = 0.55821 · · · . See Corollary 4.1.23 and Proposition 4.3.10 for the first two values, and Bosma et al. (1983) for the last two ones. Note how close the value in (4.3.6) is to 1 − γ¯ (SOCF ) =

log G = 0.24061 · · · . 2

Ergodic theory of continued fractions

299

The latter gives an a priori bound for the a.e. asymptotic arithmetic mean of the approximation coefficients. It can be shown that the value in (4.3.6) is in fact ‘the best one can get’ for any irrational number. More precisely, we have the following result. Theorem 4.3.17 [Bosma and Kraaikamp (1991)] Whatever the SRCF expansion with convergents pen /qne and approximation coefficients Θen , n ∈ N, we have m n 1 X ee 1 X e Θ k , n ∈ N+ , Θk ≥ m n k=1

k=1

e for any irrational number, where m = card{k : qke < qen+1 , k ∈ N+ } and e e e n , n ∈ N+ , are associated with the OCF expansion. qen and Θ

4.4 4.4.1

Continued fraction expansions with σ-finite, infinite invariant measure The insertion process

We have seen in previous subsections how the concept of singularization leads to a class of SRCF expansions for which the underlying ergodic theory can be developed. The idea of adding a convergent instead of removing one (as singularization does) leads to the concept of insertion, to some extent the opposite of that of singularization. Now, the fundamental identity is a+

1 = a+1 + b+x

−1 1+

1 b−1+x

,

where a ∈ Z, b ∈ N+ , b > 1, x ∈ [0, 1). Let (cf. Subsection 4.2.2) (ek )k∈M ,

(ak )k∈{0}∪M

(4.4.1)

be a (finite or infinite) CF with a`+1 > 1, e`+1 = 1 for some ` ∈ N for which ` + 1 ∈ M . The transformation ι` which takes (4.4.1) into the CF (e ek )k∈M f,

(e ak )k∈{0}∪M f,

(4.4.2)

f = M if M = N+ and M f = {k : 1 ≤ k ≤ n + 1} if M = where M {k : 1 ≤ k ≤ n}, n ∈ N+ , with eek = ek , k ∈ M , k ≤ `, ee`+1 = −1,

300

Chapter 4

ee`+2 = 1, eek = ek−1 , k ∈ M , k ≥ ` + 3, e ak = ak , k ∈ {0} ∪ M , k ≤ ` − 1, e a` = a` + 1, e a`+1 = 1, e a`+2 = a`+1 − 1, e ak = ak−1 , k ≥ ` + 3, is called an insertion of the pair (1, −1) before a`+1 , e`+1 . Let (pek /qke )k∈{0}∪M and (e pek /e qke )k∈{0}∪M f be the sets associated with (4.4.1) and (4.4.2), respectively. The result corresponding to Proposition 4.2.4 can be stated as follows. Proposition 4.4.1 Let ` ∈ N such that ` + 1 ∈ M . The set of convergents (e pek /e qke )k∈{0}∪M f resulting after the insertion ι` of the pair (1, −1) before a`+1 (> 1), e`+1 e ) in the set (= 1), is obtained by inserting the term (pe` + pe`−1 )/(q`e + q`−1 e e e e e = (pk /qk )k∈{0}∪M before the convergent p` /q` . As usual, here pe−1 = 1, q−1 0. The proof is similar to that of Proposition 4.2.4 by using appropriate matrix identities. 2 Starting from the RCF expansion, by appropriate insertions we can obtain many classical SRCF expansions, and also continued fraction algorithms which are not SRCF expansions. Amongst the former we mention the Lehner continued fraction (LCF) expansion, and amongst the latter the Farey continued fraction (FCF) expansion. Both these expansions will be studied in the next subsection. In particular, we can obtain this way the OddCF and EvenCF expansions —see the examples of SRCF expansions at the end of Subsection 4.2.2—as well as the backward continued fraction (BCF) expansion that we will study in Subsection 4.4.3.

4.4.2

The Lehner and Farey continued fraction expansions

Lehner (1994) showed that any number x ∈ [1, 2) has a unique infinite SRCF expansion of the form e1 b0 + := [ b0 ; e1 /b1 , e2 /b2 , · · · ] , (4.4.3) e2 b1 + . b2 + . . where (bn , en+1 ) is equal to either (1, 1) or (2, −1), n ∈ N. We shall call this expansion the Lehner continued fraction (LCF ) expansion. Dajani and Kraaikamp (2000) called it the Lehner fraction or the Lehner expansion, and showed that if we define the transformation L : [1, 2) → [1, 2) by L(x) =

e(x) , x − b(x)

x ∈ [1, 2),

Ergodic theory of continued fractions where (b(x), e(x)) =

301

  (2, −1) if 1 ≤ x < 32 , 

(1, 1)

if

3 2

≤ x < 2,

then (bn (x), en+1 (x)) = (b(Ln (x)), e(Ln (x))) ,

x ∈ [1, 2),

for any n ∈ N. Here Ln , n ∈ N+ , denotes the composition of L with itself n times while L0 is the identity map. Denoting as usual the RCF convergents of a real number x = [a0 ; a1 , a2 , · · · ] by (pn /qn )n∈N and defining the mediant convergents of x by kpn + pn−1 , kqn + qn−1

1 ≤ k < an+1 , n = 1, 2, · · ·

(so that if an+1 = 1 then there is no mediant convergent), we will see that the set of LCF convergents of x is the union of the sets of RCF and mediant convergents of x. It is for this reason that the LCF expansion was called the mother of all SRCF expansions in Dajani and Kraaikamp (op. cit.). Proposition 4.4.2 Let x ∈ [1, 2) \ Q, with RCF expansion [ 1; a1 , a2 , · · · ]. Then the LCF expansion (4.4.3) of x is given by the following algorithm. (i) Let n be the smallest m ∈ N for which am+1 > 1. If n = 0, that is, a1 > 1 then we replace [1; a1 , a2 · · · ] by [ 2; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ] . | {z } (a1 −2) times

If n ≥ 1 then we replace [ 1; 1, · · · , 1, an+1 , · · · ] by ιn+an+1 −1 ( · · · (ιn+1 (ιn ([ 1; 1, · · · , 1, an+1 , · · · ])) · · · ) = [ 1; 1/1, · · · , 1/1, , 1/2, −1/2, · · · , −1/2, −1/1, 1/1, 1/an+2 , · · · ] , | {z } | {z } (n−1) times

(an+1 −2) times

where ιn is defined as in Subsection 4.4.1. Denote the SRCF expansion of x thus obtained by

302

Chapter 4

[ b00 ; e01 /b01 , e02 /b02 , · · · ].

(4.4.4)

(ii) Let n0 > n be the smallest integer m0 > n for which e0m0 +1 = 1 and b0m0 +1 > 1. Apply to (4.4.4) the procedure from (i) to b0n0 +1 . The proof is easy and left to the reader.

2

Remark. It follows from the very insertion mechanism that any RCF or mediant convergent is an LCF convergent. Conversely, the sequence of LCF convergents is obtained after all mediant convergents have been inserted into the sequence of RCF convergents. Another immediate consequence is that the LCF expansion of a quadratic irrationality is (eventually) periodic. 2 Note that the transformation L [which is implicit in Lehner (1994)] is isomorphic to the transformation I : [0, 1) → [0, 1) defined by  x  if 0 ≤ x < 1/2,   1−x I(x) =    1 − x if 1/2 ≤ x < 1, x which was used by Ito (1989) to generate the RCF and mediant convergents of any x ∈ [0, 1). More precisely, we have L(x) = I(x − 1) + 1,

x ∈ [1, 2).

1 , I (h(x − 1))

x ∈ [1, 2),

We also have L(x) =

where the bijective function h : [0, 1) → [1/3, 2/3) is defined by  1    2−x h(x) =    x x+1

if 0 ≤ x < 1/2, if 1/2 ≤ x < 1.

Ito (op. cit.) showed that I is ν-preserving, where ν is the ¡ σ-finite, infinite ¢ −1 measure on B[0,1) with density x , x ∈ (0, 1), and that [0, 1), B[0,1) , I, ν is an ergodic dynamical system. This implies that L is µ-preserving, where µ is the σ-finite, infinite measure on B[1,2) with density (x − 1)−1 , x ∈ (1, 2), ¡ ¢ and that [1, 2), B[1,2) , L, µ , is an ergodic dynamical system underlying the LCF expansion.

Ergodic theory of continued fractions

303

We will now exhibit the relationship between the LCF expansion and an algorithm yielding the so called Farey continued fraction (FCF ) expansion. The latter is an infinite CF expansion of any x ∈ [−1, 0) ∪ (0, ∞) of the form f1 d1 +

:= [ f1 /d1 , f2 /d2 , · · · ] ,

f2

(4.4.5)

. d2 + . .

where (dn , fn ) is equal to either (1, 1) or (2, −1), n ∈ N+ . Formally, as shown by Dajani and Kraaikamp (op. cit.), if we define the transformation F : [−1, ∞) → [−1, ∞) by    f (x) x − d(x) if x 6= 0, F(x) =   0 if x = 0, where

  (2, −1) if − 1 ≤ x < 0, (d(x), f (x)) =



(1, 1)

if x ≥ 0,

then ¡ ¢ (dn (x), fn (x)) = d(Fn−1 (x)), f (Fn−1 (x)) ,

x ∈ [−1, ∞),

for any n ∈ N+ . Here Fn , n ∈ N+ , denotes the composition of F with itself n times while F0 is the identity map. By its very definition the FCF expansion is not an SRCF expansion since the condition fn+1 + dn ≥ 1, n ∈ N+ , is violated. ¯ : D → D by Put D = [1, 2) × [−1, ∞), and define the transformation L µ ¶ e(x) ¯ y) = L(x), L(x, , (x, y) ∈ D. b(x) + y ¯ is a one-to-one transformation of D0 := [1, 2) × It is easy to check that L ([−1, 0) ∪ (0, ∞)) with inverse ¶ µ f (y) −1 ¯ + d(y), F(y) , (x, y) ∈ D0 . L (x, y) = x Also, for any n ≥ 2 we have ¯ n (x, y) = (Ln (x), [en (x)/bn−1 (x), · · · , e2 (x)/b1 (x), e1 (x)/(b0 (x) + y)]) L

304

Chapter 4

whatever (x, y) ∈ D, and ¯ −n (x, y) = ([dn (y); fn (y)/dn−1 (y), · · · , f2 (y)/d1 (y), f1 (y)/x], Fn (y)) L whatever (x, y) ∈ D0 . Remark. It is interesting to compare the last two equations above with (1.3.10 ) and (1.3.20 ). This might suggests developments similar to those in Section 1.3. 2 ¡ ¢ ¯ µ Theorem 4.4.3 The quadruple D, BD , L, ¯ is an ergodic dynamical system which is a natural extension of the dynamical system ¡ ¢ [1, 2), B[1,2) , L, µ . Here µ ¯ is the σ-finite, infinite measure on BD with density (x+y)−2 , (x, y) ∈ D = [1, 2) × [−1, ∞). Proof. Let π1 : [1, 2) × [−1, ∞) → [1, 2) denote the projection onto the first axis. Cf. Remark 1 after Proposition 4.0.5. Then it is easy to check ¯ = L ◦ π1 , and that that π1 ◦ L ¡ ¢ µ ¯ π1−1 (A) = µ(A),

A ∈ B[1,2) .

¯ is µ We should next show that L ¯-preserving and, finally, that the σ-algebra generated by [ ¡ ¢ ¯ n π −1 B[1,2) L 1 n∈N

coincides with BD . We leave the details to the reader, who can find them in Dajani and Kraaikamp (op. cit.). 2 Let us denote by φ the σ-finite, infinite measure on B[−1,∞) with density (x + 1)−1 − (x + 2)−1 , x ∈ (−1, ∞). It is easy to check that F is φ-preserving. Theorem 4.4.4 The map ξ : [−1, 0) ∪ (0, ∞) → [1, 2) defined by ξ(x) = [ d1 ; f1 /d2 , f2 /d3 , · · · ] , if x ∈ [−1, 0) ∪ (0, ∞) has FCF expansion x = [ f1 /d1 , f2 /d2 , · · · ] ¡ ¢ ¡ ¢ is an isomorphism from [−1, ∞), B[−1,∞) , F, φ to [1, 2), B[1,2) , L, µ .

Ergodic theory of continued fractions

305

Proof. It is clear that ξ is bijective. Since L (ξ(x)) = L ([ d1 ; f1 /d2 , f2 /d3 , · · · ]) = [ d2 ; f2 /d3 , f3 /d4 , · · · ] = ξ ([ f2 /d2 , f3 /d3 , · · · ]) = ξ (F(x)) , ¡ ¢ we only need to show that ξ is measurable and that µ(A) = φ ξ −1 (A) for any A ∈ B[1,2) . Whilst measurability is obvious, the equation above can be easily checked. The details can be found in Dajani and Kraaikamp (op. cit.). 2 An immediate consequence of Theorems 4.4.3 and 4.4.4 is that ¡ ¢ [−1, ∞), B[−1,∞) , F, φ is an ergodic dynamical system underlying the FCF expansion. Remark. Corollary 4.1.10 in conjunction with the insertion concept pro-¢ ¡ vides a heuristic argument why the dynamical system [1, 2), B[1,2) , L, µ should be ergodic, where L is µ-preserving for a σ-finite, infinite measure µ. After all, an insertion before a digit > 1 is simply building a tower over the RCF cylinder corresponding to that digit. Since the LCF expansion is obtained by using insertion as many times as possible in order to ‘shrink away’ any RCF digit > 1, it follows that the system thus obtained should be ergodic (it includes the RCF dynamical system as an induced system), but by Corollary 4.1.10 it should have infinite mass. 2 The next result corresponds to Proposition 4.1.8 for the values p = −1, 0, 1 there. Theorem 4.4.5 Let x ∈ [1, 2) \ Q with LCF expansion [ b0 ; e1 /b1 , e2 /b2 , · · · ]. Then

n 1 1 + ··· + b1 bn √ lim n b1 · · · bn lim

n→∞

n→∞

lim

n→∞

b1 + · · · + bn n

= 2

a.e.,

= 2

a.e.,

= 2

a.e.

306

Chapter 4

Proof. Let [1; a1 , a2 , · · · ] be the RCF expansion of x. For any given sufficiently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N such that m = a1 + · · · + ak + j , 0 ≤ j < ak+1 . By Proposition 4.4.2 the LCF expansion is obtained by replacing any RCF digit ` by a block of LCF digits of length ` consisting of (` − 1) 2’s followed by one 1. Then k

1 1 1X j m+k + ··· + = k+ (ai − 1) + = . b1 bm 2 2 2 i=1

This implies that m 1 1 + ··· + b1 bm

=



1

k 1+ m 2

´.

Since 0 ≤ j < ak+1 , we have k ≤ m

1 k

1 Pk

i=1 ai

,

which converges a.e. to 0 by Corollary 4.1.10. Hence m

lim

m→∞

1 1 + ··· + b1 bm

= 2.

Since any bn , n ∈ N+ , is equal to either 1 or 2, recalling the classical inequalities m 1 1 + ··· + b1 bm



p b1 + · · · + bm b1 · · · bm ≤ (≤ 2) , m

m

the result follows.

2

Corollary 4.4.6 Let x ∈ [−1, ∞) \ Q, with FCF expansion [ f1 /d1 , f2 /d2 , · · · ].

Ergodic theory of continued fractions Then

n 1 1 + ··· + d1 dn √ lim n d1 · · · dn

lim

n→∞

n→∞

lim

n→∞

d1 + · · · + dn n

307

= 2

a.e.,

= 2

a.e.,

= 2

a.e..

The proof follows from Theorems 4.4.4 and 4.4.5.

4.4.3

2

The backward continued fraction expansion

Until now we have used only the insertion mechanism in this section. As an example of combining singularization and insertion we discuss here the backward continued fraction (BCF ) expansion. Any irrational number ω ∈ I has an infinite CF expansion of the form 1

1− c1 −

1

:= [ 1; −1/c1 , −1/c2 , · · · ] ,

(4.4.6)

. c2 + . .

where 2 ≤ cn = cn (ω) ∈ N+ , so that (4.4.6) is an SRCF expansion. There is a transformation β : I → I naturally associated with the RCF transformation τ , which is defined by ¥ ¦   (x − 1)−1 − (x − 1)−1 if x ∈ [0, 1), β(x) =  0 if x = 1. The graph of β can be obtained from that of τ by reflecting the latter in the line x = 1/2. It is for this reason that (4.4.6) has been called ‘backward’. Note also that β(x) = −N0 (x − 1), x ∈ I, where N0 is defined in Subsection 4.3.1. In terms of β, the incomplete BCF ¦quotients are given by cn = ¡ ¢ ¥ c1 β n−1 (ω) , n ∈ N+ , with c1 = (1 − ω)−1 , ω ∈ Ω. Here β n , n ∈ N+ , denotes the composition of β with itself n times while β 0 is the identity map. R´enyi (1957) showed that β is ν-preserving, where ν is Ito’s σ-finite, infinite measure with density x−1 , x ∈ (0, 1), which has been considered in Subsection 4.4.2, and that the dynamical system (I, BI , β, ν) is ergodic. See also Adler and Flatto (1984).

308

Chapter 4

As with Proposition 4.4.2 we leave to the reader the proof of the following result. Proposition 4.4.7 Let ω ∈ Ω with RCF expansion [a1 , a2 , · · · ]. Then the BCF expansion (4.4.6) of ω is given by the following algorithm. (i) If a1 = 1 then singularize a1 to arrive at [ 1; −1/(a2 + 1), 1/a3 , · · · ] as a new SRCF expansion of ω. If a1 > 1 then insert (a1 − 1) times −1/1 before a1 to arrive at [ 1; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ] | {z } (a1 −2) times

as a new SRCF expansion of ω, and then singularize the digit 1 appearing before 1/a2 in this expansion of ω. In either case we obtain as SRCF expansion of ω [ 1; (−1/2)a1 −1 , −1/(a2 + 1), 1/a3 , · · · ] ,

(4.4.7)

where (−1/2)a1 −1 abbreviates −1/2, · · · , −1/2. | {z } (a1 −1)times

(ii) Let n be the smallest integer m ∈ N+ for which em = 1 in (4.4.7). Apply to the latter expansion the procedure from (i) to an . Remarks. 1. The above insertion/singularization mechanism implies that ω has a BCF expansion [ 1; (−1/2)a1 −1 , −1/(a2 + 2), (−1/2)a3 −1 , 1/(a4 + 2), · · · ] .

(4.4.8)

See also Zagier (1981, Aufgabe 3, p. 131). It also follows easily from (4.4.8) that every quadratic irrationality has an (eventually) periodic BCF expansion. 2. Again, as for the LCF expansion, it heuristically follows from Corollary 4.1.10 and the insertion mechanism that the BCF transformation β should be ergodic, with invariant σ-finite, infinite measure. 2 √ For the LCF expansion it was intuitively clear that n b1 · · · bn → 2 a.e. as n → ∞ since the only digits are 1 and 2, and ‘there are very few 1’s against

Ergodic theory of continued fractions

309

the 2’s’ (by Corollary 4.1.10). For the BCF expansion such an argument clearly does not work. However, we have the following result. Theorem 4.4.8 Let ω ∈ Ω with BCF expansion (4.4.6). Then √ lim n c1 · · · cn = 2 a.e. n→∞

and

n

lim

n→∞

1 1 + ··· + c1 cn

= 2 a.e..

Proof. Let [a1 , a2 , · · · ] be the RCF expansion of ω. For any given sufficiently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N such that m = a1 + a3 + · · · + a2k−1 + j,

0 ≤ j < a2k+1 .

It follows from (4.4.8) that Pk

c1 · · · cm = 2

i=1 (a2i−1 −1)+j−1

k Y (a2i + 2) , i=1

and therefore m

1 X log ci = m i=1

log 2 m

à k X

! a2i−1 − k + j − 1

i=1



k

+



1X log(a2i + 2) m i=1

k X

  log(a2i + 2)   k+1   i=1 = (log 2) 1 − k . + k   X X  a2i−1 + j  a2i−1 + j i=1

Since

k+1 k X

i=1

1

= 1 k+1

a2i−1 + j

i=1

→ 0

k X i=1

j a2i−1 + k+1

as m → ∞, and k X

log(a2i + 2)

i=1 k X

a2i−1 + j

i=1

→ 0

a.e.

a.e.

310

Chapter 4

as m → ∞, we deduce that √ c1 · · · cm → 2

a.e.

m

as m → ∞. Next, since cn ≥ 2, n ∈ N+ , we have m ≥ 2. 1 1 + ··· + c1 cm Using the same inequalities as in the proof of Theorem 4.4.5 we therefore obtain 2 ≤ lim

m→∞

√ m ≤ lim m c1 · c2 · · · · · cm = 2, 1 1 m→∞ + ··· + c1 cm

that is, lim

m→∞

m

= 2

1 1 + ··· + c1 cm

a.e.. 2

Remark. The asymptotic behaviour of the arithmetic mean c1 + · · · + cm m as m → ∞ was posed as an open problem in Dajani and Kraaikamp (2000). If we write m as before, then an easy calculation yields k X

c1 + · · · + cm = 2+ m

a2i

i=1 k X

j+

, a2i−1

i=1

with 0 ≤ j < a2k+1 . Thus we need to study the behaviour of k X i=1 k X

a2i

a2i−1

i=1

(4.4.9)

Ergodic theory of continued fractions

311

as k → ∞. The asymptotic behaviour of the numerator in (4.4.9) is the same of that of the denominator, and Aaronson (1986) showed that the fraction converges to 1 in probability. However, one expects that infinitely often the denominator is much larger that the numerator, and vice-versa. Thus Dajani and Kraaikamp (op. cit.) conjectured that the lim inf and lim sup of (4.4.9) are a.e. equal to 0 and +∞, respectively. Recently, Aaronson and Nakada (2001) have proved this conjecture. 2

312

Chapter 4

Appendix 1: Spaces, functions, and measures A1.1 Let X be an arbitrary non-empty set. A non-empty collection X of subsets of X is said to be a σ-algebra (in X) if and only if it is closed under the formation of complements and countable unions. Clearly, ∅ and X both belong to X , and X is also closed under the formation of countable intersections. For any non-empty collection C of subsets of X the σ-algebra generated by C, denoted σ(C), is defined as the smallest σ-algebra in X which contains C. Clearly, σ(C) is the intersection of all σ-algebras in X which contain C. A pair (X, X ) consisting of a non-empty set X and a σ-algebra X in X is called a measurable space. In the special case where X is a denumerable set the usual σ-algebra in X is P(X), the collection of all subsets of X. Clearly, P(X) is generated by the elements of X : P(X) = σ ({x} : x ∈ X). The product of two measurable spaces (X, X ) and (Y, Y) is the measurable space (X × Y, X ⊗ Y), where the product σ-algebra X ⊗ Y is defined as σ(C) with C = (A × B : A ∈ X , B ∈ Y).

A1.2 Let (X, X ) and (Y, Y) be two measurable spaces. A map f : X → Y from X into Y is said to be (X , Y)-measurable or a Y -valued random variable (r.v.) on X if and only if the inverse image f −1 (A) = (x ∈ X : f (x) ∈ A) of every set A ∈ Y is in X . Setting f −1 (Y) = (f −1 (A) : A ∈ Y), the above condition can be compactly written as f −1 (Y) ⊂ X . [Note that f −1 (Y) is always a σ-algebra in X whatever f : X → Y ! ] Let (X, X ) be a measurable space, let ((Yi , Yi ))i∈I be a family of measurable spaces, and for any i ∈ I let fi be a Yi -valued r.v. on X. Then 313

314

Appendix 1

¡ ¢ the σ-algebra σ ∪i∈I fi−1 (Yi ) is called the σ-algebra generated by the family (fi )i∈I and is denoted σ((fi )i∈I ). Clearly, this is the smallest σ-algebra S⊂X having the property that fi is (S, Yi )-measurable for any i ∈ I.

A1.3 Let (X, X ) be a measurable space. A function µ : X → R+ is said to be a (finite) measure on X if and only if it is completely additive, that is,¡ for any sequence ¢ P (Ai )i∈N+ of pairwise disjoint elements of X we have µ ∪i∈N+ Ai = i∈N+ µ(Ai ). Complete additivity is equivalent to finite additivity [that is, for any finite collection A1 , . . . , An of pairwise disjoint P elements of X , we have µ (∪ni=1 Ai ) = ni=1 µ(Ai )] in conjunction with continuity at ∅ (that is, for any decreasing sequence A1 ⊃ A2 ⊃ . . . of elements of X with ∩i∈N+ Ai = ∅ we have limn→∞ µ(An ) = 0 ). Clearly, finite additivity implies µ (∅) = 0. In the special case where X is a denumerable set a measure µ on P(X) is defined by simply giving the values µ ({x}) for the elements x ∈ X. A probability on X is a measure P on X satisfying P (X) = 1. An important example of a probability on X is that of the probability δx concentrated at x for any given x ∈ X, which is defined by δx (A) = IA (x), A ∈ X . The collection of all measures (probabilities) on X will be denoted m(X ) (pr(X )). A triple (X, X , P ) consisting of a measurable space (X, X ) and a probability P on X is called a probability space. [The traditional notation for a probability space is (Ω, K, P ). The points ω ∈ Ω are interpreted as the possible outcomes (elementary events) of a random experiment, and the sets A ∈ K as the (random) events associated with it; these are the subsets of Ω arising as the truth sets of certain statements concerning the experiment.] We say that A ∈ X occurs P -almost surely, and write A P -a.s., if and only if P (A) = 1. Let (Y, Y) be a measurable space and let f be a Y -valued of f is the probability P f −1 on Y defined by ¡r.v. on ¢X. The P -distribution −1 −1 Pf (A) = P (f (A)), A ∈ Y. Let (X, X ) and (Y, Y) be two measurable spaces. The product measure of µ ∈ m(X ) and ν ∈ m(Y) is the (unique) measure µ ⊗ ν ∈ m (X ⊗ Y) satisfying the equation µ ⊗ ν(A × B) = µ(A)ν(B) for any A ∈ X and B ∈ Y.

A1.4 Let X be a metric space with metric d. The usual σ-algebra in X, denoted BX , is that of Borel subsets of X, that is, the σ-algebra generated by the

Spaces, functions, and measures

315

collection of all open subsets of X. In the special case where X = Rn (ndimensional Euclidean space) we write Bn for BRn , n ∈ N+ , and B = B 1 . Further, if X is a Borel subset M of Rn , then BM = B n ∩ M = (A ∩ M : A ∈ B n ), n ∈ N+ . A sequence (µn )n∈N+ of measures on BX is said to converge weakly to a w measure µ on BX , and we write µn → µ, if and only if Z Z lim hdµn = hdµ n→∞ X

X

for any h ∈ Cr (X) = the set of all real-valued bounded continuous functions on (X, d). An equivalent definition is obtained by asking that lim µn (A) = µ(A)

(A1.1)

n→∞

for any A ∈ BX for which µ (∂A) = 0, where ∂A is the boundary of A defined as the closure of A minus the interior of A. In the special case where X = R, putting Fn (x) = µn ((−∞, x]) and F (x) = µ ((−∞, x]), x ∈ R, equation (A1.1) holds if and only if limn→∞ µn (R) = µ(R) and limn→∞ Fn (x) = F (x) for any point of continuity x of F . The Prokhorov metric dP on pr(BX ) is defined by dP (P, Q) = inf(ε > 0 : P (A) ≤ Q(Aε )+ε, A ⊂ X, A closed), P, Q ∈ pr(BX ), where Aε = (x : d(x, A) < ε) and d(x, A) = inf(d(x, y) : y ∈ A). If the metric space (X, d) is separable, then for P, Pn ∈ pr(BX ), n ∈ N+ , the weak convergence of Pn to P is equivalent to limn→∞ dP (Pn , P ) = 0. Let (X, d) and (Y, d0 ) be two metric spaces. Consider a Y -valued r.v. f on X. The set Df of all discontinuity points of f belongs to BX since it can be written as ∪ε ∩δ Aε,δ , where ε and δ vary over the positive rational numbers, and Aε,δ is the (open) set of all points x ∈ X for which there exist x0 , x00 ∈ X such that d(x, x0 ) < δ, d(x, x00 ) < δ and d0 (f (x0 ), f (x00 )) ≥ ε. w

Proposition A1.1 If Pn , P ∈ pr (BX ), Pn → P , and P (Df ) = 0, then w

Pn f −1 → P f −1 . In particular, the above result holds for a continuous f for which clearly Df = ∅. For a characterization via weak convergence of almost everywhere continuous functions f , that is, such that P (Df ) = 0, see Mazzone (1995/96).

316

Appendix 1

A1.5 In this section (X, d) is the real line with the usual Euclidean distance. The characteristic function (ch.f.) or Fourier transform of a measure ∧ µ ∈ m(B) is the complex-valued function µ defined on R by Z ∧ µ (t) = e itx µ(dx), t ∈ R. R ∧



If µ = ν for two measures µ, ν ∈ m(B), then µ = ν. Proposition A1.2 (L´evy-Cram´er continuity theorem) Let P, Pn ∈ pr(B), n ∈ N+ . w (i) Pn → P ∈ pr(B) implies limn→∞ Pbn = Pb pointwise, and the convergence of ch.f.s is uniform on compact subsets of R. ∧

(ii) If limn→∞ P n = h pointwise and h is continuous at 0, then h is the w ch.f. of a probability P ∈ pr(B) and Pn → P . Let µ, ν ∈ m(B). The convolution µ ∗ ν is the measure on B defined by Z µ ∗ ν(A) = µ(A − x)ν(dx), A ∈ B, R

where A − x := (y − x : y ∈ A) , x ∈ R. The convolution operator ∗ is associative and commutative. We have µ[ ∗ν =µ b νb,

µ, ν ∈ m(B).

For any n ∈ N+ let fi , 1 ≤ i ≤ n, be real-valued r.v.s on a probability space (Ω, K, P ). The fi are said to be independent if and only if the σ-algebras fi−1 (B), 1 ≤ i ≤ n, are P -independent, that is, ! Ãn n \ Y P Ai = P (Ai ) i=1

i=1

for any Ai ∈ fi−1 (B), 1 ≤ i ≤ n. For independent real-valued r.v.s fi , 1 ≤ i ≤ P P n, the ch.f. of the P -distribution P ( ni=1 fi )−1 of the sum ni=1 fi is equal to the product of the ch.f.s of the P -distributions P fi−1 of the summands, P 1 ≤ i ≤ n. Also, P ( ni=1 fi )−1 is the convolution of the P fi−1 , 1 ≤ i ≤ n. Let µ ∈ m(B). For any n ∈ N+ the nth convolution µ∗n of µ with itself is defined recursively by µ∗1 = µ and µ∗n = µ∗(n−1) ∗ µ for n ≥ 2. Define also µ∗0 as δ0 .

Spaces, functions, and measures

317

Let µ ∈ m(B). The Poisson probability Pois µ associated with µ is defined as Pois µ = e −µ(R)

X µ∗n = e µ−µ(R) . n!

n∈N ∧



d µ = exp(µ − µ (0)). The classical Poisson distribution P (θ) Its ch.f. is Pois with parameter θ > 0 is Pois(θδ1 ). A measure on B ¡ is 2said ¢ to be a L´evy measure if and only if it integrates the function min 1, x on the whole of R. Given a L´evy measure µ, the τ -centered Poisson probability cτ Pois µ, τ > 0, is defined as the probability with characteristic function µZ ¶ ¡ itx ¢ exp e − 1 − itx I[−τ,τ ] (x) µ(dx) . R

We have cτ Pois µ = (Pois µ) ∗ δb(τ ) , where Z τ b(τ ) = − xµ(dx). −τ

A probability P ∈ pr(B) is said to be infinitely divisible if and only if for any n ∈ N+ there exists Pn ∈ pr(B) such that Pn∗n = P . Proposition A1.3 (L´evy–Khinchin representation) P ∈ pr(B) is infinitely divisible if and only if there exist σ ≥ 0 and a L´evy measure ν, and for any τ > 0 there exists aτ ∈ R such that ¶ µ Z ¡ itx ¢ ∧ σ 2 t2 e − 1 − itx I[−τ,τ ] (x) ν(dx) , t ∈ R. + P (t) = exp itaτ − 2 R It follows from Proposition A1.3 that an infinitely divisible probability is the convolution of a normal distribution N (aτ , σ 2 ) and a τ -centered Poisson probability cτ Pois ν. Either of the two terms can be degenerate, that is, the cases σ = 0 and ν ≡ 0 are allowed. An important special class of infinitely divisible probabilities on B is that of stable probabilities. A probability P ∈ pr(B) is said to be stable if and only if for any n ∈ N+ there exist An ∈ R++ and Bn ∈ R such that P ∗n = P fn−1 , where fn is the affine function on R defined by fn (x) = An x + Bn ,

x ∈ R.

(A1.2)

318

Appendix 1

If Bn = 0 for any n ∈ N+ , then P is said to be strictly stable. It appears that the only constants An allowed in (A1.2) are An = n1/α , n ∈ N+ , with α ∈ (0, 2], and then α is called the order of µ. A probability P ∈ pr(B) is ∧

stable of order α if and only if its ch.f. P has the form ∧

α P (t) = exp [i at − c|t| (1 − i b sgn t σ (t, α))] ,

t ∈ R,

where a, b, c ∈ R with |b| ≤ 1 and c ≥ 0, and  πα if α 6= 1,  tg 2 σ(t, α) =  2 π log |t| if α = 1. In particular, a stable probability has order 2 if and only if it is normal. An important example of a stable probability is that of the 1-centered Poisson probability c1 Pois µk1 ,k2 ,α , 0 < α < 2, k1 , k2 ≥ 0, k1 + k2 > 0, whose L´evy measure has density ¢ µk1 ,k2, α (dx) ¡ = k2 I(−∞,0) (x) + k1 I(0,∞) (x) |x|−1−α , dx

x 6= 0.

The ch.f. hk1 ,k2 ,α of c1 Pois µk1 ,k2 ,α is   Z0 ¡ ¢ e itx − 1 − itx I[−1,0) (x) |x|−1−α dx hk1 ,k2 ,α (t) = exp k2  −∞

Z∞ + k1

 ¡ itx ¢ −1−α  e − 1 − itx I(0,1] (x) x dx , 

t ∈ R,

0

which can be expressed in terms of elementary functions as follows. We have hk1 ,k2 ,1 (t) ½ µ ¶ ¾ π(k1 + k2 ) 2 k1 − k2 = exp i(k2 − k1 )(C − 1)t − 1 + i sgn t log |t| |t| , 2 π k1 + k2 where C = 0.57721... is Euler’s constant, while for α 6= 1, 0 < α < 2, ½ i(k2 − k1 )t hk1 ,k2 ,α (t) = exp 1−α µ ¶ ¾ Γ(2 − α) πα k1 − k2 πα α +(k1 + k2 ) cos 1 + i sgn t tg |t| , α(α − 1) 2 k1 + k2 2

Spaces, functions, and measures

319

where Γ is the classical gamma function. Actually, any stable probability of order α 6= 2 has the form δa ∗ c1 Pois µk1 ,k2 ,α with a ∈ R, k1 , k2 ≥ 0, k1 + k2 > 0.

A1.6 Let C = Cr (I) be the metric space of real-valued continuous functions on I = [0, 1] with the uniform metric d(x, y) = sup |x(t) − y(t)| ,

x, y ∈ C.

t∈I

The space C is complete and separable. The σ-algebra BC of Borel sets in (C, d) coincides with the σ-algebra B I ∩ C. Here BI denotes the σalgebra in RI generated by the collection of its subsets of the form Πt∈I At , where At ∈ B, t ∈ I, and At 6= R for finitely many t ∈ I. Of paramount importance is the probability W on BC known as the Wiener measure, for which W (x : x(0) = 0) = 1, W (x : x(ti ) − x(ti−1 ) ≤ ai , 1 ≤ i≤ k) Z ai k Y 1 2 p = e−u /2(ti −ti−1 ) du 2π (ti − ti−1 ) −∞ i=1 for any k ∈ N+ , 0 ≤ t0 < t1 < · · · < tk ≤ 1, ai ∈ R, 1 ≤ i ≤ k. Let D = D(I)(⊃ Cr (I)) be the metric space of real-valued functions on I which are right continuous and have left limits, with the Skorohod metric d0 to be defined below. Clearly, we can also consider the uniform metric d in D which is defined similarly to that in C, that is, d(x, y) = supt∈I |x(t) − y(t)| , x, y ∈ D. Let L denote the set of all strictly increasing continuous functions ` : I → I with `(0) = 0, `(1) = 1, and put s0 (`) = sup |log [(`(t) − `(s)) / (t − s)]| s6=t

for any ` ∈ L. The distance d0 (x, y)(≤ d(x, y)) for x, y ∈ D is defined as the infimum of all ε > 0 for which there exists ` ∈ L such that s0 (`) ≤ ε

320

Appendix 1

and supt∈I |x(t) − y (`(t))| ≤ ε. The metrics d0 and d generate the same topology in D. Nevertheless, while D is complete and separable under d0 , separability does not hold under d. The σ-algebra BD of Borel sets in (D, d0 ) coincides with the σ-algebra B I ∩ D. Wiener measure W can be immediately extended from BC to BD as the topologies induced in D by the metrics d0 and d are identical. Hence A∩C ∈ BC for any A ∈ BD . This allows us to define W (A) = W (A ∩ C), A ∈ BD . Clearly, C is the support of W in D, that is, the smallest closed subset of D whose W -measure equals 1. General references: Araujo and Gin´e (1980), Billingsley (1968), Halmos (1950), Hoffmann–Jørgensen (1994), Samorodnitsky and Taqqu (1994).

Appendix 2: Regularly varying functions A2.1 A measurable function R : [r, ∞) → R+ , where r ∈ R+ , is said to be regularly varying (at ∞) of index α ∈ R if and only if there exists x0 ≥ r such that R([x0 , ∞)) ⊂ R++ and lim

x→∞

R(tx) = tα R(x)

for any t ∈ R++ . A regularly varying function of index 0 is called a slowly varying function. It is obvious that R is regularly varying of index α if and only if it can be written in the form R(x) = xα L(x),

x ∈ (r, ∞),

where L is a slowly varying function. The general form of a slowly varying function is described by the celebrated Karamata theorem below [cf. Seneta (1976, Theorem 1.2 and its Corollary)]. Theorem A2.1 (Representation theorem) Let r ∈ R+ . A function L : [r, ∞) → R+ is slowly varying if and only if µZ x ¶ ε(t) L(x) = c(x) exp dt , x ≥ x0 , x0 t for some x0 ≥ r, where the function c : [x0 , ∞) → R+ is bounded and measurable and limx→∞ c(x) = c > 0 while the function ε : [x0 , ∞) → R is continuous and limx→∞ ε(x) = 0. Corollary A2.2 If L is a slowly varying function, then 321

322

Appendix 2

(i) limx→∞ L(x + y)/L(x) = 1 for any y ∈ R++ ; (ii) limx→∞ xε L(x) = ∞ and limx→∞ x−ε L(x) = 0 for any ε > 0; (iii) L is bounded on finite intervals in [x0 , ∞) if x0 ≥ r is large enough. There exist necessary or sufficient integral conditions for slow variation which are easy to check and use for theoretical and practical purposes. Here are two such results. See, e.g., Seneta (1976, pp. 53-56 and 86-88). Theorem A2.3 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying function and x0 ≥ r so large that L is bounded on finite intervals in [r, ∞), then for any α ≥ −1 we have lim Z

xα+1 L(x)

x→∞

x

=α+1

(A2.1)

y α L(y)dy

x0

Z

x

while the function x → x0

y α L(y)dy, x > x0 , is regularly varying of index

α + 1. Conversely, if L : [r, ∞) → R+ is measurable and bounded on finite intervals in [x0 , ∞) for some x0 ≥ r and (A2.1) holds Z for some α > −1, then x

L is a slowly varying function while the function x → x0

y α L(y)dy, x > x0 ,

is regularly varying of index α + 1. The last assertion also holds for α = −1. Theorem A2.4 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying function, then Z ∞ lim y α L(y) dy < ∞ (A2.2) x→∞ x Z ∞

for any α < −1. If have

y −1 L(y) dy < ∞ then for any α ≤ −1 we

lim

x→∞ x

lim Z

x→∞

xα+1 L(x) ∞

= −(α + 1)

(A2.3)

y α L(y)dy

Z while the function x → x

x ∞

y α L(y) dy, for x large enough, is regularly

varying of index α + 1. Conversely, if L : [r, ∞) → R+ is measurable, satisfies (A2.2), and (A2.3) holds for some α < −1, then L is a slowly varying function while

Regularly varying functions Z



the function x →

323

y α L(y)dy, for x large enough, is regularly varying of

x

index α + 1.

A2.2 An important class of pairs of regularly varying functions is defined as follows. Let ξ be a non-degenerate real-valued random variable on a probability space (Ω, K, P ), and define real-valued functions F and Fe on [0, ∞) by F (x) = E(ξ 2 I(|ξ|≤x) ), Fe(x) = P (|ξ| > x),

x ∈ R+ .

Clearly, F is non-decreasing and Fe non-increasing. It is easy to check that Z F (x) = −

x

Z



u2 dFe (u), Fe (x) =

0

u−2 dF (u),

x

x ∈ R+ ,

whence by integrating by parts we obtain Z F (x) + x Fe (x) = 2 2

x

u Fe(u)du,

(A2.4)

0

Z x Fe (x) + F (x) = 2x2 2



u−3 F (u)du,

x

x ∈ R+ .

(A2.5)

Theorem A2.5 If either F or Fe varies regularly, then the limit x2 Fe (x) =c x→∞ F (x)

(A2.6)

lim

exists and 0 ≤ c ≤ ∞. Conversely, if (A2.6) holds with 0 < c < ∞, then 2

F (x) ∼ x2− 1+c L(x),

2

Fe (x) ∼ cx− 1+c L(x)

as x → ∞, where L is a slowly varying function. Finally, (A2.6) holds with c = 0 if and only if F is slowly varying while (A2.6) holds with c = ∞ if and only if Fe is slowly varying. The proof follows immediately from equations (A2.4) and (A2.5) by using Theorems A2.3 and A2.4. 2

324

Appendix 2

A2.3 Let f : [1, ∞) → R++ be a measurable function which is bounded on finite intervals and such that limx→∞ f (x) = ∞. For any y ∈ [f (1), ∞) define f0 (y) = inf{x ≥ 1 : f (x) ≥ y}, f1 (y) = inf{x ≥ 1 : f (x) > y}, f2 (y) = sup{x ≥ 1 : f (x) ≤ y}. Clearly, the functions fi : [f (1), ∞) → [1, ∞), i = 0, 1, 2, are well defined, any of them is non-decreasing, 1 ≤ f0 ≤ f1 ≤ f2 , and limy→∞ fi (y) = ∞, i = 0, 1, 2. We say that f ∈ F if and only if f1 (y) = 1. y→∞ f2 (y) lim

Lemma A2.6 [Samur (1989, Lemma 2.11)] (i) If f : [1, ∞) → R++ is non-decreasing and limx→∞ f (x) = ∞, then f ∈ F. (ii) If f : [1, ∞) → R++ is bounded on finite intervals and regularly varying of index α > 0, then f ∈ F. Moreover, f0 (y) =1 y→∞ f2 (y) lim

and fi is regularly varying of index 1/α, i = 0, 1, 2. (iii) If f ∈ F and f1 is regularly varying of index 1/α for some α > 0, then f is regularly varying of index α. Corollary A2.7 Let f ∈ F, and define a real-valued function F on R+ by F (x) = (log 2)−1

X

f 2 (k)k −2 ,

x ∈ R+ .

{k∈N+ : |f (k)|≤x}

(i) F is slowly varying if and only if lim

x→∞

x

f 2 (x) X = 0. f 2 (k)k −2

(A2.7)

{k∈N+ : k≤x}

(ii) If f ∈ F is regularly varying of index 1/2, then (A2.7) holds, that is, F is slowly varying.

Appendix 3: Limit theorems for mixing random variables A3.1 Let (Ω, K, P ) be a probability space. For any two σ-algebras K1 and K2 included in the σ-algebra K define the dependence coefficients α(K1 , K2 ) = sup (|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : Ai ∈ Ki , i = 1, 2) , ϕ(K1 , K2 ) = sup (|P (A2 |A1 ) − P (A2 )| : Ai ∈ Ki , i = 1, 2, P (A1 ) > 0) , ¯ µ¯ ¶ ¯ P (A2 |A1 ) ¯ ψ(K1 , K2 ) = sup ¯¯ − 1¯¯ : Ai ∈ Ki , P (Ai ) > 0, i = 1, 2 . P (A2 ) Clearly, α(K1 , K2 ) ≤ ϕ(K1 , K2 ) ≤ ψ(K1 , K2 ) and 0 ≤ α(K1 , K2 ),

ϕ(K1 , K2 ) ≤ 1,

0 ≤ ψ(K1 , K2 ) ≤ ∞.

Let (X, X ) be a measurable space and consider an array X = {Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ }

(A3.1)

of X-valued r.v.s defined on (Ω, K, P ). [An infinite sequence (Xn )n∈N+ of X-valued r.v.s can be seen as the (triangular) array {Xnj ≡ Xj , 1 ≤ j ≤ n, n ∈ N+ } .] For such an array define the dependence coefficients δ (k) = sup (k) n∈N+

max

1≤h≤jn −k

δ(σ (Xnj , 1 ≤ j ≤ h), σ (Xnj , h + k ≤ j ≤ jn )) , 325

326

Appendix 3 (k)

where N+ = {n ∈ N+ : jn > k} , k ∈ N+ , and δ stands for either α, ϕ or ψ. Clearly, in the case of an infinite sequence (Xn )n∈N+ we can write δ(k) = sup δ(σ(Xj , 1 ≤ j ≤ h), σ(Xj , h + k ≤ j ≤ h + k + `)). h,`∈N+

It is obvious that the sequence (δ(k))k∈N+ is non-increasing. An array (resp. sequence) of r.v.s is said to be δ-mixing if and only if limk→∞ δ(k) = 0. It can be shown [Bradley (1986, p. 184)] that ϕ(1) < 1 whenever ψ(1) < ∞. A finite collection (Xi )1≤i≤n , n ≥ 2, of X-valued r.v.s is said to be strictly stationary if and only if the probability distribution of (Xk+1 , · · · , Xk+h ), 0 ≤ k ≤ n − h, does not depend on k whatever 1 ≤ h < n. A sequence (Xn )n∈N+ of Xvalued r.v.s is said to be strictly stationary if and only if the probability distribution of (Xk+1 , · · · , Xk+h ) does not depend on k ∈ N whatever h ∈ N+ . An array of X-valued r.v.s is said to be strictly stationary if and only if any row of it is strictly stationary. Proposition A3.1 Let (A3.1) be a ψ-mixing array of X-valued r.v.s. Let ξ and η be real-valued random variables which are σ(Xnj , 1 ≤ j ≤ h)and σ (Xnj , h + k ≤ j ≤ jn )-measurable, respectively, for some h, k, n ∈ N+ . Assume that E |ξ| , E |η| < ∞ and ψ(k) < ∞. Then Cov (ξ, η) exists and |Cov (ξ, η)| ≤ ψ(k)E |ξ| E |η| . In particular, if Eξ 2 < ∞ and Eη 2 < ∞ then |Cov (ξ, η)| ≤ ψ(k) Var1/2 ξ Var1/2 η. Corollary A3.2 Let (A3.1) be a ψ-mixing strictly stationary array of 2 < ∞ for some n ∈ N . real-valued r.v.s with ψ(1) < ∞. Assume that EXn1 + Then   jn k X X Var Xnj < k 1 + 2 ψ(j) Var Xn1 , 1 ≤ k ≤ jn . j=1

j=1

Corollary A3.3 Let (Xn )n∈N+ be a P ψ-mixing strictly stationary sequence of X-valued r.v.s. Assume that n∈N+ ψ(n) < ∞. Let f be a

Limit theorems

327

real-valued r.v. on (X, X ), and assume that Ef 2 (X1 ) < ∞. Then the series X σ 2 = Ef 2 (X1 ) − E 2 f (X1 ) + 2 E(f (X1 ) − Ef (X1 ))(f (Xn+1 ) − Ef (X1 )) n∈N+

is absolutely convergent and σ ≥ 0. We have Var

n X

f (Xj ) = n(σ 2 + o(1))

j=1

as n → ∞. The above results are already folklore. See, e.g., Doukhan (1994, Ch. 1). Proposition A3.4 [Gordin (1971, Remark 3)] In addition to the hypotheses of Corollary A3.3 assume that ψ(1) < 1. Then σ = 0 if and only if f = const.

A3.2 For an array (A3.1) of real-valued r.v.s on (Ω, K, P ) set Snk =

k X

Xnj ,

1 ≤ k ≤ jn ,

Snjn = Sn ,

n ∈ N+ .

j=1

Then such an array is said to be strongly infinitesimal (s.i. for short) if and only if it is strictly stationary and for any sequence (kn )n∈N+ of natural integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0 the sum Snkn converges in P -probability to 0 as n → ∞. All results given below were proved by J.D. Samur, as indicated at appropriate places, in the more general case of Banach valued random variables. Proposition A3.5 If (A3.1) is a ϕ-mixing s.i. array of real-valued r.v.s, then ¡ −1 ¢ lim max dP P Snk , δ0 = 0 n→∞ 1≤k≤kn

for any sequence (kn )n∈N+ of natural integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0. This is a consequence of a more general result [Samur (1984, Theorem 3.3)].

328

Appendix 3

Proposition A3.6 [Samur (1987, § 3.4.3.2)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that P Sn−1 converges weakly to some probability measure on B. Then the array (A3.1) is s.i. if and only if Xn1 converges in P -probability to 0 as n → ∞, and for any ε > 0 there exists 0 < a = a(ε) < 1 such that lim sup max P (|Snk | > ε) < 1. n→∞ 1≤k≤ajn

A3.3 Let ν be an infinitely divisible probability on B. We denote by Qν the distribution (on BD ) of a stochastic process ξν = (ξν (t))t∈I with stationary independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having probability distribution ν. When ν is Gaussian the process ξν can be taken with trajectories in C. In this case the distribution of ξν is concentrated on BC , and we shall denote it by Q0ν . Given an array (A3.1) of real-valued r.v.s., for any n ∈ N+ define the stochastic processes ξnD = (ξnD (t))t∈I and ξnC = (ξnC (t))t∈I by ξnD (t) = Snbjn tc , ξnC (t) = Snbjn tc + (jn t − bjn tc) (Sn(bjn tc+1) − Snbjn tc ),

t ∈ I,

with the convention Sn0 = 0, n ∈ N+ . Clearly, for any n ∈ N+ the trajectories of ξnD and ξnC are in D and C, respectively. Theorem A3.7 [Samur (1987, Theorem 3.2 and Corollary 3.3)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that ψ(1) < ∞. Let ν be a probability measure on B. Then the following statements are equivalent: w I. P Sn−1 → ν and the array (A3.1) is s.i. ¡ ¢−1 w II. ν is infinitely divisible and P ξnD → Qν in BD . Remark. If the assumption ψ(1) < ∞ does not hold, then Theorem A3.7 still holds with statement I replaced by w

I.0 P Sn−1 → ν, the array (A3.1) is s.i., and sup jn P (|Xn1 | > ε) < ∞, lim jn P (|Xn1 | > ε, |Xnj | > ε) = 0

n∈N+

n→∞

for any ε > 0 and any integer j ≥ 2.

2

Limit theorems

329

Theorem A3.8 [Samur (1987, Corollary 3.5 and § 3.6.4)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s. Let ν be a probability measure on B. Then the following statements are equivalent: w

I. P Sn−1 → ν, the array (A3.1) is s.i., and limn→∞ jn P (|Xn1 | > ε) = 0 for any ε > 0. ¡ ¢−1 w II. ν is Gaussian and P ξnD → Qν in BD . ¡ ¢−1 w 0 → Qν in BC . III. ν is Gaussian and P ξnC IV. ν is Gaussian, and on a common probability space (Ω0 , K0 , P 0 ) there exist an array © 0 ª X0 = Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ of real-valued r.v.s and a stochastic process ζ = (ζ(t))t∈I with trajectories in C which satisfy 0 , · · · , X 0 )−1 = P (X , · · · , X −1 P 0 (Xn1 n1 njn ) , njn

n ∈ N+ ,

P 0 ζ −1 = Q0ν , ¯ ¯ ¯X ³ ´¯¯ ¯ k 0 max ¯ Xnj − ζ jk ¯¯ → 0 P 0 -a.s. as n → ∞. n ¯ 1≤k≤jn ¯¯ j=1

Remark. If ϕ(1) < 1 and ν is Gaussian, then statement I above can be replaced by w

I.0 P Sn−1 → ν, and the array (A3.1) is s.i.

2

Theorem A3.9 [Samur (1987, § 3.4.3.1)] Let (Xn )n∈N+ be a ϕ-mixing strictly stationary sequence of real-valued r.v.s. Let (Bn )n∈N+ be a sequence of positive numbers such that limn→∞ Bn = ∞, and let (An )n∈N+ be a sequence of real numbers. Assume that 

−1 n X 1 w (Xj − An ) → ν, P Bn j=1

330

Appendix 3

where ν is a non-degenerate probability measure on B. Then ν is stable. Let α ∈ (0, 2] be the order of ν and write Xnj =

1 (Xj − An ) , Bn

1 ≤ j ≤ n, n ∈ N+ .

The array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. if and only if: (i) Bn = n1/α L(n), n ∈ N+ , for some slowly varying function L : R+ → R++ integrable over finite intervals, and (ii) for any sequence (rn )n∈N+ of natural integers such that rn ≤ n and limn→∞ rn /n = 0 we have rn (Arn − An ) = 0. n→∞ Bn lim

Theorem A3.10 [Samur (1984, Theorem 5.6)] Let (A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that ψ(1) < ∞. Assume there exist positive measures µn on B, n ∈ N+ , such that µn (R) ≤ 1 and −1 µn ([−t, t]) = 0, n ∈ N+ , for some t ∈ R++ . If P Xn1 = (1 − µn (R))δ0 + µn w and jn µn converges weakly to a finite measure µ on B, then P Sn−1 → Pois µ. Theorem A3.11 [Samur (1984, Theorems 4.1 and 4.2)] Let (A3.1) be a ϕ-mixing strictly stationary s.i. array of real-valued r.v.s such that ϕ(1) < 1. Assume that P Sn−1 converges weakly to a probability measure ν on B. Then ν is Gaussian if and only if lim jn P (|Xn1 | > ε) = 0

n→∞

for any ε > 0. If ν = N (m, σ 2 ) then for any ε > 0 we have à ! ´ 2 jn ³ P (i) limn→∞ E Xnj I(|Xnj |≤ε) − EXnj I(|Xnj |≤ε) = σ2 j=1

and (ii) limn→∞ E

jn P j=1

Xnj I(|Xnj |≤ε) = m.

For any real-valued r.v. η put  2  E η/Eη 2 if 0 < Eη 2 < ∞, 2 m (η) =  0 if Eη 2 = ∞.

Limit theorems

331

It can be proved that if Eη 2 = ∞ then E 2 η I(|η|≤x) = 0. x→∞ Eη 2 I(|η|≤x) lim

(A3.2)

See, e.g., Araujo and Gin´e (1980, p. 80). Theorem A3.12 [Samur (1985), Corollary 3.4] Let (Xn )n∈N+ be a ϕmixing strictly stationary sequence of real-valued r.v.s for which X ϕ1/2 (n) < ∞. n∈N+

Assume that 0 < EX12 ≤ ∞,

x2 P (|X1 | > x) = 0, x→∞ EX 2 I(|X |≤x) 1 1 lim

and the limits EX1 Xn I(|X1 |≤x,|Xn |≤x) , n ∈ N+ , x→∞ EX12 I(|X1 |≤x) P exist and are all finite. Put Sn = ni=1 Xi , n ∈ N+ , S0 = 0. Then the following assertions hold: ϕ(0) n := lim

(i) E |X1 | < ∞. (ii) The series (0)

2 σ(0) = ϕ1 − m2 (X1 ) + 2

X

2 (ϕ(0) n − m (X1 ))

n≥2

converges absolutely and its sum is non-negative. (iii) If σ(0) 6= 0 then for any sequence (Bn )n∈N+ of positive numbers with limn→∞ Bn = ∞ satisfying lim nBn−2 EX12 I(|X1 |≤Bn ) = 1

n→∞ w

we have P ξ¯n−1 → WD in BD , where ξ¯n (t) =

Sbntc − bntc EX1 , n ∈ N+ , σ(0) Bn

t ∈ I.

When a2 = EX12 < ∞ we can take Bn = |a|n1/2 , n ∈ N+ .

332

Appendix 3

Notes and Comments 1.1 As we have noted, the basic reference for classical non-metric results on different types of continued fraction expansions is Perron (1954, 1957). There exist several metrical results about Euclid’s algorithm. Let b, n ∈ N+ with 1 ≤ b < n. Then b/n = [a1 , · · · , aτ (b,n) ] with aτ (b,n) ≥ 2, and τ (b, n) ∈ N+ is the number of division steps occurring when b and n are input to the algorithm. Since Euclid’s algorithm applied to b and n behaves essentially the same as when applied to b/g.c.d.(b, n) and n/g.c.d.(b, n), it is convenient to consider the average number τn of division steps when b is relatively prime to n and chosen at random, that is, probability 1/ϕ(n) is given to any integer in the range [1, n] which is prime to n. Here ϕ is Euler’s ϕ-function defined by ¶ Yµ 1 , n ≥ 2, ϕ(n) = n 1− p p|n

and ϕ(1) = 1, where the product is taken over all prime numbers p which divide n. Clearly, τn =

1 ϕ(n)

n X

τ (k, n).

k=1 g.c.d.(k, n) = 1

Porter (1975) and Knuth (1976) showed that τn =

12 log 2 log n + c + O(n−1/6+ε ) π2

as n → ∞ for any ε > 0, with c=

¢ 1 6 log 2 ¡ 3 log 2 + 4C − 24π 2 ζ 0 (2) − 2 − = 1.467078... . 2 π 2 333

334

Notes and Comments

The leading coefficient (12 log 2)/π 2 = 0.84276... was independently derived by Dixon (1970, 1971) and Heilbronn (1969). A very interesting discussion of this topic can be found in Knuth (1981, Section 4.5.3). See also Lochs (1961), Sz¨ usz (1980), and Tonkov (1974). For recent generalizations of Dixon’s and Heilbronn’s results, see Hensley (1994). The largest quotient max

1≤k≤τ (b,n)

ak

occurring in Euclid’s algorithm when b and n are input to the algorithm, has been studied by Hensley (1991). The continued fraction transformation τ underlies a chaotic discrete dynamical system which exhibits in an accessible manner all the common features of such systems. See, e.g., Corless (1992).

1.2 Whole sections or chapters on the metrical theory of continued fractions can be found in the books by Billingsley (1965), Ibragimov and Linnik (1971), Iosifescu and Grigorescu (1990), Kac (1959), Khin(t)chin(e) (1956, 1963, 1964), Knuth (1981), Koksma (1936), L´evy (1954), Rockett and Sz¨ usz (1992), Sinai (1994), Urban (1923).

1.3 The natural extension τ¯ of τ has been introduced in a more general context by Nakada (1981) in order to derive ergodic properties of associated random variables. See Sections 4.0 and 4.1. The extended incomplete quotients have been first introduced by Faivre (1996) and, in general, the extended random variables by Iosifescu (1997), who proved Theorem 1.3.5 which motivates the consideration of the conditional probability measures γa , a ∈ I. Proposition 1.3.8 and Corollary 1.3.9 can also be found in the latter reference. Subsections 1.3.5 and 1.3.6 rely on the work of Iosifescu (1989, 2000 b). It is worth mentioning that to our knowledge it is the first time that mixing coefficients have been computed exactly. A first estimation, ψ(n) ≤ (0.8)n , n ∈ N+ , of the ψ-mixing coefficients is due to Philipp (1988). As to other types of mixing, it seems possible to prove a kind of α-mixing for (¯ r` )`∈Z using the Markovian structure of (¯ s` )`∈Z and the reversibility of (¯ a` )`∈Z .

Notes and Comments

335

It is the appropriate place to mention that the sequence (an )n∈N+ enjoys another mixing property known as the almost Markov property, a concept introduced by the Lithuanian school—see especially the references to the papers by V.A. Statulevi˘cius and B. Riauba in Heinrich (1987) and Misevi˘cius (1971). See also Saulis and Statulevi˘cius (1991). Let µ ∈ pr(BI ) and for k, n ∈ N+ define the random variable αk,n (µ) = sup |µ (B|σ(a1 , · · · , ak+n−1 )) − µ (B|σ(ak+1 , · · · , ak+n−1 ))| , where the supremum is taken over all B ∈ σ(ak+n , ak+n+1 , · · · ). Put χµ (n) = sup ess sup αk,n (µ). k∈N+

Then as shown in Heinrich (op. cit.)—for a slightly weaker form of this result see Misevi˘cius (1981)—assuming that µ ¿ λ and that f = dµ/dλ ∈ L(I) and is bounded away from 0, we have χµ (n) ≤ 2−n+1 (24 + s(f )/ inf f (x)), x∈I

n ∈ N+ .

Finally, note that it has not been usual to prove F. Bernstein’s theorem (Proposition 1.3.16) as an application of ψ-mixing of the sequence of incomplete quotients.

2.1 Theorem 2.1.6 and Proposition 2.1.7 are in fact corollaries of the ergodic theorem of Ionescu Tulcea and Marinescu (1950) [see also Hennion (1993)], which is a deep generalization of an ergodic theorem of Doeblin and Fortet (1937). Cf. Iosifescu (1993b). As noted by Iosifescu (1993a), it is hard to understand how Doeblin (1940) missed a geometric rate solution to Gauss’ problem, which could have been obtained by using the latter theorem. Subsection 2.1.3 relies on the work of Iosifescu (1992, 1993, 1994). In particular, Propositions 2.1.11 and 2.1.12 have allowed for the simplest solution known to date to Gauss’ problem, which is included in the first two references just quoted. Proposition 2.1.11 has been also proved by Sz¨ usz 1 (1961) for f ∈ C (I). In connection with Proposition 2.1.17 we note that in the case of a singular µ ∈ pr(BI ) the solution to the corresponding Gauss’ problem has not been yet systematically studied. See Remark 2 following Corollary 4.1.10 for a case where the limit clearly differs from Gauss’ measure.

336

Notes and Comments

2.2 Subsections 2.2.1 and 2.2.2 contain a very detailed presentation of E.Wirsing’s 1974 celebrated paper. This also includes the effective computation of numerical constants occurring there. Subsection 2.2.3 relies on the work of Iosifescu (2000 a, c). That Theorem 2.2.6 holds for f ∈ L(I), that is, that Theorem 2.2.8 holds, had been announced in Iosifescu (1992) and subsequently used by Faivre (1998a). We stress again the importance of a study of the set E defined in Remark 1 following Theorem 2.2.6. (See also Remark 2 following Theorem 2.2.11.)

2.3 This section contains a detailed presentation of K.I. Babenko’s work on Gauss’ problem, with some improvements and generalizations. Information about the life and work of K.I. Babenko (1919–1987) can be found in Russian Math. Surveys 35 (1980), no. 2, 265–275, and 43 (1988), no. 2, 138–151. Proposition 2.3.2 and its proof are due to Mayer and Roepstorff (1987). For a = 0, that is, under Lebesgue measure λ = γ0 the exact Gauss–Kuzmin– L´evy Theorem 2.3.5 has been proved by Babenko (1978). The general case a ∈ I has been announced by Iosifescu (2000 b). Note that equation (2.3.14) is equivalent to equation (3.6) in Hensley (1992). We stress the fact that for some a ∈ I the exact convergence rate in Gauss’ problem under γa is faster than Wirsing’s optimal rate O(λn0 ) as n → ∞. See the Remark after the proof of Corollary 2.3.6. It should be noted that by Proposition 2.1.17 for any i(k) ∈ Nk+ the limit of µ[(an+1 , . . . , an+k ) = i(k) ] as n → ∞ exists and is equal to γ(I(i(k) ) whatever µ ∈ pr(BI ) such that µ ¿ λ. Corollary 2.3.6 shows that in the case where µ = γa , a ∈ I, a good convergence rate also holds. A note of historical nature is in order concerning the equation µ ¶ 1 1 lim λ(an = k) = log 1 + , k ∈ N+ , n→∞ log 2 k(k + 2) which is a weaker form of a result given in Corollary 2.3.6. This formula was first obtained as early as 1900. Two papers of the Swedish astronomer Hugo Gyld´en, whose understanding of the approximate computation of planetary motions led him in 1888 to study the asymptotic of λ(an = k), k ∈ N+ , as n → ∞, were taken up for revision by his fellow-countrymen Torsten Brod´en and Anders Wiman, both mathematicians associated with Lund University.

Notes and Comments

337

Wiman (1900) got finally the correct result after Sisyphical computations. Two subsequent papers, both published in 1901, of Brod´en and Wiman were ´ then considered by Emile Borel as the first ones to notice the applicability of measure theory in probability. The reader will find precise references and all the necessary details in von Plato (1994, Ch. 2). This book is a fascinating account of the emergence of measure-theoretic probability in the first third of the 20th century (until the publication of A.N. Kolmogorov’s Grundbegriffe der Wahrscheinlichkeitsrechnung in 1933). It is convincingly argued there that the theory of the continued fraction expansion should be counted among the fields that brought infinitary events and the idea of measure 0 into probability.

2.5 This section relies on the work of Iosifescu (1994, 1997, 1999). For a = 0, that is, under Lebesgue measure λ = γ0 the optimal convergence rate O(g2n ) in Theorem 2.5.5 (without explicit lower and upper bounds), has been first shown by D¨ urner (1992) using a different approach. For a = 0, too, Theorem 2.2.8 with just an upper bound O(gn ) [instead of the optimal one O(g2n )], has been proved by a different method by Dajani and Kraaikamp (1994). The proof given here emphasizes the importance of the generalized Brod´en– Borel–L´evy formula (1.3.21). It is hard to understand why A. Denjoy’s 1936 Comptes Rendus Notes went unnoticed so many years. The method of proving and generalizing Denjoy’s results here, is quite different from that suggested by him.

3.0 The idea underlying Lemma 3.0.1 goes back to Philipp (1970). Lemma 3.0.2 is a special case of a result of Samur (1989, Lemma 2.3).

3.1 Except for Theorem 3.1.6, the results in Subsections 3.1.1 and 3.1.2 have been proved by Samur (1989). The classical Poisson law [Theorem 3.1.2 (iii)] under any µ ¿ λ has been first given a complete proof by Iosifescu (1977), who filled a gap in an incomplete proof by Doeblin (1940, p. 358).

338

Notes and Comments

3.2 & 3.3 Subsections 3.2.2 and 3.2.3 mainly rely on the work of Samur (1989, 1996), who applied his earlier results for different mixing random variables to the special case of random variables occurring in the metrical theory of continued fractions. The presentation here is more transparent due to the consistent use of the extended random variables which only appear in an implicit manner in Samur’s treatment. For the first versions of most of the results in these sections credit should be given to Doeblin (1940). An extensive analysis of Doeblin’s paper has been made by Iosifescu (1990, 1993 a,b), where the reader can find a comprehensive evaluation of Doeblin’s important contributions to the metrical theory of continued fractions as compared with subsequent work in the field. It should be noted that Samur (1989) has also dealt with more general partial sums Sn defined as follows. Let (fn )n∈N+ be a sequence of H-valued functions on N+ , where H is a separable Hilbert space, and put Sn = Pn f (a ), n ∈ N+ . He derived sufficient conditions for the laws of certain n i i=1 random functions associated with the Sn , n ∈ N+ , to converge weakly (in the Skorohod space of H-valued functions on I) to an infinitely divisible probability measure on H. Another generalization of the case considered in Theorem 3.2.4 is that of partial sums n X Sn = fi (ai ), i=1

where (fn )n∈N+ is a sequence of real-valued functions on N+ . A very special case has been taken up by Doeblin (1940, p. 360), with fn (j) = 1 or 0 according as j ≥ cn or j < cn , n, j ∈ N+ . Here (cn )n∈N+ is a sequence of positive numbers. In this case Sn is the number of occurrences of the random events (ai ≥ ci ), 1 ≤ i ≤ n. By F. Bernstein’s theorem—see Corollary 1.3.16—limn→∞ Sn < ∞ or = ∞ a.e. in I according as the series P 1/c n converges or diverges. Doeblin gave valid hints for a proof that n∈N+ P if n∈N+ 1/cn = ∞ then (Sn )n∈N+ obeys the central limit theorem under √ λ. More precisely, (Sn − An )/ An is asymptotically N (0, 1) under λ as n → ∞, with µ ¶ n 1 X 1 An = log 1 + , n ∈ N+ . log 2 ci i=1

A complete proof with an estimate of the convergence rate under any µ ¿ λ has been given by Philipp (1970). This result has been improved by Zuparov

Notes and Comments

339

(1981). The functional version of this central limit theorem was proved by Philipp and Webb (1973).

3.4 We only mention here a result not covered by those given in this section. It is about Doeblin’s sequence (Sn )n∈N+ just discussed. Doeblin (1940, p. 361) asserted the validity of the law of the iterated logarithm µ ¶ Sn − An λ lim supn→∞ √ = 1 = 1. 2An log log An A complete proof was again given by Philipp (1970). The functional version of this law of the iterated logarithm might follow from a more general result in Sz¨ usz and Volkmann (1982, p. 458).

4.0 Most of the results stated for probability measures are still valid for finite measures and even for σ-finite, infinite measures. See, e.g., Aaronson (1997).

4.1 Khin(t)chin(e) [1934/35, 1936;P1963 (or 1964), Ch. 3] proved the a.e. convergence of arithmetic means ni=1 f (ai , · · · , ai+k−1 )/n, n ∈ N+ , for some fixed k ∈ N, under an unnecessarily strong assumption on the function f : Nk+ → R. His proofs are quite intricate since he made no use of the Birkhoff–Khinchin (!) ergodic theorem which, as we have seen, provides short and elegant proofs. (This should be certainly associated with the fact that ergodic theory at the time was restricted to invertible transformations. But even so a way out could have perhaps been found.) Unlike Khinchin, Doeblin (1940, p. 366) did make use of the ergodic theorem. He proved that the continued fraction transformation τ is ergodic under λ [a different proof had been given earlier by Knopp (1926), see also Martin (1934)]. Since τ is γ-preserving, this enabled him to derive (in an equivalent form) equation (4.1.1), thus to retrieve Khinchin’s results under weaker assumptions in a straightforward manner. It is the appropriate place to note that, in spite of the fact that, e.g., Billingsley (1965, p. 49) fully credits Doeblin for the idea leading to (4.1.1), many authors assert that this idea is due to

340

Notes and Comments

Ryll–Nardzewski (1951). Actually, the only real advance made after 1940 in using ergodic theorems in the metric theory of RCF expansion originated with Nakada (1981) who, as already mentioned, introduced the natural extension τ¯ of τ , allowing to derive equation (4.1.6). It is again really surprising that Doeblin (1940, p. 365) asserts that his version of Theorem 2.2.11—see Remark 1 following that theorem—implies that 1 x ≥ 1, card{k : Θ−1 k < x, 1 ≤ k ≤ n } = H(x), n→∞ n P and that n−1 ni=1 Θi converges a.e. as n → ∞ to a constant (not indicated). Or Doeblin’s first assertion above is equivalent to the first case considered in Corollary 4.1.22 while the second one is the first equation in Corollary 4.1.23 without the value of the limit. How did Doeblin guess these results whose proofs involve the use of τ¯? It should be noted that special cases of the Khinchin-Doeblin results have been known before. For example, as already noted, Proposition 4.1.1 and its consequences were first proved (without convergence rates) by L´evy (1929). The application of the G´al–Koksma theorem to the RCF expansion, yielding the convergence rates indicated, is due to de Vroedt (1962, 1964). Let us finally mention that in Philipp (1967) a more general problem is considered. Given an arbitrary sequence (In )n∈N+ of intervals contained in I, it is shown there that for any ε > 0 the random variable lim

card{k : τ k ∈ Ik , 1 ≤ k ≤ n },

n ∈ N+ ,

is equal to n X k=1

Ã !1/2 ! Ã n n X X 3+ε γ(Ik ) log 2 γ(Ik ) + O  γ(Ik )  k=1

a.e.

k=1

as n → ∞, where the constant implied in O depends on both ε and the current point ω ∈ Ω. Moeckel (1982), then Jager and Liardet (1988), using quite different methods showed—amongst other things—that if we consider modulo 2 the sequence (qn )n∈N+ of the denominators of the RCF convergents of any given ω ∈ Ω, then the asymptotic relative frequencies of the digit blocks 01, 10, and 11 all are a.e. equal to 1/3. [Note that the digit block 00 cannot occur since |pn−1 qn − pn qn−1 | = 1, n ∈ N+ .] Jager and Liardet (op. cit.) showed

Notes and Comments

341

that results of this kind can be easily derived from the ergodicity of a certain skew product. To define it we need some notation. For any integer m ≥ 2 let G(m) denote the finite group of 2 × 2 matrices with entries from Z/mZ (the classes of remainders modulo m) and determinant equal to ±1, that is, µµ ¶ ¶ a b G(m) = : a, b, c, d ∈ Z/mZ, ad − bc = ±1 . c d It is known that the cardinality of G(m) is given by the formula   2J(2) = 6 if m = 2, card G(m) =  2mJ(m) if m ≥ 3, where J is Jordan’s arithmetical totient function defined by ¶ Yµ 1 2 1 − 2 , m ≥ 2. J(m) = m p p|m

Here the product is taken over all prime numbers p which divide m. Jager and Liardet’s skew product Tm : Ω × G(m) → Ω × G(m) is then defined by µ µ ¶¶ 0 1 Tm (ω, A) = τ (ω), A , (ω, A) ∈ Ω × G(m). 1 a1 (ω) mod m These authors showed that Tm is γ ⊗ hm -preserving, where hm is the Haar measure on G(m), that is, the uniform one assigning measure 1/card G(m) to any element of G(m), and that (Tm , γ ⊗ hm ) is an ergodic endomorphism. Hence they deduced, e.g., that given integers m ≥ 2, a, b ∈ N+ , with g.c.d.(a, b, m) = 1, we have lim

n→∞

1 1 card {k : pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n } = n J(m)

a.e.,

a result also obtained by Moeckel (1982). Subsequently, Nolte (1990) gave other interesting applications of Jager and Liardet’s endomorphism. A natural extension T¯m of Tm was obtained and studied by Dajani and Kraaikamp (1998). It appears that we can take T¯m : Ω2 ×G(m) → Ω2 ×G(m) defined by µ µ ¶¶ 0 1 ¯ Tm ((ω, θ), A) = τ¯(ω, θ), A 1 a1 (ω) mod m

342

Notes and Comments

¡ ¢ for (ω, θ, A) ∈ Ω2 ×G(m). Then T¯m is γ¯ ⊗hm -preserving, and T¯m , γ¯ ⊗ hm is an ergodic automorphism. Hence Dajani and Kraaikamp (op. cit.) deduced, e.g., that for any integers m ≥ 2, 0 ≤ a, b ≤ m − 1, with g.c.d.(a, b, m) = 1 and for any (t1 , t2 ) ∈ I 2 we have lim

n→∞

1 card {k : Θk−1 < t1 , Θk < t2 , pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n } n =

H(t1 , t2 ) J(m)

a.e.,

where the distribution function H has been defined in Corollary 4.1.20. Their paper contains a host of other results. They also showed that these results can be extended to S-expansions (cf. Sections 4.2 and 4.3). It is interesting to note that the sequences of numerators and denominators of the S-convergents have – mod m – the same asymptotic behaviour as that just indicated for the sequences of numerators and denominators of the RCF convergents. It may seem difficult to compare, e.g., the decimal expansion with the RCF expansion, since their dynamics are different. However, Lochs (1964) obtained a then surprising result that had to serve as a prototype for further results of the same kind. Let ω ∈ Ω and consider the rational number xn = xn (ω) := b10n ωc/10n , which yields the first n decimal digits of ω, and yn = xn + 10−n , n ∈ N+ . Clearly, for n large enough we have yn < 1. Next, let ω = [a1 , a2 , · · · ], xn = [b1 , · · · , bk ], and yn = [c1 , · · · , c` ] be the RCF expansions of ω, xn , and yn , respectively, and for n ∈ N+ large enough put mn = mn (ω) = max{i ≤ max(k, `) : bj = cj , 1 ≤ j ≤ i }. In other words, mn (ω) is the largest integer such that the closed interval [xn , yn ] is contained in the closure of the fundamental interval I(a1 , · · · , √ 3 amn (ω) ) (containing ω). For example, if ω = 2 − 1 = 0.259921 · · · then x5 = 0.25992, y5 = 0.25993, ω = [3, 1, 5, 1, 1, · · · ], x5 = [3, 1, 5, 1, 1, 4, 2, 5, 1, 3], and y5 = [3, 1, 5, 1, 1, 5, 5, 1, 2, 1, 4, 3]. Therefore m5 (ω) = 5, that is, from the first 5 decimal digits of ω we obtain its first 5 RCF digits. Using arithmetic properties of τ and Paul L´evy’s result (4.1.19), Lochs (op. cit.) proved that lim

n→∞

mn 6 log 2 log 10 = = 0.97027014 · · · n π2

a.e..

This means that, roughly speaking, usually around 97% of the RCF digits are determined by the decimal digits. Using an early mainframe computer,

Notes and Comments

343

by way of example, Lochs (1963) calculated that the first 1000 decimal digits of π determine 968 RCF digits of it! Lochs’ result was generalized to a wider class of transformations of I by Bosma et al. (1999). Their results are based on the Shannon–McMillan– Breiman theorem in information theory [see Billingsley (1965, p. 129)] while Lochs’ limit appears in fact to be the ratio of the entropies of the transformations S : I → I defined as Sx = 10 x mod 1, x ∈ I, underlying the decimal expansion, and τ . Finally, Dajani and Fieldsteel (2001) gave wider applications and simpler proofs of results describing the rate at which the digits of one number theoretical expansion determine those of another. Their proofs are based on general measure-theoretic covering arguments and not on the dynamics of specific maps. We mention that Lochs’ problem was also considered by Faivre (1997, 1998b), who showed that (i) for any ε > 0 there exist positive constants a < 1 and A such that ¯ µ¯ ¶ ¯ mn 6 log 2 log 10 ¯ ¯ ≥ ε ≤ Aan , n ∈ N+ , − λ ¯¯ ¯ n π2 ¡ ¢ √ and (ii) the random variable mn − 6(log 2)(log 10)n/π 2 / n is asymptotically N (0, σ) for some σ > 0 (which is related to the constant denoted by the same letter in Example 3.2.11). Clearly, Lochs’ result is implied by (i) via the Borel–Cantelli lemma. Cassels (1959) showed that there exist numbers x which are normal in base 3 but non-normal in any base that is not a power of 3. This result was generalized by Schmidt (1960) as follows. Let the notation r ∼ s stand for r, s ∈ N+ being powers of the same integer. It is fairly obvious that if r ∼ s then normality of x in both bases r and s imply each other. If r 6∼ s then this implication does not hold. In fact, Schmidt (op. cit.) showed that in the latter case there is a continuum power set of numbers x which are normal in base r but not even simply normal in base s. (Simple normality means that each single digit occurs with the proper frequency.) Motivated by this, Schweiger (1969) defined two number theoretical transformations T and S on I (or I d , the d-dimensional unit cube, d ∈ N+ ) to be equivalent (T ∼ S) if there exist positive integers m, n ∈ N+ such that T m = S n . Schweiger then showed that T ∼ S implies that every T -normal number is S-normal, and conjectured that T 6∼ S implies the opposite conclusion. Surprisingly, Kraaikamp and Nakada (2000) proved that the RCF and NICF expansions share the same set of normal numbers. Clearly, in itself

344

Notes and Comments

this is not a counter-example to Schweiger’s conjecture, since the RCF transformation τ and the NICF transformation N1/2 ‘live’ on different intervals. However, in Kraaikamp and Nakada (2001) two counter-examples are given.

4.2 & 4.3 Section 4.2 fully relies on the work of Kraaikamp (1991), see also his 1989 paper. There exists a host of CF expansions which would have deserved to be discussed here. Two such expansions are the Rosen continued fraction expansions, and the α-expansions of Tanaka and Ito (1981). We will briefly discuss both of them. Although Rosen (1954) introduced his CF expansions in the mid-1950s, it is only very recently that there has been any investigation of their metric properties—see Burton et al. (2000), Gr¨ochenig and Haas (1996), Nakada (1995), Sebe (2002), and Schmidt (1993). The groups which underlie the Rosen continued fraction expansions are Fuchsian groups of the first kind—discrete subgroups of PSL(2, R) acting upon the Poincar´e upper half-plane by M¨obius (fractional linear) transformations, with all of R as their limit sets. Let λ = λq = 2 cos(π/q) for q ∈ {3, 4, . . . }, and put µ ¶ µ ¶ 1 λ 0 −1 A= , B= . 0 1 1 0 Then the group Gq generated by A and B is called the Hecke (triangle) group of index q. Rosen (op. cit.) defined a CF expansion related to Gq , q ≥ 4. (Note that for q = 3 we have the modular group.) Fix some such q and let Jq = [−λ/2, λ/2 ]. Then the transformation fq : Jq → Jq defined by $ % sgn x sgn x 1 fq (x) = − + λ, x ∈ Jq \ {0}, fq (0) = 0, x λx 2 leads to a CF expansion of the form e1

x = b1 λ +

e2

,

. b2 λ + . .

where ei is equal to either 1 or −1 and bi ∈ N, i ∈ N+ . We call this the Rosen, or λ-continued fraction (λ-CF ), expansion of x ∈ Jq \ {0}.

Notes and Comments

345

In Burton et al. (op. cit.) the natural extension of the ergodic dynamical system underlying the λ-CF expansion was obtained for any q ≥ 3—the case q = 3 is in fact the NICF expansion. [Previously, Nakada (op. cit.) obtained a similar result for any even q.] From this a large number of results similar to those holding for the RCF expansion, were obtained for the λ-CF expansion. At first sight Nakada’s α-expansions and those of Tanaka and Ito (1981) bear a close resemblance. Let α ∈ [1/2, 1], Iα = [α − 1, α], and define the transformation Tα : Iα → Iα by ¦  −1 ¥ −1  x − x + 1 − α if x ∈ Iα \ {0}, Tα (x) =  0 if x = 0. It yields a unique Tanaka–Ito α-expansion of the form 1

x = b1 +

1

,

x ∈ Iα \ {0} ,

. b2 + . .

which is finite if and only if x is rational, and where bi ∈ Z \ {0}, i ∈ N+ . In spite of the similarities it is much harder to obtain results for the Tanaka–Ito α-expansions as compared to the Nakada α-expansions discussed in Subsection 4.3.1. E.g., Tanaka and Ito (op. cit.) were able only to give the explicit form of the density of the invariant measure for 1/2 ≤ α ≤ g. For these values of α they were also able to derive the entropy of Tα . It is interesting to note that the latter is independent of α ∈ [1/2, g], and is equal to π 2 /(6 log g), which is the value corresponding to an S-expansion with maximal singularization area. It should be noted that limit properties as those in Chapter 3 for CF expansions, other than the RCF expansion, need the corresponding Gauss– Kuzmin–L´evy theorems (implying ψ-mixing of the sequence of their incomplete quotients). In this respect we mention the papers of Dajani and Kraaikamp (1999), Iosifescu and Kalpazidou (1993), Kalpazidou (1985a, c, 1986d, e, 1987b), Popescu (1997a, b, 1999, 2000), Rieger (1978, 1979), Rockett (1980), and Sebe (2000a, b, 2001a, b, 2002). It appears, as noted in the Preface, that for any single CF expansion a specific approach is required, which has to more or less mimic that working for the RCF expansion. We conclude by briefly discussing a generalization of the RCF expansion known as f -expansions (which, in general, are not CF expansions). Let f be

346

Notes and Comments

a continuous strictly decreasing (increasing) real-valued function defined on [1, β], where either 2 < β ∈ N+ or β = ∞ ([0, β], where either 1 < β ∈ N+ or β = ∞), such that f (1) = 1 and f (β) = 0 (f (0) = 0 and f (β) = 1), with the convention f (β) = limx→β f (x) for β = ∞. Denote by f −1 the inverse function of f , which is defined on I. Such a function f can be used to represent most real numbers t ∈ I as t = f (a1 (t) + f (a2 (t) + · · · )) := lim fn (a1 (t), · · · , an (t)), n→∞

where fn is defined recursively by f1 (x1 ) = f (x1 ),

f2 (x1 , x2 ) = f1 (x1 + f (x2 )),

and fn+1 (x1 , · · · , xn+1 ) = fn (x1 , · · · , xn−1 , xn + f (xn+1 )),

n ≥ 2.

Here the ‘incomplete quotients’ an (t) are defined recursively as ¥ ¦ an (t) = f −1 ({rn−1 (t)}) with r0 (t) = t, rn (t) = f −1 ({rn−1 (t)}),

n ∈ N+ .

Note that rn (t) = an (t) + f (an+1 (t) + f (an+2 (t) + · · · )),

n ∈ N+ .

The above representation of t is called its f -expansion. Clearly, the RCF expansion is obtained for f (x) = 1/x, x ≥ 1, and the part of the continued fraction transformation τ is now played by the f -expansion transformation τf of I defined by τf (t) = {f −1 (t)}, t ∈ I. [Some caution is necessary in the case where β = ∞ when either τf (0) or τf (1) should be given the value 0.] Also, the natural extension τ¯f of τf is defined by τ¯f (t, u) = (τf (t), f (a1 (t) + u)) for the points (t, u) of a suitable subset of I 2 of Lebesgue measure 1. The f -expansions were first considered¯ by Kakeya (1924), who proved that if ¯ ¯ −1 0 ¯ −1 f is absolutely continuous and ¯(f ) ¯ > 1 a.e. in I then, save possibly a countable subset of I, any other t ∈ I has an f -expansion. A metrical theory of f -expansions parallelling that of the RCF expansion is available. See, e.g., Iosifescu and Grigorescu (1990, Section 5.4) and the references therein. Finally, if β does not belong to N+ ∪ {∞}, then the corresponding f leads to a so called f -expansion with dependent digits. For recent results on such f -expansions, see Barrionuevo et al. (1994), Dajani and Kraaikamp (1996, 2001), and Dajani et al. (1994).

References Aaronson, J. (1986) Random f -expansions. Ann. Probab. 14, 1037– 1057. Aaronson, J. (1997) An Introduction to Infinite Ergodic Theory. Mathematical Surveys and Monographs 50. Amer. Math. Soc., Providence, RI. Aaronson, J. and Nakada, H. (2001) Sums without maxima. Preprint. Abramov, L.M. (1959) Entropy of induced automorphisms. Akad. Nauk SSSR 128, 647–650. (Russian)

Dokl.

Abramowitz, M. and Stegun, I.A. (Eds.) (1964) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards, Washington, D.C. de Acosta, A. (1982) Invariance principles in probability for triangular arrays of B-valued random vectors and some applications. Ann. Probab. 10, 346–373. Adams, W.W. (1979) On a relationship between the convergents of the nearest integer and regular continued fractions. Math. Comp. 33, 1321–1331. Adler, R.L. (1991) Geodesic flows, interval maps, and symbolic dynamics. In: Bedford, T. et al. (Eds.) (1991), 93–123. Adler, R.L. and Flatto, L. (1984) The backward continued fraction map and geodesic flow. Ergodic Theory and Dynamical Systems 4, 487–492. Adler, R., Keane, M., and Smorodinsky, M. (1981) A construction of a normal number for the continued fraction transformation. J. Number Theory 13, 95–105. 347

348

References Alexandrov, A.G. (1978) Computer investigation of continued fractions. Algoritmic Studies in Combinatorics, 142–161. Nauka, Moscow. (Russian) Aliev, I., Kanemitsu, S., and Schinzel, A. (1998) On the metric theory of continued fractions. Colloq. Math. 77, 141–146. Alzer, H. (1998) On rational approximation to e. J. Number Theory 68, 57–62. Araujo, A. and Gin´e, E. (1980) The Central Limit Theorem for Real and Banach Valued Random Variables. Wiley, New York. Babenko, K.I. (1978) On a problem of Gauss. Soviet Math. Dokl. 19, 136–140. Babenko, K.I. and Jur0 ev, S.P. (1978) On the discretization of a problem of Gauss. Soviet Math. Dokl. 19, 731–735. Bagemilhl, F. and McLaughlin, J.R. (1966) Generalization of some classical theorems concerning triples of consecutive convergents to simple continued fractions. J. Reine Angew. Math. 221, 146–149. Bailey, D.H., Borwein, J.M., and Crandall, R.E. (1997) On the Khintchine constant. Math. Comp. 66, 417–431. Baladi, V. and Keller, G. (1990) Zeta functions and transfer operators for piecewise monotonic transformations. Comm. Math. Phys. 127, 459–477. Barbolosi, D. (1990) Sur le d´eveloppement en fractions continues `a quotients partiels impairs. Monatsh. Math. 109, 25–37. Barbolosi, D. (1993) Automates et fractions continues. J. Th´eor. Nombres Bordeaux 5, 1–22. Barbolosi, D. (1997) Une application du th´eor`eme ergodique sousadditif `a la th´eorie m´etrique des fractions continues. J. Number Theory 66, 172–182. Barbolosi, D. (1999) Sur l’ordre de grandeur des quotients partiels du d´eveloppement en fractions continues r´eguli`eres. Monatsh. Math. 128, 189–200.

References

349

Barbolosi, D. and Faivre, C. (1995) Metrical properties of some random variables connected with the continued fraction expansion. Indag. Math. (N.S.) 6, 257–265. Barndorff–Nielsen, O. (1961) On the rate of growth of the partial maxima of a sequence of independent identically distributed random variables. Math. Scand. 9, 383–394. Barrionuevo, J., Burton, R.M., Dajani, K., and Kraaikamp, C. (1996) Ergodic properties of generalized L¨ uroth series. Acta Arith. 74, 311– 327. Bedford, T., Keane, M., and Series, C. (Eds.) (1991) Ergodic Theory, Symbolic Dynamics and Hyperbolic Spaces. Oxford University Press, Oxford. Berechet, A. (2001a) A Kuzmin-type theorem with exponential convergence for a class of fibred systems. Ergodic Theory and Dynamical Systems 21, 673–688. Berechet, A. (2001b) Perron–Frobenius operators acting on BV(I) as contractors. Ergodic Theory and Dynamical Systems 21, 1609–1624. ¨ Bernstein, F. (1911) Uber eine Anwendung der Mengenlehre auf ein aus der Theorie der s¨akularen St¨orungen herr¨ uhrendes Problem. Math. Ann. 71, 417–439. Billingsley, P. (1965) Ergodic Theory and Information. Wiley, New York. Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York. ´ (1903) Contribution `a l’analyse arithm´etique du continu. Borel, E. J. Math. Pures Appl. (5) 9, 329–375. ´ (1909) Les probabilit´es d´enombrables et leurs applications Borel, E. arithm´etiques. Rend. Circ. Mat. Palermo 27, 247–271. Bosma, W. (1987) Optimal continued fractions. Indag. Math. 49, 353–379. Bosma, W. and Kraaikamp, C. (1990) Metrical theory for optimal continued fractions. J. Number Theory 34, 251–270.

350

References Bosma, W. and Kraaikamp, C. (1991) Optimal approximation by continued fractions. J. Austral. Math. Soc. Ser. A 50, 481–504. Bosma, W., Dajani, K., and Kraaikamp, C. (1999) Entropy and counting correct digits. Report No. 9925 (June), Univ. Nijmegen, Dept. of Math., Nijmegen (The Netherlands). Bosma, W., Jager, H., and Wiedijk, F. (1983) Some metrical observations on the approximation by continued fractions. Indag. Math. 45, 281–299. Bowman, K.O. and Shenton, L.R. (1989) Continued Fractions in Statistical Applications. Marcel Dekker, New York. Boyarsky, A. and G´ora, P. (1997) Laws of Chaos: Invariant Measures and Dynamical Systems in One Dimension. Birkh¨auser, Boston. Bradley, R.C. (1986) Basic properties of strong mixing conditions. In: Eberlein, E. and Taqqu, M.S. (Eds.) Dependence in Probability and Statistics, 165–192. Birkh¨auser, Boston. Breiman, L. (1960) A strong law of large numbers for a class of Markov chains. Ann. Math. Statist. 31, 801–803. Brezinski, C. (1991) History of Continued Fractions and Pad´e Approximants. Springer–Verlag, Berlin. Brjuno, A.D. (1964) The expansion of algebraic numbers into continued fractions. Z. Vyˇcisl. Mat. i Mat. Fiz. 4, 211–221. (Russian) Brod´en, T. (1900) Wahrscheinlichkeitsbestimmungen bei der gew¨ohn¨ lichen Kettenbruchentwickelung reeller Zahlen. Ofversigt af Kongl. Svenska Vetenskaps-Akademiens F¨ orhandlingar 57, 239–266. Brown, G. and Yin, Q. (1996) Metrical theory for Farey continued fractions. Osaka J. Math. 33, 951–970. Bruckheimer, M. and Arcavi, A. (1995) Farey series and Pick’s area theorem. Math. Intelligencer 17, no. 4, 64–67. de Bruijn, N.G. and Post, K.A. (1968) A remark on uniformly distributed sequences and Riemann integrability. Indag. Math. 30, 149– 150.

References

351

Bunimovich, L.A. (1996) Continued fractions and geometrical optics. Amer. Math. Soc. Transl. (2) 171, 45–55. Burton, R.M., Kraaikamp, C., and Schmidt, T.A. (2000) Natural extensions for the Rosen fractions. Trans. Amer. Math. Soc. 352, 1277– 1298. Cassels, J.W.S. (1959) On a problem of Steinhaus about normal numbers. Colloq. Math. 7, 95–101. Chaitin, G.J. (1998) The Limits of Mathematics: A Course on Information Theory and the Limits of Formal Reasoning. Springer–Verlag Singapore, Singapore. Champernowne, D.G. (1933) The construction of decimals normal in the scale of ten. J. London Math. Soc. 8, 254–260. Chatterji, S.D. (1966) Masse, die von regelm¨assigen Kettenbr¨ uchen induziert sind. Math. Ann. 164, 113–117. Choong, K.Y., Daykin, D.E., and Rathbone, C.R. (1971) Rational approximations to π. Math. Comp. 25, 387–392. Chudnovsky, D.V. and Chudnovsky, G.V. (1991) Classical constants and functions: computations and continued fraction expansions. In: Chudnovsky, D.V. et al. (Eds.) Number Theory (New York, 1989/1990), 13–74. Springer–Verlag, New York. Chudnovsky, D.V. and Chudnovsky, G.V. (1993) Hypergeometric and modular function identities, and new rational approximations to and continued fraction expansions of classical constants and functions. In: Knopp, M. and Sheingorn, M. (Eds.) (1993), 117–162. Clemens, L.E. , Merrill, K.D., and Roeder, D.W. (1995) Continued fractions and series. J. Number Theory 54, 309–317. Cohn, H. (Ed.) (1993) Doeblin and Modern Probability (Blaubeuren, Germany, 1991). Contemporary Mathematics 149. Amer. Math. Soc., Providence, RI. Corless, R.M. (1992) Continued fractions and chaos. Amer. Math. Monthly 99, 203–215. Cornfeld, I.P., Fomin, S.V., and Sinai, Ya.G. (1982) Ergodic Theory. Springer–Verlag, Berlin.

352

References Dajani, K. and Fieldsteel, A. (2001) Equipartition of interval partitions and an application to number theory. Proc. Amer. Math. Soc. 129, 3453–3460. Dajani, K. and Kraaikamp, C. (1994) Generalization of a theorem of Kusmin. Monatsh. Math. 118, 55–73. Dajani, K. and Kraaikamp, C. (1996) On approximation by L¨ uroth series. J. Th´eor. Nombres Bordeaux 8, 331–346. Dajani, K. and Kraaikamp, C. (1998) A note of the approximation by continued fractions under an extra condition. New York J. Math. 3A, 69–80. Dajani, K. and Kraaikamp, C. (1999) A Gauss–Kusmin theorem for optimal continued fractions. Trans. Amer. Math. Soc. 351, 2055– 2079. Dajani, K. and Kraaikamp, C. (2000) ‘The mother of all continued fractions’. Colloq. Math. 84/85, 109–123. Dajani, K. and Kraaikamp, C. (2001) From greedy to lazy expansions and their driving dynamics. Preprint No. 1186, Utrecht Univ., Dept. of Math., Utrecht. Dajani, K., Kraaikamp, C., and Solomyak, B. (1996) The natural extension of the β-transformation. Acta Math. Hungar. 73, 97–109. Daud´e, H., Flajolet, P., and Vall´ee, B. (1997) An average-case analysis of the Gaussian algorithm for lattice reduction. Combinatorics, Probability and Computing 6, 397–433. Davenport, H. (1999) The Higher Arithmetic: An Introduction to the Theory of Numbers, 7th Edition. Cambridge Univ. Press, Cambridge. Davison, J.L. and Shallit, J.O. (1991) Continued fractions for some alternating series. Monatsh. Math. 111, 119–126. Delmer, F. and Deshouillers, J-M. (1993) On a generalization of Farey sequences, I. In: Knopp, M. and Sheingorn, M. (Eds.) (1993), 243– 246. Delmer, F. and Deshouillers, J-M. (1995) On a generalization of Farey sequences. II. J. Number Theory 55, 60–67.

References

353

Denker, M. and Jakubowski, A. (1989) Stable limit distributions for strongly mixing sequences. Statist. Probab. Lett. 8, 477–483. Denjoy, A. (1936 a) Sur les fractions continues. C.R. Acad. Sci. Paris 202, 371–374. Denjoy, A. (1936 b) Sur une formule de Gauss. C.R. Acad. Sci. Paris 202, 537–540. Diamond, H.G. and Vaaler, J.D. (1986) Estimates for partial sums of continued fraction partial quotients. Pacific J. Math. 122, 73–82. Dixon, J.D. (1970) The number of steps in the Euclidean algorithm. J. Number Theory 2, 414–422. Dixon, J. D. (1971) A simple estimate for the number of steps in the Euclidean algorithm. Amer. Math. Monthly 78, 374–376. Doeblin, W. (1940) Remarques sur la th´eorie m´etrique des fractions continues. Compositio Math. 7, 353–371. Doeblin, W. and Fortet, R. (1937) Sur des chaˆınes `a liaisons compl`etes. Bull. Soc. Math. France 65, 132–148. Doob, J.L. (1953) Stochastic Processes. Wiley, New York. Doukhan, P. (1994) Mixing: Properties and Examples. Lecture Notes in Statist. 85. Springer–Verlag, New York. Duren, P.L. (1970) Theory of H p Spaces. Academic Press, New York. D¨ urner, A. (1992) On a theorem of Gauss–Kuzmin–L´evy. Arch. Math. (Basel ) 58, 251–256. Elsner, C. (1999) On arithmetic properties of the convergents of Euler’s number. Colloq. Math. 79, 133–145. Elton, H.J. (1987) An ergodic theorem for iterated maps. Ergodic Theory and Dynamical Systems 7, 481–488. Faivre, C. (1992) Distribution of L´evy constants for quadratic numbers. Acta Arith. 61, 13–34. Faivre, C. (1993) Sur la mesure invariante de l’extension naturelle de la transformation des fractions continues. J. Th´eor. Nombres Bordeaux 5, 323–332.

354

References Faivre, C. (1996) On the central limit theorem for random variables related to the continued fraction expansion. Colloq. Math. 71, 153– 159. Faivre, C. (1997) On decimal and continued fraction expansions of a real number. Acta Arith. 82, 119–128. Faivre, C. (1998a) The rate of convergence of approximations of a continued fraction. J. Number Theory 68, 21–28. Faivre, C. (1998b) A central limit theorem related to decimal and continued fraction expansions. Arch. Math. (Basel ) 70, 455–463. Falconer, K.J. (1986) The Geometry of Fractal Sets. Cambridge Univ. Press, Cambridge. Falconer, K. (1990) Fractal Geometry: Mathematical Foundations and Applications. Wiley, Chichester. Feller, W. (1968) An Introduction to Probability Theory and Its Applications, Vol. I, 3rd Edition. Wiley, New York. Finch, S. (1995) Favorite Mathematical Constants. Available at: http: //www.mathsoft.com/asolve/constant/constant.html Flajolet, P. and Vall´ee, B. (1998) Continued fractions algorithms, functional operators, and structure constants. Theoret. Comput. Sci. 194, 1–34. Flajolet, P. and Vall´ee, B. (2000) Continued fractions, comparison algorithms, and fine structure constants. Constructive, Experimental, and Nonlinear Analysis (Limoges, 1999), 53–82. Amer. Math. Soc., Providence, RI. Fluch, W. (1986) Eine Verallgemeinerung des Kuz’min-Theorems. Anz. ¨ Osterreich. Akad. Wiss. Math.-Natur. Kl. Sitzungsber. II 195, 325– 339. ¨ Fluch, W. (1992) Ein Operator der Kettenbruchtheorie. Anz. Osterreich. Akad. Wiss. Math.-Natur. Kl. 129, 39–49. G´al, I.S. and Koksma, J.F. (1950) Sur l’ordre de grandeur des fonctions sommables. Indag. Math. 12, 638–653.

References

355

Galambos, J. (1972) The distribution of the largest coefficient in continued fraction expansions. Quart. J. Math. Oxford Ser. (2) 23, 147– 151. Galambos, J. (1973) The largest coefficient in continued fractions and related problems. In: Osgood, Ch. (Ed.) Diophantine Approximation and its Applications (Proc. Conf., Washington, D.C., 1972), 101–109. Academic Press, New York. Galambos, J. (1994) An iterated logarithm type theorem for the largest coefficient in continued fractions. Acta Arith. 25, 359–364. Gologan, R.-N. (1989) Applications of Ergodic Theory. Technical Publishing House, Bucharest. (Romanian) Gordin, M.I. (1971) On the behavior of the variances of sums of random variables forming a stationary process. Theory Probab. Appl. 16, 474–484. Gordin, M.I. and Reznik, M.H. (1970) The law of the iterated logarithm for the denominators of continued fractions. Vestnik Leningrad. Univ. 25, no. 13, 28–33. (Russian) Gray, J.J. (1984) A commentary on Gauss’ mathematical diary, 1796– 1814, with an English translation. Exposition. Math. 2, 97–130. Grigorescu, S. and Popescu, G. (1989) Random systems with complete connections as a framework for fractals. Stud. Cerc. Mat. 41, 481–489. Gr¨ochenig, K. and Haas, A. (1996) Backward continued fractions and their invariant measures. Canad. Math. Bull. 39, 186–198. Grothendieck, A. (1955) Produits tensoriels topologiques et espaces nucl´eaires. Mem. Amer. Math. Soc. 16. Amer. Math. Soc., Providence, RI. Grothendieck, A. (1956) La th´eorie de Fredholm. Bull. Soc. Math. France 84, 319–384. de Haan, L. (1970) On Regular Variation and its Application to the Weak Convergence of Sample Extremes. Math. Centre Tracts 32. Math. Centrum, Amsterdam. Halmos, P.R. (1950) Measure Theory. Van Nostrand, New York. (Reprinted 1974 by Springer–Verlag, New York)

356

References Hardy, G.H. and Wright, E. (1979) An Introduction to the Theory of Numbers, 5th Edition. Clarendon Press, Oxford. [Reprinted (with corrections) 1983] Harman, G. (1998) Metric Number Theory. Oxford University Press, New York. Harman, G. and Wong, K.C. (2000) A note on the metrical theory of continued fractions. Amer. Math. Monthly 107, 834–837. Hartman, S. (1951) Quelques propri´et´es ergodiques des fractions continues. Studia Math. 12, 271–278. Hartono, Y. and Kraaikamp, C. (2002) On continued fractions with odd partial quotients. Rev. Roumaine Math. Pures Appl. 47, no. 1. Heilbronn, H. (1969) On the average length of a class of finite continued fractions. Number Theory and Analysis (Papers in Honor of Edmund Landau), 87–96. Plenum, New York. Heinrich, H. (1987) Rates of convergence in stable limit theorems for sums of exponentially ψ-mixing random variables with an application to metric theory of continued fractions. Math. Nachr. 131, 149–165. Hennion, H. (1993) Sur un th´eor`eme spectral et son application aux noyaux lipschitziens. Proc. Amer. Math. Soc. 118, 627–634. Hensley, D. (1988) A truncated Gauss–Kuzmin law. Trans. Amer. Math. Soc. 306, 307–327. Hensley, D. (1991) The largest digit in the continued fraction expansion of a rational number. Pacific J. Math. 151, 237–255. Hensley, D. (1992) Continued fraction Cantor sets, Hausdorff dimension, and functional analysis. J. Number Theory 40, 336–358. Hensley, D. (1994) The number of steps in the Euclidean algorithm. J. Number Theory 49, 142–182. Hensley, D. (1996) A polynomial time algorithm for the Hausdorff dimension of continued fraction Cantor sets. J. Number Theory 58, 9–45. Hensley, D. (1998) Metric Diophantine approximation and probability. New York J. Math. 4, 249–257.

References

357

Hensley, D. (2000) The statistics of the continued fraction digit sum. Pacific J. Math. 192, 103–120. Heyde, C.C. and Scott, D.J. (1973) Invariance principles for the law of the iterated logarithm for martingales and processes with stationary increments. Ann. Probab. 1, 428–436. Hofbauer, F. and Keller, G. (1982) Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, 119– 140. Hoffmann-Jørgensen, J.(1994) Probability with a View toward Statistics, Vols. I and II. Chapman & Hall, New York. ¨ Hurwitz, A. (1889) Uber eine besondere Art der Kettenbruch-Entwicklung reeller Gr¨ossen. Acta Math. 12, 367–405. Ibragimov, I.A. and Linnik, Yu.V. (1971) Independent and Stationary Sequences of Random Variables. Wolters–Noordhoff, Groningen. Ionescu Tulcea, C. T. and Marinescu, G. (1950) Th´eorie ergodique pour des classes d’op´erations non compl`etement continues. Ann. of Math. (2) 52, 140–147. Iosifescu, M. (1968) The law of the iterated logarithm for a class of dependent random variables. Theory Probab. Appl. 13, 304–313. Addendum, ibid. 15 (1970), 160. Iosifescu, M. (1972) On Strassen’s version of the loglog law for some classes of dependent random variables. Z. Wahrsch. Verw. Gebiete 24, 155–158. Iosifescu, M. (1977) A Poisson law for φ-mixing sequences establishing the truth of a Doeblin statement. Rev. Roumaine Math. Pures Appl. 22, 1441–1447. Iosifescu, M. (1978) Recent advances in the metric theory of continued fractions. Trans. Eighth Prague Conf. on Information Theory, Statistical Decision Functions, Random Processes (Prague, 1978), Vol. A, 27–40. Reidel, Dordrecht. Iosifescu, M. (1989) On mixing coefficients for the continued fraction expansion. Stud. Cerc. Mat. 41, 491–499.

358

References Iosifescu, M. (1990) A survey of the metric theory of continued fractions, fifty years after Doeblin’s 1940 paper. In: Grigelionis, B. et al. (Eds.) Probability Theory and Mathematical Statistics (Proc. Fifth Vilnius Conference, 1989), Vol. I, 550–572. Mokslas, Vilnius & VSP, Utrecht. Iosifescu, M. (1992) A very simple proof of a generalization of the Gauss–Kuzmin–L´evy theorem on continued fractions, and related questions. Rev. Roumaine Math. Pures Appl. 37, 901–914. Iosifescu, M. (1993a) Doeblin and the metric theory of continued fractions: a functional theoretical approach to Gauss’ 1812 problem. In: Cohn, H. (Ed.) (1993), 97–110. Iosifescu, M. (1993b) A basic tool in mathematical chaos theory: Doeblin and Fortet’s ergodic theorem and Ionescu Tulcea and Marinescu’s generalization. In: Cohn, H. (Ed.) (1993), 111–124. Iosifescu, M. (1994) On the Gauss–Kuzmin–L´evy theorem, I. Rev. Roumaine Math. Pures Appl. 39, 97–117. Iosifescu, M. (1995) On the Gauss–Kuzmin–L´evy theorem, II. Rev. Roumaine Math. Pures Appl. 40, 91–105. Iosifescu, M. (1996) On some series involving sums of incomplete quotients of continued fractions. Stud. Cerc. Mat. 48, 31–36. Corrigendum, ibid. 48, 146. Iosifescu, M. (1997a) On the Gauss–Kuzmin–L´evy theorem, III. Rev. Roumaine Math. Pures Appl. 42, 71–88. Iosifescu, M. (1997b) A reversible random sequence arising in the metric theory of the continued fraction expansion. Rev. Anal. Num´er. Th´eor. Approx. 26, 91–93. Iosifescu, M. (1999) On a 1936 paper of Arnaud Denjoy on the metrical theory of the continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 777–792. Iosifescu, M. (2000a) An exact convergence rate result with application to Gauss’ 1812 problem. Proc. Romanian Acad. Ser. A 1, 11–13. Iosifescu, M. (2000b) Exact values of ψ-mixing coefficients of the sequence of incomplete quotients of the continued fraction expansion. Proc. Romanian Acad. Ser. A 1, 67–69.

References

359

Iosifescu, M. (2000c) On the distribution of continued fraction approximations: optimal rates. Proc. Romanian Acad. Ser. A 1, 143–145. Iosifescu, M. and Grigorescu, S. (1990) Dependence with Complete Connections and its Applications. Cambridge Univ. Press, Cambridge. Iosifescu, M. and Kalpazidou, S. (1993) The nearest integer continued fraction expansion: an approach in the spirit of Doeblin. In: Cohn, H. (Ed.) (1993), 125–137. Iosifescu, M. and Kraaikamp, C. (2001) On Denjoy’s canonical continued fraction expansion. Submitted. Iosifescu, M. and Theodorescu, R. (1969) Random Processes and Learning. Springer–Verlag, Berlin. Ito, Sh. (1987) On Legendre’s theorem related to Diophantine approximations. S´eminaire de Th´eorie des Nombres, 1987–1988 (Talence, 1987–1988), Exp. No. 44, 19 pp. Ito, Sh. (1989) Algorithms with mediant convergents and their metrical theory. Osaka J. Math. 26, 557–578. Jager, H. (1982) On the speed of convergence of the nearest integer continued fraction. Math. Comp. 39, 555–558. Jager, H. (1985) Metrical results for the nearest integer continued fraction. Indag. Math. 47, 417–427. Jager, H. (1986a) The distribution of certain sequences connected with the continued fraction. Indag. Math. 48, 61–69. Jager, H. (1986b) Continued fractions and ergodic theory. Transcendental Number Theory and Related Topics, 55–59. RIMS Kokyuroku 599. Kyoto Univ., Kyoto. Jager, H. and Kraaikamp, C. (1989) On the approximation by continued fractions. Indag. Math. 51, 289–307. Jager, H. and Liardet, P. (1988) Distributions arithm´etiques des d´enominateurs de convergents de fractions continues. Indag. Math. 50, 181–197. Jain, N.C. and Pruitt, W.E. (1975) The other law of the iterated logarithm. Ann. Probab. 3, 1046–1049.

360

References Jain, N.C. and Taylor, S.J. (1973) Local asymptotic laws for Brownian motion. Ann. Probab. 1, 527–549. Jenkinson, O. and Pollicott, M. (2001) Computing the dimension of dynamically defined sets: E2 and bounded continued fractions. Ergodic Theory and Dynamical Systems 21, 1429–1445. Jain, N.C., Jodgeo, K., and Stout, W.F. (1975) Upper and lower functions for martingales and mixing processes. Ann. Probab. 3, 119–145. Jones, W.B. and Thron, W.J. (1980) Continued Fractions: Analytic Theory and Applications. Addison-Wesley, Reading, Mass. Kac, M. (1959) Statistical Independence in Probability and Statistics. Wiley, New York. Kaijser, T. (1983) A note on random continued fractions. Probability and Mathematical Statistics : Essays in Honour of Carl-Gustav Esseen, 74–84. Uppsala Univ., Dept. of Math., Uppsala. Kakeya, S. (1924) On a generalized scale of notations. Japan J. Math. 1, 95-108. Kalpazidou, S. (1985a) On a random system with complete connections associated with the continued fraction to the nearer integer expansion. Rev. Roumaine Math. Pures Appl. 30, 527–537. Kalpazidou, S. (1985b) On some bidimensional denumerable chains of infinite order. Stochastic Process. Appl. 19, 341–357. Kalpazidou, S. (1985c) Denumerable chains of infinite order and Hurwitz expansion. Selected Papers Presented at the 16th European Meeting of Statisticians (Marburg, 1994). Statist. Decisions, Suppl. Issue no. 2, 83–87. Kalpazidou, S. (1986a) A class of Markov chains arising in the metrical theory of the continued fraction to the nearer integer expansion. Rev. Roumaine Math. Pures Appl. 31, 877–890. Kalpazidou, S. (1986b) Some asymptotic results on digits of the nearest integer continued fraction. J. Number Theory 22, 271–279. Kalpazidou, S. (1986c) On nearest continued fractions with stochastically independent and identically distributed digits. J. Number Theory 24, 114–125.

References

361

Kalpazidou, S. (1986d) On a problem of Gauss–Kuzmin type for continued fractions with odd partial quotients. Pacific J. Math. 123, 103–114. Kalpazidou, S. (1986e) A Gaussian measure for certain continued fractions. Proc. Amer. Math. Soc. 96, 629–635. Kalpazidou, S. (1987a) On the entropy of the expansion with odd partial quotients. In: Grigelionis, B. et al. (Eds.) Probability Theory and Mathematical Statistics (Proc. Fourth Vilnius Conf., 1985), Vol. II, 55–62. VNU Science Press, Utrecht. Kalpazidou, S. (1987b) On the application of dependence with complete connections to the metrical theory of G-continued fractions. Lithuanian Math. J. 27, no. 1, 32–40. Kamae, T. (1982) A simple proof of the ergodic theorem using nonstandard analysis. Israel J. Math. 42, 284–290. Kanwal, R.P. (1997) Linear Integral Equations: Theory and Technique, 2nd Edition. Birkh¨auser, Boston. Kargaev, P. and Zhigljavsky, A. (1997) Asymptotic distribution of the distance function to the Farey points. J. Number Theory 65, 130–149. Katznelson, Y. and Weiss, B. (1982) A simple proof of some ergodic theorems. Israel J. Math. 42, 291–296. Keane, M.S. (1991) Ergodic theory and subshifts of finite type. In: Bedford, T. et al. (Eds.) (1991), 35–70. Keller, G. (1984) On the rate of convergence to equilibrium in onedimensional systems. Comm. Math. Phys. 96, 181–193. Khintchine, A. (1934/35) Metrische Kettenbruchprobleme. Compositio Math. 1, 361–382. Khintchine, A. (1936) Zur metrischen Kettenbruchtheorie. Compositio Math. 3, 276–285. Khintchine, A.J. (1956) Kettenbr¨ uche. Teubner, Leipzig. [Translation of the 2nd (1949) Russian Edition; 1st Russian Edition 1935] Khintchine, A.Ya. (1963) Continued Fractions. Noordhoff, Groningen. [Translation of the 3rd (1961) Russian Edition]

362

References Khinchin, A.Ya. (1964) Continued Fractions. Univ. Chicago Press, Chicago. [Translation of the 3rd (1961) Russian Edition] ¨ Klein, F. (1895) Uber eine geometrische Auffassung der gew¨ohnlichen Kettenbruchentwicklung. Nachr. K¨ onig. Gesellsch. Wiss. G¨ ottingen Math.-Phys. Kl. 45, 357–359. [French version (1896) Sur une repr´esentation g´eom´etrique du d´eveloppement en fraction continue ordinaire. Nouvelles Ann. Math. (3), 15, 327–331] Knopp, K. (1926) Mengentheoretische Behandlung einiger Probleme der diophantische Approximationen und der transfiniten Wahrscheinlichkeiten. Math. Ann. 95, 409–426. Knopp, M. and Sheingorn, M. (Eds.) (1993) A Tribute to Emil Grosswald: Number Theory and Related Analysis. Contemporary Mathematics 143. Amer. Math. Soc., Providence, RI. Knuth, D.E. (1976) Evaluation of Porter’s constant. Comput. Math. Appl. 2, 137–139. Knuth, D.E. (1981) The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 2nd Edition. Addison-Wesley, Reading, Mass. Knuth, D.E. (1984) The distribution of continued fraction approximations. J. Number Theory 19, 443–448. K¨ohler, G. (1980) Some more predictable continued fractions. Monatsh. Math. 89, 95–100. Koksma, J.F. (1936) Diophantische Approximationen. J. Springer, Berlin. Kraaikamp, C. (1987) The distribution of some sequences connected with the nearest integer continued fraction. Indag. Math. 49, 177–191. Kraaikamp, C. (1989) Statistic and ergodic properties of Minkowski’s diagonal continued fraction. Theoret. Comput. Sci. 65, 197–212. Kraaikamp, C. (1990) On the approximation by continued fractions, II. Indag. Math. (N.S.) 1, 63–75. Kraaikamp, C. (1991) A new class of continued fractions. Acta Arith. 57, 1–39.

References

363

Kraaikamp, C. (1993) Maximal S-expansions are Bernoulli shifts. Bull. Soc. Math. France 121, 117–131. Kraaikamp, C. (1994) On symmetric and asymmetric Diophantine approximation by continued fractions. J. Number Theory 46, 137–157. Kraaikamp, C. and Liardet, P. (1991) Good approximations and continued fractions. Proc. Amer. Math. Soc. 112, 303–309. Kraaikamp, C. and Lopes, A. (1996) The theta group and the continued fraction expansion with even partial quotients. Geometriae Dedicata 59, 293–333. Kraaikamp, C. and Meester, R. (1998) Convergence of continued fraction type algorithms and generators. Monatsh. Math. 125, 1–14. Kraaikamp, C. and Nakada, H. (2000) On normal numbers for continued fractions. Ergodic Theory and Dynamical Systems 20, 1405–1421. Kraaikamp, C. and Nakada, H. (2001) On a problem of Schweiger concerning normal numbers. J. Number Theory 86, 330–340. Krasnoselskii, M. (1964) Positive Solutions of Operator Equations. Noordhoff, Groningen. Krengel, U. (1985) Ergodic Theorems (with a Supplement by Antoine Brunel). W. de Gruyter, Berlin. Kuipers, L. and Niederreiter, H. (1974) Uniform Distribution of Sequences. Wiley, New York. Kurosu, K. (1924) Notes on some points in the theory of continued fractions. Japan J. Math. 1, 17–21. Corrigendum, ibid. 2 (1926), 64. Kuzmin, R.O. (1928) On a problem of Gauss. Dokl. Akad. Nauk SSSR Ser. A, 375–380. [Russian; French version in Atti Congr. Internaz. Mat. (Bologna, 1928), Tomo VI, 83–89. Zanichelli, Bologna, 1932] Lagarias, J.C. (1992) Number theory and dynamical systems. In: Burr, S.A. (Ed.) The Unreasonable Effectiveness of Number Theory, 35–72. Proc. Sympos. Appl. Math. 46. Amer. Math. Soc., Providence, RI.

364

References Lang, S. and Trotter, H. (1972) Continued fractions for some algebraic numbers. J. Reine Angew. Math. 255, 112–134. Addendum, ibid. 267 (1974), 219–220. Lasota, A. and Mackey, M.C. (1985) Probabilistic Properties of Deterministic Systems. Cambridge Univ. Press, Cambridge. [2nd Edition (1994) Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics. Applied Mathematical Sciences 97. Springer–Verlag, New York] Legendre, A.M. (1798) Essai sur la th´eorie des nombres. Duprat, Paris. [2`eme ´edition (1808), Courcier, Paris; 3`eme ´edition (1830), Didot, Paris; reprinted (1955), Blanchard, Paris] Lehmer, D. (1939) Note on an absolute constant of Khintchine. Amer. Math. Monthly 46, 148–152. Lehner, J. (1994) Semiregular continued fractions whose partial denominators are 1 or 2. In: Abikoff, W. et al. (Eds.) The Mathematical Legacy of Wilhelm Magnus: Groups, Geometry and Special Functions (Brooklyn, NY, 1992), 407–410. Contemporary Mathematics 169. Amer. Math. Soc., Providence, RI. L´evy, P. (1929) Sur les lois de probabilit´e dont d´ependent les quotients complets et incomplets d’une fraction continue. Bull. Soc. Math. France 57, 178–194. L´evy, P. (1936) Sur le d´eveloppement en fraction continue d’un nombre choisi au hasard. Compositio Math. 3, 286–303. L´evy, P. (1952) Fractions continues al´eatoires. Rend. Circ. Mat. Palermo (2) 1, 170–208. L´evy, P. (1954) Th´eorie de l’addition des variables al´eatoires, 2`eme ´edition. Gauthier-Villars, Paris. (1`ere ´edition 1937) Liardet, P. and Stambul, P. (2000) S´eries de Engel et fractions continues. J. Th´eor. Nombres Bordeaux 12, 37–68. Lin, M. (1978) Quasi-compactness and uniform ergodicity of positive operators. Israel J. Math. 29, 309–311. Lochs, G. (1961) Statistik der Teilnenner der zu den echten Br¨ uchen geh¨origen regelm¨assigen Kettenbr¨ uche. Monatsh. Math. 65, 27–52.

References

365

Lochs, G. (1963) Die ersten 968 Kettenbruchnenner von π. Monatsh. Math. 67, 311–316. Lochs, G. (1964) Vergleich der Genauigkeit von Dezimalbruch und Kettenbruch. Abh. Math. Sem. Hamburg 27, 142–144. Lorenzen, L. and Waadeland, H. (1992) Continued Fractions and Applications. North-Holland, Amsterdam. Loynes, R.M. (1965) Extreme values in uniformly mixing stationary stochastic processes. Ann. Math. Statist. 36, 993–999. Lyons, R. (2000) Singularity of some random continued fractions. J. Theoret. Probab. 13, 535–545. Mackey, M.C. (1992) Time’s Arrow: The Origins of Thermodynamic Behavior. Springer–Verlag, New York. MacLeod, A. J.(1993) High-accuracy numerical values in the Gauss– Kuzmin continued fraction problem. Comput. Math. Appl. 26, 37–44. Magnus, W., Oberhettinger, F., and Soni, R.P. (1966) Formulas and Theorems for the Special Functions of Mathematical Physics, 3rd Edition. Springer–Verlag, Berlin. Marcus, S. (1961) Les approximations diophantiennes et la cat´egorie de Baire. Math. Z. 76, 42–45. Marques Henriques, J. (1966) On probability measures generated by regular continued fractions. Gaz. Mat. (Lisboa) 27, no. 103–104, 16– 22. Martin, M.H. (1934) Metrically transitive point transformations. Bull. Amer. Math. Soc. 40, 606–612. Mayer, D.H. (1987) Relaxation properties of the mixmaster universe. Physics Lett. A 122, 390–394. Mayer, D. (1990) On the thermodynamic formalism for the Gauss map. Comm. Math. Phys. 130, 311–333. Mayer, D. (1991) Continued fractions and related transformations. In: Bedford, T. et al. (Eds.) (1991), 175–222.

366

References Mayer, D. and Roepstorff, G. (1987) On the relaxation time of Gauss’ continued-fraction map. I. The Hilbert space approach (Koopmanism). J. Statist. Phys. 47, 149–171. Mayer, D. and Roepstorff, G. (1988) On the relaxation time of Gauss’ continued-fraction map. II. The Banach space approach (transfer operator method). J. Statist. Phys. 50, 331–344. Mazzone, F. (1995/96) A characterization of almost everywhere continuous functions. Real Anal. Exchange 21, no. 1, 317–319. McKinney, T.E. (1907) Concerning a certain type of continued fractions depending on a variable parameter. Amer. J. Math. 29, 213–278. ¨ Minkowski, H. (1900) Uber die Ann¨aherung an eine reelle Gr¨osse durch rationale Zahlen. Math. Ann. 54, 91–124. ¨ Minnigerode, B. (1873) Uber eine neue Methode, die Pell’sche Gleichung aufzul¨osen. Nachr. K¨ onig. Gesellsch. Wiss. G¨ ottingen Math.Phys. Kl. 23, 619–652. Misevi˘cius, G. (1971) Asymptotic for the distribution funcPn−1 expansions j t). Ann. Univ. Sci. Budapest f (T tions of sums of the form j=0 E¨ otv¨ os Sect. Math. 14, 77–92. (Russian) Misevi˘cius, G. (1981) Estimate of the remainder term in the limit theorem for the denominators of continued fractions. Lithuanian Math. J. 21, 245–253. Misevi˘cius, G. (1992) The optimal zone for large deviations of the denominators of continued fractions. New Trends in Probability and Statistics (Palanga, 1991), Vol. 2, 83–90. VSP, Utrecht. Moeckel, R. (1982) Geodesics on modular surfaces and continued fractions. Ergodic Theory and Dynamical Systems 2, 69–83. Mollin, R.A. (1999) Continued fraction gems. Nieuw Arch. Wiskunde (4) 17, 383–405. Morita, T. (1994) Local limit theorem and distribution of periodic orbits of Lasota-Yorke transformations with infinite Markov partitions. J. Math. Soc. Japan 46, 309–343. Errata, ibid. 47 (1995), 191–192.

References

367

Nakada, H. (1981) Metrical theory for a class of continued fraction transformations and their natural extensions. Tokyo J. Math. 7, 399– 426. Nakada, H. (1990) The metrical theory of complex continued fractions. Acta Arith. 56, 279–289. Nakada, H. (1995) Continued fractions, geodesic flows and Ford circles. In: Takahashi, Y. (Ed.), Algorithms, Fractals and Dynamics, 179–191. Plenum, New York. Nakada, H., Ito, Sh., and Tanaka, S. (1977) On the invariant measure for the transformations associated with some real continued fraction. Keio Engrg. Rep. 30, 159–175. von Neumann, J. and Tuckerman, B. (1955) Continued fraction expansion of 21/3 . Math. Tables Aids Comput. 9, 23–24. Nolte, V.N. (1990) Some probabilistic results on the convergents of continued fractions. Indag. Math. (N.S.) 1, 381–389. Obrechkoff, N. (1951) Sur l’approximation des nombres irrationnels par des nombres rationnels. C.R. Acad. Bulgare Sci. 3, no. 1, 1–4. Olds, C.D. (1963) Continued Fractions. Random House, Toronto. Pedersen, P. (1959) On the expansion of π in a regular continued fraction. II. Nordisk Mat. Tidskr. 7, 165–168. Perron, O. (1954, 1957) Die Lehre von der Kettenbr¨ uchen. Band I: Elementare Kettenbr¨ uche; Band II: Analytisch-funktiontheoretische Kettenbr¨ uche. Teubner, Stuttgart. (1st Edition 1913; 2nd Edition 1929) Petek, P. (1989) The continued fraction of a random variable. Exposition. Math. 7, 369–378. Petersen, K. (1983) Ergodic Theory. Cambridge Univ. Press, Cambridge. Peth˝o, A. (1982) Simple continued fractions for the Fredholm numbers. J. Number Theory 14, 232–236. Philipp, W. (1967) Some metrical theorems in number theory. Pacific J. Math. 20, 109–127.

368

References Philipp, W. (1970) Some metrical theorems in number theory II. Duke Math. J. 37, 447–458. Errata, ibid. 37, 788. Philipp, W. (1976) A conjecture of Erd¨os on continued fractions. Acta Arith. 28, 379–386. Philipp, W. (1988) Limit theorems for sums of partial quotients of continued fractions. Monatsh. Math. 105, 195–206. Philipp, W. and Stackelberg, O.P. (1969) Zwei Grenzwerts¨ atze f¨ ur Kettenbr¨ uche. Math. Ann. 181, 152–156. Philipp, W. and Stout, W. (1975) Almost Sure Invariance Principles for Partial Sums of Weakly Dependent Random Variables. Mem. Amer. Math. Soc. 161. Amer. Math. Soc., Providence, RI. Philipp, W. and Webb, G.R. (1973) An invariance principle for mixing sequences of random variables. Z. Wahrsch. Verw. Gebiete 25, 223– 237. von Plato, J. (1994) Creating Modern Probability: Its Mathematics, Physics and Philosophy in Historical Perspective. Cambridge Univ. Press, Cambridge. van der Poorten, A. and Shallit, J. (1992) Folded continued fractions. J. Number Theory 40, 237–250. Popescu, C. (1997a) Continued fractions with odd partial quotients: an approach in the spirit of Doeblin. Stud. Cerc. Mat. 49, 107–117. Popescu, C. (1997b) On the rate of convergence in Gauss’ problem for the continued fraction expansion with odd partial quotients. Stud. Cerc. Mat. 49, 231–244. Popescu, C. (1999) On the rate of convergence in Gauss’ problem for the nearest interger continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 257–267. Popescu, C. (2000) On a Gauss–Kuzmin problem for the α-continued fractions. Rev. Roumaine Math. Pures Appl. 45, 993–1004. Popescu, G. (1978) Asymptotic behaviour of random systems with complete connections, I, II. Stud. Cerc. Mat. 30, 37–68, 181–215. (Romanian)

References

369

Porter, J.W. (1975) On a theorem of Heilbronn. Mathematika 22, 20–28. Postnikov, A.G. (1960) Arithmetic Modeling of Random Processes. Trudy Mat. Inst. Steklov. 57. Nauka, Moscow. [Russian; English translation Selected Transl. in Math. Statist. and Probab. 13 (1973), 41–122] Raney, G.N. (1973) On continued fractions and finite automata. Math. Ann. 206, 265–283. R˘aut¸u, G. and Zb˘aganu, G. (1989) Some Banach algebras of functions of bounded variation. Stud. Cerc. Mat. 41, 513–519. R´enyi, A. (1957) Representations for real numbers and their ergodic properties. Acta Math. Acad. Sci. Hungar. 8, 477–493. Richtmyer, R.D. (1975) Continued fraction expansion of algebraic numbers. Adv. in Math. 16, 362–367. Rieger, G.J. (1977) Die metrische Theorie der Kettenbr¨ uche seit Gauss. Abh. Braunschweig. Wiss. Gesellsch. 27, 103–117. Rieger, G.J. (1978) Ein Gauss–Kusmin–L´evy–Satz f¨ ur Kettenbr¨ uche nach n¨achsten Ganzen. Manuscripta Math. 24, 437–448. Rieger, G.J. (1979) Mischung und Ergodizit¨at bei Kettenbr¨ uchen nach n¨achsten Ganzen. J. Reine Angew. Math. 310, 171–181. Rieger, G.J. (1981a) Ein Heilbronn–Satz f¨ ur Kettenbr¨ uche mit ungeraden Teilnennern. Math. Nachr. 101, 295–307. ¨ Rieger, G.J. (1981b) Uber die L¨ange von Kettenbr¨ uchen mit ungeraden Teilnennern. Abh. Braunschweig. Wiss. Gesellsch. 32, 61–69. Rieger, G.J. (1984) On the metrical theory of the continued fractions with odd partial quotients. Topics in Classical Number Theory (Budapest, 1981), Vol. II, 1371–1418. Colloq. Math. Soc. J´anos Bolyai 34. North-Holland, Amsterdam. Rivat, J. (1999) On the metric theory of continued fractions. Colloq. Math. 79, 9–15. Rockett, A.M. (1980) The metrical theory of continued fractions to the nearer integer. Acta Arith. 38, 97–103.

370

References Rockett, A.M. and Sz˝ usz, P. (1992) Continued Fractions. World Scientific, Singapore. Rogers, C.A. (1998) Hausdorff measures, 2nd Printing, with a Foreword by K.Falconer. Cambridge Univ. Press, Cambridge. Rosen, D. (1954) A class of continued fractions associated with certain properly discontinuous groups. Duke Math. J. 21, 549–563. Rousseau-Eg`ele, J. (1983) Un th´eor`eme de la limite locale pour une classe de transformations dilatantes et monotones par morceaux. Ann. Probab. 11, 772–788. Ruelle, D. (1978) Thermodynamic Formalism. The Mathematical Structures of Classical Equilibrium Statistical Mechanics. Addison-Wesley, Reading, Mass. Ryll–Nardzewski, C. (1951) On the ergodic theorems. II. Ergodic theory of continued fractions. Studia Math. 12, 74–79. ˇ at, T. (1967) Remarks on the ergodic theory of the continued fracSal´ ˇ tions. Mat. Casopis Sloven. Akad. Vied 17, 121–130. ˇ at, T. (1969) Bemerkung zu einem Satz von P. L´evy in der metrischen Sal´ Theorie der Kettenbr¨ uche. Math. Nachr. 41, 91–94. ˇ at, T. (1984) On a metric result in the theory of continued fractions. Sal´ Acta Math. Univ. Comenian. 44–45, 49–53. Salem, R. (1943) On some singular monotonic functions which are strictly increasing. Trans. Amer. Math. Soc. 53, 427–439. Samorodnitsky, G. and Taqqu, M.S. (1994) Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman & Hall, New York. Samur, J.D. (1984) Convergence of sums of mixing triangular arrays of random vectors with stationary rows. Ann. Probab. 12, 390–426. Samur, J.D. (1985) A note on the convergence to Gaussian laws of sums of stationary ϕ-mixing triangular arrays. Probability in Banach Spaces V (Proccedings, Medford, 1984), 387–399. Lecture Notes in Math. 1153. Springer–Verlag, Berlin.

References

371

Samur, J.D. (1987) On the invariance principle for stationary ϕ-mixing triangular arrays with infinitely divisible limits. Probab. Theory Related Fields 75, 245–259. Samur, J.D. (1989) On some limit theorems for continued fractions. Trans. Amer. Math. Soc. 316, 53–79. Samur, J.D. (1991) A functional central limit theorem in Diophantine approximation. Proc. Amer. Math. Soc. 111, 901–911. Samur, J.D. (1996) Some remarks on a probability limit theorem for continued fractions. Trans. Amer. Math. Soc. 348, 1411–1428. Saulis, L. and Statulevi˘cius, V. (1991) Limit Theorems for Large Deviations. Kluwer, Dordrecht. Schmidt, A.L. (1975) Diophantine approximation of complex numbers. Acta Math. 134, 1–85. Schmidt, A.L. (1983) Ergodic theory for complex continued fractions. Monatsh. Math. 93, 39–62. Schmidt, T.A. (1993) Remarks on the Rosen λ-continued fractions. In: Pollington, A. and Moran, W. (Eds.), Number Theory with an Emphasis on the Markoff Spectrum, 227–238. Marcel Dekker, New York. Schmidt, W.M. (1960) On normal numbers. Pacific J. Math. 10, 661–672. Schmidt, W.M. (1980) Diophantine Approximation. Lecture Notes in Math. 785. Springer–Verlag, Berlin. Schweiger, F. (1969) Eine Bemerkung zu einer Arbeit von S.D. Chatˇ terji. Mat. Casopis Sloven. Akad. Vied 19, 89–91. Schweiger, F. (1995) Ergodic Theory of Fibred Systems and Metric Number Theory. Clarendon Press, Oxford. Schweiger, F. (2000a) Kuzmin’s theorem revisited. Ergodic Theory and Dynamical Systems 20, 557–565. Schweiger, F. (2000b) Multidimensional Continued Fractions. Oxford Univ. Press, Oxford.

372

References Sebe, G.I. (1999) Spectral analysis of the Ruelle operator associated with the topological infinite order chain of the continued fraction expansion. Rev. Roumaine Math. Pures Appl. 44, 277–291. Sebe, G.I. (2000a) The Gauss–Kuzmin theorem for Hurwitz’s singular continued fraction expansion. Rev. Roumaine Math. Pures Appl. 45, 495–514. Sebe, G.I. (2000b) A two-dimensional Gauss–Kuzmin theorem for singular continued fractions. Indag. Math. (N.S.) 11, 593–605. Sebe, G.I. (2001a) On convergence rate in the Gauss–Kuzmin problem for the grotesque continued fractions. Monatsh. Math. 133, 241–254. Sebe, G.I. (2001b) Gauss’ problem for the continued fraction expansion with odd partial quotients revisited. Rev. Roumaine Math. Pures Appl. 46, 839–852. Sebe, G.I. (2002) A Gauss–Kuzmin theorem for the Rosen fractions. J. Th´eor. Nombres Bordeaux 14. Segre, B. (1945) Lattice points in infinite domains, and asymmetric Diophantine approximation. Duke J. Math. 12, 337–365. Selenius, C.-O. (1960) Konstruktion und Theorie halbregelm¨assiger Kettenbr¨ uche mit idealer relativer Approximationen. Acta Acad. Abo. Math. Phys. 22, no. 2, 1–75. Sendov, B. (1959/60) Der Vahlensatz u ¨ber die singul¨aren Kettenbr¨ uche und die Kettenbr¨ uche nach n¨achsten Ganzen. Annuaire Univ. Sofia Fac. Sci. Phys. Math. Livre 1 Math. 54, 251–258. Seneta, E. (1976) Regularly Varying Functions. Math. 508. Springer–Verlag, Berlin.

Lecture Notes in

Series, C. (1982) Non-Euclidean geometry, continued fractions, and ergodic theory. Math. Intelligencer 4, no. 1, 24–31. Series, C. (1991) Geometrical methods of symbolic coding. In: Bedford, T. et al. (Eds.) (1991), 125–151. Shallit, J. (1979) Simple continued fractions for some irrational numbers. J. Number Theory 11, 209–217.

References

373

Shallit, J. O. (1982a) Simple continued fractions for some irrational numbers, II. J. Number Theory 14, 228–231. Shallit, J. O. (1982b) Explicit descriptions of some continued fractions. Fibonacci Quart. 20, 77–81. Shallit, J. (1994) Origins of the analysis of the Euclidean algorithm. Historia Math. 21, 401–419. Shanks, D. and Wrench, J.W., Jr. (1959) Khintchine’s constant. Amer. Math. Monthly 66, 276–279. Shiu, P. (1995) Computation of continued fractions without input values. Math. Comp. 64, 1307–1317. Sinai, Ya.G. (1994) Topics in Ergodic Theory. Princeton Univ. Press, Princeton, NJ. Sloane, N.J.A. and Plouffe, S. (1995) The Encyclopedia of Integer Sequences. Academic Press, San Diego. Sprindˇzuk, V.G. (1979) Metric Theory of Diophantine Approximations. Wiley, New York. Stadje, W. (1985) Bemerkung zu einem Satz von Akcoglu und Krengel. Studia Math. 81, 307–310. Strassen, V. (1964) An invariance principle for the law of the iterated logarithm. Z. Wahrsch. Verw. Gebiete 3, 211–226. Sudan, G. (1959) The Geometry of Continued Fractions. Technical Publishing House, Bucharest. (Romanian) ¨ Sz˝ usz, P. (1961) Uber einen Kusminschen Satz. Acta Math. Acad. Sci. Hungar. 12, 447–453. Sz˝ usz, P. (1962) Verallagemainerung und Anwendungen eines Kusminschen Satzes. Acta Arith. 7, 149–160. Sz˝ usz, P. (1980) On the length of continued fractions representing a rational number with given denominator. Acta Arith. 37, 55–59. Sz˝ usz, P. and Volkmann, B. (1982) On Strassen’s law of the iterated logarithm. Z. Wahrsch. Verw. Gebiete 61, 453–458.

374

References Tamura, J. (1991) Symmetric continued fractions related to certain series. J. Number Theory 38, 251–264. Tanaka, S. and Ito, Sh. (1981) On a family of continued-fraction transformations and their ergodic properties. Tokyo J. Math. 4, 153– 175. Thakur, D.S. (1996) Exponential and continued fractions. J. Number Theory 59, 248–261. ¨ Tietze, H. (1913) Uber die raschesten Kettenbruchentwicklungen reeller Zahlen. Monatsh. Math. Phys. 24, 209–242. Tong, J. (1983) The conjugate property of the Borel theorem on Diophantine approximation. Math. Z. 184, 151–153. Tong, J. (1994) The best approximation function to irrational numbers. J. Number Theory 49, 89–94. Tonkov, T. (1974) On the average length of finite continued fractions. Acta Arith. 26, 47–57. Urban, F.M. (1923) Grundlagen der Wahrscheinlichkeitsrechnung und der Theorie der Beobachtungsfehler. Teubner, Leipzig. Urba´ nski, M. (2001) Porosity in conformal infinite iterated function systems. J. Number Theory 88, 283–312. ¨ Vahlen, K.T. (1895) Uber N¨aherungswerthe und Kettenbr¨ uche. J. Reine Angew. Math. 115, 221–233. Vajda, S. (1989) Fibonacci and Lucas Numbers, and the Golden Section: Theory and Applications. E. Horwood, Chichester. Vall´ee, B. (1997) Op´erateurs de Ruelle–Mayer g´en´eralis´es et analyse des algorithmes d’Euclide et de Gauss. Acta Arith. 81, 101–144. Vall´ee, B. (1998) Dynamique des fractions continues `a contraintes p´eriodiques. J. Number Theory 72, 183–235. Vall´ee, B. (2000) Digits and continuants in Euclidean algorithms. Ergodic versus Tauberian theorems. J. Th´eor. Nombres Bordeaux 12, 531–570.

References

375

Vardi, I. (1995) The limiting distribution of the St. Petersburg game. Proc. Amer. Math. Soc. 123, 2875–2882. Vardi, I. (1997) The St. Petersburg game and continued fractions. C.R. Acad. Sci. Paris Ser. I Math. 324, 913–918. Veech, V.A. (1982) Gauss measures for transformations on the space of interval exchange maps. Ann. of Math. (2) 115, 201–242. Vershik, A.M. and Sidorov, N.A. (1993) Arithmetic expansions associated with the rotation of a circle. Algebra i Analiz 5, no. 6, 97–115. (Russian) Viader, P., Paradis, J., and Bibiloni, L. (1998) A new light on Minkowski’s ?(x)-function. J. Number Theory 73, 212–227. Viswanath, D. (2000) Random Fibonacci sequences and the number 1.13198824 · · · . Math. Comp. 69, 1131–1155. de Vroedt, C. (1962) Measure-theoretical investigations concerning continued fractions. Indag. Math. 24, 583–591. de Vroedt, C. (1964) Metrical problems concerning continued fractions. Compositio Math. 16, 191–195. Wall, H.S. (1948) Analytic Theory of Continued Fractions. Van Nostrand, New York. Walters, P. (1982) An Introduction to Ergodic Theory. Graduate Texts in Mathematics 79. Springer–Verlag, New York. Watson, G.N. (1944) A Treatise on the Theory of Bessel Functions, 2nd Edition. Cambridge Univ. Press, Cambridge. Whittaker, E.T. and Watson, G.N. (1927) A Course of Modern Analysis. Cambridge Univ. Press, Cambridge. ¨ Wiman, A. (1900) Uber eine Wahrscheinlichkeitsaufgabe bei Ketten¨ bruchentwickelungen. Ofversicht af Kongl. Svenska Vetenskaps-Akademiens F¨ orhandlingar 57, 829–841. Wirsing, E. (1974) On the theorem of Gauss–Kusmin–L´evy and a Frobenius type theorem for function spaces. Acta Arith. 24, 507– 528.

376

References Wrench, J.W., Jr. (1960) Further evaluation of Khintchine’s constant. Math. Comp. 14, 370–371. Wrench, J.W., Jr. and Shanks, D. (1966) Questions concerning Khintchine’s constant and the efficient computation of regular continued fractions. Math. Comp. 20, 444–448. Zagier, D.B. (1981) Zetafunktionen und quadratische K¨ orper. Eine Einf¨ uhrung in die h¨ ohere Zahlentheorie. Springer–Verlag, Berlin-New York. Zuparov, T.M. (1981) On a theorem from the metric theory of continued fractions. Izv. Akad. UzSSR Ser. Fiz.-Mat. Nauk no. 6, 9–12. (Russian)

Index Aaronson, J., 311, 339 Abramov’s formula, 277 Acosta, A. de, 202 Adams, W.W., 271, 272 Adler, R.L, 9, 244, 307 α-expansion, 281, 344, 345 Alexandrov, A.G., 241 algorithm A, 259 algorithm B, 260 algorithm C, 260 almost Markov property, 335 Alzer, H., 13 approximation coefficient, 27, 263 Araujo, A., 197, 320, 331 arc-sine law, 187 generalization of, 202 array, 325 strictly stationary, 326 strongly infinitesimal (s.i.), 327 associated random variables, 15 extended, 34 automorphism, 219

Berechet, A., xiii, 151 Bernstein, F. F. Bernstein’s theorem, 49, 174 Bibiloni, L., 238 Billingsley, P., 36, 180, 187, 221, 224, 257, 320, 334, 343 Birkhoff’s individual ergodic theorem, 221 Borel sets, 314 ´ 22, 30, 243, 337 Borel, E., Borwein, J.M., 231, 233, 241 Bosma, W., 249, 251, 252, 260, 281, 288, 293–296, 298, 299, 343 boundary, 315 bounded essential variation, 55 bounded p-variation, 75 Boyarski, A., 58, 221, 223 Bradley, R.C., 326 Breiman, L., 253 Brezinski, C., xii Brjuno, A.D., 12, 241 Brod´en, T., 22, 336, 337 Brod´en–Borel–L´evy formula, 21 generalized, 37 Burton, R.M., 344, 345

Babenko, K.I., 103, 109, 111, 113, 336 backward continued fraction (BCF) expansion, 307 Bagemihl, F., 30 Bailey, D.H., 231, 233, 241 Barbolosi, D., 249, 264 Barndorff–Nielsen, O., 176 Barrionuevo, J., 346

Cassels, J.W.S., 343 Champernowne, D.G., 243 characteristic function, 316 Choong, K.Y., 241 Chudnovsky, D.V., 13 377

378 Chudnovsky, G.V., 13 Clemens, L.E., 12 conditional probability measures, 36 continuant, 5 continued fraction (CF), 260 continued fraction digits, 4 continued fraction expansion, 4 continued fraction expansion for e, 12 continued fraction expansion for π, 13 continued fraction transformation, 2 natural extension of, 25 continued fraction with even incomplete quotients (Even CF) expansion, 264 continued fraction with odd incomplete quotients (Odd CF) expansion, 264 convolution, 316 Corless, R.M., 334 Cornfeld, I.P., 221 Crandall, R.E., 231, 233, 241 Dajani, K., 250, 300, 303–305, 310, 311, 337, 341, 343, 345, 346 Daud´e, H., 111, 130, 134 Davison, J.L., 13 Daykin, D.E., 241 Denjoy, A., 156, 163, 337 dependence coefficients, 325 dependence with complete connections, 23, 234 diagonal continued fraction (DCF) expansion, 289 Diamond, H.G., 235, 239, 240 digamma function ψ, 145

Index Diophantine approximation, 29 fundamental theorem of, 257 Dixon, J.D., 334 δ-mixing, 326 Doeblin, W., xi, 22, 33, 99, 204, 252, 335, 337–340 Doeblin–Lenstra conjecture, 252 Doob, J.L., 31 Doukhan, P., 327 Duren, P.L., 102 D¨ urner, A., 34, 337 dynamical system, 219 Elsner, C., 13 Elton, H.J., 253 endomorphism, 219 entropy, 257, 277 Euclid’s algorithm, 1, 2 Euler, L., 5, 12 Faivre, C., 9, 101, 130, 249, 334, 336, 343 Falconer, K.J., 233, 234 Farey continued fraction (FCF) expansion, 303 Feller, W., 238 f -expansion, 346 with dependent digits, 346 Fieldsteel, A., 343 Flajolet, P., 111, 130, 134 Flatto, L., 307 Fluch, W., 134 Fortet, R., 335 Fourier transform, 316 Fujiwara, M., 30 fundamental interval, 18 G´ al, I.S., 221, 340 G´ ora, P., 58, 221, 223 Galambos, J., 173, 174 Gauss, C.F., x, 15

Index Gauss–Kusmin–L´evy theorem ‘exact’, 111, 125 L2 -version, 123 Gauss’ measure, 16 extended, 26 Gauss’ Problem, 15 Babenko’s solution to, 101f Paul L´evy’s solution to, 39f Wirsing’s solution to, 79f Gauss’ problem for τ¯, 246 geodesic flow, 9 Gin´e, E., 197, 320, 331 Gordin, M.I., 216, 327 Gr¨ochenig, K., 344 Gray, J.J., 16 Grigorescu, S., 23, 33, 62, 168, 193, 253, 334, 346 Grothendieck, A., 105 Gyld´en, H., 336 Haan, L. de, 174 Haas, A., 344 Halmos, P.R., 320 Hardy, G.H., 11 Harman, G., 233 Hartman, S., 238 Hartono, Y., 264 Hausdorff dimension, 233 Hausdorff measure, 233 Heilbronn, H., 334 Heinrich, H., 203, 335 Hennion, H., 335 Hensley, D., 2, 103, 194, 234, 252, 334, 336 Heyde, C.C., 188, 214 Hofbauer, F., 193 Hoffmann–Jørgensen, J., 320 Hurwitz, A., 263, 264, 288, 298 Ibragimov, I.A., 71, 72, 334

379 infinite-order chain, 33 insertion, 300 Ionescu Tulcea, C.T., 335 Iosifescu, M., 23, 33, 62, 64, 147, 151, 168, 173, 178, 179, 183, 193, 204, 334–337, 345, 346 isomorphism, 222 iterated function systems, 234 Ito, Sh., 273, 281, 302, 344, 345 Jager, H., 30, 249, 251, 252, 271– 273, 281, 288, 298, 340, 341 Jain, N.C., 215, 216 Jarn´ık, V., 234 Jenkinson, O., 234 Jogdeo, K., 215, 216 Jones, W.B., xii Jur0 ev, S.P., 113 Kac, M., 334 Kakeya, S., 346 Kalpazidou, S., 345 Kamae, T., 221 Kanwal, R.P., 105 Karamata theorem, 321 Katznelson, Y., 221 K-automorphism, 223 Keane, M.S., 221, 244 Keller, G., 193 Khin(t)chin(e), A.Ya., 16, 204, 231, 257, 334, 339, 340 Knopp, K., 339 Knuth, D.E., 2, 92, 101, 333, 334 K¨ohler, G., 13 Koksma, J.F., 221, 334, 340 Kolmogorov, A.N., 337 Kraaikamp, C., 30, 250, 251, 264, 273, 278, 286–290, 294, 296,

380 299, 300, 303–305, 310, 311, 337, 341–346 Krasnoselskii, M., 128 Kurosu, K., 287 Kuzmin, R.O., 16 Lagarias, J.C., 238 Lagrange, J.-L., 11 Lam´e, G., 2 Lang, S., 241 Laplace, P.S., 15 Lasota, A., 58, 220 λ-continued fraction (λ-CF) expansion, 344 Law of the iterated logarithm Chung’s, 215 classical, 213 Strassen’s, 213, 216 Legendre constants, 273 Legendre’s theorem, 20 Lehmer, D., 231 Lehner continued fraction (LCF) expansion, 300 Lehner, J., 300, 302 Lenstra, H.W., 252 LeVeque, J., 30 L´evy-Cram´er continuity theorem, 316 L´evy–Khinchin representation, 317 L´evy measure, 317 L´evy, Paul, 16, 22, 39, 256, 271, 334, 340, 342 Liardet, P., 340, 341 Linnik, Yu.V., 71, 72, 334 Lochs, G., 334, 342 Lopes, A., 264 Lorenzen, L., xii Loynes, R.M., 174 Mackey, M.C., 58, 220

Index MacLeod, A.J., 111, 119 Magnus, W., 105, 107 Marinescu, G., 335 Martin, M.H., 339 matrix approach, 7 Mayer, D.H., 59, 103, 109, 111, 120, 127, 130, 194, 336 Mazzone, F., 315 McKinney, T.E., 281 McLaughlin, J.R., 30 measurable space, 313 measure, 314 mediant convergents, 301 Merrill, K.D., 12 Minnigerode, B., 263 Misevi˘cius, G., 193, 194, 335 M¨ obius transformation, 7 Moeckel, R., 340, 341 Morita, T., 195 ‘Mother of all SRCF expansions’, 301 Nakada, H., 9, 271, 281, 283, 285, 311, 334, 340, 343–345 nearest integer continued fraction (NICF) expansion, 263 Neumann, J. von, 244 Nolte, V.N., 227, 229, 341 normal continued fraction number, 243 normal number, 243 number normal in base b, 243 Oberhettinger, F., 105, 107 Obrechkoff, N., 30 Olds, C.D., xii 1–block, 258 Operator Mayer–Ruelle, 130 generalization of, 134

Index nuclear of order 0 (of trace class), 105 trace of, 105 Perron–Frobenius, 57, 58 transition, 65 optimal continued fraction (OCF) expansion, 293 Paradis, J., 238 Pedersen, P., 231 Perron, O., xii, 11, 261, 288, 289, 333 Petek, P., 64 Petersen, K., 221, 223, 276, 277 Peth˝o, A., 13 Philipp, W., 34, 173, 174, 176, 181, 215, 216, 230, 239, 256, 334, 337–340 Plato, J. von, xii, 337 Poisson probability, 317 τ -centered, 317 Pollicott, M., 234 Poorten, A. van der, 13 Popescu, C., 180, 281, 345 Popescu, G., 253 Porter, J.W., 333 Postnikov, A.G., 244 preservation area, 269 probability, 314 infinitely divisible, 317 stable, 317 order of, 318 strictly stable, 318 probability space, 314 Prokhorov metric, 315 Pruitt, W.E., 215 ψ-mixing coefficient, 43 quadratic irrationality, 11 random variable (r.v.), 313

381 independent, 316 P -distribution of, 314 Raney, G.N., 9 Rathbone, C.R., 241 R˘aut¸u, G., 56 (regular) continued fraction (RCF), 3, 4 convergents of [= (RCF) convergents], 4 digits of, 4 asymptotic relative digit frequencies, 225 asymptotic relative frequencies of digits between two given values, 226 asymptotic relative frequencies of digits exceeding a given value, 227 asymptotic relative m-digit block frequencies, 226 incomplete (partial) quotients of, 4 extended, 31 regularly varying function, 321 index of, 321 Reznik, M.H., 216 Riauba, R., 335 Richtmyer, R.D., 12, 241 Rieger, G.J., 264, 272, 284, 345 Rockett, A.M., xii, 284, 334, 345 Roeder, D.W., 12 Roepstorff, G., 109, 111, 120, 127, 336 Rogers, C.A., 233 Rosen continued fraction expansion, 344 Rosen, D., 344 Rousseau-Eg`ele, J., 193 Ruelle, D., 130 Ryll–Nardzewski, C., 340

382 ˇ at, T., 233 Sal´ Salem, R., 238 σ-algebra, 313 Samorodnitsky, G., 320 Samur, J.D., 79, 99, 188, 197, 211, 324, 327–331, 337, 338 Saulis, L., 335 Schmidt, T.A., 344 Schmidt, W.M., xii, 343 Schweiger, F., 264, 343 S-convergent, 270 Scott, D.J., 188, 214 Sebe, G.I., 344, 345 Segre, B., 30 Selenius, C.-O., 260 semi-regular continued fraction (SRCF) expansion, 261 closest, 259 fastest, 259, 294 Sendov, B., 30, 287 Seneta, E., 321, 322 Series, C., 9 S-expansion, 267 Shallit, J.O., 2, 13, 241 Shanks, D., 231, 232, 241 Shiu, P., 241 Sinai, Ya.G., 334 singular continued fraction (SCF) expansion, 264 singularization, 258, 265 singularization area, 267 maximal, 269 singularization process, 265 skew product, 223 Jager and Liardet’s, 341 Skorohod metric d0 , 319 slowly varying function, 321 representation theorem, 321 Smorodinsky, M., 244 Soni, R.P., 105, 107

Index spectral radius, 95 Sprindˇzuk, V.G., xii St. Petersburg game, 238 Stackelberg, O.P., 216 Stadje, W., 55 Statulevi˘cius, V.A., 335 Stout, W.F., 181, 215, 216 Strassen, V., 213, 216 Sudan, G., xii Sz˝ usz, P., xii, 16, 30, 334, 335, 339 Tamura, J., 13 Tanaka, S., 281, 344, 345 Taqqu, M.S., 320 Taylor, S.J., 215, 216 Thakur, D.S., 13 Theodorescu, R., 168 Thron, W.J., xii Tietze, H., 261 Tong, J., 30, 31 Tonkov, T., 334 transformation, 219 ergodic, 220 exact, 220 measure preserving, 219 natural extension of, 222 non-singular, 219 strongly mixing, 220 Trotter, H., 241 Tuckerman, B., 244 UB Conjecture, 139 Urban, F.M., 334 Uspensky, J.V., 16 Vaaler, J.D., 235, 239, 240 Vahlen, K.T., 28 Vall´ee, B., 111, 130, 134, 135, 194 Vardi, I., 238 Viader, P., 238 Volkmann, B., 339

Index Vroedt, C. de, 340 Waadeland, H., xii Wall, H.S., xii Walters, P., 221 Watson, G.N., 104, 227 weak convergence, 315 Webb, G.R., 339 Weiss, B., 221 Whittaker, E.T., 227 Wiedijk, F., 249, 251, 252, 281, 288, 298 Wiener measure, 319 Wiman, A., 336, 337 Wirsing, E., 16, 83, 91, 92, 113, 336 Wrench, J.W., 231, 232 Wright, E., 11 Zagier, D.B., 308 Zb˘aganu, G., 56 Zuparov, T.M., 338

383

Related Documents