Higher order derivatives of the inverse function Andrej Liptaj∗ May 7, 2007
Abstract The general formula for higher order derivatives of the inverse function is presented. The formula is next used in couple of mathematical applications: expansion of the inverse function into Taylor series, solving equations, constructing random numbers with a given distribution from random numbers with the uniform distribution and predicting the value of the function from its properties in a single point in the region beyond the radius of convergence of its Taylor series.
1 Introduction Every reader probably knows the formula for the derivative of the inverse function d 1 , g = f −1 , g(y)|y=f (x0 ) = d dy f (x)| x=x0 dx which usually makes part of the basic course of mathematical analysis. One could ask about analogous formulas for higher order derivatives. Even though these formulas are simple they are not available. In this article we are going to write down the general formula for all higher order derivatives of the inverse function and show a couple of related examples and applications.
2 Higher order derivatives of the inverse function 2.1 Notations For the purpose of this article it is convenient to introduce new notations. This is mostly motivated by the fact that higher order derivatives and higher powers will occur often in the text and thus would make formulas dicult to read. Let us consider a real function of a real variable f (x) on the interval I . Let us suppose that f is bijective on I . Let x0 be an internal point of I and g(y) be the inverse function of f , g(f (x1 )) = g(y1 ) = x1 for any x1 belonging to I . We dene y0 = f (x0 ). In this article we will use the following notations: ∗ iliºská
1, Bratislava, Slovakia; E-mail:
[email protected]
1
Figure 1: f (x) and g(y).
fn =
dn f (x)|x=x0 dxn
gn =
dn g(y)|y=y0 dy n
Described objects are depicted on the Figure 1.
2.2 Formulas The main result of this paper is the formula for higher order derivatives of the inverse function. It has a recursive limit form and stands
g1 =
1 , f1
h ii 1 1 n j ∆x − Σn−1 i=1 i! (gi) Σj=1 j! (f j)(∆x) £ n 1 ¤n gn = lim n! . ∆x→0 Σi=1 i! (f i)(∆x)i
(1)
This gives us for the rst ve derivatives
g1 =
g2 = −
g3 =
g4 = −
1 , f1 f2 , (f 1)3
3(f 2)2 − (f 1)(f 3) , (f 1)5
15(f 2)3 − 10(f 1)(f 2)(f 3) + (f 1)2 (f 4) , (f 1)7 2
(2)
g5 = =
105(f 2)4 − 105(f 1)(f 2)2 (f 3) + 10(f 1)2 (f 3)2 + 15(f 1)2 (f 2)(f 4) − (f 1)3 (f 5) . (f 1)9
2.3 Examples For the sake of convenience let us adopt the notation gn = F [f 1, ..., f n]. This notation means that one uses the formula 1 to calculate gn as a function of f 1, ..., f n.
2.3.1 Example 1: f (x) = x2 , x0 = 2 Let now calculate g1, g2 and g3 basing us on the formula 1. We have
f1 =
d f (x)|x=2 = 2x|x=2 = 4, dx
f2 =
d2 f (x)|x=2 = 2|x=2 = 2, dx2
d3 f (x)|x=2 = 0|x=2 = 0. dx3 Using now the formulas 2 (which follow from 1) one obtains f3 =
g1 = F [f 1] =
1 1 = , f1 4
f2 2 1 =− 3 =− , 3 (f 1) 4 32 3(2)2 − (4)(0) 12 3 3(f 2)2 − (f 1)(f 3) = = = . g3 = F [f 1, f 2, f 3] = 5 5 (f 1) (4) 1024 256 √ Since we know the corresponding inverse function g(y) = f −1 = y we can explicitely verify the obtained results. We calculate the derivatives in the point y0 = f (x0 ) = f (2) = 4. g2 = F [f 1, f 2] = −
g1 =
d 1 1 1 g(y)|y=4 = √ |y=4 = √ = = F [f 1], dy 2 y 4 2 4
g2 =
d2 1 1 1 g(y)|y=4 = − p |y=4 = − √ = − = F [f 1, f 2], 2 3 3 dy 32 4 4 4 y
g3 =
3 3 3 d3 g(y)|y=4 = p |y=4 = √ = = F [f 1, f 2, f 3]. 5 5 dy 3 256 8 4 8 y
We conclude that we obtain the same results either by using the formula 1 or by explicit derivatives of the inverse function. It may however happen that the inverse function is not an elementary function. In that cases we can relay only on the formula 1. 3
2.3.2 Example 2: f (x) = sin(x), x0 = 0 One has
d f (x)|x=0 = cos(x)|x=0 = 1, dx
f1 =
f2 =
f3 =
d2 f (x)|x=0 = − sin(x)|x=0 = 0, dx2
d3 f (x)|x=0 = − cos(x)|x=0 = −1. dx3
Using the formulas for the derivatives of the inverse function one gets
g1 = F [f 1] = g2 = F [f 1, f 2] = − g3 = F [f 1, f 2, f 3] =
1 1 = = 1, f1 1 f2 0 = − 3 = 0, (f 1)3 1
3(f 2)2 − (f 1)(f 3) 3(0)2 − (1)(−1) 1 = = = 1. 5 5 (f 1) (1) 1
Using the explicit form of the inverse function g(y) = arcsin(y) and calculating the derivatives in y0 = sin(0) = 0, we obtain once more an agreement
g1 =
g2 =
g3 =
d 1 1 |y=0 = √ = 1 = F [f 1], g(y)|y=0 = p dy 1 − 02 1 − y2
0 d2 y |y=0 = q = 0 = F [f 1, f 2], g(y)|y=0 = q dy 2 3 3 (1 − y 2 ) (1 − 02 )
d3 g(y)|y=0 = q dy 3
1 3
3y 2
+q
(1 − y 2 )
5
|y=0 = 1 = F [f 1, f 2, f 3].
(1 − y 2 )
2.3.3 Example 3: f (x) = ln(x), x0 = 1 The last presented example - the natural logarithm - will serve us later on in the last section. First three derivatives are
f1 =
f2 =
d 1 f (x)|x=1 = |x=1 = 1, dx x
d2 1 f (x)|x=1 = − 2 |x=1 = −1, dx2 x
f3 =
d3 2 f (x)|x=1 = 3 |x=1 = 2. dx3 x 4
The formulas 2 yield
g1 = F [f 1] = g2 = F [f 1, f 2] = − g3 = F [f 1, f 2, f 3] =
1 1 = = 1, f1 1
f2 (−1) = − 3 = 1, (f 1)3 1
3(−1)2 − (1)(2) 1 3(f 2)2 − (f 1)(f 3) = = = 1. 5 (f 1) (1)5 1
One recognizes immediately the derivatives of the exponential function g(y) = f −1 = ey in the point y0 = ln(1) = 0. Once more the results agree.
2.4 Few comments 2.4.1 Properties of the formulas Calculating the explicit form of the formulas for the rst few higher order derivatives of the inverse function (see 2) one can make following observations:
• The formula 1 has recursive and limit form. Observing the formulas 2 one would think about an expression that would not be recursive and that would not be in the form of the limit. Although some terms are easy to guess (for example the denominator for each gn is (f 1)2n−1 ), it does not seem to be straightforward to nd such a formula. • The formulas for gn = F [f 1, ..., f n] contain only basic arithmetical operation (addition, subtraction, multiplication and division). Thus one can - for example - expect them (at least for rst few cases) to be rapidly calculable on a computer. • The formulas gn = F [f 1, ..., f n] being simple, they however grow rapidly. The formula for g10 has already 30 additive terms. We will make comment concerning this property in the section dedicated to solving equations.
2.4.2 Property of duality The operation of inversion has the property of duality. By that we understand that applying the inversion twice to a function leads to the initial function, the inversion squared gives identity. This property propagates into the formula 1. Let g be the inverse function of f and h inverse function of g . One actually has f ≡ h, but let us for a while think of f and h separately. Than one expects
hn = F [g1, ..., gn] = F [F [f 1] , ..., F [f 1, ..., f n]] = f n. We do not need to prove this equality since it is the consequence of f , g and h being inverse (we however still need to prove the formula 1). In spite of that it is interesting to cross-check it. Let focus on the rst three cases:
5
h1 =
1 1 = 1 = f 1, g1 f1 f2
h2 = −
− (f 1)3 g2 −(f 2)(f 1)3 =− 1 3 =− = f 2, 3 (g1) (f 1)3 ( f1 ) 2
3(f 2) −(f 1)(f 3) 2 2 1 3(− (ff1) ) 3 ) − ( f 1 )( 3(g2)2 − (g1)(g3) (f 1)5 h3 = = = f 3. 1 5 5 (g1) ( f1 )
This property was explicitely veried on the computer to be valid up to n = 8. Ignoring the knowledge that this property is valid for each n because it is propagated from the property of the inversion operation, it does not seem to be straightforward to prove it just on the algebraic basis.
3 Derivation of the higher-order derivatives formula for the inverse function In this section we derive the general formula 1. We use the same notations as in the previous section and we start with the derivation of the formula for g1 and g2 . The general case gn is treated afterwards.
3.1 Formulas for g1 and g2 The case of g1 is a well-known formula for the derivative of the inverse function. We however still derive it in order to show the general scheme which will then apply for deriving the formulas for higher derivatives. Let us suppose that the function f can be on some interval around x0 approximated by its Taylor series. Taking into account only the linear term one has
f (x0 + ∆x) ≈ f (x0 ) + a∆x, for ∆x small. Doing the same for g
g(y0 + ∆y) ≈ g(y0 ) + b∆y and using the fact of f and g being inverse (considered true also for their linear approximations when ∆x is small) one obtains
x0 + ∆x
= g[f (x0 + ∆x)] = g[f (x0 ) + a∆x] = x0 + ba∆x,
where we used a∆x = ∆y . Comparing the starting and the ending expression one has immediately b = a1 , which gives the relation between the coecients a 6
and b. The coecients a and b are also the rst derivatives of f and g at x0 and y0 respectively a = f 1 and b = g1. Thus one has
g1 =
1 . f1
The derivation of the formula for g2 is somewhat more complicated, it however follows the scheme from the previous case. Let make approximations of f and g up to quadratic members
f (x0 + ∆x) ≈ f (x0 ) + a1 ∆x + a2 ∆x2 , g(y0 + ∆y) ≈ g(y0 ) + b1 ∆y + b2 ∆y 2 . We already know the relation between the rst derivatives so one can write
g(y0 + ∆y) ≈ g(y0 ) +
1 ∆y + b2 ∆y 2 . a1
Now applying the inverse-function property, which in our approximation holds for ∆x → 0 (and ∆y → 0) one has
x0 + ∆x
= =
g[f (x0 + ∆x)] g[f (x0 ) + a1 ∆x + a2 ∆x2 ] [∆y = a1 ∆x + a2 ∆x2 ] 1 = g[f (x0 )] + a1 (a1 ∆x + a2 ∆x2 ) + b2 (a1 ∆x + a2 ∆x2 )2 = x0 + ∆x + aa12 ∆x2 + b2 (a21 ∆x2 + a22 ∆x4 + 2a1 a2 ∆x3 ).
Comparing the rst and the last expression in the previous equations one arrives to a2 ∆x2 b2 = − . a1 (a21 ∆x2 + a22 ∆x4 + 2a1 a2 ∆x3 ) This expression does depend on ∆x (because unlike in the case of the linear approximation, the inverse of a quadratic function is not a quadratic function). However as we stressed before, our calculations are valid only for ∆x → 0 where the power approximations for the functions hold. Taking in addition into account the relations between polynom coecients and the derivatives b2 = g2 2! , a2 = f2!2 and a1 = f 1 one ends up with
g2 = lim −2! ∆x→0
f2 2 f2 2! ∆x h i =− . (f 1)3 (f 1) (f 1)2 ∆x2 + ( f2!2 )2 ∆x4 + 2(f 1) f2!2 ∆x3
3.2 General case Following the scheme from the previous section we can expand functions f and g up to the n-th order
f (x0 + ∆x) ≈ f (x0 ) + a1 ∆x + a2 ∆x2 + . . . + an ∆xn = f (x0 ) +
n X j=1
7
aj ∆xj ,
g(y0 + ∆y) ≈ g(y0 ) + b1 ∆y + b2 ∆y 2 + . . . + bn ∆y n = g(y0 ) +
n X
bi ∆y i .
i=1
Than using the inverse-function property in the point x0 + ∆x one has for small ∆x
x0 + ∆x = =
g[f (x0P+ ∆x)] n g[f (x0 ) + j=1 aj ∆xj ] Pn [∆y = j=1 aj ∆xj ] Pn Pn = x0 + i=1 bi [ j=1 aj ∆xj ]i Pn−1 Pn Pn = x0 + i=1 bi [ j=1 aj ∆xj ]i + bn [ j=1 aj ∆xj ]n .
Comparing the starting and the ending expression one gets Pn−1 Pn ∆x − i=1 bi [ j=1 aj ∆xj ]i Pn . bn = [ j=1 aj ∆xj ]n Taking into account that our calculations are valid only for small ∆x by applying the limit ∆x → 0 and using the relation between the polynom coecients and its derivatives aj = fj!j , bi = gi i! one arrives to the expression
gn = lim n! ∆x→0
∆x −
Pn−1
h
gi Pn fj j i=1 i! j=1 j! ∆x Pn [ j=1 fj!j ∆xj ]n
ii
which is identical to the formula 1. Although the presented derivation of the formula 1 might need some ne tuning (careful treatment of approximations and limits) to be regarded as a rigorous mathematical proof, it represents the essence of such a proof.
4 Inverse function Taylor series, solving equations 4.1 Inverse function Taylor series The knowledge of the derivatives of the inverse function enables us to expand the inverse function into Taylor series. Knowing the initial function f , let in addition suppose that we also know the value of its inverse function g at some point y0 , g(y0 ) = x0 . Calculating the numbers g1, g2, g3 . . . from the formula 1 one can express the value of g at some dierent point y1
g(y1 ) = g(y0 ) + (g1)(y1 − y0 ) +
1 (g2)(y1 − y0 )2 + . . . 2!
(3)
∞ X 1 (gi)(y1 − y0 )i . = x0 + i! i=1
This naturally holds only on condition that Taylor series for g converge to g at y1 . 8
Figure 2: Solving equations
4.2 Solving equations The knowledge of the inverse function (supposing it can be approximated by its Taylor series) makes us immediately think of solving the real equations of the real variable. Equation f (xs ) = 0 can be simply solved by applying the inverse function g on the both sides of the equation xs = g[f (xs )] = g(0). The situation is actually little bit more complicated because in the general case we do not know the value of g at some point. Consider the situation depicted on the Figure 2. We can express g(0) through g(y0 )
xs = g(0) = g(y0 ) +
∞ X 1 (gi)(−y0 )i . i! i=1
To know g(y0 ) one cannot choose y0 arbitrary (otherwise one would directly choose y0 = 0). The only possibility is to choose arbitrary an x0 in some interval around xs , where f is bijective. Then using y0 = f (x0 ) one rewrites the previous equation in the form
xs = g(0) = x0 +
∞ X 1 (gi)[−f (x0 )]i , i! i=1
(4)
or once more in the form
xs = g(0) = x0 +
∞ X 1 F [f 1, . . . , f i][−f (x0 )]i , i! i=1
where we explicitely see, that all information needed to solve the equation f (xs ) = 0 is contained in the function f itself and the function f is supposed to be known. As already previously mentioned this conclusion relies on supposition that the Taylor series for the inverse function g converge to g at y = 0. 9
4.3 Few comments • Many transcendent equations do not have a solution which could be expressed as an elementary function of an entire number. Thus the presented method provides us the most general solution of an equation in the form of (Taylor) series - one cannot expect better. The presented method has however two drawbacks: the arbitrary choice of x0 and the fact that the Taylor series for the inverse function do not need to converge at y = 0. • Although said that the choice of x0 is arbitrary, it is actually not completely true. One should try to nd such x0 which would be as close as possible to the solution value xs . If x0 is close to xs then y0 is close to 0 and hence the series have good chances to converge and to converge rapidly. • One can use a recursive approach to solve the equations. Since the length of the formulas for higher order derivatives of the inverse function grows rapidly (see the comment in 2.4.1), one might prefer to use the Taylor polynom for the inverse function with smaller number of terms recursively than to use it with big number of terms once. If x0 is our starting point, then one could for example use as an approximation of the inverse function a Taylor polynom with 10 terms to calculate the approximate value of the solution xs,approx . One could however also use a polynom with 5 terms and calculate the rst approximative value of the solution xs,approx1 . Then, in the next iteration, one would choose xs,approx1 as the starting point, recalculate all the derivatives of the initial and the inverse function, construct the corresponding inverse function polynom and arrive to more accurate approximation of the solution xs,approx2 . In this way one would avoid the long formulas when calculating higher order inverse derivatives. We will show this approach on the examples which follow.
4.4 Examples In this section we will demonstrate the previous ideas on a couple of examples. Only the rst example will be presented in all details and will show as well the method as the results of the method. The second example will exactly follow the method of the rst one and thus will focus only on results.
4.4.1 Example 1 - solving equation Let solve the equation xx = 7. We need to transform it into the form f (x) = xx − 7 = 0. As explained before, we should have some reasonable guess for the solution. Let the guess be x0 = 2 [f (x0 ) = y0 = 22 − 7 = −3]. Having our x0 we dn x need to calculate the derivatives f n = dx n (x −7)|x=x0 =2 up to the desired order - we will consider only rst 8 derivatives. Then using the formula 1 we calculate g1 . . . g8 - the derivatives of the inverse function g(y) at the corresponding point y0 = −3. The numerical results for f n and gn are summarized in the Table 1. 10
fn =
dn x dxn (x
− 7)|x=x0 =2
gn =
f 1 ≈ 6.77258 f 2 ≈ 13.46698 f 3 ≈ 28.57418 f 4 ≈ 64.50134 f 5 ≈ 151.34073 f 6 ≈ 370.18129 f 7 ≈ 929.27934 f 8 ≈ 2409.09527
dn y dy n (y
− 7)−1 |y=y0 =−3
g1 ≈ 0.14765 g2 ≈ −0.04335 g3 ≈ 0.02460 g4 ≈ −0.02070 g5 ≈ 0.02313 g6 ≈ −0.03227 g7 ≈ 0.05406 g8 ≈ −0.10583
Table 1: Numerical values for f 1 . . . f 8 and g1 . . . g8
fn =
dn x dxn (x
− 7)|x=x1 =2.28879...
gn =
dn y dy n (y
f 1 ≈ 12.16106 f 2 ≈ 25.13703 f 3 ≈ 55.30717 f 4 ≈ 128.19501
− 7)−1 |y=y1 =−0.34729...
g1 ≈ 0.08222 g2 ≈ −0.01397 g3 ≈ 0.00459 g4 ≈ −0.00224
Table 2: Numerical values for f 1, . . . f 4 and g1, . . . , g4 at x1 = 2.28879 . . . and y1 = −0.34729 . . .. Now we just need to insert the calculated values into the formula 4
xsolution ≈ 2 + (0.14765)(3) +
1 1 (−0.04335)(3)2 + . . . + (−0.10583)(3)8 2! 8!
xsolution ≈ 2.309117 . . . . We cross-check the obtained value 2.309117 . . .2.309117... = 6.90620 . . .. We see that our approach gives indeed a good approximation of the solution. For more precise value one should either include higher orders, either have better initial guess or use an iterative procedure, as we are going to demonstrate it now. If we proceed in the same way like in the previous paragraph, but we include only 4 terms in the Taylor polynom we obtain
x1 ≈ 2 + (0.14765)(3) +
1 1 (−0.04335)(3)2 + . . . + (−0.02070)(3)4 2! 4! x1 ≈ 2.288709 . . . .
We take the value x1 = 2.28879 . . . as out starting point and we recalculate at this point the rst four derivatives f 1, . . . , f 4 of f . Subsequently we calculate the rst four derivatives g1, . . . , g4 of g in the corresponding point y1 = 2.288709 . . .2.288709... − 7 = −0.34729 . . .. The results are summarized in the Table 2. Using the formula 4 we get
xsolution ≈ 2.288709 + (0.08222)(0.34729 . . .) + . . . + 11
1 (−0.00224)(0.34729 . . .)4 4!
xsolution ≈ 2.31645 . . . Calculating 2.31645 . . .2.31645... = 6.999999205 . . . we observe that the precision becomes really good, better than in the previous calculation. From this one can learn that one does not need to include high orders in order to arrive to a good approximation of the solution. It seems to be more ecient to use a smaller number of orders in the iterative way. Because of the theory it is however interesting to have the general expression for all higher orders, although they may not be used in practice.
4.4.2 Example 2 - calculating π and e One can use the previous approach to calculate approximate values of important mathematical constants like π or e. One just needs to formulate equations that have these constants for solutions. ¡ ¢ √ ¡ ¢ For π one can think of solving cos x4 = 22 . So one denes f (x) = cos x4 − √ 2 2 and applies the method of the inverse function to solve f (x) = 0. If we choose x0 = 3 and if we include into our calculations rst 8 derivatives then we obtain as an approximation of π
π ≈ 3.14159265357636280318084094735 . . . , where 10 decimal places are correct. If we use the iterative way including 4 derivatives and iterating just twice (like in the example in 4.4.1) we get
π ≈ 3.14159265358979323846269213732 . . . , where 22 decimal digits are correct. For e one thinks immediately of the equation ln(e) = 1, from which one denes f (x) = ln(x) − 1 and solves f (x) = 0. Taking x0 = 2 and using the inverse function method with 8 derivatives we get
e ≈ 2.71828182832191473183292924148 . . . , where 9 decimal places are correct. Using the iterative method with 4 derivatives we obtain e ≈ 2.71828182845904444095135772192 . . . , where 14 decimal digits are correct.
4.4.3 Example 3 - generating random numbers Higher order derivatives of the inverse function and the expression of the inverse function in the form of the Taylor series can serve not only for solving equations but also for other purposes. We provide such an example in this section. Let us formulate the problem: given a random number generator which provides uniformly distributed random numbers from the interval h0, 1), we would like to obtain numbers which are distributed respecting probability density function ρ(x). Such kind of problem can often appear in practice since computers 12
fn =
dn dxn f (x)|x=x0 =0
f 0 = f (x0 = 0) = 0.5 f 1 ≈ 0.398942 f2 = 0 f 3 ≈ −0.398942 f4 = 0 f 5 ≈ 1.196826 f6 = 0 f 7 ≈ −5.984134 f8 = 0 f 9 ≈ 41.888939
gn =
dn dy n g(y)|y=y0 =0.5
g0 = g(y0 = 0.5) = 0 g1 ≈ 2.506628 g2 = 0 g3 ≈ 15.749609 g4 = 0 g5 ≈ 692.704024 g6 = 0 g7 ≈ 78964.749174 g8 = 0 g9 ≈ 17068346.560783
Table 3: Numerical values for f n and gn at x0 = 0 and y0 = 0.5 with f (x) = Rx 2 √1 exp(− y2 ) dy and g = f −1 . 2π −∞ usually oer uniformly distributed numbers from the interval h0, 1) and it is up to user/programmer to obtain from them the numbers that are distributed respecting a given probability density function. The solution of the problem is simple: Rx • construct the distribution function f (x) = −∞ ρ(y) dy .
• construct the function g inverse to f : g[f (x)] = f [g(x)] = x. • if the numbers x1 , x2 , . . . , xn from the interval h0, 1) respect the uniform distribution then the numbers y1 = g(x1 ), y2 = g(x2 ), . . . , yn = g(xn ) respect the distribution ρ(x). The methods presented in the article allow us to apply this solution supposing that we know the value of the distribution function f at least at one point x0 (this is often the case - for example for symmetric distributions f (x0 = 0) = 12 ). We simply construct the Taylor series for g using the formula 3. The example we are going to present will at the end turn out not to be so convincing, it will however well demonstrate our ideas. Let our aim be the construction of a random number generator respecting 2 the normal distribution ρ(x) = √12π exp(− x2 ). The corresponding distribution Rx 2 function f (x) = √12π −∞ exp(− y2 ) dy is not an elementary function. We however know its value and the value of all its derivatives f n at the point x0 = 0. Using the formula 1 we calculate the derivatives gn of the inverse function g at the point y0 = f (x0 = 0) = 12 . Taking into account rst 9 derivatives we get numerical results which are summarized in the Table 3. Straighforward use of the formula 3 (we use x for the variable name rather then y ) yields
g(x) ≈ 0 + (2.506628)(x − 0.5) + 0 +
13
1 (15.749609)(x − 0.5)3 + . . . 3!
Figure 3: The approximation of the function g , g = f −1 , f (x) = Rx 2 √1 exp(− y2 ) dy . 2π −∞
1 (17068346.560783)(x − 0.5)9 . 9! The graph of the obtained function is plotted on the Figure 3. One clearly sees that our random number generator would not produce numbers smaller than approximately −2 and bigger then approximately 2. Since it should in principle produce numbers from −∞ to +∞, one cannot be really satised with the result. The dierence between our generator and a standard one can be well observed on the Figure 4, where the two distributions with a high statistics from both generators are compared. A simple way of healing our generator would be to include higher order terms in the Taylor polynom approximating the function g . The calculation of such terms however relies on the calculation of the formulas for gn and the calculation of these formulas starts to be complicated (=time consuming) with increasing n ( n > 10) even when computers are used. Nevertheless this could be done once and then used for ever. One should also check whether the Taylor series for g converge to g . Although the presented example does not approximate the normal distribution random number generator with big accuracy, the ideas presented in this paragraph provide a very general and powerful method of producing random numbers with a desired distribution from uniformly distributed random numbers. ... +
5 Approximating the function beyond the radius of convergence of its Taylor series As the last point we would like to mention an interesting property which can be explored using techniques presented in the article. This techniques allow us - at least for some functions - to predict in canonical way the value of the function 14
Figure 4: Plots produced with the high statistics of random numbers. Dashed line - standard Gaussian; Solid line - our approximation of the Gaussian
f at some point x1 only from the value of f and its derivatives at the point x0 , even if x1 is beyond the radius of convergence of the Taylor series! One can very well demonstrate it on the example from the section 2.3.3 - and we will do so. Let f (x) = ln(x) and x0 = 1. There is a lot of knowledge about the function ln(x) but let us in this section allow only the knowledge about its value and the values of its derivatives at x0 = 1. Using only this knowledge let us try to predict the value of f (x) at the point x1 = 3. Having the set of numbers f 0, f 1, f 2, . . . one would naturally think of constructing Taylor series. This idea would however fail. It is a well known fact that the radius of convergence of the Taylor series for the function ln(x) at x0 = 1 is 1. The point x1 = 3 is simply too far from 1, the series at this point diverge. There is however another possibility how to calculate f (3). Let rst calculate the derivatives of the inverse function g = f −1 = exp(y) at y0 = f (x0 = 1) = 0. The result is obvious: 1 = g0 = g1 = g2 = . . . (see 2.3.3). Then one just needs to expand g = exp(y) into Taylor series. It is a well known fact that this series converge for each y ∈ R! Thus we can draw the graph of the function g and it is just a technicality to nd the value y1 such that g(y1 ) = x1 = 3. The number y1 is the number we are looking for. Let us briey summarize in the general way: • We have the knowledge of the numbers f 0, f 1, f 2, . . . - the value and the derivatives of the function f at the point x0 . • We calculate g1, g2, . . . - the derivatives of its inverse function g at the point y0 = f (x0 ). • We expand the function g into Taylor series. If it converges on an appropriate interval we can calculate the value of f at some point x1 - we may use for that purpose some iterative procedure. The iterative procedure in our example with f (x) = ln(x) and g(y) = exp(y) could look like 15
1. y := 0, step := 1, precision := 0.001 2. if [ g(y) < 3 and step > 0 ] then (y := y + step) 3. if [ g(y) < 3 and step < 0 ] then (step := −0.5 ∗ step and y := y + step) 4. if [ g(y) > 3 and step < 0 ] then ( y := y + step) 5. if [ g(y) > 3 and step > 0 ] then (step := −0.5 ∗ step and y := y + step) 6. if [ |g(y) − 3| < precision ] return y , else goto 2. Many problems in the theory of mathematics or natural sciences are solved using power series. In some cases the series do not converge simply because the parameter in which they are built is too big (for example the series in powers of strong coupling constant αs in particle physics at low energy ). One may wonder whether presented approach would be applicable to this kind of problems. It is however for the moment not clear for which class of functions would such an approach work.
6 Conclusion In this article we presented the formula for calculation of higher order derivatives of the inverse function. We also tried to demonstrate on examples that the knowledge of the derivatives of the inverse function leads to the interesting mathematical applications. There are still many questions opened. What is the relation between the radius of convergence for the Taylor series of the initial and the inverse function? Is it possible to generalize the presented ideas to more complex mathematical objects like for example operators? Since many problems in practice lead to the functional equations such a generalization could be of a big interest.
16