Consider the following function to optimize. $$ f(x) = x^2 + x -2\sqrt{x} $$
f(x) = x^2 + x - 2*sqrt(x)
I just created a very cool function!
We can rewrite this function as
g(x::Float64) = x*(x+1) - 2*sqrt(x)
g(x::Int) = 1+1
g(1.0)
g(1)
methods(g)
using Plots
pyplot()
xmin = 0.0
xmax = 1.5
plot(g, xmin, xmax)
Compute the Fibonacci numbers.
N = 50
F = ones(N)
for i = 3:N
F[i] = F[i-1] + F[i-2]
end
F
F[length(F)]
F[0]
Assume that we know that the solution is in [0,1].
xmin = 0
xmax = 1.0
verbose = true
function fibonacci(g::Function, xmin, xmax, verbose::Bool = false)
k = 1
i = 1
d = xmax - xmin
xG = xmin+(F[N-2]/F[N])*d
xD = xmin+(F[N-1]/F[N])*d
fG = g(xG)
fD = g(xD)
if (verbose)
println("Iteration $k.\nxmin = $xmin, xmax = $xmax")
println("xG = $xG, fG = $fG")
println("xD = $xD, fD = $fD")
println("d = $d")
end
while (k < N-2)
k += 1
i += 1
if fG < fD
xmax = xD
d = xmax - xmin
xD = xG
fD = fG
xG = xmin+(F[N-k-1]/F[N-k+1])*d
fG = g(xG)
elseif fG > fD
xmin = xG
d = xmax - xmin
xG = xD
fG = fD
xD = xmin+(F[N-k]/F[N-k+1])*d
fD = g(xD)
elseif fG == fD
k += 1
if (k < N-2)
xmin = xG
xmax = xD
d = xmax - xmin
xG = xmin+(F[N-k-1]/F[N-k+1])*d
xD = xmin+(F[N-k]/F[N-k+1])*d
fG = g(xG)
fD = g(xD)
end
end
if verbose
println("Iteration $i.\nxmin = $xmin, xmax = $xmax, $k = k")
println("xG = $xG, fG = $fG")
println("xD = $xD, fD = $fD")
println("d = $d")
end
end
return [xmin, xmax]
end
methods(fibonacci)
bounds = fibonacci(g, xmin, xmax, true)
bounds[2]-bounds[1]
function golden(f::Function, a, b, tol::Float64 = 1e-6)
k = 1
i = 1
if (b < a)
t = b
b = a
a = t
end
d = b - a
# Golden ratio
gr = (sqrt(5) + 1) / 2
c = b - d / gr
d = a + d / gr
while (abs(c - d) > tol)
if f(c) < f(d)
b = d
else
a = c
end
c = b - (b - a) / gr
d = a + (b - a) / gr
end
return a, b, (b + a) / 2
end
golden(g, 0.0, 1.0)
golden(g, 0.0, 1.0, 1e-8)
Some optimization routines are directly available in Julia, and can be obtained with the command
using Pkg
Pkg.add("Optim")
using Optim
The basic routine is optimize, taking as first argument the function to minimize, and for univariate functions, second and third arguments the initial lower and upper bound of the search interval.
The Golden Section search method is an extension of the Fibonacci approach, where $N$ is not specified, and is taken as $N \rightarrow \infty$. We specify it as the fourth argument.
result = optimize(g, 0, 1, GoldenSection())
Optim.minimizer(result)
bounds
The derivate of $f$ is $$ f'(x) = 2x+1-\frac{1}{\sqrt{x}} $$ which can be translated in Julia as
df(x) = 2x+1-1.0/sqrt(x)
Set $f'(x) = 0$, i.e $$ \frac{1}{\sqrt{x}} = 2x+1 $$ or $$ \frac{1}{x} = 4x^2 + 4x + 1 $$ We therefore have to look for the roots of polynomial $$ 4x^3 + 4x^2 + x - 1 = 0. $$ Not easy! We will use the roots finding library.
# Pkg.add("Roots")
using Roots
h(x) = x*(4x*(x+1)+1)-1
The function fzeros aim to find all the roots of a polynomial, but it is quite slow. We will just look for one zero of the function, in the interval $[0,1]$.
?fzero
fzero(h, 0, 1)
fzeros(h, 0, 1)
We can do it explicitely by coding our bisection function.
function bisection(f::Function, a::Float64, b::Float64, δ::Float64 = 1e-8)
k = 1
if (a > b)
c = a
a = b
b = c
end
fa = f(a)
fb = f(b)
if fa == 0
return k, fa, a, a
elseif fb == 0
return k, fb, b, b
end
if fa*fb > 0
println("The function must be of opposite signs at the bounds")
return
end
d = b-a
c = a+d/2
fc = f(c)
while (d > δ)
if (verbose)
println("$k. a = $a, b = $b, d = $d, c = $c, fc = $fc")
end
k += 1
if (fc == 0)
a = b = c
break
elseif (fc*fa < 0)
b = c
fb = fc
else
a = c
fa = fc
end
d = b-a
c = a+d/2
fc = f(c)
end
return k, fc, a, b
end
methods(bisection)
X = bisection(df, 0.0, 1.0)
X = bisection(df, 0.0, 1.0, 1e-11)
df(1.0)
df(2.0)
X = bisection(df, 1.0, 2.0, 1e-11)
X
The second derivate of $f$ is $$ f''(x) = 2+\frac{1}{2}x^{-\frac{3}{2}}. $$
function d2f(x::Float64)
return 2+x^(-3/2)/2
end
A basic implementation of the Newton approach follows.
function Newton(f::Function, df::Function, d2f:: Function, xstart::Float64, δ::Float64 = 1e-8, nmax::Int64 = 100)
k = 1
x = xstart
if (verbose)
fx = f(x)
println("$k. x = $x, f(x) = $fx")
end
dfx = df(x)
while (abs(dfx) > δ && k <= nmax)
k += 1
dfx = df(x)
x = x-dfx/d2f(x)
if (verbose)
fx = f(x)
println("$k. x = $x, f(x) = $fx")
end
end
end
verbose = true
Newton(f, df, d2f, 0.1)
verbose = true
Newton(f, df, d2f, 100.0)
x0 = 3.0
x1 = x0-df(x0)/d2f(x0)
verbose = true
Newton(f, df, d2f, 0.5)
We see that we converge faster to the optimal solution.
It is not always easy to explicitely compute the derivative of a function. It is however possible to exploit the derivate definition in order to numerically approximate it. Let $f$ be derivable at $x$. The derivative is defined as $$ f'(x) = \lim_{\epsilon \rightarrow 0} \frac{f(x+\epsilon)-f(x)}{\epsilon} $$ We can therefore approximate the derivative by choosing $\epsilon$ small enough and computing $$ f'(x) \approx \frac{f(x+\epsilon)-f(x)}{\epsilon} $$ for instance
ϵ = 1e-4
dgfd(x) = (g(x+ϵ)-g(x))/ϵ
Applying the bisection method to that approximation, we obtain
fzero(dgfd, 0, 1)
We cannot choose $\epsilon$ arbitrarily small, as illustrated below.
# fd: Finite difference
dffd(x, ϵ) = (f(x+ϵ)-f(x))/ϵ
x = 1.0
errfd(ϵ) = abs(df(x)-dffd(x, ϵ))
plot(errfd, 1e-14,1e-12)
ϵ = 1e-40
dgfd(x) = (g(x+ϵ)-g(x))/ϵ
fzero(dgfd,0,1)
The method can be refined using the central difference, defined as $$ f'(x) \approx \frac{f(x+\epsilon)-f(x-\epsilon)}{2\epsilon} $$
dfcd(x, ϵ=1e-6) = (f(x+ϵ)-f(x-ϵ))/(2*ϵ)
x = 1.0
errcd(ϵ) = abs(df(x)-dfcd(x, ϵ))
plot(errcd, 1e-18,0.1e-14)
The central difference provides smaller numerical errors, but at the expense of one additional function evaluation.
The numerical derivates are often expensive to compute, especially in multivariate problem, and we will look for automatic differentiation.
Newton(f, dfcd, d2f, 1.1)
dfhd(x, ϵ=1e-4) = (dfcd(x+ϵ)-dfcd(x))/ϵ
Newton(f, dfcd, dfhd, 1.1)
using ForwardDiff
import Pkg
Pkg.add("ForwardDiff")
g2 = x -> ForwardDiff.derivative(f, x)
Newton(f, g2, dfhd, 1.1)
errfd(x) = abs(df(x)-g2(x))
plot(errfd, 1,1.1)