Files
ArtStudies/M1/Numerical Optimisation/ComputerSession2.ipynb
Arthur DANJOU f94ff07cab Refactor code for improved readability and consistency across notebooks
- Standardized spacing around operators and function arguments in TP7_Kmeans.ipynb and neural_network.ipynb.
- Enhanced the formatting of model building and training code in neural_network.ipynb for better clarity.
- Updated the pyproject.toml to remove a specific TensorFlow version and added linting configuration for Ruff.
- Improved comments and organization in the code to facilitate easier understanding and maintenance.
2025-07-01 20:46:08 +02:00

595 lines
20 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Computer session 2\n",
"\n",
"The goal of this computer class is to get a good feel of the Newton method and its\n",
"variants. In a (maybe) surprising way, we actually start with the dichotomy method\n",
"in the one-dimensional case.\n",
"\n",
"## The dichotomy method in the one-dimensional case\n",
"\n",
"When trying to solve the equation $\\phi(x) = 0$ in the one-dimensional case, the\n",
"most naive method, which actually turns out to be quite efficient, is the dichotomy\n",
"method. Namely, starting from an initial pair $(a_L , a_R ) \\in \\mathbb{R}^2$ with $a_L < a_R$ such\n",
"that $\\phi(a_L)\\phi(a_R)<0$, we set $b =\\frac{a_L+a_R}{2}$. If $\\phi(b) = 0$, the algorithm stops. If\n",
"$\\phi(a_L)\\phi(b) < 0$ we set $a_L\\to a_L$ and $a_R \\to b$. In this way, we obtain a linearly\n",
"converging algorithm. In particular, it is globally converging.\n",
"\n",
"\n",
"Write a function `Dichotomy(phi,aL,aR,eps)` that take sas argument\n",
"a function `phi`, an initial guess `aL,aR` and a tolerance `eps` and that runs the dichotomy\n",
"algorithm. Your argument should check that the condition $\\phi(a_L)\\phi(a_R) < 0$ is satisfied,\n",
"stop when the function `phi` reaches a value lower than `eps` and return the number\n",
"of iteration. Run your algorithm on the function $f = tanh$ with initial guesses\n",
"$a_L = 20$ , $a_R = 3$.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:14.314484Z",
"start_time": "2025-03-18T16:19:13.728014Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(-2.3283064365386963e-09, 31)\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"\n",
"def dichotomy(phi, aL, aR, eps=1e-8):\n",
" iter = 0\n",
" b = (aL + aR) / 2\n",
" while (aR - aL) / 2 > eps and phi(b) != 0:\n",
" b = (aL + aR) / 2\n",
" if phi(aL) * phi(b) < 0:\n",
" aR = b\n",
" else:\n",
" aL = b\n",
" iter += 1\n",
" return b, iter\n",
"\n",
"\n",
"f = lambda x: np.tanh(x)\n",
"aL, aR = -20, 3\n",
"print(dichotomy(f, aL, aR))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solving one-dimensional equation with the Newton and the secant method\n",
"\n",
"We work again in the one-dimensional case with a function φ we want to find the\n",
"zeros of.\n",
"\n",
"### Newton method\n",
"\n",
"Write a function `Newton(phi,dphi,x0,eps)` that takes, as arguments, a function\n",
"`phi`, its derivative `dphi`, an initial guess `x0` and a tolerance `eps`\n",
"and that returns an approximation of the solutions of the equation $\\phi(x) = 0$. The\n",
"tolerance criterion should again be that $|\\phi| ≤\\text{\\texttt{eps}}$. Your\n",
"algorithm should return an error message in the following cases:\n",
"1. If the derivative is zero (look up the `try` and `except` commands in Python).\n",
"2. If the method diverges.\n",
"\n",
"Apply this code to the minimisation of $x\\mapsto \\ln(e^x + e^{x})$, with initial condition `x0=1.8`.\n",
"Compare this with the results of Exercise 3.10."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:17.447647Z",
"start_time": "2025-03-18T16:19:17.442560Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/tp/_ld5_pzs6nx6mv1pbjhq1l740000gn/T/ipykernel_25957/3868809151.py:14: RuntimeWarning: overflow encountered in exp\n",
" f = lambda x: np.log(np.exp(x) + np.exp(-x))\n"
]
}
],
"source": [
"def Newton(phi, dphi, x0, eps=1e-10):\n",
" iter = 0\n",
" while abs(phi(x0)) > eps:\n",
" try:\n",
" x0 -= phi(x0) / dphi(x0)\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iter += 1\n",
" if iter > 10000 or phi(x0) == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"f = lambda x: np.log(np.exp(x) + np.exp(-x))\n",
"x0 = 1.8\n",
"df = lambda x: (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))\n",
"print(Newton(f, df, x0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Secant method\n",
"\n",
"Write a function `Secant(phi,x0,x1,eps)` that takes, as arguments, a function `phi`, two initial positions `x0`, `x1` and a tolerance\n",
"`eps` and that returns an approximation of the solutions of the equation\n",
"$\\phi(x) = 0$. The tolerance criterion should again be that $|\\phi| ≤\\text{\\texttt{eps}}$. Apply this code to the minimisation\n",
"of $x\\mapsto \\ln(e^x + e^{x})$, with initial conditions `x0=1`, `x1=1.9`, then\n",
"`x0=1`, `x1=2.3` and\n",
"`x0=1`, `x1=2.4`. Compare with the results of Exercise 3.10."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:19.649523Z",
"start_time": "2025-03-18T16:19:19.456149Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n",
"(inf, 'Method diverges')\n",
"(inf, 'Method diverges')\n"
]
}
],
"source": [
"def Secant(phi, x0, x1, eps=1e-8):\n",
" iter = 0\n",
" while abs(phi(x1)) > eps:\n",
" x1, x0 = x1 - phi(x1) * (x1 - x0) / (phi(x1) - phi(x0)), x0\n",
" iter += 1\n",
" if iter > 10000:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"f = lambda x: np.log(np.exp(x) + np.exp(-x))\n",
"xx = [(1, 1.9), (1, 2.3), (1, 2.4)]\n",
"\n",
"for x0, x1 in xx:\n",
" print(Secant(f, x0, x1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Combining dichotomy and the Newton method\n",
"\n",
"A possibility to leverage the advantages of dichotomy (the global convergence of\n",
"the method) and of the Newton method (the quadratic convergence rate) is to\n",
"combine both: start from an initial interval `[aL,aR]` of length `InitialLength`\n",
"with $\\phi(a_L)\\phi(a_R)<0$ and fix a real\n",
"number $s \\in [0; 1]$. Run the dichotomy algorithm until the new interval is of length\n",
"`s*InitialLength`. From this point on, apply the Newton method.\n",
"\n",
"Implement this algorithm with `s = 0.1`. Include a possibility to switch\n",
"back to the dichotomy method if, when switching to the Newton method, the new\n",
"iterate falls outside of the computed interval `[aL,aR]`. Apply this to the minimisation\n",
"of the function $f x\\mapsto \\ln(e^x + e^{x})$ with an initial condition that made the Newton\n",
"method diverge. What can you say about the number of iterations?"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:44:41.592150Z",
"start_time": "2025-03-18T16:44:41.584318Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/tp/_ld5_pzs6nx6mv1pbjhq1l740000gn/T/ipykernel_25957/1578277506.py:23: RuntimeWarning: overflow encountered in exp\n",
" f = lambda x: np.log(np.exp(x) + np.exp(-x))\n"
]
}
],
"source": [
"def DichotomyNewton(phi, dphi, aL, aR, s=0.1, eps=1e-10):\n",
" iter = 0\n",
" inital_length = aR - aL\n",
" while (aR - aL) >= s * inital_length:\n",
" b = (aL + aR) / 2\n",
" if phi(aL) * phi(b) < 0:\n",
" aR = b\n",
" else:\n",
" aL = b\n",
" iter += 1\n",
" x0 = (aL + aR) / 2\n",
" while abs(phi(x0)) > eps:\n",
" try:\n",
" x0 -= phi(x0) / dphi(x0)\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iter += 1\n",
" if iter > 10000 or phi(x0) == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"f = lambda x: np.log(np.exp(x) + np.exp(-x))\n",
"df = lambda x: (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))\n",
"print(DichotomyNewton(f, df, -20, 3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solving an optimisation problem using the Newton method\n",
"\n",
"An island (denoted by a point $I$ below) is situated 2 kilometers from the shore (its projection on the shore\n",
"is a point $P$). A guest staying at a nearby hotel $H$ wants to go from the hotel to the\n",
"island and decides that he will run at 8km/hr for a distance $x$, before swimming at\n",
"speed 3km/hr to reach the island.\n",
"\n",
"![illustration of the problem](./images/optiNewton.png)\n",
"\n",
"Taking into account the fact that there are 6 kilometers between the hotel $H$ and $P$,\n",
"how far should the visitor run before swimming?\n",
"\n",
"Model the situation as a minimisation problem, and solve it numerically.\n",
"Compare the efficiency of the dichotomy method and of the Newton algorithm."
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T17:43:43.061916Z",
"start_time": "2025-03-18T17:43:43.042625Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimal point (Newton): 0.9299901531755377\n",
"Objective function value at optimal point (Newton): 1.9329918821224974\n",
"Number of iterations (Newton): 100\n",
"Optimal point (Dichotomy): inf\n",
"Objective function value at optimal point (Dichotomy): inf\n",
"Number of iterations (Dichotomy): Method diverges\n"
]
}
],
"source": [
"\n",
"u = lambda x: np.sqrt((6 - x) ** 2 + 4)\n",
"\n",
"\n",
"def objective_function(x):\n",
" return x / 8 + u(x) / 3\n",
"\n",
"\n",
"def gradient(x):\n",
" return 1 / 8 + (6 - x) / (3 * u(x))\n",
"\n",
"\n",
"def hessian(x):\n",
" return (12 * u(x) - (2 * x - 12) ** 2 / 12 * u(x)) / 36 * u(x) ** 2\n",
"\n",
"\n",
"# Newton's method for optimization\n",
"def newton_method(initial_guess, tolerance=1e-6, max_iterations=100):\n",
" x = initial_guess\n",
" iterations = 0\n",
" while iterations < max_iterations:\n",
" grad = gradient(x)\n",
" hess = hessian(x)\n",
" if np.abs(grad) < tolerance:\n",
" break\n",
" x -= grad / hess\n",
" iterations += 1\n",
" return x, iterations\n",
"\n",
"\n",
"# Dichotomy method for optimization\n",
"def dichotomy_method(aL, aR, eps=1e-6, max_iterations=1000):\n",
" iterations = 0\n",
" x0 = (aL + aR) / 2\n",
" grad = gradient(x0)\n",
" hess = hessian(x0)\n",
" while abs(grad) > eps:\n",
" try:\n",
" x0 -= grad / hess\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iterations += 1\n",
" if iterations > max_iterations or grad == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" grad = gradient(x0)\n",
" hess = hessian(x0)\n",
" return x0, iterations\n",
"\n",
"\n",
"# Initial guess for Newton's method\n",
"initial_guess_newton = 4\n",
"\n",
"# Run Newton's method\n",
"optimal_point_newton, iterations_newton = newton_method(initial_guess_newton)\n",
"print(f\"Optimal point (Newton): {optimal_point_newton}\")\n",
"print(\n",
" f\"Objective function value at optimal point (Newton): {objective_function(optimal_point_newton)}\"\n",
")\n",
"print(f\"Number of iterations (Newton): {iterations_newton}\")\n",
"\n",
"# Initial interval for dichotomy method\n",
"aL, aR = 0, 6\n",
"\n",
"# Run dichotomy method\n",
"optimal_point_dichotomy, iterations_dichotomy = dichotomy_method(aL, aR)\n",
"print(f\"Optimal point (Dichotomy): {optimal_point_dichotomy}\")\n",
"print(\n",
" f\"Objective function value at optimal point (Dichotomy): {objective_function(optimal_point_dichotomy)}\"\n",
")\n",
"print(f\"Number of iterations (Dichotomy): {iterations_dichotomy}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Newton method to solve boundary value problems\n",
"\n",
"### Shooting method\n",
"\n",
"We consider the following non-linear ODE\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y(b)=\\beta.\n",
"\\end{equation}\n",
"\n",
"\n",
"To classically integrate such an ODE, we usually dont have endpoints for $y$,\n",
"but initial values for $y$ and $y'$. So, we cannot start at $x=a $ and integrate\n",
"up to $x=b$. This is a _boundary value problem_.\n",
"\n",
"One approach is to approximate $y$ by somme finite difference and then arrive at\n",
"a system for the discrete values $y(x_i)$ and finally solve large linear\n",
"systems.\n",
"\n",
"Here, we will see how we can formulate the problem as a _shooting_ method, and\n",
"use Newton method so solve it.\n",
"\n",
"The idea is to use a _guess_ for the initial value of $y'$. Let $s$ be a\n",
"parameter for a fonction $y(\\;\\cdot\\;;s)$ solution of (1) such that\n",
"$$y(a;s)=\\alpha,\\text{ and }y'(a;s)=s.$$\n",
"\n",
"There is no chance that $y(b;s)=y(b)=\\beta$ but we can adjust the value of $s$,\n",
"refining the guess until it is (nearly equal to) the right value.\n",
"\n",
"This method is known as _shooting_ method in analogy to shooting a ball at a\n",
"goal, determining the unknown correct velocity by throwing it too fast/too\n",
"slow until it hits the goal exactly.\n",
"\n",
"#### In Practice\n",
"\n",
"For the parameter $s$, we integrate the following ODE:\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y'(a)=s.\n",
"\\end{equation}\n",
"\n",
"We denote $y(\\;\\cdot\\;;s)$ solution of (2).\n",
"\n",
"Let us now define the _goal function_. Here, we want that $y(b;s)=\\beta$, hence,\n",
"we define:\n",
"$$g:s\\mapsto \\left.y(x;s)\\right|_{x=b}-\\beta$$\n",
"\n",
"We seek $s^*$ such that $g(s^*)=0$.\n",
"\n",
"Note that computing $g(s)$ involves the integration of an ODE, so each\n",
"evaluation of $g$ is expensive. Newtons method seems then to be a good\n",
"way due to its fast convergence.\n",
"\n",
"To be able to code a Newtons method, we need to compute the derivative of $g$.\n",
"For this purpose, let define\n",
"$$z(x;s)=\\frac{\\partial y(x;s)}{\\partial s}.$$\n",
"\n",
"Then by differentiating (2) with respect to $s$, we get\n",
"$$z''=\\frac{\\partial f}{\\partial y}z+\\frac{\\partial f}{\\partial y'}z',\\quad\n",
"z(a;s)=0,\\text{ and }z'(a;s)=1.$$\n",
"\n",
"The derivative of $g$ can now be expressed in term of $z$:\n",
"$$g'(z)=z(b;s).$$\n",
"\n",
"Putting this together, we can code the Newton's method:\n",
"$$s_{n+1}=s_n-\\frac{g(s_n)}{g'(s_n)}.$$\n",
"\n",
"To sum up, a shooting method requires an ODE solver and a Newton solver.\n",
"\n",
"#### Example\n",
"\n",
"Apply this method to\n",
"$$y''=2y^3-6y-2x^3,\\quad y(1)=2,y(2)=5/2,$$\n",
"with standard library for integration, and your own Newton implementation.\n",
"\n",
"Note that you may want to express this with one order ODE. Moreover, it may be\n",
"simpler to solve only one ODE for both $g$ and $g'$.\n",
"\n",
"With python, you can use `scipy.integrate.solve_ivp` function:\n",
"https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html#scipy.integrate.solve_ivp\n",
"\n",
"Plot the solution $y$.\n",
"\n",
"For numerical parameters, compute the solution up to a precision of $10^{-8}$\n",
"and get the function on a grid of 1000 points.\n",
"\n",
"## Finite differences\n",
"\n",
"Here, we are going to use a different approach to solve the boundary value\n",
"problem:\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y(b)=\\beta.\n",
"\\end{equation}\n",
"\n",
"This problem can be solved by the following direct process:\n",
"\n",
"1. We discretize the domain choosing an integer $N$, grid points $\\{x_n\\}_{n=0,\\dots,N}$ and\n",
" we define the discrete solution $\\{y_n\\}_{n=0,\\dots,N}$.\n",
"1. We discretize the ODE using derivative approximation with finite differences\n",
" in the interior of the domain.\n",
"1. We inject the boundary conditions (here $y_0=\\alpha$ and $y_N=\\beta$) in the\n",
" discretized ODE.\n",
"1. Solve the system of equation for the unknows $\\{y_n\\}_{n=1,\\dots,N-1}$.\n",
"\n",
"We use here a uniform grid :\n",
"$$h:=(b-a)/N, \\quad \\forall n=0,\\dots, N\\quad x_n=hn.$$\n",
"If we use a centered difference formula for $y''$ and $y'$, we obtain:\n",
"$$\\forall n=1,\\dots,N-1,\\quad \\frac{y_{n+1}-2y_n+y_{n-1}}{h^2}=f\\left(x_n,y_n,\\frac{y_{n+1}-y_{n-1}}{2h}\\right).$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is a system of equations for $\\mathbf{y}=(y_1,\\dots,y_{N-1})$ :\n",
"$$G(\\mathbf{y})=0,\\quad G:\\mathbb{R}^{N-1}\\to\\mathbb{R}^{N-1}.$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This system can be solved using Newton's method. Note that the Jacobian\n",
"$\\partial G/\\partial \\mathbb{y}$ is _tridiagonal_.\n",
"\n",
"Of course, here, we are in the multidimensional context, so you will have to\n",
"code a Newton algorithm well suited.\n",
"\n",
"#### Example\n",
"\n",
"Apply this method to\n",
"$$y''=2y^3-6y-2x^3,\\quad y(1)=2,y(2)=5/2.$$\n",
"\n",
"Plot the solution $y$.\n",
"\n",
"For numerical parameters, compute the solution up to a precision of $10^{-8}$\n",
"and get the function on a grid of 1000 points."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Remark:** In the context of numerical optimal control, these two numerical\n",
"methods are often called _indirect method_ (for the shooting method) and _direct\n",
"method_ (for the finite difference method)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Whos the best?\n",
"\n",
"Compare the two methods playing with parameters (grid discretization, precision,\n",
"initialization, etc.). Mesure the time computation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}