From a26a87522805fb26b2a66e54194b86fc011bb25c Mon Sep 17 00:00:00 2001 From: Nando Farchmin <nando.farchmin@gmail.com> Date: Fri, 1 Jul 2022 19:30:12 +0200 Subject: [PATCH] Fix typo --- doc/basics.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/doc/basics.md b/doc/basics.md index eda06f9..f507a5f 100644 --- a/doc/basics.md +++ b/doc/basics.md @@ -119,10 +119,17 @@ Let $`x^{(1)},\dots,x^{(N)}\sim\pi`$ be independent (random) samples and assume The empirical regression problem then reads +```math +\text{Find}\qquad \Psi_\vartheta += \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \frac{1}{N} \sum_{i=1}^N \bigl(f^{(i)} - \Psi_\theta(x^{(i)})\bigr)^2 +=: \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \mathcal{L}_N(\Psi_\theta) +``` > **Definition** (loss function): > A _loss functions_ is any function, which measures how good a neural network approximates the target values. +**TODO: Is there a maximum number of inline math?** + Typical loss functions for regression and classification tasks are - mean-square error (MSE, standard $`L^2`$-error) - weighted $`L^p`$- or $`H^k`$-norms (solutions of PDEs) @@ -170,6 +177,7 @@ The best metaphor to remember the difference (I know of) is the following: > Your friend, however, drank a little to much and is not capable of planning anymore. > So they stagger down the mountain in a more or less random direction. > Each step they take is with little thought, but it takes them a long time overall to get back home (or at least close to it). +> > <img src="sgd.png" title="sgd" alt="sgd" height=400 /> What remains is the computation of $`\operatorname{\nabla}_\vartheta\Psi_{\vartheta^{(i)}}`$ for $`i`\in\Gamma_j\subset\{1,\dots,N\}$ in each step. -- GitLab