From a26a87522805fb26b2a66e54194b86fc011bb25c Mon Sep 17 00:00:00 2001
From: Nando Farchmin <nando.farchmin@gmail.com>
Date: Fri, 1 Jul 2022 19:30:12 +0200
Subject: [PATCH] Fix typo

---
 doc/basics.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/basics.md b/doc/basics.md
index eda06f9..f507a5f 100644
--- a/doc/basics.md
+++ b/doc/basics.md
@@ -119,10 +119,17 @@ Let $`x^{(1)},\dots,x^{(N)}\sim\pi`$ be independent (random) samples and assume
 
 The empirical regression problem then reads
 
+```math
+\text{Find}\qquad \Psi_\vartheta
+= \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \frac{1}{N} \sum_{i=1}^N \bigl(f^{(i)} - \Psi_\theta(x^{(i)})\bigr)^2
+=: \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \mathcal{L}_N(\Psi_\theta)
+```
 
 > **Definition** (loss function):
 > A _loss functions_ is any function, which measures how good a neural network approximates the target values.
 
+**TODO: Is there a maximum number of inline math?**
+
 Typical loss functions for regression and classification tasks are
   - mean-square error (MSE, standard $`L^2`$-error)
   - weighted $`L^p`$- or $`H^k`$-norms (solutions of PDEs)
@@ -170,6 +177,7 @@ The best metaphor to remember the difference (I know of) is the following:
 > Your friend, however, drank a little to much and is not capable of planning anymore.
 > So they stagger down the mountain in a more or less random direction.
 > Each step they take is with little thought, but it takes them a long time overall to get back home (or at least close to it).
+>
 > <img src="sgd.png" title="sgd" alt="sgd" height=400 />
 
 What remains is the computation of $`\operatorname{\nabla}_\vartheta\Psi_{\vartheta^{(i)}}`$ for $`i`\in\Gamma_j\subset\{1,\dots,N\}$ in each step.
-- 
GitLab