Fix typo

a26a8752 · Nando Farchmin · cbc6cef1 · a26a8752
Commit a26a8752 authored 2 years ago by Nando Farchmin
--- a/doc/basics.md
+++ b/doc/basics.md
@@ -119,10 +119,17 @@ Let $`x^{(1)},\dots,x^{(N)}\sim\pi`$ be independent (random) samples and assume

 The empirical regression problem then reads

+```math
+\text{Find}\qquad \Psi_\vartheta
+= \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \frac{1}{N} \sum_{i=1}^N \bigl(f^{(i)} - \Psi_\theta(x^{(i)})\bigr)^2
+=: \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \mathcal{L}_N(\Psi_\theta)
+```

 > **Definition** (loss function):
 > A _loss functions_ is any function, which measures how good a neural network approximates the target values.

+**TODO: Is there a maximum number of inline math?**
+
 Typical loss functions for regression and classification tasks are
  - mean-square error (MSE, standard $`L^2`$-error)
  - weighted $`L^p`$- or $`H^k`$-norms (solutions of PDEs)
@@ -170,6 +177,7 @@ The best metaphor to remember the difference (I know of) is the following:
 > Your friend, however, drank a little to much and is not capable of planning anymore.
 > So they stagger down the mountain in a more or less random direction.
 > Each step they take is with little thought, but it takes them a long time overall to get back home (or at least close to it).
+>
 > <img src="sgd.png" title="sgd" alt="sgd" height=400 />

 What remains is the computation of $`\operatorname{\nabla}_\vartheta\Psi_{\vartheta^{(i)}}`$ for $`i`\in\Gamma_j\subset\{1,\dots,N\}$ in each step.