Skip to content
Snippets Groups Projects
Commit a26a8752 authored by Nando Farchmin's avatar Nando Farchmin
Browse files

Fix typo

parent cbc6cef1
No related branches found
No related tags found
1 merge request!1Update math to conform with gitlab markdown
......@@ -119,10 +119,17 @@ Let $`x^{(1)},\dots,x^{(N)}\sim\pi`$ be independent (random) samples and assume
The empirical regression problem then reads
```math
\text{Find}\qquad \Psi_\vartheta
= \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \frac{1}{N} \sum_{i=1}^N \bigl(f^{(i)} - \Psi_\theta(x^{(i)})\bigr)^2
=: \operatorname*{arg\, min}_{\Psi_\theta\in\mathcal{M}_{d,\varphi}} \mathcal{L}_N(\Psi_\theta)
```
> **Definition** (loss function):
> A _loss functions_ is any function, which measures how good a neural network approximates the target values.
**TODO: Is there a maximum number of inline math?**
Typical loss functions for regression and classification tasks are
- mean-square error (MSE, standard $`L^2`$-error)
- weighted $`L^p`$- or $`H^k`$-norms (solutions of PDEs)
......@@ -170,6 +177,7 @@ The best metaphor to remember the difference (I know of) is the following:
> Your friend, however, drank a little to much and is not capable of planning anymore.
> So they stagger down the mountain in a more or less random direction.
> Each step they take is with little thought, but it takes them a long time overall to get back home (or at least close to it).
>
> <img src="sgd.png" title="sgd" alt="sgd" height=400 />
What remains is the computation of $`\operatorname{\nabla}_\vartheta\Psi_{\vartheta^{(i)}}`$ for $`i`\in\Gamma_j\subset\{1,\dots,N\}$ in each step.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment