Euler
The Euler sampler in the context of machine learning is a stochastic gradient descent algorithm used to minimize loss functions in deep learning models. It follows the general approach:
- Sample a random initialization for the model parameters (weights, biases)
- Calculate the gradient of the loss function with respect to each parameter
- Take a step against the gradient direction to modify each parameter
The key characteristic of Euler sampler is that the step size for each parameter update is sampled from a Gaussian distribution with mean 0 and standard deviation proportional to the learning rate.
Specifically, the parameter update equation is:
θ ← θ - η*N(0, σ^2) + α*∇L(θ)
Where:
θ - model parameter
η - learning rate
N(0, σ^2) - Gaussian noise with 0 mean and variance σ^2
α - scaling constant
∇L(θ) - gradient of loss L w.r.t. θ
Adding Gaussian noise injection encourages exploration and prevents getting stuck in local optima. The scale of noise is controlled by the learning rate.
Compared to vanilla stochastic gradient descent, Euler sampler allows larger and more stochastic steps to escape sharp local minima. This enables more efficient exploration and can generalize better for neural networks.
Overall, the Euler sampler modifies gradient descent by injecting Gaussian noise in the parameter updates. This helps optimize highly non-convex loss functions in deep learning.