Activation Functions Visualizer

Non-linearity and Gradient Flow
Function $f(x)$ Derivative $f'(x)$

Vanishing Gradients: Watch the blue dashed line. When it approaches zero, the weight updates in early layers of a deep network will effectively stall.