Function response
The blue point moves on the activation curve as the input changes.
ReLU
activation curve
current input-output point
Derivative response
This is what really explains saturation, dead neurons, and gradient flow.
Active region
derivative curve
current derivative point
Choose activation
Drag this to see how the output and the local slope change together.
Playback
Active region
Current output
1.20
a = g(z)
Current derivative
1.00
g'(z)
Function
g(z) = max(0, z)
Derivative
g'(z) = 1 for z > 0, 0 for z < 0
Why ReLU is common
On the positive side, the slope stays strong, so gradients do not shrink quickly.
On the negative side, the neuron outputs zero and stops passing gradient.