Activation functions introduce nonlinearity into a neural network, allowing it to model complex relationships instead of collapsing into a single linear mapping. After a neuron computes a weighted sum, the activation decides how strongly that signal should pass forward.
Different activations suit different roles. ReLU is common in hidden layers because it is cheap and usually trains well. Sigmoid and tanh are bounded, which can be useful for probabilities or centered outputs, but they can saturate and slow down learning in deep networks.