ReLU

ReLU (Rectified Linear Unit) is an activation function that outputs the input directly if it's positive, and zero otherwise.

Equation:
f(x)=max(0,x)

Python Implementation:

def relu(x):
	return np.maximum(0, x)

Why used:

  1. Simplicity: ReLU is computationally efficient, being a simple max(0,x) operation.

  2. Sparsity: It naturally creates sparse representations, as negative inputs become zero.

  3. Gradient flow: ReLU allows for better gradient flow during backpropagation, mitigating the vanishing gradient problem.

  4. Non-linearity: It introduces non-linearity without causing saturation like sigmoid or tanh functions.

  5. Biological plausibility: ReLU is more similar to the firing of biological neurons than previous activation functions.

The success of ReLU led to several variants and improvements: