List Headline Image
Updated by mohitverma0491 on Apr 18, 2020
 REPORT
7 items   1 followers   0 votes   0 views

ReLU activation function in neural networkd

State of the art Neural networks are capable of learning amazing things. Relu is one of the activation functions used to train neural networks. It has many advantages over other neural network activation functions.

Source: https://deeplearninguniversity.com/

ReLU as an Activation Function in Neural Networks - Deep Learning University

ReLU produces an output which is maximum among 0 and x. So when x is negative, the output is 0 and when x is positive, the output is x.

Sigmoid as an Activation Function in Neural Networks - Deep Learning University

Sigmoid activation function, also known as logistic function is one of the activation functions used in the neural network.

What is Deep Learning? - Deep Learning University

Deep Learning is a subfield of Machine Learning in which a layered architecture learns representations and each successive layers works with more meaningful data.

4

Importance of activation functions in neural networks

activation functions in neural network is one of the most important part of the neural network. Relu activation function has some very serious advantages over other activation functions such as hyperbolic tangent, sigmoid, leaky relu, parametarised relu, elu, and selu.

5

Disadvantages of sigmoid activation function over relu

Disadvantages of sigmoid activation function.
Sigmoid activation function has some serious disadvantages as compared to relu, these are but not limited to -
Range lies between 0 and 1, which causes saturation
cannot be used as output neuron for unbounded regression output.
suffers from gradient explosion and dying gradient problem
mean in 0.5 and not 1

6

Disadvantages of relu

Disadvantages of relu-
suffer from dying relu
function is continuous over R
but it is not differentiable at 0.
no gradient to work with when output is less than 0.

7

advantages of relu

advantages of relu-
easy to compute
doesnt suffer from dying gradients
doesnt suffer from exploding gradients.
fast in computation