1. 程式人生 > >Deep Learning Networks: Advantages of ReLU over Sigmoid Function

Deep Learning Networks: Advantages of ReLU over Sigmoid Function

Sigmoid: tend to vanish gradient (cause there is a mechanism to reduce the gradient as "a" increases, where "a" is the input of a sigmoid function. When "a" grows to infinite large, S′(a) S(a)(1 S(a)) 1 (1 1) 0.