A Gentle Introduction To Sigmoid Function

Whether you implement a neural neighborhood your self in any other case you employ a inbuilt library for neural neighborhood learning, it is of paramount significance to understand the significance of a sigmoid function. The sigmoid function is the necessary factor to understanding how a neural neighborhood learns superior points. This function moreover served as a basis for finding completely different capabilities that lead to surroundings pleasant and good choices for supervised learning in deep learning architectures.

In this tutorial, you may uncover the sigmoid function and its perform in learning from examples in neural networks.

After ending this tutorial, you may know:

The sigmoid function
Linear vs. non-linear separability
Why a neural neighborhood may make superior selection boundaries if a sigmoid unit is used

Let’s get started.

A Gentle Introduction to sigmoid function. Photo by Mehreen Saeed, some rights reserved.

Tutorial Overview

This tutorial is break up into 3 elements; they’re:

The sigmoid function
1. The sigmoid function and its properties
Linear vs. non-linearly separable points
Using a sigmoid as an activation function in neural networks

Sigmoid Function

The sigmoid function is a selected kind of the logistic function and is often denoted by σ(x) or sig(x). It is given by:

σ(x) = 1/(1+exp(-x))

Properties and Identities Of Sigmoid Function

The graph of sigmoid function is an S-shaped curve as confirmed by the inexperienced line inside the graph beneath. The decide moreover displays the graph of the spinoff in pink coloration. The expression for the spinoff, along with some important properties are confirmed on the becoming.

Graph of the sigmoid function and its derivative. Some important properties are also shown.

Graph of the sigmoid function and its spinoff. Some important properties are moreover confirmed.

Quite a lot of completely different properties embody:

Domain: (-∞, +∞)
Range: (0, +1)
σ(0) = 0.5
The function is monotonically rising.
The function is regular everywhere.
The function is differentiable everywhere in its space.
Numerically, it’s adequate to compute this function’s price over a small range of numbers, e.g., [-10, +10]. For values decrease than -10, the function’s price is sort of zero. For values bigger than 10, the function’s values are almost one.

The Sigmoid As A Squashing Function

The sigmoid function might be known as a squashing function as its space is the set of all precise numbers, and its range is (0, 1). Hence, if the enter to the function is each a very large unfavorable amount or a very large optimistic amount, the output is always between 0 and 1. Same goes for any amount between -∞ and +∞.

Sigmoid As An Activation Function In Neural Networks

The sigmoid function is used as an activation function in neural networks. Just to guage what’s an activation function, the decide beneath displays the perform of an activation function in a single layer of a neural neighborhood. A weighted sum of inputs is handed by the use of an activation function and this output serves as an enter to the following layer.

A sigmoid unit in a neural neighborhood

When the activation function for a neuron is a sigmoid function it is a guarantee that the output of this unit will always be between 0 and 1. Also, as a result of the sigmoid is a non-linear function, the output of this unit may very well be a non-linear function of the weighted sum of inputs. Such a neuron that employs a sigmoid function as an activation function is termed as a sigmoid unit.

Linear Vs. Non-Linear Separability?

Suppose now now we have a typical classification draw back, the place now now we have a set of things in space and each degree is assigned a class label. If a straight line (or a hyperplane in an n-dimensional space) can divide the two programs, then now now we have a linearly separable draw back. On the other hand, if a straight line should not be adequate to divide the two programs, then now now we have a non-linearly separable draw back. The decide beneath displays data inside the 2 dimensional space. Each degree is assigned a pink or blue class label. The left decide displays a linearly separable draw back that requires a linear boundary to distinguish between the two programs. The correct decide displays a non-linearly separable draw back, the place a non-linear selection boundary is required.

Linera Vs. Non-Linearly separable problems

Linera Vs. Non-Linearly separable points

For three dimensional space, a linear selection boundary could also be described via the equation of a plane. For an n-dimensional space, the linear selection boundary is described by the equation of a hyperplane.

Why The Sigmoid Function Is Important In Neural Networks?

If we use a linear activation function in a neural neighborhood, then this model can solely be taught linearly separable points. However, with the addition of just one hidden layer and a sigmoid activation function inside the hidden layer, the neural neighborhood can merely be taught a non-linearly separable draw back. Using a non-linear function produces non-linear boundaries and subsequently, the sigmoid function might be utilized in neural networks for learning superior selection capabilities.

The solely non-linear function that may be utilized as an activation function in a neural neighborhood is one which is monotonically rising. So as an illustration, sin(x) or cos(x) cannot be used as activation capabilities. Also, the activation function must be outlined everywhere and must be regular everywhere inside the space of precise numbers. The function might be required to be differentiable over your full space of precise numbers.

Typically a once more propagation algorithm makes use of gradient descent to be taught the weights of a neural neighborhood. To derive this algorithm, the spinoff of the activation function is required.

The indisputable fact that the sigmoid function is monotonic, regular and differentiable everywhere, coupled with the property that its spinoff could also be expressed by the use of itself, makes it easy to derive the change equations for learning the weights in a neural neighborhood when using once more propagation algorithm.

Extensions

This half lists some ideas for extending the tutorial that you possibly can be wish to uncover.

Other non-linear activation capabilities, e.g., tanh function
A Gentle Introduction to the Rectified Linear Unit (ReLU)
Deep learning

If you uncover any of these extensions, I’d wish to know. Post your findings inside the suggestions beneath.

Summary

In this tutorial, you discovered what’s a sigmoid function. Specifically, you realized:

The sigmoid function and its properties
Linear vs. non-linear selection boundaries
Why together with a sigmoid function on the hidden layer permits a neural neighborhood to be taught superior non-linear boundaries

Do you should have any questions?

Ask your questions inside the suggestions beneath and I’ll do my best to answer

Search This Blog

Solution Desk

Why Does My Snapchat AI Have a Story? Has Snapchat AI Been Hacked?