A Gentle Introduction To Sigmoid Function
- Get link
- X
- Other Apps
Whether you implement a neural neighborhood your self in any other case you employ a inbuilt library for neural neighborhood learning, it is of paramount significance to understand the significance of a sigmoid function. The sigmoid function is the necessary factor to understanding how a neural neighborhood learns superior points. This function moreover served as a basis for finding completely different capabilities that lead to surroundings pleasant and good choices for supervised learning in deep learning architectures.
In this tutorial, you may uncover the sigmoid function and its perform in learning from examples in neural networks.
After ending this tutorial, you may know:
- The sigmoid function
- Linear vs. non-linear separability
- Why a neural neighborhood may make superior selection boundaries if a sigmoid unit is used
Let’s get started.

A Gentle Introduction to sigmoid function. Photo by Mehreen Saeed, some rights reserved.
Tutorial Overview
This tutorial is break up into 3 elements; they’re:
- The sigmoid function
- The sigmoid function and its properties
- Linear vs. non-linearly separable points
- Using a sigmoid as an activation function in neural networks
Sigmoid Function
The sigmoid function is a selected kind of the logistic function and is often denoted by σ(x) or sig(x). It is given by:
σ(x) = 1/(1+exp(-x))
Properties and Identities Of Sigmoid Function
The graph of sigmoid function is an S-shaped curve as confirmed by the inexperienced line inside the graph beneath. The decide moreover displays the graph of the spinoff in pink coloration. The expression for the spinoff, along with some important properties are confirmed on the becoming.

Graph of the sigmoid function and its spinoff. Some important properties are moreover confirmed.
Quite a lot of completely different properties embody:
- Domain: (-∞, +∞)
- Range: (0, +1)
- σ(0) = 0.5
- The function is monotonically rising.
- The function is regular everywhere.
- The function is differentiable everywhere in its space.
- Numerically, it’s adequate to compute this function’s price over a small range of numbers, e.g., [-10, +10]. For values decrease than -10, the function’s price is sort of zero. For values bigger than 10, the function’s values are almost one.
The Sigmoid As A Squashing Function
The sigmoid function might be known as a squashing function as its space is the set of all precise numbers, and its range is (0, 1). Hence, if the enter to the function is each a very large unfavorable amount or a very large optimistic amount, the output is always between 0 and 1. Same goes for any amount between -∞ and +∞.
Sigmoid As An Activation Function In Neural Networks
The sigmoid function is used as an activation function in neural networks. Just to guage what’s an activation function, the decide beneath displays the perform of an activation function in a single layer of a neural neighborhood. A weighted sum of inputs is handed by the use of an activation function and this output serves as an enter to the following layer.

A sigmoid unit in a neural neighborhood
When the activation function for a neuron is a sigmoid function it is a guarantee that the output of this unit will always be between 0 and 1. Also, as a result of the sigmoid is a non-linear function, the output of this unit may very well be a non-linear function of the weighted sum of inputs. Such a neuron that employs a sigmoid function as an activation function is termed as a sigmoid unit.
Linear Vs. Non-Linear Separability?
Suppose now now we have a typical classification draw back, the place now now we have a set of things in space and each degree is assigned a class label. If a straight line (or a hyperplane in an n-dimensional space) can divide the two programs, then now now we have a linearly separable draw back. On the other hand, if a straight line should not be adequate to divide the two programs, then now now we have a non-linearly separable draw back. The decide beneath displays data inside the 2 dimensional space. Each degree is assigned a pink or blue class label. The left decide displays a linearly separable draw back that requires a linear boundary to distinguish between the two programs. The correct decide displays a non-linearly separable draw back, the place a non-linear selection boundary is required.

Linera Vs. Non-Linearly separable points
For three dimensional space, a linear selection boundary could also be described via the equation of a plane. For an n-dimensional space, the linear selection boundary is described by the equation of a hyperplane.
Why The Sigmoid Function Is Important In Neural Networks?
If we use a linear activation function in a neural neighborhood, then this model can solely be taught linearly separable points. However, with the addition of just one hidden layer and a sigmoid activation function inside the hidden layer, the neural neighborhood can merely be taught a non-linearly separable draw back. Using a non-linear function produces non-linear boundaries and subsequently, the sigmoid function might be utilized in neural networks for learning superior selection capabilities.
The solely non-linear function that may be utilized as an activation function in a neural neighborhood is one which is monotonically rising. So as an illustration, sin(x) or cos(x) cannot be used as activation capabilities. Also, the activation function must be outlined everywhere and must be regular everywhere inside the space of precise numbers. The function might be required to be differentiable over your full space of precise numbers.
Typically a once more propagation algorithm makes use of gradient descent to be taught the weights of a neural neighborhood. To derive this algorithm, the spinoff of the activation function is required.
The indisputable fact that the sigmoid function is monotonic, regular and differentiable everywhere, coupled with the property that its spinoff could also be expressed by the use of itself, makes it easy to derive the change equations for learning the weights in a neural neighborhood when using once more propagation algorithm.
Extensions
This half lists some ideas for extending the tutorial that you possibly can be wish to uncover.
- Other non-linear activation capabilities, e.g., tanh function
- A Gentle Introduction to the Rectified Linear Unit (ReLU)
- Deep learning
If you uncover any of these extensions, I’d wish to know. Post your findings inside the suggestions beneath.
Further Reading
This half gives additional belongings on the topic in case you’re in search of to go deeper.
Tutorials
- Calculus in movement: Neural networks
- A fragile introduction to gradient descent course of
- Neural networks are function approximation algorithms
- How to Choose an Activation Function for Deep Learning
Resources
- Jason Brownlee’s great helpful useful resource on Calculus Books for Machine Learning
Books
- Pattern recognition and machine learning by Christopher M. Bishop.
- Deep learning by Ian Goodfellow, Joshua Begio, Aaron Courville.
- Thomas Calculus, 14th model, 2023. (based totally on the distinctive works of George B. Thomas, revised by Joel Hass, Christopher Heil, Maurice Weir)
Summary
In this tutorial, you discovered what’s a sigmoid function. Specifically, you realized:
- The sigmoid function and its properties
- Linear vs. non-linear selection boundaries
- Why together with a sigmoid function on the hidden layer permits a neural neighborhood to be taught superior non-linear boundaries
Do you should have any questions?
Ask your questions inside the suggestions beneath and I’ll do my best to answer
How to Choose an Activation Function for Deep Learning
A Gentle Introduction to the Rectified Linear Unit (ReLU)
Visualizing the vanishing gradient draw back
How to Implement Bayesian Optimization from Scratch…
Using Activation Functions in Neural Networks
Visualization for Function Optimization in Python
- Get link
- X
- Other Apps
Comments
Post a Comment