ML - SVM (Support Vector Machine) Technique in a Nutshell

ss7dec · July 1, 2020, 8:36pm

Introduction
Support Vector Machines come under the category of supervised machine learning algorithm. Chiefly used for:

a) Linear Classification
b) Non-Linear Classification
c) SVM Regression
d) Outlier detection

The basic principle behind the working of SVM is very simple by simply creating a hyperplane that separates the dataset into two or more classes. Ultimate goal is to create a line that classifies the data into two or more classes, creating a distinction between them.

Support Vectors

Vectors or the training set located closest to the classifier.
Vectors or the training sets located at the edge of the street.
Data Model for SVM is very well separated having bigger lanes.

Herein we try to fit a widest possible street between 2 datasets and then we observe which data-points are lying on the edge are called as Support Vectors.

Before performing SVM Algorithmic technique, it is necessary to scale the dataset. If the dataset is not scaled, a narrow lane will be generated.

Kernel:
In SVM (Support Vector Machines) ML Algorithm, a set of mathematical functions are used that are defined as Kernel. This Kernel is responsible for transforming the input data into the desired format as required by the end-user. By invoking the kernels for data transformation, it is possible to obtain accurate classifiers.

It is used for Pattern Analysis which is used to study the patterns or relations in between them viz.:
• Classification
• Correlation
• Clustering
• Ranking &
• Principal Component

Kernel algorithms are based on Convex Optimization & Eigen values.

Functions of the Kernel:
a) To take the data as the input
b) Thereafter, transform the data as desired by the end-user

Different SVM kernels use different types of kernel functions. Some of the commonly used Kernel types have been enlisted below:

 Polynomial kernel – used for image processing
 Gaussian kernel – Used as a general-purpose kernel especially when there is no prior knowledge about the data.
 Gaussian Radial Basis Function (RBF) – Similar to Gaussian kernel wherein Used as a general-purpose kernel especially when there is no prior knowledge about the data.
 Hyperbolic tangent kernel – Used in Neural Networks
 Sigmoid kernel – Used as a proxy for Neural Networks
 ANOVA radial basis kernel – Used in Regression
 Linear splines kernel in one-dimension – Used wherein vectorized data is sparse. Used for text-values categorization. Also used for solving Regression

Amongst them, the most commonly used type of kernel function is RBF (Radial Basis Function). It is the default kernel. This is because it has it has localized and finite response along the entire X-axis.

Kernel Trick

Helps in solving Non-linear, Polynomial or quadratic problems
Avoidance of explicit mapping that is needed for Liner Algorithms to learn a Non-linear function

Hyperparameters of SVM ML Technique
The following hyperparameters need to be fine-tuned as follows:

a) Kernel – Transforming the input data so as to get the required format. Usage of appropriate kernels can help in obtaining accurate classifiers.

b) Regularization – Herein C parameter is included.

Bigger or Higher the “C” parameter - Narrower street and lower margin violations – the model won’t generalize well. - Overfitting
Smaller or Lower the “C” parameter - Wider or Bigger street and more margin violations – the model will generalize well - Underfitting

c) Gamma – It defines how r the influence of a single training dataset example reaches.

lower value of Gamma will consider points at greater distance .
A high value of gamma will consider only points close to the hyperplane.

Thus in brief, decreasing the Gamma will result in finding the correct hyperplane which will include greater distances so more and more data-points will be used for consideration under SVM ML technique.

Advantages of SVM :

Guaranteed Optimization
Can be implemented on both – linear as well as Non-Linear data
Provides compliance to semi-supervised learning models. It can be used in areas where the data is labeled as well as unlabeled.
Kernel Trick is used for Feature Mapping
Uses a subset for training the data thereby rendering it as memory efficient.
Different kernels can be used for the decision function. However, these kernels can be customized as per the requirements also.

Disadvantages Of SVM

Cannot return the probabilistic confidence value that is similar to logistic regression.
Incapable of handling text data.
Choice of the kernel is perhaps the biggest limitation or hurdle vis-à-vis SVM modelling technique. As there are many varieties of kernels, it becomes difficult to choose the most appropriate one for data-modelling

Uses of SVM in Real Time Scenarios

Face Detection & Identification
Thumb impression identification
Classification of Images esp. in Security, Defence, Crime, Legal, Supply-Chain Management, Cargo Shipments etc.
Bioinformatics
Handwriting analysis
Text & Hypertext Categorization
Geospatial Science
Environmental Science

References:

https://en.wikipedia.org/wiki/Kernel_method#:~:text=In%20machine%20learning%2C%20kernel%20methods,support%20vector%20machine%20(SVM).&text=Kernel%20functions%20have%20been%20introduced,images%2C%20as%20well%20as%20vectors.

https://scikit-learn.org/stable/modules/svm.html#kernel-functions

NOTE: : Purely my understanding regarding the concepts used in SVM topic from what has been understood after reading from various sources. Looking forward for further suggestions, in case if that has been overlooked.