Support Vector Machine Classifier¶
Note
Support Vector Machine Classifier is also available with oneAPI interfaces:
Support Vector Machine (SVM) is among popular classification algorithms. It belongs to a family of generalized linear classification problems. Because SVM covers binary classification problems only in the multiclass case, SVM must be used in conjunction with multiclass classifier methods. SVM is a binary classifier. For a multiclass case, use MultiClass Classifier framework of the library.
Details¶
Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y = (y_1, \ldots, y_n)\), where \(y_i \in \{1, 1\}\) describes the class to which the feature vector \(x_i\) belongs, the problem is to build a twoclass Support Vector Machine (SVM) classifier.
Training Stage¶
oneDAL provides two methods to train the SVM model:
Boser method [Boser92]  performanceoriented variant of Boser [Boser92] and Platt [Platt98] algorithms
Thunder method [Wen2018]
The SVM model is trained to solve the quadratic optimization problem
with \(0 \leq \alpha_i \leq C\), \(i = 1, \ldots, n\), \(y^T \alpha = 0\), where \(e\) is the vector of ones, \(C\) is the upper bound of the coordinates of the vector \(\alpha\), \(Q\) is a symmetric matrix of size \(n \times n\) with \(Q_{ij} = y_i y_j K(x_i, x_j)\), and \(K(x,y)\) is a kernel function.
Working subset of α updated on each iteration of the algorithm is based on the Working Set Selection (WSS) 3 scheme [Fan05]. The scheme can be optimized using one of these techniques or both:
Cache: the implementation can allocate a predefined amount of memory to store intermediate results of the kernel computation.
Shrinking: the implementation can try to decrease the amount of kernel related computations (see [Joachims99]).
The solution of the problem defines the separating hyperplane and corresponding decision function \(D(x)= \sum_{k} {y_k \alpha_k K(x_k, x)} + b\) where only those \(x_k\) that correspond to nonzero \(\alpha_k\) appear in the sum, and \(b\) is a bias. Each nonzero \(\alpha_k\) is called a classification coefficient and the corresponding \(x_k\) is called a support vector.
Prediction Stage¶
Given the SVM classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the signed value of the decision function \(D(x_i)\), \(i=1, \ldots, r\). The sign of the value defines the class of the feature vector, and the absolute value of the function is a multiple of the distance between the feature vector and separating hyperplane.
Usage of Training Alternative¶
To build a Support Vector Machine (SVM) Classifier model using methods of the Model Builder class of SVM Classifier, complete the following steps:
Create an SVM Classifier model builder using a constructor with the required number of support vectors and features.
In any sequence:
Use the
setSupportVectors
,setClassificationCoefficients
, andsetSupportIndices
methods to add precalculated support vectors, classification coefficients, and support indices (optional), respectively, to the model. For each method specify random access iterators to the first and the last element of the corresponding set of values [ISO/IEC 14882:2011 § 24.2.7]_.Use
setBias
to add a bias term to the model.
Use the
getModel
method to get the trained SVM Classifier model.Use the
getStatus
method to check the status of the model building process. IfDAAL_NOTHROW_EXCEPTIONS
macros is defined, the status report contains the list of errors that describe the problems API encountered (in case of API runtime failure).
Note
If after calling the getModel method you use the setBias
, setSupportVectors
, setClassificationCoefficients
, or setSupportIndices
methods, coefficients, the initial model will be automatically updated with the new set of parameters.
Examples¶
Note
There is no support for Java on GPU.
Batch Processing¶
SVM classifier follows the general workflow described in Classification Usage Model.
Training¶
For a description of the input and output, refer to Usage Model: Training and Prediction.
At the training stage, SVM classifier has the following parameters:
Parameter 
Default Value 
Description 



The floatingpoint type that the algorithm uses for intermediate computations. Can be 


The computation method used by the SVM classifier. Available methods for the training stage: For CPU: For GPU:


\(2\) 
The number of classes. 

\(1.0\) 
The upper bound in conditions of the quadratic optimization problem. 

\(0.001\) 
The training accuracy. 

\(1.0e6\) 
Tau parameter of the WSS scheme. 

\(1000000\) 
Maximal number of iterations for the algorithm. 

\(8000000\) 
The size of cache in bytes for storing values of the kernel matrix. A nonzero value enables use of a cache optimization technique. 


A flag that enables use of a shrinking optimization technique. Note This parameter is only supported for 

Pointer to an object of the KernelIface class 
The kernel function. By default, the algorithm uses a linear kernel. 
Prediction¶
For a description of the input and output, refer to Usage Model: Training and Prediction.
At the prediction stage, SVM classifier has the following parameters:
Parameter 
Default Value 
Description 



The floatingpoint type that the algorithm uses for intermediate computations. Can be 


Performanceoriented computation method, the only prediction method supported by the algorithm. 

\(2\) 
The number of classes. 

Pointer to object of the 
The kernel function. By default, the algorithm uses a linear kernel. 
Examples¶
Batch Processing:
Batch Processing:
Batch Processing:
Note
There is no support for Java on GPU.
Batch Processing:
Batch Processing:
Batch Processing:
Performance Considerations¶
For the best performance of the SVM classifier, use homogeneous numeric tables if your input data set is homogeneous or SOA numeric tables otherwise.
Performance of the SVM algorithm greatly depends on the cache size cacheSize. Larger cache size typically results in greater performance. For the best SVM algorithm performance, use cacheSize equal to \(n^2 \cdot \text{sizeof(algorithmFPType)}\). However, avoid setting the cache size to a larger value than the number of bytes required to store \(n^2\) data elements because the algorithm does not fully utilize the cache in this case.
Optimization Notice 

Intel’s compilers may or may not optimize to the same degree for nonIntel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessordependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 