# Naïve Bayes Classifier¶

Naïve Bayes is a set of simple and powerful classification methods often used for text classification, medical diagnosis, and other classification problems. In spite of their main assumption about independence between features, Naïve Bayes classifiers often work well when this assumption does not hold. An advantage of this method is that it requires only a small amount of training data to estimate model parameters.

## Details¶

The library provides Multinomial Naïve Bayes classifier [Renie03].

Let $$J$$ be the number of classes, indexed $$0, 1, \ldots, J-1$$. The integer-valued feature vector $$x_i = (x_{11}, \ldots, x_{ip})$$, $$i=1, \ldots, n$$, contains scaled frequencies: the value of $$x_{ik}$$ is the number of times the $$k$$-th feature is observed in the vector $$x_i$$ (in terms of the document classification problem, $$x_{ik}$$ is the number of occurrences of the word indexed $$k$$ in the document $$x_i$$. For a given data set (a set of $$n$$ documents), $$(x_1, \ldots, x_n)$$, the problem is to train a Naïve Bayes classifier.

### Training Stage¶

The Training stage involves calculation of these parameters:

• $$\mathrm{log}\left({\theta }_{jk}\right)=\mathrm{log}\left(\frac{{N}_{jk}+{\alpha }_{k}}{{N}_{j}+\alpha }\right)$$, where $$N_{jk}$$ is the number of occurrences of the feature $$k$$ in the class $$j$$, $$N_j$$ is the total number of occurrences of all features in the class, the $$\alpha_k$$ (for example, $$\alpha_k = 1$$), and $$\alpha$$ is the sum of all $$\alpha_k$$.

• $$\mathrm{log}\left({\theta }_{j}\right)$$, where $$p(\theta_j)$$ is the prior class estimate.

### Prediction Stage¶

Given a new feature vector $$x_i$$, the classifier determines the class the vector belongs to:

$class\left({x}_{i}\right)=\mathrm{arg}{\mathrm{max}}_{j}\left(\mathrm{log}\left(p\left({\theta }_{j}\right)\right)+{\sum }_{k}\mathrm{log}\left({\theta }_{jk}\right)\right).$

## Computation¶

The following computation modes are available:

## Examples¶

Batch Processing:

Online Processing:

Distributed Processing:

Note

There is no support for Java on GPU.

Batch Processing:

Online Processing:

Distributed Processing:

Batch Processing:

Online Processing:

Distributed Processing:

## Performance Considerations¶

### Training Stage¶

To get the best overall performance at the Naïve Bayes classifier training stage:

• If input data is homogeneous:

• For the training data set, use a homogeneous numeric table of the same type as specified in the algorithmFPType class template parameter.

• For class labels, use a homogeneous numeric table of type int.

• If input data is non-homogeneous, use AOS layout rather than SOA layout.

The training stage of the Naïve Bayes classifier algorithm is memory access bound in most cases. Therefore, use efficient data layout whenever possible.

### Prediction Stage¶

To get the best overall performance at the Naïve Bayes classifier prediction stage:

• For the working data set, use a homogeneous numeric table of the same type as specified in the algorithmFPType class template parameter.

• For predicted labels, use a homogeneous numeric table of type int.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804