Correlation and Variance-Covariance Matrices

Variance-covariance and correlation matrices are among the most important quantitative measures of a data set that characterize statistical relationships involving dependence.

Specifically, the covariance measures the extent to which variables “fluctuate together” (that is, co-vary). The correlation is the covariance normalized to be between -1 and +1. A positive correlation indicates the extent to which variables increase or decrease simultaneously. A negative correlation indicates the extent to which one variable increases while the other one decreases. Values close to +1 and -1 indicate a high degree of linear dependence between variables.

Details

Given a set \(X\) of \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of dimension \(p\), the problem is to compute the sample means and variance-covariance matrix or correlation matrix:

Statistic

Definition

Means

\(M = (m(1), \ldots , m(p))\), where \(m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}\)

Variance-covariance matrix

\(Cov = (v_{ij})\), where \(v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))\), \(i=\overline{1,p}\), \(j=\overline{1,p}\)

Correlation matrix

\(Cor = (c_{ij})\), where \(c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}\), \(i=\overline{1,p}\), \(j=\overline{1,p}\)

Computation

The following computation modes are available:

Examples

Note

There is no support for Java on GPU.

Batch Processing:

Batch Processing:

Online Processing:

Batch Processing:

Online Processing:

Distributed Processing:

Performance Considerations

To get the best overall performance when computing correlation or variance-covariance matrices:

  • If input data is homogeneous, provide the input data and store results in homogeneous numeric tables of the same type as specified in the algorithmFPType class template parameter.

  • If input data is non-homogeneous, use AOS layout rather than SOA layout.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804