Intel Math Kernel Library Changelog

What's new in Intel Math Kernel Library 11.2

Aug 29, 2014

Intel MKL now provides optimizations for all Intel® Atom™ processors that support Intel® Streaming SIMD Extensions 4.1 (Intel® SSE4.1) and Intel® Streaming SIMD Extensions 4.2 (Intel® SSE4.2) instruction sets
Introduced support for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instruction set with limited optimizations in BLAS, DFT and VML
Introduced Verbose support for BLAS and LAPACK domains, which enables users to capture the input parameters to Intel MKL function calls
Introduced support for Intel® MPI Library 5.0
Introduced the Intel Math Kernel Library Cookbook (http://software.intel.com/en-us/mkl_cookbook) , a new document that describes how to use Intel MKL routines to solve certain complex problems
Introduced the MKL_DIRECT_CALL or MKL_DIRECT_CALL_SEQ compilation feature that provides ?GEMM small matrix performance improvements for all processors (see the Intel® Math Kernel Library User's Guide for more details)
Added the ability to link a Single Dynamic Library (mkl_rt) on Intel® Many Integrated Core Architecture (Intel® MIC Architecture)
Added a customizable error handler.See the Intel Math Kernel Library Reference Manual description of mkl_set_exit_handler() for further details
Extended the Intel® Xeon Phi™ coprocessor Automatic Offload feature with a resource sharing mechanism.See the Intel Math Kernel Library Reference Manual for the description of mkl_mic_set_resource_limit() function and the MKL_MIC_RESOURCE_LIMIT environment variable for further details
Parallel Direct Sparse Solver for Clusters:
Introduced Parallel Direct Sparse Solver for Clusters, a distributed memory version of Intel MKL PARDISO direct sparse solver
Improved performance of the matrix gather step for distributed matrices
Enabled reuse of reordering information on multiple factorization steps
Added distributed CSR format, support of distributed matrices, RHS, and distributed solutions
Added support of solving of systems with multiple right hand sides
Added cluster support of factorization and solving steps
Added support for pure MPI mode and support for single OpenMP thread in hybrid configurations
BLAS:
Improved threaded performance of ?GEMM for all 64-bit architectures supporting Intel® Advanced Vector Extensions 2 (Intel® AVX2)
Optimized ?GEMM, ?TRSM, DTRMM for the Intel AVX-512 instruction set
Improved performance of ?GEMM for outer product [large m, large n, small k] and tall skinny matrices [large m, medium n, small k] on Intel MIC Architecture
Improved performance of ?TRSM and ?SYMM in Automatic Offload mode on Intel MIC Architecture
Improved performance of Level 3 BLAS functions for 64-bit processors supporting Intel AVX2
Improved ?GEMM performance on small matrices for all processors when MKL_DIRECT_CALL or MKL_DIRECT_CALL_SEQ is defined during compilation (see the Intel® Math Kernel Library User’s Guide for more details )
Improved performance of DGER and DGEMM for the beta=1, k=1 case for 64-bit processors supporting Intel SSE4.2, Intel® Advanced Vector Extensions (Intel® AVX), and Intel AVX2 instruction sets
Optimized (D/Z)AXPY for the Intel AVX-512 instruction set
Optimized ?COPY for Intel AVX2 and AVX512 instruction sets
Optimized DGEMV for Intel AVX-512 instruction set
Improved performance of SSYR2K for 64-bit processors supporting Intel AVX and Intel AVX2
Improved threaded performance of ?AXPBY for all Intel processors
Improved DTRMM performance for the side=R, uplo={U,L}, transa=N, diag={N,U} cases for Intel AVX-512
LINPACK:
Improved performance of matrix generation in the heterogeneous Intel® Optimized MP LINPACK Benchmark for Clusters
Intel MIC Architecture offload option of the Intel Optimized MP LINPACK Benchmark for Clusters package now supports Intel AVX2 hosts
Improved performance of the Intel Optimized MP LINPACK for Clusters package for 64-bit processors supporting Intel AVX2
LAPACK:
Improved performance of ?(SY/HE)RDB
Improved performance of ?(SY/HE)(EV/EVD) when eigenvectors are needed
Improved performance of ?(SY/HE)(EV/EVR/EVD) when eigenvectors are not needed
Improved performance of ?GELQF,?GELS and ?GELSS for underdetermined case (M less than N)
Improved performance of ?GEHRD,?GEEV and ?GEES
Improved performance of NaN checkers in LAPACKE interfaces
Improved performance of ?GELSX, ?GGSVP
Improved performance of ?GETRF
Improved performance of (S/D)GE(SVD/SDD) when M>=N and singular vectors are not needed
Improved performance of ?POTRF UPLO=U in Automatic Offload mode on Intel MIC Architecture
Added Automatic Offload for ?SYRDB on Intel MIC Architecture, which speeds up ?SY(EV/EVD/EVR) when eigenvectors are not needed
PBLAS and ScaLAPACK:
Enabled Automatic Offload in P?GEMM routines for large distribution blocking factors
Sparse BLAS:
Optimized SpMV kernels for Intel AVX-512 instruction set
Added release example for diagonal format use in Sparse BLAS
Improved Sparse BLAS level 2 and 3 performance for systems supporting Intel SSE4.2, Intel AVX and Intel AVX2 instruction sets
Intel MKL PARDISO:
Added the ability to store Intel MKL PARDISO handle to the disk for future use at any solver stage
Added pivot control support for unsymmetric matrices and out-of-core mode
Added diagonal extraction support for unsymmetric matrices and out-of-core mode
Added example demonstrating use of Intel MKL PARDISO as iterative solver for non-linear systems
Added capability to free memory taken by original matrix after factorization stage if iterative refinement is disabled
Improved memory estimation of out-of-core (OOC) portion size for reordering algorithm leading to improved factorization-solve performance in OOC mode
Improved message output from Intel MKL PARDISO
Added support of zero pivot during factorization for structurally symmetric cases
Poisson library:
Added example demonstrating use of the Intel MKL Poisson library as a preconditioner for linear systems solves
Extended Eigensolver:
Improved message output
Improved examples
Added input and output iparm parameters in predefined interfaces for solving sparse problems
FFT:
Optimized FFTs for the Intel AVX-512 instruction set
Improved performance for non-power-of-2 length on Intel® MIC Architecture
VML: Added v[d|s]Frac function computing fractional part for each vector element
VSL RNG:
Added support of ntrial=0 in Binomial Random Number Generator
Improved performance of MRG32K3A and MT2203 BRNGs on Intel MIC Architecture
Improved performance of MT2203 BRNG on CPUs supporting Intel AVX and Intel AVX2 instruction sets
VSL Summary Statistics:
Added support for group/pooled mean estimates (VSL_SS_GROUP_MEAN/VSL_SS_POOLED_MEAN)
Data Fitting: Fixed incorrect behavior of the natural cubic spline construction function when number of breakpoints is 2 or 3
Introduced an Intel MKL mode that ignores all settings specified by Intel MKL environment variables
User can set up the mode by calling mkl_set_env_mode() routine which directs Intel MKL to ignore all environment settings specific to Intel MKL so that all Intel MKL related environment variables such as MKL_NUM_THREADS, MKL_DYNAMIC, MKL_MIC_ENABLE and others are ignored; users can instead set needed parameters via Intel MKL service routines such as mkl_set_num_threads() and mkl_mic_enable()

Intel Math Kernel Library Changelog

What's new in Intel Math Kernel Library 11.2

New in Intel Math Kernel Library 9.1.022 (Jun 25, 2007)