Intel? Math Kernel Library (Intel? MKL) offers highly optimized, extensively threaded math routines for scientific, engineering, and financial applications that require maximum performance.
?
Features
Outstanding performance on Intel? processors Achieve outstanding performance with the math library that is highly optimized for Intel? Itanium? , Intel? Xeon?, Intel? Pentium? 4, and Intel? Core™2 Duo processor-based systems. Special attention has been paid to optimizing multi-threaded performance for the new Quad-Core Intel? Xeon? processor 5300 series. Intel MKL performance is competitive with that of other math software packages on non-Intel processors.
Multi-core ready
Excellent scaling on multiprocessor systems1 Use the built-in parallelism of Intel MKL to automatically obtain excellent scaling on multiprocessors including the latest dual and quad-core systems. Intel MKL Level-3 BLAS, Fast Fourier transforms, and Vector Math are threaded using OpenMP*.
Thread-Safety All Intel MKL functions are thread-safe. A non-threaded serial version of Intel MKL is also provided.
Automatic runtime processor detection A runtime check is performed so that processor-specific optimized code is executed, ensuring that your application achieves optimal performance on whatever system it is executing on.
Support for C and Fortran interfaces Unlike some alternative math libraries that require you to purchase multiple products to get C and Fortran interfaces, Intel MKL includes both.
Support for all Intel? processors in one package Alternative math libraries require you to purchase multiple products for support of Intel Itanium 2, Intel Xeon, and Pentium 4 processors. Intel MKL includes support for ALL of these processors in a single, inexpensive package.
Royalty-free distribution rights Redistribute unlimited copies of the runtime libraries with your software.
Intel? Premier Support Receive one year of world-class technical support with every purchase of Intel MKL. During this period, you can download product upgrades free of charge, including major version releases. For more information, visit the Intel Registration Center.
Functionality:
Linear Algebra - BLAS and LAPACK Deploy BLAS and LAPACK routines that are highly optimized for Intel processors, and that provide significant performance improvements over alternative implementations. Intel MKL 10.0 is compliant with the new 3.1 release of LAPACK.
Linear Algebra - ScaLAPACK The Intel MKL implementation of ScaLAPACK can provide significant performance improvements over the standard NETLIB implementation.
Linear Algebra- Sparse Solvers Solve large, sparse linear systems of equations with the PARDISO Direct Sparse Solver – an easy-to-use, thread-safe, high-performance, and memory-efficient software library licensed from the University of Basel. Intel MKL also includes Conjugate Gradient and FGMRES iterative sparse solvers.
Fast Fourier Transforms (FFT) Utilize our multi-dimensional FFT routines (1D up to 7D) with a modern, easy-to-use C/ Fortran interface. Intel MKL supports distributed memory clusters with the same API enabling you to improve your performance by distributing the work over a large number of processors with minimal effort. Intel MKL also provides a set of C routines ("wrappers") that mimic the FFTW 2.x and 3.0 interfaces making it easy for current FFTW users to plug Intel MKL into their existing applications.
Vector Math Library Increase application speeds with vectorized implementations of computationally intensive core mathematical functions (power, trigonometric, exponential, hyperbolic, logarithmic, and more).
Vector Random Number Generators Speed up your simulations using our vector random number generators, which can provide substantial performance improvements over scalar random number generator alternatives.
LINPACK Benchmark Intel provides free LINPACK benchmark packages to help you obtain the highest possible benchmark results for your Intel? architecture-based systems.
New in This Release
In this release of Intel Math Kernel Library (Intel MKL), we have focused on three primary objectives. First and always foremost is providing optimized multi-threaded performance for the newest Intel? processors (Quad-Core Intel? Xeon? processor 5300 series and its close relative the Dual-Core Intel? Xeon? processor 5100 series). Secondly, we have re-architected Intel MKL to have a new “layered” architecture to better support the varied usage models of our users. Lastly, we have merged the standard and cluster editions of Intel MKL so we now have a single, comprehensive package.
Optimizations for the new Quad-Core Intel Xeon processor 5300 series For more information see section “Performance Improvements in Version 10.0” below.
New "Layered" Architecture In Version 10.0 of Intel MKL we have re-architected the product to provide multiple layers so that the base Intel MKL package supports numerous configurations of interfaces, compilers, and processors in a single package. Many other library vendors have specific versions that must be found, downloaded, installed, and tested depending on the particular configuration of your development environment. This new Intel MKL architecture is intended to provide maximum support for our varied customers’ needs, while minimizing the effort it takes to obtain and utilize the great performance of Intel MKL. For more information, please refer to the “Using Intel MKL Parallelism” section of the Intel MKL User’s Guide.
Threading Layer All Intel MKL threading has been isolated to this layer. Link to the version of this layer that matches your development environment and rest assured that Intel MKL will not have threading incompatibilities with the threading in your application. ?
Fully Compliant with Microsoft, GCC, and Intel Compiler Threading Separate versions of this layer are provided that have been compiled with different compilers (Intel, MSFT, GCC) enabling Intel MKL to be fully compliant with threading mechanisms used by whatever development environment your overall application is using.
Serial Version of Intel MKL A version of the threading layer that has no threading is also provided. This ensures that Intel MKL will not conflict in any way is your application should you choose not to use the threading within Intel MKL.
Interface Layer This layer enables:
LP64 and ILP64 interfaces. An ILP64 (64-bit integer data) interface is now included in the base package of Intel MKL, and is no longer a separate download. Intel MKL 10.0’s new layered architecture has made this possible with a minimal increase in product package size.
Separate layers are provided for different compiler parameter return value conventions (Intel, GCC, MSFT)
Cray-style naming support
Computational Layer This layer forms the heart of Intel MKL. A runtime check is performed so that processor-specific optimized code is executed. Users can build custom shared objects to include only the specific code needed and thus reduce the size of this layer if size is an issue.
PARDISO Direct Sparse Solver
New support for out-of-core memory
Sparse BLAS
Sparse 0-based indexing
Single precision support added
Level-3 Sparse BLAS triangular solvers were threaded
Iterative Solver Preconditioner
ILUT accelerator/preconditioner for the Intel MKL RCI iterative solvers
Vector Math Functions
New Mul, Conj, MulbyConj, CIS, Abs functions
New “Enhanced Performance” mode EP Mode is for applications where math function inaccuracies don’t dominate parameter inaccuracies (e.g. Monte Carlo simulations and Media applications)
All VML functions are now threaded
User’s Guide
We have greatly improved our Intel MKL User’s Guide. It is an indispensable tool for working with Intel MKL. Visit the Documentation page to download it or view it online.
Performance Improvements in Version 10.0
Performance optimizations were done in all areas of the library. Below are some specific measured performance gains. A list of performance improvements in past versions of Intel MKL is available on the Performance Improvements page. Performance charts are shown on each product domain page (BLAS/LAPACK, FFT, VML, etc.)
BLAS
Threading of DGEMM was improved for small and middle sizes - outer product sizes by 10%, square sizes by 80%
DGEMM/SGEMM Large square and large outer product sizes were improved by 4-5% on 1 thread and 10-15% on 8 threads
DTRSM, DTRMM, and DSYRK were improved by 5-30%
Other level 3 real functions were improved by 2-4% on large sizes
LAPACK
Several linear equation solvers (?spsv/?hpsv/?ppsv, ?pbsv/?gbsv, ?gtsv/?ptsv, ?sysv/?hesv) have dramatically improved in performance. Banded and packed storage format and multiple right-hand sides cases see speed-ups of up to 100 times
All symmetric eigensolvers (?syev/?syev, ?syevd/?heevd, ?syevx/?heevx, ?syevr/?heevr) have significantly improved, since tridiagonalization routine (?sytrd/?hetrd) has sped up to 4 times
All symmetric eigensolvers in packed storage (?spev/?hpev, ?spevd/?hpevd, ?spevx/?hpevx) have significantly improved, since tridiagonalization routine in packed storage (?sptrd/?hptrd) has sped up to 3 times
A number of routines applying orthogonal/unitary transformations (?ormqr/?unmqr, ?ormrq/?unmrq, ?ormql/?unmql, ?ormlq/?unmlq) have improved up to 2 times
FFTs
Performance of complex 1D FFTs for power-of-two sizes was improved by up to 1.8 times on 1 thread
On Intel? 64 architecture-based systems running in 64-bit mode single precision complex backward 1D FFT for data sizes greater than 2^22 elements have been sped up by up to 2 times on 4 threads and up to 2.4 times on 8 threads on Intel? Itanium? processors
VML/VSL
Performance of VSL functions is improved on non-Intel processors by approximately 2 times on average
Performance of VML vdExp, vdSin, and vdCos functions is improved on non-Intel processors by 18% on average
Performance of VSL functions is improved on IA-32 and Intel? 64 architecture by 7% on average
Compatibility
Operating Systems
Intel MKL 10.0 supports Linux*, Windows* and Mac OS* X. Linux variants include: Red Hat*, Suse*, Debian*, Ubuntu*, Asianux*, and other Linux Standard Base 3.1 variants. For a complete list, please see the System Requirements page.
Development Environments
Intel MKL is easily used and integrated with popular development tools and environments, such as Microsoft Visual Studio*, Xcode*, Eclipse*, and the GNU Compiler Collection (GCC).
Processors
Intel MKL 10.0 supports the following families of Intel processors:
Information