kernels from library calls
gpu coder™ supports libraries optimized for cuda® gpus such as cublas, cusolver, cufft, thrust, cudnn, and tensorrt libraries.
the cublas library is an implementation of basic linear algebra subprograms (blas) on top of the nvidia® cuda run time. it allows you to access the computational resources of the nvidia gpu.
the cusolver library is a high-level package based on the cublas and cusparse libraries. it provides useful lapack-like features, such as common matrix factorization and triangular solve routines for dense matrices, a sparse least-squares solver, and an eigenvalue solver.
the cufft library provides a high-performance implementation of the fast fourier transform (fft) algorithm on nvidia gpus. the cublas, cusolver, and cufft libraries are part of the nvidia cuda toolkit.
thrust is a c template library for cuda. the thrust library is shipped with cuda toolkit and allows you to take advantage of gpu-accelerated primitives such as sort to implement complex high-performance parallel applications.
the nvidia cuda deep neural network library (cudnn) is a gpu-accelerated library of primitives for deep neural networks. cudnn provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. the nvidia is a high performance deep learning inference optimizer and runtime library. for more information, see and .
gpu coder does not require a special pragma to generate kernel calls to libraries. during
the code generation process, when you select the enable cublas option in
the gpu coder app or use config_object.gpuconfig.enablecublas = true
property in cli, gpu coder replaces some functionality with calls to the cublas library. when
you select the enable cusolver option in the gpu coder app or use config_object.gpuconfig.enablecusolver = true
property in cli, gpu coder replaces some functionality with calls to the cusolver library. for gpu coder to replace high-level math functions to library calls, the following conditions
must be met:
gpu-specific library replacement must exist for these functions.
matlab® coder™ data size thresholds must be satisfied.
gpu coder supports cufft, cusolver, and cublas library replacements for the functions listed in the table. for functions that have no replacements in cuda, gpu coder uses portable matlab functions that are mapped to the gpu.
matlab function | description | matlab coder lapack support | cublas, cusolver, cufft, thrust support |
---|---|---|---|
| matrix multiply | yes | yes |
| solve system of linear equation | yes | yes |
| lu matrix factorization | yes | yes |
| orthogonal-triangular decomposition | yes | partial |
| matrix determinant | yes | yes |
| cholesky factorization | yes | yes |
| reciprocal condition number | yes | yes |
| solve system of linear equations | yes | yes |
| eigenvalues and eigen vectors | yes | no |
| schur decomposition | yes | no |
| singular value decomposition | yes | partial |
| fast fourier transform | yes | yes |
| inverse fast fourier transform | yes | yes |
sort array elements | yes, using |
when you select the enable cufft option in the gpu coder app or use config_object.gpuconfig.enablecufft = true
property in cli, gpu coder maps fft,ifft,fft2,ifft2,fftn.ifftn
function calls in your
matlab code to the appropriate cufft library calls. for 2-d transforms and higher,
gpu coder creates multiple 1-d batched transforms. these batched transforms have higher
performance than single transforms. gpu coder only supports out-of-place transforms. if enable cufft is
not selected, gpu coder uses c fftw
libraries where available or generates kernels
from portable matlab fft. both single and double precision data types are supported. input and output
can be real or complex-valued, but real-valued transforms are faster. cufft library support
input sizes that are typically specified as a power of 2 or a value that can be factored into
a product of small prime numbers. in general the smaller the prime factor, the better the
performance.
note
using cuda library names such as cufft
, cublas
, and
cudnn
as the names of your matlab function results in code generation errors.
see also
| | | | |