code generation by using the gpu coder app
the easiest way to create cuda® kernels is to place the pragma into your primary
matlab® function. the primary function is also known as the
top-level or entry-point function.
when gpu coder™ encounters the kernelfun
pragma, it attempts to
parallelize all the computation within this function and then maps it to the gpu. for
more information about gpu kernels, see gpu programming paradigm.
learning objectives
in this tutorial, you learn how to:
prepare your matlab code for cuda code generation by using the
kernelfun
pragma.create and set up a gpu coder project.
define function input properties.
check for code generation readiness and run-time issues.
specify code generation properties.
generate cuda code by using the gpu coder app.
tutorial prerequisites
this tutorial requires the following products:
matlab
matlab coder™
gpu coder
c compiler
nvidia® gpu enabled for cuda
cuda toolkit and driver
environment variables for the compilers and libraries. for more information, see .
example: the mandelbrot set
description
the mandelbrot set is the region in the complex plane consisting of the values z0 for which the trajectories defined by this equation remain bounded at k→∞.
the overall geometry of the mandelbrot set is shown in the figure. this view does not have the resolution to show the richly detailed structure of the fringe just outside the boundary of the set. at increasing magnifications, the mandelbrot set exhibits an elaborate boundary that reveals progressively finer recursive detail.
algorithm
for this tutorial, pick a set of limits that specify a highly zoomed part of the mandelbrot set in the valley between the main cardioid and the p/q bulb to its left. a 1000-by-1000 grid of real parts (x) and imaginary parts (y) is created between these two limits. the mandelbrot algorithm is then iterated at each grid location. an iteration number of 500 renders the image in full resolution.
maxiterations = 500; gridsize = 1000; xlim = [-0.748766713922161,-0.748766707771757]; ylim = [0.123640844894862,0.123640851045266];
this tutorial uses an implementation of the mandelbrot set by using standard matlab commands running on the cpu. this calculation is vectorized such that every location is updated simultaneously.
tutorial files
create a matlab function called mandelbrot_count.m
with the
following lines of code. this code is a baseline vectorized matlab implementation of the mandelbrot set. for every point
(xgrid,ygrid)
in the grid, it calculates the iteration index
count
at which the trajectory defined by the equation reaches
a distance of 2
from the origin. it then returns the natural
logarithm of count
, which is used generate the color coded plot
of the mandelbrot set. later in this tutorial, you modify this file to make it
suitable for code
generation.
function count = mandelbrot_count(maxiterations,xgrid,ygrid) % mandelbrot computation z0 = xgrid 1i*ygrid; count = ones(size(z0)); z = z0; for n = 0:maxiterations z = z.*z z0; inside = abs(z)<=2; count = count inside; end count = log(count);
create a matlab script called mandelbrot_test.m
with the following
lines of code. the script generates a 1000-by-1000 grid of real parts
(x) and imaginary parts (y) between the
limits specified by xlim
and ylim
. it also
calls the mandelbrot_count
function and plots the resulting
mandelbrot set.
maxiterations = 500; gridsize = 1000; xlim = [-0.748766713922161,-0.748766707771757]; ylim = [0.123640844894862,0.123640851045266]; x = linspace(xlim(1),xlim(2),gridsize); y = linspace(ylim(1),ylim(2),gridsize); [xgrid,ygrid] = meshgrid(x,y); %% mandelbrot computation in matlab count = mandelbrot_count(maxiterations,xgrid,ygrid); % show figure(1) imagesc(x,y,count); colormap([jet();flipud(jet());0 0 0]); axis off title('mandelbrot set with matlab');
run the original matlab code
run the mandelbrot example
before making the matlab version of the mandelbrot set algorithm suitable for code generation, you can test the functionality of the original code.
change the current matlab working folder to the location that contains
mandelbrot_count.m
andmandelbrot_test.m
. gpu coder places generated code in this folder. change your current working folder if you do not have full access to this folder.run the
mandelbrot_test
script.
the test script runs and shows the geometry of the mandelbrot within the
boundary set by the variables xlim
and
ylim
.
prepare matlab code for code generation
before you generate code with gpu coder, check for coding issues in the original matlab code.
check for issues at design time
there are two tools that help you detect code generation issues at design time:
code analyzer tool
code generation readiness tool
the code analyzer is a tool incorporated into the matlab editor that continuously checks your code as you enter it. the
code analyzer reports issues and recommends modifications to maximize
performance and maintainability of your code. to identify the warnings and
errors specific to code generation from your matlab code, add the %#codegen
directive to your
matlab file. for more information, see .
note
the code analyzer does not detect all code generation issues. after eliminating the errors or warnings that the code analyzer detects, compile your code with gpu coder to determine if the code has other compliance issues.
the code generation readiness tool screens the matlab code for features and functions that are not supported for code generation. this tool provides a report that lists issues and recommendations for making the matlab code suitable for code generation. you can access the code generation readiness tool in these ways:
in the current folder browser — right-click the matlab file that contains the entry-point function.
at the command line — by using the function with the
-gpu
flag.in the gpu coder app — after specifying the entry-point files, the app runs the code analyzer and the code generation readiness tool.
check for issues at code generation time
you can use gpu coder to check for issues at code generation time. when gpu coder detects errors or warnings, it generates an error report that describes the issues and provides links to the problematic matlab code. for more information, see .
make the matlab code suitable for code generation
to begin the process of making your matlab code suitable for code generation, use the file
mandelbrot_count.m
.
set your matlab current folder to the work folder that contains your files for this tutorial.
in the matlab editor, open
mandelbrot_count.m
. the code analyzer message indicator at the top right corner of the matlab editor is green. the analyzer did not detect errors, warnings, or opportunities for improvement in the code.after the function declaration, add the
%#codegen
directive to turn on the error checking that is specific to code generation.function count = mandelbrot_count(maxiterations,xgrid,ygrid) %#codegen
the code analyzer message indicator remains green, indicating that it has not detected any code generation issues.
to map the
mandelbrot_count
function to a cuda kernel, modify the original matlab code by placing thecoder.gpu.kernelfun
pragma in the body of the function.function count = mandelbrot_count(maxiterations,xgrid,ygrid) %#codegen % add kernelfun pragma to trigger kernel creation coder.gpu.kernelfun; % mandelbrot computation z0 = xgrid 1i*ygrid; count = ones(size(z0)); z = z0; for n = 0:maxiterations z = z.*z z0; inside = abs(z)<=2; count = count inside; end count = log(count);
if you use the
coder.gpu.kernelfun
pragma, gpu coder attempts to map the computations in the functionmandelbrot_count
to the gpu.save the file. you are now ready to compile your code by using the gpu coder app.
generate code by using the gpu coder app
open the gpu coder app
on the matlab toolstrip apps tab, under code generation, click the gpu coder app icon. you can also open the app by typing in the matlab command window. the app opens the select source files page.
select source files
on the select source files page, enter or select the name of the primary function,
mandelbrot_count
. the primary function is also known as the top-level or entry-point function. the app creates a project with the default namemandelbrot_count.prj
in the current folder.click next and go to the define input types step. the app analyzes the function for coding issues and code generation readiness. if the app identifies issues, it opens the review code generation readiness page where you can review and fix issues. in this example, because the app does not detect issues, it opens the define input types page.
define input types
the code generator must determine the data types of all the variables in the matlab files at compile time. therefore, you must specify the data types of all the input variables. you can specify the input data types in one of these two ways:
provide a test file that calls the project entry-point functions. the gpu coder app can infer the input argument types by running the test file.
enter the input types directly.
for more information about input specifications, see .
in this example, to define the properties of the inputs
maxiterations
, xgrid
, and
ygrid
, specify the test file
mandelbrot_test.m
:
enter or select the test file
mandelbrot_test.m
.click autodefine input types.
the test file
mandelbrot_test.m
calls the entry-point function,mandelbrot_count.m
with the expected input types. the app infers that the inputmaxiterations
isdouble(1x1)
and the inputsxgrid
andygrid
aredouble(1000x1000)
.click next go to the check for run-time issues step.
check for run-time issues
the check for run-time issues step generates a mex file from your entry-point functions, runs the mex function, and reports issues. this step is optional. however, it is a best practice to perform this step. using this step, you can detect and fix defects that are harder to diagnose in the generated gpu code.
gpu coder provides the option to perform gpu-specific checks at this point. when you select this option, gpu coder generates cuda code and a mex file from your entry-point functions, runs the mex function, and reports issues. some of the gpu-specific run-time checks include:
checks for register spills.
stack size conformance checks.
note
there may be certain matlab constructs in your code that cause the check for run-time issues to fail cpu-specific checks but pass the gpu-specific checks.
to open the check for run-time issues dialog box, click the check for issues arrow.
in the check for run-time issues dialog box, specify a test file or enter code that calls the entry-point function with example inputs. for this example, use the test file
mandelbrot_test.m
that you used to define the input types.to enable gpu-specific checks, select the gpu option button. click check for issues.
the app generates a mex function. it runs the test script
mandelbrot_test
replacing calls tomandelbrot_count
with calls to the generated mex. if the app detects issues during the mex function generation or execution, it provides warning and error messages. you can click these messages to navigate to the problematic code and fix the issue. in this example, the app does not detect issues. the mex function has the same functionality as the originalmandelbrot_count
function.note
there may be certain matlab constructs in your code that cause the check for run-time issues to fail cpu-specific checks but pass the gpu-specific checks.
click next go to the generate code step.
generate cuda code
to open the generate dialog box, click the generate arrow.
in the generate dialog box, you can select the type of build that you want gpu coder to perform. the available options are listed in this table.
build type description source code
cuda source code to integrate with an external project.
mex
compiled code to run inside matlab.
static library
binary library for static linking with an external project.
dynamic library
binary library for dynamic linking with an external project.
executable
standalone program (requires a custom cuda main file).
for this tutorial, set build type to
mex(.mex)
. by generating a mex output, you can check the correctness of the generated cuda code from within matlab. the mex build type does not require additional settings like toolchain and hardware board. it also does not provide the option to generate only the source code. gpu coder can automatically select an available cuda toolchain as long as the are set properly.to view advanced options, select more settings - > gpu code. to the compiler flags option, add
--fmad=false
. this flag, when passed to thenvcc
, instructs the compiler to disable floating-point multiply-add (fmad) optimization. this option is set to prevent numerical mismatch in the generated code because of architectural differences between the cpu and the gpu. for more information, see numerical differences between cpu and gpu.click generate.
gpu coder generates the mex executable
mandelbrot_count_mex
in your working folder. the
folder contains all other the generated files including the cuda source (*.cu) and header files. the gpu coder app indicates that the code generation succeeded. it displays the source matlab files and generated output files on the left side of the page. on the variables tab, it displays information about the matlab source variables. on the target build log tab, it displays the build log, including compiler warnings and errors. by default, in the code window, the app displays the cuda source file\codegen\mex\mandelbrot_count mandelbrot_count.cu
. to view a different file, in the source code or output files pane, click the file name.to view the code generation report, click view report. the report provides links to your matlab code and the generated cuda (*.cu) files. it also provides compile-time information for the variables and expressions in your matlab code. this information helps you to find sources of error and warnings. it also helps you to debug code generation issues in your code. for more information, see .
the gpu kernels section on the generated code tab provides a list of kernels created during gpu code generation. the items in this list link to the relevant source code. for example, when you click mandelbrot_count_kernel1, the code section for this kernel is shown in the code browser window.
after you review the report, you can close the code generation report window. to view the report later, open
report.mldatx
in
folder.\codegen\mex\mandelbrot_cout\html the
contains the\codegen\mex\mandelbrot_count gpu_codegen_info.mat
mat-file that contains the statistics for the generated gpu code. this mat-file contains thecuda_kernel
variable that has information about the thread and block sizes, shared and constant memory usage, and input and output arguments of each kernel. thecudamalloc
andcudamemcpy
variables contain information about the size of all the gpu variables and the number ofmemcpy
calls between the host and the device.in the gpu coder app, click next to open the finish workflow page.
review the finish workflow page
the finish workflow page indicates that the code generation succeeded. it provides a project summary and links to the matlab source files, the code generation report, and the generated output binaries. you can save the configuration parameters of the current gpu coder project as a matlab script. see .
verify correctness of the generated code
to verify the correctness of the generated mex file, see verify correctness of the generated code.
see also
apps
functions
codegen
| | |
objects
- | |