performance -凯发k8网页登录
some of the most common reasons why gpu coder™ generated code is not performing as expected are:
cuda® kernels are not created.
host to device and device to host memory transfers (
cudamemcpy
) are throttling performance.not enough parallelism or device issues.
these topics elaborate on the common causes for these symptoms and describe how to utilize the built-in screener to detect these issues. you can find information on how to work around for these issues and generate more efficient cuda code.
apps
functions
objects
topics
gpu coder troubleshooting workflow.
create and view reports generated during code generation.
- trace between generated cuda code and matlab source code
highlight sections of matlab code that runs on the gpu.
create and explore gpu static code metrics report.
visualize code metrics and identify optimization and tuning opportunities in your code.
suggestions for debugging cuda mex function.
recommendations for generating efficient cuda kernels.
reduce memory bottleneck issues when using gpu coder.
improve performance by using the information obtained from nvidia profiler (nvvp).
troubleshoot compilation failures due to a register count
nvlink
error.