main content

performance -凯发k8网页登录

troubleshoot code generation issues, improve code execution time, and reduce memory usage of generated code

some of the most common reasons why gpu coder™ generated code is not performing as expected are:

  • cuda® kernels are not created.

  • host to device and device to host memory transfers (cudamemcpy) are throttling performance.

  • not enough parallelism or device issues.

these topics elaborate on the common causes for these symptoms and describe how to utilize the built-in screener to detect these issues. you can find information on how to work around for these issues and generate more efficient cuda code.

apps

generate gpu code from matlab code
verify and set up gpu code generation environment

functions

codegengenerate c/c code from matlab code
open gpu coder app
analyze and optimize performance of the generated code
pragma that maps for-loops to gpu kernels
pragma that maps function to gpu kernels
pragma to disable kernel creation for loops

objects

configuration parameters for cuda code generation from matlab code by using gpu coder
configuration parameters for c/c code generation from matlab code
configuration parameters for c/c code generation from matlab code with embedded coder
create configuration object containing the parameters passed to coder.checkgpuinstall for performing gpu code generation environment checks

topics


  • gpu coder troubleshooting workflow.


  • create and view reports generated during code generation.

  • trace between generated cuda code and matlab source code

    highlight sections of matlab code that runs on the gpu.


  • create and explore gpu static code metrics report.


  • visualize code metrics and identify optimization and tuning opportunities in your code.


  • suggestions for debugging cuda mex function.


  • recommendations for generating efficient cuda kernels.


  • reduce memory bottleneck issues when using gpu coder.


  • improve performance by using the information obtained from nvidia profiler (nvvp).


  • troubleshoot compilation failures due to a register count nvlink error.

网站地图