cuda code from cwt -凯发k8网页登录
this example shows how to generate a mex file to perform the continuous wavelet transform (cwt) using generated cuda® code.
first, ensure that you have a cuda-enabled gpu and the nvcc compiler. see (gpu coder) to ensure you have the proper configuration.
create a gpu coder configuration object.
cfg = coder.gpuconfig("mex");
generate a signal of 100,000 samples at 1,000 hz. the signal consists of two cosine waves with disjoint time supports.
t = 0:.001:(1e5*0.001)-0.001; x = cos(2*pi*32*t).*(t > 10 & t<=50) ... cos(2*pi*64*t).*(t >= 60 & t < 90) ... 0.2*randn(size(t));
cast the signal to use single precision. gpu calculations are often more efficiently done in single precision. you can however also generate code for double precision if your nvidia® gpu supports it.
x = single(x);
generate the gpu mex file and a code generation report. to allow generation of the mex file, you must specify the properties (class, size, and complexity) of the three input parameters:
coder.typeof(single(0),[1 1e5])
specifies a row vector of length 100,000 containing realsingle
values.coder.typeof('c',[1 inf])
specifies a character array of arbitrary length.coder.typeof(0)
specifies a realdouble
value.
sig = coder.typeof(single(0),[1 1e5]); wav = coder.typeof('c',[1 inf]); sfrq = coder.typeof(0); codegen cwt -config cfg -args {sig,wav,sfrq} -report
code generation successful: view report
the -report flag is optional. using -report
generates a code generation report. in the summary tab of the report, you can find a gpu code metrics link, which provides detailed information such as the number of cuda kernels generated and how much memory was allocated.
run the mex file on the data and plot the scalogram. confirm the plot is consistent with the two disjoint cosine waves.
[cfs,f] = cwt_mex(x,'morse',1e3); image("xdata",t,"ydata",f,"cdata",abs(cfs),"cdatamapping","scaled") set(gca,"yscale","log") axis tight xlabel("time (seconds)") ylabel("frequency (hz)") title("scalogram of two-tone signal")
run the cwt command above without appending the _
mex
. confirm the matlab® and the gpu mex scalograms are identical.
[cfs2,f2] = cwt(x,'morse',1e3);
max(abs(cfs2(:)-cfs(:)))
ans = single
7.3380e-07
see also
|