generates a multi-凯发k8网页登录
generates a multi-threaded mex file from a matlab function
description
dspunfold
generates a multi-threaded mex
file from the entry-point matlab® function specified by file
file
, using the unfolding technology.
unfolding is a technique to improve throughput through parallelization. the multi-threaded
mex file leverages the multicore cpu architecture of the host computer and can improve speed
significantly. in addition to the multi-threaded mex file, the function generates a
single-threaded mex file, a self-diagnostic analyzer function, and the corresponding help
files.
input arguments
output files
when you invoke dspunfold
on an entry-point matlab function, dspunfold
generates the following files.
file | value | description | examples |
---|---|---|---|
multi-threaded mex file | mex file | multi-threaded mex file generated from the entry-point matlab function. the mex file inherits the |
|
help file for the multi-threaded mex file | matlab file | matlab help file for the multi-threaded mex file. the help file has the same
name as the mex file, but with an '.m' extension. to invoke the help file, type
this help file displays information on
how to invoke the mex file, its syntax, |
|
single-threaded mex file | mex file | single-threaded mex file generated from the entry-point matlab function. the mex file inherits the |
|
help file for the single-threaded mex file | matlab file | matlab help file for the single-threaded mex file. the help file has the same
name as the mex file, but with an '.m' extension. to invoke the help file, type
the help file displays information on how to invoke the mex file, its syntax, and types (size, class, and complexity) of the inputs to the mex file. the syntax to invoke the mex file should be the same as the syntax shown in the help file. |
|
self-diagnostic analyzer function | p-coded file |
the first dimension of the analyzer inputs must be a
multiple of the first dimension of the corresponding inputs, given to the
the analyzer inherits the |
|
help file for the self-diagnostic analyzer function | matlab file | help file for the self-diagnostic analyzer function. the help file has the
same name as the mex file, but with an '.m' extension. to invoke the help file, type
the help file for the self-diagnostic analyzer function displays information on how to invoke the analyzer function, its syntax, and types (size, class, and complexity) of the inputs to the analyzer function. the syntax to invoke the analyzer function should be the same as the syntax shown in the help file. |
|
limitations
general limitations:
on windows and linux, you must use a compiler that supports the open multiprocessing (openmp) application interface. see .
if you have a macos with an xcode version 12.0 or later, using the
dspunfold
function is not supported.if the input matlab function has runtime errors, the errors are not caught when you run the multi-threaded mex file. before you use the
dspunfold
function, callcodegen
on the matlab function and make sure that the mex file is generated successfully.if the generated code uses a large amount of memory to store the local variables, around
4
mb
on windows platform, the generated multi-threaded mex file can have unexpected behavior. this limit varies with each platform. as a workaround, reduce the size of the input signals or restructure the matlab function to use less local memory.dspunfold
does not support:and inside the matlab function
variable-size inputs and outputs
input signals with an arbitrary frame length to system objects that use the
decimationfactor
property. the input signal is considered to have an arbitrary frame length when its frame length is not a multiple of the decimation factor. when this is the case, the output of the object in the generated code is a variable-size signal, anddspunfold
does not support variable-size output signals.in the case of the object, you can determine the decimation factor using the function.
p-coded entry-point matlab functions
cell arrays as inputs and outputs
analyzer limitations:
the following limitations apply to the analyzer function generated by the
dspunfold
function. for more information on the analyzer function, see
'self-diagnostic analyzer’ in the 'more about' section of dspunfold
.
if multiple frames of the analyzer input are identical, the analyzer might throw false positive pass results. it is recommended that you provide at least two different frames for each input of the analyzer.
if the algorithm in the entry-point matlab function chooses its state length based on the input values, the analyzer might provide different pass results for different input values. for an example, see the
fir_mean
function in .if the input to the entry-point matlab function does affect the output immediately, the analyzer might throw false positive pass results. for an example, see the
input_output
function in .if the output results of the multi-threaded mex file and single-threaded mex file match statistically but do not match numerically, the analyzer does not pass. consider the
filternoise
function that follows, which filters a random noise signal with an fir filter. the function callsrandn
from within itself to generate random noise. hence, the output results of thefilternoise
function match statistically but not match numerically.when you run the automatic state length detection tool run onfunction output = filternoise(x) persistent firfilter if isempty(firfilter) firfilter = dsp.firfilter('numerator',fir1(12,0.4)); end output = firfilter(x randn(1000,1)); end
filternoise
, the tool detects an infinite state length. because the tool cannot find a numerical match for a finite state length, it chooses an infinite state length.dspunfold filternoise -args {randn(1000,1)} -s auto
analyzing input matlab function filternoise creating single-threaded mex file filternoise_st.mexw64 searching for minimal state length (this might take a while) checking stateless ... insufficient checking 1 ... insufficient checking infinite ... sufficient checking 2 ... insufficient minimal state length is inf creating multi-threaded mex file filternoise_mt.mexw64 warning: the multi-threading was disabled due to performance considerations. this happens when the state length is greater than or equal to (threads-1)*repetition frames (3 frames in this case). > in coder.internal.warning (line 8) in unfoldingengine/buildparallelsolution (line 25) in unfoldingengine/generate (line 207) in dspunfold (line 234) creating analyzer file filternoise_analyzer
the algorithm does not need an infinite state. the state length of the fir filter, hence the algorithm is
12
.call
dspunfold
with state length set to 12.dspunfold filternoise -args {randn(1000,1)} -s 12 -f true
analyzing input matlab function filternoise creating single-threaded mex file filternoise_st.mexw64 creating multi-threaded mex file filternoise_mt.mexw64 creating analyzer file filternoise_analyzer
run the analyzer function.
filternoise_analyzer(randn(1000*4,1))
analyzing multi-threaded mex file filternoise_mt.mexw64 ... latency = 8 frames speedup = 0.5x warning: the output results of the multi-threaded mex file filternoise_mt.mexw64 do not match the output results of the single-threaded mex file filternoise_st.mexw64. check that you provided the correct state length value to the dspunfold function when you generated the multi-threaded mex file filternoise_mt.mexw64. for best practices and possible solutions to this problem, see the 'tips' section in the dspunfold function reference page. > in coder.internal.warning (line 8) in filternoise_analyzer ans = latency: 8 speedup: 0.4970 pass: 0
the analyzer looks for a numerical match and fails the verification, even though the generated multi-threaded mex file is valid.
speedup limitations:
if the entry-point matlab function contains code with low complexity, matlab overhead or multi-threaded mex overhead overshadow any performance gains. in such cases, do not use
dspunfold
.if the number of operations in the input matlab function is small compared to the size of the input or output data, the multi-threaded mex file does not provide any speedup gain. sometimes, it can result in a speedup loss, even if the repetition value is increased. in such cases, do not use
dspunfold
.
more about
tips
general
do not display plots, scopes, or execute other user interface operations from within the multi-threaded mex file. the generated mex file can have unexpected behavior.
do not use
coder.extrinsic
inside the input matlab function. the generated mex file can have unexpected behavior.
when the state length is less than or equal to (threads –
1
) × repetition frames:
do not use a random number inside the matlab function. the outputs of the single-threaded mex file and the multi-threaded mex file might not match. also, the outputs of the consecutive executions of the multi-threaded mex file might not match. the analyzer might not pass the numerical match verification.
it is recommended that you generate the random number outside the entry-point matlab function and pass it as an argument to the function.
do not use global or persistent variables anywhere other than in the entry-point matlab function. for example, avoid using persistent variables in subfunctions. the generated mex file can produce inaccurate results. in general, global variables are not recommended.
do not access i/o resources from within the multi-threaded mex file. the generated mex file can have unexpected behavior. these resources include file writers and readers, udp sockets, and audio players and recorders.
do not use functions with interactive inputs (for example, the keyboard) inside the multi-threaded mex file. the generated mex file can have unexpected behavior.
workflow
to generate a valid multi-threaded mex file with the required speedup and latency, follow the .
before using
dspunfold
, callcodegen
on the entry-point matlab function and make sure that the function generates a mex file successfully.after generating the multi-threaded mex file using
dspunfold
, run the analyzer function. make sure that the analyzer function passes. the exception to this rule is when the algorithm produces results that match statistically, but not numerically. in this exception, the analyzer function does notpass
, even though thedspunfold
function generates a valid multi-threaded mex file. see 'analyzer limitations' for an example.for help on using the mex file and analyzer, at the matlab command prompt, enter
help
andhelp
.
state length
if you choose a state length that is greater than or equal to the value of the exact state length, the analyzer passes. if the analyzer fails, increase the state length, regenerate the mex file, and verify again.
if the state length is greater than
0
, the inputs marked as frames (through-f
option) must all have the same dimensions.when generating the mex file and running the analyzer, use inputs that invoke the same state length.
automatic state length detection
when you set -s
to auto
:
if the algorithm in the entry-point matlab function chooses a code path based on the input values, use inputs that choose the code path with the longest state length.
provide random inputs to
-args
.choose inputs that have an immediate effect on the output. see .
analyzer
make sure the outputs of the multi-threaded mex file and the single-threaded mex file do not contain
nan
or aninf
. the analyzer cannot do numeric checks and returnspass
asfalse
. the automatic state length detection tool detects infinite state length and displays a warningwarning
the output results of the multi-threaded mex file do not match the output results of the single-threaded mex file even for infinite state length. a possible reason is that input matlab function generates different output results between consecutive runs even for the same input values.
provide multiple frames with different values for each input of the analyzer. to improve the analyzer effectiveness, append successive frames along the first dimension.
provide inputs to the analyzer that lead to efficient code coverage.
speedup
to improve the speedup of the multi-threaded mex file, specify the exact state length in samples. you can specify the state length in samples by setting at least one entry of
frameinputs
totrue
. the use of samples reduces the overhead and increases the speedup.to increase the speedup at the cost of larger latency, you can:
increase the repetition factor. use the
-r
option.increase the number of threads. use the
-t
option.
for each input that can be divided into samples without altering the algorithm behavior, set frame status to
true
using the-f
option. the input is then considered in samples, which can increase the speedup of the generated multi-threaded mex file.
algorithms
the multi-threaded mex file buffers multiple-input signal frames into a buffer of
2
× threads × repetition
frames, where threads is the number of threads, and
repetition is the repetition factor. the mex file processes these frames
simultaneously, using multiple cores. this process introduces some deterministic latency,
where latency = 2
× threads
× repetition. latency is traded off with the speedup you might gain by
increasing the number of threads or the repetition factor.
version history
introduced in r2015b
see also
topics
- matlab algorithm acceleration (matlab coder)