speed vs. area tradeoffs
overview of speed or area optimizations
the coder provides options that extend your control over speed vs. area tradeoffs in the realization of filter designs. to achieve the desired tradeoff, you can either specify a fully parallel architecture for generated hdl filter code, or choose one of several serial architectures. these architectures are described in parallel and serial architectures.
this table summarizes the filter types that are available for parallel and serial architecture choices.
architecture | available for filter types... |
---|---|
fully parallel (default) | filter types that are supported for hdl code generation |
fully serial |
|
partly serial |
|
cascade serial |
|
the coder supports the full range of parallel and serial architecture options via
properties passed in to the generatehdl
function, as described in specifying speed vs. area tradeoffs via generatehdl properties.
alternatively, you can use the architecture pop-up menu on the generate hdl tool to choose parallel and serial architecture options, as described in select architectures in the generate hdl tool.
note
the coder also supports distributed arithmetic (da), another highly efficient architecture for realizing filters. see .
parallel and serial architectures
fully parallel architecture
this option is the default selection. a fully parallel architecture uses a dedicated multiplier and adder for each filter tap; the taps execute in parallel. this type of architecture is optimal for speed. however, it requires more multipliers and adders than a serial architecture, and therefore consumes more chip area.
serial architectures
serial architectures reuse hardware resources in time, saving chip area. the coder provides a range of serial architecture options. these architectures have a latency of one clock period (see latency in serial architectures).
you can select from these serial architecture options:
fully serial: a fully serial architecture conserves area by reusing multiplier and adder resources sequentially. for example, a four-tap filter design would use a single multiplier and adder, executing a multiply/accumulate operation once for each tap. the multiply/accumulate section of the design runs at four times the input/output sample rate. this type of architecture saves area at the cost of some speed loss and higher power consumption.
in a fully serial architecture, the system clock runs at a much higher rate than the sample rate of the filter. thus, for a given filter design, the maximum speed achievable by a fully serial architecture is less than the maximum speed of a parallel architecture.
partly serial: partly serial architectures cover the full range of speed vs. area tradeoffs that lie between fully parallel and fully serial architectures.
in a partly serial architecture, the filter taps are grouped into serial partitions. the taps within each partition execute serially, but the partitions execute together in parallel. the outputs of the partitions are summed at the final output.
when you select a partly serial architecture for a filter, you can define the serial partitioning in these ways:
define the serial partitions directly, as a vector of integers. each element of the vector specifies the length of the corresponding partition.
specify the desired hardware folding factor
ff
, an integer greater than1
. given the folding factor, the coder computes the serial partition and the number of multipliers.specify the desired number of multipliers
nmults
, an integer greater than1
. given the number of multipliers, the coder computes the serial partition and the folding factor.
the generate hdl tool lets you specify a partly serial architecture in terms of these three parameters. you can then view how a change in one parameter interacts with the other two. the coder also provides
hdlfilterserialinfo
, an informational function that helps you define an optimal serial partition for a filter.cascade-serial: a cascade-serial architecture closely resembles a partly serial architecture. as in a partly serial architecture, the filter taps are grouped into several serial partitions that execute together in parallel. however, the accumulated output of each partition cascades to the accumulator of the previous partition. the output of the partitions is therefore computed at the accumulator of the first partition. this technique is termed accumulator reuse. you do not require a final adder, which saves area.
the cascade-serial architecture requires an extra cycle of the system clock to complete the final summation to the output. therefore, the frequency of the system clock must be increased slightly with respect to the clock used in a noncascade partly serial architecture.
to generate a cascade-serial architecture, you specify a partly serial architecture with accumulator reuse enabled. if you do not specify the serial partitions, the coder automatically selects an optimal partitioning.
latency in serial architectures
serialization of a filter increases the total latency of the design by one clock cycle. the serial architectures use an accumulator (an adder with a register) to add the sequential products. an additional final register is used to store the summed result of each of the serial partitions. the operation requires an extra clock cycle.
holding input data in a valid state
serial architectures implement internal clock rates higher than the input rate. in
such filter implementations, there are n
cycles (n >=
2
) of the base clock for each input sample. you can specify how many clock
cycles the test bench holds the input data values in a valid state.
when you select hold input data between samples (the default), the test bench holds the input data values in a valid state for
n
clock cycles.when you clear hold input data between samples, the test bench holds input data values in a valid state for only one clock cycle. for the next
n-1
cycles, the test bench drives the data to an unknown state (expressed as'x'
) until the next input sample is clocked in. forcing the input data to an unknown state verifies that the generated filter code registers the input data only on the first cycle.
the figure shows the test bench pane of the generate hdl tool, with hold input data between samples set to its default setting.
use the equivalent property when you call the
generatehdl
function.
specifying speed vs. area tradeoffs via generatehdl
properties
by default, generates filter code using a fully parallel architecture. if you want to generate filter code with a fully parallel architecture, you do not have to specify this architecture explicitly.
two properties specify serial architecture options to the
generatehdl
function:
: this property specifies the serial partitioning of the filter.
: this property enables or disables accumulator reuse.
the table summarizes how to set these properties to generate the desired architecture.
to generate this architecture... | set serialpartition to... | set reuseaccum to... |
---|---|---|
fully parallel | omit this property | omit this property |
fully serial | n , where n is the length of the
filter | not specified, or 'off' |
partly serial |
you can also specify a serial architecture in terms of a desired hardware folding factor, or in terms of the optimal number of multipliers. see for detailed information. | 'off' |
cascade-serial with explicitly specified partitioning | [p1 p2 p3...pn] : a vector of integers having
n elements, where n is the number of serial
partitions. each element of the vector specifies the length of the corresponding
partition. the sum of the vector elements must equal the length of the filter. the
values of the vector elements must appear in descending order, except that the last
two elements must be equal. for example, for a filter of length 9, partitions such
as[5 4] or [4 3 2] would be legal, but the
partitions [3 3 3] or [3 2 4] raise an error
at code generation time. | 'on' |
cascade-serial with automatically optimized partitioning | omit this property | 'on' |
you can use the helper function to explore possible partitions for your filter.
for an example, see .
serial architectures for iir sos filters
to specify a partly or fully serial architecture for an iir sos filter structure
(df1sos
or dsp.biquadfilter
), specify either one
of these parameters:
'foldingfactor',ff
: specify the desired hardware folding factorff
, an integer greater than 1. given the folding factor, the coder computes the number of multipliers.'nummultipliers',nmults
: specify the desired number of multipliersnmults
, an integer greater than 1. given the number of multipliers, the coder computes the folding factor.
to obtain information about the folding factor options and the corresponding number of
multipliers for a filter, call the hdlfilterserialinfo
function.
for an example, see .
select architectures in the generate hdl tool
the architecture pop-up menu, in the generate hdl tool, lets you select parallel and serial architecture. these topics describe the ui options you must set for each architecture choice.
specifying a fully parallel architecture
the default architecture setting is fully
parallel
, as shown.
specifying a fully serial architecture
when you select the fully serial
,
architecture options, the generate hdl tool displays additional
information about the folding factor, number of multipliers, and serial partitioning.
because these parameters depend on the length of the filter, they display in a read-only
format, as shown in this figure.
the generate hdl tool also displays a view details link. when you click this link, the coder displays an html report in a separate window. the report displays an exhaustive table of folding factor, multiplier, and serial partition settings for the current filter. you can use the table to help you choose optimal settings for your design.
specify partitions for a partly serial architecture
when you select the partly serial
architecture option, the generate hdl tool displays additional
information and data entry fields related to serial partitioning.
the generate hdl tool also displays a view details link. when you click this link, the coder displays an html report in a separate window. the report displays an exhaustive table of folding factor, multiplier, and serial partition settings for the current filter. you can use the table to help you choose optimal settings for your design.
the specified by drop-down menu lets you decide how you define the partly serial architecture. select one of these options:
folding factor
: the drop-down menu to the right offolding factor
contains an exhaustive list of folding factors for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.multipliers
: the drop-down menu to the right ofmultipliers
contains an exhaustive list of value options for the number of multipliers for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.serial partition
: the drop-down menu to the right ofserial partition
contains an exhaustive list of serial partition options for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.
specifying a cascade serial architecture
when you select the cascade serial
architecture option, the generate hdl tool displays the
serial partition field, as shown in this figure.
the specified by menu lets you define the number and size of the serial partitions according to different criteria, as described in specifying speed vs. area tradeoffs via generatehdl properties.
specifying serial architectures for iir sos filters
to specify a partly or fully serial architecture for an iir sos filter structure in the ui, set these options:
architecture — select
fully parallel
(the default),fully serial
, orpartly serial
. if you selectpartly serial
, the ui displays the specified by drop-down menu.specified by — select one of these methods:
folding factor
— specify the desired hardware folding factor,ff
, an integer greater than 1. given the folding factor, the coder computes the number of multipliers.multipliers
— specify the desired number of multipliers,nmults
, an integer greater than 1. given the number of multipliers, the coder computes the folding factor.
example: direct form i sos filter. this example creates a direct form i sos (df1sos
) filter design
and opens the ui. the figure following the code example shows the coder options
configured for a partly serial architecture specified with a folding
factor
of 18.
fs = 48e3 % sampling frequency fc = 10.8e3 % cut-off frequency n = 5 % filter order f_lp = fdesign.lowpass('n,f3db',n,fc,fs) filt = design(f_lp,'butter','filterstructure','df1sos','systemobject',true) fdhdltool(filt,numerictype(1,16,15))
example: direct form ii sos filter. this example creates a direct form ii sos (df2sos
) filter design
using filter builder.
the filter is a lowpass df2sos
filter with a filter order of 6.
the filter arithmetic is set to fixed-point
.
on the code generation tab, the
generate hdl button activates the filter design hdl coder™ ui. this figure shows the hdl coder options configured for this filter,
using partly serial architecture with a folding factor
of
9.
specifying a distributed arithmetic architecture
the architecture pop-up menu also includes the
distributed arithmetic (da)
option. see ) for information about this
architecture.
interactions between architecture options and other hdl options
selecting certain architecture menu options can change or disable other options.
when the
fully serial
option is selected, these options are set to their default values and disabled:coefficient multipliers
add pipeline registers
fir adder style
when the
partly serial
option is selected:the coefficient multipliers option is set to its default value and disabled.
if the filter is multirate, the clock inputs option is set to
single
and disabled.
when the
cascade serial
option is selected, these options are set to their default values and disabled:coefficient multipliers
add pipeline registers
fir adder style