speed vs. area tradeoffs -凯发k8网页登录

speed vs. area tradeoffs

overview of speed or area optimizations

the coder provides options that extend your control over speed vs. area tradeoffs in the realization of filter designs. to achieve the desired tradeoff, you can either specify a fully parallel architecture for generated hdl filter code, or choose one of several serial architectures. these architectures are described in parallel and serial architectures.

this table summarizes the filter types that are available for parallel and serial architecture choices.

architecture	available for filter types...
`fully parallel` (default)	filter types that are supported for hdl code generation
`fully serial`	direct form direct form symmetric direct form asymmetric direct form i sos direct form ii sos
`partly serial`	direct form direct form symmetric direct form asymmetric direct form i sos direct form ii sos
`cascade serial`	direct form direct form symmetric direct form asymmetric

the coder supports the full range of parallel and serial architecture options via properties passed in to the generatehdl function, as described in specifying speed vs. area tradeoffs via generatehdl properties.

alternatively, you can use the architecture pop-up menu on the generate hdl tool to choose parallel and serial architecture options, as described in select architectures in the generate hdl tool.

note

the coder also supports distributed arithmetic (da), another highly efficient architecture for realizing filters. see .

parallel and serial architectures

fully parallel architecture

this option is the default selection. a fully parallel architecture uses a dedicated multiplier and adder for each filter tap; the taps execute in parallel. this type of architecture is optimal for speed. however, it requires more multipliers and adders than a serial architecture, and therefore consumes more chip area.

serial architectures

serial architectures reuse hardware resources in time, saving chip area. the coder provides a range of serial architecture options. these architectures have a latency of one clock period (see latency in serial architectures).

you can select from these serial architecture options:

fully serial: a fully serial architecture conserves area by reusing multiplier and adder resources sequentially. for example, a four-tap filter design would use a single multiplier and adder, executing a multiply/accumulate operation once for each tap. the multiply/accumulate section of the design runs at four times the input/output sample rate. this type of architecture saves area at the cost of some speed loss and higher power consumption.
in a fully serial architecture, the system clock runs at a much higher rate than the sample rate of the filter. thus, for a given filter design, the maximum speed achievable by a fully serial architecture is less than the maximum speed of a parallel architecture.
partly serial: partly serial architectures cover the full range of speed vs. area tradeoffs that lie between fully parallel and fully serial architectures.
in a partly serial architecture, the filter taps are grouped into serial partitions. the taps within each partition execute serially, but the partitions execute together in parallel. the outputs of the partitions are summed at the final output.
when you select a partly serial architecture for a filter, you can define the serial partitioning in these ways:
- define the serial partitions directly, as a vector of integers. each element of the vector specifies the length of the corresponding partition.
- specify the desired hardware folding factor ff, an integer greater than 1. given the folding factor, the coder computes the serial partition and the number of multipliers.
- specify the desired number of multipliers nmults, an integer greater than 1. given the number of multipliers, the coder computes the serial partition and the folding factor.
the generate hdl tool lets you specify a partly serial architecture in terms of these three parameters. you can then view how a change in one parameter interacts with the other two. the coder also provides hdlfilterserialinfo, an informational function that helps you define an optimal serial partition for a filter.
cascade-serial: a cascade-serial architecture closely resembles a partly serial architecture. as in a partly serial architecture, the filter taps are grouped into several serial partitions that execute together in parallel. however, the accumulated output of each partition cascades to the accumulator of the previous partition. the output of the partitions is therefore computed at the accumulator of the first partition. this technique is termed accumulator reuse. you do not require a final adder, which saves area.
the cascade-serial architecture requires an extra cycle of the system clock to complete the final summation to the output. therefore, the frequency of the system clock must be increased slightly with respect to the clock used in a noncascade partly serial architecture.
to generate a cascade-serial architecture, you specify a partly serial architecture with accumulator reuse enabled. if you do not specify the serial partitions, the coder automatically selects an optimal partitioning.

latency in serial architectures

serialization of a filter increases the total latency of the design by one clock cycle. the serial architectures use an accumulator (an adder with a register) to add the sequential products. an additional final register is used to store the summed result of each of the serial partitions. the operation requires an extra clock cycle.

holding input data in a valid state

serial architectures implement internal clock rates higher than the input rate. in such filter implementations, there are n cycles (n >= 2) of the base clock for each input sample. you can specify how many clock cycles the test bench holds the input data values in a valid state.

when you select hold input data between samples (the default), the test bench holds the input data values in a valid state for n clock cycles.
when you clear hold input data between samples, the test bench holds input data values in a valid state for only one clock cycle. for the next n-1 cycles, the test bench drives the data to an unknown state (expressed as 'x') until the next input sample is clocked in. forcing the input data to an unknown state verifies that the generated filter code registers the input data only on the first cycle.

the figure shows the test bench pane of the generate hdl tool, with hold input data between samples set to its default setting.

test bench tab of the generate hdl tool

use the equivalent property when you call the generatehdl function.

specifying speed vs. area tradeoffs via `generatehdl` properties

by default, generates filter code using a fully parallel architecture. if you want to generate filter code with a fully parallel architecture, you do not have to specify this architecture explicitly.

two properties specify serial architecture options to the generatehdl function:

: this property specifies the serial partitioning of the filter.
: this property enables or disables accumulator reuse.

the table summarizes how to set these properties to generate the desired architecture.

to generate this architecture...	set serialpartition to...	set reuseaccum to...
fully parallel	omit this property	omit this property
fully serial	`n`, where `n` is the length of the filter	not specified, or `'off'`
partly serial	`[p1 p2 p3...pn]`: a vector of `n`integer elements, where `n` is the number of serial partitions. each element of the vector specifies the length of the corresponding partition. the sum of the vector elements must be equal to the length of the filter. when you define the partitioning for a partly serial architecture, consider these points. the filter length should be divided as uniformly as you can into a vector of length equal to the number of multipliers intended. for example, if your design requires a filter of length 9 with 2 multipliers, the recommended partition is `[5 4]`. if your design requires 3 multipliers, the recommended partition is`[3 3 3]` rather than some less uniform division such as `[1 4 4]` or `[3 4 2]`. if your design is constrained by having to compute each output value (corresponding to each input value) in an exact number `n` of clock cycles, use `n` as the largest partition size and partition the other elements as uniformly as you can. for example, if the filter length is 9 and your design requires exactly 4 cycles to compute the output, define the partition as `[4 3 2]`. this partition executes in 4 clock cycles, at the cost of 3 multipliers. you can also specify a serial architecture in terms of a desired hardware folding factor, or in terms of the optimal number of multipliers. see for detailed information.	`'off'`
cascade-serial with explicitly specified partitioning	`[p1 p2 p3...pn]`: a vector of integers having `n` elements, where `n` is the number of serial partitions. each element of the vector specifies the length of the corresponding partition. the sum of the vector elements must equal the length of the filter. the values of the vector elements must appear in descending order, except that the last two elements must be equal. for example, for a filter of length 9, partitions such as`[5 4]` or `[4 3 2]` would be legal, but the partitions `[3 3 3]` or `[3 2 4]` raise an error at code generation time.	`'on'`
cascade-serial with automatically optimized partitioning	omit this property	`'on'`

you can use the helper function to explore possible partitions for your filter.

for an example, see .

serial architectures for iir sos filters

to specify a partly or fully serial architecture for an iir sos filter structure (df1sos or dsp.biquadfilter), specify either one of these parameters:

'foldingfactor',ff: specify the desired hardware folding factor ff, an integer greater than 1. given the folding factor, the coder computes the number of multipliers.
'nummultipliers',nmults: specify the desired number of multipliers nmults, an integer greater than 1. given the number of multipliers, the coder computes the folding factor.

to obtain information about the folding factor options and the corresponding number of multipliers for a filter, call the hdlfilterserialinfo function.

for an example, see .

select architectures in the generate hdl tool

the architecture pop-up menu, in the generate hdl tool, lets you select parallel and serial architecture. these topics describe the ui options you must set for each architecture choice.

specifying a fully parallel architecture

the default architecture setting is fully parallel, as shown.

generate hdl tool

specifying a fully serial architecture

when you select the fully serial, architecture options, the generate hdl tool displays additional information about the folding factor, number of multipliers, and serial partitioning. because these parameters depend on the length of the filter, they display in a read-only format, as shown in this figure.

the generate hdl tool also displays a view details link. when you click this link, the coder displays an html report in a separate window. the report displays an exhaustive table of folding factor, multiplier, and serial partition settings for the current filter. you can use the table to help you choose optimal settings for your design.

generate hdl tool

specify partitions for a partly serial architecture

when you select the partly serial architecture option, the generate hdl tool displays additional information and data entry fields related to serial partitioning.

generate hdl tool

the specified by drop-down menu lets you decide how you define the partly serial architecture. select one of these options:

folding factor: the drop-down menu to the right of folding factor contains an exhaustive list of folding factors for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.
multipliers: the drop-down menu to the right of multipliers contains an exhaustive list of value options for the number of multipliers for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.
serial partition: the drop-down menu to the right of serial partition contains an exhaustive list of serial partition options for the filter. when you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.

specifying a cascade serial architecture

when you select the cascade serial architecture option, the generate hdl tool displays the serial partition field, as shown in this figure.

generate hdl tool

the specified by menu lets you define the number and size of the serial partitions according to different criteria, as described in specifying speed vs. area tradeoffs via generatehdl properties.

specifying serial architectures for iir sos filters

to specify a partly or fully serial architecture for an iir sos filter structure in the ui, set these options:

architecture — select fully parallel (the default), fully serial, or partly serial. if you select partly serial, the ui displays the specified by drop-down menu.
specified by — select one of these methods:
- folding factor — specify the desired hardware folding factor, ff, an integer greater than 1. given the folding factor, the coder computes the number of multipliers.
- multipliers — specify the desired number of multipliers, nmults, an integer greater than 1. given the number of multipliers, the coder computes the folding factor.

example: direct form i sos filter. this example creates a direct form i sos (df1sos) filter design and opens the ui. the figure following the code example shows the coder options configured for a partly serial architecture specified with a folding factor of 18.

fs = 48e3             % sampling frequency 
fc = 10.8e3           % cut-off frequency 
n = 5                 % filter order 
f_lp = fdesign.lowpass('n,f3db',n,fc,fs) 
filt = design(f_lp,'butter','filterstructure','df1sos','systemobject',true)
fdhdltool(filt,numerictype(1,16,15))

generate hdl tool

example: direct form ii sos filter. this example creates a direct form ii sos (df2sos) filter design using filter builder.

lowpass design dialog box

the filter is a lowpass df2sos filter with a filter order of 6. the filter arithmetic is set to fixed-point.

on the code generation tab, the generate hdl button activates the filter design hdl coder™ ui. this figure shows the hdl coder options configured for this filter, using partly serial architecture with a folding factor of 9.

generate hdl tool

specifying a distributed arithmetic architecture

the architecture pop-up menu also includes the distributed arithmetic (da) option. see ) for information about this architecture.

interactions between architecture options and other hdl options

selecting certain architecture menu options can change or disable other options.

when the fully serial option is selected, these options are set to their default values and disabled:
- coefficient multipliers
- add pipeline registers
- fir adder style
when the partly serial option is selected:
- the coefficient multipliers option is set to its default value and disabled.
- if the filter is multirate, the clock inputs option is set to single and disabled.
when the cascade serial option is selected, these options are set to their default values and disabled:
- coefficient multipliers
- add pipeline registers
- fir adder style

speed vs. area tradeoffs -凯发k8网页登录