XS3 Filter Types¶

struct xs3_filter_fir_s32_t¶
 #include <xs3_filters.h>
32bit DiscreteTime Finite Impulse Response (FIR) Filter
 Todo:
Move most of this information out to higherlevel documentation
 Filter Model

This struct represents an Ntap 32bit discretetime FIR Filter.
At each time step, the FIR filter consumes a single 32bit input sample and produces a single 32bit output sample.
To process a new input sample and compute a new output sample, use xs3_filter_fir_s32(). To add a new input sample to the filter without computing a new output sample, use xs3_filter_fir_s32_add_sample().
An
N
tap FIR filter containsN
32bit cofficients (pointed to bycoef
) andN
words of state data (pointed to bystate
. The state data is a vector of theN
most recent input samples. When processing a new input sample at time stept
,x[t]
is the new input sample,x[t1]
is the previous input sample, and so on, up tox[t(N1)]
, which is the oldest input considered when computing the new output sample (see note 1 below). The coefficients form a vectorb[]
, whereb[k]
is the coefficient by which thek
th oldest input sample is multiplied. There is an additional parametershift
which scales the output as described below. Both the coefficients andshift
are considered to be constants which do not change after initialization (although nothing should break if they are changed to new valid values).At time step
t
, the output sampley[t]
is computed based on the inner product (i.e. sum of elementwise products) of the coefficients and state data as follows (a more detailed description is below):Importantly, all three of the operators above (addition, multiplication and the rightwards bitshift) have slightly ideosyncratic meanings.acc = x[t0] * b[0] + x[t1] * b[1] + x[t2] * b[2] + ... + x[t(N1)] * b[N1] y[t] = acc >> shift
The products have a builtin rounding arithmetic rightshift of 30 bits, where ties round toward positive infinity. This is a hardware feature which allows for longer filters (larger
N
) without sacrificing coefficient precision. These elementwise products accumulate into 8 40bit accumulators saturate the sums at symmetric 40bit bounds (see Symmetrically Saturating Arithmetic). The order in which the taps are accumulated is unspecified (see note 2 below).After each tap has been accumulated, the 8 accumulators are then added together to get a 64bit penultimate result (with 43 useful bits). Finally, an unsigned rounding arithmetic rightshift of
shift
bits is applied to the 64bit sum, and the final result is saturated to the symmetric 32bit range (INT32_MAX
toINT32_MAX
inclusive).Below is a more detailed description of the operations performed (not including the saturation logic applied by the accumulators).
\[\begin{split} & y[t] = sat_{32} \left( round \left( \left( \sum_{k=0}^{N1} round(x[tk] \cdot b[k] \cdot 2^{30}) \right) \cdot 2^{shift} \right) \right) \\ & \qquad\text{where } sat_{32}() \text{ saturates to } \pm(2^{31}1) \\ & \qquad\text{ and } round() \text{ rounds to the nearest integer, with ties rounding towards } +\!\infty \end{split}\]
 Operations

Initialize: A
xs3_filter_fir_s32_t
filter is initialized with a call to xs3_filter_fir_s32_init(). The caller supplies information about the filter, including the number of taps and pointers the coefficients and a state buffer. It is typically recommended that the state buffer be cleared to all0
s before initializing.Add Sample: To add a new input sample without computing a new output sample, use xs3_filter_fir_s32_add_sample(). This is a constanttime operation which does not depend on the number of filter taps. This may be useful in some situations, for example, to quickly preload the filter’s state buffer with multiple samples, without incurring the cost of computing an output with each added sample.
Process Sample: To process a new input sample and produce a new output sample, use xs3_filter_fir_s32().
 Fields

After initialization via xs3_filter_fir_s32_init(), the contents of the
xs3_filter_fir_s32_t
struct are considered to be opaque, and may change between major versions. In general, user code should not need to access its members.num_taps
is the order of the filter, or the number of taps. It is also the (minimum) size of the buffers to whichcoef
andstate
point, in elements (where each element is 4 bytes). The time required to process an input sample and produce an output sample is approximately linear innum_taps
(see Performance below).head
is the index intostate
at which the next sample will be added.shift
is the unsigned arithmetic rounding saturating rightshift applied to internal accumulator to get a final output.coef
is a pointer to a buffer (supplied by the user at initialization) containing the tap coefficients. The coefficients are stored in forward order, with lower indices corresponding to newer samples.coef[0]
, then, corresponds tob[0]
,coef[1]
tob[1]
, and so on. None of the functions which operate onxs3_filter_fir_s32_t
structs in this library will modify the contents of the buffer to whichcoef
points. This buffer must be at leastnum_taps
words long.state
is a pointer to a buffer (supplied by the user at initialization) containing the state data — a history of thenum_taps
most recent input samples.state
is used in a circular fashion withhead
indicating the index at which the next sample will be inserted.
 Performance

More work remains to fully characterize the time performance of this FIR filter, but asymptotically (i.e. with a large number of filter taps) processing a new input sample to produce a new output sample takes approximately 3 thread cycles per 8 filter taps.
That assumes that both the coefficients (pointed to by
coef
) and state buffer (pointed to bystate
) are stored directly in SRAM.
 Todo:
If the function takes
S * num_taps + V
thread cycles, what isV
?When there are fewer than
M
taps, it is more efficient to just use a C implementation of an FIR filter. What isM
?Brief explanation of how thread cycles correspond to actual time.
 Coefficient Scaling

Suppose you’re starting with a floatingpoint FIR filter model with coefficients
B[k]
which operates on a sequence of 32bit integer input samplesx[t]
to get a resultY[t]
whereBecause of the 30bit rightshift and the rightshift of the final accumulator byY[t] = x[t0] * B[0] + x[t1] * B[1] + ... + x[t(N1)] * B[N1]
shift
bits, the coefficientsb[k]
to use with this library can be thought of as fixedpoint values with30 + shift
fractional bits.The floatingpoint coefficients
B[k]
can then be naively converted to fixedpoint coefficientsb[k]
After this, any further doubling of the coefficients can be compensated for without changing the overall gain by incrementingshift = 0 b[k] = (int32_t) round(ldexp(B[k], 30)
shift
.To maximize precision, you’ll typically want
shift
to be as large as possible while in the worst case to be considered neither saturates the internal accumulator (which, for safety, should generally be assumed to be 42 bits), nor saturates the final 32bit output whenshift
is applied.The details of this depend on various details, such as your filter’s gain and the statistics of the sequence
x[t]
(e.g. any headroomx[t]
is known a priori to have).
 Filter Conversion

This library includes a python script which converts existing floatingpoint FIR filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
 Usage Example

#define N 256 // Tap count #define B_VAL ldexp(1.0/N, 30+7) // Value for (all) coefficients const int32_t b[TAPS] = // The filter coefficients { B_VAL, B_VAL, B_VAL, ..., B_VAL }; const right_shift_t shift = 7; // The (unsigned) rightshift applied to the final accumulator int32_t state_buff[TAPS] = { 0 }; // Filter state buffer, initialized to 0's xs3_filter_fir_s32_t filter; // The filter struct #define SAMPLE_COUNT 1024 int32_t x[SAMPLE_COUNT] = { ... }; // Some sequence of input samples // Initialize xs3_filter_fir_s32_init(&filter, state_buff, N, b, shift); // Just add the first 64 without processing output samples. (not necessary) for(unsigned i = 0; i < 64; i++) xs3_filter_fir_s32_add_sample(&filter, x[i]); // Process the rest, generating a sequence of filtered output samples int32_t y[SAMPLE_COUNT] = { 0 }; //Output samples (first 64 never get updated here) for(unsigned i = 64; i < SAMPLE_COUNT; i++) y[i] = xs3_filter_fir_s32(&filter, x[i]); // Do something with output sequence ...
This example creates a simple 256tap filter which averages the most recent 256 samples.
Each
b[k]
is \(2^{29}\), and the final accumulator is rightshifted 7 bits. In the worst case, all input samples are \(2^{31}\). In that case, the final accumulator value is \( 256 \cdot (2^{29} \cdot 2^{31} \cdot 2^{30}) = 2^{38} \), well below the saturation limit of the accumulator. Aftershift
is applied, that becomes \(2^{38} \cdot 2^{7} = 2^{31}\). Finally, the 32bit symmetric saturation logic is applied, making the final output value \(2^{31}+1\).
 Notes

state
is a circular buffer, and so the index ofx[t]
withinstate
changes with each input sample. Thestate
field of this struct is considered to be opaque — its exact usage may change between versions.Ordinarily integer sums are associative, so the order in which elements are added added does not affect the final result. The sum that the FIR filters use, however, is saturating, with the saturation logic being applied throughout the sum. This saturation is a hard nonlinearity and is not associative. The details of exactly when each tap is accumulated and into which accumulator are complicated and subject to change. It is best to construct a filter such that no ordering of the taps will saturate the accumulators.

struct xs3_filter_fir_s16_t¶
 #include <xs3_filters.h>
16bit DiscreteTime Finite Impulse Response (FIR) Filter
 Filter Model

This struct represents an Ntap 16bit discretetime FIR Filter.
At each time step, the FIR filter consumes a single 16bit input sample and produces a single 16bit output sample.
To process a new input sample and compute a new output sample, use xs3_filter_fir_s16(). To add a new input sample to the filter without computing a new output sample, use xs3_filter_fir_s16_add_sample().
An
N
tap FIR filter containsN
16bit cofficients (pointed to bycoef
) andN
int16_t
s of state data (pointed to bystate
. The state data is a vector of theN
most recent input samples. When processing a new input sample at time stept
,x[t]
is the new input sample,x[t1]
is the previous input sample, and so on, up tox[t(N1)]
, which is the oldest input considered when computing the new output sample (see note 1 below). The coefficients form a vectorb[]
, whereb[k]
is the coefficient by which thek
th oldest input sample is multiplied. There is an additional parametershift
which scales the output as described below. Both the coefficients andshift
are considered to be constants which do not change after initialization (although nothing should break if they are changed to new valid values).At time step
t
, the output sampley[t]
is computed based on the inner product (i.e. sum of elementwise products) of the coefficients and state data as follows (a more detailed description is below):Unlike the 32bit FIR filters (seeacc = x[t0] * b[0] + x[t1] * b[1] + x[t2] * b[2] + ... + x[t(N1)] * b[N1] y[t] = acc >> shift
xs3_filter_fir_s16_t
), the productsx[tk] * b[k]
are the raw 32bit products of the 16bit elements. These elementwise products accumulate into a 32bit accumulator which saturates the sums at symmetric 32bit bounds (see Symmetrically Saturating Arithmetic).After all taps have been accumulated, a rounding arithmetic rightshift of
shift
bits is applied to the 64bit sum, and the final result is saturated to the symmetric 16bit range (INT16_MAX
toINT16_MAX
inclusive).Below is a more detailed description of the operations performed (not including the saturation logic applied by the accumulators).
\[\begin{split} & y[t] = sat_{16} \left( round \left( \left( \sum_{k=0}^{N1} round(x[tk] \cdot b[k]) \right) \cdot 2^{shift} \right) \right) \\ & \qquad\text{where } sat_{32}() \text{ saturates to } \pm(2^{15}1) \\ & \qquad\text{ and } round() \text{ rounds to the nearest integer, with ties rounding towards } +\!\infty \end{split}\]
 Operations

Initialize: A
xs3_filter_fir_s16_t
filter is initialized with a call to xs3_filter_fir_s16_init(). The caller supplies information about the filter, including the number of taps and pointers the coefficients and a state buffer. It is typically recommended that the state buffer be cleared to all0
s before initializing.Add Sample: To add a new input sample without computing a new output sample, use xs3_filter_fir_s16_add_sample(). Unlike xs3_filter_fir_s32_add_sample(), this is not a constanttime operation, and does depend on the number of filter taps. Nevertheless, this is faster than computing output samples, and may be useful in some situations, for example, to moer quickly preload the filter’s state buffer with multiple samples, without incurring the cost of computing an output with each added sample.
Process Sample: To process a new input sample and produce a new output sample, use xs3_filter_fir_s16().
 Fields

After initialization via xs3_filter_fir_s16_init(), the contents of the
xs3_filter_fir_s16_t
struct are considered to be opaque, and may change between major versions. In general, user code should not need to access its members.num_taps
is the order of the filter, or the number of taps. It is also the (minimum) size of the buffers to whichcoef
andstate
point, in elements (where each element is 2 bytes). The time required to process an input sample and produce an output sample is approximately linear innum_taps
(see Performance below).shift
is the unsigned arithmetic rounding saturating rightshift applied to internal accumulator to get a final output.coef
is a pointer to a buffer (supplied by the user at initialization) containing the tap coefficients. The coefficients are stored in forward order, with lower indices corresponding to newer samples.coef[0]
, then, corresponds tob[0]
,coef[1]
tob[1]
, and so on. None of the functions which operate onxs3_filter_fir_s16_t
structs in this library will modify the contents of the buffer to whichcoef
points. This buffer must be at leastnum_taps
elements long, and must begin at a wordaligned address.state
is a pointer to a buffer (supplied by the user at initialization) containing the state data — a history of thenum_taps
most recent input samples.state
must begin at a wordaligned address.
 Coefficient Scaling
 Filter Conversion

This library includes a python script which converts existing floatingpoint FIR filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
 Todo:
 Usage Example

struct xs3_biquad_filter_s32_t¶
 #include <xs3_filters.h>
A biquad filter block.
Contains the coeffient and state information for a cascade of up to 8 biquad filter sections.
To process a new input sample, xs3_filter_biquad_s32() can be used with a pointer to one of these structs.
For longer cascades, an array of
xs3_biquad_filter_s32_t
structs can be used with xs3_filter_biquads_s32(). Filter Conversion

This library includes a python script which converts existing floatingpoint cascaed biquad filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
XS3 Filter Functions¶

void xs3_filter_fir_s32_init(xs3_filter_fir_s32_t *filter, int32_t *sample_buffer, const unsigned tap_count, const int32_t *coefficients, const right_shift_t shift)¶
Initialize a 32bit FIR filter.
Before xs3_filter_fir_s32() or xs3_filter_fir_s32_add_sample() can be used on a filter it must be initialized with a call to this function.
sample_buffer
andcoefficients
must be at least4 * tap_count
bytes long, and aligned to a 4byte (word) boundary.See
xs3_filter_fir_s32_t
for more information about 32bit FIR filters and their operation.See also
 Parameters
filter – [out] Filter struct to be initialized
sample_buffer – [in] Buffer used by the filter to contain state information. Must be at least
tap_count
elements longtap_count – [in] Order of the FIR filter; number of filter taps
coefficients – [in] Array containing filter coefficients.
shift – [in] Unsigned arithmetic rightshift applied to accumulator to get filter output sample

void xs3_filter_fir_s32_add_sample(xs3_filter_fir_s32_t *filter, const int32_t new_sample)¶
Add a new input sample to a 32bit FIR filter without processing an output sample.
This function adds a new input sample to
filter
’s state without computing a new output sample. This is a constant time operation and can be used to quickly preload a filter with sample data.See
xs3_filter_fir_s32_t
for more information about FIR filters and their operation.See also
 Parameters
filter – [inout] Filter struct to have the sample added
new_sample – [in] Sample to be added to
filter
’s history

int32_t xs3_filter_fir_s32(xs3_filter_fir_s32_t *filter, const int32_t new_sample)¶
This function implements a Finite Impulse Response (FIR) filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_filter_fir_s32_t
.With a large number of filter taps, this function takes approximately 3 thread cycles per 8 filter taps.
See also
 Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
 Returns
Next filtered output sample

void xs3_filter_fir_s16_init(xs3_filter_fir_s16_t *filter, int16_t *sample_buffer, const unsigned tap_count, const int16_t *coefficients, const right_shift_t shift)¶
Initialize a 16bit FIR filter.
Before xs3_filter_fir_s16() or xs3_filter_fir_s16_add_sample() can be used on a filter it must be initialized with a call to this function.
sample_buffer
andcoefficients
must be at least2 * tap_count
bytes long, and aligned to a 4byte (word) boundary.See
xs3_filter_fir_s16_t
for more information about 16bit FIR filters and their operation.See also
 Parameters
filter – [out] Filter struct to be initialized
sample_buffer – [in] Buffer used by the filter to contain state information. Must be at least
tap_count
elements longtap_count – [in] Order of the FIR filter; number of filter taps
coefficients – [in] Array containing filter coefficients
shift – [in] Unsigned arithmetic rightshift applied to accumulator to get filter output sample

void xs3_filter_fir_s16_add_sample(xs3_filter_fir_s16_t *filter, const int16_t new_sample)¶
Add a new input sample to a 16bit FIR filter without processing an output sample.
This function adds a new input sample to
filter
’s state without computing a new output sample.See
xs3_filter_fir_s16_t
for more information about FIR filters and their operation.See also
 Parameters
filter – [inout] Filter struct to have the sample added
new_sample – [in] Sample to be added to
filter
’s history

int16_t xs3_filter_fir_s16(xs3_filter_fir_s16_t *filter, const int16_t new_sample)¶
This function implements a Finite Impulse Response (FIR) filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_filter_fir_s16_t
.With a large number of filter taps, this function takes approximately 3 thread cycles per 16 filter taps.
See also
 Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
 Returns
Next filtered output sample

int32_t xs3_filter_biquad_s32(xs3_biquad_filter_s32_t *filter, const int32_t new_sample)¶
This function implements a 32bit Biquad filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_biquad_filter_s32_t
.This function processes a single filter block containing (up to) 8 biquad filter sections. For biquad filters containing 2 or more filter blocks (more than 8 biquad filter sections), see xs3_filter_biquads_s32().
 Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
 Returns
Next filtered output sample

int32_t xs3_filter_biquads_s32(xs3_biquad_filter_s32_t biquads[], const unsigned block_count, const int32_t new_sample)¶
This function implements a 32bit Biquad filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_biquad_filter_s32_t
.This function processes one or more filter blocks, with each block containing up to 8 biquad filter sections.
See also
 Parameters
biquads – [inout] Filter blocks to be processed
block_count – [in] Number of filter blocks in
biquads
new_sample – [in] New input sample to be processed by
filter
 Returns
Next filtered output sample