XS3 Filter Types¶
-
struct xs3_filter_fir_s32_t¶
- #include <xs3_filters.h>
32-bit Discrete-Time Finite Impulse Response (FIR) Filter
- Todo:
Move most of this information out to higher-level documentation
- Filter Model
-
This struct represents an N-tap 32-bit discrete-time FIR Filter.
At each time step, the FIR filter consumes a single 32-bit input sample and produces a single 32-bit output sample.
To process a new input sample and compute a new output sample, use xs3_filter_fir_s32(). To add a new input sample to the filter without computing a new output sample, use xs3_filter_fir_s32_add_sample().
An
N
-tap FIR filter containsN
32-bit cofficients (pointed to bycoef
) andN
words of state data (pointed to bystate
. The state data is a vector of theN
most recent input samples. When processing a new input sample at time stept
,x[t]
is the new input sample,x[t-1]
is the previous input sample, and so on, up tox[t-(N-1)]
, which is the oldest input considered when computing the new output sample (see note 1 below). The coefficients form a vectorb[]
, whereb[k]
is the coefficient by which thek
th oldest input sample is multiplied. There is an additional parametershift
which scales the output as described below. Both the coefficients andshift
are considered to be constants which do not change after initialization (although nothing should break if they are changed to new valid values).At time step
t
, the output sampley[t]
is computed based on the inner product (i.e. sum of element-wise products) of the coefficients and state data as follows (a more detailed description is below):Importantly, all three of the operators above (addition, multiplication and the rightwards bit-shift) have slightly ideosyncratic meanings.acc = x[t-0] * b[0] + x[t-1] * b[1] + x[t-2] * b[2] + ... + x[t-(N-1)] * b[N-1] y[t] = acc >> shift
The products have a built-in rounding arithmetic right-shift of 30 bits, where ties round toward positive infinity. This is a hardware feature which allows for longer filters (larger
N
) without sacrificing coefficient precision. These element-wise products accumulate into 8 40-bit accumulators saturate the sums at symmetric 40-bit bounds (see Symmetrically Saturating Arithmetic). The order in which the taps are accumulated is unspecified (see note 2 below).After each tap has been accumulated, the 8 accumulators are then added together to get a 64-bit penultimate result (with 43 useful bits). Finally, an unsigned rounding arithmetic right-shift of
shift
bits is applied to the 64-bit sum, and the final result is saturated to the symmetric 32-bit range (-INT32_MAX
toINT32_MAX
inclusive).Below is a more detailed description of the operations performed (not including the saturation logic applied by the accumulators).
\[\begin{split} & y[t] = sat_{32} \left( round \left( \left( \sum_{k=0}^{N-1} round(x[t-k] \cdot b[k] \cdot 2^{-30}) \right) \cdot 2^{-shift} \right) \right) \\ & \qquad\text{where } sat_{32}() \text{ saturates to } \pm(2^{31}-1) \\ & \qquad\text{ and } round() \text{ rounds to the nearest integer, with ties rounding towards } +\!\infty \end{split}\]
- Operations
-
Initialize: A
xs3_filter_fir_s32_t
filter is initialized with a call to xs3_filter_fir_s32_init(). The caller supplies information about the filter, including the number of taps and pointers the coefficients and a state buffer. It is typically recommended that the state buffer be cleared to all0
s before initializing.Add Sample: To add a new input sample without computing a new output sample, use xs3_filter_fir_s32_add_sample(). This is a constant-time operation which does not depend on the number of filter taps. This may be useful in some situations, for example, to quickly pre-load the filter’s state buffer with multiple samples, without incurring the cost of computing an output with each added sample.
Process Sample: To process a new input sample and produce a new output sample, use xs3_filter_fir_s32().
- Fields
-
After initialization via xs3_filter_fir_s32_init(), the contents of the
xs3_filter_fir_s32_t
struct are considered to be opaque, and may change between major versions. In general, user code should not need to access its members.num_taps
is the order of the filter, or the number of taps. It is also the (minimum) size of the buffers to whichcoef
andstate
point, in elements (where each element is 4 bytes). The time required to process an input sample and produce an output sample is approximately linear innum_taps
(see Performance below).head
is the index intostate
at which the next sample will be added.shift
is the unsigned arithmetic rounding saturating right-shift applied to internal accumulator to get a final output.coef
is a pointer to a buffer (supplied by the user at initialization) containing the tap coefficients. The coefficients are stored in forward order, with lower indices corresponding to newer samples.coef[0]
, then, corresponds tob[0]
,coef[1]
tob[1]
, and so on. None of the functions which operate onxs3_filter_fir_s32_t
structs in this library will modify the contents of the buffer to whichcoef
points. This buffer must be at leastnum_taps
words long.state
is a pointer to a buffer (supplied by the user at initialization) containing the state data — a history of thenum_taps
most recent input samples.state
is used in a circular fashion withhead
indicating the index at which the next sample will be inserted.
- Performance
-
More work remains to fully characterize the time performance of this FIR filter, but asymptotically (i.e. with a large number of filter taps) processing a new input sample to produce a new output sample takes approximately 3 thread cycles per 8 filter taps.
That assumes that both the coefficients (pointed to by
coef
) and state buffer (pointed to bystate
) are stored directly in SRAM.
- Todo:
If the function takes
S * num_taps + V
thread cycles, what isV
?When there are fewer than
M
taps, it is more efficient to just use a C implementation of an FIR filter. What isM
?Brief explanation of how thread cycles correspond to actual time.
- Coefficient Scaling
-
Suppose you’re starting with a floating-point FIR filter model with coefficients
B[k]
which operates on a sequence of 32-bit integer input samplesx[t]
to get a resultY[t]
whereBecause of the 30-bit right-shift and the right-shift of the final accumulator byY[t] = x[t-0] * B[0] + x[t-1] * B[1] + ... + x[t-(N-1)] * B[N-1]
shift
bits, the coefficientsb[k]
to use with this library can be thought of as fixed-point values with30 + shift
fractional bits.The floating-point coefficients
B[k]
can then be naively converted to fixed-point coefficientsb[k]
After this, any further doubling of the coefficients can be compensated for without changing the overall gain by incrementingshift = 0 b[k] = (int32_t) round(ldexp(B[k], 30)
shift
.To maximize precision, you’ll typically want
shift
to be as large as possible while in the worst case to be considered neither saturates the internal accumulator (which, for safety, should generally be assumed to be 42 bits), nor saturates the final 32-bit output whenshift
is applied.The details of this depend on various details, such as your filter’s gain and the statistics of the sequence
x[t]
(e.g. any headroomx[t]
is known a priori to have).
- Filter Conversion
-
This library includes a python script which converts existing floating-point FIR filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
- Usage Example
-
#define N 256 // Tap count #define B_VAL ldexp(1.0/N, 30+7) // Value for (all) coefficients const int32_t b[TAPS] = // The filter coefficients { B_VAL, B_VAL, B_VAL, ..., B_VAL }; const right_shift_t shift = 7; // The (unsigned) right-shift applied to the final accumulator int32_t state_buff[TAPS] = { 0 }; // Filter state buffer, initialized to 0's xs3_filter_fir_s32_t filter; // The filter struct #define SAMPLE_COUNT 1024 int32_t x[SAMPLE_COUNT] = { ... }; // Some sequence of input samples // Initialize xs3_filter_fir_s32_init(&filter, state_buff, N, b, shift); // Just add the first 64 without processing output samples. (not necessary) for(unsigned i = 0; i < 64; i++) xs3_filter_fir_s32_add_sample(&filter, x[i]); // Process the rest, generating a sequence of filtered output samples int32_t y[SAMPLE_COUNT] = { 0 }; //Output samples (first 64 never get updated here) for(unsigned i = 64; i < SAMPLE_COUNT; i++) y[i] = xs3_filter_fir_s32(&filter, x[i]); // Do something with output sequence ...
This example creates a simple 256-tap filter which averages the most recent 256 samples.
Each
b[k]
is \(2^{29}\), and the final accumulator is right-shifted 7 bits. In the worst case, all input samples are \(-2^{31}\). In that case, the final accumulator value is \( 256 \cdot (2^{29} \cdot -2^{31} \cdot 2^{-30}) = -2^{38} \), well below the saturation limit of the accumulator. Aftershift
is applied, that becomes \(-2^{38} \cdot 2^{-7} = -2^{31}\). Finally, the 32-bit symmetric saturation logic is applied, making the final output value \(-2^{31}+1\).
- Notes
-
state
is a circular buffer, and so the index ofx[t]
withinstate
changes with each input sample. Thestate
field of this struct is considered to be opaque — its exact usage may change between versions.Ordinarily integer sums are associative, so the order in which elements are added added does not affect the final result. The sum that the FIR filters use, however, is saturating, with the saturation logic being applied throughout the sum. This saturation is a hard non-linearity and is not associative. The details of exactly when each tap is accumulated and into which accumulator are complicated and subject to change. It is best to construct a filter such that no ordering of the taps will saturate the accumulators.
-
struct xs3_filter_fir_s16_t¶
- #include <xs3_filters.h>
16-bit Discrete-Time Finite Impulse Response (FIR) Filter
- Filter Model
-
This struct represents an N-tap 16-bit discrete-time FIR Filter.
At each time step, the FIR filter consumes a single 16-bit input sample and produces a single 16-bit output sample.
To process a new input sample and compute a new output sample, use xs3_filter_fir_s16(). To add a new input sample to the filter without computing a new output sample, use xs3_filter_fir_s16_add_sample().
An
N
-tap FIR filter containsN
16-bit cofficients (pointed to bycoef
) andN
int16_t
s of state data (pointed to bystate
. The state data is a vector of theN
most recent input samples. When processing a new input sample at time stept
,x[t]
is the new input sample,x[t-1]
is the previous input sample, and so on, up tox[t-(N-1)]
, which is the oldest input considered when computing the new output sample (see note 1 below). The coefficients form a vectorb[]
, whereb[k]
is the coefficient by which thek
th oldest input sample is multiplied. There is an additional parametershift
which scales the output as described below. Both the coefficients andshift
are considered to be constants which do not change after initialization (although nothing should break if they are changed to new valid values).At time step
t
, the output sampley[t]
is computed based on the inner product (i.e. sum of element-wise products) of the coefficients and state data as follows (a more detailed description is below):Unlike the 32-bit FIR filters (seeacc = x[t-0] * b[0] + x[t-1] * b[1] + x[t-2] * b[2] + ... + x[t-(N-1)] * b[N-1] y[t] = acc >> shift
xs3_filter_fir_s16_t
), the productsx[t-k] * b[k]
are the raw 32-bit products of the 16-bit elements. These element-wise products accumulate into a 32-bit accumulator which saturates the sums at symmetric 32-bit bounds (see Symmetrically Saturating Arithmetic).After all taps have been accumulated, a rounding arithmetic right-shift of
shift
bits is applied to the 64-bit sum, and the final result is saturated to the symmetric 16-bit range (-INT16_MAX
toINT16_MAX
inclusive).Below is a more detailed description of the operations performed (not including the saturation logic applied by the accumulators).
\[\begin{split} & y[t] = sat_{16} \left( round \left( \left( \sum_{k=0}^{N-1} round(x[t-k] \cdot b[k]) \right) \cdot 2^{-shift} \right) \right) \\ & \qquad\text{where } sat_{32}() \text{ saturates to } \pm(2^{15}-1) \\ & \qquad\text{ and } round() \text{ rounds to the nearest integer, with ties rounding towards } +\!\infty \end{split}\]
- Operations
-
Initialize: A
xs3_filter_fir_s16_t
filter is initialized with a call to xs3_filter_fir_s16_init(). The caller supplies information about the filter, including the number of taps and pointers the coefficients and a state buffer. It is typically recommended that the state buffer be cleared to all0
s before initializing.Add Sample: To add a new input sample without computing a new output sample, use xs3_filter_fir_s16_add_sample(). Unlike xs3_filter_fir_s32_add_sample(), this is not a constant-time operation, and does depend on the number of filter taps. Nevertheless, this is faster than computing output samples, and may be useful in some situations, for example, to moer quickly pre-load the filter’s state buffer with multiple samples, without incurring the cost of computing an output with each added sample.
Process Sample: To process a new input sample and produce a new output sample, use xs3_filter_fir_s16().
- Fields
-
After initialization via xs3_filter_fir_s16_init(), the contents of the
xs3_filter_fir_s16_t
struct are considered to be opaque, and may change between major versions. In general, user code should not need to access its members.num_taps
is the order of the filter, or the number of taps. It is also the (minimum) size of the buffers to whichcoef
andstate
point, in elements (where each element is 2 bytes). The time required to process an input sample and produce an output sample is approximately linear innum_taps
(see Performance below).shift
is the unsigned arithmetic rounding saturating right-shift applied to internal accumulator to get a final output.coef
is a pointer to a buffer (supplied by the user at initialization) containing the tap coefficients. The coefficients are stored in forward order, with lower indices corresponding to newer samples.coef[0]
, then, corresponds tob[0]
,coef[1]
tob[1]
, and so on. None of the functions which operate onxs3_filter_fir_s16_t
structs in this library will modify the contents of the buffer to whichcoef
points. This buffer must be at leastnum_taps
elements long, and must begin at a word-aligned address.state
is a pointer to a buffer (supplied by the user at initialization) containing the state data — a history of thenum_taps
most recent input samples.state
must begin at a word-aligned address.
- Coefficient Scaling
- Filter Conversion
-
This library includes a python script which converts existing floating-point FIR filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
- Todo:
- Usage Example
-
struct xs3_biquad_filter_s32_t¶
- #include <xs3_filters.h>
A biquad filter block.
Contains the coeffient and state information for a cascade of up to 8 biquad filter sections.
To process a new input sample, xs3_filter_biquad_s32() can be used with a pointer to one of these structs.
For longer cascades, an array of
xs3_biquad_filter_s32_t
structs can be used with xs3_filter_biquads_s32().- Filter Conversion
-
This library includes a python script which converts existing floating-point cascaed biquad filter coefficients into a suitable representation and generates code for easily initializing and executing the filter. See Note: Digital Filter Conversion for more.
XS3 Filter Functions¶
-
void xs3_filter_fir_s32_init(xs3_filter_fir_s32_t *filter, int32_t *sample_buffer, const unsigned tap_count, const int32_t *coefficients, const right_shift_t shift)¶
Initialize a 32-bit FIR filter.
Before xs3_filter_fir_s32() or xs3_filter_fir_s32_add_sample() can be used on a filter it must be initialized with a call to this function.
sample_buffer
andcoefficients
must be at least4 * tap_count
bytes long, and aligned to a 4-byte (word) boundary.See
xs3_filter_fir_s32_t
for more information about 32-bit FIR filters and their operation.See also
- Parameters
filter – [out] Filter struct to be initialized
sample_buffer – [in] Buffer used by the filter to contain state information. Must be at least
tap_count
elements longtap_count – [in] Order of the FIR filter; number of filter taps
coefficients – [in] Array containing filter coefficients.
shift – [in] Unsigned arithmetic right-shift applied to accumulator to get filter output sample
-
void xs3_filter_fir_s32_add_sample(xs3_filter_fir_s32_t *filter, const int32_t new_sample)¶
Add a new input sample to a 32-bit FIR filter without processing an output sample.
This function adds a new input sample to
filter
’s state without computing a new output sample. This is a constant- time operation and can be used to quickly pre-load a filter with sample data.See
xs3_filter_fir_s32_t
for more information about FIR filters and their operation.See also
- Parameters
filter – [inout] Filter struct to have the sample added
new_sample – [in] Sample to be added to
filter
’s history
-
int32_t xs3_filter_fir_s32(xs3_filter_fir_s32_t *filter, const int32_t new_sample)¶
This function implements a Finite Impulse Response (FIR) filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_filter_fir_s32_t
.With a large number of filter taps, this function takes approximately 3 thread cycles per 8 filter taps.
See also
- Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
- Returns
Next filtered output sample
-
void xs3_filter_fir_s16_init(xs3_filter_fir_s16_t *filter, int16_t *sample_buffer, const unsigned tap_count, const int16_t *coefficients, const right_shift_t shift)¶
Initialize a 16-bit FIR filter.
Before xs3_filter_fir_s16() or xs3_filter_fir_s16_add_sample() can be used on a filter it must be initialized with a call to this function.
sample_buffer
andcoefficients
must be at least2 * tap_count
bytes long, and aligned to a 4-byte (word) boundary.See
xs3_filter_fir_s16_t
for more information about 16-bit FIR filters and their operation.See also
- Parameters
filter – [out] Filter struct to be initialized
sample_buffer – [in] Buffer used by the filter to contain state information. Must be at least
tap_count
elements longtap_count – [in] Order of the FIR filter; number of filter taps
coefficients – [in] Array containing filter coefficients
shift – [in] Unsigned arithmetic right-shift applied to accumulator to get filter output sample
-
void xs3_filter_fir_s16_add_sample(xs3_filter_fir_s16_t *filter, const int16_t new_sample)¶
Add a new input sample to a 16-bit FIR filter without processing an output sample.
This function adds a new input sample to
filter
’s state without computing a new output sample.See
xs3_filter_fir_s16_t
for more information about FIR filters and their operation.See also
- Parameters
filter – [inout] Filter struct to have the sample added
new_sample – [in] Sample to be added to
filter
’s history
-
int16_t xs3_filter_fir_s16(xs3_filter_fir_s16_t *filter, const int16_t new_sample)¶
This function implements a Finite Impulse Response (FIR) filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_filter_fir_s16_t
.With a large number of filter taps, this function takes approximately 3 thread cycles per 16 filter taps.
See also
- Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
- Returns
Next filtered output sample
-
int32_t xs3_filter_biquad_s32(xs3_biquad_filter_s32_t *filter, const int32_t new_sample)¶
This function implements a 32-bit Biquad filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_biquad_filter_s32_t
.This function processes a single filter block containing (up to) 8 biquad filter sections. For biquad filters containing 2 or more filter blocks (more than 8 biquad filter sections), see xs3_filter_biquads_s32().
- Parameters
filter – [inout] Filter to be processed
new_sample – [in] New input sample to be processed by
filter
- Returns
Next filtered output sample
-
int32_t xs3_filter_biquads_s32(xs3_biquad_filter_s32_t biquads[], const unsigned block_count, const int32_t new_sample)¶
This function implements a 32-bit Biquad filter.
The new input sample
new_sample
is added to this filter’s state, and a new output sample is computed and returned as specified inxs3_biquad_filter_s32_t
.This function processes one or more filter blocks, with each block containing up to 8 biquad filter sections.
See also
- Parameters
biquads – [inout] Filter blocks to be processed
block_count – [in] Number of filter blocks in
biquads
new_sample – [in] New input sample to be processed by
filter
- Returns
Next filtered output sample