XS3 Float Vector Functions¶

exponent_t xs3_vect_f32_max_exponent(const float b[], const unsigned length)¶
Get the maximum (32bit BFP) exponent from a vector of IEEE754 floats.
This function is used to determine the BFP exponent to use when converting a vector of IEEE754 singleprecision floats into a 32bit BFP vector.
The exponent returned, if used with
xs3_vect_f32_to_s32()
, is the one which will result in no headroom in the BFP vector — that is, the minimum permissible exponent for the BFP vector. The minimum permissible exponent is derived from the maximum exponent found in thefloat
elements themselves.More specifically, the
FSEXP
instruction is used on each element to determine its exponent. The value returned is the maximum exponent given by theFSEXP
instruction plus30
.b[]
must begin at a doublewordaligned address.See also
See also
Note
If required, when converting to a 32bit BFP vector, additional headroom can be included by adding the amount of required headroom to the exponent returned by this function.
 Parameters
b – [in] Input vector of IEEE754 singleprecision floats \(\bar b\)
length – [in] Number of elements in \(\bar b\)
 Throws
ET_LOAD_STORE – Raised if
b
is not doublewordaligned (See Note: Vector Alignment)ET_ARITHMETIC – Raised if Any element of
b
is infinite or notanumber.
 Returns
Exponent used for converting to 32bit BFP vector.

void xs3_vect_f32_to_s32(int32_t a[], const float b[], const unsigned length, const exponent_t a_exp)¶
Convert a vector of IEEE754 singleprecision floats into a 32bit BFP vector.
This function converts a vector of IEEE754 singleprecision floats \(\bar b\) into the mantissa vector \(\bar a\) of a 32bit BFP vector, given BFP vector exponent \(a\_exp\). Conceptually, the elements of output vector \(\bar{a} \cdot 2^{a\_exp}\) represent the same values as those of the input vector.
Because the output exponent \(a\_exp\) is shared by all elements of the output vector, even though the output vector has 32bit mantissas, precision may be lost on some elements if the exponents of the input elements \(b_k\) span a wide range.
The function
xs3_vect_f32_max_exponent()
can be used to determine the value for \(a\_exp\) which minimizes headroom of the output vector. Operation Performed:
 \[\begin{split}\begin{align*} & a_k \leftarrow round(\frac{b_k}{2^{b\_exp}}) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) \end{align*}\end{split}\]
 Parameter Details

a[]
represents the 32bit output mantissa vector \(\bar a\).b[]
represents the IEEE754 float input vector \(\bar b\).a[]
andb[]
must each begin at a doublewordaligned address.b[]
can be safely updated inplace.length
is the number of elements in each of the vectors.a_exp
is the exponent associated with the output vector \(\bar a\).
See also
See also
 Parameters
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
a_exp – [in] Exponent \(a\_exp\) of output vector \(\bar a\)
 Throws
ET_LOAD_STORE – Raised if
a
orb
is not doublewordaligned (See Note: Vector Alignment)ET_ARITHMETIC – Raised if Any element of
b
is infinite or notanumber.

void xs3_vect_s32_to_f32(float a[], const int32_t b[], const unsigned length, const exponent_t b_exp)¶
Convert a 32bit BFP vector into a vector of IEEE754 singleprecision floats.
This function converts a 32bit mantissa vector and exponent \(\bar b \cdot 2^{b\_exp}\) into a vector of 32bit IEEE754 singleprecision floatingpoint elements \(\bar a\). Conceptually, the elements of output vector \(\bar a\) represent the same values as those of the input vector.
Because IEEE754 singleprecision floats hold fewer mantissa bits, this operation may result in a loss of precision for some elements.
 Operation Performed:
 \[\begin{split}\begin{align*} & a_k \leftarrow b_k \cdot 2^{b\_exp} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) \end{align*}\end{split}\]
 Parameter Details

a[]
represents the output IEEE754 float vector \(\bar a\).b[]
represents the 32bit input mantissa vector \(\bar b\).a[]
andb[]
must each begin at a doublewordaligned address.b[]
can be safely updated inplace.length
is the number of elements in each of the vectors.b_exp
is the exponent associated with the input vector \(\bar b\).
See also
 Parameters
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
b_exp – [in] Exponent \(b\_exp\) of input vector \(\bar b\)
 Throws
ET_LOAD_STORE – Raised if
a
orb
is not doublewordaligned (See Note: Vector Alignment)

float xs3_vect_f32_dot(const float b[], const float c[], const unsigned length)¶
Compute the inner product of two IEEE754 float vectors.
This function takes two vectors of IEEE754 singleprecision floats and computes their inner product — the sum of the elementwise products. The
FMACC
instruction is used, granting full precision in the addition.The inner product \(a\) is returned.
 Operation Performed:
 \[\begin{align*} & a \leftarrow \sum_{k=0}^{length1} ( b_k \cdot c_k ) \end{align*}\]
 Parameters
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar b\) and \(\bar c\)
 Returns
The inner product

complex_float_t *xs3_vect_f32_fft_forward(float x[], const unsigned fft_length)¶
Perform forward FFT on a vector of IEEE754 floats.
This function takes real input vector \(\bar x\) and performs a forward FFT on the signal inplace to get output vector \(\bar{X} = FFT{\bar{x}}\). This implementation is accelerated by converting the IEEE754 float vector into a block floatingpoint representation to compute the FFT. The resulting BFP spectrum is then converted back to IEEE754 singleprecision floats. The operation is performed inplace on
x[]
.See
bfp_fft_forward_mono()
for the details of the FFT.Whereas the input
x[]
is an array offft_length
float
elements, the output (placed inx[]
) is an array offft_length/2
complex_float_t
elements, so the input should be cast after calling this.const unsigned FFT_N = 512 float time_series[FFT_N] = { ... }; xs3_vect_f32_fft_forward(time_series, FFT_N); complex_float_t* freq_spectrum = (complex_float_t*) &time_series[0]; const unsigned FREQ_BINS = FFT_N/2; // e.g. freq_spectrum[FREQ_BINS1].re
x[]
must begin at a doublewordaligned address. Operation Performed:
 \[\begin{align*} & \bar{X} \leftarrow FFT{\bar{x}} \end{align*}\]
 Parameters
x – [inout] Input vector \(\bar x\)
fft_length – [in] The length of \(\bar x\)
 Throws
ET_LOAD_STORE – Raised if
x
is not doublewordaligned (See Note: Vector Alignment) Returns
Pointer to frequencydomain spectrum (i.e.
((complex_float_t*) &x[0])
)

float *xs3_vect_f32_fft_inverse(complex_float_t X[], const unsigned fft_length)¶
Perform inverse FFT on a vector of complex_float_t.
This function takes complex input vector \(\bar X\) and performs an inverse real FFT on the spectrum inplace to get output vector \(\bar{x} = IFFT{\bar{X}}\). This implementation is accelerated by converting the IEEE754 float vector into a block floatingpoint representation to compute the IFFT. The resulting BFP signal is then converted back to IEEE754 singleprecision floats. The operation is performed inplace on
X[]
.See
bfp_fft_inverse_mono()
for the details of the IFFT.Input
X[]
is an array offft_length/2
complex_float_t
elements. The output (placed inX[]
) is an array offft_length
float
elements.const unsigned FFT_N = 512 complex_float_t freq_spectrum[FFT_N/2] = { ... }; xs3_vect_f32_fft_inverse(freq_spectrum, FFT_N); float* time_series = (float*) &freq_spectrum[0];
X[]
must begin at a doublewordaligned address. Parameters
X – [inout] Input vector \(\bar X\)
fft_length – [in] The FFT length. Twice the element count of \(\bar X\).
 Throws
ET_LOAD_STORE – Raised if
X
is not doublewordaligned (See Note: Vector Alignment) Returns
Pointer to timedomain signal (i.e.
((float*) &X[0])
)