XCORE SDK
XCORE Software Development Kit
|
Functions | |
exponent_t | xs3_vect_f32_max_exponent (const float b[], const unsigned length) |
Get the maximum (32-bit BFP) exponent from a vector of IEEE754 floats. More... | |
void | xs3_vect_f32_to_s32 (int32_t a[], const float b[], const unsigned length, const exponent_t a_exp) |
Convert a vector of IEEE754 single-precision floats into a 32-bit BFP vector. More... | |
void | xs3_vect_s32_to_f32 (float a[], const int32_t b[], const unsigned length, const exponent_t b_exp) |
Convert a 32-bit BFP vector into a vector of IEEE754 single-precision floats. More... | |
float | xs3_vect_f32_dot (const float b[], const float c[], const unsigned length) |
Compute the inner product of two IEEE754 float vectors. More... | |
complex_float_t * | xs3_vect_f32_fft_forward (float x[], const unsigned fft_length) |
Perform forward FFT on a vector of IEEE754 floats. More... | |
float * | xs3_vect_f32_fft_inverse (complex_float_t X[], const unsigned fft_length) |
Perform inverse FFT on a vector of complex_float_t. More... | |
float xs3_vect_f32_dot | ( | const float | b[], |
const float | c[], | ||
const unsigned | length | ||
) |
Compute the inner product of two IEEE754 float vectors.
This function takes two vectors of IEEE754 single-precision floats and computes their inner product – the sum of the elementwise products. The FMACC
instruction is used, granting full precision in the addition.
The inner product \(a\) is returned.
\begin{align*} & a \leftarrow \sum_{k=0}^{length-1} ( b_k \cdot c_k ) \end{align*}
[in] | b | Input vector \(\bar b\) |
[in] | c | Input vector \(\bar c\) |
[in] | length | Number of elements in vectors \(\bar b\) and \(\bar c\) |
complex_float_t* xs3_vect_f32_fft_forward | ( | float | x[], |
const unsigned | fft_length | ||
) |
Perform forward FFT on a vector of IEEE754 floats.
This function takes real input vector \(\bar x\) and performs a forward FFT on the signal in-place to get output vector \(\bar{X} = FFT{\bar{x}}\). This implementation is accelerated by converting the IEEE754 float vector into a block floating-point representation to compute the FFT. The resulting BFP spectrum is then converted back to IEEE754 single-precision floats. The operation is performed in-place on x[]
.
See bfp_fft_forward_mono()
for the details of the FFT.
Whereas the input x[]
is an array of fft_length
float
elements, the output (placed in x[]
) is an array of fft_length/2
complex_float_t
elements, so the input should be cast after calling this.
x[]
must begin at a double-word-aligned address.
\begin{align*} & \bar{X} \leftarrow FFT{\bar{x}} \end{align*}
[in,out] | x | Input vector \(\bar x\) |
[in] | fft_length | The length of \(\bar x\) |
((complex_float_t*) &x[0])
)ET_LOAD_STORE | Raised if x is not double-word-aligned (See Note: Vector Alignment) |
float* xs3_vect_f32_fft_inverse | ( | complex_float_t | X[], |
const unsigned | fft_length | ||
) |
Perform inverse FFT on a vector of complex_float_t.
This function takes complex input vector \(\bar X\) and performs an inverse real FFT on the spectrum in-place to get output vector \(\bar{x} = IFFT{\bar{X}}\). This implementation is accelerated by converting the IEEE754 float vector into a block floating-point representation to compute the IFFT. The resulting BFP signal is then converted back to IEEE754 single-precision floats. The operation is performed in-place on X[]
.
See bfp_fft_inverse_mono()
for the details of the IFFT.
Input X[]
is an array of fft_length/2
complex_float_t
elements. The output (placed in X[]
) is an array of fft_length
float
elements.
X[]
must begin at a double-word-aligned address.
[in,out] | X | Input vector \(\bar X\) |
[in] | fft_length | The FFT length. Twice the element count of \(\bar X\). |
((float*) &X[0])
)ET_LOAD_STORE | Raised if X is not double-word-aligned (See Note: Vector Alignment) |
exponent_t xs3_vect_f32_max_exponent | ( | const float | b[], |
const unsigned | length | ||
) |
Get the maximum (32-bit BFP) exponent from a vector of IEEE754 floats.
This function is used to determine the BFP exponent to use when converting a vector of IEEE754 single-precision floats into a 32-bit BFP vector.
The exponent returned, if used with xs3_vect_f32_to_s32()
, is the one which will result in no headroom in the BFP vector – that is, the minimum permissible exponent for the BFP vector. The minimum permissible exponent is derived from the maximum exponent found in the float
elements themselves.
More specifically, the FSEXP
instruction is used on each element to determine its exponent. The value returned is the maximum exponent given by the FSEXP
instruction plus 30
.
b[]
must begin at a double-word-aligned address.
[in] | b | Input vector of IEEE754 single-precision floats \(\bar b\) |
[in] | length | Number of elements in \(\bar b\) |
ET_LOAD_STORE | Raised ifb is not double-word-aligned (See Note: Vector Alignment) |
ET_ARITHMETIC | Raised if Any element of b is infinite or not-a-number. |
void xs3_vect_f32_to_s32 | ( | int32_t | a[], |
const float | b[], | ||
const unsigned | length, | ||
const exponent_t | a_exp | ||
) |
Convert a vector of IEEE754 single-precision floats into a 32-bit BFP vector.
This function converts a vector of IEEE754 single-precision floats \(\bar b\) into the mantissa vector \(\bar a\) of a 32-bit BFP vector, given BFP vector exponent \(a\_exp\). Conceptually, the elements of output vector \(\bar{a} \cdot 2^{a\_exp}\) represent the same values as those of the input vector.
Because the output exponent \(a\_exp\) is shared by all elements of the output vector, even though the output vector has 32-bit mantissas, precision may be lost on some elements if the exponents of the input elements \(b_k\) span a wide range.
The function xs3_vect_f32_max_exponent()
can be used to determine the value for \(a\_exp\) which minimizes headroom of the output vector.
\begin{align*} & a_k \leftarrow round(\frac{b_k}{2^{b\_exp}}) \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}
a[]
represents the 32-bit output mantissa vector \(\bar a\).
b[]
represents the IEEE754 float input vector \(\bar b\).
a[]
and b[]
must each begin at a double-word-aligned address.
b[]
can be safely updated in-place.
length
is the number of elements in each of the vectors.
a_exp
is the exponent associated with the output vector \(\bar a\).
[out] | a | Output vector \(\bar a\) |
[in] | b | Input vector \(\bar b\) |
[in] | length | Number of elements in vectors \(\bar a\) and \(\bar b\) |
[in] | a_exp | Exponent \(a\_exp\) of output vector \(\bar a\) |
ET_LOAD_STORE | Raised if a or b is not double-word-aligned (See Note: Vector Alignment) |
ET_ARITHMETIC | Raised if Any element of b is infinite or not-a-number. |
void xs3_vect_s32_to_f32 | ( | float | a[], |
const int32_t | b[], | ||
const unsigned | length, | ||
const exponent_t | b_exp | ||
) |
Convert a 32-bit BFP vector into a vector of IEEE754 single-precision floats.
This function converts a 32-bit mantissa vector and exponent \(\bar b \cdot 2^{b\_exp}\) into a vector of 32-bit IEEE754 single-precision floating-point elements \(\bar a\). Conceptually, the elements of output vector \(\bar a\) represent the same values as those of the input vector.
Because IEEE754 single-precision floats hold fewer mantissa bits, this operation may result in a loss of precision for some elements.
\begin{align*} & a_k \leftarrow b_k \cdot 2^{b\_exp} \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}
a[]
represents the output IEEE754 float vector \(\bar a\).
b[]
represents the 32-bit input mantissa vector \(\bar b\).
a[]
and b[]
must each begin at a double-word-aligned address.
b[]
can be safely updated in-place.
length
is the number of elements in each of the vectors.
b_exp
is the exponent associated with the input vector \(\bar b\).
[out] | a | Output vector \(\bar a\) |
[in] | b | Input vector \(\bar b\) |
[in] | length | Number of elements in vectors \(\bar a\) and \(\bar b\) |
[in] | b_exp | Exponent \(b\_exp\) of input vector \(\bar b\) |
ET_LOAD_STORE | Raised if a or b is not double-word-aligned (See Note: Vector Alignment) |