16Bit Vector API#
 group vect_s16_api
Functions

headroom_t vect_s16_abs(int16_t a[], const int16_t b[], const unsigned length)#
Compute the elementwise absolute value of a 16bit vector.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow sat_{32}(\left b_k \right) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the output vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\).

int32_t vect_s16_abs_sum(const int16_t b[], const unsigned length)#
Compute the sum of the absolute values of elements of a 16bit vector.
b[]
represents the 16bit vector \(\bar b\).b[]
must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{flalign*} a \leftarrow \sum_{k=0}^{length1} \left b_k \right && \end{flalign*}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the returned value \(a\) is the 32bit mantissa of floatingpoint value \(a \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
The 32bit sum \(a\)

headroom_t vect_s16_add(int16_t a[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t b_shr, const right_shift_t c_shr)#
Add one 16bit BFP vector to another.
a[]
,b[]
andc[]
represent the 16bit vectors \(\bar a\), \(\bar b\) and \(\bar c\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
orc[]
.length
is the number of elements in each of the vectors.b_shr
andc_shr
are the signed arithmetic rightshifts applied to each element of \(\bar b\) and \(\bar c\) respectively. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' = sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & c_k' = sat_{16}(\lfloor c_k \cdot 2^{c\_shr} \rfloor) \\ & a_k \leftarrow sat_{16}\!\left( b_k' + c_k' \right) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\).
In this case, \(b\_shr\) and \(c\_shr\) must be chosen so that \(a\_exp = b\_exp + b\_shr = c\_exp + c\_shr\). Adding or subtracting mantissas only makes sense if they are associated with the same exponent.
The function vect_s16_add_prepare() can be used to obtain values for \(a\_exp\), \(b\_shr\) and \(c\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
See also
vect_s16_add_prepare
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
b_shr – [in] Rightshift appled to \(\bar b\)
c_shr – [in] Rightshift appled to \(\bar c\)
 Throws ET_LOAD_STORE:
Raised if
a
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\).

headroom_t vect_s16_add_scalar(int16_t a[], const int16_t b[], const int16_t c, const unsigned length, const right_shift_t b_shr)#
Add a scalar to a 16bit vector.
a[]
,b[]
represent the 16bit mantissa vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.c
is the scalar \(c\) to be added to each element of \(\bar b\).length
is the number of elements in each of the vectors.b_shr
is the signed arithmetic rightshifts applied to each element of \(\bar b\). Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' = sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & a_k \leftarrow sat_{16}\!\left( b_k' + c \right) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If elements of \(\bar b\) are the mantissas of BFP vector \( \bar{b} \cdot 2^{b\_exp} \), and \(c\) is the mantissa of floatingpoint value \(c \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\).
In this case, \(b\_shr\) and \(c\_shr\) must be chosen so that \(a\_exp = b\_exp + b\_shr = c\_exp + c\_shr\). Adding or subtracting mantissas only makes sense if they are associated with the same exponent.
The function vect_s16_add_scalar_prepare() can be used to obtain values for \(a\_exp\), \(b\_shr\) and \(c\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
Note that \(c\_shr\) is an output of
vect_s16_add_scalar_prepare()
, but is not a parameter to this function. The \(c\_shr\) produced byvect_s16_add_scalar_prepare()
is to be applied by the user, and the result passed as inputc
.
See also
vect_s16_add_scalar_prepare()
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input scalar \(c\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
b_shr – [in] Rightshift appled to \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\).

unsigned vect_s16_argmax(const int16_t b[], const unsigned length)#
Obtain the array index of the maximum element of a 16bit vector.
b[]
represents the 16bit input vector \(\bar b\). It must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{split}\begin{flalign*} & a \leftarrow argmax_k\{ b_k \} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elemetns in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
\(a\), the index of the maximum element of vector \(\bar b\). If there is a tie for the maximum value, the lowest tying index is returned.

unsigned vect_s16_argmin(const int16_t b[], const unsigned length)#
Obtain the array index of the minimum element of a 16bit vector.
b[]
represents the 16bit input vector \(\bar b\). It must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{split}\begin{flalign*} & a \leftarrow argmin_k\{ b_k \} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elemetns in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
\(a\), the index of the minimum element of vector \(\bar b\). If there is a tie for the minimum value, the lowest tying index is returned.

headroom_t vect_s16_clip(int16_t a[], const int16_t b[], const unsigned length, const int16_t lower_bound, const int16_t upper_bound, const right_shift_t b_shr)#
Clamp the elements of a 16bit vector to a specified range.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors.lower_bound
andupper_bound
are the lower and upper bounds of the clipping range respectively. These bounds are checked for each element of \(\bar b\) only afterb_shr
is applied.b_shr
is the signed arithmetic rightshift applied to elements of \(\bar b\) before being compared to the upper and lower bounds.If \(\bar b\) are the mantissas for a BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the exponent \(a\_exp\) of the output BFP vector \(\bar{a} \cdot 2^{a\_exp}\) is given by \(a\_exp = b\_exp + b\_shr\).
 Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & a_k \leftarrow \begin{cases} lower\_bound & b_k' \le lower\_bound \\ & upper\_bound & b_k' \ge upper\_bound \\ & b_k' & otherwise \end{cases} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the output vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + b\_shr\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
lower_bound – [in] Lower bound of clipping range
upper_bound – [in] Upper bound of clipping range
b_shr – [in] Arithmetic rightshift applied to elements of \(\bar b\) prior to clipping
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of output vector \(\bar a\)

int64_t vect_s16_dot(const int16_t b[], const int16_t c[], const unsigned length)#
Compute the inner product of two 16bit vectors.
b[]
andc[]
represent the 32bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address.length
is the number of elements in each of the vectors. Operation Performed:
 \[\begin{flalign*} a \leftarrow \sum_{k=0}^{length1}\left( b_k \cdot c_k \right) && \end{flalign*}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of the BFP vectors \( \bar{b} \cdot 2^{b\_exp}\) and \(\bar{c}\cdot 2^{c\_exp}\), then result \(a\) is the mantissa of the result \(a \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + c\_exp\).
If needed, the bitdepth of \(a\) can then be reduced to 16 or 32 bits to get a new result \(a' \cdot 2^{a\_exp'}\) where \(a' = a \cdot 2^{a\_shr}\) and \(a\_exp' = a\_exp + a\_shr\).
 Notes

The sum \(a\) is accumulated simultaneously into 16 48bit accumulators which are summed together at the final step. So long as
length
is less than roughly 2 million, no overflow or saturation of the resulting sum is possible.
 Parameters:
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar b\) and \(\bar c\)
 Throws ET_LOAD_STORE:
Raised if
b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
\(a\), the inner product of vectors \(\bar b\) and \(\bar c\).

int32_t vect_s16_energy(const int16_t b[], const unsigned length, const right_shift_t b_shr)#
Calculate the energy (sum of squares of elements) of a 16bit vector.
b[]
represents the 16bit vector \(\bar b\).b[]
must begin at a wordaligned address.length
is the number of elements in \(\bar b\).b_shr
is the signed arithmetic rightshift applied to elements of \(\bar b\).b_shr
should be chosen to avoid the possibility of saturation. See the note below. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & a \leftarrow \sum_{k=0}^{length1} (b_k')^2 && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of the BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then floatingpoint result is \(a \cdot 2^{a\_exp}\), where the 32bit mantissa \(a\) is returned by this function, and \(a\_exp = 2 \cdot (b\_exp + b\_shr) \).
 Additional Details

If \(\bar b\) has \(b\_hr\) bits of headroom, then each product \((b_k')^2\) can be a maximum of \( 2^{30  2 \cdot (b\_hr + b\_shr)}\). So long as
length
is less than \(1 + 2\cdot (b\_hr + b\_shr) \), such errors should not be possible. Each increase of \(b\_shr\) by \(1\) doubles the number of elements that can be summed without risk of overflow.If the caller’s mantissa vector is longer than that, the full result can be found by calling this function multiple times for partial results on subsequences of the input, and adding the results in user code.
In many situations the caller may have a priori knowledge that saturation is impossible (or very nearly so), in which case this guideline may be disregarded. However, such situations are applicationspecific and are well beyond the scope of this documentation, and as such are left to the user’s discretion.
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in \(\bar b\)
b_shr – [in] Rightshift appled to \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
64bit mantissa of vector \(\bar b\)’s energy

headroom_t vect_s16_headroom(const int16_t b[], const unsigned length)#
Calculate the headroom of a 16bit vector.
The headroom of an Nbit integer is the number of bits that the integer’s value may be leftshifted without any information being lost. Equivalently, it is one less than the number of leading sign bits.
The headroom of an
int16_t
array is the minimum of the headroom of each of itsint16_t
elements.This function efficiently traverses the elements of
b[]
to determine its headroom.b[]
represents the 16bit vector \(\bar b\).b[]
must begin at a wordaligned address.length
is the number of elements inb[]
. Operation Performed:
 \[\begin{flalign*} a \leftarrow min\!\{ HR_{16}\left(x_0\right), HR_{16}\left(x_1\right), ..., HR_{16}\left(x_{length1}\right) \} && \end{flalign*}\]
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] The number of elements in vector \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of vector \(\bar b\)

void vect_s16_inverse(int16_t a[], const int16_t b[], const unsigned length, const unsigned scale)#
Compute the inverse of elements of a 16bit vector.
a[]
andb[]
represent the 16bit mantissa vectors \(\bar a\) and \(\bar b\) respectively. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors.scale
is a scaling parameter used to maximize the precision of the result. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow \lfloor\frac{2^{scale}}{b_k}\rfloor \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = scale  b\_exp\).
The function vect_s16_inverse_prepare() can be used to obtain values for \(a\_exp\) and \(scale\).
See also
vect_s16_inverse_prepare
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
scale – [in] Scale factor applied to dividend when computing inverse
 Returns:
Headroom of output vector \(\bar a\)

int16_t vect_s16_max(const int16_t b[], const unsigned length)#
Find the maximum value in a 16bit vector.
b[]
represents the 16bit vector \(\bar b\). It must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{flalign*} max\{ x_0, x_1, ..., x_{length1} \} && \end{flalign*}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the returned value \(a\) is the 16bit mantissa of floatingpoint value \(a \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
Maximum value from \(\bar b\)

headroom_t vect_s16_max_elementwise(int16_t a[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t b_shr, const right_shift_t c_shr)#
Get the elementwise maximum of two 16bit vectors.
a[]
,b[]
andc[]
represent the 16bit mantissa vectors \(\bar a\), \(\bar b\) and \(\bar c\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
, but not onc[]
.length
is the number of elements in each of the vectors.b_shr
andc_shr
are the signed arithmetic rightshifts applied to each element of \(\bar b\) and \(\bar c\) respectively. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & c_k' \leftarrow sat_{16}(\lfloor c_k \cdot 2^{c\_shr} \rfloor) \\ & a_k \leftarrow max(b_k', c_k') \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + b\_shr = c\_exp + c\_shr\).
The function vect_2vec_prepare() can be used to obtain values for \(a\_exp\), \(b\_shr\) and \(c\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
Warning
For correct operation, this function requires at least 1 bit of headroom in each mantissa vector after the shifts have been applied.
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
b_shr – [in] Rightshift appled to \(\bar b\)
c_shr – [in] Rightshift appled to \(\bar c\)
 Throws ET_LOAD_STORE:
Raised if
a
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of vector \(\bar a\)

int16_t vect_s16_min(const int16_t b[], const unsigned length)#
Find the minimum value in a 16bit vector.
b[]
represents the 16bit vector \(\bar b\). It must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{flalign*} max\{ x_0, x_1, ..., x_{length1} \} && \end{flalign*}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the returned value \(a\) is the 16bit mantissa of floatingpoint value \(a \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
Minimum value from \(\bar b\)

headroom_t vect_s16_min_elementwise(int16_t a[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t b_shr, const right_shift_t c_shr)#
Get the elementwise minimum of two 16bit vectors.
a[]
,b[]
andc[]
represent the 16bit mantissa vectors \(\bar a\), \(\bar b\) and \(\bar c\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
, but not onc[]
.length
is the number of elements in each of the vectors.b_shr
andc_shr
are the signed arithmetic rightshifts applied to each element of \(\bar b\) and \(\bar c\) respectively. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & c_k' \leftarrow sat_{16}(\lfloor c_k \cdot 2^{c\_shr} \rfloor) \\ & a_k \leftarrow min(b_k', c_k') \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + b\_shr = c\_exp + c\_shr\).
The function vect_2vec_prepare() can be used to obtain values for \(a\_exp\), \(b\_shr\) and \(c\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
Warning
For correct operation, this function requires at least 1 bit of headroom in each mantissa vector after the shifts have been applied.
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
b_shr – [in] Rightshift appled to \(\bar b\)
c_shr – [in] Rightshift appled to \(\bar c\)
 Throws ET_LOAD_STORE:
Raised if
a
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of vector \(\bar a\)

headroom_t vect_s16_macc(int16_t acc[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t acc_shr, const right_shift_t bc_sat)#
Multiply one 16bit vector elementwise by another, and add the result to an accumulator.
acc[]
represents the 16bit accumulator mantissa vector \(\bar a\). Each \(a_k\) isacc[k]
.b[]
andc[]
represent the 16bit input mantissa vectors \(\bar b\) and \(\bar c\), where each \(b_k\) isb[k]
and each \(c_k\) isc[k]
.Each of the input vectors must begin at a wordaligned address.
length
is the number of elements in each of the vectors.acc_shr
is the signed arithmetic rightshift applied to the accumulators \(a_k\) prior to accumulation.bc_sat
is the unsigned arithmetic rightshift applied to the product of \(b_k\) and \(c_k\) before accumulation. Operation Performed:
 \[\begin{split}\begin{flalign*} & v_k \leftarrow round( sat_{16}( b_k \cdot c_k \cdot 2^{bc\_sat} ) ) \\ & \hat{a}_k \leftarrow sat_{16}( a_k \cdot 2^{acc\_shr} ) \\ & a_k \leftarrow sat_{16}( \hat{a}_k + v_k ) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If inputs \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), and input \(\bar a\) is the accumulator BFP vector \(\bar{a} \cdot 2^{a\_exp}\), then the output values of \(\bar a\) have the exponent \(2^{a\_exp + acc\_shr}\).
For accumulation to make sense mathematically, \(bc\_sat\) must be chosen such that \( a\_exp + acc\_shr = b\_exp + c\_exp + bc\_sat \).
The function vect_complex_s16_macc_prepare() can be used to obtain values for \(a\_exp\), \(acc\_shr\) and \(bc\_sat\) based on the input exponents \(a\_exp\), \(b\_exp\) and \(c\_exp\) and the input headrooms \(a\_hr\), \(b\_hr\) and \(c\_hr\).
See also
vect_s16_macc_prepare
 Parameters:
acc – [inout] Accumulator \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
acc_shr – [in] Signed arithmetic rightshift applied to accumulator elements.
bc_sat – [in] Unsigned arithmetic rightshift applied to the products of elements \(b_k\) and \(c_k\)
 Throws ET_LOAD_STORE:
Raised if
acc
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\)

headroom_t vect_s16_nmacc(int16_t acc[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t acc_shr, const right_shift_t bc_sat)#
Multiply one 16bit vector elementwise by another, and subtract the result from an accumulator.
acc[]
represents the 16bit accumulator mantissa vector \(\bar a\). Each \(a_k\) isacc[k]
.b[]
andc[]
represent the 16bit input mantissa vectors \(\bar b\) and \(\bar c\), where each \(b_k\) isb[k]
and each \(c_k\) isc[k]
.Each of the input vectors must begin at a wordaligned address.
length
is the number of elements in each of the vectors.acc_shr
is the signed arithmetic rightshift applied to the accumulators \(a_k\) prior to accumulation.bc_sat
is the unsigned arithmetic rightshift applied to the product of \(b_k\) and \(c_k\) before accumulation. Operation Performed:
 \[\begin{split}\begin{flalign*} & v_k \leftarrow round( sat_{16}( b_k \cdot c_k \cdot 2^{bc\_sat} ) ) \\ & \hat{a}_k \leftarrow sat_{16}( a_k \cdot 2^{acc\_shr} ) \\ & a_k \leftarrow sat_{16}( \hat{a}_k  v_k ) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If inputs \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), and input \(\bar a\) is the accumulator BFP vector \(\bar{a} \cdot 2^{a\_exp}\), then the output values of \(\bar a\) have the exponent \(2^{a\_exp + acc\_shr}\).
For accumulation to make sense mathematically, \(bc\_sat\) must be chosen such that \( a\_exp + acc\_shr = b\_exp + c\_exp + bc\_sat \).
The function vect_complex_s16_nmacc_prepare() can be used to obtain values for \(a\_exp\), \(acc\_shr\) and \(bc\_sat\) based on the input exponents \(a\_exp\), \(b\_exp\) and \(c\_exp\) and the input headrooms \(a\_hr\), \(b\_hr\) and \(c\_hr\).
See also
vect_s16_nmacc_prepare
 Parameters:
acc – [inout] Accumulator \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
acc_shr – [in] Signed arithmetic rightshift applied to accumulator elements.
bc_sat – [in] Unsigned arithmetic rightshift applied to the products of elements \(b_k\) and \(c_k\)
 Throws ET_LOAD_STORE:
Raised if
acc
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\)

headroom_t vect_s16_mul(int16_t a[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t a_shr)#
Multiply two 16bit vectors together elementwise.
a[]
,b[]
andc[]
represent the 16bit vectors \(\bar a\), \(\bar b\) and \(\bar c\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
orc[]
.length
is the number of elements in each of the vectors.a_shr
is an unsigned arithmetic rightshift applied to the 32bit accumulators holding the penultimate results. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k' \leftarrow b_k \cdot c_k \\ & a_k \leftarrow sat_{16}(round(a_k' \cdot 2^{a\_shr})) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + c\_exp + a\_shr\).
The function vect_s16_mul_prepare() can be used to obtain values for \(a\_exp\) and \(a\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
a_shr – [in] Rightshift appled to 32bit products
 Throws ET_LOAD_STORE:
Raised if
a
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of vector \(\bar a\)

headroom_t vect_s16_rect(int16_t a[], const int16_t b[], const unsigned length)#
Rectify the elements of a 16bit vector.
Rectification ensures that all outputs are nonnegative, changing negative values to 0.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors.Each output element
a[k]
is set to the value of the corresponding input elementb[k]
if it is positive, anda[k]
is set to zero otherwise. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow \begin{cases} b_k & b_k > 0 \\ 0 & b_k \leq 0\end{cases} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the output vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\).

headroom_t vect_s16_scale(int16_t a[], const int16_t b[], const unsigned length, const int16_t c, const right_shift_t a_shr)#
Multiply a 16bit vector by a 16bit scalar.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors.c
is the 16bit scalar \(c\) by which elements of \(\bar b\) are multiplied.a_shr
is an unsigned arithmetic rightshift applied to the 32bit accumulators holding the penultimate results. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k' \leftarrow b_k \cdot c \\ & a_k \leftarrow sat_{16}(round(a_k' \cdot 2^{a\_shr})) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of a BFP vector \( \bar{b} \cdot 2^{b\_exp} \) and \(c\) is the mantissa of floatingpoint value \(c \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + c\_exp + a\_shr\).
The function vect_s16_scale_prepare() can be used to obtain values for \(a\_exp\) and \(a\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
a_shr – [in] Rightshift appled to 32bit products
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of vector \(\bar a\)

void vect_s16_set(int16_t a[], const int16_t b, const unsigned length)#
Set all elements of a 16bit vector to the specified value.
a[]
represents the 16bit vector \(\bar a\). It must begin at a wordaligned address.b
is the value elements of \(\bar a\) are set to.length
is the number of elements ina[]
. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow b \\ & \qquad\text{for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(b\) is the mantissa of floatingpoint value \(b \cdot 2^{b\_exp}\), then the output vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input value \(b\)
length – [in] Number of elements in vector \(\bar a\)
 Throws ET_LOAD_STORE:
Raised if
a
is not wordaligned (See Note: Vector Alignment)

headroom_t vect_s16_shl(int16_t a[], const int16_t b[], const unsigned length, const left_shift_t b_shl)#
Leftshift the elements of a 16bit vector by a specified number of bits.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in vectors \(\bar a\) and \(\bar b\).b_shl
is the signed arithmetic leftshift applied to each element of \(\bar b\). Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shl} \rfloor) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of a BFP vector \( \bar{b} \cdot 2^{b\_exp} \), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(\bar{a} = \bar{b} \cdot 2^{b\_shl}\) and \(a\_exp = b\_exp\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
b_shl – [in] Arithmetic leftshift applied to elements of \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of output vector \(\bar a\)

headroom_t vect_s16_shr(int16_t a[], const int16_t b[], const unsigned length, const right_shift_t b_shr)#
Rightshift the elements of a 16bit vector by a specified number of bits.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in vectors \(\bar a\) and \(\bar b\).b_shr
is the signed arithmetic rightshift applied to each element of \(\bar b\). Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of a BFP vector \( \bar{b} \cdot 2^{b\_exp} \), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(\bar{a} = \bar{b} \cdot 2^{b\_shr}\) and \(a\_exp = b\_exp\).
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
b_shr – [in] Arithmetic rightshift applied to elements of \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of output vector \(\bar a\)

headroom_t vect_s16_sqrt(int16_t a[], const int16_t b[], const unsigned length, const right_shift_t b_shr, const unsigned depth)#
Compute the square roots of elements of a 16bit vector.
a[]
andb[]
represent the 16bit vectors \(\bar a\) and \(\bar b\) respectively. Each vector must begin at a wordaligned address. This operation can be performed safely inplace onb[]
.length
is the number of elements in each of the vectors.b_shr
is the signed arithmetic rightshift applied to elements of \(\bar b\).depth
is the number of most significant bits to calculate of each \(a_k\). For example, adepth
value of 8 will only compute the 8 most significant byte of the result, with the remaining byte as 0. The maximum value for this parameter isVECT_SQRT_S16_MAX_DEPTH
(31). The time cost of this operation is approximately proportional to the number of bits computed. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & a_k \leftarrow \begin{cases} \sqrt{ b_k' } & b_k' >= 0 \\ 0 & otherwise\end{cases} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) \\ & \qquad\text{ where } \sqrt{\cdot} \text{ computes the most significant } depth \text{ bits of the square root.} && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = (b\_exp + b\_shr  14)/2\).
Note that because exponents must be integers, that means \(b\_exp + b\_shr\) must be even.
The function vect_s16_sqrt_prepare() can be used to obtain values for \(a\_exp\) and \(b\_shr\) based on the input exponent \(b\_exp\) and headroom \(b\_hr\).
 Notes
This function assumes roots are real. Negative input elements will result in corresponding outputs of 0.
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
b_shr – [in] Rightshift appled to \(\bar b\)
depth – [in] Number of bits of each output value to compute
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of output vector \(\bar a\)

headroom_t vect_s16_sub(int16_t a[], const int16_t b[], const int16_t c[], const unsigned length, const right_shift_t b_shr, const right_shift_t c_shr)#
Subtract one 16bit BFP vector from another.
a[]
,b[]
andc[]
represent the 16bit vectors \(\bar a\), \(\bar b\) and \(\bar c\) respectively. Each must begin at a wordaligned address. This operation can be performed safely inplace onb[]
orc[]
.length
is the number of elements in each of the vectors.b_shr
andc_shr
are the signed arithmetic rightshifts applied to each element of \(\bar b\) and \(\bar c\) respectively. Operation Performed:
 \[\begin{split}\begin{flalign*} & b_k' = sat_{16}(\lfloor b_k \cdot 2^{b\_shr} \rfloor) \\ & c_k' = sat_{16}(\lfloor c_k \cdot 2^{c\_shr} \rfloor) \\ & a_k \leftarrow sat_{16}\!\left( b_k'  c_k' \right) \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) and \(\bar c\) are the mantissas of BFP vectors \( \bar{b} \cdot 2^{b\_exp} \) and \(\bar{c} \cdot 2^{c\_exp}\), then the resulting vector \(\bar a\) are the mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\).
In this case, \(b\_shr\) and \(c\_shr\) must be chosen so that \(a\_exp = b\_exp + b\_shr = c\_exp + c\_shr\). Adding or subtracting mantissas only makes sense if they are associated with the same exponent.
The function vect_s16_sub_prepare() can be used to obtain values for \(a\_exp\), \(b\_shr\) and \(c\_shr\) based on the input exponents \(b\_exp\) and \(c\_exp\) and the input headrooms \(b\_hr\) and \(c\_hr\).
See also
vect_s16_sub_prepare
 Parameters:
a – [out] Output vector \(\bar a\)
b – [in] Input vector \(\bar b\)
c – [in] Input vector \(\bar c\)
length – [in] Number of elements in vectors \(\bar a\), \(\bar b\) and \(\bar c\)
b_shr – [in] Rightshift appled to \(\bar b\)
c_shr – [in] Rightshift appled to \(\bar c\)
 Throws ET_LOAD_STORE:
Raised if
a
,b
orc
is not wordaligned (See Note: Vector Alignment) Returns:
Headroom of the output vector \(\bar a\).

int32_t vect_s16_sum(const int16_t b[], const unsigned length)#
Get the sum of elements of a 16bit vector.
b[]
represents the 16bit vector \(\bar b\).b[]
must begin at a wordaligned address.length
is the number of elements in \(\bar b\). Operation Performed:
 \[\begin{flalign*} a \leftarrow \sum_{k=0}^{length1} b_k && \end{flalign*}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the returned value \(a\) is the 32bit mantissa of floatingpoint value \(a \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).
 Parameters:
b – [in] Input vector \(\bar b\)
length – [in] Number of elements in \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
b
is not wordaligned (See Note: Vector Alignment) Returns:
The 32bit sum \(a\)

unsigned chunk_s16_accumulate(split_acc_s32_t *acc, const int16_t b[VPU_INT16_EPV], const right_shift_t b_shr, const unsigned vpu_ctrl)#
Accumulate a 16bit vector chunk into a 32bit accumulator chunk.
16bit vector chunk \(\bar b\) is shifted and accumulated into 32bit accumulator vector chunk \(\bar a\) (
acc
). This function is used for efficiently accumulating multiple (possibly many) 16bit vectors together.The accumulator vector \(\bar a\) stores its elements across two 16bit vector chunks, which corresponds to how the accumulators are stored internally across VPU registers
vD
andvR
. See split_acc_s32_t for details about the accumulator structure.The signed arithmetic rightshift
b_shr
is applied to \(\bar b\) prior to being accumulated into \(\bar a\). When \(\bar b\) and \(\bar a\), are the mantissas of block floating point vectors, usingb_shr
allows those vectors to have different exponents. This is also important when this function is to be called periodically where each \(\bar b\) may have a different exponent.b_shr
must meet the condition14 <= b_shr <= 14
or the behavior of this function is undefined. Operation Performed:
 \[\begin{flalign*} & a_k \leftarrow a_k + floor( \frac{b_k}{2^{\mathtt{b\_shr}}} ) && \end{flalign*}\]
The input
vpu_ctrl
tracks the VPU’s control register state during accumulation. In particular, it is used for keeping track of the headroom of the accumulator vector \(\bar a\). When beginning a sequence of accumulation calls, the value passed in should be initialized toVPU_INT16_CTRL_INIT
. On completion, this function returns the updated VPU control register state, which should be passed in asvpu_ctrl
on the next accumulation call. VPU Control Value
The idea is that each call to this function processes only a single ‘chunk’ (in 16bit mode, a 16element block) at a time, but the caller usually wants to know the headroom of a whole vector, which may comprise many such chunks. So
vpu_ctrl
is a value which persists through each of these calls to track the whole vector.Once all chunks have been accumulated, the
VPU_INT16_HEADROOM_FROM_CTRL()
macro can be used to get the headroom of the accumulator vector. Note that this will produce a maximum value of15
.If many vector chunks \(\bar b\) are accumulated into the same accumulators (when using block floatingpoint, it may be only a few accumulations if the exponent associated with \(\bar b\) is significantly larger than that associated with \(\bar a\)), saturation becomes possible.
 Accumulating Many Values
When saturation is possible, the user must monitor the headroom of \(\bar a\) (using the returned value and
VPU_INT16_HEADROOM_FROM_CTRL()
) to detect when there is no further headroom. As long as there is at least 1 bit of headroom, a call to this function cannot saturate.Typically, when using block floatingpoint, this will be handled by:
Converting \(\bar a\) to a standard vector of
int32_t
using vect_s32_merge_accs()Rightshift the values of \(\bar a\) using vect_s32_shr()
Increment the exponent associated with \(\bar a\) by the same amount rightshifted
Convert \(\bar a\) back into the split accumulator format using vect_s32_split_accs()
When accumulating, setting
b_shr
to the exponent associated with \(\bar b\) minus the exponent associated with \(\bar a\) will automatically adjust for the new exponent of \(\bar a\). Parameters:
acc – [inout] b
b – [in] v
b_shr – [in] v
vpu_ctrl – [in] e
 Throws ET_LOAD_STORE:
Raised if
acc
orb
is not wordaligned (See Note: Vector Alignment) Returns:
Current state of VPU control register.

void vect_s16_to_vect_s32(int32_t a[], const int16_t b[], const unsigned length)#
Convert a 16bit vector to a 32bit vector.
a[]
represents the 32bit output vector \(\bar a\).b[]
represents the 16bit input vector \(\bar b\).Each vector must begin at a wordaligned address.
length
is the number of elements in each of the vectors. Operation Performed:
 \[\begin{split}\begin{flalign*} & a_k \leftarrow b_k \cdot 2^{8} \\ & \qquad\text{ for }k\in 0\ ...\ (length1) && \end{flalign*}\end{split}\]
 Block FloatingPoint

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the 32bit mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\). If \(a\_exp = b\_exp  8\), then this operation has effectively not changed the values represented.
 Notes
The multiplication by \(2^8\) is an artifact of the VPU’s behavior. It turns out to be significantly more efficient to include the factor of \(2^8\). If this is unwanted, vect_s32_shr() can be used with a
b_shr
value of 8 to remove the scaling afterwards.The headroom of output vector \(\bar a\) is not returned by this function. The headroom of the output is always 8 bits greater than the headroom of the input.
 Parameters:
a – [out] 32bit output vector \(\bar a\)
b – [in] 16bit input vector \(\bar b\)
length – [in] Number of elements in vectors \(\bar a\) and \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment)

void vect_s16_extract_high_byte(int8_t a[], const int16_t b[], const unsigned len)#
Extract an 8bit vector containing the most significant byte of a 16bit vector.
This is a utility function used, for example, in optimizing mixedwidth products. The most significant byte of each element is extracted (without rounding or saturation) and inserted into the output vector.
See also
 Parameters:
a – [out] 8bit output vector \(\bar a\)
b – [in] 16bit input vector \(\bar b\)
len – [in] The number of elements in \(\bar a\) and \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment)

void vect_s16_extract_low_byte(int8_t a[], const int16_t b[], const unsigned len)#
Extract an 8bit vector containing the least significant byte of a 16bit vector.
This is a utility function used, for example, in optimizing mixedwidth products. The least significant byte of each element is extracted (without rounding or saturation) and inserted into the output vector.
See also
 Parameters:
a – [out] 8bit output vector \(\bar a\)
b – [in] 16bit input vector \(\bar b\)
len – [in] The number of elements in \(\bar a\) and \(\bar b\)
 Throws ET_LOAD_STORE:
Raised if
a
orb
is not wordaligned (See Note: Vector Alignment)

headroom_t vect_s16_abs(int16_t a[], const int16_t b[], const unsigned length)#