16-bit Scalar API#

group scalar_s16_api

Functions

int32_t s16_to_s32(exponent_t *a_exp, const int16_t b, const exponent_t b_exp, const unsigned remove_hr)#

Convert a 16-bit floating-point scalar to a 32-bit floating-point scalar.

Converts a 16-bit floating-point scalar, represented by the 16-bit mantissa b and exponent b_exp, into a 32-bit floating-point scalar, represented by the 32-bit returned mantissa and output exponent a_exp.

remove_hr, if nonzero, indicates that the output mantissa should have no headroom. Otherwise, the output mantissa will be the same as the input mantissa.

Parameters:
• a_exp[out] Output exponent

• b[in] 16-bit input mantissa

• b_exp[in] Input exponent

• remove_hr[in] Whether to remove headroom in output

Returns:

32-bit output mantissa

int16_t s16_inverse(exponent_t *a_exp, const int16_t b)#

Compute the inverse of a 16-bit integer.

b represents the integer $$b$$. a and a_exp together represent the result $$a \cdot 2^{a\_exp}$$.

Operation Performed

\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \frac{1}{b} \end{aligned}

Parameters:
• a_exp[out] Output exponent $$a\_exp$$

• b[in] Input integer $$b$$

Returns:

Output mantissa $$a$$

int16_t s16_mul(exponent_t *a_exp, const int16_t b, const int16_t c, const exponent_t b_exp, const exponent_t c_exp)#

Compute the product of two 16-bit floating-point scalars.

a and a_exp together represent the result $$a \cdot 2^{a\_exp}$$.

b and b_exp together represent the result $$b \cdot 2^{b\_exp}$$.

c and c_exp together represent the result $$c \cdot 2^{c\_exp}$$.

Operation Performed

\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \left( b\cdot 2^{b\_exp} \right) \cdot \left( c\cdot 2^{c\_exp} \right) \end{aligned}

Parameters:
• a_exp[out] Output exponent $$a\_exp$$

• b[in] First input mantissa $$b$$

• c[in] Second input mantissa $$c$$

• b_exp[in] First input exponent $$b\_exp$$

• c_exp[in] Second input exponent $$c\_exp$$

Returns:

Output mantissa $$a$$