16-bit scalar API¶

group 32-bit Scalar API

Functions

static inline int32_t s16_to_s32( exponent_t *a_exp, const int16_t b, const exponent_t b_exp, const unsigned remove_hr, )¶

Convert a 16-bit floating-point scalar to a 32-bit floating-point scalar.

Converts a 16-bit floating-point scalar, represented by the 16-bit mantissa b and exponent b_exp, into a 32-bit floating-point scalar, represented by the 32-bit returned mantissa and output exponent a_exp.

remove_hr, if nonzero, indicates that the output mantissa should have no headroom. Otherwise, the output mantissa will be the same as the input mantissa.

Parameters:

a_exp – [out] Output exponent
b – [in] 16-bit input mantissa
b_exp – [in] Input exponent
remove_hr – [in] Whether to remove headroom in output

Returns:

32-bit output mantissa

int16_t s16_inverse(exponent_t *a_exp, const int16_t b)¶

Compute the inverse of a 16-bit integer.

b represents the integer \(b\). a and a_exp together represent the result \(a \cdot 2^{a\_exp}\).

Operation Performed

\[\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \frac{1}{b} \end{aligned}\]

Parameters:

a_exp – [out] Output exponent \(a\_exp\)
b – [in] Input integer \(b\)

Returns:

Output mantissa \(a\)

int16_t s16_mul( exponent_t *a_exp, const int16_t b, const int16_t c, const exponent_t b_exp, const exponent_t c_exp, )¶

Compute the product of two 16-bit floating-point scalars.

a and a_exp together represent the result \(a \cdot 2^{a\_exp}\).

b and b_exp together represent the result \(b \cdot 2^{b\_exp}\).

c and c_exp together represent the result \(c \cdot 2^{c\_exp}\).

Operation Performed

\[\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \left( b\cdot 2^{b\_exp} \right) \cdot \left( c\cdot 2^{c\_exp} \right) \end{aligned}\]

Parameters:

a_exp – [out] Output exponent \(a\_exp\)
b – [in] First input mantissa \(b\)
c – [in] Second input mantissa \(c\)
b_exp – [in] First input exponent \(b\_exp\)
c_exp – [in] Second input exponent \(c\_exp\)

Returns:

Output mantissa \(a\)

int16_t s16_ashr(const int16_t x, const right_shift_t shr)¶

Arithmetic shift right of a 16-bit integer.

When a positive shr is given, returns x right-shifted by shr bits, filling the most significant bits with the sign bit. If shr is larger than 15, returns 0 for non-negative x or -1 for negative x.

When a negative shr is given, returns x left-shifted by |shr| bits, saturating to INT16_MAX or INT16_MIN if the result overflows.

Parameters:

x – [in] Input value
shr – [in] Right shift to apply to the input

Returns:

int16_t Shifted result