XCORE SDK
XCORE Software Development Kit
Macros | Functions
XS3 16-Bit Prepare Functions

Macros

#define xs3_vect_complex_s16_add_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_add(). More...
 
#define xs3_vect_complex_s16_add_scalar_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_add_scalar(). More...
 
#define xs3_vect_complex_s16_conj_mul_prepare   xs3_vect_complex_s16_mul_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_mul(). More...
 
#define xs3_vect_complex_s16_nmacc_prepare   xs3_vect_complex_s16_macc_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_nmacc(). More...
 
#define xs3_vect_complex_s16_conj_macc_prepare   xs3_vect_complex_s16_macc_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_macc(). More...
 
#define xs3_vect_complex_s16_conj_nmacc_prepare   xs3_vect_complex_s16_macc_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_nmacc(). More...
 
#define xs3_vect_complex_s16_mag_prepare   xs3_vect_complex_s32_mag_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_mag(). More...
 
#define xs3_vect_complex_s16_real_scale_prepare   xs3_vect_s16_scale_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_real_scale(). More...
 
#define xs3_vect_complex_s16_scale_prepare   xs3_vect_complex_s16_mul_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_scale(). More...
 
#define xs3_vect_complex_s16_sub_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_sub(). More...
 
#define xs3_vect_s16_add_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_s16_add(). More...
 
#define xs3_vect_s16_add_scalar_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_s16_add_scalar(). More...
 
#define xs3_vect_s16_nmacc_prepare   xs3_vect_s16_macc_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_s16_nmacc(). More...
 
#define xs3_vect_s16_sub_prepare   xs3_vect_s32_add_prepare
 Obtain the output exponent and shifts required for a call to xs3_vect_s16_sub(). More...
 

Functions

void xs3_vect_complex_s16_macc_prepare (exponent_t *new_acc_exp, right_shift_t *acc_shr, right_shift_t *bc_sat, const exponent_t acc_exp, const exponent_t b_exp, const exponent_t c_exp, const headroom_t acc_hr, const headroom_t b_hr, const headroom_t c_hr)
 Obtain the output exponent and shifts needed by xs3_vect_complex_s16_macc(). More...
 
void xs3_vect_complex_s16_mul_prepare (exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
 Obtain the output exponent and output shift used by xs3_vect_complex_s16_mul() and xs3_vect_complex_s16_conj_mul(). More...
 
void xs3_vect_complex_s16_real_mul_prepare (exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
 Obtain the output exponent and output shift used by xs3_vect_complex_s16_real_mul(). More...
 
void xs3_vect_complex_s16_squared_mag_prepare (exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const headroom_t b_hr)
 Obtain the output exponent and input shift used by xs3_vect_complex_s16_squared_mag(). More...
 
void xs3_vect_s16_clip_prepare (exponent_t *a_exp, right_shift_t *b_shr, int16_t *lower_bound, int16_t *upper_bound, const exponent_t b_exp, const exponent_t bound_exp, const headroom_t b_hr)
 Obtain the output exponent, input shift and modified bounds used by xs3_vect_s16_clip(). More...
 
void xs3_vect_s16_inverse_prepare (exponent_t *a_exp, unsigned *scale, const int16_t b[], const exponent_t b_exp, const unsigned length)
 Obtain the output exponent and scaling parameter used by xs3_vect_s16_inverse(). More...
 
void xs3_vect_s16_macc_prepare (exponent_t *new_acc_exp, right_shift_t *acc_shr, right_shift_t *bc_sat, const exponent_t acc_exp, const exponent_t b_exp, const exponent_t c_exp, const headroom_t acc_hr, const headroom_t b_hr, const headroom_t c_hr)
 Obtain the output exponent and shifts needed by xs3_vect_s16_macc(). More...
 
void xs3_vect_s16_mul_prepare (exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
 [xs3_vect_s16_mul] More...
 
void xs3_vect_s16_scale_prepare (exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
 Obtain the output exponent and output shift used by xs3_vect_s16_scale(). More...
 
void xs3_vect_s16_sqrt_prepare (exponent_t *a_exp, right_shift_t *b_shr, const exponent_t b_exp, const right_shift_t b_hr)
 Obtain the output exponent and shift parameter used by xs3_vect_s16_sqrt(). More...
 

Detailed Description

Macro Definition Documentation

◆ xs3_vect_complex_s16_add_prepare

#define xs3_vect_complex_s16_add_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_add().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_add() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s32_add_prepare()

◆ xs3_vect_complex_s16_add_scalar_prepare

#define xs3_vect_complex_s16_add_scalar_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_add_scalar().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_add_scalar() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s16_add_prepare()

◆ xs3_vect_complex_s16_conj_macc_prepare

#define xs3_vect_complex_s16_conj_macc_prepare   xs3_vect_complex_s16_macc_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_macc().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_conj_macc() is identical to that for xs3_vect_complex_s16_macc().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s16_macc_prepare(), xs3_vect_complex_s16_conj_macc()

◆ xs3_vect_complex_s16_conj_mul_prepare

#define xs3_vect_complex_s16_conj_mul_prepare   xs3_vect_complex_s16_mul_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_mul().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_conj_mul() is identical to that for xs3_vect_complex_s16_mul().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s16_mul_prepare()

◆ xs3_vect_complex_s16_conj_nmacc_prepare

#define xs3_vect_complex_s16_conj_nmacc_prepare   xs3_vect_complex_s16_macc_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_conj_nmacc().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_conj_nmacc() is identical to that for xs3_vect_complex_s16_macc().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s16_macc_prepare(), xs3_vect_complex_s16_conj_nmacc()

◆ xs3_vect_complex_s16_mag_prepare

#define xs3_vect_complex_s16_mag_prepare   xs3_vect_complex_s32_mag_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_mag().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_mag() is identical to that for xs3_vect_complex_s32_mag().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s32_mag_prepare()

◆ xs3_vect_complex_s16_nmacc_prepare

#define xs3_vect_complex_s16_nmacc_prepare   xs3_vect_complex_s16_macc_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_nmacc().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_nmacc() is identical to that for xs3_vect_complex_s16_macc().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s16_macc_prepare(), xs3_vect_complex_s16_nmacc()

◆ xs3_vect_complex_s16_real_scale_prepare

#define xs3_vect_complex_s16_real_scale_prepare   xs3_vect_s16_scale_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_real_scale().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_real_scale() is identical to that for xs3_vect_s32_scale().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s16_scale_prepare()

◆ xs3_vect_complex_s16_scale_prepare

#define xs3_vect_complex_s16_scale_prepare   xs3_vect_complex_s16_mul_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_scale().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_scale() is identical to that for xs3_vect_complex_s32_mul().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_complex_s16_mul_prepare()

◆ xs3_vect_complex_s16_sub_prepare

#define xs3_vect_complex_s16_sub_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_complex_s16_sub().

The logic for computing the shifts and exponents of xs3_vect_complex_s16_sub() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s32_add_prepare()

◆ xs3_vect_s16_add_prepare

#define xs3_vect_s16_add_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_s16_add().

The logic for computing the shifts and exponents of xs3_vect_s16_add() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s32_add_prepare()

◆ xs3_vect_s16_add_scalar_prepare

#define xs3_vect_s16_add_scalar_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_s16_add_scalar().

The logic for computing the shifts and exponents of xs3_vect_s16_add_scalar() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s32_add_prepare()

◆ xs3_vect_s16_nmacc_prepare

#define xs3_vect_s16_nmacc_prepare   xs3_vect_s16_macc_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_s16_nmacc().

The logic for computing the shifts and exponents of xs3_vect_s16_nmacc() is identical to that for xs3_vect_s16_macc_prepare().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s16_macc_prepare(), xs3_vect_s16_nmacc()

◆ xs3_vect_s16_sub_prepare

#define xs3_vect_s16_sub_prepare   xs3_vect_s32_add_prepare

Obtain the output exponent and shifts required for a call to xs3_vect_s16_sub().

The logic for computing the shifts and exponents of xs3_vect_s16_sub() is identical to that for xs3_vect_s32_add().

This macro is provided as a convenience to developers and to make the code more readable.

See also
xs3_vect_s32_add_prepare()

Function Documentation

◆ xs3_vect_complex_s16_macc_prepare()

void xs3_vect_complex_s16_macc_prepare ( exponent_t new_acc_exp,
right_shift_t acc_shr,
right_shift_t bc_sat,
const exponent_t  acc_exp,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  acc_hr,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

Obtain the output exponent and shifts needed by xs3_vect_complex_s16_macc().

This function is used in conjunction with xs3_vect_complex_s16_macc() to perform an element-wise multiply-accumlate of complex 16-bit BFP vectors.

This function computes new_acc_exp and acc_shr and bc_sat, which are selected to maximize precision in the resulting accumulator vector without causing saturation of final or intermediate values. Normally the caller will pass these outputs to their corresponding inputs of xs3_vect_complex_s16_macc().

acc_exp is the exponent associated with the accumulator mantissa vector \(\bar a\) prior to the operation, whereas new_acc_exp is the exponent corresponding to the updated accumulator vector.

b_exp and c_exp are the exponents associated with the complex input mantissa vectors \(\bar b\) and \(\bar c\) respectively.

acc_hr, b_hr and c_hr are the headrooms of \(\bar a\), \(\bar b\) and \(\bar c\) respectively. If the headroom of any of these vectors is unknown, it can be obtained by calling xs3_vect_complex_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the acc_shr and bc_sat produced by this function can be adjusted according to the following:

// Presumed to be set somewhere
exponent_t acc_exp, b_exp, c_exp;
headroom_t acc_hr, b_hr, c_hr;
exponent_t desired_exp;
...
// Call prepare
right_shift_t acc_shr, bc_sat;
xs3_vect_complex_s16_macc_prepare(&acc_exp, &acc_shr, &bc_sat,
acc_exp, b_exp, c_exp,
acc_hr, b_hr, c_hr);
// Modify results
right_shift_t mant_shr = desired_exp - acc_exp;
acc_exp += mant_shr;
acc_shr += mant_shr;
bc_sat += mant_shr;
// acc_shr and bc_sat may now be used in a call to xs3_vect_complex_s16_macc()
int exponent_t
An exponent.
Definition: xs3_math_types.h:76
int right_shift_t
A rightwards arithmetic bit-shift.
Definition: xs3_math_types.h:98
unsigned headroom_t
Headroom of some integer or integer array.
Definition: xs3_math_types.h:86
void xs3_vect_complex_s16_macc_prepare(exponent_t *new_acc_exp, right_shift_t *acc_shr, right_shift_t *bc_sat, const exponent_t acc_exp, const exponent_t b_exp, const exponent_t c_exp, const headroom_t acc_hr, const headroom_t b_hr, const headroom_t c_hr)
Obtain the output exponent and shifts needed by xs3_vect_complex_s16_macc().
Definition: xs3_complex_prepare.c:17

When applying the above adjustment, the following conditions should be maintained:

  • bc_sat >= 0 (bc_sat is an unsigned right-shift)
  • acc_shr > -acc_hr (Shifting any further left may cause saturation)

It is up to the user to ensure any such modification does not result in saturation or unacceptable loss of precision.

Parameters
[out]new_acc_expExponent associated with output mantissa vector \(\bar a\) (after macc)
[out]acc_shrSigned arithmetic right-shift used for \(\bar a\) in xs3_vect_complex_s16_macc()
[out]bc_satUnsigned arithmetic right-shift applied to the product of elements \(b_k\) and \(c_k\) in xs3_vect_complex_s16_macc()
[in]acc_expExponent associated with input mantissa vector \(\bar a\) (before macc)
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]c_expExponent associated with input mantissa vector \(\bar c\)
[in]acc_hrHeadroom of input mantissa vector \(\bar a\) (before macc)
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
[in]c_hrHeadroom of input mantissa vector \(\bar c\)
See also
xs3_vect_complex_s16_macc

◆ xs3_vect_complex_s16_mul_prepare()

void xs3_vect_complex_s16_mul_prepare ( exponent_t a_exp,
right_shift_t a_shr,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

Obtain the output exponent and output shift used by xs3_vect_complex_s16_mul() and xs3_vect_complex_s16_conj_mul().

This function is used in conjunction with xs3_vect_complex_s16_mul() to perform a complex element-wise multiplication of two complex 16-bit BFP vectors.

This function computes a_exp and a_shr.

a_exp is the exponent associated with mantissa vector \(\bar a\), and must be chosen to be large enough to avoid overflow when elements of \(\bar a\) are computed. To maximize precision, this function chooses a_exp to be the smallest exponent known to avoid saturation (see exception below). The a_exp chosen by this function is derived from the exponents and headrooms of associated with the input vectors.

a_shr is the shift parameter required by xs3_vect_complex_s16_mul() to achieve the chosen output exponent a_exp.

b_exp and c_exp are the exponents associated with the input mantissa vectors \(\bar b\) and \(\bar c\) respectively.

b_hr and c_hr are the headroom of \(\bar b\) and \(\bar c\) respectively. If the headroom of \(\bar b\) or \(\bar c\) is unknown, they can be obtained by calling xs3_vect_complex_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the a_shr and c_shr produced by this function can be adjusted according to the following:

exponent_t desired_exp = ...; // Value known a priori
right_shift_t new_a_shr = a_shr + (desired_exp - a_exp);

When applying the above adjustment, the following conditions should be maintained:

  • new_a_shr >= 0

Be aware that using smaller values than strictly necessary for a_shr can result in saturation, and using larger values may result in unnecessary underflows or loss of precision.

Notes

  • Using the outputs of this function, an output mantissa which would otherwise be INT16_MIN will instead saturate to -INT16_MAX. This is due to the symmetric saturation logic employed by the VPU and is a hardware feature. This is a corner case which is usually unlikely and results in 1 LSb of error when it occurs.
Parameters
[out]a_expExponent associated with output mantissa vector \(\bar a\)
[out]a_shrUnsigned arithmetic right-shift for \(\bar b\) used by xs3_vect_complex_s16_mul()
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]c_expExponent associated with input mantissa vector \(\bar c\)
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
[in]c_hrHeadroom of input mantissa vector \(\bar c\)
See also
xs3_vect_complex_s16_conj_mul, xs3_vect_complex_s16_mul

◆ xs3_vect_complex_s16_real_mul_prepare()

void xs3_vect_complex_s16_real_mul_prepare ( exponent_t a_exp,
right_shift_t a_shr,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

Obtain the output exponent and output shift used by xs3_vect_complex_s16_real_mul().

This function is used in conjunction with xs3_vect_complex_s16_real_mul() to perform a complex element-wise multiplication of a complex 16-bit BFP vector by a real 16-bit vector.

This function computes a_exp and a_shr.

a_exp is the exponent associated with mantissa vector \(\bar a\), and must be chosen to be large enough to avoid overflow when elements of \(\bar a\) are computed. To maximize precision, this function chooses a_exp to be the smallest exponent known to avoid saturation (see exception below). The a_exp chosen by this function is derived from the exponents and headrooms of associated with the input vectors.

a_shr is the shift parameter required by xs3_vect_complex_s16_real_mul() to achieve the chosen output exponent a_exp.

b_exp and c_exp are the exponents associated with the input mantissa vectors \(\bar b\) and \(\bar c\) respectively.

b_hr and c_hr are the headroom of \(\bar b\) and \(\bar c\) respectively. If the headroom of \(\bar b\) or \(\bar c\) is unknown, they can be obtained by calling xs3_vect_complex_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the a_shr and c_shr produced by this function can be adjusted according to the following:

exponent_t desired_exp = ...; // Value known a priori
right_shift_t new_a_shr = a_shr + (desired_exp - a_exp);

When applying the above adjustment, the following conditions should be maintained:

  • new_a_shr >= 0

Be aware that using smaller values than strictly necessary for a_shr can result in saturation, and using larger values may result in unnecessary underflows or loss of precision.

Notes

  • Using the outputs of this function, an output mantissa which would otherwise be INT16_MIN will instead saturate to -INT16_MAX. This is due to the symmetric saturation logic employed by the VPU and is a hardware feature. This is a corner case which is usually unlikely and results in 1 LSb of error when it occurs.
Parameters
[out]a_expExponent associated with output mantissa vector \(\bar a\)
[out]a_shrUnsigned arithmetic right-shift for \(\bar a\) used by xs3_vect_complex_s16_real_mul()
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]c_expExponent associated with input mantissa vector \(\bar c\)
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
[in]c_hrHeadroom of input mantissa vector \(\bar c\)
See also
xs3_vect_complex_s16_real_mul

◆ xs3_vect_complex_s16_squared_mag_prepare()

void xs3_vect_complex_s16_squared_mag_prepare ( exponent_t a_exp,
right_shift_t a_shr,
const exponent_t  b_exp,
const headroom_t  b_hr 
)

Obtain the output exponent and input shift used by xs3_vect_complex_s16_squared_mag().

This function is used in conjunction with xs3_vect_complex_s16_squared_mag() to compute the squared magnitude of each element of a complex 16-bit BFP vector.

This function computes a_exp and a_shr.

a_exp is the exponent associated with mantissa vector \(\bar a\), and is be chosen to maximize precision when elements of \(\bar a\) are computed. The a_exp chosen by this function is derived from the exponent and headroom associated with the input vector.

a_shr is the shift parameter required by xs3_vect_complex_s16_mag() to achieve the chosen output exponent a_exp.

b_exp is the exponent associated with the input mantissa vector \(\bar b\).

b_hr is the headroom of \(\bar b\). If the headroom of \(\bar b\) is unknown it can be calculated using xs3_vect_complex_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the a_shr produced by this function can be adjusted according to the following:

exponent_t a_exp;
xs3_vect_s16_mul_prepare(&a_exp, &a_shr, b_exp, c_exp, b_hr, c_hr);
exponent_t desired_exp = ...; // Value known a priori
a_shr = a_shr + (desired_exp - a_exp);
a_exp = desired_exp;
void xs3_vect_s16_mul_prepare(exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
[xs3_vect_s16_mul]
Definition: xs3_prepare.c:67

When applying the above adjustment, the following condition should be maintained:

  • a_shr >= 0

Using larger values than strictly necessary for a_shr may result in unnecessary underflows or loss of precision.

Parameters
[out]a_expOutput exponent associated with output mantissa vector \(\bar a\)
[out]a_shrUnsigned arithmetic right-shift for \(\bar a\) used by xs3_vect_complex_s16_squared_mag()
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
See also
xs3_vect_complex_s16_squared_mag()

◆ xs3_vect_s16_clip_prepare()

void xs3_vect_s16_clip_prepare ( exponent_t a_exp,
right_shift_t b_shr,
int16_t *  lower_bound,
int16_t *  upper_bound,
const exponent_t  b_exp,
const exponent_t  bound_exp,
const headroom_t  b_hr 
)

Obtain the output exponent, input shift and modified bounds used by xs3_vect_s16_clip().

This function is used in conjunction with xs3_vect_s16_clip() to bound the elements of a 32-bit BFP vector to a specified range.

This function computes a_exp, b_shr, lower_bound and upper_bound.

a_exp is the exponent associated with the 16-bit mantissa vector \(\bar a\) computed by xs3_vect_s32_clip().

b_shr is the shift parameter required by xs3_vect_s16_clip() to achieve the output exponent a_exp.

lower_bound and upper_bound are the 16-bit mantissas which indicate the lower and upper clipping bounds respectively. The values are modified by this function, and the resulting values should be passed along to xs3_vect_s16_clip().

b_exp is the exponent associated with the input mantissa vector \(\bar b\).

bound_exp is the exponent associated with the bound mantissas lower_bound and upper_bound respectively.

b_hr is the headroom of \(\bar b\). If unknown, it can be obtained using xs3_vect_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Parameters
[out]a_expExponent associated with output mantissa vector \(\bar a\)
[out]b_shrSigned arithmetic right-shift for \(\bar b\) used by xs3_vect_s32_clip()
[in,out]lower_boundLower bound of clipping range
[in,out]upper_boundUpper bound of clipping range
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]bound_expExponent associated with clipping bounds lower_bound and upper_bound
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
See also
xs3_vect_s16_clip

◆ xs3_vect_s16_inverse_prepare()

void xs3_vect_s16_inverse_prepare ( exponent_t a_exp,
unsigned *  scale,
const int16_t  b[],
const exponent_t  b_exp,
const unsigned  length 
)

Obtain the output exponent and scaling parameter used by xs3_vect_s16_inverse().

This function is used in conjunction with xs3_vect_s16_inverse() to compute the inverse of elements of a 16-bit BFP vector.

This function computes a_exp and scale.

a_exp is the exponent associated with output mantissa vector \(\bar a\), and must be chosen to avoid overflow in the smallest element of the input vector, which when inverted becomes the largest output element. To maximize precision, this function chooses a_exp to be the smallest exponent known to avoid saturation. The a_exp chosen by this function is derived from the exponent and smallest element of the input vector.

scale is a scaling parameter used by xs3_vect_s16_inverse() to achieve the chosen output exponent.

b[] is the input mantissa vector \(\bar b\).

b_exp is the exponent associated with the input mantissa vector \(\bar b\).

length is the number of elements in \(\bar b\).

Todo:
In lib_dsp, the inverse function has a floor, which prevents tiny values from completely dominating the output behavior. Perhaps I should include that?
Parameters
[out]a_expExponent of output vector \(\bar a\)
[out]scaleScale factor to be applied when computing inverse
[in]bInput vector \(\bar b\)
[in]b_expExponent of \(\bar b\)
[in]lengthNumber of elements in vector \(\bar b\)
See also
xs3_vect_s16_inverse

◆ xs3_vect_s16_macc_prepare()

void xs3_vect_s16_macc_prepare ( exponent_t new_acc_exp,
right_shift_t acc_shr,
right_shift_t bc_sat,
const exponent_t  acc_exp,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  acc_hr,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

Obtain the output exponent and shifts needed by xs3_vect_s16_macc().

This function is used in conjunction with xs3_vect_s16_macc() to perform an element-wise multiply-accumlate of 16-bit BFP vectors.

This function computes new_acc_exp and acc_shr and bc_sat, which are selected to maximize precision in the resulting accumulator vector without causing saturation of final or intermediate values. Normally the caller will pass these outputs to their corresponding inputs of xs3_vect_s16_macc().

acc_exp is the exponent associated with the accumulator mantissa vector \(\bar a\) prior to the operation, whereas new_acc_exp is the exponent corresponding to the updated accumulator vector.

b_exp and c_exp are the exponents associated with the complex input mantissa vectors \(\bar b\) and \(\bar c\) respectively.

acc_hr, b_hr and c_hr are the headrooms of \(\bar a\), \(\bar b\) and \(\bar c\) respectively. If the headroom of any of these vectors is unknown, it can be obtained by calling xs3_vect_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the acc_shr and bc_sat produced by this function can be adjusted according to the following:

// Presumed to be set somewhere
exponent_t acc_exp, b_exp, c_exp;
headroom_t acc_hr, b_hr, c_hr;
exponent_t desired_exp;
...
// Call prepare
right_shift_t acc_shr, bc_sat;
xs3_vect_s16_macc_prepare(&acc_exp, &acc_shr, &bc_sat,
acc_exp, b_exp, c_exp,
acc_hr, b_hr, c_hr);
// Modify results
right_shift_t mant_shr = desired_exp - acc_exp;
acc_exp += mant_shr;
acc_shr += mant_shr;
bc_sat += mant_shr;
// acc_shr and bc_sat may now be used in a call to xs3_vect_s16_macc()
void xs3_vect_s16_macc_prepare(exponent_t *new_acc_exp, right_shift_t *acc_shr, right_shift_t *bc_sat, const exponent_t acc_exp, const exponent_t b_exp, const exponent_t c_exp, const headroom_t acc_hr, const headroom_t b_hr, const headroom_t c_hr)
Obtain the output exponent and shifts needed by xs3_vect_s16_macc().
Definition: xs3_prepare.c:19

When applying the above adjustment, the following conditions should be maintained:

  • bc_sat >= 0 (bc_sat is an unsigned right-shift)
  • acc_shr > -acc_hr (Shifting any further left may cause saturation)

It is up to the user to ensure any such modification does not result in saturation or unacceptable loss of precision.

Parameters
[out]new_acc_expExponent associated with output mantissa vector \(\bar a\) (after macc)
[out]acc_shrSigned arithmetic right-shift used for \(\bar a\) in xs3_vect_s16_macc()
[out]bc_satUnsigned arithmetic right-shift applied to the product of elements \(b_k\) and \(c_k\) in xs3_vect_s16_macc()
[in]acc_expExponent associated with input mantissa vector \(\bar a\) (before macc)
[in]b_expExponent associated with input mantissa vector \(\bar b\)
[in]c_expExponent associated with input mantissa vector \(\bar c\)
[in]acc_hrHeadroom of input mantissa vector \(\bar a\) (before macc)
[in]b_hrHeadroom of input mantissa vector \(\bar b\)
[in]c_hrHeadroom of input mantissa vector \(\bar c\)
See also
xs3_vect_s16_macc

◆ xs3_vect_s16_mul_prepare()

void xs3_vect_s16_mul_prepare ( exponent_t a_exp,
right_shift_t a_shr,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

[xs3_vect_s16_mul]

Obtain the output exponent and output shift used by xs3_vect_s16_mul().

This function is used in conjunction with xs3_vect_s16_mul() to perform an element-wise multiplication of two 16-bit BFP vectors.

This function computes a_exp and a_shr.

a_exp is the exponent associated with mantissa vector \(\bar a\), and must be chosen to be large enough to avoid overflow when elements of \(\bar a\) are computed. To maximize precision, this function chooses a_exp to be the smallest exponent known to avoid saturation (see exception below). The a_exp chosen by this function is derived from the exponents and headrooms of associated with the input vectors.

a_shr is an arithmetic right-shift applied by xs3_vect_complex_s16_mul() to the 32-bit products of input elements to achieve the chosen output exponent a_exp.

b_exp and c_exp are the exponents associated with the input mantissa vectors \(\bar b\) and \(\bar c\) respectively.

b_hr and c_hr are the headroom of \(\bar b\) and \(\bar c\) respectively. If the headroom of \(\bar b\) or \(\bar c\) is unknown, they can be obtained by calling xs3_vect_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the a_shr produced by this function can be adjusted according to the following:

exponent_t a_exp;
xs3_vect_s16_mul_prepare(&a_exp, &a_shr, b_exp, c_exp, b_hr, c_hr);
exponent_t desired_exp = ...; // Value known a priori
a_shr = a_shr + (desired_exp - a_exp);
a_exp = desired_exp;

When applying the above adjustment, the following conditions should be maintained:

  • a_shr >= 0

Be aware that using a smaller value than strictly necessary for a_shr can result in saturation, and using larger values may result in unnecessary underflows or loss of precision.

Notes

  • Using the outputs of this function, an output mantissa which would otherwise be INT16_MIN will instead saturate to -INT16_MAX. This is due to the symmetric saturation logic employed by the VPU and is a hardware feature. This is a corner case which is usually unlikely and results in 1 LSb of error when it occurs.
Parameters
[out]a_expExponent of output elements of xs3_vect_s16_mul()
[out]a_shrRight-shift supplied to xs3_vect_s16_mul()
[in]b_expExponent associated with \(\bar b\)
[in]c_expExponent associated with \(\bar c\)
[in]b_hrHeadroom of \(\bar b\)
[in]c_hrHeadroom of \(\bar c\)
See also
xs3_vect_s16_mul

◆ xs3_vect_s16_scale_prepare()

void xs3_vect_s16_scale_prepare ( exponent_t a_exp,
right_shift_t a_shr,
const exponent_t  b_exp,
const exponent_t  c_exp,
const headroom_t  b_hr,
const headroom_t  c_hr 
)

Obtain the output exponent and output shift used by xs3_vect_s16_scale().

This function is used in conjunction with xs3_vect_s16_scale() to perform multiplication of a 16-bit BFP vector \(\bar{b} \cdot 2^{b\_exp}\) by a 16-bit scalar \(c \cdot 2^{c\_exp}\). The result is another 16-bit BFP vector \(\bar{a} \cdot 2^{a\_exp}\).

This function computes a_exp and a_shr.

a_exp is the exponent associated with mantissa vector \(\bar a\), and must be chosen to be large enough to avoid overflow when elements of \(\bar a\) are computed. To maximize precision, this function chooses a_exp to be the smallest exponent known to avoid saturation (see exception below). The a_exp chosen by this function is derived from the exponents and headrooms of associated with the inputs.

a_shr is an arithmetic right-shift applied by xs3_vect_complex_s16_scale() to the 32-bit products of input elements to achieve the chosen output exponent a_exp.

b_exp and c_exp are the exponents associated with \(\bar b\) and \(c\) respectively.

b_hr and c_hr are the headroom of \(\bar b\) and \(c\) respectively. If the headroom of \(\bar b\) or \(c\) are unknown, they can be obtained by calling xs3_vect_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the a_shr produced by this function can be adjusted according to the following:

exponent_t a_exp;
xs3_vect_s16_scale_prepare(&a_exp, &a_shr, b_exp, c_exp, b_hr, c_hr);
exponent_t desired_exp = ...; // Value known a priori
a_shr = a_shr + (desired_exp - a_exp);
a_exp = desired_exp;
void xs3_vect_s16_scale_prepare(exponent_t *a_exp, right_shift_t *a_shr, const exponent_t b_exp, const exponent_t c_exp, const headroom_t b_hr, const headroom_t c_hr)
Obtain the output exponent and output shift used by xs3_vect_s16_scale().
Definition: xs3_prepare.c:247

When applying the above adjustment, the following conditions should be maintained:

  • a_shr >= 0

Be aware that using a smaller value than strictly necessary for a_shr can result in saturation, and using larger values may result in unnecessary underflows or loss of precision.

Notes

  • Using the outputs of this function, an output mantissa which would otherwise be INT16_MIN will instead saturate to -INT16_MAX. This is due to the symmetric saturation logic employed by the VPU and is a hardware feature. This is a corner case which is usually unlikely and results in 1 LSb of error when it occurs.
Parameters
[out]a_expExponent of output elements of xs3_vect_s16_scale()
[out]a_shrRight-shift supplied to xs3_vect_s16_scale()
[in]b_expExponent associated with \(\bar b\)
[in]c_expExponent associated with \(\bar c\)
[in]b_hrHeadroom of \(\bar b\)
[in]c_hrHeadroom of \(\bar c\)
See also
xs3_vect_s16_scale

◆ xs3_vect_s16_sqrt_prepare()

void xs3_vect_s16_sqrt_prepare ( exponent_t a_exp,
right_shift_t b_shr,
const exponent_t  b_exp,
const right_shift_t  b_hr 
)

Obtain the output exponent and shift parameter used by xs3_vect_s16_sqrt().

This function is used in conjunction withx xs3_vect_s16_sqrt() to compute the square root of elements of a 16-bit BFP vector.

This function computes a_exp and b_shr.

a_exp is the exponent associated with output mantissa vector \(\bar a\), and should be chosen to maximize the precision of the results. To that end, this function chooses a_exp to be the smallest exponent known to avoid saturation of the resulting mantissa vector \(\bar a\). It is derived from the exponent and headroom of the input BFP vector.

b_shr is the shift parameter required by xs3_vect_s16_sqrt() to achieve the chosen output exponent a_exp.

b_exp is the exponent associated with the input mantissa vector \(\bar b\).

b_hr is the headroom of \(\bar b\). If it is unknown, it can be obtained using xs3_vect_s16_headroom(). Alternatively, the value 0 can always be safely used (but may result in reduced precision).

Adjusting Output Exponents

If a specific output exponent desired_exp is needed for the result (e.g. for emulating fixed-point arithmetic), the b_shr produced by this function can be adjusted according to the following:

exponent_t a_exp;
xs3_vect_s16_mul_prepare(&a_exp, &b_shr, b_exp, c_exp, b_hr, c_hr);
exponent_t desired_exp = ...; // Value known a priori
b_shr = b_shr + (desired_exp - a_exp);
a_exp = desired_exp;

When applying the above adjustment, the following condition should be maintained:

  • b_hr + b_shr >= 0

Be aware that using smaller values than strictly necessary for b_shr can result in saturation, and using larger values may result in unnecessary underflows or loss of precision.

Also, if a larger exponent is used than necessary, a larger depth parameter (see xs3_vect_s16_sqrt()) will be required to achieve the same precision, as the results are computed bit by bit, starting with the most significant bit.

Parameters
[out]a_expExponent of outputs of xs3_vect_s16_sqrt()
[out]b_shrRight-shift to be applied to elements of \(\bar b\)
[in]b_expExponent of vector{b}
[in]b_hrHeadroom of vector{b}
See also
xs3_vect_s16_sqrt