Mixed-Precision Vector API#
- group vect_mixed_api
void mat_mul_s8_x_s16_yield_s32(int32_t output, const int8_t matrix, const int16_t input_vect, const unsigned M_rows, const unsigned N_cols, int8_t scratch)#
Multiply an 8-bit matrix by a 16-bit vetor for a 32-bit result vector.
This function multiplies an 8-bit \(M \times N\) matrix \(\bar W\) by a 16-bit \(N\)-element column vector \(\bar v\) and returns the result as a 32-bit \(M\)-element vector \(\bar a\).
outputis the output vector \(\bar a\).
matrixis the matrix \(\bar W\).
input_vectis the vector \(\bar v\).
input_vectmust both begin at a word-aligned offsets.
N_rowsare the dimensions \(M\) and \(N\) of matrix \(\bar W\). \(M\) must be a multiple of 16, and \(N\) must be a multiple of 32.
scratchis a pointer to a word-aligned buffer that this function may use to store intermediate results. This buffer must be at least \(N\) bytes long.
The result of this multiplication is exact, so long as saturation does not occur.
output – [inout] The output vector \(\bar a\)
matrix – [in] The weight matrix \(\bar W\)
input_vect – [in] The input vector \(\bar v\)
M_rows – [in] The number of rows \(M\) in matrix \(\bar W\)
N_cols – [in] The number of columns \(N\) in matrix \(\bar W\)
scratch – [in] Scratch buffer required by this function.
- Throws ET_LOAD_STORE:
input_vectis not word-aligned (See Note: Vector Alignment)
unsigned vect_sXX_add_scalar(int32_t a, const int32_t b, const unsigned length_bytes, const int32_t c, const int32_t d, const right_shift_t b_shr, const unsigned mode_bits)#
Add a scalar to a vector.
Add a scalar to a vector. This works for 8, 16 or 32 bits, real or complex.
length_bytesis the total number of bytes to be output. So, for 16-bit vectors,
length_bytesis twice the number of elements, whereas for complex 32-bit vectors,
length_bytesis 8 times the number of elements.
dare the values that populate the internal buffer to be added to the input vector as follows: Internally an 8 word (32 byte) buffer is allocated (on the stack). Even-indexed words are populated with
cand odd-indexed words are populated with
d. For real vectors,
dshould be the same value — the reason for
dis to allow this same function to work for complex 32-bit vectors. This also means that for 16-bit vectors, the value to be added needs to be duplicated in both the higher 2 bytes and lower 2 bytes of the word.
0x0000for 32-bit mode,
0x0100for 16-bit mode or
0x0200for 8-bit mode.
- void mat_mul_s8_x_s16_yield_s32(int32_t output, const int8_t matrix, const int16_t input_vect, const unsigned M_rows, const unsigned N_cols, int8_t scratch)#