# 32-Bit Vector Chunk (8-Element) API#

group chunk32_api

Functions

int32_t chunk_s32_dot(const int32_t b[VPU_INT32_EPV], const q2_30 c[VPU_INT32_EPV])#

Compute the inner product between two vector chunks.

This function computes the inner product of two vector chunks, $$\bar b$$ and $$\bar c$$.

Conceptually, elements of $$\bar b$$ may have any number of fractional bits (int, fixed-point, mantissas of a BFP vector) so long as they’re all the same. Elements of $$\bar c$$ are Q2.30 fixed-point values. Given that, the returned value $$a$$ will have the same number of fractional bits as $$\bar b$$.

Only the lowest 32 bits of the sum $$a$$ are returned.

Operation Performed

\begin{aligned} & a \leftarrow \sum_{k=0}^{\mathtt{VPU\_INT32\_EPV}-1} \left( round\left( \frac{b_k\cdot{}c_k}{2^{30}} \right) \right) \end{aligned}

Parameters:
• b[in] Input chunk $$\bar b$$

• c[in] Input chunk $$\bar c$$

Returns:

$$a$$

void chunk_s32_log(q8_24 a[VPU_INT32_EPV], const int32_t b[VPU_INT32_EPV], const exponent_t b_exp)#

Compute the natural log of a vector chunk of 32-bit values.

This function computes the natural logarithm of each of the 8 elements in vector chunk $$\bar b$$. The result is returned as an 8-element chunk $$\bar a$$ of Q8.24 values.

b_exp is the exponent associated with elements of $$\bar b$$.

Any input $$b_k \le 0$$ will result in a corresponding output $$a_k = \mathtt{INT32_MIN}$$.

Operation Performed

\begin{split}\begin{aligned} & a_k \leftarrow \ \begin{cases} log(b_k\cdot{}2^{\mathtt{b\_exp}}) & b_k > 0 \\ \mathtt{INT32\_MIN} & \text{otherwise} \\ \end{cases} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}

Parameters:
• a[out] Output vector chunk $$\bar a$$

• b[in] Input vector chunk $$\bar b$$

• b_exp[in] Exponent associated with $$\bar b$$

Raised if b or a is not double word-aligned (See Note: Vector Alignment)

void chunk_float_s32_log(q8_24 a[VPU_INT32_EPV], const float_s32_t b[VPU_INT32_EPV])#

Compute the natural log of a vector chunk of float_s32_t.

This function computes the natural logarithm of each of the VPU_INT32_EPV elements in vector chunk $$\bar b$$. The result is returned as an 8-element chunk $$\bar a$$ of Q8.24 values.

Any input $$b_k \le 0$$ will result in a corresponding output $$a_k = \mathtt{INT32_MIN}$$.

Operation Performed

\begin{split}\begin{aligned} & a_k \leftarrow \ \begin{cases} log(b_k) & b_k > 0 \\ \mathtt{INT32\_MIN} & \text{otherwise} \\ \end{cases} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}

Parameters:
• a[out] Output vector chunk $$\bar a$$

• b[in] Input vector chunk $$\bar b$$

Raised if b or a is not double word-aligned (See Note: Vector Alignment)

void chunk_q30_power_series(int32_t a[VPU_INT32_EPV], const q2_30 b[VPU_INT32_EPV], const int32_t c[], const unsigned term_count)#

Compute a power series on a vector chunk of Q2.30 values.

This function is used to compute a power series summation on a vector chunk (VPU_INT32_EPV-element vector) $$\bar b$$. $$\bar b$$ contains Q2.30 values. $$\bar c$$ is a vector containing coefficients to be multiplied by powers of $$\bar b$$, and may have any associated exponent. The output is vector chunk $$\bar a$$ and has the same exponent as $$\bar c$$.

c[] is an array with shape (term_count, VPU_INT32_EPV), where the second axis contains the same value replicated across all VPU_INT32_EPV elements. That is, c[k][i] = c[k][j] for i and j in 0..(VPU_INT32_EPV-1). This is for performance reasons. (For the purpose of this explanation, $$\bar c$$ is considered to be single-dimensional, without redundancy.)

Operation Performed

\begin{split}\begin{aligned} & b_{k,0} = 2^{30} \\ & b_{k,i} = round\left(\frac{b_{k,i-1}\cdot{}b_k}{2^{30}}\right) \\ & \qquad\text{for }i \in {1..(N-1)} \\ & a_k \leftarrow \sum_{i=0}^{N-1} round\left( \frac{b_{k,i}\cdot c_i}{2^{30}} \right) \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}

Parameters:
• a[out] Output vector chunk $$\bar a$$

• b[in] Input vector chunk $$\bar b$$

• c[in] Coefficient vector $$\bar c$$

• term_count[in] Number of power series terms, $$N$$

void chunk_q30_exp_small(q2_30 a[VPU_INT32_EPV], const q2_30 b[VPU_INT32_EPV])#

Compute $$e^b$$ on a vector chunk of Q2.30 values.

This function computes $$e^{b_k}$$ for each element of a vector chunk (VPU_INT32_EPV-element vector) $$\bar b$$ of Q2.30 values near $$0$$. The result is computed using the power series approximation of $$e^x$$ near zero. It is recommended that this function only be used for $$-0.5 \le b_k\cdot{}2^{-30} \le 0.5$$.

The output vector chunk $$\bar a$$ is also in a Q2.30 format.

Operation Performed

\begin{split}\begin{aligned} & a_k \leftarrow e^{b_k\cdot{}2^{-30}} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}} \end{aligned}\end{split}

Parameters:
• a[out] Output vector chunk $$\bar a$$

• b[in] Input vector chunk $$\bar b$$