Matrix multiplication is a powerful and flexible way of transforming data. It encompasses numerous operations—averaging, filtering, re-routing, dimensionality changes, geometric transformations, and more. Below are several intuitive ways to see what matrix multiplication can do and how to interpret it in familiar contexts.
Each output element is a weighted sum of the input elements.
$$ y_i = \sum_{j} \mathbf{W}_{ij} \, x_j \quad \implies \quad\mathbf{y} = \mathbf{W} \, \mathbf{x}. $$
Example: Suppose you have a 3-element input $\mathbf{x}$ representing three sensor readings $[x_1, x_2, x_3]$. If you create an scalar output $y_1$ that is an average of all three sensors, that is
$$ y_1 = \tfrac13 (x_1 + x_2 + x_3). $$
In matrix form,
$$ W =\begin{bmatrix}\frac13 & \frac13 & \frac13\end{bmatrix}, $$
so that
$$ y_1 = \begin{bmatrix} \frac13 & \frac13 & \frac13 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}. $$
This basic perspective generalizes to all the more complex operations below.
Often in audio or image processing, different channels (e.g., 3 color channels $\text{RGB}$, or multiple microphones) need to be blended or separated.
Blending Colors: In an image with RGB channels, a $3 \times 3$ matrix can combine or remix the color channels. For instance, converting from RGB to grayscale can be approximated by a matrix that weights red, green, and blue differently to yield a single channel.
The standard luminance-based grayscale conversion formula is:
$$
\begin{bmatrix} 0.2989 & 0.5870 & 0.1140 \end{bmatrix} \begin{bmatrix} \left[x_\text{R}\right] \\ \left[x_\text{G}\right] \\ \left[x_\text{B}\right] \end{bmatrix} $$
Audio Stereo to Mono: To combine left and right channels into a single mono channel, you can multiply by a $1 \times 2$ matrix like $\left[0.5 \quad 0.5\right]$.
$$ y_\text{mono} = \begin{bmatrix} 0.5& 0.5 \end{bmatrix} \begin{bmatrix} \left[x_\text{left}\right] \\ \left[x_\text{right}\right] \end{bmatrix} $$
Simple Averaging Downsample: You’ve already seen the example of taking every two consecutive samples and averaging them. In 1D (for audio or time-series), a matrix can be formed so that each row picks two samples from the input and averages them.
$$ \mathbf{y} = \frac{1}{2} \begin{bmatrix} 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ \end{bmatrix} = \begin{bmatrix} \frac{x_1 + x_2}{2} \\ \frac{x_3 + x_4}{2} \end{bmatrix}. $$
This effectively halves the signal length, creating an output of dimension 2 from an input of dimension 4.
Spreading and Duplicating: A matrix can also replicate a downsampled signal multiple times. For instance, duplicating the averaged samples across multiple rows so you get repeated blocks of the same information in the output.