In count regression, we aim to predict how many times an event happens in a fixed window, meaning the target is a non-negative integer $y^{(i)} \in \{0, 1, 2, \dots\}$.
Count regression is widely used across various fields to model phenomena where the outcome is a discrete count of events. Common real-world examples include:
The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space.
It is built on the assumption that these events occur with a known constant mean rate and independently of the time since the last event. If a random variable $Y$ follows a Poisson distribution with a rate parameter $\lambda$, the probability of observing exactly $k$ events is given by the probability mass function (PMF):
$$ P(Y = k) = \frac{\lambda^k e^{-\lambda}}{k!} $$
Where:
To model this probabilistically, we assume that the target variable $y$, given the input $x$, follows a Poisson distribution parameterized by a rate $\lambda=z^{(i)} > 0$. For the sake of consistency, the following text uses $z^{(i)}$ instead of $\lambda$ to represent the rate.
$$ y^{(i)} \mid x^{(i)},\theta \sim \text{Poisson}(z^{(i)}) $$
Our neural network processes the input $x$ and outputs the predicted expected rate, $z^{(i)} = f_\theta(x^{(i)})$, ensuring $z^{(i)} > 0$.