Bayesian Estimation (Next Year)

Bayesian estimation is also a form of posterior estimation, but it is not maximum a posteriori estimation. Its goal is not to maximize the posterior, but to maximize the posterior expectation of the parameter, as shown below:

$$ \hat{\theta}_{\text{Bayes}} = E[\theta \mid \mathbf{x}] = \int \theta p(\theta \mid \mathbf{x}) d\theta $$

However, it is important to note that the goal of Bayesian estimation is often not just to obtain the final value of $\hat{\theta}_{\text{Bayes}}$, but to obtain the probability distribution of the parameter $\theta$ itself.

Example

Let's demonstrate the process of Bayesian estimation through a simple example, such as estimating the probability of a coin landing heads up, denoted as $p$. This is a typical binomial distribution problem, and we can use Bayesian methods to update our belief about $p$.

Assumptions: we don't know whether the coin is fair. Therefore, our prior belief about $p$ is that it could be any value between 0 and 1, and all values are equally likely (i.e., the prior distribution is uniform). Mathematically, we can express this as $p \sim U(0, 1)$.

Experimental Observation: Suppose we flip the coin 10 times and observe 7 heads and 3 tails. We want to update our knowledge about $p$.

Bayesian Update Steps:

Define the prior distribution: $\text{Prior}: p \sim U(0, 1)$. In Bayesian analysis, the uniform distribution is a special case of the Beta distribution, specifically $\text{Beta}(1, 1)$.
Likelihood function: The coin flips follow a binomial distribution, and the likelihood function $P(X = 7 | p)$ can be defined as: $P(X = 7 | p) = \binom{10}{7} p^7 (1-p)^3$, where $\binom{10}{7}$ is the combination number, representing the number of ways to choose 7 heads from 10 flips.
Posterior distribution: In Bayesian updating, the posterior distribution is proportional to the product of the prior and the likelihood: $\text{Posterior} \propto \text{Prior} \times \text{Likelihood}$, or $\text{Posterior} \propto p^7 (1-p)^3$. Since the prior is $\text{Beta}(1, 1)$, the posterior distribution has a close form solution, which becomes: $\text{Posterior} = \text{Beta}(7 + 1, 3 + 1) = \text{Beta}(8, 4)$.
Calculate the expectation of the posterior distribution: The expectation of a Beta distribution $E[p]$ is $\frac{\alpha}{\alpha + \beta}$. Therefore, for $\text{Beta}(8, 4)$: $E[p] = \frac{8}{8 + 4} = \frac{8}{12} = \frac{2}{3} \approx 0.667$.

Conclusion

Based on the observed data, we have updated our knowledge about the probability of the coin landing heads up, $p$. Initially, we knew nothing about $p$ and assumed it was uniformly distributed between $[0, 1]$. The observed data has updated our belief to a $\text{Beta}(8, 4)$ distribution, with an expected value of approximately 0.667. This indicates that, based on the current data, the probability of the coin landing heads up is about 2/3.

It needs to extend to the bayes linear regression.