Intereting Posts

Can you also conclude also that $X^{\phi{(n)}}-=\prod_{ \epsilon (Z /_n Z)^x}(X-)$?
probability textbooks
Example of non-trivial number field
Integral $\int_0^1 \log \left(\Gamma\left(x+\alpha\right)\right)\,{\rm d}x=\frac{\log\left( 2 \pi\right)}{2}+\alpha \log\left(\alpha\right) -\alpha$
Show from the axioms: Addition in a quasifield is abelian
$\alpha$-derivative (concept)
What does the symbol $\lll$ mean?
Proving the mean value inequality in higher dimensions for a differentiable function (rather than $C^1$)
How many ways are there to divide $100$ different balls into $5$ different boxes so the last $2$ boxes contains even number of balls?
Problem on abelian group
Lower bound for monochromatic triangles in $K_n$
$K$-monomorphism that is not $K$-automorphism?
$\int_{0}^{\infty} f(x) \,dx$ exists. Then $\lim_{x\rightarrow \infty} f(x) $ must exist and is $0$. A rigorous proof?
An eigenvector is a non-zero vector such that…
Puzzles or short exercises illustrating mathematical problem solving to freshman students

Suppose we observe one draw from the random variable $X$, which is distributed with normal distribution $\mathcal{N}(\mu,\sigma^2)$. The variance $\sigma^2$ is known, $\mu$ isn’t. We want to estimate $\mu$.

Suppose further that the prior distribution is given by truncated normal distribution $\mathcal{N}(\mu_0,\sigma^2_0,t)$, i.e., density $f(\mu)=c/\sigma \phi((\mu-\mu_0)/\sigma_0)$ if $\mu<t$, and $f(\mu)=0$ otherwise, where $t>\mu$ and $c$ is a normalizing constant. (Interpretation: we get noisy signals about $\mu$, which are known to be normally distributed with known variance—this is the draw of $X$. But we have prior knowledge that values $\mu\ge t$ are not possible.)

In this setup, is the resulting posterior a truncated normal distribution (truncated at $t$ like the prior)? I tried to adapt the derivation of the posterior for the well known conjugate normal pair (e.g., here and here), and it seems to work. Do you see any mistake in this derivation?

- Cox derivation of the laws of probability
- combining conditional probabilities
- P(A|C)=P(A|B)*P(B|C)?
- Why do knowers of Bayes's Theorem still commit the Base Rate Fallacy?
- Intuitively, why does Bayes' theorem work?
- In Bayesian Statistic how do you usually find out what is the distribution of the unknown?

The likelihood function is given by

$$f(x|\mu)=\frac{1}{\sigma\sqrt{2\pi}} \exp\left\{-\frac{(x-\mu)^2}{2\sigma^2} \right\} $$

The prior density is ($\Phi(.)$ is the cdf of the standard normal distribution)

$$f(\mu)=\begin{cases}

\frac{1}{\sigma_0\sqrt{2\pi}\Phi((t-\mu_0)/\sigma_0)} \exp\left\{-\frac{(\mu-\mu_0)^2}{2\sigma_0^2} \right\} &\text{ if } \mu\le t \\

0 & \text{else}.

\end{cases}$$

The prior density can be rewritten as

$$f(\mu)=c \phi((\mu-\mu_0)/\sigma_0)\mathbf{1}\{\mu<t\},$$

where $c$ is the normalizing constant (independent of $\mu$, but dependent on $t$). Now, by Bayes’ rule,

\begin{equation}

f(\mu|x)\propto f(x|\mu) f(\mu)\propto\exp\left\{-\frac{(x-\mu)^2}{2\sigma^2} \right\} \exp\left\{-\frac{(\mu-\mu_0)^2}{2\sigma_0^2} \right\}\mathbf{1}\{\mu<t\} \\

=\exp\left\{-\frac{(x-\mu)^2}{2\sigma^2} -\frac{(\mu-\mu_0)^2}{2\sigma_0^2} \right\}\mathbf{1}\{\mu<t\}\\

\propto \exp\left\{-\frac{1}{2\sigma^2\sigma_0^2/(\sigma^2+\sigma_0^2)} \left(\mu-\frac{\sigma^2\mu_0+\sigma_0^2 x}{\sigma^2+\sigma_0^2}\right)^2 \right\}\mathbf{1}\{\mu<t\}.

\end{equation}

This is the kernel of the normal distribution with the usual mean and variance (as if we had done the derivation for an untruncated prior), but truncated at $t$ and above. In other words, ignoring the truncation in the prior distribution, using the usual learning rule for the conjugate normal pair, and then applying the truncation gives the same result as the derivation above (assuming it is correct). Is it correct? All I do is add the indicator function (and adapt the normalizing constant), does that introduce problems somewhere?

- combining conditional probabilities
- Facebook Question (Data Science)
- Is there an introduction to probability and statistics that balances frequentist and bayesian views?
- How to solve probability with two conditions (with explanation)?
- Let. $X \sim \mathcal{U}(0,1)$. Given $X = x$, let $Y \sim \mathcal{U}(0,x)$. How can I calculate $\mathbb{E}(X|Y = y)$?
- Between bayesian and measure theoretic approaches
- Separate expression $c(x + y)^2 e^{yz}$
- How to calculate Pr(Diseased | 2 Positive Tests)?
- Finding the marginal posterior distribution of future prediction, $y_{n+1}$
- Hillary Clinton's Iowa Caucus Coin Toss Wins and Bayesian Inference

Your derivation is correct.

I think the result is also very intuitive. As you pointed out, if you have a prior which is a normal distribution and posterior which is also a normal distribution, then the result will be another normal distribution.

$$f(\mu|x)\propto f(x|\mu) f(\mu)$$

Now suppose I came along and set a region of $f(\mu)$ to zero and scaled it by $c$ to renormalize it. For points of $\mu$ where it was not set to zero, the right-hand side of the above equation is the same except that we have to change $f(\mu) \to c f(\mu)$. Therefore, the left-hand side is also just scaled by $c$, but retains the exact shape of a normal distribution. So we end up with a scaled normal distribution, except of course for points where $f(\mu)$ is zero and the left hand side is also zero.

It won’t cause problems since it’s correct, although it might not be a nice function to work with if you’re trying to derive something analytically. For example, the mean of your posterior is a very long, complicated expression (which I was able to find in Mathematica).

The new density is again truncated normal at $t$ with new parameters $\frac{\sigma^2\mu_0+\sigma_0^2 x}{\sigma^2+\sigma_0^2},\sigma^2\sigma_0^2/(\sigma^2+\sigma_0^2)$. The new normalising constant is $$\Phi\left( \frac{t-\frac{\sigma^2\mu_0+\sigma_0^2 x}{\sigma^2+\sigma_0^2}}{\sigma^2\sigma_0^2/(\sigma^2+\sigma_0^2)}\right )$$

So the interesting thing is that conjugacy is preserved under truncation of the prior for the mean.

It would be nice to study these posteriors for a fixed $t$ and different values of the prior parameters.

- How far can we take “If $f$ is holomorphic in $D\setminus C$, $f$ is holomorphic in $D$.”?
- Topological space that is not homeomorphic to the disjoint union of its connected components
- Why is the inverse of a sum of matrices not the sum of their inverses?
- Independence and Events.
- Arithmetic mean of positive integers less than an integer $N$ and co-prime with $N$.
- show that $19-5\sqrt{2}-8\sqrt{4}$ is a unit in $\mathbb{Z}{2}]$
- Continuity of $d(x,A)$
- Prove that $g(x)=\frac{\ln(S_n (x))}{\ln(S_{n-1}(x))}$ is increasing in $x$, where $S_{n}(x)=\sum_{m=0}^{n}\frac{x^m}{m!}$
- What is the difference between intrinsic and extrinsic curvature?
- A Function Continous on
- How to show a ring is normal or not, and how to show the normalisation of the ring
- $\lim_{p\to \infty}\Vert f\Vert_{p}=\Vert f\Vert_{\infty}$?
- Computing ${\int\limits_{0}^{\chi}\int\limits_{0}^{\chi}\int\limits_{0}^{2\pi}\sqrt{1- \cdots} \, d\phi \, d\theta_1 \, d\theta_2}$?
- The equation $x^3 + y^3 = z^3$ has no integer solutions – A short proof
- How to show that $\sum_{k=1}^n k(n+1-k)=\binom{n+2}3$?