Articles of bayesian

Prove these two conditional probabilities are equivalent

I saw people using such equivalence $$P(X|\mu) P(\mu | D) = P(X,\mu|D)$$ how to prove it is valid? My attempt: \begin{align} P(X|\mu) P(\mu | D) &= P(X|\mu) \frac{P(\mu,D)}{P(D)}\\ &= P(X|\mu) \frac{P(D|\mu) P(\mu)}{P(D)}\\ &=\frac{P(X,\mu) P(D|\mu)}{P(D)} \end{align} where $P(X,\mu|D) = \frac{P(X,\mu,D)}{P(D)}$. I have stuck here, couldn’t figure out why $P(X,\mu,D) = P(X,\mu) P(D|\mu)$. Edited Additional: If instead […]

Let. $X \sim \mathcal{U}(0,1)$. Given $X = x$, let $Y \sim \mathcal{U}(0,x)$. How can I calculate $\mathbb{E}(X|Y = y)$?

Suppose that $X$ is uniformly distributed over $[0,1]$. Now choose $X = x$ and let $Y$ be uniformly distributed over $[0,x]$. Is it possible for us to calculate the “expected value of $X$ given $Y = y$”, i.e., $\mathbb{E}(X|Y = y)$? Intuitively, it seems that if $y = 0$, we gain no information, and so […]

Facebook Question (Data Science)

Out of curiosity, here’s a question from Glassdoor (Facebook Data Science Interview) You’re about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it’s raining. Each of your friends has a 2/3 […]

Bayesian posterior with truncated normal prior

Suppose we observe one draw from the random variable $X$, which is distributed with normal distribution $\mathcal{N}(\mu,\sigma^2)$. The variance $\sigma^2$ is known, $\mu$ isn’t. We want to estimate $\mu$. Suppose further that the prior distribution is given by truncated normal distribution $\mathcal{N}(\mu_0,\sigma^2_0,t)$, i.e., density $f(\mu)=c/\sigma \phi((\mu-\mu_0)/\sigma_0)$ if $\mu<t$, and $f(\mu)=0$ otherwise, where $t>\mu$ and $c$ […]

Hillary Clinton's Iowa Caucus Coin Toss Wins and Bayesian Inference

In yesterday’s Iowa Caucus, Hillary Clinton beat Bernie Sanders in six out of six tied counties by a coin-toss*. I believe we would have heard the uproar about it by now if this was somehow rigged in her favor, but I wanted to calculate the odds of this happening, assuming she really was that lucky, […]

bayesian probability and measure theory

I am reading somewhat simultaneously on Bayesian probability theory (from Jaynes’ probability theory and the logic of science), and on the measure-theoretic approach to probability. Now, I casually overheard a Bayesian (it may have been Jaynes) say that measure theory as a basis for probability is “ad hoc”. So this left me wondering: what is […]

Is there an introduction to probability and statistics that balances frequentist and bayesian views?

Perhaps, roughly, I might be described as advanced undergraduate regarding mathematics. However, I have not learned statistics and have only learned elementary probability. Does there exist a book or monograph that introduces probability and statistics at this level and still covers frequentist and bayesian views (philosophy?) in a balanced manner? It appears to me (but […]

Bayes rule with multiple conditions

I am wondering how I would apply Bayes rule to expand an expression with multiple variables on either side of the conditioning bar. In another forum post, for example, I read that you could expand $P(a,z \mid b)$ using Bayes rule like this (see Summing over conditional probabilities): $$P( a,z \mid b) = P(a \mid […]

Bayesian Parameter Estimation – Parameters and Data Jointly Continuous?

This is a follow up to my previous question regarding viewing parameters as random variables in a Bayesian framework. If we apply Bayes’ theorem to model parameters $\mathbf{\Theta} \in \mathbb{R}^n$ and data $\mathbf{Y} \in \mathbb{R}^m$ we get $$ g(\boldsymbol\theta|\mathbf{y}) = \frac{h(\mathbf{y}|\boldsymbol\theta) g(\boldsymbol\theta)}{h(\mathbf{y})}, $$ where $g$ is the marginal pdf of $\mathbf{\Theta}$, $h$ is the marginal […]

Bayes' Rule for Parameter Estimation – Parameters are Random Variables?

Let $(\Omega, \mathcal{F}, P)$ be a probability space and let $\mathbf{X}: \Omega \to \mathbb{R}^n$, $\mathbf{Y}: \Omega \to \mathbb{R}^m$ be jointly continuous random vectors. That is, there exists a $\mathcal{B}(\mathbb{R}^n \times \mathbb{R}^m)$-measurable function $f: \mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}$ such that for all $A \in \mathcal{B}(\mathbb{R}^n)$ and $B \in \mathcal{B}(\mathbb{R}^m)$, $$ P((\mathbf{X}, \mathbf{Y}) \in A \times […]