Understanding the measurability of conditional expectations

My question is about the conditional expectation of random variables with respect to a $\sigma$-algebra. I am having trouble getting an intuition behind the definitions among other things.

I know that if $X$ is $\mathcal{G}$-measurable then $\mathbb{E} [X| \mathcal{G}] = X$, but what if $X$ is not $\mathcal{G}$-measurable? Is this expression just not defined?

Furthermore, in the following definition:

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space, $\mathcal{G}$ a sub $\sigma$-field of $\mathcal{F}$ and $X$ be a random variable with $\mathbb{E}|X| < \infty$.

Definition (Conditional Expectation):
The conditional expectation of $X$ given $G$, denoted by $\mathbb{E}[X|\mathcal{G}]$, is defined as any random variable Y which satisfies

  1. $Y$ is $\mathcal{G}$-measurable
  2. $\int_A X d\mathbb{P} = \int_A Y d\mathbb{P}$ for all $A \in \mathcal{G}$

Remark:

  1. Any random variable $Y$ satisfying $(1)$ and $(2)$ in the definition is called a version of $\mathbb{E}[X|\mathcal{G}]$.

I don’t know why the 2nd condition is defined this way. Can someone bring an argument where if the second condition is not defined in the way it is then something undesirable results? Or an argument where this 2nd condition brings desirable results.

Solutions Collecting From Web of "Understanding the measurability of conditional expectations"

Let $(\Omega,\mathcal F,\mu)$ be a probability space. The idea of condition expectation is the following: we have an integrable random variable $X$ and a sub-$\sigma$-algebra $\mathcal G$ of $\mathcal F$. The random variable $X$ is not necessarily measurable with respect to this smaller $\sigma$-algebra. We would like to consider a random variable which is $\mathcal G$-measurable, and close in some sense to $X$.

Assume that $Y$ satisfies conditions 1. and 2. Then
$$X=\color{red}{Y}+\color{blue}{X-Y}.$$
The red random variable is $\mathcal G$-measurable and if $\varphi$ is a bounded $\mathcal G$-measurable function, then $\mathbb E[(\color{blue}{X-Y})\phi]=0$, hence we wrote $X$ as a sum of a $\mathcal G$-measurable random variable plus an other one whose integral over the $\mathcal G$-measurable sets vanish. There is an idea of projection, which can be made more concrete when $X$ belongs to $\mathbb L^2$.