Conditional Probability Density Functions
[Note: This post was created as part of a lecture for STAT 131 at UCSC.]
Recall that for two continuous random variables \(X\) and \(Y\), we work with the joint probability density function \(f(x,y)\). Below, we plot \(f(x,y)\) for a pair of random variables.
In some cases, we might be interested in probabilities for one random variable, conditioning on the other. For example, for \(X\) and \(Y\) in the previous plot, we might want to know how the probabilities for \(X\) change given that we know \(Y=1\). Our intuition might tell us to look at the joint pdf with \(y=1\) in this case (i.e. \(f(x,1)\)). This is plotted below as the orange curve. Notice that the vertical slice between our \(x-y\) plane and the orange curve looks very similar to a probability density for a single variable.
The primary issue here is that this region does not integrate to one, and thus does not constitute a valid probability density. However, we can scale this region so that it does integrate to one. It turns out that the scaling constant \(c\) in this case is \(c=1/f_Y(1)\). Recall that \(f_Y(1)\) is the marginal density of \(Y\) evaluated at \(y=1\). After rescaling, we plot the new curve in green. Note that this is a valid probability density. In fact we call this the conditional probability density of \(X\) given that \(Y=1\), or \(f_{X|Y=1}(x|1)\).
This example shows that we obtained a valid conditional density by rescaling the joint density at the set of points where \(y=1\), \[ f_{X|Y=1}(x|1) = \frac{f(x,1)}{f_Y(1)}. \] We could repeat this process conditioning on any other value of \(Y=y\), resuling in the general formula \[ f_{X|Y}(x|y) = \frac{f(x,y)}{f_Y(y)}. \]