--- jupytext: formats: md:myst text_representation: extension: .md format_name: myst name: python3 --- (sec:2dPDFs)= # Two-dimensional PDFs ## Visualizing correlated Gaussian distributions A multivariate Gaussian distribution for $N$ dimensional $\boldsymbol{x} = \{x_1, \ldots, x_N\}$ with $\boldsymbol{\mu} = \{\mu_1, \ldots, \mu_N\}$, with positive-definite *covariance matrix* $\Sigma$ is $$ p(\boldsymbol{x}|\boldsymbol{\mu},\Sigma) = \frac{1}{\sqrt{\det(2\pi\Sigma)}} e^{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^\intercal\Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})} $$ For the one-dimensional case, it reduces to the familiar $$ p(x_1|\mu_1,\sigma_1) = \frac{1}{\sqrt{2\pi\sigma_1^2}} e^{-\frac{(x_1-\mu_1)^2}{2\sigma_1^2}} $$ with $\Sigma = \sigma_1^2$. For the bivariate case (two dimensional), $$ \boldsymbol{x} = \pmatrix{x_1\\ x_2} \quad\mbox{and}\quad \boldsymbol{\mu} = \pmatrix{\mu_1\\ \mu_2} \quad\mbox{and}\quad \Sigma = \pmatrix{\sigma_1^2 & \rho_{12} \sigma_1\sigma_2 \\ \rho_{12}\sigma_1\sigma_2 & \sigma_2^2} \quad\mbox{with}\ 0 < \rho_{12}^2 < 1 $$ and $\Sigma$ is positive definite. **Widget user interface features**: * Set the mean position $(\mu_1, \mu_2)$ and variances $(\Sigma_{11}, \Sigma_{22})$ with the sliders * Set the correlation $\rho_{12}$ with the slider. This controls the covariance $\Sigma_{12} = \rho_{12} \sqrt{ \Sigma_{11} \Sigma_{22}}$. * Four presets are available. * The corner plot shows samples from the bivariate PDF and histograms for the two marginal distributions. Control the number of samples with the slider. * Dashed lines on the marginals mark the 16th, 50th, and 84th percentiles. This is equivalent to $\pm 1\sigma$ for a one-dimensional Gaussian. ```{raw} html ``` The solid and dashed ellipses in the joint panel are iso-probability levels. These correspond to fixed values of the squared Mahalanobis distance $$ \Delta^2 = (\mathbf{x} - \boldsymbol{\mu})^\top \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}). $$ In the widget they are drawn at $\Delta = 1$ and $\Delta = 2$. Their semi-axes are aligned with the eigenvectors of $\Sigma$ and have lengths $\sqrt{\lambda_i}$ (the "$1\sigma$" ellipse) and $2\sqrt{\lambda_i}$ (the "$2\sigma$" ellipse), where $\lambda_i$ are the eigenvalues of $\Sigma$. :::{admonition} A subtlety worth highlighting :class: note In one dimension the $\pm 1\sigma$ and $\pm 2\sigma$ intervals contain 68.3% and 95.4% of the probability mass. In two dimensions, the corresponding *ellipses* contain considerably less, because $\Delta^2 \sim \chi^2_2$ and $$ \prob(\Delta^2 \le k^2) = 1 - e^{-k^2/2}. $$ | Ellipse | Probability mass (2D) | For comparison: 1D interval | |---|---|---| | $\Delta = 1$ ("$1\sigma$") | 39.3% | 68.3% | | $\Delta = 2$ ("$2\sigma$") | 86.5% | 95.4% | | $\Delta = 3$ ("$3\sigma$") | 98.9% | 99.7% | ::: ::::{admonition} Questions to consider: :class: my-checkpoint 1. *What does "positive definite" mean and why is this a requirement for the covariance matrix $\Sigma$?* :::{admonition} Answer :class: dropdown, my-answer A symmetric matrix (such as a covariance matrix) is positive definite if all of its eigenvalues are greater than zero. This ensures that * the variance of any linear combination of random variables is non-negative; * that no variable is a linear combination of others (no collinear variables); * the covariance matrix is invertible. ::: 1. *What is plotted in each part of the graph (called a "corner plot")?* 1. *What effect does changing $\mu_1$ or $\mu_2$ have?* 1. *What effect does changing $\Sigma_{11}$ or $\Sigma_{22}$ have? What if the scales for $x_1$ and $x_2$ were the same?* 1. *What happens if $\rho_{12}$ is equal to $0$ then $+0.7$ then $-0.7$.* 1. *What would happen if you were allowed to set $|\rho_{12}| \leq 1$? Explain what goes wrong.* 1. *So what characterizes independent (uncorrelated) variables versus positively correlated versus negatively correlated?* :::: ## 2D PDF with a quadratic approximation Consider a two-dimensional log likelihood $L(X,Y)$. We'll analyze it in a quadratic approximation. First, find the mode $X_0$, $Y_0$ (best estimate) by differentiating $$\begin{align} L(X,Y) &= \log p(X,Y|\{\text{data}\}, I) \\ \quad&\Longrightarrow\quad \left.\frac{dL}{dX}\right|_{X_0,Y_0} = 0, \ \left.\frac{dL}{dY}\right|_{X_0,Y_0} = 0 \end{align}$$ To check reliability, Taylor expand around $L(X_0,Y_0)$: $$\begin{align} L &= L(X_0,Y_0) + \frac{1}{2}\Bigl[ \left.\frac{\partial^2L}{\partial X^2}\right|_{X_0,Y_0}(X-X_0)^2 + \left.\frac{\partial^2L}{\partial Y^2}\right|_{X_0,Y_0}(Y-Y_0)^2 \\ & \qquad\qquad\qquad + 2 \left.\frac{\partial^2L}{\partial X\partial Y}\right|_{X_0,Y_0}(X-X_0)(Y-Y_0) \Bigr] + \ldots \\ &\equiv L(X_0, Y_0) + \frac{1}{2}Q + \ldots \end{align}$$ It makes sense to do this in (symmetric) matrix notation: $$ Q = \begin{pmatrix} X-X_0 & Y-Y_0 \end{pmatrix} \begin{pmatrix} A & C \\ C & B \end{pmatrix} \begin{pmatrix} X-X_0 \\ Y-Y_0 \end{pmatrix} $$ $$ \Longrightarrow A = \left.\frac{\partial^2L}{\partial X^2}\right|_{X_0,Y_0}, \quad B = \left.\frac{\partial^2L}{\partial Y^2}\right|_{X_0,Y_0}, \quad C = \left.\frac{\partial^2L}{\partial X\partial Y}\right|_{X_0,Y_0} $$ ```{image} ../assets/posterior_ellipse.png :alt: posterior ellipse :class: bg-primary :width: 400px :align: center ``` So in a quadratic approximation, the contour $Q=k$ for some $k$ is an ellipse centered at $X_0, Y_0$ (as in the figure). The orientation and eccentricity are determined by $A$, $B$, and $C$. The principal axes are found from the eigenvectors of the Hessian matrix $\begin{pmatrix} A & C \\ C & B \end{pmatrix}$: $$ \begin{pmatrix} A & C \\ C & B \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \lambda \begin{pmatrix} x \\ y \end{pmatrix} \quad\Longrightarrow\quad \lambda_1,\lambda_2 < 0 \ \mbox{so $(x_0,y_0)$ is a maximum} $$ If the major and minor axes of the ellipse are aligned with the $x$-axis and $y$-axis (so $C=0$), the analysis is simple: the eigenvalues are $A$ and $B$ and the error-bars for $X_0$ and $Y_0$ will be inversely proportional to the modulus of their square roots. What if the ellipse is skewed? See Sivia section 3.3 {cite}`Sivia2006` for a thorough treatment.