# Study Guide: Undirected Graphical Models (Markov Random Fields) **Date:** 2025.11.27 **Topic:** Potential Functions, Partition Function, and Conditional Independence in MRFs. --- ### **1. Recap: Decomposition in Undirected Graphs** Unlike Directed Graphical Models (Bayesian Networks) which use conditional probabilities, **Undirected Graphical Models (Markov Random Fields - MRFs)** cannot directly use probabilities because there is no direction/causality. Instead, they decompose the joint distribution based on **Maximal Cliques**. * **Cliques:** Subsets of nodes where every node is connected to every other node. * **Maximal Clique:** A clique that cannot be expanded (e.g., in the example graph, the maximal cliques covers the graph). * **Decomposition Rule:** The joint distribution is the product of functions defined over these maximal cliques. --- ### **2. Potential Functions ($\psi$)** * **Definition:** For each maximal clique $C$, we define a **Potential Function** $\psi_C(x_C)$ (often denoted as $\phi$ or $\psi$). * It is a **positive function** ($\psi(x) \ge 0$) mapping the state of the clique variables to a real number. * It represents the "compatibility" or "energy" of that configuration. * **Key Distinction:** A potential function is **NOT a probability**. It does not sum to 1. It is just a score (non-negative function). * *Example:* $\psi_{12}(x_1, x_2)$ scores the interaction between $x_1$ and $x_2$. --- ### **3. The Partition Function ($Z$)** Since the product of potential functions is not a probability distribution (it doesn't sum to 1), we must normalize it. * **Definition:** The normalization constant is called the **Partition Function** ($Z$). $$Z = \sum_{x} \prod_{C} \psi_C(x_C)$$ * **Role:** It ensures that the resulting distribution sums to 1, making it a valid probability distribution. $$P(x) = \frac{1}{Z} \prod_{C} \psi_C(x_C)$$ * **Calculation:** To find $Z$, we must sum the product of potentials over **all possible states** (combinations) of the random variables. This summation is often computationally expensive. #### **Example Calculation** The lecture walks through a simple example with 4 binary variables and two cliques: $\{x_1, x_2, x_3\}$ and $\{x_3, x_4\}$. * **Step 1:** Define potential tables for $\psi_{123}$ and $\psi_{34}$. * **Step 2:** Calculate the score for every combination. * **Step 3:** Sum all scores to get $Z$. In the example, $Z=10$. * **Step 4:** The probability of any specific state (e.g., $P(1,0,0,0)$) is its specific score divided by $Z$ (e.g., $(1 \times 3)/10$ or similar depending on values). --- ### **4. Parameter Estimation** * **Discrete Case:** If variables are discrete (like the email spam example), the parameters are the entries in the potential tables. We estimate these values from data to maximize the likelihood. * **Continuous Case:** If variables are continuous, potential functions are typically Gaussian distributions. We estimate means and covariances. * **Reduction:** Just like in Bayesian Networks, using the graph structure reduces the number of parameters. * *Without Graph:* A full table for 4 binary variables needs $2^4 = 16$ entries. * *With Graph:* We only need tables for the cliques, significantly reducing complexity. --- ### **5. Verifying Conditional Independence** The lecture demonstrates analytically that the potential function formulation preserves the conditional independence properties of the graph. * **Scenario:** Graph with structure $x_1 - x_2 - x_3 - x_4$. * Is $x_4$ independent of $x_1$ given $x_3$? * **Analytical Check:** * We calculate $P(x_4=1 | x_1=0, x_2=1, x_3=1)$. * We also calculate $P(x_4=1 | x_1=0, x_2=0, x_3=1)$. * **Result:** The calculation shows that as long as $x_3$ is fixed (given), the value of $x_1$ and $x_2$ cancels out in the probability ratio. * $P(x_4|x_1, x_2, x_3) = \frac{\phi_1}{\phi_1 + \phi_0}$ (depends only on potentials involving $x_4$ and $x_3$). * **Conclusion:** This confirms that $x_4 \perp \{x_1, x_2\} | x_3$. The formulation correctly encodes the global Markov property.