Files
2025-02-AI/final/1127.md
2025-12-06 18:32:08 +09:00

4.0 KiB

Study Guide: Undirected Graphical Models (Markov Random Fields)

Date: 2025.11.27 Topic: Potential Functions, Partition Function, and Conditional Independence in MRFs.


1. Recap: Decomposition in Undirected Graphs

Unlike Directed Graphical Models (Bayesian Networks) which use conditional probabilities, Undirected Graphical Models (Markov Random Fields - MRFs) cannot directly use probabilities because there is no direction/causality. Instead, they decompose the joint distribution based on Maximal Cliques.

  • Cliques: Subsets of nodes where every node is connected to every other node.
  • Maximal Clique: A clique that cannot be expanded (e.g., in the example graph, the maximal cliques covers the graph).
  • Decomposition Rule: The joint distribution is the product of functions defined over these maximal cliques.

2. Potential Functions (\psi)

  • Definition: For each maximal clique C, we define a Potential Function \psi_C(x_C) (often denoted as \phi or \psi).
    • It is a positive function (\psi(x) \ge 0) mapping the state of the clique variables to a real number.
    • It represents the "compatibility" or "energy" of that configuration.
  • Key Distinction: A potential function is NOT a probability. It does not sum to 1. It is just a score (non-negative function).
    • Example: \psi_{12}(x_1, x_2) scores the interaction between x_1 and x_2.

3. The Partition Function (Z)

Since the product of potential functions is not a probability distribution (it doesn't sum to 1), we must normalize it.

  • Definition: The normalization constant is called the Partition Function (Z). Z = \sum_{x} \prod_{C} \psi_C(x_C)
  • Role: It ensures that the resulting distribution sums to 1, making it a valid probability distribution. P(x) = \frac{1}{Z} \prod_{C} \psi_C(x_C)
  • Calculation: To find Z, we must sum the product of potentials over all possible states (combinations) of the random variables. This summation is often computationally expensive.

Example Calculation

The lecture walks through a simple example with 4 binary variables and two cliques: \{x_1, x_2, x_3\} and \{x_3, x_4\}.

  • Step 1: Define potential tables for \psi_{123} and \psi_{34}.
  • Step 2: Calculate the score for every combination.
  • Step 3: Sum all scores to get Z. In the example, Z=10.
  • Step 4: The probability of any specific state (e.g., P(1,0,0,0)) is its specific score divided by Z (e.g., (1 \times 3)/10 or similar depending on values).

4. Parameter Estimation

  • Discrete Case: If variables are discrete (like the email spam example), the parameters are the entries in the potential tables. We estimate these values from data to maximize the likelihood.
  • Continuous Case: If variables are continuous, potential functions are typically Gaussian distributions. We estimate means and covariances.
  • Reduction: Just like in Bayesian Networks, using the graph structure reduces the number of parameters.
    • Without Graph: A full table for 4 binary variables needs 2^4 = 16 entries.
    • With Graph: We only need tables for the cliques, significantly reducing complexity.

5. Verifying Conditional Independence

The lecture demonstrates analytically that the potential function formulation preserves the conditional independence properties of the graph.

  • Scenario: Graph with structure x_1 - x_2 - x_3 - x_4.
    • Is x_4 independent of x_1 given x_3?
  • Analytical Check:
    • We calculate P(x_4=1 | x_1=0, x_2=1, x_3=1).
    • We also calculate P(x_4=1 | x_1=0, x_2=0, x_3=1).
  • Result: The calculation shows that as long as x_3 is fixed (given), the value of x_1 and x_2 cancels out in the probability ratio.
    • P(x_4|x_1, x_2, x_3) = \frac{\phi_1}{\phi_1 + \phi_0} (depends only on potentials involving x_4 and x_3).
  • Conclusion: This confirms that x_4 \perp \{x_1, x_2\} | x_3. The formulation correctly encodes the global Markov property.