add to final

2025-12-06 18:32:08 +09:00
parent ac1d2e744d
commit 0fc412e690
21 changed files with 935 additions and 0 deletions
--- a/final/1201.md
+++ b/final/1201.md
@@ -0,0 +1,102 @@
+# Study Guide: Bayesian Networks & Probabilistic Inference
+
+**Date:** 2025.12.01 (Final Lecture)
+**Topic:** Bayesian Networks, Probabilistic Inference Examples, Marginalization.
+
+---
+
+### **1. Recap: Directed vs. Undirected Models**
+The lecture begins by briefly contrasting the two types of graphical models discussed:
+* **Undirected Graphs (MRF):** Use potential functions ($\psi$) defined on maximal cliques. Requires a normalization constant (partition function $Z$) to become a probability distribution.
+* **Directed Graphs (Bayesian Networks):** Use conditional probability distributions (CPDs). The joint distribution is the product of local conditional probabilities.
+    $$P(X) = \prod_{i} P(x_i | \text{parents}(x_i))$$
+
+---
+
+### **2. Example 1: The "Alarm" Network (Burglary/Earthquake)**
+This is a classic example used to demonstrate inference in Bayesian Networks.
+
+#### **Scenario & Structure**
+* **Nodes:**
+    * **B:** Burglary (Parent, no prior causes).
+    * **E:** Earthquake (Parent, no prior causes).
+    * **A:** Alarm (Triggered by Burglary or Earthquake).
+    * **J:** JohnCalls (Triggered by Alarm).
+    * **M:** MaryCalls (Triggered by Alarm).
+* **Dependencies:** $B \rightarrow A \leftarrow E$, $A \rightarrow J$, $A \rightarrow M$.
+* **Probabilities (Given):**
+    * $P(B) = 0.05$, $P(E) = 0.1$.
+    * $P(A|B, E)$: Table given (e.g., $P(A|B, \neg E) = 0.85$, $P(A|\neg B, \neg E) = 0.05$, etc.).
+    * $P(J|A) = 0.7$, $P(M|A) = 0.8$.
+
+#### **Task 1: Calculate a Specific Joint Probability**
+Calculate the probability of the event: **Burglary, No Earthquake, Alarm rings, John calls, Mary does not call**.
+$$P(B, \neg E, A, J, \neg M)$$
+
+* **Decomposition:** Apply the Chain Rule based on the graph structure.
+    $$= P(B) \cdot P(\neg E) \cdot P(A | B, \neg E) \cdot P(J | A) \cdot P(\neg M | A)$$
+* **Calculation:**
+    $$= 0.05 \times 0.9 \times 0.85 \times 0.7 \times 0.2$$
+
+#### **Task 2: Inference (Conditional Probability)**
+Calculate the probability that a **Burglary occurred**, given that **John called** and **Mary did not call**.
+$$P(B | J, \neg M)$$
+
+* **Formula (Bayes Rule):**
+    $$P(B | J, \neg M) = \frac{P(B, J, \neg M)}{P(J, \neg M)}$$
+
+* **Numerator Calculation ($P(B, J, \neg M)$):**
+    We must **marginalize out** the unknown variables ($A$ and $E$) from the joint distribution.
+    $$P(B, J, \neg M) = \sum_{A \in \{T,F\}} \sum_{E \in \{T,F\}} P(B, E, A, J, \neg M)$$
+    This involves summing 4 terms (combinations of A and E).
+
+* **Denominator Calculation ($P(J, \neg M)$):**
+    We further marginalize out $B$ from the numerator result.
+    $$P(J, \neg M) = P(B, J, \neg M) + P(\neg B, J, \neg M)$$
+
+---
+
+### **3. Example 2: 4-Node Tree Structure**
+A simpler example to demonstrate how sums simplify during marginalization.
+
+#### **Scenario & Structure**
+* **Nodes:** $X_1, X_2, X_3, X_4 \in \{0, 1\}$ (Binary).
+* **Dependencies:**
+    * $X_1 \rightarrow X_2$
+    * $X_2 \rightarrow X_3$
+    * $X_2 \rightarrow X_4$
+* **Decomposition:** $P(X) = P(X_1)P(X_2|X_1)P(X_3|X_2)P(X_4|X_2)$.
+* **Given Tables:** Probabilities for all priors and conditionals are provided.
+
+#### **Task: Calculate Marginal Probability $P(X_3 = 1)$**
+We need to find the probability of $X_3=1$ regardless of the other variables.
+
+* **Definition:** Sum the joint probability over all other variables ($X_1, X_2, X_4$).
+    $$P(X_3=1) = \sum_{x_1} \sum_{x_2} \sum_{x_4} P(x_1, x_2, x_3=1, x_4)$$
+
+* **Step 1: Expand using Graph Structure**
+    $$= \sum_{x_1} \sum_{x_2} \sum_{x_4} P(x_1)P(x_2|x_1)P(X_3=1|x_2)P(x_4|x_2)$$
+
+* **Step 2: Simplify (Key Insight)**
+    Move the summation signs to push them as far right as possible. The sum over $x_4$ only affects the last term $P(x_4|x_2)$.
+    $$= \sum_{x_1} \sum_{x_2} P(x_1)P(x_2|x_1)P(X_3=1|x_2) \left[ \sum_{x_4} P(x_4|x_2) \right]$$
+   
+    * **Property:** $\sum_{x_4} P(x_4|x_2) = 1$ (Sum of probabilities for a variable given a condition is always 1).
+    * Therefore, the $X_4$ term vanishes. This makes sense intuitively: $X_4$ is a "leaf" node distinct from $X_3$; knowing nothing about it doesn't change $X_3$'s probability if $X_2$ is handled.
+
+* **Step 3: Final Calculation**
+    We are left with summing over $X_1$ and $X_2$:
+    $$= \sum_{x_1} \sum_{x_2} P(x_1)P(x_2|x_1)P(X_3=1|x_2)$$
+    This expands to 4 terms (combinations of $x_1 \in \{0,1\}$ and $x_2 \in \{0,1\}$).
+
+---
+
+### **4. Semester Summary & Conclusion**
+The lecture concludes the semester's material.
+
+* **Key Themes Covered:**
+    * **Discriminative vs. Generative Methods:** The fundamental difference in approach (boundary vs. distribution).
+    * **Objective Functions:** Designing Loss functions vs. Likelihood functions.
+    * **Optimization:** Parameter estimation via derivatives (MLE).
+    * **Graphical Models:** Reducing parameter complexity using independence assumptions (Bayes Nets, MRFs).
+* **Final Exam:** Scheduled for Thursday, December 11th. It will cover the concepts discussed, focusing on understanding the fundamentals (e.g., Likelihood, Generative principles) rather than rote memorization.