update notes in midterm2

2025-10-27 17:19:12 +09:00
parent 13b27cc21e
commit 2a7cb64862
2 changed files with 125 additions and 148 deletions
--- a/notes/3.md
+++ b/notes/3.md
@@ -210,7 +210,7 @@ As there can be conflicts: For a given state(stack + input) there can be multipl

 ### LR Grammars

-* LR(k): left-toright scanning, right most derivation and $k$ symbol lookahead
+* LR(k): left-to-right scanning, right most derivation and $k$ symbol lookahead
 * LR(0) Grammar
 LR(0) indicates grammars that can determine actions without any lookahead.
 There are **no reduce-reduce and shift-reduce conflicts**, because it should be determined by stacks.
@@ -243,10 +243,86 @@ And table consists of four different actions:
 * `reduce x -> a`: pop a from the stack and push <`x`, `goto[curr_state, x]`>
 * accept(`S' -> S$.`) / Error

-DFA states are converted to index of each rows.
+Also DFA states are converted to index of each rows.
+
+But There is a limitation when there are multiple options to fill the parsing table, which should be solved with **lookahead**.

 ### SLR(1) Parsing

 A simple extension of LR(0).

-For each reduction `X -> b`, look at the next symbol `c` and then apply reduction only if `c` is in `Follow(X)`.
+For each reduction `X -> b`, look at the next symbol `c` and then apply reduction **only if `c` is in `Follow(X)`** which is a lookahead.
+
+### LR(1) Parsing 
+
+LR(1) uses lookahead more delicately. For them, it uses a more complex state like `X -> a.b,c`, which means:
+1. `a` is already matched at top of the stack
+2. next expect to see `b` followed by `c`
+Also `X -> a.b,{x1,x2,...,xn}` indicates:
+* forall i in `{x1,...,xn}`, `X -> a.b,i`
+
+We extend the $\epsilon$-closure and `goto` operation.
+
+LR(1) closure identification:
+* start with `Closure(S) = S`
+* foreach item: `[X -> a.Bb,c]` in `S`
+  * add `{B -> .y,First(bc)}`
+* Initalize the state with `[S' -> .S,$]`
+
+LR(1) `goto`:
+Given an Item in the state I: `[X -> a.Bb,c]`, `Goto/Shift(I, B) = Closure([X -> aB.b,c])`
+
+LR(1) Parsing Table is same as LR(0) except for **reductions**.
+
+### LALR(1) Parsing
+
+LR(1) has too many states. LALR(1) Parsing.
+
+LR(1) parsing is a **LookAhead LR**.
+Construct LR(1) DFA and merges any two LR(1) states whose items have the same production rule, but different lookahead. It reduces the number of parser table entries, but theoretically less powerful than LR(1).
+
+LR(1) generally has the same number of states as SLR(1) but much less than LR(1).
+But we will not dive into the details of LALR(1).
+
+### LL/LR Grammars
+
+1. LL Parsing Tables
+   * Table[NT, T] = Production to apply
+   * Compute using First, Follow.
+2. LR Parsing Tables
+   * Table[LR State, Term] = shift/reduce/error/accept
+   * Table[LR State, NT] = goto/err
+   * Computing using closure and goto operations on LR states
+
+## Automatic Disambiguation
+
+It is highly complex to propose unambiguous grammars: precedence, associativity. By defining precedence, using ambiguous grammars without shift-reduce conflicts: define precedence between terminals on the stack vs. terminals on the input.
+
+
+## AST Data Structure
+
+LL/LR parsing implicitly build AST.
+* LL parsing: AST represented by the productions
+* LR parsing: AST represented by the reduction
+
+### AST Construction in LL
+
+```cpp
+expr parse_S() {
+    switch (token) {
+        case num:
+        case '(':
+            expr child1 = parse_E();
+            expr child2 = parse_S_();
+            return new S(child1, child2);
+        default: ParseError();
+    }
+}
+```
+### AST Construction in LR
+
+Construction mechanism:
+* Store parts of the tree on the stack
+* foreach nonterminal `X` on the stack, store the sub-tree for `X`
+* After reduce operation for a production `X -> a`, create an AST node for `X`
+