# Dataflow Analysis Optimization means improving resource utilization not changing what the program computes. Resource utilization means many things: * **Execution time** * Code size * Network messages sent. ## Basic Block (BB) A BB is a maximum sequence of instructions with **no labels**, **no jumps** All instructions in a BB has fixed control flow. ## Control Flow Graph ```mermaid flowchart TD Entry --> A A[BB1] --> B[BB2] A --> C[BB3] B --> D[BB4] C --> D D --> E[BB5] E --> G[BB7] E --> F[BB6] G --> Exit F --> Exit ``` ## Local Optimization ### Algebraic Simplification x := x + 0 -> x := 0 y := y ** 2 -> y := y * y x := x * 8 -> x := x << 3 x := x * 15 -> t := x << 4; x := t - x ### Constant Folding x := 2 + 2 -> x := 4 if 2 < 0 jump L -> nop if 2 > 0 jump L -> jump L But Constant folding can be dangerous on cross-compilation (in precision). ### Unreachable Code ### Dead Code Elimination ## Global Optimization It is not actually global but in control graph. In Basic Block, there are a few instructions (4-8). So only local optimization, it is not quite optimizable. There are many cases where the optimization can be performed through the entire CFG. ## Dataflow Analysis * Local Dataflow Analysis * Global Dataflow Analysis Analysis of Reaching Definition Effect of an Instruction `IN[b]` and `OUT[b]` Meet Operator `IN[b] = union(OUT[p1]...OUT[pn])` ```c // init OUT[entry] = {} ``` ## Liveness Analysis Liveness is the concept the variable is used in the future. It helps **eliminating dead code**. Transfer function * `USE[b]` set of variables used in `b` * `DEF[b]` set of variables defined in `b` so transfer function `f_b` for a basic block b: ```IN[b] = USE[b] + (OUT[b] - DEF[b])``` for reaching defintion ```OUT[b] = union(INs)``` For supporting cyclic graphs, repeated computation is needed.