# Optimization There's more to performance than asymptotic complexity(time complexity). But all the instructions are not consume the same amount of time. Constant factors matter too! So we need to understand system to optimize performance. * How programs are compiled and executed * How modern processors and memory system operate * How to measure performance and identify bottlenecks * How to improve performance without destroying code modularity and generality Provide efficent mapping of program to machine code * Register allocation * Code selection and ordering (scheduling) * Dead code elimination * Elimininating minor inefficiencies **Don't improve asymptotic efficiency**. ## Generally Useful Optimizations ### Code Motion(Hoisting) Reduce frequecy where computation performed. If it will always produce the same result, then move it to a place where it is computed once and reused. Especially moving code out of loop. ```c {cmd=gcc args=[-Og -x c -c $input_file -o 4_1.o]} void set_row(double *a, double *b, long i, long n) { long j; for (j = 0; j < n; j++) { a[i * n + j] = b[j]; } } ```
| Default | Optimized |
|---|---|
| ```c {cmd=gcc args=[-O1 -x c -c $input_file -o 4_2.o]} void set_row(double *a, double *b, long i, long n) { long j; for (j = 0; j < n; j++) { a[i * n + j] = b[j]; } } ``` | ```c void set_row_opt(double *a, double *b, long i, long n) { long j; int ni = n * i; for (j = 0; j < n; j++) { a[ni + j] = b[j]; } } ``` |
| ```sh {cmd hide} while ! [ -r 4_1.o ]; do sleep .1; done; objdump -d 4_1.o ``` `imul` is located in the loop. | ```sh {cmd hide} while ! [ -r 4_2.o ]; do sleep .1; done; objdump -d 4_2.o ``` can see that `imul` is located out of the loop. |
| Default | Optimized |
|---|---|
| ```c void test_reduction(double *a, double *b, long i, long n) { int i, j; for (i = 0;i < n; i++) { int ni = n * i; for (j = 0; j < n; j++) { a[ni + j] = b[j]; } } } ``` | ```c void test_reduction_opt(double *a, double *b, long i, long n) { int i, j; int ni = 0; for (i = 0;i < n; i++) { for (j = 0; j < n; j++) { a[ni + j] = b[j]; } ni += n; } } ``` |
| Default | Optimized |
|---|---|
| ```c {cmd=gcc args=[-O1 -x c -c $input_file -o 4_3.o]} double test_scs(double* val, long i, long j, long n) { double up, down, left, right; up = val[(i - 1) * n + j]; down = val[(i + 1) * n + j]; left = val[i * n + (j - 1)]; right = val[i * n + (j + 1)]; return up + down + left + right; } ``` | ```c double test_scs_opt(double *a, double *b, long i, long n) { double up, down, left, right; long inj = i * n + j; up = a[inj - n]; down = a[inj + n]; left = b[inj - 1]; right = b[inj + 1]; return up + down + left + right; } ``` |