# [Notes] LICM (Loop Invariant Code Motion) and its implementation in LLVM

## A Simple Lead-in

1 | T *p = /* loop invariant address */ |

1 | a = /* loop invariant */; // hoist |

## Loop

### Dominators

Node \(a\) ** dominates** node \(b\) \((a \operatorname{dom} b)\) in a graph if every path from the entry node to \(b\) goes through \(a\).

Node \(a\) ** immediately dominates** node \(b\) \((a \operatorname{idom} b)\) if \(\nexists c: a \operatorname{dom} c \wedge c \operatorname{dom} b\).

- The (immediately) dominant relationships between nodes form a tree with the entry node as its root.
- The dominator tree of a flow graph can be constructed using an algorithm developed by Lengauer and Tarjan with \(O(|E|\alpha(|E|, |V|))\) complexity

### Back Edges

** back edge**: an edge \((\textit{tail} \rightarrow \textit{head})\) with condition \(\textit{head} \operatorname{dom} \textit{tail}\) satisfied.

### Natural Loops

** the natural loop of a back edge** \((\textit{tail} \rightarrow \textit{head})\): all nodes that can reach \(\textit{tail}\) after removing \(\textit{head}\), including \(\textit{head}\) itself. the \(\textit{head}\) node is its

**single entry-point**.

### Find Loops

A intuitional simple process:

- Construct the dominator tree of the control flow graph
- Find out the back edges
- and the natural loops associated with them

more details:

- Traverse the tree in the depth-first order (Depth-first spanning tree)
- When traverse an edge \(\textit{tail} \rightarrow \textit{head}\) where \(\textit{head}\) is an ancestor of \(\textit{tail}\) in the depth-first spanning tree, check whether \(\textit{head}\) dominates \(\textit{tail}\). If true, the edge is a back edge.
- Delete \(\textit{head}\) from the graph, and find nodes that can reach \(\textit{tail}\) still.

## Loop Invariant

** Loop-Invariant Computation**: a computation which computes the same result each time evaluated. The result of a Loop-Invariant Computation is a Loop Invariant.

### Detecting Loop Invariants

- Basic Cases
- it's a constant
**all**of it's**def**s are outside the loop (read-only in the loop)

- Inductive Cases
- it’s an idempotent computation all of whose args are Loop Invariants
- there is
**exactly one**reaching**def**for it in the loop, and the def's rhs is a Loop Invariant

- Repeat this process, until no more Loop Invariants can be detected.

## Code Motion

- Types of Code Motion
- hoist instructions to the loop preheader (the predecessor of the loop header), if used within loop
- sink instructions to the exit blocks of the loop, if only used after loop
- extra: promote memory accesses to scalars (
`promoteLoopAccessesToScalars`

)

- Conditions
- Aimed at improving performance of the program
- the instruction should not be executed more often after motion
- before moving an instruction, check if the destination block is "colder" than the original block. (
`worthSinkOrHoistInst`

)

- Can not change semantics of the program
- the candidate instruction should dominates
**all**the exits of the loop (otherwise it may be wrongly executed, e.g. the code in a conditional branch) - and should dominates
**all**its users in the entire loop (when hoisting) - the destination of the instruction
**must**dominates**all**its users too

- the candidate instruction should dominates

- Aimed at improving performance of the program
- some more aggressive optimizations

### Algorithm

- depth-first search on the control-flow graph, sink or hoist candidate instructions if all the invariant operations it depends upon have been moved

## Implementation in LLVM

1 | fn LICMPass::run() |

1 | fn llvm::promoteLoopAccessesToScalars() |

1 | fn isLoadInvariantInLoop() |