[Notes] LICM (Loop Invariant Code Motion) and its implementation in LLVM
A Simple Lead-in
1  | T *p = /* loop invariant address */  | 
1  | a = /* loop invariant */; // hoist  | 
Loop
Dominators
Node \(a\) dominates node \(b\) \((a \operatorname{dom} b)\) in a graph if every path from the entry node to \(b\) goes through \(a\).
Node \(a\) immediately dominates node \(b\) \((a \operatorname{idom} b)\) if \(\nexists c: a \operatorname{dom} c \wedge c \operatorname{dom} b\).
- The (immediately) dominant relationships between nodes form a tree with the entry node as its root.
 - The dominator tree of a flow graph can be constructed using an algorithm developed by Lengauer and Tarjan with \(O(|E|\alpha(|E|, |V|))\) complexity
 
Back Edges
back edge: an edge \((\textit{tail} \rightarrow \textit{head})\) with condition \(\textit{head} \operatorname{dom} \textit{tail}\) satisfied.
Natural Loops
the natural loop of a back edge \((\textit{tail} \rightarrow \textit{head})\): all nodes that can reach \(\textit{tail}\) after removing \(\textit{head}\), including \(\textit{head}\) itself. the \(\textit{head}\) node is its single entry-point.
Find Loops
A intuitional simple process:
- Construct the dominator tree of the control flow graph
 - Find out the back edges
 - and the natural loops associated with them
 
more details:
- Traverse the tree in the depth-first order (Depth-first spanning tree)
 - When traverse an edge \(\textit{tail} \rightarrow \textit{head}\) where \(\textit{head}\) is an ancestor of \(\textit{tail}\) in the depth-first spanning tree, check whether \(\textit{head}\) dominates \(\textit{tail}\). If true, the edge is a back edge.
 - Delete \(\textit{head}\) from the graph, and find nodes that can reach \(\textit{tail}\) still.
 
Loop Invariant
Loop-Invariant Computation: a computation which computes the same result each time evaluated. The result of a Loop-Invariant Computation is a Loop Invariant.
Detecting Loop Invariants
- Basic Cases
- it's a constant
 - all of it's defs are outside the loop (read-only in the loop)
 
 - Inductive Cases
- it’s an idempotent computation all of whose args are Loop Invariants
 - there is exactly one reaching def for it in the loop, and the def's rhs is a Loop Invariant
 
 - Repeat this process, until no more Loop Invariants can be detected.
 
Code Motion
- Types of Code Motion
- hoist instructions to the loop preheader (the predecessor of the loop header), if used within loop
 - sink instructions to the exit blocks of the loop, if only used after loop
 - extra: promote memory accesses to scalars (
promoteLoopAccessesToScalars) 
 - Conditions
- Aimed at improving performance of the program
- the instruction should not be executed more often after motion
 - before moving an instruction, check if the destination block is "colder" than the original block. (
worthSinkOrHoistInst) 
 - Can not change semantics of the program
- the candidate instruction should dominates all the exits of the loop (otherwise it may be wrongly executed, e.g. the code in a conditional branch)
 - and should dominates all its users in the entire loop (when hoisting)
 - the destination of the instruction must dominates all its users too
 
 
 - Aimed at improving performance of the program
 - some more aggressive optimizations
 
Algorithm
- depth-first search on the control-flow graph, sink or hoist candidate instructions if all the invariant operations it depends upon have been moved
 
Implementation in LLVM
1  | fn LICMPass::run()  | 
1  | fn llvm::promoteLoopAccessesToScalars()  | 
1  | fn isLoadInvariantInLoop()  |