Author: | Andreas Rumpf |
---|
Term rewriting macros are macros or templates that have not only a name but also a pattern that is searched for after the semantic checking phase of the compiler: This means they provide an easy way to enhance the compilation pipeline with user defined optimizations:
template optMul{`*`(a, 2)}(a: int): int = a+a let x = 3 echo x * 2
The compiler now rewrites x * 2 as x + x. The code inside the curlies is the pattern to match against. The operators *, **, |, ~ have a special meaning in patterns if they are written in infix notation, so to match verbatim against * the ordinary function call syntax needs to be used.
Unfortunately optimizations are hard to get right and even the tiny example is wrong:
template optMul{`*`(a, 2)}(a: int): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2
We cannot duplicate 'a' if it denotes an expression that has a side effect! Fortunately Nimrod supports side effect analysis:
template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2 # not optimized ;-)
So what about 2 * a? We should tell the compiler * is commutative. We cannot really do that however as the following code only swaps arguments blindly:
template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a
What optimizers really need to do is a canonicalization:
template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a
The int{lit} parameter pattern matches against an expression of type int, but only if it's a literal.
The parameter constraint expression can use the operators | (or), & (and) and ~ (not) and the following predicates:
Predicate | Meaning |
---|---|
atom | The matching node has no children. |
lit | The matching node is a literal like "abc", 12. |
sym | The matching node must be a symbol (a bound identifier). |
ident | The matching node must be an identifier (an unbound identifier). |
call | The matching AST must be a call/apply expression. |
lvalue | The matching AST must be an lvalue. |
sideeffect | The matching AST must have a side effect. |
nosideeffect | The matching AST must have no side effect. |
param | A symbol which is a parameter. |
genericparam | A symbol which is a generic parameter. |
module | A symbol which is a module. |
type | A symbol which is a type. |
var | A symbol which is a variable. |
let | A symbol which is a let variable. |
const | A symbol which is a constant. |
result | The special result variable. |
proc | A symbol which is a proc. |
method | A symbol which is a method. |
iterator | A symbol which is an iterator. |
converter | A symbol which is a converter. |
macro | A symbol which is a macro. |
template | A symbol which is a template. |
field | A symbol which is a field in a tuple or an object. |
enumfield | A symbol which is a field in an enumeration. |
forvar | A for loop variable. |
label | A label (used in block statements). |
nk* | The matching AST must have the specified kind. (Example: nkIfStmt denotes an if statement.) |
alias | States that the marked parameter needs to alias with some other parameter. |
noalias | States that every other parameter must not alias with the marked parameter. |
The alias and noalias predicates refer not only to the matching AST, but also to every other bound parameter; syntactially they need to occur after the ordinary AST predicates:
template ex{a = b + c}(a: int{noalias}, b, c: int) = # this transformation is only valid if 'b' and 'c' do not alias 'a': a = b inc a, b
The operators *, **, |, ~ have a special meaning in patterns if they are written in infix notation.
The | operator if used as infix operator creates an ordered choice:
template t{0|1}(): expr = 3 let a = 1 # outputs 3: echo a
The matching is performed after the compiler performed some optimizations like constant folding, so the following does not work:
template t{0|1}(): expr = 3 # outputs 1: echo 1
The reason is that the compiler already transformed the 1 into "1" for the echo statement. However, a term rewriting macro should not change the semantics anyway. In fact they can be deactived with the --patterns:off command line option or temporarily with the patterns pragma.
A pattern expression can be bound to a pattern parameter via the expr{param} notation:
template t{(0|1|2){x}}(x: expr): expr = x+1 let a = 1 # outputs 2: echo a
The ~ operator is the not operator in patterns:
template t{x = (~x){y} and (~x){z}}(x, y, z: bool): stmt = x = y if x: x = z var a = false b = true c = false a = b and c echo a
The * operator can flatten a nested binary expression like a & b & c to &(a, b, c):
var calls = 0 proc `&&`(s: varargs[string]): string = result = s[0] for i in 1..len(s)-1: result.add s[i] inc calls template optConc{ `&&` * a }(a: string): expr = &&a let space = " " echo "my" && (space & "awe" && "some " ) && "concat" # check that it's been optimized properly: doAssert calls == 1
The second operator of * must be a parameter; it is used to gather all the arguments. The expression "my" && (space & "awe" && "some " ) && "concat" is passed to optConc in a as a special list (of kind nkArgList) which is flattened into a call expression; thus the invocation of optConc produces:
`&&`("my", space & "awe", "some ", "concat")
The ** is much like the * operator, except that it gathers not only all the arguments, but also the matched operators in reverse polish notation:
import macros type TMatrix = object dummy: int proc `*`(a, b: TMatrix): TMatrix = nil proc `+`(a, b: TMatrix): TMatrix = nil proc `-`(a, b: TMatrix): TMatrix = nil proc `$`(a: TMatrix): string = result = $a.dummy proc mat21(): TMatrix = result.dummy = 21 macro optM{ (`+`|`-`|`*`) ** a }(a: TMatrix): expr = echo treeRepr(a) result = newCall(bindSym"mat21") var x, y, z: TMatrix echo x + y * z - x
This passes the expression x + y * z - x to the optM macro as an nnkArgList node containing:
Arglist Sym "x" Sym "y" Sym "z" Sym "*" Sym "+" Sym "x" Sym "-"
(Which is the reverse polish notation of x + y * z - x.)
Parameters in a pattern are type checked in the matching process. If a parameter is of the type varargs it is treated specially and it can match 0 or more arguments in the AST to be matched against:
template optWrite{ write(f, x) ((write|writeln){w})(f, y) }(x, y: varargs[expr], f: TFile, w: expr) = w(f, x, y)Generated: 2014-03-11 21:26:34 UTC