As I hinted in my previous post, I have (very) slowly been working on implementing a simple JIT-based evaluator for a toy language, which I’ve called Nickel. Nickel, which I’ve now released on GitHub, uses the LLVM C library to leverage the power of the LLVM optimising compiler.
For full details, see the repo on GitHub, but I’ll try in this post to give a brief overview. Consider the following Nickel program1, which computes and prints a numeric value:
# Saved to input.nkl...
def g(x)
9 * x
end
def f(x, y)
(57005 << x) + g(y + 1)
end
puts f(16, 5430)
We can evaluate this program using the nickel
evaluator in simple interpreter
mode:
$ ./nickel --interpreter < input.nkl
3735928559
However, the particular value printed will be more familiar with a hex representation:
$ printf "0x%x\n" $(./nickel --interpreter < input.nkl)
0xdeadbeef
A JIT evaluator performs compilation during execution - essentially generating
code to execute at run-time. We can witness the generated code by using a
debugger such as lldb
to break at the point at which we obtain a
pointer to the dynamically generated code:
int (*func)(void) = (int (*)(void))LLVMGetFunctionAddress(engine, "__anon_tl"); //<=
func();
Let’s run lldb
and break before we call func
:
$ lldb ./nickel -- --jit
[...]
(lldb) breakpoint set --file jit.c --line 284
[...]
(lldb) process launch -i input.nkl
[...]
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00000001000035f9 nickel`jit(p=0x0000000101a08940) at jit.c:284
281 }
282
283 int (*func)(void) = (int (*)(void))LLVMGetFunctionAddress(engine, "__anon_tl");
-> 284 func();
285
286 LLVMDisposeExecutionEngine(engine);
287 }
[...]
Now, we can ask lldb to disassemble the func
function, which
will reveal the function’s implementation:
(lldb) disassemble -s func
0x101943030: movabsq $0x101944000, %rdi ; imm = 0x101944000
0x10194303a: movabsq $0x7fff67760710, %rcx ; imm = 0x7FFF67760710
0x101943044: movl $0xdeadbeef, %esi ; imm = 0xDEADBEEF
0x101943049: xorl %eax, %eax
0x10194304b: jmpq *%rcx
0x10194304d: addb %al, (%rax)
and indeed, we can see the literal value 0xdeadbeef
, which means our Nickel
program has been fully optimisated away by LLVM into the return of a single
literal value - very cool!
Nickel syntax is superficially similar to Ruby. ↩