Nickel, an exercise in JIT evaluation

As I hinted in my previous post, I have (very) slowly been working on implementing a simple JIT-based evaluator for a toy language, which I've called Nickel. Nickel, which I've now released on GitHub, uses the LLVM C library to leverage the power of the LLVM optimising compiler.

For full details, see the repo on GitHub, but I'll try in this post to give a brief overview. Consider the following Nickel program1, which computes and prints a numeric value:

# Saved to input.nkl...

def g(x)
  9 * x
end

def f(x, y)
  (57005 << x) + g(y + 1)
end

puts f(16, 5430)

We can evaluate this program using the nickel evaluator in simple interpreter mode:

$ ./nickel --interpreter < input.nkl
3735928559

However, the particular value printed will be more familiar with a hex representation:

$ printf "0x%x\n" $(./nickel --interpreter < input.nkl)
0xdeadbeef

A JIT evaluator performs compilation during execution - essentially generating code to execute at run-time. We can witness the generated code by using a debugger such as lldb to break at the point at which we obtain a pointer to the dynamically generated code:

int (*func)(void) = (int (*)(void))LLVMGetFunctionAddress(engine, "__anon_tl"); //<=
func();

Let's run lldb and break before we call func:

$ lldb ./nickel -- --jit
[...]
(lldb) breakpoint set --file jit.c --line 284
[...]
(lldb) process launch -i input.nkl
[...]
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001000035f9 nickel`jit(p=0x0000000101a08940) at jit.c:284
   281      }
   282
   283      int (*func)(void) = (int (*)(void))LLVMGetFunctionAddress(engine, "__anon_tl");
-> 284      func();
   285
   286      LLVMDisposeExecutionEngine(engine);
   287  }
[...]

Now, we can ask lldb to disassemble the func function, which will reveal the function's implementation:

(lldb) disassemble -s func
    0x101943030: movabsq $0x101944000, %rdi        ; imm = 0x101944000
    0x10194303a: movabsq $0x7fff67760710, %rcx     ; imm = 0x7FFF67760710
    0x101943044: movl   $0xdeadbeef, %esi         ; imm = 0xDEADBEEF
    0x101943049: xorl   %eax, %eax
    0x10194304b: jmpq   *%rcx
    0x10194304d: addb   %al, (%rax)

and indeed, we can see the literal value 0xdeadbeef, which means our Nickel program has been fully optimisated away by LLVM into the return of a single literal value - very cool!


  1. Nickel syntax is superficially similar to Ruby.