BrainFuck Optimizing JIT
What is BrainFuck? BrainFuck is an esoteric programming language designed specifically to be easy to compile. The environment provides the programmer with an “infinite” array of bytes (traditionally just 30,000) and a data pointer. There are only 8 single character commands:
-
+
: Increment the current memory cell by 1 (with wrapping overflow) -
-
: Decrement the current memory cell by 1 (with wrapping underflow) -
>
: Shift the data pointer to the next memory cell -
<
: Shift the data pointer to the previous memory cell -
.
: Output the current memory cell as an ASCII character -
,
: Read one ASCII character from stdin -
[
: Jump to the matching]
if the current memory cell is0
-
]
: Jump to the matching[
if the current memory cell is not0
Implementation
Optimization
The lowest hanging fruit here is to perform run-length encoding on the instructions. Sequential +
, -
, >
and <
commands can be combined before they are executed. Internally this is done by compiling to an intermediate language - which is stored as a vector of Instr
s:
pub struct Program {
pub data: Vec<Instr>,
}
pub enum Instr {
Incr(u8),
Decr(u8),
Next(usize),
Prev(usize),
Print,
Read,
BeginLoop(Option<usize>),
EndLoop(Option<usize>),
}
Without any other optimizations performed (unless you count stripping out comments before execution) this alone results in a ~3x speedup when benchmarked against a BrainFuck Mandelbrot set renderer.
What’s next? The more complicated BrainFuck programs are generated from a high level macro language. Decompiling from BrainFuck back to this language could allow me to do more intelligent code execution.
JIT Compiling
While impossible to read BrainFuck code itself, BrainFuck is probably the simplest turing-complete language. This makes it an ideal candidate for exploring JIT compilation.
The first six of our instructions defined in Instr
are pretty straight-forward to implement in x86-64.
+
:
add BYTE PTR [r10],n
Where:
-
r10
is used as the data pointer -
n
is the same value that is held byIncr
in theInstr
enum
-
, >
and <
are equally simple.
Print
and Read
are slightly more complex but don’t require us to do any control flow ourselves.
Where we start to get into trouble is with [
and ]
. To avoid the difficulty of tracking labels and linking them together before execution, all instructions’ x86-64 machine code is padded with nop
s.
while bytes.len() < BF_INSTR_SIZE as usize {
// nop
bytes.push(0x90);
}
This means that the jump targets can be easily found as long as you know the target position (in the Program
data vector), current position, and unpadded size of the current instruction:
let begin_loop_size: i32 = 10; // Bytes
let offset = (*pos as i32 - this_pos as i32) * BF_INSTR_SIZE - begin_loop_size;
let offset_bytes: [u8; mem::size_of::<i32>()] = unsafe { mem::transmute(offset) };
// Check if the current memory cell equals zero.
// cmp BYTE PTR [r10],0x0
bytes.push(0x41);
bytes.push(0x80);
bytes.push(0x3a);
bytes.push(0x00);
// Jump to the end of the loop if equal.
// je offset
bytes.push(0x0f);
bytes.push(0x84);
bytes.push(offset_bytes[0]);
bytes.push(offset_bytes[1]);
bytes.push(offset_bytes[2]);
bytes.push(offset_bytes[3]);
Benchmarks
Ran on mandelbrot.bf
Version | Runtime |
---|---|
Naive Interpreter | 56.824s |
Optimized Interpreter | 19.055s |
Optimized JIT | 5.484s |