OCaml FFI Sharp Edges -- and How to Avoid them!

Hello again,

Today I’ll be talking about the OCaml FFI, mostly through the lens of the excellent Ctypes library, some troubles I’ve had, and how to deal with them. This post is most likely to be of use to people looking to use the FFI in a safe way.

Context

I have recently been fixing a number of strange bugs in godotcaml, where malloc claims that memory has been corrupted during an OCaml GC cycle. Firstly, this is usually due to one of two things: a so-called “double-free error”; or writing the memory you shouldn’t (like a buffer overflow or the like). In my case, it turned out to be a bit of both: it was a “use-after-free” error — something you probably wouldn’t expect in a garbage collected language!. This happens when memory is freed back to the allocator, but you end up writing to it again later anyways.

This bug was rather nasty to chase down, and I’ll show you what I did to find it.

The Bug

The problem arose with code that looked like this:

let my_struct_pointer = Ctypes.allocate_n ~count:1 ?finalise:None MyStructure.typ in
let my_int_pointer = Ctypes.allocate_n ~count:1 ?finalise:None Ctypes.int in
my_int_pointer <-@ 3;
my_struct_pointer |-> MyStructure.int_ptr_field <-@ my_int_pointer;
(* Details matter here, but we'll explore a couple of examples *)
...
(* Use the pointer. *)
let ret = a_foreign_call_into_c my_struct_pointer in
ret

The problem is if in the code labeled ..., the garbage collector is run, my_int_pointer will likely be collected, making my_struct_pointer invalid, since it contains it. The garbage collector has no knowledge of the dependence of my_struct_pointer on my_int_point; <-@ is just another function, and it does a memset, essentially. It is still unclear to me if there is a danger if my_int_pointer is moved by the garbage collector during compaction, but doesn’t actually free it. In my testing and thoughts, it would seem that, in this case using Ctypes’s allocate and allocate_n functions it would be safe — the OCaml value which contains the address of my_int_pointer might be moved, but the memory it points to should live off the heap. Feel free to comment on the discourse if you know the answer to this for sure!

An Example

The simplest example is this: suppose the above ... is replaced with

print_endline "running GC";
Gc.compact ();

This would force the GC to perform the heaviest type of collection, a compaction. We choose this because we want to reveal as many bugs as possible. If the memory moved, for example, this should cause that and therefore we should see bugs that arise from the assumption that it doesn’t.

The code then looks like

print_endline "start of function";
let my_struct_pointer = Ctypes.allocate_n ~count:1 ?finalise:None MyStructure.typ in
let my_int_pointer = Ctypes.allocate_n ~count:1 ?finalise:None Ctypes.int in
my_int_pointer <-@ 3;
my_struct_pointer |-> MyStructure.int_ptr_field <-@ my_int_pointer;
print_endline "running GC";
Gc.compact ();
print_endline "creating return";
let ret = a_foreign_call_into_c my_struct_pointer in
ret

Why does this code fail, and only when the GC runs? Because my_int_pointer is no used in the body of the sequence expression after setting it:

print_endline "running GC";
Gc.compact ();
print_endline "creating return";
let ret = a_foreign_call_into_c my_struct_pointer in
ret

We can see this in action by setting the finalise values above to non-None values with obvious side effects:

let my_int_finaliser _ = print_endline "finalise the int" in
let my_struct_finaliser _ = print_endline "finalise the struct" in
let my_struct_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_struct_finaliser MyStructure.typ in
let my_int_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_int_finaliser Ctypes.int in
...

If you run code looking like the above, you will see this in the console:

$ dune exec ./run_it.exe
start of function
running GC
finalise the int
(some sort of malloc memory corruption error hopefully (it's technically UB) and a crash)

The problem, as said before, is the GC has no knowledge that my_int_pointer is still in use. So let’s use it!

A Naive Solution

I tried this, and it doesn’t work with optimizations enabled.

print_endline "start of function";
let my_int_finaliser _ = print_endline "finalise the int" in
let my_struct_finaliser _ = print_endline "finalise the struct" in
let my_struct_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_struct_finaliser MyStructure.typ in
let my_int_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_int_finaliser Ctypes.int in
my_int_pointer <-@ 3;
my_struct_pointer |-> MyStructure.int_ptr_field <-@ my_int_pointer;
print_endline "running GC";
Gc.compact ();
print_endline "creating return";
let ret = a_foreign_call_into_c my_struct_pointer in
ignore my_int_pointer; (* Look GC, it's being used here! *)
ret

However, if you know much about functional language compilers, you know that the inliner is basically your best friend for optimization. If the GC was correct, we could simply inline ignore = fun _ -> (), apply it to my_int_pointer, replace that with the result of (); and then simply drop this useless unit. These are all valid program transformations, assuming that there wasn’t a data leak in our program (which, as we’ve established, there is). Therefore, the code will actually print the exact same thing as before, since after the inliner has done its thing, the body of the sequence expression again contains no references to my_int_pointer, so it is collected as garbage. The OCaml optimizer is too clever for our own good!

In Haskell there is a nice function that lets you keep alive data that it is applied to. This would locally allow you to keep data alive up to a clearly delineated point. I do not think OCaml has such a function (I hope I’m wrong though, please tell me if so!), and so we must resort to using so-called GC “roots”.

A Working Solution

Here is an actual solution.

let static_allocate_n ~count ?finalise (typ: 'a typ) : 'a ptr =
    let original_ptr = Ctypes.allocate_n ~count ?finalise typ in
    let root_ptr = Ctypes.Roots.create original_ptr in
    Ctypes.Roots.get root_ptr

(Note that we give static_allocate_n a signature so that the pointer it returns is determined by typ, and not by the calling context.) This function:

Creates the pointer we actually want.
Registers it as a root of the GC, meaning it will never be collected.
Return the contents of that registered root, which should be equal to the original pointer.

Replacing allocate_n with static_allocate_n in the definition of of my_int_pointer, eg.

let my_int_pointer = static_allocate_n ~count:1 ~finalise:my_int_finaliser Ctypes.int in
...

and running will create the correct output.

A Theoretically Better Solution

The above code has a memory leak now: root_ptr is not returned from static_allocate_n, and so we can never call Roots.release on it, so that int lives forever. There are ways around this, but they are rather clunky. What I really would like is for me to be able to do, instead of using a global root, is to just tell the GC, “hey don’t collect this variable until at least ‘this point’“. Something like the ignore trick above, but that actually works:

print_endline "start of function";
let my_int_finaliser _ = print_endline "finalise the int" in
let my_struct_finaliser _ = print_endline "finalise the struct" in
let my_struct_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_struct_finaliser MyStructure.typ in
let my_int_pointer = Ctypes.allocate_n ~count:1 ~finalise:my_int_finaliser Ctypes.int in
my_int_pointer <-@ 3;
my_struct_pointer |-> MyStructure.int_ptr_field <-@ my_int_pointer;
print_endline "running GC";
Gc.compact ();
print_endline "creating return";
let ret = a_foreign_call_into_c my_struct_pointer in
Ctypes.keep_alive my_int_pointer; (* Look GC, it's being used here! *)
ret

Does anyone know if this is possible? If so, let me know on the OCaml discourse! I’m @Fizzixnerd there. If not, that might be a neat way to get my hands dirty in some OCaml compiler magic. I have some ideas for how to do this, either using PPX (bad) or possibly just defining keep_alive as the same as ignore, but not inlineable. Maybe I’ve completely misunderstood things as well, would be happy to hear criticism!

Best,

Matt

#open-source#ocaml#godot#godotcaml#ffi