I was working on a OCaml binding for libsrt last summer, to add support for SRT real-time input and output to liquidsoap, and came across the need to access the [sys/socket.h](https://pubs.opengroup.org/onlinepubs/7908799/xns/syssocket.h.html)
C API.
I had already decided to use the very elegant [ocaml-ctypes](https://github.com/ocamllabs/ocaml-ctypes)
module for the SRT binding so I went with it and created a [ocaml-sys-socket](https://github.com/toots/ocaml-sys-socket)
module using it as well. It was a very interesting experience that I would like to describe here!
ocaml-ctypes
The idea behind OCaml ctypes is to create a binding against a C library without having to write C code, or as least as possible. The most straight-forward way of using it is via [libffi](https://github.com/libffi/libffi)
, providing access to dynamically-loaded libraries.
The second way of using it is by letting the module generate the basic C stubs required to build and link against a shared library. This is the mode that we’re going to use here. In this mode, the programmer has to describe the C headers of the library they intent to bind to using dedicated OCaml modules, operators and types. From that description, ocaml-ctypes is able to generate the required glue for the binding.
One advantage of using ocaml-ctypes is that the created bindings make as few assumptions as possible about the OCaml C interfacing API. This is pretty nice, in particular since the OCaml compiler is moving pretty quickly these days (which is awesome!) and also if, perhaps one day, support for multi-core is added to the compiler, which will undoubtedly change the C interface API quite a bit.
dune
[dune](https://github.com/ocaml/dune)
(formally jbuilder
) is a build system for OCaml projects that has recently raised to much popularity, particularly due to its tight integration with the rest of the OCaml ecosystem, such as [ocamlfind](http://projects.camlcity.org/projects/findlib.html)
and [opam](https://opam.ocaml.org/)
.
My personal motto in programming in general is that “Simple things should be simple, but complex things should be possible”. dune
certainly does not fit into that category but, rather, makes some complex things extremely easy to setup. It’s the kind of tool that will make your life incredibly easier when what you intent to do fits well within their workflow but might not be easy to bend to some very specific niche use. We will see one such case below.
At any rate, it’s been an amazing experience getting to learn how to use dune
and the resulting code and build system is remarkably short and elegant, yet very powerful.
socket.h
socket.h
is the Unix header that describes the C API to various socket operations, IP version 4 and 6 as well as unix file sockets. There is also a windows API mimicking it, which makes most code using it easily portable to windows.
Most network-based C libraries refer to socket.h
to describe the type of socket that can be used with their API so it’s an important entry point for a lot of network operations and one that would be nice to support as generically as possible in OCaml.
The catch, though, is that, most likely for historical reasons¹, the POSIX specifications only partially defines some of the required data structures and types, which makes it possible to write C code using them but does not give enough information to write C bindings without having to use the compiler to parse the actual system-specific headers of the running host.
For instance, here’s how the sockaddr
structure is specified:
The <sys/socket.h> header defines the sockaddr structure that includes at least the following members:sa_family_t sa_family address family
char sa_data[] socket address (variable-length data)
Likewise, here’s what is specified about the size of the socklen_t
data type:
<sys/socket.h> makes available a type, socklen_t, which is an unsigned opaque integral type of length of at least 32 bits.
Thus, in order to know the exact offset of sa_family
inside the sockaddr
structure or the actual size of a socklen_t
integer, one has to include the OS-specific header, parse its definitions for that specific OS and, only then, is it possible to compute that offset or data size. Let’s see how it’s done in our binding now!
Putting it together
The C binding requires 4 separate passes:
- The
[constants](https://github.com/toots/ocaml-sys-socket/tree/master/src/sys-socket/constants)
pass, which computes and exports some specific constant and data sizes, computed from the C headers - The
[types](https://github.com/toots/ocaml-sys-socket/tree/master/src/sys-socket/types)
pass, which, given the system-specific constants and sizes exported in the previous phase, defines the actual C data structure bindings. - The
[stubs](https://github.com/toots/ocaml-sys-socket/tree/master/src/sys-socket/stubs)
pass, where we define the actual bindings to the C functions that we wish to export in our API. - Finally, the last pass does a cleanup of the
stubs
pass to export a relevant and OCaml- (andocaml-ctypes
) specific public API that is to be used by users of the module.
dune
makes each of these steps fairly easy to integrate into the next one, defining compilation elements and binaries to build before moving to the next pass.
Constants pass
During that pass, we compute and export all required C values defined in the headers. We also add our own constants, which give us the sizes that the POSIX specifications leave up to the OS. Here’s the OCaml code for it:
1 | module Def (S : Cstubs.Types.TYPE) = struct |
Pretty straightforward! Some of these constants are defined by the POSIX headers and some are custom defined for our needs, for instance SOCKLEN_T_LEN
. Here’s how they are extracted, using the dune
build configuration for [gen_constants_c](https://github.com/toots/ocaml-sys-socket/blob/master/src/sys-socket/generator/gen_constants_c.ml)
:
1 | let c_headers = " |
This OCaml code makes use of ocaml-ctypes
to build a binary that exports the OCaml interface defined by Sys_socket_constants.Def
. Once compiled, its output looks like this:
1 | include Ctypes |
The files used to describe how to build this binary using dune
are located in a separate [generator](https://github.com/toots/ocaml-sys-socket/tree/master/src/sys-socket/generator)
directory. Here’s the entry to build this one:
1 | (executable |
This executable is compiled during the next phase. Let’s move into it now!
Types pass
During that phase, we use the constants exported during the previous phase to describe the various C structures and types. This is by far the most complex part of the code, making use of first-class modules and several OCaml tricks.
First, let’s look at how we tell dune
that we need to generate the .ml
file exporting our required constants from the previous pass:
1 | (rule |
With only this information, if the code refers to a Sys_socket_generated_constants
module, dune
will know that this module needs to be generated and how to do it. We will explain later the use of the exec.sh
wrapper here.
Now that we can make use of the exported constants in our OCaml code, let’s see how we define the Socklen
module, exporting abstract types and interface to use socklen_t
integers:
1 | module Constants = Sys_socket_constants.Def(Sys_socket_generated_constants) |
As you can see, we make use of first-order modules and the size of the socklen_t
integer to define the right API for the compiling host. Now let’s see how we define the sockaddr
interface:
1 | module type SaFamily = sig |
Here, too, we make use of the size of sa_family
as exported previously to define the right structure fields.
Next step, we need to compile this interface again to export the right offset for the various structures that have been defined. That’s dune
’s job again!
First, the generator code:
1 | let c_headers = " |
And the build instructions:
1 | (executable |
Once, compiled, the exported .ml
looks like this:
1 | include Ctypes |
As you can see, this exports all the offsets required to access the fields inside a sockaddr_t
structure. We’re now ready to move to the final stage, which is the actual binding stubs!
Binding stubs
First step in this pass, just like with the previous ones, we need to configure dune
to be able to build the exported .ml
code from the types
pass:
1 | (rule |
And we can now define the proper bindings. Here’s how it looks like:
1 | open Ctypes |
As you can see, we’re exporting the getnameinfo
function, taking various arguments, including a pointer to a sockaddr_t
structure and a couple of socklen_t
integers, making use of all the various data types and structures previously defined. The exact specifications of this function can be found here. We can now define out top-level API..
Final API
Building upon the previous modules, we export various OCaml idiomatic APIs that the binding user can now use to build new bindings against the socket.h
APIs.
Just like with the previous steps, first we need to configure the build system:
1 | (rule |
This time, we need ocaml-ctypes
to generate two compilation units: a .ml
file describing the API exported during the stubs
phase, as well as the C code to glue it with the C APIs. Here’s the code for that generator:
1 | let c_headers = " |
The exported .ml
and .c
files are omitted here for simplicity but the reader can generated them themselves from the [ocaml-sys-socket](https://github.com/toots/ocaml-sys-socket)
repository if they are curious about their actual content.
We can now export our top-level API:
1 | open Ctypes |
1 | open Ctypes |
That’s it! We now have ocaml-ctypes
specific data types and structures that can be used to interface with the host’s native socket.h
APIs. Note that we also worked on top of the original low-level binding to getnameinfo
to export a higher-level function more idiomatic to the OCaml language.
Lagniappe: cross-compilation to Windows
On windows platforms, liquidsoap
is compiled using [ocaml-cross-windows](https://github.com/ocaml-cross/opam-cross-windows)
and, since windows does have compatible socket APIs, we wanted to also look at cross-compiling for the windows target, which is where we hit a snag on the current dune
support.
The problem is that, at each intermediary steps, in the case of a cross-compilation, the compiled binaries need to use the target’s OS headers and not the host’s headers, otherwise we end up using offsets specific to e.g. Debian but for a windows binary.
In this case, this means that the compiled .exe
binaries need to be windows binaries and that we need to execute them as windows native binaries, using [wine](https://www.winehq.org/)
.
dune
has a truly amazing support for cross-compiling, which we do not cover here, but, unfortunately, its primitives for building and executing binaries do not yet cover this use case. Thus we had to trick it into compiling things the way we wanted to do, which why we are using the exec.sh
wrapper. Here’s its code:
1 |
|
Now, you can go back to the previous dune
files and see how this wrapper allows to execute binaries according to the system that the corresponding ocamlopt
compiler has been configured to build for.
Conclusion
It’s been a fun time working on this binding! It’s amazing to see the level of details that can be built through ocaml-ctypes
using their provided primitives. Ultimately, the binding is very clean and elegant, with very few low-level assumptions.
Likewise, the simplicity and power of the dune
build system makes this very fluid to build. Without it, each of the described steps above would have been much more painful to execute and compile.
[1]: My bet is that, at the time the POSIX specifications were being written, there we already several inconsistent socket.h
headers out in the wild among the various historical UNIX flavors..