Let's start Scheme

2015-05-20

Introduction of Portable Foreign Function Interface (pffi) library

I think it has kinda fixed for API wise, so let me introduce Portable Foreign Function Interface (pffi)

The library is written mostly R6RS portable. Of course, it's impossible to write FFI without implementations' support so it just provides some of greatest common interfaces. One of the purpose is that if you want to write bindings written in C for R6RS Scheme, you always need to refer FFI APIs per implementations. This is pain in the ass. Most of the R6RS implementations provide the way to write compatible layer, so if there is an abstraction layer then users can write portable code without the pain. (don't ask me, how many people want to write portable code.) Currently the following implementations are supported:

  • Sagittarius (0.6.4)
  • Mosh (0.2.7, only SRFI Mosh, so called NMosh)
  • Vicare (0.3d7)
  • Racket (6.1.1, plt-r6rs)
  • Guile (2.0.11)
This is more than 50% of R6RS implementations which support FFI, so I think I'm allowed to call it portable.

How to use


Suppose you have the following useful function written in C.
int plus(int a, int b)
{
  return a+b;
}
Very useful isn't it? Now it's compiled to libtest.so. Then the Scheme code which uses this C function would look like this:
#!r6rs
(import (rnrs) (pffi))

;; load C library
(define lib (open-shared-object "libtest.so"))

;; load C function
(define plus (foreign-procedure lib int plus (int int)))

;; loaded C function can be called as if it's Scheme procedure.
(plus 1 2) ;; => 3
This code works for all implementations listed above. Simple, easy, isn't it?

Callback


If a C function requires function pointer such as quicksort, then you want to write it in Scheme. Suppose our quicksort has the following signature:
int quicksort(void *base, const size_t num, const size_t size,
              int (*compare)(const void *, const void *));
The first one is the array of elements you don't know the size, the second one is the number of the elements, the third one is the size of an element, then the last one is comparison function. To use this in Scheme, then you can write like this:
;; suppose this is in libtest.so
(define qsort
  (foreign-procedure lib int quicksort
     (pointer unsigned-long unsigned-long (callback int (pointer pointer)))))

(let ((callback (c-callback int
                            ((pointer a) (pointer b))
                            (lambda (a b)
                              (let ((ia (pointer-ref-c-uint8 a 0))
                                    (ib (pointer-ref-c-uint8 b 0)))
                                (- ia ib)))))
      (bv (u8-list->bytevector '(9 1 2 8 3 7 4 6 5))))
  (qsort (bytevector->pointer bv) (bytevector-length bv) 1 callback)
  (free-c-callback callback)
  (bytevector->u8-list bv));; => (1 2 3 4 5 6 7 8 9)
It's slightly uglier than the plus since you need to specify the callback procedure's types twice (if you want to write portable thing, sometimes this kind of things are inevitable). Some implementations doesn't free callback either, so you need to release it explicitly with free-c-callback procedure (of course, if the callback is created globally, then you don't have to since it's bound forever).

Global variables


I'm not sure if you want to handle this kind of code, but sometimes it's inevitable. Suppose your C library exports the following variable:
int externed_variable = 10;
Now, you need to get this value and modify this value from Scheme world. You can do like this:
(define-foreign-variable lib int externed_variable)
;; can also be written to specify the binding name
;; the name is converted to Scheme way if the macro is only given 3 arguments.
(define-foreign-variable lib int externed_variable global-variable)

externed-variable ;; => 10
global-variable   ;; => 10

;; this overwrites the value of C world
(set! global-variable (* global-variable 10)) ;; unspecified
global-variable   ;; => 100

;; so if the address is shared, then it's got affected.
externed-variable ;; => 100
I hope there is no actual use case for this but it's always good to have some workaround.

C Structure


Most of C functions requires pointer(s) of structure. C's structure is mere chunk of memory so you can simply pass a bytevector and get its value calculating offset/padding of the structure. But this is kinda pain in the ass. This library provides define-foreign-structure macro which is pretty much similar with define-record-type and calculates padding/offset for you.

Suppose you have the following C structure and its operations:
struct st1
{
  int count;
  void *elements;
};

struct st2
{
  struct st1 p;
  short attr;
};

static int values[] = {0,1,2,3,4,5,6,7,8,9};

#define CONST 1L
#define INT_ARRAY 1L<<1;

void fill_struct(struct st2 *s)
{
  s->p.count = sizeof(values)/sizeof(values[0]);
  s->p.elements = (void *)values;
  s->attr = CONST | INT_ARRAY;
}
To use this, you can write like this:
(let ()
  (define-foreign-struct st
    (fields (int count)
            (pointer elements)))
  ;; either way is fine
  (define-foreign-struct st2
    (fields (st p)
            (short attr)))
  ;; if the first member of struct is a struct
  ;; then you can also use 'parent' clause
  (define-foreign-struct st2*
    (fields (short attr))
    (parent st))

  (let ((st  (make-st2 (make-st 0 (integer->pointer 0)) 0))
        (st* (make-st2* 0 (integer->pointer 0) 0))
        (fillter (foreign-procedure lib void fill_struct (pointer))))
    ;; created struct is mere bytevector
    (fillter (bytevector->pointer st))
    (fillter (bytevector->pointer st*))
    ;; accessors are just accessing calculated offset
    (st-count st) ;; => 10
    ;; so this is also fine
    (st-count (st2-p st)) ;; => 10
    (st2-attr st) ;; => 3
    ;; this can also work. NB different accessor
    (st2-attr st*) ;; => 3
    (st2-p st) ;; => bytevector
    ))
parent might look weird since there is no hierarchy mechanism on C struct. But in practice, you see a lot of code which uses this kind of technique to emulate sub class thing. So I thought it might be useful. If you don't want to use it, you don't have to. You can also specify the struct member.

You might not want initial value for structures, then you can also use protocol like this:
(let ()
  (define-foreign-struct st
    (fields (int count)
            (pointer elements))
    (protocol
     (lambda (p)
       (lambda (size)
         (p size (integer->pointer 0))))))
  ;; either way is fine
  (define-foreign-struct st2
    (fields (short attr))
    (parent st)
    ;; child must have protocol if the parent has it
    (protocol
     (lambda (p)
       (lambda ()
         ((p 0) 0)))))

  (let ((st  (make-st2))
        (fillter (foreign-procedure lib void fill_struct (pointer))))
    (fillter (bytevector->pointer st))
    ;; accessors are just accessing calculated offset
    (st-count st) ;; => 10
    (st2-attr st) ;; => 3
    ))
The protocol is just the same as define-record-type's one. So the same rules are applied.

The define-foreign-struct also creates size-of-struct-name variable which contains the size of the structure. So you don't have to allocate memory to know the size.

What can't do


  • Currently, there is no way to pass address of pointers.
  • Union is not supported.
  • Bit field is not supported.
Probably many more, but I think it can still cover basic usages.

As usual, your pull requests / feedbacks are always welcome.

No comments:

Post a Comment