Syntax highlighter









Schemeで書かれたソースも処理系によって大分違う。例えば、Mosh、Vicare、Sagittariusは基本的にライブラリ形式なので見つけてしまえばそのままライブラリを呼ぶことができる。Larcenyは多少違って、ライブラリ外の手続き・マクロは(primitives foo ...)のようにしてやる必要がある。これはvan Tonderのマクロで共通なのでNMoshもこの形式である。



find . -name '*.c' | xargs grep 'keyword'








Introduction of Portable Foreign Function Interface (pffi) library

I think it has kinda fixed for API wise, so let me introduce Portable Foreign Function Interface (pffi)

The library is written mostly R6RS portable. Of course, it's impossible to write FFI without implementations' support so it just provides some of greatest common interfaces. One of the purpose is that if you want to write bindings written in C for R6RS Scheme, you always need to refer FFI APIs per implementations. This is pain in the ass. Most of the R6RS implementations provide the way to write compatible layer, so if there is an abstraction layer then users can write portable code without the pain. (don't ask me, how many people want to write portable code.) Currently the following implementations are supported:

  • Sagittarius (0.6.4)
  • Mosh (0.2.7, only SRFI Mosh, so called NMosh)
  • Vicare (0.3d7)
  • Racket (6.1.1, plt-r6rs)
  • Guile (2.0.11)
This is more than 50% of R6RS implementations which support FFI, so I think I'm allowed to call it portable.

How to use

Suppose you have the following useful function written in C.
int plus(int a, int b)
  return a+b;
Very useful isn't it? Now it's compiled to Then the Scheme code which uses this C function would look like this:
(import (rnrs) (pffi))

;; load C library
(define lib (open-shared-object ""))

;; load C function
(define plus (foreign-procedure lib int plus (int int)))

;; loaded C function can be called as if it's Scheme procedure.
(plus 1 2) ;; => 3
This code works for all implementations listed above. Simple, easy, isn't it?


If a C function requires function pointer such as quicksort, then you want to write it in Scheme. Suppose our quicksort has the following signature:
int quicksort(void *base, const size_t num, const size_t size,
              int (*compare)(const void *, const void *));
The first one is the array of elements you don't know the size, the second one is the number of the elements, the third one is the size of an element, then the last one is comparison function. To use this in Scheme, then you can write like this:
;; suppose this is in
(define qsort
  (foreign-procedure lib int quicksort
     (pointer unsigned-long unsigned-long (callback int (pointer pointer)))))

(let ((callback (c-callback int
                            ((pointer a) (pointer b))
                            (lambda (a b)
                              (let ((ia (pointer-ref-c-uint8 a 0))
                                    (ib (pointer-ref-c-uint8 b 0)))
                                (- ia ib)))))
      (bv (u8-list->bytevector '(9 1 2 8 3 7 4 6 5))))
  (qsort (bytevector->pointer bv) (bytevector-length bv) 1 callback)
  (free-c-callback callback)
  (bytevector->u8-list bv));; => (1 2 3 4 5 6 7 8 9)
It's slightly uglier than the plus since you need to specify the callback procedure's types twice (if you want to write portable thing, sometimes this kind of things are inevitable). Some implementations doesn't free callback either, so you need to release it explicitly with free-c-callback procedure (of course, if the callback is created globally, then you don't have to since it's bound forever).

Global variables

I'm not sure if you want to handle this kind of code, but sometimes it's inevitable. Suppose your C library exports the following variable:
int externed_variable = 10;
Now, you need to get this value and modify this value from Scheme world. You can do like this:
(define-foreign-variable lib int externed_variable)
;; can also be written to specify the binding name
;; the name is converted to Scheme way if the macro is only given 3 arguments.
(define-foreign-variable lib int externed_variable global-variable)

externed-variable ;; => 10
global-variable   ;; => 10

;; this overwrites the value of C world
(set! global-variable (* global-variable 10)) ;; unspecified
global-variable   ;; => 100

;; so if the address is shared, then it's got affected.
externed-variable ;; => 100
I hope there is no actual use case for this but it's always good to have some workaround.

C Structure

Most of C functions requires pointer(s) of structure. C's structure is mere chunk of memory so you can simply pass a bytevector and get its value calculating offset/padding of the structure. But this is kinda pain in the ass. This library provides define-foreign-structure macro which is pretty much similar with define-record-type and calculates padding/offset for you.

Suppose you have the following C structure and its operations:
struct st1
  int count;
  void *elements;

struct st2
  struct st1 p;
  short attr;

static int values[] = {0,1,2,3,4,5,6,7,8,9};

#define CONST 1L
#define INT_ARRAY 1L<<1;

void fill_struct(struct st2 *s)
  s->p.count = sizeof(values)/sizeof(values[0]);
  s->p.elements = (void *)values;
  s->attr = CONST | INT_ARRAY;
To use this, you can write like this:
(let ()
  (define-foreign-struct st
    (fields (int count)
            (pointer elements)))
  ;; either way is fine
  (define-foreign-struct st2
    (fields (st p)
            (short attr)))
  ;; if the first member of struct is a struct
  ;; then you can also use 'parent' clause
  (define-foreign-struct st2*
    (fields (short attr))
    (parent st))

  (let ((st  (make-st2 (make-st 0 (integer->pointer 0)) 0))
        (st* (make-st2* 0 (integer->pointer 0) 0))
        (fillter (foreign-procedure lib void fill_struct (pointer))))
    ;; created struct is mere bytevector
    (fillter (bytevector->pointer st))
    (fillter (bytevector->pointer st*))
    ;; accessors are just accessing calculated offset
    (st-count st) ;; => 10
    ;; so this is also fine
    (st-count (st2-p st)) ;; => 10
    (st2-attr st) ;; => 3
    ;; this can also work. NB different accessor
    (st2-attr st*) ;; => 3
    (st2-p st) ;; => bytevector
parent might look weird since there is no hierarchy mechanism on C struct. But in practice, you see a lot of code which uses this kind of technique to emulate sub class thing. So I thought it might be useful. If you don't want to use it, you don't have to. You can also specify the struct member.

You might not want initial value for structures, then you can also use protocol like this:
(let ()
  (define-foreign-struct st
    (fields (int count)
            (pointer elements))
     (lambda (p)
       (lambda (size)
         (p size (integer->pointer 0))))))
  ;; either way is fine
  (define-foreign-struct st2
    (fields (short attr))
    (parent st)
    ;; child must have protocol if the parent has it
     (lambda (p)
       (lambda ()
         ((p 0) 0)))))

  (let ((st  (make-st2))
        (fillter (foreign-procedure lib void fill_struct (pointer))))
    (fillter (bytevector->pointer st))
    ;; accessors are just accessing calculated offset
    (st-count st) ;; => 10
    (st2-attr st) ;; => 3
The protocol is just the same as define-record-type's one. So the same rules are applied.

The define-foreign-struct also creates size-of-struct-name variable which contains the size of the structure. So you don't have to allocate memory to know the size.

What can't do

  • Currently, there is no way to pass address of pointers.
  • Union is not supported.
  • Bit field is not supported.
Probably many more, but I think it can still cover basic usages.

As usual, your pull requests / feedbacks are always welcome.


FFI library comparison for R6RS implementations

I'm writing Portable Foreign Function Interface for R6RS and have found some interesting things to share. Currently the library supports the following implementations and this article mentions them mainly:
  • Sagittarius (0.6.4)
  • Vicare (0.3d7)
  • Mosh (0.2.7)
  • Racket (plt-r6rs 6.1.1)
  • Guile (2.0.11)

API wise

Most of the implementations supports common procedures with different names. Only Mosh, Sagittarius and Vicare provide similar APIs. (Well, at least Sagittarius took some of the API names from Mosh, I don't know about Vicare.) Unfortunately, Mosh has much less APIs and seems incomplete. For example, Mosh doesn't have any API to convert bytevectors to pointers or APIs to set unsigned char/short/int/long to pointer. This might be critical in some cases. Vicare and Sagittarius have enough APIs to do almost everything.

Racket has interesting APIs. Almost all foreign variable need to have types.  For example, it distinguish char * and void * whilst above 3 not really do.

Guile has limited pointer operation procedure. It seems users always convert pointers to bytevectors, I think this is pretty inconvenient though. And it doesn't accept bytevector as pointer directly. So it needs to be converted by bytevector->pointer procedure.

Documentation wise

Racket provides excellent documents so I could write the library without referring its source code.

Guile provides good documents it's not so user friendly means I needed to do some try and errors to figure out how it works (especially, document doesn't provide library name. It took me some time to figure it out).

Mosh is OK document, though I also referred the source code. (because of lack of APIs, I hoped there is hidden ones).

Sagittarius is also OK document. Though, some of APIs documents are missing so it would be hard to write this library without referring the source code.

Vicare has poor document for FFI (unlikely the other documents). It just have shallow intruductions. This basically means there might be a chance that APIs would be changed. To write the library, I needed to refer the source code.

Other implementations

There are 4 more R6RS implementations which support FFI, Chez, Ypsilon, IronScheme and Larceny. These are the reasons why I didn't do it in first place:

Chez: I can only use Petite Chez Scheme but this doesn't support FFI. So to make it I need the comercial one which is not available anymore.

Ypsilon: Released version of Ypsilon has very limited APIs to create foreign functions/variables. And no document. Trunk version is far more APIs but not maintained nor released (I think it's sad but I can't help it). Just gave up.

IronScheme: .NET ... well simply not familiar to use it

Larceny: I might support, even it has good document. But there is no proper install script. So it's kinda hard to run/test.


I don't mean which APIs are the best or worst. Just figured out that if I want to write a portable library, then it might be better to have rather low level APIs exposed. And if low level APIs are there, then most of the concepts can be shared even they look completely different. (In this sense, Ypsilon's APIs are too high level to handle, unfortunately.)



妹からWhat's Appでオランダの労働環境的なものの質問が来て、「これはブログのネタになる」と思ったのでまとめてみることにする。法律的なことは詳しくないし、あくまで僕の経験+聞いた内容なので正確性は低いと思ってもらえるとありがたい。断定的な言葉を使っているが後ろに「と思う」とか「のはず」とかを捕捉してもらいたい。



















Breaking privacy

R6RS has a record which is much more flexible than the one R7RS provides. You can consider the R7RS one is a subset of R6RS record. One of the differences between R6RS record and R7RS record is that R6RS provides inspection layer. Using this makes us having sort of peeping Tom ability.


Suppose you want to have a record type which is only used in the library defines the record type. Later on, you notice that it might be convenient if users can pass the record so you decided to export its constructor.
(library (private-record)
    (export make-private2)
    (import (rnrs))

  (define-record-type private
    (fields field1))
  (define-record-type private2
    (fields field2)
    (parent private))

  ;; you may want to do some
  ;; private operation here 
So far so good. It seems no one can access the record fields except the library itself. So you can have some privacy here.

Is that really so?

Actually not. You would still have peeping Tom. Let's see how we can do it.
(import (rnrs)

(define private2 (make-private2 1 2))
(define private2-rtd (record-rtd private2))

;; predicate
(define private? (record-predicate (record-type-parent private2-rtd)))
(define private2? (record-predicate private2-rtd))

;; get field accessors
(define (find-accessor rtd field)
  (let loop ((rtd rtd))
    (if rtd
        (let ((fields (record-type-field-names rtd)))
          (let lp ((i 0))
            (cond ((= i (vector-length fields)) (loop (record-type-parent rtd)))
                  ((eq? (vector-ref fields i) field)
                   (record-accessor rtd i))
                  (else (lp (+ i 1))))))
        (lambda (o) (error 'find-accessor "no such field" field)))))

;; parent field
(define private2-field1 (find-accessor private2-rtd 'field1))
;; record field
(define private2-field2 (find-accessor private2-rtd 'field2))

(display (private2? private2)) (newline)
(display (private? private2)) (newline)

(display (private2-field1 private2)) (newline)
(display (private2-field2 private2)) (newline)
Even though the library only exports constructor, we can still access to the fields including parent's one and checks if the object is the particular record.

The inspection layer is really convenient in some cases. Suppose you want to write a destructuring matcher which can also handle records. The essence of this kind of macro would be like the following:
(define-syntax destructuring-record
  ;; actually we don't need to put 'record' keyword at all
  ;; to show only this...
  (syntax-rules (record)
    ((_ (record r field ...) body ...)
     (let ((tmp r))
       (when (record? tmp)
         (let* ((rtd (record-rtd tmp))
                ;; definition of find-accessor is above
                (field ((find-accessor rtd 'field) tmp))
           body ...))))))

;; use it
(destructuring-record (record private2 field2) (display field2) (newline))
If you know the field names of records, then you can write without calling predefined accessors. (Though, this would cost a bit of performance since it creates closures each time.)

How to prevent it?

If you are treating something you really don't want to show such as password (I don't argue holding password here), you just need to specify (opaque #f) in the record definition.


R7RS or SRFI-9 define-record-type can be implemented by R6RS's one. Since R7RS's one doesn't specify record inspection, there is no way to prevent this as long as wrapper doesn't specify (opaque #f). One of the implementations made by Derick Eddington allowed to be inspected. Should this behaviour be the case for R7RS one?





;; version 1
(define-syntax test-values
  (syntax-rules ()
    ((_ (expected ...) expr)
     (test-values 'expr (expected ...) expr))
    ((_ name (expected ...) expr)
     (test-equal 'expr (expected ...) (let-values ((results expr)) results)))))


;; version 2
(define-syntax test-values
  (syntax-rules ()
    ((_ "tmp" name (e e* ...) (expected ...) (var ...) expr)
     (test-values "tmp" name (e* ...) (expected ... e) (var ... t) expr))
    ((_ "tmp" name () (expected ...) (var ...) expr)
     (let-values (((var ...) expr))
       (test-equal '(name expected) 'expected var)
    ((_ (expected ...) expr)
     (test-values expr (expected ...) expr))
    ((_ name (expected ...) expr)
     (test-values "tmp" name (expected ...) () () expr))))


このマクロはSASMを書いているときに使っているのだが、x86の一貫性のなさから返し得る値が複数ある場合が出てきた(例: ADD)。そうすると、何かを弄った拍子に結果が入れ替わるとかが起きると毎回テストを書き換えなければならない。それではテストを書いてる意味が薄れると思ったので、こういった場合にも対応できるようにした。
;; version 3
(define-syntax test-values
  (syntax-rules (or)
    ((_ "tmp" name (e e* ...) (expected ...) (var ...) (var2 ... ) expr)
     (test-values "tmp" name (e* ...) (expected ... e) 
                  (var ... t) (var2 ... t2)
    ((_ "tmp" name () (expected ...) (var ...) (var2 ...) expr)
     (let ((var #f) ...)
       (test-assert 'expr
                    (let-values (((var2 ...) expr))
                      (set! var var2) ...
       (test-values "equal" name (expected ...) (var ...))))
    ;; compare
    ((_ "equal" name () ()) (values))
    ((_ "equal" name ((or e ...) e* ...) (v1 v* ...))
       (test-assert '(name (or e ...)) (member v1 '(e ...)))
       (test-values "equal" name (e* ...) (v* ...))))
    ((_ "equal" name (e e* ...) (v1 v* ...))
       (test-equal '(name e) e v1)
       (test-values "equal" name (e* ...) (v* ...))))
    ((_ (expected ...) expr)
     (test-values expr (expected ...) expr))
    ((_ name (expected ...) expr)
     (test-values "tmp" name (expected ...) () () () expr))))


;; version 4
(define-syntax test-values
  (syntax-rules (or ?)
    ((_ "tmp" name (e e* ...) (expected ...) (var ...) (var2 ... ) expr)
     (test-values "tmp" name (e* ...) (expected ... e) 
                  (var ... t) (var2 ... t2)
    ((_ "tmp" name () (expected ...) (var ...) (var2 ...) expr)
     (let ((var #f) ...)
       (test-assert 'expr
                    (let-values (((var2 ...) expr))
                      (set! var var2) ...
       (test-values "equal" name (expected ...) (var ...))))
    ;; compare
    ((_ "equal" name () ()) (values))
    ((_ "equal" name ((? pred) e* ...) (v1 v* ...))
       (test-assert '(name pred) (pred v1))
       (test-values "equal" name (e* ...) (v* ...))))
    ((_ "equal" name ((or e ...) e* ...) (v1 v* ...))
       (test-assert '(name (or e ...)) (member v1 '(e ...)))
       (test-values "equal" name (e* ...) (v* ...))))
    ((_ "equal" name (e e* ...) (v1 v* ...))
       (test-equal '(name e) e v1)
       (test-values "equal" name (e* ...) (v* ...))))
    ((_ (expected ...) expr)
     (test-values expr (expected ...) expr))
    ((_ name (expected ...) expr)
     (test-values "tmp" name (expected ...) () () () expr))))








(import (rnrs))

(define (convert-port input file transcoder)
  (call-with-port (open-file-output-port file (file-options no-fail) 
                                         (buffer-mode block)
    (lambda (out)
      (put-string out (get-string-all input)))))

(call-with-input-file "utf-8.txt"
  (lambda (in)
    (convert-port in "utf-16.txt" 
     (make-transcoder (utf-16-codec) (eol-style crlf)
                      (error-handling-mode raise)))))
