時の羅針盤＠blog

2016-09-28

C translator (2)

Starting 0.7.8, Sagittarius provides experimental C translator. Currently it's more or less toy quality (though, just using is not a problem). There are 2 way of translations:

Dump all required library cache
Doing the same as precompiled files

Both solutions have problems. The first one requires huge amount of static area and generated files are huge. The second one can't translate macros.

Dumping cache file is not a good solution so I want to discard it. To do so, second solution must support macro translation. The challenge of doing it is that macro contains shared structures and may contain subrs (Scheme procedures defined in C). Translating shared structure to C may not be so difficult but subrs. Subrs themselves don't have any information where it's defined, so they only have their name and C pointer. As far as I know, there's no subr in macro so this may not be a problem as long as users (me mostly) don't do some black magic (and I know I can...).

One of the benefits (or maybe this is the only) of translating C is that considerably short amount of starting time. Sagittarius is, unfortunately, not so fast implementation, and its starting time is not so fast even when it loads only cached libraries. For example, one of my daily routine scripts takes 500ms to just showing help message (8000ms if it's not cached) and translated version takes 130ms. It's not that much difference (yet) but still it's almost 5 times faster than raw Scheme.

However, I think it's still slow or doing some unnecessary things. Consider the following script:

(import (rnrs))

(define (main args)
  (display args) (newline))

If I run this script, then it loads 32 libraries currently. But the only bindings required in this script are display and newline. So if the C translator is smart enough to detect them, then the result script should look like this:

;; imaginary script... 
(define (main args)
  (#{(core)#display} args) (#{(core)#newline}))

Then it doesn't require any unnecessary library loading.

Hmm, I think I have a lot to do.

2016-09-19

Scheme Workshop 2016 奈良

行ってきた。朝一で奈良入りでもいけなくはなかった気がしないでもないが、せっかくだしということで奈良に２泊３日してきた。奈良の観光の話でもいいけど、流石にどこも回れなかったのでワークショップの話。

前日の夜にAlexからスケジュールの変更のメールが届く。発表者二人が飛行機に乗り遅れたので、その二人を最後に回すため。結局間に合わずWillとKathyによりカラオケスライドメソッド(後述)が発動した。

ホテルから徒歩１５分という距離なのに９時１５分のオープニングに間に合わず。油断しすぎでした。

【招待講演１】
プログラムの正当性を検証するプログラムにLispを使ったという話(だと思う)。スケジュールが移動して自分の発表が一発目になったのでそれに意識が行き過ぎてたから話が頭に入らなかった…これ聞きながら、今年のはこんなにアカデミックなのかぁとものすごく緊張した記憶しかない。

【自分の】
そのうちスライド挙げます。単にライブラリの紹介。

【Nash: a tracing JIT for Extension Language】
GuileにトレーシングJITを実装したという話。プログラムで使用される時間の大部分はタイトなループによって発生しているということに注目したJITらしい。本家にマージされるかは不明らしい。されてほしいような、されてほしくないような(Sagittariusが選ばれる可能性が下がるという意味で)。

【Ghosts in the machine】
現在のプログラミング環境はコンピュータ黎明期と比較して劣っている、ということをClojureのREPLを例にして示す話。時間内に終わらなかった発表一つ目。休憩中に発表者に質問してどこを目指しているのか質問して聞いたりしていた。視点としては面白いし、一意見として終わらせるには惜しい気がするけど、何から始めるといいんだろう？となるくらいには壮大なビジョンだった。

【R7RS update】
AlexによるR7RSの近況。まだ見ぬSRFI-142とSRFI-143が並んでいたのでArthurで止まっているのか、Johnがまだ提出してないのかは謎。Red Docketは終わったらしいのだが、正式な発表はまだされてない気がするけど、c.l.sで見逃したかな？

【招待講演2】
Guixという比較的新しいパッケージマネージャの話。インストール履歴をリビジョンで管理しているのでうっかり何かを壊しても動くリビジョンに戻せるというのが他のパッケージマネージャとの大きな違いかな。後はGuileで書かれているのでパッケージの定義もS式(Guileで動くプログラム)というのも大きな特徴かな。

【Function compose, Type cut, And the Algebra of logic】
発表者のコミュニケーション能力があまり高くなかったので何が何だかさっぱり分からなかった。ただ、発表資料自体は面白いことが書いてあったので、論文を後で読むことにする。

【Multi-purpose web framework design based on websocket over HTTP Gateway】
SagittariusにWebsocketを入れたこともあり一番気になってた発表なんだけど、今一発表が雲を掴む様な感じでよく分からなかった。また、コミュニケーションの問題も多少あり、納得のいく答えも得られず。後で論文を読む。

ここからカラオケスライドメソッドが発動する。ちなみに、カラオケスライドとは自分が作ったスライドではないものを発表するもと思えばよい。命名はWill。日本のカラオケが流れてくる文字を歌うことに引っ掛けた上手い名前だと個人的には思った。

【miniAdapton: A Minimal Implementation of Incremental Computation in Scheme】
AdaptonというMemoriseのテクニックに一つをSchemeで実装した話。入力ツリーの一部が変更されても共通部分は再利用しているような感じだった。リポジトリも公開されてるし、ソース見た方が早いかな。しかし、代理の発表(論文の共著者であるが) なのにえらくうまいことやるなぁと前回見た時も思ったなぁ。あれくらいうまくしゃべれるようになりたい(なんか違う)。

【Deriving Pure, Functional One-Pass Operations for Processing Tail-Aligned Lists】
リストの共通サフィックス部分を調べるスクリプトをエレガントに書いたという話。ベンチマークがないのでこれが実際にどこまで効くのかは微妙。ナイーブな実装、エレガントな実装ともにO(N)なので、コンスタントの部分だけなのだが、正直論文の実装だとヘルパー手続きの作成にかかるコストの方が高い気がしないでもない。まぁ、コードは論文に載っているので手元でベンチマークとってJasonにメール投げてみればいい話な気がする。

どうでもいいのだが、僕が喋る英語はオランダ語訛りらしい(Arthur談)。どうも子音の発音がネイティブに比べて強いのだそうだ。普通母語に引っ張られるが、第三言語(第二が英語)に引っ張られるのは珍しいんじゃないの？という話をしていた。確かに寡聞にして聞かない話ではある。

2016-08-30

ログ

そういえばSagittariusにはずっとログを吐き出すためのライブラリがないということをふと思い出した。附属させてるライブラリがログを吐き出すのはさすがにどうかと思うのであんまり考えてなかったのだが、Paellaみたいなのが何のログも吐かないというのはいろいろ面倒だなぁと思ってはいた。

あまりいいデザインというのも思い浮かばないんだけど、なんとなくこんな感じのロガーがあればいいかなぁと思い30分くらいで作ってみた。こんな感じで使える。

(import (srfi :18) (util logging) (util file))

(define (print-log logger)
  (trace-log logger "trace")
  (debug-log logger "debug")
  (info-log logger "info")
  (warn-log logger "warn")
  (error-log logger "error")
  (fatal-log logger "fatal")
  (terminate-logger! logger))

(print-log (make-logger +info-level+ (make-appender "[~l] ~w4 ~m")))
(print)
(print-log (make-async-logger +debug-level+ 
         (make-appender "[~l] ~w4 ~m")
         (make-file-appender "[~l] ~w4 ~m" "log.log")))

(print (file->string "log.log"))
#|
[info] 2016-08-30T16:02:29+0200 info
[warn] 2016-08-30T16:02:29+0200 warn
[error] 2016-08-30T16:02:29+0200 error
[fatal] 2016-08-30T16:02:29+0200 fatal

[debug] 2016-08-30T16:02:29+0200 debug
[info] 2016-08-30T16:02:29+0200 info
[warn] 2016-08-30T16:02:29+0200 warn
[error] 2016-08-30T16:02:29+0200 error
[fatal] 2016-08-30T16:02:29+0200 fatal
[debug] 2016-08-30T16:02:29+0200 debug
[info] 2016-08-30T16:02:29+0200 info
[warn] 2016-08-30T16:02:29+0200 warn
[error] 2016-08-30T16:02:29+0200 error
[fatal] 2016-08-30T16:02:29+0200 fatal
|#

ロガーがログレベルと同期をコントロールして、アペンダーが実際にログを吐き出す。どこかでみたことあるようなモデルではある。まぁ、本職はその言語を使うのでなんとなく似通ったのだろう。make-appenderでつくられるアペンダーは標準出力（正確にはcurrent-output-portが返す値）に吐き出す。アペンダーの第一引数はログのフォーマット。現状手の込んだ出力はできないが、当面はこれでも問題ないだろう。ちなみに、~wの後ろに続く4はSRFI−19で定義されているフォーマットの一つ。現在は一文字しかみないが、そのうちなんとかするかもしれない。

make-async-loggerはスレッドを一つ消費する代わりにログの書き出しをバックグラウンドでやってくれる。アペンダーが複数あったり、処理が重い（メール送るとか）等あるときに役に立つと思う。

とりあえずこれを適当に使ってるスクリプト等に組み込んで使用感を確かめていくことにしよう。

2016-08-29

Ephemeron and reference barrier

Recently, there's the post on SRFI-124 ML about reference barrier. The SRFI doesn't require reference-barrier procedure due to the non-trivial work to implement. Now, there's a post which shows how to implement portably, like this:

(define last-reference 0)
(define (reference-barrier x)
  (let ((y last-reference))
    (set! last-reference x)
    y))

It seems ok, but is it? The SRFI says like this:

This procedure ensures that the garbage collector does not break an ephemeron containing an unreferenced key before a certain point in a program. The program can invoke a reference barrier on the key by calling this procedure, which guarantees that even if the program does not use the key, it will be considered strongly reachable until after reference-barrier returns.

Due to the lack of knowledge and imagination, I can't imagine how it should work precisely on like this situation:

(let loop ()
  (when (some-condition)
    (reference-barrier key1)
    (reference-barrier key2)
    (do-something-with-ephemerons)
    (loop)))

Should key1 and key2 be guaranteed not to be garbage collected or only the last one (key2 in this case)? If the first case should be applied, then the proposed implementation doesn't work since it can only hold one reference. To me, it's rather rational to hold multiple reference since it's not always only one key needs to be preserved. However if you extend to hold multiple values, then how do you release the references without calling explicit release procedure? If the references would not be released, then ephemerons would also not release its keys. Thus calling reference-barrier causes eternal preservation.

Suppose this assumption is correct, then reference-barrier should work like alloca with automated push so that the pushed reference is on the location where garbage collector can see them as live objects. In this sense, allocated references are released automatically once caller of reference-barrier is returned and multiple references can be pushed. Though, it's indeed non-trivial task to implement. I hope there's no errata which says reference-barrier is *not* optional, otherwise it would be either a huge challenge or dropping the support...

2016-08-24

暗黙の総称関数 (部分解決編)

昨日の続き

総称関数が暗黙的に大域定義されるという話なのだが、昨日のアイデアを元に直してみた。現在のHEADでは以下のようにしても大域には定義されない。

(import (rnrs) (clos user))

(let ()
  (define-generic foo)
  foo)
foo ;; -> &undefined

define-genericは今まで無条件に大域に束縛を挿入していたが、現在はトップレベルで定義された際のみとしている。これは別に難しいことではなかったので割愛(実現するのに他の不具合を直す必要があったが)。

問題はdefine-methodである。現状ではほぼうまく動くのだが、以下のような使い方をすると動かない。

(import (rnrs) (clos user))

(let ((foo 'foo))
  (define-method foo (o) o)
  (print (foo 1)))
;; -> error

define-methodが局所定義された際には局所定義された総称関数をまず探し、なければ大域定義されたものを探すという手順を取っている。そして両方とも見つからなかったらdefine-genericを挿入する。ちなみにここで挿入されたものは局所定義になるので、大域には定義されない。上記の例がうまく動かないのは、メソッド名と局所変数名が同一だから。マクロ展開は当然コンパイル時に行われるので、同名の変数を探すことはできるが、その変数が何を指しているかというのは、指しているものがリテラルである場合を除き、実行時まで分からない。この使い方をすることはそうないだろうと踏んで、現状ではこれで妥協することにした。何か妙案が浮かんだ時にまた直すかもしれない。(どうでもいいが、投げられるエラーが&assertionなのはおかしい気がしてきた。少なくとも&errorじゃないと…)

これとは直接は関係ないのだが、変数名の重複をチェックするようにした。以下のようなものが動なる。

(let ((a 1) (a 2)) a) ;; -> error
(let (("a" 1)) 1)     ;; -> error

もともと単にサボっていただけなのだが、総称関数が以下のように定義された際に混乱を招きそうだったので：

(let ()
  (define foo)
  (define-generic foo)
  (define-method foo (o) o)
  (foo 1))
;; ???

そもそもエラーなので何が起きても問題ないんだけど、変な混乱を招くよりはコンパイラに怒られた方がいろいろ楽な気がしたというのが本音。

完璧な解決ではないが、メソッドの局所定義とか(個人的には)やらないだろうし、9割くらいの混乱を未然に防げるんじゃないかなぁ。

2016-08-23

Implicit generic function creation

define-method would create a generic function implicitly if there's none defined yet. It's convenient and I've written libraries (probably only one) depending on this behaviour. (c.f. (binary data))

Convenience would usually be a trade-off of consistency, at least in my case. For example, this is a long standing bug (though, I've just issued):

(import (clos user))

(let ()
  (define-generic foo))
foo
;; -> should be &undefined

This is because define-method would create a generic function during macro expansion, and it would be an unexpected result in this case if it didn't:

(begin
  (define-generic foo)
  (define-method foo (o) #t))

In this case, define-method should not make implicit generic function, but macro expansions are done in the same compilation time as define-generic. Thus, it's impossible to know if it should create or not. To let define-method know it shouldn't, define-generic also inserts binding into current environment (a library) during macro expansion.

Can't I do better? Creating global binding during macro expansion is rather ugly, but I don't I can get rid of it (or maybe the future?). Yet, I think I can avoid to create unwanted one like above example. It's still just an idea but if define-generic and define-method can see if they are used in a scope, then it seems there's a way. Since Sagittarius has current-usage-env and current-macro-env procedures, it is possible to access compile time environment during macro expansion. Thus, it should even be able to detect whether or not define-method should create an implicit generic function or not.

This should work, let's see.

2016-08-22

Cトランスレータ

ふとスタンドアロンのバイナリができるといいかなぁと思ってこんなものを書いてみた(使うには0.7.8のHEADが必要)。以下のように使う：

$ ./scheme2c -o out.c foo.scm
$ gcc `sagittarius-config -L -I -l` -O2 out.c

中身はほぼなんでもいいんだけど、引数を受け取るにはSRFI-22のmainがないとたぶんうまいこと使えないはず(command-line手続きで引数が取れない)。ちなみに出来上がりCファイルは超巨大になる可能性がある。参考までに、このスクリプト自信をCに変換したら45MBのファイルになった(64bit環境)。ファイルが巨大になるのはこのスクリプトが何をやっているかを見ればすぐにわかるのだが、端的に言うとキャッシュファイルを引っ張ってきてCのバイト列にしてるから。

なんでこんなものを書いたかというと、特に理由はないんだけど、バイナリ一個で動くといいかなぁと思ったから。今のところランタイムとして最低でも DLL か .so (OSXなら .dylib) がいるがこの辺はそのうち気が向いたら何とかする予定。

キャッシュをそのままダンプしているので、性能的なメリットは一切ない。強いて言えば多少ファイルアクセスが減るくらい。今のところsagittarius-configは Windows 版にはつけてないので VC でバイナリを作りたい場合は多少の工夫が必要。ちなみに、インストール時にパスが決まるからインストーラでやらないといけないというのがついてない理由。

作っておけば気が向いたときに改善されていくのではないかというメソッドなので、しばらく実用にはならない可能性が高い。

2016-08-11

不要レジスタの除去とSchemeコード

VMにレジスタを追加するとスレッドセーフなパラメタと同様な感じで使えるということがあり、SagittariusではVMにいくつかマクロ展開器用のレジスタを持たせていた。あまりいい手ではないし、無駄にVMのサイズを増やすことになるのでいつかは何とかせねばと思いつつもかなりの期間放置していた。っが、最近いい方法を思いついたのでえいや！と除去。VMのサイズが５ワード減って752バイト(64ビット環境)になった。焼け石に水もいいところである。

【やったこと】
基本的には非常に簡単で、VMのレジスタをScheme側でパラメータ化してそれらを使っていたCの実装を全てScheme側に移動させただけ。Scheme側といってもプレビルドされているものなので、本体サイズが小さくなるということは残念ながらない。単にVMのサイズが多少減るのと、自己満足度が多少上がっただけである。

言うは易し行うは、キャッシュのせいで、多少難かった。除去したVMレジスタはマクロ展開器ようにしぶしぶ追加したもの。こいつのせいでマクロのキャッシュが無駄に複雑になっていた(今でも不必要に複雑だが)。SchemeのパラメータをCから呼び出すということは可能な限り避けたいと思ったので、マクロ展開器に関するコードをごそっとScheme側に移動させる必要があった。そうすると、Cで作られた展開器と密接につながっていたキャッシュの読み出しと書き出しが壊れる。キャッシュ機構自体が無駄に複雑なので面倒なデバッグに突入。後は気合でって感じだった。

これによるパフォーマンスの低下があるかなぁと思ったけど、顕著に表れるようなものはなかったのでよし。

【Schemeコードの混在】
VMのレジスタにはリーダーマクロやR7RSのincludeのためにあるものもあった(取っ払ってやった、更に2ワード減ったぜ)。こいつらはコンパイラが依存していたり、load手続きが使用していたりするので、単純にプリコンパイルされたSchemeに放り込むことができなかった。気合でパラメータ関連をCで書いて何とかするという手もなくはなかったんだけど、年を取ると楽な方に逃げたくなる。ということで、スタブファイル内にSchemeコードが書けるようにしてみた。

Schemeコードのプリコンパイル自体はすでにあるのだから、ちょっと手を入れればいけるだろうと思って手を入れてみたら動いたという感じ。残念ながらCで書かれた手続きからSchemeの手続きを呼ぶことはできないし、SchemeからCの関数を直接呼び出すこともできない。それでも、混在可能というのは非常に楽である。依存関係があるので純粋にSchemeで書くより多少気は使うものの(ちなみにマクロは使えない)、Cでごにょごにょやるより遥かに楽である。これを使って(sagittarius)ライブラリ内に簡易パラメータを作り、load周りのVMレジスタをそいつで実装。それに伴って関連するCのコードをごそっと消してやった。すっきり。

流石にこれはパフォーマンスに影響がでて、例えば(rfc http)のように大量に他のライブラリ(数えたら131個あった)に依存するものをキャッシュなしで読み込むと大体10～15％程度パフォーマンスが落ちた。Cでは手続き内でやっていた処理がScheme側に移動したことで手続き呼び出しに変わったこと等によるオーバーヘッドだろうなぁとは思っているものの、まぁしょうがないか。キャッシュになってさえいればほぼ変わらないので(キャッシュはCなので当たり前だが)、初回起動のみが遅いと思えばそこまで気にすることでもないかなぁ。そもそもコンパイルとマクロ展開が遅いでそっちを先に何とかしないとという話である。(text sql)のコンパイルとか環境によっては10秒かかるし…2000行以上にわたるマクロを10秒で展開すると思えばそこまで悪くないともいえるのか？っがコンパイル待ってる間にコーヒー飲み終わる勢いなのは流石になぁ…

この調子でVMの不要もしくはあってほしくないレジスタを削除していきたいところである。

2016-08-03

WebSocketクライアント

連投の二つ目

なんとなくやる気とか刺激とか取り戻すためにWebsocketのライブラリを書いてみた。WebSocketはRFC6455で定義されているプロトコル(詳細はRFC読むべし)で、実装もそんな大変そうでもないので書いてみた。

書いてる途中でSaitoAtsushiさんの実装の存在を思い出してライブラリの名前を確認したら、(rfc websocket)とまるかぶりだった。Pegasus用のformulaまで書いていただいているのにこのままぶつけるのもなぁと思い確認してみたところ

@tk_riple 実際のところ、すでに入れている人がどれだけいるのかなぁという気もするんですよね。本家でやるなら私自身も本家を使うようになる (私の実装はもうメンテされず、相対的に価値が暴落する) ので、そういう「普通の名前」は本家の方で取って欲しいという気持ちがあります。
— 齊藤敦志 (@SaitoAtsushi) 2 August 2016

とのことだったので、遠慮なくぶつけさせていただいた。他の名前の候補としては、(rfc :6455 websocket)とか(rfc websockets)（Cのlibwebsocketsに倣って)とかあったけど、「普通の名前」（RFCの名前）を使うことにした。

簡単な使い方。

(import (rnrs) (rfc websocket))

;; Creates WebSocket object
(define websocket (make-websocket "wss://echo.websocket.org"))

;; Sets event handlers
(websocket-on-open websocket
  (lambda (ws) (display 'CONNECTED) (newline)))
(websocket-on-text-message websocket
  (lambda (ws text) (display text) (newline)))

;; Connects to the server
(websocket-open websocket)

;; Sends a message
(websocket-send websocket "Hello")

;; Close it
(websocket-close websocket)

ユーザーレベルAPIは基本的にWebSocketオブジェクトを返すので、以下のようにも書ける。

(websocket-close 
 (websocket-send 
  (websocket-open 
   (websocket-on-text-message
    (websocket-on-open (make-websocket "wss://echo.websocket.org")
     (lambda (ws) (display 'CONNECTED) (newline)))
    (lambda (ws text) (display text) (newline))))
  "Hello"))

どっちがいいかは好みだろうけど（流石に下のはあまり使わないか？）。ドラフトのまま絶賛放置中（期限切れてるから破棄されてるの？）のWebSocket over HTTP/2にも頑張れば対応できるようにはしてある（プラグイン書くだけ）。

すでにあるライブラリにぶつけにいったということもあり、例外とかかなり頑張って作っている。こんなに例外階層作って、しかも例外ハンドリングをまじめにやったのってたぶん初めてじゃないかなぁ。ライブラリ自体の構成は割とスタンダードで、ユーザーレベルAPI、中間レベルAPI、低レベルAPIという感じになっている。低レベルAPIはサーバー書くときに便利に使え（現状ではサーバーをどうするかあんまり考えていない）、中間レベルは何かしらプログラム的にやるのに、ユーザーレベルはJavaScriptのWebsocketみたいな感じで使える。

作って２日なので、作りこみが足りない部分はあるかもしれないが、簡単なチャットサーバーとクライアントみたいなのは作れたので紹介してみた。

例外ハンドリング

連投の一つ目。

一つ前の記事でguardの例外の再送出をwith-exception-handlerで受けると無限ループに陥る問題を解決した話。結論を先に書くと、raise、raise-continuable及びwith-exception-handlerをSchemeで実装して、継続の境界を作らないようにした。

ぼ～っと考えて実装したら動いちゃった系の解決方法で、特に苦労とかなかったんだけど、実装前に気になっていた点が以下：

Cでwith-exception-handlerを使っているか

使ってなかった。なんでこれC側にあるんだろう状態だった。

C側で例外投げたらどうなるの

この記事の肝、気になったら続きを読んで。

一つ目は特筆することもなく。Cで書くと複雑怪奇になるんだけど（なってた）、使ってないしSchemeに移動させてコードがかなりすっきりした。VMのサイズも１ワード減っていい感じだと思われる。(core)ライブラリだけでは使えなくなったけど、これに依存するコードはないと思うのでまぁ、問題ないだろう。

二つ目はC側のコールスタック。継続の境界はSg_Apply系の関数で作られるんだけど、コールスタックの深いところで例外が投げられたらどうする？ってのをぼ～っと考えていた。はい、longjmpを使うだけでした。

具体的にはこんな感じで例外が投げられたとする。

Call stack
  +----------+  +---------+  +---------+  +---------+
--| Sg_Apply |--| C func1 |--| C func2 |--| C func3 |
  +----------+  +---------+  +---------+  +---------+
                                            ^^^^^^^
                                             C error

こんな感じでCのraiseは呼ばれた時点でVMのスタックにSchemeのraiseを呼び出す継続フレームを入れる。

Before
   PC = CALL
   Stack
   +--------------------------+
   | Return to previous frame |
   +--------------------------+
                :
After
   PC = RET
   Stack
   +--------------------------+
   | Return to calling raise  | --> PC to return = CALL raise
   +--------------------------+
   | Return To previous frame |
   +--------------------------+
                :

with-exception-handlerとraise-continuableの組み合わせだと、呼び出し元に戻る必要があるので一つ前のフレームを飛ばすわけにはいかない。この状態にしておいて、longjmpを呼び出し、VMのループを再起動する。もともとVM自体はVMループへのjmpbufを持っているのでそれを使うだけ。割とお手軽に解決できてしまった。

このバグのおかげで別のバグをつぶすこともできたし、YpsilonとMoshのバグを発見することもできたのでいいバグ（？）だった。

2016-07-29

Exception handling in C world

The following piece of code goes into infinite loop on Sagittarius.

(import (rnrs))
(with-exception-handler
 (lambda (k) #t)
 (lambda ()
   (guard (e (#f #f))
     (error 'test "msg"))))

I couldn't figure it out why this never returned at first glance. And it turned out to be very interesting problem.

The problem is related to continuation boundary and exception handlers. If you print the k, then only the very first time you'd see error object with "msg" after that it'd be "attempt to return from C continuation boundary.". But why?

This happens when you capture a continuation and invoke it in the different C level apply. Each time C level apply is called, then VM put a boundary mark on its stack to synchronise the C level stack and VM stack. This is needed because there's no way to restore C level stack when the function is returned. When a continuation is captured on the C apply which is already returned, then invocation of this continuation would cause unexpected result. To avoid this, the VM checks if the continuation is captured on the same C level apply.

Still, why this combination causes this? The answer is raise is implemented in C world and exception handlers are invoked by C level apply. The whole step of the infinite loop is like this:

guard without else clause captures a continuation.
When error is called, then guard re-raise and invoke the captured continuation. (required by R6RS/R7RS)
Exception handler is invoked by C level apply.
VM check the continuation boundary and raises an error on the same dynamic environment as #3
Goes to #3. Hurray!

There are 2 things I can do:

Create a specific condition and when with-exception-handler received it, then it wouldn't restore the exception handler. [AdHoc]
Let raise use the Scheme level apply. (Big task)

#1 probably wouldn't work properly. #2 is really a huge task. I need to think about it.

2016-07-12

リモートREPLとライブラリ依存関係

最近ちょっとしたツールを作るのにPaella(に付属しているPlato)を使っているのだが、リモートREPLの問題とリロードの問題が面倒だなぁと思ってきたのでちょっとメモ。

リモートREPLの問題

これは非常に簡単で#!read-macro=...のようなのが送れない。問題も分かっていて、#!から始まるリーダーマクロは値を返さないで次の値を読みにいく。マクロはリーダー内で解決される。スクリプトや通常のREPLなら特に問題ないんだけど、それ自体を送りたい場合にはあまり嬉しくない。解決方法はぱっと思いつくだけで以下：

リモートREPL用のリーダーを作る

面倒

リモートREPL用ポートを作って、リードマクロを読ませる

アドホックだけど、悪くない気がする

一つ目のは既存のリーダーマクロ機構をどうするんだ？という話になるので、最後の手段にしたい(作れば確実に動くのは分かっているので)。二つ目は何とかいけそうな気がしないでもないんだけど、ソースポート(この場合は標準入力ポート)まで影響が出るかが疑問(出ない気がしてる)。そもそもソースポートにまで影響はでる必要はない気もするなぁ。試してみるか。

ライブラリ依存関係

Platoを使う最大の理由はREPLでリロードができるからなんだけど、ハンドラ(と呼んでいるライブラリ)が依存しているライブラリを変更した際に上手いことその変更を反映する方法がない。ハンドラであればリロードが可能なのだが、依存ライブラリだとロードパスの問題とかも出てきて嬉しくない。手動でloadを呼ぶとか、コード自体をリモートREPLに貼り付けるとか(上記の問題が出ることもあるが)方法がないこともないけど今一面倒。
依存関係の子に当たる部分を親が解決できればなんとかなりそうなんだけど、現状の仕組みでは子が親を探すことができても親は子を知る術がない。突き詰めていくとincludeされたファイルの変更は追跡できないとかあるし。
あるとよさそうなものとして、

ライブラリファイルの変更を検知したらリロードする機構
依存関係の親が子を知る手段

かなぁ。一つ目はそういえばファイルシステムの監視機構を入れたからやろうと思えばやれなくもないのか(自動監視だと任意のタイミングでやれないから微妙かな)。二つ目はC側に手を入れてやればやれるけど、必要な場面がかなり限定されているので単なるオーバーヘッドにしかならない気がするなぁ(メモリも喰うだろうし)。こっちはかなり考えないとだめっぽい。いいアイデア募集。

2016-07-08

Syntax parameters (aka SRFI-139)

Marc Nieper-Wißkirchen submitted an interesting SRFI. The SRFI was based on the paper 'Keeping it Clean with Syntax Parameters'. The paper mentioned some corner case of breaking hygiene with datum->syntax, which I faced before (if I knew this paper at that moment!).

In 'Implementation' section, the SRFI mentions that this is implemented on 'Rapid Scheme', Guile and Racket. Unfortunately there's no portable implementation. In my very prejudiced perspective, Guile isn't so fascinated by macro, so the syntax parameters is probably implemented on top of existing macro expander (I haven't check since Guile is released under GPL, and if I see the code it may violates the license). If so, it might be able to be implemented on syntax-case.

Without any deep thinking, I've written the very sloppy implementation like this:

#!r6rs
(library (srfi :139 syntax-parameters)
  (export define-syntax-parameter
          syntax-parameterize)
  (import (rnrs))

(define-syntax define-syntax-parameter
  (syntax-rules ()
    ((_ keyword transformer)
     (define-syntax keyword transformer))))

(define-syntax syntax-parameterize
  (lambda (x)
    (define (rewrite k body keys)
      (syntax-case body ()
        (() '())
        ((a . d)
         #`(#,(rewrite k #'a keys). #,(rewrite k #'d keys)))
        (#(e ...)
         #`#(#,@(rewrite k #'(e ...) keys)))
        (e
         (and (identifier? #'e)
              (exists (lambda (o) (free-identifier=? #'e o)) keys))
         (datum->syntax k (syntax->datum #'e)))
        (e #'e)))
      
    (syntax-case x ()
      ((k ((keyword spec) ...) body1 body* ...)
       (with-syntax (((n* ...)
                      (map (lambda (n) (datum->syntax #'k (syntax->datum n)))
                           #'(keyword ...)))
                     ((nb1 nb* ...)
                      (rewrite #'k #'(body1 body* ...) #'(keyword ...))))
         #'(letrec-syntax ((n* spec) ...) nb1 nb* ...))))))
)

And this can be used like this (taken from example of the SRFI):

#!r6rs
(import (rnrs) (srfi :139 syntax-parameters))

(define-syntax-parameter abort
  (syntax-rules ()
    ((_ . _)
     (syntax-error "abort used outside of a loop"))))

(define-syntax forever
  (syntax-rules ()
    ((forever body1 body2 ...)
     (call-with-current-continuation
      (lambda (escape) 
 (syntax-parameterize
  ((abort
    (syntax-rules ()
      ((abort value (... ...))
       (escape value (... ...))))))
  (let loop ()
    body1 body2 ... (loop))))))))

(define i 0)
(forever
 (display i)
 (newline)
 (set! i (+ 1 i))
 (when (= i 10)
   (abort)))

(define-syntax-parameter return
  (syntax-rules ()
    ((_ . _)
     (syntax-error "return used outside of a lambda^"))))

(define-syntax lambda^
  (syntax-rules ()
    ((lambda^ formals body1 body2 ...)
     (lambda formals
       (call-with-current-continuation
 (lambda (escape)
          (syntax-parameterize
    ((return
      (syntax-rules ()
        ((return value (... ...))
  (escape value (... ...))))))
    body1 body2 ...)))))))

(define product
  (lambda^ (list)
    (fold-left (lambda (n o)
   (if (zero? n)
       (return 0)
       (* n o)))
        1 list)))

(display (product '(1 2 3 4 5))) (newline)

I've tested on Chez, Larceny, Mosh and Sagittarius.

This implementation violates some of 'MUST' specified in the SRFI.

keyword bound on syntax-parameterize doesn't have to be syntax parameter. (on the SRFI it MUST be)
keyword on syntax-parameterize doesn't have to have binding.

And these are the sloppy part:

define-syntax-parameter does nothing
syntax-parameterize traverses the given expression.

If there's concrete test cases and the above implementation passes all, I might send it as a sample implementation for R6RS.

2016-07-07

Weirdness of self evaluating vector

Scheme's vector has a history of being non-self evaluating datum and self evaluating datum. The first one is on R6RS, and the latter one is R7RS (not sure about R5RS). Most of the time, you don't really care about the difference other than it requires ' (quote) or not. However, you may need to think about the difference and maybe also think self evaluating causes more trouble. One of the particular case (and this is the only case I think self evaluating vector is evil) is when vector is used in macro.

Have a look at this case:

(import (rnrs))

(define-syntax foo
  (syntax-case ()
    ((_ e) e)))

(foo #(a b c))

What do you think how it behaves? The answer is depending on the standard. On R6RS, vectors are not self evaluating data so this should be an error. So you can't complain if you'd get a daemon from your nose. On R7RS (of course you should change the importing library name to (scheme base)), on the other hand, vectors are self evaluating data so this should return the input vector.

Now, how about this case?

(import (scheme base) (scheme write))

(define-syntax foo
  (syntax-rules ()
    ((_ "go" (v ...) ())         #(v ...))
    ((_ "go" (v ...) (e e* ...)) (foo "go" (v ... t) (e* ...)))
    ((_ e ...)                   (foo "go" () (e ...)))))

(foo a b c d e)

What would be the expansion result of macro foo? I think this is totally up to implementations (if it's not please let me know). For example, Chibi returns vector of something like {Sc #22 #<Environment 4365836288> () t} (syntactic closure, I think), Sagittarius returns vector of identifier, and Larceny returns vector of symbol t. If you put ' (quote) to the result template then the expansion result should be the same as Larceny returns (though, Chibi still returned a vector of syntactic closure, so this might not be defined, either).

Back to the first case. The first case sometimes bites me when I write/use R6RS macro in R7RS context. For example, SRFI-64 is implemented in R6RS macro and using it like this:

(import (scheme base) (srfi 64))

(test-begin "foo")

(test-equal "boom!" #(a b) (vector 'a 'b))
;; FAIL!!

(test-end)

On Sagittarius, R6RS macro transformer first converts all symbols into identifiers, then syntax information will be stripped only if expressions have quote. Now, SRFI-64 is implemented on the R6RS macro transformer and the vector doesn't have quote. Thus, symbols inside of the vector are converted to identifiers. If it's R6RS, then it's an error. But if it's R7RS, it should be a valid script.

I have sort of solution (not sure if I do it or not): Internally, symbol and the identifiers converted from symbols without any context (c.f. not using datum->syntax) are theoretically the same. So if compiler sees such an identifier, then it should be able to unwrap it safely.

I haven't decided how it should be. So for now, just a memo and let it sleep.

2016-06-23

価値観の違い

立場が違えば価値観は当然違う。良い悪いという話ではなく、そういうものだと思っている。一時間半に及ぶ最終面接(といって良いのかあれ？)でふと思ったこととをつらつら書いてみる。

現在絶賛転職活動中の僕は今週に1回(今日終わった)、来週に2回とえらい密度で面接が組まれている。別にここまで積極的にするつもりはなかったんだけどタイミング的にこうなった、正直辛い。っで、今日あった最終面接はその会社のオーナーとだったんだけど、なかなか面白い意見だなぁと思った。曰く

その会社で絶対働きたいという熱意がいる

会社名のタトゥーをいれるくらいとか
40時間越えて働いても残業代請求しないとか

金が欲しい開発者ならGoogleにでも行け

熱意があるやつが欲しいそうな

給料が高すぎるとよくないから(全社員共通の)上限がある

ちなみに上限はAmazonのオファーより1万5千ユーロほど低かった
そして上限を上げるつもりはないらしい(といわれた)

オファーを比較して自分の市場価値を探るのは好きではない

自分もやってたけど意味無いってさ
比較せずさっさと受けろという意味だとは思うけど

経営者の視点というか、これだけ見るとブラック企業(違法企業)にしか見えないが、条件自体はそこまで悪くない。ただ、3年で給料の上限にぶちあたる可能性(4%の昇給があったとして)があるのでよくもない。面白い話として、その会社の社員の一人は週末にプログラミングの講師をしているそうだ。っで、もし給料が今の倍、もしくはGoogleに準ずるレベルだったらその彼は週末を遊んで過ごすだろう、とも。個人的にこの話で何を説得したかはさっぱり分からないけど、僕の意見としては：

副業をしなければいけないほど給料が安い
そうでないのなら、Google並みの給料もらってもやってると思う

というもの。言わなかったけど。面白い意見だなぁと思ったけど、個人的には萎える意見の類ではある。

もう一つ別に引っかかったのは、「半年もすれば会社で一番の開発者になれる可能性がある」というもの。職場で勉強しようとかそういう気持ちはないんだけど、お山の大将になるつもりもなくて、「全力で追いかけても追いつけない」くらいの人がいる職場の方がいいんだけどなぁ、とか。たった一文の中で矛盾する二つの意見があるというのは置いておいて。同じ条件なら後者を選ぶという程度ではあるが。最先端を常に追いかけてるテック企業といってる割には、僕程度が半年で頂点に立てるレベルというのも今一矛盾しているような。

ここからは個人的な被雇用者としてあり方なんだけど：

労働を提供する対価を雇用者に求めている
仕事のやりがいと称して対価を下げる行為を嫌っている
対価に勝る評価方法はない
僕の労働をより評価してくれる雇用者に容易に移る

要するに、金ですよ。日本で働くことはすごく辛かったけど、一つだけ同意した言葉が今でもある。それが「金に勝る評価方法ない」というもの。どれだけ口で感謝されてもそれでは生きていけないのですよ、ワ○ミの社員ではないんで。もちろん、僕は僕とて市場価値を上げるように努力しているつもりではいるが。まぁ、プログラム書くこと(もしくはそれに関わること、論文読むとか)＝趣味なので努力とは違う気もしないでもないが(それだけでは年齢的にも頭打ちな感じがしないでもないところが辛いところ、かといってこれという何かもないけど)。

会社自体は創業20何年だけどオーナーが変わったのが去年らしく、どうにもスタートアップ的なのか体育会系的なのか分からないが妙にこう会社に尽くす人が欲しい的な雰囲気を前面に押し出す感があった(体育会系じゃないスタートアップに失礼か？)。来週までに辞意を今の会社に告げないと開始時期が9月になるという時期でもあるので、オファー出したら即日もしくは週末を挟んでの月曜日に返事が欲しいとか(流石にもう少し待ってもらうことになったけど)すごい勢いで急かされた感もある。

来週の面接の出来次第かなぁ。条件自体は今より多少上がるし。

絶賛転職活動中なので興味があれば声をかけていただけると嬉しいです。オランダでの勤務もしくは完全リモートが絶対条件だけど。

2016-06-18

R7RSレコード

WG2の議論でレコードの健全性について出てる(c.f. record types)。個人的にレコードが暗黙的にアクセサを作るかどうかというのはとりあえずどうでもよくて(R6RSでは明示されなければ暗黙的に作るってなってるし)、健全性のテストコードが問題だった。

R7RS-smallのレコードは基本SRFI-9なのだが、SRFI-9ではコンストラクタタグとフィールドが同一の識別子でないとエラーとなっている。SagittariusのSRFI-9の実装はこの条件をbound-identifier=?でチェックしているので、テストコードもまぁ問題なく動くだろうなぁと思っていた。のだが、そうもいかなかった。

テストコードではtmpという識別子がコンストラクタタグとフィールドに並ぶ形になる。これらはfree-identifier=?を満たすが、bound-identifier=?は満たさない(識別子の挿入されるタイミングが違うため)。そのためSRFI-9の実装的には問題ない。何が問題か？R6RS版のdefine-record-typeとrecord-accessorの実装が問題だった。

R6RS版のdefine-record-typeは構文情報を引き剥がして下請けのレコード手続きに定義されたレコードの情報を渡すため、テストコードが生成するレコードが持つフィールド名は全てtmpになる(これはR6RS的には許容されている)。っで、record-accessorは与えられたレコードのk番目のフィールドの値を返す手続きを返すんだけど、このk番目にアクセスするためにスロット名でアクセスしていたのがまずかった。構文情報の引き剥がされてるので、単なるシンボルの比較になるんだけど、全部同じ名前なので常に最初にヒットしたスロットの値を返す。

どうしたか？普通にk番目でアクセスするように変更した。昔レコードをCLOSと統合した際に既にあった機能な気がしないでもないんだけど、なんでスロット名で引くようにしたんだろう？当時は気付いていなかったか、実はこの機能は後から外に出されたかのどっちかだろう。

一見するとマクロのバグのようで実は違ったという話。

2016-06-12

Because it's fun

Couple of days ago, I've had a job interview, and one of the interviewer said something very interesting. I don't remember exact sentence but something like this: If I make a framework for hobby, it's okay. I didn't understand what the purpose of this comment, so I said "if there's no framework then no choice, right?" Then, he said "if you need to use this framework for work, then you need to consider a lot of things such as buffer overflow.". Well, sort of agree and sort of disagree.

The reason why I needed to make loads of framework is basically because nobody would make other than me. If it's major language such as Java, then you just need to google it and find something. But I'm using Scheme, more specifically Sagittarius. Sagittarius, unfortunately, doesn't have many libraries. Of course, I'm trying to write something useful, but there's only one resource so it doesn't increase the number drastically. And even more unfortunate thing is that it's not so popular so there's not many users. If number of users is small, then not many libraries are written. (It's rather my fault because I still think it only needs to fit my hand and didn't advertise much...) Then if you need something, you gotta write it.

The part I agree with the opinion is the cost of coding. If there are well maintained libraries for you purpose, using them would reduce some time to redevelop the same functionality. Especially if the library is mature enough, then it takes a lot of resource to make your own implementation such level. (e.g. Spring Framework or so)

But should it only be like this? If you are a programmer, you want to write something from scratch because it's fun, don't you? I hear almost every year new trend framework. Not sure the actual motivations are, might be unmatched framework, might be just for fun, etc. And most of the time, it has some bugs. What I want to say is not something like new frameworks are buggy, but they can be famous even the initial version is buggy. So it's pity if you don't write/show something useful because it might not be perfect.

The very first version may only confirm your own requirement. I usually write such libraries and just put it on GitHub (you'll find loads of junks on my GitHub repository). After using it, I usually notice what's missing or not considered. If you have choice to write whatever you want to, then don't hesitate to write especially under the reason of "buffer overflow". (It's more for me, though)

2016-06-04

生産性

ツイッターでも呟いたのだが、体力系と瞬発力系の二つのコーディングテストを最近受けた。体力系は現在進行形だが、コーディング自体は要求された機能を最低限満たしたのでよしとしてしまった。(ネタは面白いんだけど、プライベートでJavaを長時間書く気になれんのよ…)

先週の週末から時間が取れるときにやってたんだけど、途中でJavaに嫌気が差してSchemeで書いてその後Javaに戻ったりしたので、この2言語間(正確にはSagittariusとJavaだが)の生産性の違いが書けそうだなぁと思い、愚痴をこめつつ書くことにした。

先に結論を書くと、Javaは極めて冗長に書く必要があるのでプロトタイプ的なものを作るには向かない。(ので、体力勝負系のコーディングテストに持ってこられると非常に面倒。) 成果はGithubかBitbucketに置けと言われたので、晒しても問題ないと判断して晒す(まさかプライベートリポジトリに置けって意味ではないと思うし)。とりあえず以下はなんとなくな比較

	Scheme	Java
実装時間	6時間	16時間
ファイル数	6(自動生成含む)	61(テストファイル含む)
開発環境	Emacs	Eclipse

Schemeの実装時間にはテーブル構成の考案にHTML(+Javascript)の時間も含むので多少多めではある(Scheme書いてた時間は3時間くらいかなぁ)。Javaの16時間のうち少なくない時間を割いたのがMavanリポジトリを探す作業。2年近く触ってなかったからいろいろ忘れていた。後はSpring、Hibernateの初期設定とか。一回書くと忘れる系のものは都度Google先生にお伺いを立ててたので。

そうは言っても割と大きめな時間の差があると個人的には思う。言語の習熟度とか、ライブラリ習熟度の差なのかもしれないけど、一応これでも職業Java屋歴10年以上あるのでそこまでの差はないとしたいところ(むしろScheme歴は6-7年なのでJava歴より短い)。個人的に最も生産性(ここでは時間のことを指す)に響いたのはREPLの存在だったと思う。SchemeではリモートREPLを使ってサーバに変更を即時反映させて挙動を確認していたのに対し、Javaでは毎回ビルドするという切ない状況だったのは大きい。対話的に確認できることの偉大さというのを改めて実感した気がする。次いで大きかったのはデザインパターンの有無。JavaだとなんとなくGeneric DAOパターンを使わないといけないかなぁという脅迫観念に襲われたので、とりあえずそれを使ったんだけど、このパターンは恐ろしく生産性が低い。あらかじめあるものを使うのならばいいのだが、エンティティ毎にDAO書いてサービス書いてとかやってるもんだから、非常に面倒だった。(この辺IDEがワンクリックでやってくれると違うのかもしれない。)細かくボディーブローのように効いたのはJavaの1クラス1ファイルの仕様や、キーボードから指が離れる瞬間が多いこと。別にEmacs最高とかいうつもりはないけど、こういうところで地味に時間を取られた気がする。

こういう体力勝負のコーディングテストは最低でも自分が好きな言語でやらせてくれないと途中で息切れするなぁと思った。問題は、僕が好きな言語でやると9割方読めないということだろうか。Schemeいい言語だと思うのだが、なんでこうも人気がないのだろう？

2016-05-25

File monitoring on OS X using FSEvents

(sagittarius filewatch) on OS X is using kqueue (2) currently. Using kqueue isn't bad just having couple of limitation such as no capability of directory monitoring (due to my laziness). It is OK on BSD environment since this is the only choice to do it. However, on OS X, there are FSEvents APIs which allow users to monitor filesystem. When I research it, it doesn't require loads of file descriptors nor limit file/directory only monitoring. So I thought this might be a good underlying implementation for OS X and implemented like this.

If it ends without any problem, as usual, I don't write any blog post. Yes, there's a huge problem. It doesn't allow me to write tail emulator. I first thought my implementation has an issue. So I've written this small piece of code to check if it works as my expect.

#include <CoreServices/CoreServices.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

static void callback(ConstFSEventStreamRef stream,
                     void *callbackInfo,
                     size_t numEvents,
                     void *evPaths,
                     const FSEventStreamEventFlags evFlags[],
                     const FSEventStreamEventId evIds[])
{
  FILE *fp = (FILE *)callbackInfo;
  char buf[1024];
  const char **paths = (const char **)evPaths;
  for (int i = 0; i < numEvents; i++) {
    while (1) {
      int n = fread(buf, 1, sizeof(buf), fp);
      fwrite(buf, 1, n, stdout);
      if (feof(fp)) break;
    }
    fflush(stdout);
  }
}

int main(int argc, char **args)
{
  if (argc != 2) {
    fputs("fstail file", stderr);
    exit(-1);
  }
  FILE *fp = fopen(args[1], "r");
  fseek(fp, 0, SEEK_END);
  
  CFStringRef s = CFStringCreateWithCString(kCFAllocatorDefault, args[1],
                                            kCFStringEncodingUTF8);
  CFArrayRef ar = CFArrayCreate(NULL, (const void **)&s, 1, NULL);
  FSEventStreamContext ctx = {0, fp, NULL, NULL, NULL};
  int flags = kFSEventStreamCreateFlagFileEvents;
  FSEventStreamRef stream = FSEventStreamCreate(NULL, &callback, &ctx, ar,
                                                kFSEventStreamEventIdSinceNow,
                                                0, flags);
  FSEventStreamScheduleWithRunLoop(stream, CFRunLoopGetCurrent(),
                                   kCFRunLoopDefaultMode);
  FSEventStreamStart(stream);
  CFRunLoopRun();
  return 0;
}

This didn't work like tail command, unfortunately. It didn't receive any event after it got first event. If you get a technical problem, you probably search Google or Stack Overflow. Yes, I've found the very similar issue: No callback get called from FSEventStreamCreate with modifications created by self in watched file. It seems the question didn't get any useful answer. Feeling like I'm missing something, but I don't know what. I've also changed latency argument to non zero value but got no luck.

As long as this problem is not solved, I can't use FSEvents. So for now, I use kqueue, which works perfectly fine for my purpose, on OS X...

2016-05-14

equal?の挙動

二週間前くらいにバグの報告を受けた(それ自体は修正済み)。バグの原因を突き詰めた際に「これはブログのネタになる」と思っていたのだが、それから随分時間が経ってしまった。多少賞味期限が切れてしまった気がしないでもないが、ちょっとした後方互換を壊す修正でもあるし、適当に記録に残しておく。ちなみにバグの報告はこれ。

バグの原因を要約すると以下の二つ：

eqv?が循環構造のあるレコードを受け取るとSEGVを起こす。
equal?がレコードの中身を比較する(R6RS的には規格違反)

随分前に(R7RSがまだドラフトだったときじゃないかなぁ)、レコードの中身をチェックするようにしていたのだが、それが噛み付いてきた感じである。eqv?がレコードの中身をチェックするのは、実はR6RS、R7RS両方で規格違反なのだが(明確にアドレス比較のみと書いてある)、equal?を変更した際にR6RSのテストスイートを通すために必要だったという経緯がある。(テストケースの意味もよく分からないんだけど、フィールドの存在しないレコードはコンストラクタが同一のオブジェクトを返してもいいってことなのかなぁ？)

1.に関してはレコード周りを取り除いてしまえば直るのは明白だったのだが、2.との兼ね合いでどうしようかなぁとい感じになるもの。そもそも、eqv?が中身を見るというのはいろいろおかしい感じがするので(リストの中身とか見ないわけだし)、取り除いてしまいたい感はあった。となると2.との兼ね合いだけなんだけど、これ自体はもともと便利だからという理由で入れてあっただけなので、正直取り除いてもそんなに影響ないだろうと思っていた。実際R7RS的には未規定なわけだし。が、SRFI-116に落とし穴が潜んでいた。

SRFI-116はイミュータブルなリストを定義しているんだけど、その参照実装のテストケースがequal?にレコードの中身を検査することを要求するコードになっていた。参照実装はChibiとChickenの2つの処理系で動くように実装されているのだが、どうもこの2つはレコードの中身を見るみたいである。新しいSRFIだからまだ使ってる人は少ないだろうし、暗黙的な要求なので無視しても問題ない気はしたのだが、人気処理系のうちの二つでやられているのなるとなぁという気持ちの方が大きかった。

じゃあどうしたか？実装をR6RSの

equal?とそれ以外という風に分けた。ポータブルなコードを書くという上ではR6RSの方がより細かく規定してあるのでいいのだろうけど、これくらいの処理系拡張は許して欲しいという気持ちもある。そうすると実装を二つにする以外に方法が思いつかなかったのだ。ということで、(rnrs base)と(scheme base)で定義されているequal?は別物になった。この動作に依存したコードを書くというのはあんまりないと思うけど、ハッシュテーブルをequal?で作ってキーにレコードを使用している場合は影響がでるという話。

Syntax highlighter