時の羅針盤＠blog

2014-08-30

2014-08-27

最適化

ちょっとした最適化をコンパイルに入れた。具体的にはインポートした束縛でキャッシュ可能なものは定数展開してしまうというもの。以下がその例。

(library (foo)
    (export a)
    (import (rnrs))
  (define a 'a))
(import (rnrs) (foo))
;;(set! a 'b) ;; uncomment this would also work...
(disasm (lambda () (print a)))

こんな感じのライブラリをインポートすると

;; size: 5
;;    0: CONST_PUSH a
;;    2: GREF_TAIL_CALL(1) #<identifier print#user (0x80402708):0>; (print a)
;;    4: RET

こんな感じで、GREF_PUSHの変わりにCONST_PUSHが使われるというもの。これだけだと、実はあんまり嬉しくないんだけど、これにコンパイル時の手続き呼び出しが加わるとかなり最適化ができる。例えばaが文字列"1234"で、スクリプトが(print (string->number a))だとすると、string->numberがコンパイル時に解決されて数値1234が直接VMインストラクションとして出される。まぁ、そんなに上手いことはまるケースは少ないだろうけど、こういうのは入れておくと後で効いてくるというのが経験側から学んだことなので入れておいて損はないだろう。

現状で気に入らないのはset!の挙動なのだが、再定義と代入は別ものとしてエラーにした方が精神衛生上いいような気がする。スクリプト上の環境(+evalの環境)は再定義+代入可能にしているのでこれが可能なのだが(#!compatibleをつけても同様)、禁止するとREPL上でも禁止になるという弊害もある。そうは入ってもREPL上で(set! + 'hoge)とかやるかという話にもなるのだが、例えばMoshとChezはこれを禁止している(psyntaxがかね？)。ちなみに、NMoshとYpsilonは許容している(NMoshは奇妙な挙動をするけど)。

ちなみに、再定義もしくは代入された後は定数展開されない。というか、同一ライブラリで定義された場合はされないという話。これは、例えばGambit benchmarkのcompilerみたいなの対策だったりする(展開されると正しい結果が出ない場合)。ただし、library及びdefine-libraryフォーム内では別のロジックが走るので、展開される可能性はある。でも、よく考えれば、定数しかやらないので展開してしまってもいいかもしれない。後で考えよう。

追記
いや、やっぱり同一ライブラリ内では駄目だな。以下のようなのが困る。

(define b 'b)
(define (foo) (print b))
(foo)
(set! b 'c)
(foo)

これを定数展開してしまうと、二度目のfooが正しく動かない。

追記2
インポートされた束縛に対するset!は禁止することにした。再定義はOK。そういえばこの変更でR6RS/R7RS準拠のevalが書けなくもないのだが(定義禁止等)、まぁそこまでする必要は今のところないかなぁ。

2014-08-23

SRFI-26の紹介

(LISP Library 365参加エントリ)

SRFI-26はカリー化を除いたパラメタの特殊化表記です。日本語にすると意味が分かりませんが(正しく訳せているのかすら疑問) 、要するにcutとcuteです。どちらも手続きを作成するマクロです。使い方は以下：

(cut cons a <>)
;; -> (lambda (tmp) (cons a tmp))

(cut list a <> <>)
;; -> (lambda (tmp1 tmp2) (list a tmp1 tmp2))

(cut list a <> <...>)
;; -> (lambda (tmp1 . xs) (apply list a tmp1 xs))

(cute cons (+ a 1) <>)
;; -> (let ((a1 (+ a 1))) (lambda (tmp) (cons a1 tmp)))

どちらのマクロも第一引数を手続きとみなし、残りをその手続きの引数とします。<>及び<...>はプレースホルダーで、作成された手続きに渡される引数に置換されます。出現する順番どおりに引数に展開されます。cutは引数をそのままlambdaに展開し、cuteの方はプレースホルダーである<>以外の引数を評価を一度だけ行うことを保障します。

プレースホルダーの使い方ではまるのが以下の例です。

(cut cons a (list b <>))
;; -> (lambda () (cons a (list b <>)))

プレースホルダーはcutもしくはcuteと同じ深さにいないといけません。個人的にこれは不便だなぁと思う点ではあるのですが、ネストを考慮するとsyntax-rulesで書けない(もしくはものすごく頑張らないといけない)ので、まぁしょうがないのではないでしょうか。

今回はSRF-26を紹介しました。

2014-08-19

Performance turning - I/O

The R7RS benchmark showed that I/O was slow in Sagittarius. Well not quite, in R7RS library read-line is defined in Scheme whilst get-line is defined in C.This is one of the reason why it's slow. There is another reason that makes I/O slow which is port locking.

Sagittarius guarantees that passing a object to the port then it writes it in order even it's in multi threading script. For example;

(import (rnrs) (srfi :18) (sagittarius threads))

(let-values (((out extract) (open-string-output-port)))
  (let ((threads (map (lambda (v) 
   (make-thread (lambda ()
           (sys-nanosleep 3)
           (put-string out v)
           (newline out))))
        '("hello world"
   "bye bye world"
   "the red fox bla bla"
   "now what?"))))
    (for-each thread-start! threads)
    (for-each thread-join! threads)
    (display (extract))))

This script won't have shuffled values but (maybe random order) whole sentence.

To make this, each I/O call from Scheme locks the given port. However if the reading/writing value is a byte then the locking is not needed. Now we need to consider 2 things, one is a character and the other one is custom ports. Reading/writing a character may have multiple I/O because we need to handle Unicode. And we can't know what custom port would do ahead. Thus for binary port, we don't have to lock unless it's a custom port. And for textual port, we can use string port without lock.

Now how much performance impact with this change? Following is the result of current version and HEAD version:

% ./bench sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m26.155s
user    0m25.568s
sys     0m0.936s

% env SAGITTARIUS=../../build/sagittarius ./bench sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m19.417s
user    0m18.703s
sys     0m0.904s

Well not too bad. Plus this change is not for this particular benchmarking which uses read-line but for generic performance improvements. Now we can finally change the subjective procedures implementation. The difference between get-line and read-line is that handling end of line. R7RS decided to handle '\r', '\n' and '\r\n' as end of line for convenience whilst R6RS only needs to handle '\n'. Following is the result of implementing read-line in C.

% env SAGITTARIUS="../../build/sagittarius" ./bench -o -L../../sitelib sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m5.031s
user    0m4.492s
sys     0m0.795s

Well it's as I expected so no surprise.

Performance turning - Windows (MSVC)

It turned out that Windows version of Sagittarius is extremely slow. If I try ack on Cygwin version then it can run about 60 sec (Core i7 3.40 GHz). Following is the result of three implementations that previous article mentioned.

% time gosh -r7 ack.scm
gosh -r7 test.scm  130.14s user 4.26s system 120% cpu 1:51.70 total

% time sagittarius ack.scm
sagittarius test.scm  54.48s user 3.01s system 124% cpu 46.022 total

% time chibi-scheme ack.scm
chibi-scheme test.scm  49.84s user 0.00s system 99% cpu 50.170 total

And this is the inside of "acm.scm".

(import (scheme base))
(define (ack m n)
  (cond ((= m 0) (+ n 1))
        ((= n 0) (ack (- m 1) 1))
        (else (ack (- m 1) (ack m (- n 1))))))

(ack 3 12)

Apparently Chibi is the fastest. And Sagittarius is not so slow. Well, I don't know why Gauche is slow in Cygwin environment. I think this is simply the performance of VM and its stack overflow handling. (at least on Sagittarius, not sure for others.)

Now let's see how is the Windows version's performance. Following is the result of ack using Cygwin's time command to see how slow it is.

D:>\cygwin\bin\time sash.exe ack.scm
0.00user 0.00system 1:12.13elapsed 0%CPU (0avgtext+0avgdata 220928maxresident)k
0inputs+0outputs (901major+0minor)pagefaults 0swaps

I don't know why it has some garbage but 30% slower. So indeed it is Windows version's issue. But why?

I think the reason why is it does not use direct threading due to the limitation of MSVC (if we use it should improve approx 30% of performance, so about 21 sec in this case?). However I've found this article: How not to make a virtual machine (label-based threading). Even though this article concluded not to emulate direct threading on MSVC however it wouldn't hurt to give it a shot.So I've modified VM a bit to emulate it (sources are in msvc32-emulate-direct-threading branch). Then run the above script.

Here is the result;

D:>\cygwin\bin\time.exe build\sash.exe ack.scm
0.00user 0.00system 1:40.56elapsed 0%CPU (0avgtext+0avgdata 220928maxresident)k
0inputs+0outputs (901major+0minor)pagefaults 0swaps

Well, it's always better to believe what people have done, I guess...So this really doesn't work. Now we only have 2 options, giving up or adopt to MinGW. Hmmmm.

2014-08-17

性能差

@SaitoAtsushiさんがLarcenyにあるR6RSベンチマークをR7RS用にして走らせた結果をツイートしていた。

R7RS 処理系のベンチマークとった。 Gauche 激速！ Sagittarius が次点。 Chibi はとんでもなく遅い。 Foment や Husk は測定するのが非現実的なくらい遅くて諦めた。 pic.twitter.com/eduG04M3FD
— 齊藤敦志 (@SaitoAtsushi) August 17, 2014

Gaucheが速いのは分かっていたことなのだが(いずれ追い越す予定…未定…)、Chibiより遅いベンチがあるのは正直まずい。Chibiはランタイムが小さいためほとんどの手続きがSchemeで書かれているという特徴があるのだが、それより遅いのがあるということは基本的なVM等の動作が遅い可能性がある。

とりあえずChibiより遅いのが以下のベンチ

ack
cat
nucliec
quicksort
simplex
slatex
gcbench
mperm

こいつらに関わる手続きはすぐにでもパフォーマンスチューニングする必要がある。特にackなんて本当にシンプルなのでVMの性能がまずいということになる・・・(涙)

次いで、Gaucheに圧倒的に負けてるやつ

ctak
sum
sumfp
mbrot
mbrotZ
pnpoly
tail
read1
lattice
nqueens

結構あるのだが、このうち浮動小数点が関わってくるやつ(sumfp、mbrot及びmbrotZ)はちと分が悪いので後回し。Gaucheは浮動小数点数をある程度レジスタに持つのでキャッシュが効くうちはメモリ割付が発生しない(はず)。テキストポートが絡んでくるI/O周りも多少分が悪いところがあるのではあるが、何とかしたいところ。nqueensはどちらも絡まないのでVMもしくはコンパイラということになる。

所詮マイクロベンチマークという見方もあるかもしれないが、こういう地味な性能差は地道なチューニングの結果なのでここで手を抜くのはまずいというのが僕の考え方。

2014-08-12

What do you expect to be printed?

I was testing how include on Sagittarius works and found out something interesting.

Firstly, I've prepared 4 files related to library. The contents are very simple. Like this;

;; foo.sld
(define-library (foo)
    (import (scheme base))
    (export bar)
  (include "impl/bar.scm"))

;; impl/bar.scm
(include "buzz.scm")

;; impl/buzz.scm
(define bar 'bar)

;; buzz.scm
(define bar 'buzz)

Now I put those files following directory structure;

./ + foo.sld
   + impl
       + bar.scm
       + buzz.scm
   + buzz.scm

Then, wrote following script;

(import (scheme base) (scheme write) (foo))
(display bar) (newline)

The question is, what would be the expected value to be printed 'bar or 'buzz? I first expected to be print 'bar with Sagittarius but it was 'buzz. Then checked with Chibi which also printed 'buzz. At this point I thought, oh R7RS de-facto would be 'buzz. Well it wasn't. I've also check with Gauche and Foment and these printed 'bar.

I know why this happens actually. When the first include was processed then the current loading path returned to the same directly as "foo.sld". Then second include "buzz.scm" is processed. I don't know how Gauche and Foment do this properly (well R7RS doesn't specify how to resolve the path so both are proper behaviour though).

Well, by now I've never seen this type of code in anywhere other than in my test. But I would like to go more intuitive behaviour.

13th August, 2014
I've modified compiler to be able to handle above case. What I've changed is that

include and include-ci returns read expression and filename
compiler sets current load path after include is done according to above filenam

It's a bit ugly way to fix so I need to think how we can clean it up now.

2014-08-11

SQL in S-Expression

There is SXML which represents XML in S-Expression. If you're a professional programmer, it is very difficult to avoid using SQL (well, if you could, you are very lucky).

Writing SQL isn't so bad if the file is separated or the editor is smart enough to edit. However it's usually none of the case so I always need to suffer especially writing it in double quote. Now, Emacs is de-facto editor for all programmers (vim? why are you talking about jeans here? V.I.M.) and it's by default very good with editing s-expression. So if you can write SQL in s-expression then the world would be happier than before.

Now, there are bunch of projects that have the same idea I've got, CLSQL, SxQL, S-SQL for example (I think Clojure also has something similar but I'm not so familiar). The problem of them are very simple. It's using keyword which is not in R6RS/R7RS. Moreover, those use either reader macro or CLOS so not pure s-expression. What I want is light weight but easy to edit. (well, it is actually ok for me, Sagittarius has all of them anyway.)

So I'm thinking something like following;

(define ssql 
  '(select ((p id) (as (p first_name) name) (a street))
    (from (as person p))
    (inner-join (as address a) (on (= (p id) (a id_person))))
    (where (= (a country) "Nederlands"))))

(ssql->sql ssql)
;; "select p.id, p.first_name as name, a.street
;;  from person as p
;;  inner join address as a on p.id = a.id_person
;;  where a.count = 'Netherlands'"

Hmmm, it looks parentheses are a bit too many so the readability is a bit lowered. Basically it doesn't have to be fancy but readable and easy to remember. So one-by-one mapping is ok. (if I want OR-mapper, then I just need to construct it on top of it.)

If there is something similar in Scheme or better idea, I would like to know.

2014-08-08

SRFI-25の紹介

(LISP Library 365参加エントリ)

SRFI-25は多次元配列を扱うSRFIです。多次元配列自体はリストでもベクタでも表現できますが、このSRFIは2次元以上の配列の扱いをより自然にすることを目的としています。

使い方を見てみましょう。

(import (srfi :25))
;; create 3 dimension array, each dimension
;; contains bound start with 0 and end with 4
;; (exclusive)
(define arr (make-array (shape 0 4 0 4 0 4)))

;; set 'e1 on position x = 2, y = 2, z = 2
(array-set! arr 2 2 2 'e1)
;; -> unspecified value

;; ref it
(array-ref arr 2 2 2)
;; -> e1

;; returns dimensions of array
(array-rank arr)
;; -> 3

ベクタやリストを使うとアクセスが複雑になるところが、より直感的に操作可能になります。

もう少し複雑な例を見てみましょう。手続きshapeは配列境界のペアを受け取ります。例えば以下のように書けば0以外の数字から始まる配列も書くことが可能です。(配列が3次元を超えるを僕の理解の範疇を超えてくるので2次元で・・・)

;;; first dimension has 2 elements start with 4 end with 5
;;; second dimension has also 2 elements start with 5 end with 6
;;; the array is initialised with given objects as following
;;;   5  6 
;;; 4 nw ne
;;; 5 sw se
(array (shape 4 6 5 7) 'nw 'ne 'sw 'se)

このSRFI自体には含まれないのですが、このSRFIの著者はarlib.scmをSRFIに付随させており、その中にはtubulate-arrayやarray-equal?といった便利手続きが定義されています。(なんでSRFI自体には含めなかったんだろう？)

今回はSRFI-25を紹介しました。

2014-08-05

LaTeX用環境構築

Schem Workshop 2014の主催者から「Sagittariusについてペーパー書かない？」というメールが来たのが昨日。これも何かの縁だろうということで頑張って書こうかと決めたのが昨日。Windows用のLaTeX環境が絶望的に乏しいことに気付いたのも昨日。昨日はなんだかいろいろあったなぁw

世の中にはCloud LaTeXなるものもあるらしく、おとなしくそれを使っておけという話なのかもしれないが、それらがEmacsキーバインドをサポートしているかどうかというのと、期日が9月5日という1ヶ月しかない中でサービス探しつつLaTeXを覚える(そして英語論文の書き方も・・・)というのをこなすのはちときついので、Emacsに環境を作ってしまおうという方に傾いてみることにした。軽くググッてみると世の中にはAUCTeXなるものがあるらしい。とりあえずこいつをLaTeXのプリビューが出来るまで動かせれば勝ちである。

とりあえず簡単な手順。

ダウンロードページからAUCTeXを落としてくる。
libpng14-14.dllとzlib1.dllの32ビット版をEmacsのbinフォルダに置く(PNGサポート)
MikTeXをインストールしてパスを通す
64ビット版を入れたが正しいかは不明
Ghostscriptの32ビット版をインストールしてパスを通す
最初64ビット版を入れて、GSWIN32C.EXEがないと言われてはまった

これでいいはずなんだけど、スクリーンショットのpreview-latexのようには見えない件。error in process sentinel: preview-reraise-error: End of Preview snippet 9 unexpectedなんていわれる。何かが足りてないのだろうが、何が足りてないのかは不明・・・しかし、なんで選択肢がMicorsoft WordかLaTeXというのがいけないんだ。OOoならあるのに・・・

ペーパー書く以外にも問題があるのだけど、そいつらは書けてから考えることにする。(USに入るための手続きとか、クレジットカード作らないととか、パスポートの更新とか、まぁ割りとたくさんあるなぁ・・・)

2014-08-04

Gaucheライクな文字セットリーダをサポートしてみた

事の発端は(text parse)のテストケースがなかったこと。

この数日バイナリの扱いを楽にするためのライブラリを書いていたのだが、(text parse)と対になる(binary parse)を格段になってテストケースがないことに気付いた。っで、Gaucheからテストケースを拝借しようと思ったら、文字セットのところで躓いたという話。Sagittariusはリーダーマクロをサポートしているのでさくっと書いてしまえばいい話ではあるのだが、この形式の読み取りで躓いたのはこれで何度目か分からないので。

以下のように使える。

#!read-macro=char-set
#[a-zA-Z]
;; => #<char-set ...>

あくまで読み取りで、書き出しの方は何もしてない。(pp)は変更してあるのでそれを使うということで。

実装は以下。

(library (char-set)
    (export :export-reader-macro)
    (import (rnrs)
            (sagittarius reader)
            ;; we don't want to export regex reader macro with this
            (only (sagittarius regex) parse-char-set-string))

  ;; Gauche compatible charset
  (define-dispatch-macro #\# #\[ (char-set-reader port subchar param)
    (define (read-string in out)
      (let loop ((depth 0))
        (let ((c (get-char in)))
          ;; we just need to put all chars until we got #\]
          ;; c could be eof so check
          (when (char? c) (put-char out c))
          (cond ((eof-object? c) 
                 (raise (condition
                         (make-lexical-violation)
                         (make-who-condition 'char-set-reader)
                         (make-message-condition "unexpected EOF"))))
                ((char=? c #\[) (loop (+ depth 1)))
                ((char=? c #\])
                 ;; done?
                 (unless (zero? depth) (loop (- depth 1))))
                (else (loop depth))))))
    (let-values (((out extract) (open-string-output-port)))
      ;; need this unfortunately
      (put-char out #\[)
      (read-string port out)
      (parse-char-set-string (extract))))
)

parse-char-set-stringは新たに足された手続き。インポートの際に問答無用でインポートしたライブラリのリーダーマクロを引き継ぐようになっていたのを修正。

スクリプト内で使う分にはこれで十分なはず。

2014-08-01

SRFI-23の紹介

(LISP Library 365参加エントリ)

SRFI-23はエラー通知について定めたSRFIです。要するにerror手続きです。使い方は以下のようになります。

(error "error reason" some error related objects)

たったこれだけです。このSRFIではエラーが通知された際に何が投げられる等のことは定義されていません。参照実装は潔く単にメッセージと原因のオブジェクトをプリントした後、処理系が何かしらエラーを投げると思われる手続きの呼び方をして終わりです。(例:(car 'a)等)

振る舞いはともかくとして、このSRFIはR7RSに丸ごと取り入れられています。R7RSではエラーオブジェクトを投げるとされていますが、エラーオブジェクトが何かということは触れられていません。

ちなみに、R6RSにも同名の手続きがありますが、こちはら引数の個数が違うので注意が必要です。また、投げられる例外オブジェクトについても細かく既定されています。これもR6RSとR7RSの非互換の一つです。

今回はSRFI-23を紹介しました。

2014-07-30

Industriaのテスト

IndustriaというR6RS Scheme用のポータブルなライブラリをご存知だろうか？Göran Weinholt氏が作成しているライブラリで、暗号器からX86のアセンブラまであるという巨大なライブラリである。(GitHubで最後にコミットがあったのが去年の8月なのでひょっとしたら開発が停止しているかもしれない。)リポジトリのweinholt/netディレクトリの中にあるtcp.slsからサポートしている、もしくは検討された処理系を知ることができる。以下がそのリスト。

Chez
Guile
ikarus
IronScheme
Larceny
Mosh
mzscheme (Racket)
Sagittarius
vicare (ikarusの後継)
Ypsilon

Biwa Schemeを除いたr6rs.orgに挙げられている処理系のリストほぼそのままである。可搬性のない部分のみをうまく切り出して、あとはR6RSの仕様どおりに書かれているということであろう。

以前このライブラリで定義されているマクロがうまく動かないことがあったこともあり、マクロ展開器のバグを直すたびくらいにテストを走らせていたのだが、ふと他の処理系ではどうなのかと思い試してみた。依存しているSRFIとCygwinという関係上、テストを走らせることが出来たのは以下の3つ。

Sagittarius (0.5.7 HEAD)
Mosh (0.2.7)
Racket (6.0.1 Windows x86_64)

GitHubのIssueに挙げたのだが、weinholt/assembler/x86.slsにあるfind-instruction-encoding手続きで定義されているbailoutが0引数で呼ばれている箇所があるため、直してやる必要がある。(でないとSagittariusはコンパイルエラーを挙げる。)毎回コマンドを叩くのもあほらしいので以下のようなスクリプトを用意した。

#!/bin/sh

usage() {
    echo "run-test.sh CONFIG"
    echo " CONFIG argument must be a shell script file"
    echo " contains $program and $load_flag"
}

if [ -f $1 ]; then
    . `pwd`/$1
    if [ "$?" != "0" ]; then
 echo "source command failed"
 exit -1
    fi
else
    usage;
    exit -1
fi

load_path=`pwd`

init $load_path

cd tests
for f in *.sps
do
    echo "Testing $f"
# hack for Windows racket...
    temp_file=$f.tmp.scm
    echo -n ';;' | cat - $f >  $temp_file
    echo "$program $load_flag $f"
    "$program" $load_flag $temp_file
    rm $temp_file
done

っで、以下の設定ファイルをそれぞれ用意。

# sash.conf
#!/bin/sh
# -*- mode:shell; coding:utf-8; -*-

program=/opt/bin/sash
load_flag=

init() {
load_flag=-L$1
}

# mosh.conf
#!/bin/sh
# -*- mode:shell; coding:utf-8; -*-

program=mosh
load_flag=

init() {
    load_flag='--loadpath='$1
}

# racket.conf
#!/bin/sh
# -*- mode:shell; coding:utf-8; -*-

program='/cygdrive/c/Program Files/Racket/plt-r6rs'
load_flag=

init() {
    load_flag="++path ../"
}

後は以下のようなコマンドで走らせる。

$ ./run-tests.sh sash.conf

テストを走らせた結果、意外にもSagittariusだけが全てのテストをパスするということが起きた。(これを書きたかった。)

Moshはpsyntaxのエラーがあったり、行末文字に起因すると思われるエラーがあったりでいくつかのテストが失敗する。またSRFI-25をサポートしていないのでライブラリがないというエラーも見受けられた。NMoshで走らせた場合はマクロ展開に起因するエラーは発生しないものの、通常のMoshと同様のエラーは発生する。

Racketは恐らく行末文字に起因するエラーが一つといくつかのテストが走ることさえなく落ちていくという現象が起きていた。

また、テスト完了までの時間もSagittariusが最速であった。以下がtimeコマンドを用いたテストケースの起動結果。(Cygwin 32 bit on Windows 7 64 bit Core 7i)

% time ./run-tests.sh sash.conf > sash.log 2>&1              [~/work/industria]
./run-tests.sh sash.conf > sash.log 2>&1  68.94s user 6.14s system 103% cpu 1:12.23 total

% time ./run-tests.sh mosh.conf > mosh.log 2>&1              [~/work/industria]
./run-tests.sh mosh.conf > mosh.log 2>&1  111.95s user 2.98s system 99% cpu 1:55.56 total

% time ./run-tests.sh racket.conf > racket.log 2>&1          [~/work/industria]
./run-tests.sh racket.conf > racket.log 2>&1  0.21s user 0.71s system 0% cpu 5:56.48 total

以外なのはRacketが5倍近く遅かったことである。

この結果を受けて着実に使える処理系になっていっているということと勝手に解釈することにする。

2014-07-27

みんな仲良く

ドラクエ４の作戦みたいだｗ博愛主義とは違うんだろうけど、こういうのなんていうんだろう？

生きていれば、好きになれない人、どうにも馬が合わない人、関わりたくない人なんていくらでも出てくると思う。そんな人との付き合いが発生したときにどうするかという話。あくまで僕個人の話で、これが唯一絶対ではないし、一般論ですらないかもしれないけど・・・(どちらかと言えば単なる愚痴というか精神衛生を保つために何かしら吐き出すという行為に近いw)

こういう話を一般論でするのは多少危険なので、少しだけ具体的な状況を記しておくことにする。相手は、僕のことを散々人格否定した人。ちなみにそれに関しての謝罪はない。状況としては第三者から今のギクシャクした関係をなんとかできないかという話を振られた。これ以上具体的に書くのは辛いのでこれくらい。

個人的にはそんな話が第三者からでてくること自体がおかしな話だと思うのだが、まぁ理解できなくもない範囲であるし、こういった場合にどう答えるのかというのが趣旨なのでそれは置いておく。僕の答えはもちろん「No」である。できるわけがない。

感情的に言えば、相手は僕の人生の中に存在しないレベルでいてほしいくらいになっている。そもそも謝罪もされていないのだから相手方にそんな意思はないと捕らえるのが普通で、そんな人間との関係をどうこうするつもりはない。まぁ、謝罪されたからといって許すつもりは全然ないし、謝罪がほしいとすら思っていない。そもそも、存在してほしくないのだ。

理性的には、そうするメリットがない。幸か不幸か相手の状況を大体把握しているのと、相手から学ぶことはない(IT界隈の人ではないという意味)のと、相手が持つコネクションは僕にとって価値があるものではない。最悪の場合、デメリットが発生する可能性の方が高い(金の無心等)。打算的にみて一切の利がないのであれば、関係を改善する理由はない。

感情的にも理性的にも「No」という答えがでてくるのだ、それこそどうしようもない。こういった場合に求められるのは「大人の対応」ができるかどうかだけである。人それぞれ「大人の対応」の定義に違いがあるとは思うが僕の定義では、必要最小限の会話で波風を立てずただやり過ごす、である。必要最小限の会話なんてビジネスの場でないのであればそれこそ挨拶程度でいいし、それも他に誰かいるときだけでいい。道端でばったり出会ったときは無視すればいい。社交性が求められるのは自分と相手の第三者がいるときだけである。

綺麗ごとだけで生きていける人からすれば、僕の対応は間違っているのかもしれないけれど、残念ながらそんな幸せな世界では生きていないみたいである。全ての人と仲良くする必要はないし、全ての人を許す必要もない、嫌な人は変な波風を立てないように適当にあしらえばいい。あんまり褒められたものではないが、30余年の人生で学んだ処世術の一つである。

2014-07-26

SRFI-22の紹介

(LISP Library 365参加エントリ)

SRFI-22はSchemeスクリプトをUnix上で実行形式であるかのうようにして走らせるためのSRFIです(日本語難しいな・・・)。何を定義しているかといえば、スクリプトの先頭に現れた#! /usr/local/bin/sashのような形式のラインを無視することと、main手続きの呼び出しです。

具体的な例を見てみましょう。(この例では/usr/local/binにSagittariusがインストールされていることを前提としています。)

#! /usr/local/bin/sash

(import (rnrs))

(define (main args)
  (display "hello world!") (newline))

このスクリプトを仮にsrfi-22.scmとして保存し実行権を付与すれば、Unixライクなシェル上で以下のように実行することができます。

$ ./srfi-22.scm

このSRFIで重要になるのは、実は#!の部分だけで、それ以外は処理をトップレベルにおいてしまえば動きます。(R5RS時代にcommand-line手続きがなかったのであればmainの意味は大きいとは思いますが、R6RS以降であれば単なるショートカットくらいの意味しかないでしょう。)

個人的にはもっともよく使っているSRFIの一つですが、割と賛否両論あるみたいです。

今回はSRFI-22を紹介しました。

2014-07-25

R7RS implementation differences

Recently, Gauche 0.9.4 has been out and it now supports R7RS. Now, I have couple of R7RS implementation in my hand, like followings;

Sagittarius (of course)
Chibi Scheme
Gauche
Foment

Well, only 3. I've also tried ~~foment and~~ picrin which ~~are~~ is supposed to be R7RS implementations however it didn't work on Cygwin which is my main environment. (Sorry, I'm Windows user...)

July 28th 2014: Foment compiled on VC2010 has been added.

The purpose of this article is to show some corner cases of R7RS. Sagittarius is R6RS/R7RS and Chibi and Gauche are R5RS/R7RS implementation so there should be something but how much? So I've prepared this test script.

;; #!r7rs is supported only Gauche and Sagittarius...
;; #!r7rs
(import (scheme base)
        (scheme write)
        (scheme read)
        (scheme file)
        (scheme inexact)
        (scheme lazy))

(define (print . args) (for-each display args) (newline))

;; string
;; allow #\null in string?
(print "String range")
(write "abc\x0;def")(newline)

;; file
;; are binary and textual port separated?
(print "Port separation")
(define-syntax check-port
  (syntax-rules ()
    ((_ opener reader data)
     (let ((in (opener data)))
       (guard (e (#t (print "ERROR: "(error-object-message e))
                     (close-input-port in)))
         (print 'opener ":" 'reader)
         (reader in)
         (close-input-port in))))))

;; files
(check-port open-binary-input-file read-char "r7rs-comp.scm")
(check-port open-binary-input-file read-u8 "r7rs-comp.scm")
(check-port open-input-file read-char "r7rs-comp.scm")
(check-port open-input-file read-u8 "r7rs-comp.scm")
;; memory
(check-port open-input-bytevector read-u8 #u8(0 1 2 3))
(check-port open-input-bytevector read-char #u8(0 1 2 3))
(check-port open-input-string read-u8 "abc")
(check-port open-input-string read-char "abc")

;; error object
(print "Error object")
(define-syntax check-error
  (syntax-rules ()
    ((_ pred expr)
     (guard (e (#t (print 'pred ":" (pred e))))
       expr))))
(check-error error-object? (car 'a))
(check-error read-error? (read (open-input-string "#0#")))
(check-error file-error? (open-input-file "this doesn't exist.txt"))

(define-syntax check
  (syntax-rules ()
    ((_ expr)
     (guard (e (#t (print 'expr ":" (error-object-message e))))
       (print 'expr ":" expr)))))

;; sqrt
(print "Procedures")
(check (sqrt 4))

;; promise
(print "Promise")
(check (+ (delay (* 3 7)) 13))

The script is constructed based on what I thought could be a corner case. Mostly port related procedures and error object. Then this is the output;

Sagittarius
String range
"abc\x0;def"
Port separation
open-binary-input-file:read-char
ERROR: textual port required, but got #<file-binary-input-port r7rs-comp.scm>
open-binary-input-file:read-u8
open-input-file:read-char
open-input-file:read-u8
ERROR: "binary-port" required, but got #<transcoded-textual-input-port r7rs-comp.scm utf8-codec>
open-input-bytevector:read-u8
open-input-bytevector:read-char
ERROR: textual port required, but got #<bytearray-binary-input-port>
open-input-string:read-u8
ERROR: "binary-port" required, but got #<string-textual-input-port>
open-input-string:read-char
Error object
error-object?:#t
file-error?:#t
Procedures
(sqrt 4):2
Promise
(+ (delay (* 3 7)) 13):"number" required, but got (13 #<<promise> 0x80599100>)
---
Chibi Scheme
String range
"abc\x00;def"
Port separation
open-binary-input-file:read-char
open-binary-input-file:read-u8
open-input-file:read-char
open-input-file:read-u8
open-input-bytevector:read-u8
open-input-bytevector:read-char
open-input-string:read-u8
ERROR: not a binary port
open-input-string:read-char
Error object
error-object?:#t
read-error?:#t
file-error?:#t
Procedures
(sqrt 4):2
Promise
(+ (delay (* 3 7)) 13):invalid type, expected Number
---
Gauche
String range
"abc\0def"
Port separation
open-binary-input-file:read-char
open-binary-input-file:read-u8
open-input-file:read-char
open-input-file:read-u8
open-input-bytevector:read-u8
open-input-bytevector:read-char
open-input-string:read-u8
open-input-string:read-char
Error object
error-object?:#t
read-error?:#t
file-error?:#t
Procedures
(sqrt 4):2
Promise
(+ (delay (* 3 7)) 13):operation + is not defined between 13 and #<promise 0x80445480>

Following it the output from Foment

String range
"abc^@def"
Port separation
open-binary-input-file:read-char
ERROR: read-char: expected an open textual input port
open-binary-input-file:read-u8
open-input-file:read-char
open-input-file:read-u8
ERROR: read-u8: expected an open binary input port
open-input-bytevector:read-u8
open-input-bytevector:read-char
ERROR: read-char: expected an open textual input port
open-input-string:read-u8
ERROR: read-u8: expected an open binary input port
open-input-string:read-char
Error object
error-object?:#t
file-error?:#t
Procedures
(sqrt 4):2
Promise
(+ (delay (* 3 7)) 13):+: expected a number

Hmmm, not much difference than I expected. The only difference is port related procedures. Gauche is the most tolerant then Chibi, Sagittarius is the strictest one. Well, Sagittarius is an R6RS implementation and it's required to be. Gauche's behaviour is a bit too flexible to me but still consistent. Chibi looks a bit tricky to me. It actually reject to pass texutal port to read-u8 however read-char accept binary port.

Even though, R7RS explicitly says that implementation may reject #\null in strings however none of implementations does it. Interestingly, Foment doesn't support writing #\null. So it writes as it is. It would be better to consider that it doesn't support properly or particially supported. So to write portable string library, users should not depend on null character.

I thought error object could have some difference but not really (or the test script doesn't cover that much).

I was hoping one of the implementations (Chibi or Gauche though) have implicit forcing but the result was none of them.

The difference is small and it's avoidable if users are consistent. So if you write R7RS scripts and it could be run on Sagittarius, then most likely it would run on the other implementations.

2014-07-24

バグは寝かせてみるといい

かなり長めに寝かせていたバグの糸口を掴んだのでメモ。

問題になるのは以下のようなコード

(import (rnrs))
(define-syntax wrap-let
  (syntax-rules ()
    ((_ expr)
     (let-syntax ((foo (lambda (x) (syntax-case x () ((_) #'expr)))))
       (foo)))))

(define-syntax wrap
  (lambda (x)
    (syntax-case x ()
      ((_ expr)
       #'(let ((temp expr))
           (wrap-let (car temp)))))))

(wrap '(a b c))

これがなぜ問題になるのかというのが重要で、展開された後を見るとなんとなく答えが出てくる。以下が展開形コードのイメージ。

(let ((temp.0 '(a b c)))
  (let-syntax ((foo (lambda (x) (syntax-case x () ((_) #'(car temp.0))))))
    (foo)))

問題はここからで、let-syntaxの中にある#'(car temp.0)が原因。Sagittariusのマクロ展開器は現状syntax-caseのパターン及びテンプレートを全てリネームする。さらにsyntaxが展開されるときにもリネームする。そうすると、元々のtemp.0は持っていない構文情報を付加されてしまい、結果的に一意の識別子になってしまう。

ならば、テンプレート変数と分かっている場合はリネームしなければいいだけのように見えるのだが、そうも行かないのが面倒なところ。テンプレート変数はリネームされなければならないのだ。とりあえずマクロ展開形が局所マクロを含んでいるというのは(おそらく)なんとか分かる気がするので、その場合のみを特殊扱いするという感じでいこうかと思う。既に不安材料が見えているのでそれが正しいのかは今一よく分からないのではあるが・・・

このバグ、6ヶ月くらい寝かせてあったのだが、最近になって気運が高まってきたのか原因が掴めるところまできた。どうやら困難なバグほど寝かせておいた方が角が取れて消化しやすくなるらしい。

2014-07-22

packratの機能追加

SagittariusにはChicken由来のpackratライブラリがあるのだが、このライブラリ数量指定子が使えなくて非常に使い出が悪い。例えば、0個以上の何かを取るというのを書くのに以下のようにする必要がある。

(import (rnrs) (packrat))

(define (generator tokens)
  (let ((stream tokens))
    (lambda ()
      (if (null? stream)
          (values #f #f)
          (let ((base-token (car stream)))
            (set! stream (cdr stream))
            (values #f base-token))))))

(define quantifier (packrat-parser
                    top
                    (top ((vs <- item*) vs))
                    (item* ((i <- item i* <- item*) (cons i i*))
                           (() '()))
                    (item (('+) '+))))


(let* ((q (generator '((+) (+) (+))))
       (r (quantifier (base-generator->results q))))
  (parse-result-semantic-value r))
;; -> (+ + +)

BNFで書かれた定義を持ってくる際に数量指定子がないというのは非常に面倒で、パーサを書くのを何度も何度も億劫にしてくれた。quantifierの定義はこう書けた方が直感的である。

(define quantifier (packrat-parser
                    top
                    (top ((vs <- (* item)) vs))
                    (item (('+) '+))))

ずっとほしいと思っていたのだが、そろそろあった方がいいなぁという気運が高まってきたので、えいや！っと中身を読むことにした。

中身を読んで分かったのは、複数回マッチするというのを何とかするAPIが足りていないのが原因ぽかったので、こんなのを追加してマクロ側にも手を入れてみた。

(define (packrat-many parser n m k)
  (lambda (results)
    (define (>=? many max) (and max (>= many max)))
    (define (return-results vs result)
      ;; if we inline map then for some reason step on a bug...
      (let ((vals (map parse-result-semantic-value vs)))
        (merge-result-errors ((k vals)
                              (parse-result-next result))
                             (parse-result-error result))))
    (when (and (zero? n) (eqv? n m))
      (error 'packrat-many "both min and max are zero" n m))
    (let loop ((i 0) (vs '()) (s #f) (results results))
      (if (>=? i m)
          (return-results (reverse! vs) s)
          (let ((result (parser results)))
            (cond ((parse-result-successful? result)
                   (loop (+ i 1) (cons result vs) result
                         (parse-result-next result)))
                  ((<= n i)
                   (return-results (reverse! vs) result))
                  (else result)))))))

なぜか、インライナーのバグを踏んだのはまぁご愛嬌ということで・・・
マクロの方は+、*、?に加えてマッチ回数を指定できる=も入っている。長いのでここには載せない。これで多少便利になる気がする。

2014-07-14

ライブラリの作られるタイミング

Chaton Gauche部屋から流れてきたこれ

REPLからエラーの起きるライブラリを繰り返しr7rs#importすると，エラーが起きなくなります．

;; temp.scm
(define-library (temp)
(begin
(f)))

;; REPL
gosh> (import (temp))
*** ERROR: Compile Error: Compile Error: unbound variable: f

"(standard input)":1:(import (temp))

Stack Trace:
_______________________________________
0 (eval expr env)
At line 179 of "/usr/local/share/gauche-0.9/0.9.4-rc2/lib/gauche/interactive.scm"
gosh> (import (temp))
#<undef>
gosh[r7rs.user]>

全く同じ問題がSagittariusでも起きる。原因も分かっていて、コンパイラがコンパイル時にライブラリを作ってしまうけど、エラー自体はランタイムで起きるのでimportが呼ばれた際にはライブラリはあるものとして扱われてしまうから。

あまりうまい解決策が思いつかないんだけど、とりあえずぱっと思いついたのがVM側で何とかするというもの。っが、実際に実装しかけては無理だなぁということが分かってしまったので破棄。案としては、evalの中身をwith-exception-handlerで囲ってしまい、エラーが起きた際に現在のライブラリと保存されたライブラリが異なっていたら現在のライブラリを破棄するというもの、だったのだが、これはライブラリの変更がインストラクションでのみ発生するということに依存している上に、それ以外の方法で変更することが可能なので火を見るより明らかにだめだろうという・・・

次に思いついたのが、ライブラリのコンパイル全体をwith-exception-handlerで囲ってしまうとうもの。実際の手順としては以下のようになる予定。

ダミーのライブラリを作る
実行コードをwith-exception-handlerで囲う
コンパイル及び実行

エラーがあればダミーを破棄
なければ正式名で作成及び中身をコピー

こっちはまだ実装をしていないのだが、見えているだけで以下のデメリットががが

エラーが存在しない場合でも呼ばれる

キャッシュがでかくなる

余分な手続きを2つ作る必要があるため、性能に影響が出そう
コピーが重い
マクロ展開がライブラリを使用しているのだけれど・・・

まぁ、ダミーのライブラリを作る必要はないかもしれないのでそこは下2つは無視できるかもしれないが。

REPLでしか起きない問題なので多少以上にやる気が低いのも問題だろう・・・

2014-07-11

SRFI-19の紹介

(LISP Library 365参加エントリ)

SRFI-19は日時を扱うライブラリです。関連性が強いのでSRFI-18と同じ著者だと思っていたのですが、実は違うということに今日気づきました。新たな発見です。

このSRFIでは時間と日付の両方を扱います。まずは簡単に使い方を見てみましょう。

(import (rnrs) (srfi :19))
;; simple stop watch
(define-syntax time
  (syntax-rules ()
    ((_ expr)
     (let* ((s (current-time))
            (r expr)
            (e (current-time)))
       (display "execution time: ")
       (display (time-difference e s))
       (newline)
       r))))

(define (tak x y z)
  (if (not (< y x))
      z
      (tak (tak (- x 1) y z)
           (tak (- y 1) z x)
           (tak (- z 1) x y))))

(time (tak 21 14 7))
;; prints
;; <time-duration 0.0000000>
;; -> 14

どのような形で表示されるかは処理系依存なので、値を真面目に取得するにはtime-secondやtime-nanosecond等のアクセサを使います。

current-timeで現在の時間オブジェクトを取得するのがよく使われると思いますが、計算して時間を求めることも可能です。その際にはmake-time及び計算用の手続きを使います。

(define one-hour (make-time time-duration 0 3600))
(let ((c (current-time)))
  ;; I don't want to do anything for an hour...
  (thread-sleep! (add-duration c one-hour)))

make-timeは最初に時間タイプを受け取ります。このSRFIでは以下のタイプが定義されています。

time-duration
時間の差分を扱います
time-monotonic
Monotonicな時間を扱います
time-process
現在のプロセス時間を扱います
time-tai
TAI時間を扱います
time-thread
現在のスレッド時間を扱います
time-utc
UTC時間を扱います

処理系によってはプロセス時間とスレッド時間はサポートしていない可能性があります。

日付を扱う手続きのいくつかを見てみましょう。

(current-date)
;; -> date object

(date->string (current-date))
;; -> "Sat Jul 12 19:03:24+0100 2014"

(date->string (current-date) "~Y-~m-~d")
;; -> "2014-07-12"

(string->date "2014-07-12" "~Y-~m-~d")
;; -> date object

(date->julian-day (current-date))
;; -> 7075728736710773/2880000000

(julian-day->date 7075728736710773/2880000000)
;; -> date object

上記以外にも日付⇔時間、修正ユリウス暦⇔日付or時間、TAI時間⇔UTC時間等の変換手続きが用意されています。

個人的に日付⇔文字列の変換は拡張性が無く、またXSD等が採用しているISO-8601のフォーマットのタイムゾーン形式に対応していなかったりと、多少使いにくい点があります。日付のアクセサは用意されているのでその辺は自前で書けということなのかもしれませんが・・・

今回はSRFI-19を紹介しました。

Syntax highlighter