時の羅針盤＠blog: 2014-08

2014-08-30

2014-08-27

最適化

ちょっとした最適化をコンパイルに入れた。具体的にはインポートした束縛でキャッシュ可能なものは定数展開してしまうというもの。以下がその例。

(library (foo)
    (export a)
    (import (rnrs))
  (define a 'a))
(import (rnrs) (foo))
;;(set! a 'b) ;; uncomment this would also work...
(disasm (lambda () (print a)))

こんな感じのライブラリをインポートすると

;; size: 5
;;    0: CONST_PUSH a
;;    2: GREF_TAIL_CALL(1) #<identifier print#user (0x80402708):0>; (print a)
;;    4: RET

こんな感じで、GREF_PUSHの変わりにCONST_PUSHが使われるというもの。これだけだと、実はあんまり嬉しくないんだけど、これにコンパイル時の手続き呼び出しが加わるとかなり最適化ができる。例えばaが文字列"1234"で、スクリプトが(print (string->number a))だとすると、string->numberがコンパイル時に解決されて数値1234が直接VMインストラクションとして出される。まぁ、そんなに上手いことはまるケースは少ないだろうけど、こういうのは入れておくと後で効いてくるというのが経験側から学んだことなので入れておいて損はないだろう。

現状で気に入らないのはset!の挙動なのだが、再定義と代入は別ものとしてエラーにした方が精神衛生上いいような気がする。スクリプト上の環境(+evalの環境)は再定義+代入可能にしているのでこれが可能なのだが(#!compatibleをつけても同様)、禁止するとREPL上でも禁止になるという弊害もある。そうは入ってもREPL上で(set! + 'hoge)とかやるかという話にもなるのだが、例えばMoshとChezはこれを禁止している(psyntaxがかね？)。ちなみに、NMoshとYpsilonは許容している(NMoshは奇妙な挙動をするけど)。

ちなみに、再定義もしくは代入された後は定数展開されない。というか、同一ライブラリで定義された場合はされないという話。これは、例えばGambit benchmarkのcompilerみたいなの対策だったりする(展開されると正しい結果が出ない場合)。ただし、library及びdefine-libraryフォーム内では別のロジックが走るので、展開される可能性はある。でも、よく考えれば、定数しかやらないので展開してしまってもいいかもしれない。後で考えよう。

追記
いや、やっぱり同一ライブラリ内では駄目だな。以下のようなのが困る。

(define b 'b)
(define (foo) (print b))
(foo)
(set! b 'c)
(foo)

これを定数展開してしまうと、二度目のfooが正しく動かない。

追記2
インポートされた束縛に対するset!は禁止することにした。再定義はOK。そういえばこの変更でR6RS/R7RS準拠のevalが書けなくもないのだが(定義禁止等)、まぁそこまでする必要は今のところないかなぁ。

2014-08-23

SRFI-26の紹介

(LISP Library 365参加エントリ)

SRFI-26はカリー化を除いたパラメタの特殊化表記です。日本語にすると意味が分かりませんが(正しく訳せているのかすら疑問) 、要するにcutとcuteです。どちらも手続きを作成するマクロです。使い方は以下：

(cut cons a <>)
;; -> (lambda (tmp) (cons a tmp))

(cut list a <> <>)
;; -> (lambda (tmp1 tmp2) (list a tmp1 tmp2))

(cut list a <> <...>)
;; -> (lambda (tmp1 . xs) (apply list a tmp1 xs))

(cute cons (+ a 1) <>)
;; -> (let ((a1 (+ a 1))) (lambda (tmp) (cons a1 tmp)))

どちらのマクロも第一引数を手続きとみなし、残りをその手続きの引数とします。<>及び<...>はプレースホルダーで、作成された手続きに渡される引数に置換されます。出現する順番どおりに引数に展開されます。cutは引数をそのままlambdaに展開し、cuteの方はプレースホルダーである<>以外の引数を評価を一度だけ行うことを保障します。

プレースホルダーの使い方ではまるのが以下の例です。

(cut cons a (list b <>))
;; -> (lambda () (cons a (list b <>)))

プレースホルダーはcutもしくはcuteと同じ深さにいないといけません。個人的にこれは不便だなぁと思う点ではあるのですが、ネストを考慮するとsyntax-rulesで書けない(もしくはものすごく頑張らないといけない)ので、まぁしょうがないのではないでしょうか。

今回はSRF-26を紹介しました。

2014-08-19

Performance turning - I/O

The R7RS benchmark showed that I/O was slow in Sagittarius. Well not quite, in R7RS library read-line is defined in Scheme whilst get-line is defined in C.This is one of the reason why it's slow. There is another reason that makes I/O slow which is port locking.

Sagittarius guarantees that passing a object to the port then it writes it in order even it's in multi threading script. For example;

(import (rnrs) (srfi :18) (sagittarius threads))

(let-values (((out extract) (open-string-output-port)))
  (let ((threads (map (lambda (v) 
   (make-thread (lambda ()
           (sys-nanosleep 3)
           (put-string out v)
           (newline out))))
        '("hello world"
   "bye bye world"
   "the red fox bla bla"
   "now what?"))))
    (for-each thread-start! threads)
    (for-each thread-join! threads)
    (display (extract))))

This script won't have shuffled values but (maybe random order) whole sentence.

To make this, each I/O call from Scheme locks the given port. However if the reading/writing value is a byte then the locking is not needed. Now we need to consider 2 things, one is a character and the other one is custom ports. Reading/writing a character may have multiple I/O because we need to handle Unicode. And we can't know what custom port would do ahead. Thus for binary port, we don't have to lock unless it's a custom port. And for textual port, we can use string port without lock.

Now how much performance impact with this change? Following is the result of current version and HEAD version:

% ./bench sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m26.155s
user    0m25.568s
sys     0m0.936s

% env SAGITTARIUS=../../build/sagittarius ./bench sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m19.417s
user    0m18.703s
sys     0m0.904s

Well not too bad. Plus this change is not for this particular benchmarking which uses read-line but for generic performance improvements. Now we can finally change the subjective procedures implementation. The difference between get-line and read-line is that handling end of line. R7RS decided to handle '\r', '\n' and '\r\n' as end of line for convenience whilst R6RS only needs to handle '\n'. Following is the result of implementing read-line in C.

% env SAGITTARIUS="../../build/sagittarius" ./bench -o -L../../sitelib sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m5.031s
user    0m4.492s
sys     0m0.795s

Well it's as I expected so no surprise.

Performance turning - Windows (MSVC)

It turned out that Windows version of Sagittarius is extremely slow. If I try ack on Cygwin version then it can run about 60 sec (Core i7 3.40 GHz). Following is the result of three implementations that previous article mentioned.

% time gosh -r7 ack.scm
gosh -r7 test.scm  130.14s user 4.26s system 120% cpu 1:51.70 total

% time sagittarius ack.scm
sagittarius test.scm  54.48s user 3.01s system 124% cpu 46.022 total

% time chibi-scheme ack.scm
chibi-scheme test.scm  49.84s user 0.00s system 99% cpu 50.170 total

And this is the inside of "acm.scm".

(import (scheme base))
(define (ack m n)
  (cond ((= m 0) (+ n 1))
        ((= n 0) (ack (- m 1) 1))
        (else (ack (- m 1) (ack m (- n 1))))))

(ack 3 12)

Apparently Chibi is the fastest. And Sagittarius is not so slow. Well, I don't know why Gauche is slow in Cygwin environment. I think this is simply the performance of VM and its stack overflow handling. (at least on Sagittarius, not sure for others.)

Now let's see how is the Windows version's performance. Following is the result of ack using Cygwin's time command to see how slow it is.

D:>\cygwin\bin\time sash.exe ack.scm
0.00user 0.00system 1:12.13elapsed 0%CPU (0avgtext+0avgdata 220928maxresident)k
0inputs+0outputs (901major+0minor)pagefaults 0swaps

I don't know why it has some garbage but 30% slower. So indeed it is Windows version's issue. But why?

I think the reason why is it does not use direct threading due to the limitation of MSVC (if we use it should improve approx 30% of performance, so about 21 sec in this case?). However I've found this article: How not to make a virtual machine (label-based threading). Even though this article concluded not to emulate direct threading on MSVC however it wouldn't hurt to give it a shot.So I've modified VM a bit to emulate it (sources are in msvc32-emulate-direct-threading branch). Then run the above script.

Here is the result;

D:>\cygwin\bin\time.exe build\sash.exe ack.scm
0.00user 0.00system 1:40.56elapsed 0%CPU (0avgtext+0avgdata 220928maxresident)k
0inputs+0outputs (901major+0minor)pagefaults 0swaps

Well, it's always better to believe what people have done, I guess...So this really doesn't work. Now we only have 2 options, giving up or adopt to MinGW. Hmmmm.

2014-08-17

性能差

@SaitoAtsushiさんがLarcenyにあるR6RSベンチマークをR7RS用にして走らせた結果をツイートしていた。

R7RS 処理系のベンチマークとった。 Gauche 激速！ Sagittarius が次点。 Chibi はとんでもなく遅い。 Foment や Husk は測定するのが非現実的なくらい遅くて諦めた。 pic.twitter.com/eduG04M3FD
— 齊藤敦志 (@SaitoAtsushi) August 17, 2014

Gaucheが速いのは分かっていたことなのだが(いずれ追い越す予定…未定…)、Chibiより遅いベンチがあるのは正直まずい。Chibiはランタイムが小さいためほとんどの手続きがSchemeで書かれているという特徴があるのだが、それより遅いのがあるということは基本的なVM等の動作が遅い可能性がある。

とりあえずChibiより遅いのが以下のベンチ

ack
cat
nucliec
quicksort
simplex
slatex
gcbench
mperm

こいつらに関わる手続きはすぐにでもパフォーマンスチューニングする必要がある。特にackなんて本当にシンプルなのでVMの性能がまずいということになる・・・(涙)

次いで、Gaucheに圧倒的に負けてるやつ

ctak
sum
sumfp
mbrot
mbrotZ
pnpoly
tail
read1
lattice
nqueens

結構あるのだが、このうち浮動小数点が関わってくるやつ(sumfp、mbrot及びmbrotZ)はちと分が悪いので後回し。Gaucheは浮動小数点数をある程度レジスタに持つのでキャッシュが効くうちはメモリ割付が発生しない(はず)。テキストポートが絡んでくるI/O周りも多少分が悪いところがあるのではあるが、何とかしたいところ。nqueensはどちらも絡まないのでVMもしくはコンパイラということになる。

所詮マイクロベンチマークという見方もあるかもしれないが、こういう地味な性能差は地道なチューニングの結果なのでここで手を抜くのはまずいというのが僕の考え方。

2014-08-12

What do you expect to be printed?

I was testing how include on Sagittarius works and found out something interesting.

Firstly, I've prepared 4 files related to library. The contents are very simple. Like this;

;; foo.sld
(define-library (foo)
    (import (scheme base))
    (export bar)
  (include "impl/bar.scm"))

;; impl/bar.scm
(include "buzz.scm")

;; impl/buzz.scm
(define bar 'bar)

;; buzz.scm
(define bar 'buzz)

Now I put those files following directory structure;

./ + foo.sld
   + impl
       + bar.scm
       + buzz.scm
   + buzz.scm

Then, wrote following script;

(import (scheme base) (scheme write) (foo))
(display bar) (newline)

The question is, what would be the expected value to be printed 'bar or 'buzz? I first expected to be print 'bar with Sagittarius but it was 'buzz. Then checked with Chibi which also printed 'buzz. At this point I thought, oh R7RS de-facto would be 'buzz. Well it wasn't. I've also check with Gauche and Foment and these printed 'bar.

I know why this happens actually. When the first include was processed then the current loading path returned to the same directly as "foo.sld". Then second include "buzz.scm" is processed. I don't know how Gauche and Foment do this properly (well R7RS doesn't specify how to resolve the path so both are proper behaviour though).

Well, by now I've never seen this type of code in anywhere other than in my test. But I would like to go more intuitive behaviour.

13th August, 2014
I've modified compiler to be able to handle above case. What I've changed is that

include and include-ci returns read expression and filename
compiler sets current load path after include is done according to above filenam

It's a bit ugly way to fix so I need to think how we can clean it up now.

2014-08-11

SQL in S-Expression

There is SXML which represents XML in S-Expression. If you're a professional programmer, it is very difficult to avoid using SQL (well, if you could, you are very lucky).

Writing SQL isn't so bad if the file is separated or the editor is smart enough to edit. However it's usually none of the case so I always need to suffer especially writing it in double quote. Now, Emacs is de-facto editor for all programmers (vim? why are you talking about jeans here? V.I.M.) and it's by default very good with editing s-expression. So if you can write SQL in s-expression then the world would be happier than before.

Now, there are bunch of projects that have the same idea I've got, CLSQL, SxQL, S-SQL for example (I think Clojure also has something similar but I'm not so familiar). The problem of them are very simple. It's using keyword which is not in R6RS/R7RS. Moreover, those use either reader macro or CLOS so not pure s-expression. What I want is light weight but easy to edit. (well, it is actually ok for me, Sagittarius has all of them anyway.)

So I'm thinking something like following;

(define ssql 
  '(select ((p id) (as (p first_name) name) (a street))
    (from (as person p))
    (inner-join (as address a) (on (= (p id) (a id_person))))
    (where (= (a country) "Nederlands"))))

(ssql->sql ssql)
;; "select p.id, p.first_name as name, a.street
;;  from person as p
;;  inner join address as a on p.id = a.id_person
;;  where a.count = 'Netherlands'"

Hmmm, it looks parentheses are a bit too many so the readability is a bit lowered. Basically it doesn't have to be fancy but readable and easy to remember. So one-by-one mapping is ok. (if I want OR-mapper, then I just need to construct it on top of it.)

If there is something similar in Scheme or better idea, I would like to know.

2014-08-08

SRFI-25の紹介

(LISP Library 365参加エントリ)

SRFI-25は多次元配列を扱うSRFIです。多次元配列自体はリストでもベクタでも表現できますが、このSRFIは2次元以上の配列の扱いをより自然にすることを目的としています。

使い方を見てみましょう。

(import (srfi :25))
;; create 3 dimension array, each dimension
;; contains bound start with 0 and end with 4
;; (exclusive)
(define arr (make-array (shape 0 4 0 4 0 4)))

;; set 'e1 on position x = 2, y = 2, z = 2
(array-set! arr 2 2 2 'e1)
;; -> unspecified value

;; ref it
(array-ref arr 2 2 2)
;; -> e1

;; returns dimensions of array
(array-rank arr)
;; -> 3

ベクタやリストを使うとアクセスが複雑になるところが、より直感的に操作可能になります。

もう少し複雑な例を見てみましょう。手続きshapeは配列境界のペアを受け取ります。例えば以下のように書けば0以外の数字から始まる配列も書くことが可能です。(配列が3次元を超えるを僕の理解の範疇を超えてくるので2次元で・・・)

;;; first dimension has 2 elements start with 4 end with 5
;;; second dimension has also 2 elements start with 5 end with 6
;;; the array is initialised with given objects as following
;;;   5  6 
;;; 4 nw ne
;;; 5 sw se
(array (shape 4 6 5 7) 'nw 'ne 'sw 'se)

このSRFI自体には含まれないのですが、このSRFIの著者はarlib.scmをSRFIに付随させており、その中にはtubulate-arrayやarray-equal?といった便利手続きが定義されています。(なんでSRFI自体には含めなかったんだろう？)

今回はSRFI-25を紹介しました。

2014-08-05

LaTeX用環境構築

Schem Workshop 2014の主催者から「Sagittariusについてペーパー書かない？」というメールが来たのが昨日。これも何かの縁だろうということで頑張って書こうかと決めたのが昨日。Windows用のLaTeX環境が絶望的に乏しいことに気付いたのも昨日。昨日はなんだかいろいろあったなぁw

世の中にはCloud LaTeXなるものもあるらしく、おとなしくそれを使っておけという話なのかもしれないが、それらがEmacsキーバインドをサポートしているかどうかというのと、期日が9月5日という1ヶ月しかない中でサービス探しつつLaTeXを覚える(そして英語論文の書き方も・・・)というのをこなすのはちときついので、Emacsに環境を作ってしまおうという方に傾いてみることにした。軽くググッてみると世の中にはAUCTeXなるものがあるらしい。とりあえずこいつをLaTeXのプリビューが出来るまで動かせれば勝ちである。

とりあえず簡単な手順。

ダウンロードページからAUCTeXを落としてくる。
libpng14-14.dllとzlib1.dllの32ビット版をEmacsのbinフォルダに置く(PNGサポート)
MikTeXをインストールしてパスを通す
64ビット版を入れたが正しいかは不明
Ghostscriptの32ビット版をインストールしてパスを通す
最初64ビット版を入れて、GSWIN32C.EXEがないと言われてはまった

これでいいはずなんだけど、スクリーンショットのpreview-latexのようには見えない件。error in process sentinel: preview-reraise-error: End of Preview snippet 9 unexpectedなんていわれる。何かが足りてないのだろうが、何が足りてないのかは不明・・・しかし、なんで選択肢がMicorsoft WordかLaTeXというのがいけないんだ。OOoならあるのに・・・

ペーパー書く以外にも問題があるのだけど、そいつらは書けてから考えることにする。(USに入るための手続きとか、クレジットカード作らないととか、パスポートの更新とか、まぁ割りとたくさんあるなぁ・・・)

2014-08-04

Gaucheライクな文字セットリーダをサポートしてみた

事の発端は(text parse)のテストケースがなかったこと。

この数日バイナリの扱いを楽にするためのライブラリを書いていたのだが、(text parse)と対になる(binary parse)を格段になってテストケースがないことに気付いた。っで、Gaucheからテストケースを拝借しようと思ったら、文字セットのところで躓いたという話。Sagittariusはリーダーマクロをサポートしているのでさくっと書いてしまえばいい話ではあるのだが、この形式の読み取りで躓いたのはこれで何度目か分からないので。

以下のように使える。

#!read-macro=char-set
#[a-zA-Z]
;; => #<char-set ...>

あくまで読み取りで、書き出しの方は何もしてない。(pp)は変更してあるのでそれを使うということで。

実装は以下。

(library (char-set)
    (export :export-reader-macro)
    (import (rnrs)
            (sagittarius reader)
            ;; we don't want to export regex reader macro with this
            (only (sagittarius regex) parse-char-set-string))

  ;; Gauche compatible charset
  (define-dispatch-macro #\# #\[ (char-set-reader port subchar param)
    (define (read-string in out)
      (let loop ((depth 0))
        (let ((c (get-char in)))
          ;; we just need to put all chars until we got #\]
          ;; c could be eof so check
          (when (char? c) (put-char out c))
          (cond ((eof-object? c) 
                 (raise (condition
                         (make-lexical-violation)
                         (make-who-condition 'char-set-reader)
                         (make-message-condition "unexpected EOF"))))
                ((char=? c #\[) (loop (+ depth 1)))
                ((char=? c #\])
                 ;; done?
                 (unless (zero? depth) (loop (- depth 1))))
                (else (loop depth))))))
    (let-values (((out extract) (open-string-output-port)))
      ;; need this unfortunately
      (put-char out #\[)
      (read-string port out)
      (parse-char-set-string (extract))))
)

parse-char-set-stringは新たに足された手続き。インポートの際に問答無用でインポートしたライブラリのリーダーマクロを引き継ぐようになっていたのを修正。

スクリプト内で使う分にはこれで十分なはず。

2014-08-01

SRFI-23の紹介

(LISP Library 365参加エントリ)

SRFI-23はエラー通知について定めたSRFIです。要するにerror手続きです。使い方は以下のようになります。

(error "error reason" some error related objects)

たったこれだけです。このSRFIではエラーが通知された際に何が投げられる等のことは定義されていません。参照実装は潔く単にメッセージと原因のオブジェクトをプリントした後、処理系が何かしらエラーを投げると思われる手続きの呼び方をして終わりです。(例:(car 'a)等)

振る舞いはともかくとして、このSRFIはR7RSに丸ごと取り入れられています。R7RSではエラーオブジェクトを投げるとされていますが、エラーオブジェクトが何かということは触れられていません。

ちなみに、R6RSにも同名の手続きがありますが、こちはら引数の個数が違うので注意が必要です。また、投げられる例外オブジェクトについても細かく既定されています。これもR6RSとR7RSの非互換の一つです。

今回はSRFI-23を紹介しました。

時の羅針盤＠blog

Let's start Scheme