The R7RS benchmark showed that I/O was slow in Sagittarius. Well not quite, in R7RS library
read-line
is defined in Scheme whilst
get-line
is defined in C.This is one of the reason why it's slow. There is another reason that makes I/O slow which is port locking.
Sagittarius guarantees that passing a object to the port then it writes it in order even it's in multi threading script. For example;
(import (rnrs) (srfi :18) (sagittarius threads))
(let-values (((out extract) (open-string-output-port)))
(let ((threads (map (lambda (v)
(make-thread (lambda ()
(sys-nanosleep 3)
(put-string out v)
(newline out))))
'("hello world"
"bye bye world"
"the red fox bla bla"
"now what?"))))
(for-each thread-start! threads)
(for-each thread-join! threads)
(display (extract))))
This script won't have shuffled values but (maybe random order) whole sentence.
To make this, each I/O call from Scheme locks the given port. However if the reading/writing value is a byte then the locking is not needed. Now we need to consider 2 things, one is a character and the other one is custom ports. Reading/writing a character may have multiple I/O because we need to handle Unicode. And we can't know what custom port would do ahead. Thus for binary port, we don't have to lock unless it's a custom port. And for textual port, we can use string port without lock.
Now how much performance impact with this change? Following is the result of current version and HEAD version:
% ./bench sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10
real 0m26.155s
user 0m25.568s
sys 0m0.936s
% env SAGITTARIUS=../../build/sagittarius ./bench sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10
real 0m19.417s
user 0m18.703s
sys 0m0.904s
Well not too bad. Plus this change is not for this particular benchmarking which uses
read-line
but for generic performance improvements. Now we can finally change the subjective procedures implementation. The difference between
get-line
and
read-line
is that handling end of line. R7RS decided to handle '\r', '\n' and '\r\n' as end of line for convenience whilst R6RS only needs to handle '\n'. Following is the result of implementing
read-line
in C.
% env SAGITTARIUS="../../build/sagittarius" ./bench -o -L../../sitelib sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10
real 0m5.031s
user 0m4.492s
sys 0m0.795s
Well it's as I expected so no surprise.