(net server)
and on top of this library I've written simple HTTP server and web framework, Paella. I don't use it in tight situation so performance isn't really matter for now. However if you write something you want to check how good or bad it is, don't you? And yes I've done simple benchmark and figured out it's horrible.I've created a very simple static page with Plato which is a web application framework bundled to Paella. It just return a HTML file. (although it does have some overhead...) It looks like this:
(library (plato webapp benchmark) (export entry-point support-methods) (import (rnrs) (paella) (plato) (util file)) (define (support-methods) '(GET)) (define (entry-point req) (values 200 'file (build-path (plato-current-path (*plato-current-context*)) "index.html") '("content-type" "text/html"))) )The
index.html
has 200B data.I don't have modern nice HTTP benchmark software like ApatchBench (because I'm lazy) so just used cURL and shell. The script looks like this:
#!/bin/sh invoke () { curl http://localhost:8500/benchmark > /dev/null 2>&1 } call () { for i in `seq 1 1000`; do invoke & done } call waitIt's just create 1000 processes background and wait them.
The benchmark is done on default starting script which Plato generates. So number of threads are 10. Then this is the result:
$ time ./benchmark.sh ./benchmark.sh 4.89s user 3.77s system 313% cpu 2.764 totalSo, I've done couple of times and average is approx 3 seconds per 1000 requests. So 300 Req/S. It's slow.
If I run the above benchmark with 10 requests, then the result was like this:
$ time ./benchmark.sh ./benchmark.sh 0.05s user 0.05s system 249% cpu 0.040 totalAnd 1 request is like this:
$ time ./benchmark.sh ./benchmark.sh 0.01s user 0.01s system 77% cpu 0.025 totalSo up to thread number, I can assume it does better, at least it's not increased 10 times. But if it's 100, then it's about 7 times more.
$ time ./benchmark.sh ./benchmark.sh 0.49s user 0.35s system 285% cpu 0.293 total1 to 10 is twice, but 10 to 100 is 7 times. Then 100 to 1000 is 10 times. Something isn't right to me.
Why it's so slow and gets slow when number of requests is increased? I think there are couple of reasons. The
(net server)
uses combination of select (2)
and multithreading. When the server accepts the connection, then it
tries to find least used thread. After that it pushes the socket to the
found thread. The thread calls select
if there's something
to read. Then invokes user defined procedure. After the invocation, it
checks if there's closed socket or not and waits input by select
again. So flow is like this (n = number of thread, m = number of socket per thread):- Find least used thread. O(nm) (best case O(1) if none of the threads are used)
- Push socket to the thread. O(1)
- Handling request. O(m)
- Cleaning up sockets. O(m)
- Adding load balancing thread which simply manage priority queue
- Just asking the queue which thread is least loaded
- Code cleaning up
- Using
(util concurrent shared-queue)
instead of manually managing sockets and locks - Don't assume write side shutdowned socket is not used.
- more...
$ time ./benchmark.sh ./benchmark.sh 4.61s user 3.76s system 317% cpu 2.633 totalYAHOOOOO!!!! 100ms faster!!! ... WHAAAATTTT!???
Well in average it's 2.6sec per 1000 request so it is a bit faster like 300ms - 400ms. And using
(util concurrent)
made the server itself more robust (it sometimes hanged before). I think the server framework itself is not too bad but HTTP server. So that'd be the next step.
No comments:
Post a Comment