Let's start Scheme

2015-06-05

getaddrinfo on Cygwin

I think I found a bug on Cygwin or could already be known issue but it kinda took my half a day so might be good to share.

The problem is related to getaddrinfo as this article's title says. The story starts from that I've re-written the executor on (util concurrent), then for some reason getting the error which says 'Name or service not known' which return code is '8'. This error has never happened before even though I was using the library heavily on my testing (if you see the test runner, it uses this library). So first I thought this was because of the re-writing.

So I've checked the code which raises the error message and reached the procedure get-addrinfo. The procedure is just a thin wrapper of getaddrinfo(7) so I've started been doubting. So I've asked my precious body, Google, why this happens, then I've found this Python issue. Sounds pretty much similar. Could it be? So I've wrote the following script to check if this is related to multi threading but my new library.
(import (rnrs) 
        (srfi :18)
        (sagittarius socket))

(define s 
  (thread-start!
   (make-thread
    (lambda ()
      (define sock (make-server-socket "5000"))
      (print 'up)
      (let loop ()
        (let ((cs (socket-accept sock)))
          (socket-send cs (string->utf8 "hello"))
          (loop)))))))

(thread-yield!)
(thread-sleep! 0.1)

(print 'try)

;; uncomment this would make next thread work.
#;(let ((cs (make-client-socket "localhost" "5000")))
  (print (utf8->string (socket-recv cs 255))))

(define c 
  (thread-start!
   (make-thread
    (lambda ()
      (let ((cs (make-client-socket "localhost" "5000")))
        (print (utf8->string (socket-recv cs 255))))))))

(print (thread-join! c))
#|
up
try
Unhandled exception
  Condition components:
  &uncaught-exception
    reason: #<condition
 #<&i/o>
 #<&who get-addrinfo>
 #<&message Name or service not known>
 #<&irritants ((localhost 5000))>
>
    thread: #<thread thread-4 terminated 0x806c8c80>
stack trace:
  [1] thread-join!
  [2] #f
    src: (thread-join! c)
    "test.scm":31
  [3] load
|#
As the comment mentions, if I create a socket on the main thread (or even the thread which creates the server thread?), it works. On POSIX, it says getaddrinfo is thread safe. See, freeaddrinfo, getaddrinfo - get address information. So I think whatever I do on multi threading environment, it should work as I expect. So it's not my library, case closed.

...
......
.........

NO, IT'S NOT!!!

The problem is that Cygwin is one of my most used environment and I can't give up this type of error. However I have no idea how I can resolve this. I've tried to lock before calling getaddrinfo so that the process itself can be atomic. But the result is the same. Now I need to find a workaround for this crappy thing. *sigh*

No comments:

Post a Comment