Let's start Scheme

2023-05-04

Remote debugger(ish)

In the previous post, I've mentioned that I needed to collect all threads to see which one is hanging. After implementing it, I've also noticed that all threads were hanging. So I've decided to implement a better remote debugger(ish).

CAVEAT

The functions and interface are experimental, so it might get changed in the future.

Suppose we have this script:

(import (rnrs)
        (srfi :1)
        (srfi :18)
        (sagittarius)
        (sagittarius debug))

;; Creating a remote debugger. You can specify a specific port number as well
(define remote-debugger (make-remote-debugger "0"))
;; Showing the port.
(print "Debugger port: " (remote-debugger-port remote-debugger))

(define ((sleep-in-deep name time))
  (define (before-sleep)
    (format #t "~a wants to sleep~%" name))
  (define (after-sleep time)
    (format #t "~a slept for ~a but wants to sleep more~%" name time))
  (define (sleep time) (thread-sleep! time))
  (before-sleep)
  (sleep time)
  (after-sleep time))

(define threads
  (map thread-start!
       (map (lambda (i)
              (make-thread (sleep-in-deep i (* i 1800))
                           (string-append "sleep-" (number->string i))))
            (iota 3))))
(for-each thread-join! threads)

It doesn't do anything but just sleeps. Now, if you run the script, it shows this kind of message on the cosole.

Debugger port: 50368
0 wants to sleep
0 slept for 0 but wants to sleep more
2 wants to sleep
1 wants to sleep

At this moment, remote debugger is just a type of remote REPL, so open Emacs and run sagittarius (can be older one). Then do like this

sash> (import (sagittarius remote-repl))
sash> (connect-remote-repl "localhost" "50368")

connect: localhost:50368 (enter ^D to exit) 

It's connected, then I want to check hanging threads.

sash> (for-each print (sleeping-threads))
#<thread root runnable 0x10ac6dc80>
#<thread remote-debugger-thread-1295 runnable 0x10bdb8640>
#<thread sleep-1 runnable 0x10bdb8000>
#<thread sleep-2 runnable 0x10c6e0c80>
#<unspecified>

I have 4 threads hanging. root is the main thread, so I can ignore. remote-debugger-thread-1295, the number varies, is a remote debugger's server thread, I can ignore this as well. So, the rest of the threads are the targets. I want to filter it for the later usage. I can simply do this:

sash> (define t* (filter (lambda (t) (and (string? (thread-name t)) (string-prefix? "sleep-" (thread-name t)))) (sleeping-threads)))

Now, let's see the backtrace of the threads.

(for-each print (map thread->pretty-backtrace-string t*))
Thread sleep-1
stack trace:
  [1] thread-sleep!
  [2] sleep-in-deep
    src: (thread-sleep! time)
    "sleep.scm":15

Thread sleep-2
stack trace:
  [1] thread-sleep!
  [2] sleep-in-deep
    src: (thread-sleep! time)
    "sleep.scm":15

#<unspecified>

It seems the sleep-in-deep is calling the thread-sleep! procedure. Okay, what's the value of the time?

To see those variables, I need to collect the backtrace of the threads. Like this:

sash> (define bt* (map thread-backtrace t*))

A backtrace contains multiple frames, in this example, each backtrace of the threads contains 2 frames. I want to see the first one which is the thread-sleep! frame. So, doing this:

sash> (for-each print (map (lambda (bt) (thread-backtrace-arguments bt 1)) bt*))
((local (0 . 1800)))
((local (0 . 3600)))
#<unspecified>

1800 and 3600 seconds! That's why the threads are hanging (obviously...).

In this example, I only showed how to see the arguments (local variables and free variables), but the remote debugger has a bit more functionality, such as inspecting objects and accessing its slots. Using this debugger, I found the root cause of the bug made me suffer for a couple of weeks.

No comments:

Post a Comment