How debuggers like GDB really works?



Hey all, here is another post related to Into the wild collection. We gonna talk here about GDB in general and ptrace() in particular.
How GDB works and what internal functions and syscalls it does in order to debug your preferred program.
I am assuming here that you have already the basics of debugging from usage point of view. So, I may not introduce you the usual commands to interact with GDB.
In general, debugger is a tool where it have a vast control of the execution of a software, it can directly talk with the process of your program and manage certain actions.
The question that came to anyone's mind here, how GDB can do all of that?
Before getting into answering this question, let's first go the opposite way and see what GDB can't do.
Usually, GDB does not simulate the execution of the target software, in other words, it doesn't read and interept the binary instructions of your software and execute it, that would be very very slow
However, there are other tools that can do that, for example Valgrind. That's also applicable for the way Qemu works.
GDB works by attaching processes, execute some system calls which they give it more capabibilities to have almost full control about the process of your software.
I am assuming also that you know how processes works under Linux. One of these syscalls that GDB calls is ptrace. Ptrace is (Process Trace) a system call, it provides a powerful mechanism by which a PPID (Parent Process ID) can observe and control of other processes. It allows also the examining of how memory addresses are allocated
to your software and reads the stack of registers, here where it could also give permissions to set breakpoints and invoke other syscalls.
So, to resume any debugger in Linux in order to do what it must do; it need to have the support of the kernel side. I bet anyone would be interested to know the black magic that the kernel does and hopefully this is what
am gonna cover also here :)
Here is a simple prototype of ptrace:
long ptrace(enum __ptrace_request request, pid_t pid,
                   void *addr, void *data);

Additionally, ptrace allows a debugger process to access the low level stack about another proecess (debugee).
Before, we get into how ptrace is implemented in the Linux kernel, we need to mention some sub syscalls that ptrace invoke. As, we can see from the prototype of ptrace, it uses four arguments enum __ptrace_request request, pid_t pid, void *addr and void *data.
The first argument enum __ptrace_request request determines the behaviour of ptrace itself and how other arguments are used. This is defined in sys/ptrace.h
In other words, this argument is the most important one as it specifies what action or operation we gonna do with ptrace(), whether it is reading a register from debuggee's process
getting vaules from its memory.
For the second argument pid_t pid it obviously specifies the PID of the debuggee (our target program). However, one single process can do magic by debugging several other processes.
The last two args are void *addr and void *data optional.

In the end ptrace will perform one of these action or what I want to call sub-syscalls. The following PTRACE_* are called due to our first argument in ptrace() that we talked about.
- PTRACE_TRACEME: Indicates that this process is to be traced its parent. This is only invoked and used by the child process.
- PTRACE_PEEKTEXT and PTRACE_PEEKDATA: They reads a word at the location addr in the child’s memory, returning the word as the result of the ptrace() call. - PTRACE_PEEKUSER : It allows to read from the USER area's tracee, where it holds the registers and other useful information.
- PTRACE_POKEUSER: It copies the word data to offset addr in the child's user area.
- PTRACE_POKETEXT and PTRACE_POKEDATA: Both copies the word data to location addr in the child's memory.
- PTRACE_GETREGS and PTRACCE_GETFPREGS: Both copies the child's general purpose or floating-registers to location data in parent.
- PTRACE_SETREGS and PTRACE_SETFPREGS: It does the same as PTRACE_GETREGS and PTRACE_GETFPREGS but with limited permissions.
- PTRACE_CONT: This is invoke a restart action for the stopped child process.
- PTRACE_SYSCALL: This is basically obvious where it made a call to ptrace(), it makes the kernel stop the child process whenever a syscall entry or exist is made.
- PTRACE_SINGLESTEP: It does what PTRACE_CONT does but it gives the ability for the child process to be stopped at the next entry/exit
- PTRACE_ATTACH: Attaches to the process specified in pid, making it a traced "child" of the current process.
- PTRACE_DETACH: Restarts the stopped child as for PTRACE_CONT, but first detaches from the process, undoing the reparenting effect of PTRACE_ATTACH, and the effects of PTRACE_TRACEME.

Actually, this is not all the list that ptrace() is capable to do, but for educational purposes and for the sake of simplicity, I am gonna talk only about the PTRACE_* that matters in this article.
I want also to mention that in order to execute our ptrace() syscall, we need really to go through other syscalls. It is really fun to do all of this. I am gonna talk briefly about all of this.
First of all, we need to initiate our tracee's PID, so we can attach it to our debugger. So, in order to initiate it, we need to call the fork() syscall and having the result child do a PTRACE_TRACEME.
Followed then by another syscall which is exec(). Here where our debugger like GDB can attach the process of the tracee by doing a PTRACE_ATTACH.
However, we didn't complete our mission here. The tracee's process can stop each time when a signal is delivered. If this happens. Our parent process need to wait by calling wait() syscall.
Then, it can go and inspect or even modify the child process while it is stopped. If the parent process have the information and maybe the values that it want from the tracee, it can then
resume and continue the work of the tracee.
Finally, when the debugger have finished tracing, it can terminate the child's process by sending PTRACE_KILL or let the tracee live in the normal mode that it supposed to be by deattaching it and that's by invoking PTRACE_DEATTACH.

Okay, until this point we knew how to trace our debugee's process and attach it to the debugger which is our GDB.
This is good but unfortunatey it is not enough to start your journey with debugging. We need to put some breakpoints where we can interrupt our debuggee's process
and then we can examine its stack and managing its information,registers, etc.. In order to do this we need to make some signals.
Hopefully, this is managed by the kernel space based on our debugger (GDB), the kernel space notifie us through GDB by using events that occur to our debuggee.
To implement PTRACE_TRACEME, the kernel shoots the debugee or the tracee's process by SIGSTOP, respectively we receive a SIGCHLD; then a new process is spawned and here where
PTRACE_TRACEME execute. This new process will get a SIGTRAP once it attemps to do an exec() or execve() syscall.
Lastly, we gonna be notified by the SIGCHLD about the process that got spawned and traced.
Unfortunately, ptrace() does a lot of cool things. But, it is not sufficient to start debugging, ptrace() syscall doesn't allow us or our debugger to put breakpoints.
This is why we used SIGNALS alongside ptrace() syscall.
Next part I am gonna cover some aspects about breakpoints that any debugger like GDB does.


Cheers o/