Behind the scenes!



Hey guys, this is the 1st article of collection Into the wild!. I will try my best to clarify the obscure things
about Assembly on Linux! Like any soldier going to war/battle he must indeed be ready, same case for you :P you need to download some tools
I think it would be good to go duckduck how to install those tools, mostly we will use : gcc, vim, objdump
Any other tools I will mention them in the next articles.

The name of this post is "Behind the scenes", keep reading and you will know why that name!
I assume that everyone can write a simple C script printing hello world
$ vim hello.c

#include
void main() {
printf("hello world!");
}

Until here, it is easy to write scripts/programs using high level languages.. but the question that you need to ask yourself about it
What happen behind the scene? How the damn processor know that we want to print the sentence "hello world!"?
I will try to answer those questions and I hope I can clarify as much as I can.
OK, so we want to compile this little script, we use here gcc gcc hello.c -o hello.o We told here gcc to compile the script hello.c and
we want the exectuable to be hello.o, the o in the end of the name stand for out
After that to execute it ./hello.o Alright, we want now to debug the program and see the assembly lines

$ objdump -d hello.o
Objdump will give you by default AT&T syntax, but we will use here the Intel syntax so, will add an argument to the Objdump command

$ objdump -M intel -d hello.o
<_start>:
400400: 31 ed xor ebp,ebp
400402: 49 89 d1 mov r9,rdx
400405: 5e pop rsi
400406: 48 89 e2 mov rdx,rsp
400409: 48 83 e4 f0 and rsp,0xfffffffffffffff0
40040d: 50 push rax
40040e: 54 push rsp
40040f: 49 c7 c0 80 05 40 00 mov r8,0x400580
400416: 48 c7 c1 10 05 40 00 mov rcx,0x400510
40041d: 48 c7 c7 f6 04 40 00 mov rdi,0x4004f6
400424: ff 15 c6 0b 20 00 call QWORD PTR [rip+0x200bc6] # 600ff0 <_DYNAMIC+0x1d0>
40042a: f4 hlt
40042b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]


I have pasted only the .text section, it is all what we need right now, but there's other sections, usually we have 2 important sections
data section which is used for declaring constants, it does not change at runtime, the syntax for declaring data section is :

section .data

The other important section it is what I have pasted already text section, it contains our code, we can say that it is our main function.
That section should start with declaration : global _start , which will tell our sweet kernel where the program execution begins. :)
Let's try declaring the text section :

section .text
global _start
_start :


Alright, so let's take a look on our objdump's output and let's understand it line by line! :D
OK, so xor ebp,ebp we are trying here to clear the register EBP, that means we will assign 0 to our register EBP
But wait! What is EBP in the first place!? EBP is a stack frame pointer, if you don't have any idea about what I am talking here, you can duckduck it
or just wait until the next article. ;)
The second line is mov r9,rdx and here we are moving the vaule of the rdx to r9, which are btw two registers of the x86_64 arch with Intel syntax ofcourse.
Then, we find pop instruction which copy the data from the stack to RSI, then to add a value to the stack pointer.
I will not goes a little deeper into the stack pointer and how it works, I will try my best to explain it in other articles. :)
and rsp,0xfffffffffffffff0 and this is the stack alignment.
and the push instruction is inserting a value onto the stack. It is very useful for passing arguments, saving registers..
and call it stores the location where it will return after. and hlt turns our CPU into HALT state (power saving mode), in other words, it is the point where the computer is ready to be turned off at that point.
the last line nop in a human language it means no operation. it does nothing but it is processed by the CPU like any other instruction.
So this means that it will be readed from memory, will increment the instruction pointer. It is very useful for debugging, e.g we can set a breakpoint there

This is almost everything for the moment. If there's any clue about anything do not hesitate to let me know. I will proceed next article with writing programs
using NASM and there the fun will begin! xD I hope I have clarified some things for you at least!

Cheers o/