(NOTES) Intro to Assembly and Reverse Engineering
A few commands:
1readelf -a file.out
2objdump -d file.out # you will notice memory addresses that MATCH with elf headers ofcNotes:
- Dynamic Linking: If INTERP is mentioned in ELF headers, then ld-linux-x86 or something is used for dynamically linking shared object files. If it’s absent then dynamic linking isn’t being used in the code.
- global _start: It means _start will become an exported symbol i.e available to the linker, the linker will be able to find it and EXPECTS IT (_start is the default
entrypoint label, but you can dold -e _somethingelsetoo ) to link into final binary that will execute the program.
Registers lists
64 bit: rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8 , r9 , r10 , r11 , r12 , r13 , r14 , r15
32 bit: eax, ebx, ecx, edx, esi, edi, ebp, esp, r8d, r9d, r10d, r11d, r12d, r13d, r14d, r15d
16 bit: ax, bx, cx, dx, si, di, bp, sp, r8w, r9w, r10w, r11w, r12w, r13w, r14w, r15w
8 bit: al, bl, cl, dl, sil, dil, bpl, spl, r8b, r9b, r10b, r11b, r12b, r13b, r14b, r15bFlags lists
CF (Carry), PF (Parity), ZF (Zero), SF (Sign), OF (Overflow), AF (Adjust), IF (Interrupt Enabled)A couple Pointerss lists
rip (eip, ip) - index pointer - points to next address to be executed in the control flow
rsp (esp, sp) - stack pointer - points to the top address of the stack
rbp (ebp, bp) - stack base pointer - points to bottom of stackList of System Calls: https://filippo.io/linux-syscall-table/
System Call Inputs by Register
- rax: ID
- rdi: 1st argument
- rsi: 2nd argument
- rdx: 3rd argument
- r10: 4th argument
- r8: 5th argument # bruh, why not 9 first
- r9: 6th argument
System Call Examples: For writing data to stdout, where label text containing string of size 14 bytes is used
1 mov rax, 1 ; ID of syscall, 1 is write
2 mov rdi, 1 ; file descriptor, 1 is stdout, 2 is stderr, 0 is stdin, so u would use 0 for taking inputs
3 mov rsi, text ; starting memory address to data (text is a label for the memory address here)
4 mov rdx, 14 ; lengh of string (appended to memory address, can cause *buffer overflow*)
5 syscallExit program with status code 0
1 mov rax, 60 ; exit
2 mov rdi, 0 ; exit code, 0 -> no error, anything else -> error
3 syscallSections
.data - read writable data set before compiling
.text - the instructions that would be ran
.bss - fixed-sized variables initialized for runtime BUT NOT SET (like .data) before compilingAssembly Operations
<label> db <text> - under .data section, dEFINE bYTES
<label> resb <num_bytes> - under .bss section, resERVE bYTES
mov <to>, <from> - move data from one place to another
syscall - make system call to kernel space using the system call input registers' values
jmp <section> - stores the memory address of section to rip register and runs that, no return
call <section> - same as jump BUT return back to it
ret - returns back to where it was called and continues from there, cool!
cmp <reg name/int>, <reg name/int> - if same, ZF set =1 . if not same, ZF reset =0 AND SF=msb(a-b)
je <section> - jump if a=b
jne <section> - jump if a!=b
jg <section> - jump if a>b
jge <section> - jump if a>=b
jl <section> - jump if a<b
jle <section> - jump if a<=b
jz <section> - jump if a=0
jo <section> - jump if overflow occured
jno <section> - jump if overflow did not occur
js <section> - jump if signed ?
jns <section> - jump if not signed
mov rax, rdi <- C: rax = rdi
mov rax, [rdi] <- C: rax = *rdi (where rdi is a pointer), it treats the value of [rdi] as some memory address and retrieves the data stored in THAT address
add a, b - a = a + b ; without carry
sub a, b - a = a - b ; without carry
adc a, b - a = a + b + CF ; with carry
sbb a, b - a = a + b + CF ; with carry
mul <reg> - rax = rax * <reg>
div <reg> - rax = rax / <reg>
neg <reg> - <reg> = -1 * <reg>
inc <reg> - <reg> = <reg> + 1
dec <reg> - <reg> = <reg> - 1