In the second part of this tutorial I will be doing key file related operations - create, open, write, read, etc. using only Linux system calls and avoid libraries.
Also I will demonstrate how to use GNU debugger (gdb) to step through the code and analyze program behavior in depth.
Let's begin by running the program (fops) in the terminal:
geo@fermat:/home/work/asm/nasm$ ./fops write.txt This is George, and his nasm line. geo@fermat:/home/work/asm/nasm$ l write.txt -rwxr-xr-x 1 geo geo 60 May 22 22:29 write.txt geo@fermat:/home/work/asm/nasm$ cat write.txt This is George, and his nasm line. This is line number 2. geo@fermat:/home/work/asm/nasm$
What happens when you enter the program name and its parameters is
Linux runtime system puts program parameters on stack (from last to first)
then program name itself and finally total number of command line parameters.
Let's have a look on the corresponding assembly code:
;---------------------------------------------------------- ; fops.s demonstrates NASM file operations: ; create, write, read, sync, open, close ; ; Copyright 2011 by George Matveev ; ; www.matveev.se ;---------------------------------------------------------- section .text global _start _start: nop ;required for debugging purposes pop ebx ;number of command line parameters cmp ebx, 2 ;check if total number is 2 jne error ;exit pop ebx ;name of the program pop ebx ;name of the file to be created cmp ebx, 0 ;check if ebx is not zero (ok) jbe error ;exit mov [file], ebx ;store file name in local variable
What we are doing here is popping up all entered values from stack using ebx register
and verifying if file name was actually provided (cmp ebx, 2) by user and also
checking if file pointer is not zero (cmp ebx, 0). Compare this approach with
using esp register I utilized in fibonacci program in part one of the tutorial.
Purpose of nop operator is explained in the last section of this page.
If input is ok we save it into local variable file defined in .bss section for
future use. Notice that no matter how many bytes we reserve here, pointer
will be allocated anyway, we just need to put some (non-negative) integer here,
even zero will do the trick for both bssbuf and file:
section .data line db "This is George,", db " and his first line.", 0xa, 0 len equ $ - line line2 db "This is line number 2.", 0xa, 0 len2 equ $ - line2 section .bss bssbuf: resb len ;any int will do here, even 0, file: resb 4 ;since pointer is allocated anyway
Now we are ready to actually create the file, here's how it is done:
; create file with (-rwxr-xr-x) access rights ; access modes of new file: ;1=--x, 5=--r-x, 6=--r, 7=--r-x, 15=-xr-x ;127=---xr-xr-x, 128=--w------- ;255=--wxr-xr-x, 256=-r-------- ;511=-rwxr-xr-x, 512=---------T ;666=--w---x--T, 755=--wxr----t mov eax, 8 ;sys_creat mov ecx, 511 ;access rights int 80h cmp eax, 0 ;check if file was created jbe error ;error creating file
First we put appropriate Linux system call number (sys_creat=8) into eax,
second - ebx register already contains name of the file we want to create,
and finally we move access mode into ecx register using standard Unix combination
of access codes for user/group/others respectively. Then we check the return
result of the system call which is always in eax.
Now that the file was created we can open it in write mode:
; open file in read-write mode mov eax, 5 ;sys_open file with fine name in ebx mov ebx, [file] ;name of the file to be opened mov ecx, 1 ;0_RDWR int 80h cmp eax, 0 ;check if fd in eax > 0 (ok) jbe error ;cannot open file mov ebx, eax ;store file descriptor of new file
Again we put the system call number into eax register and move fine name
and required mode into ebx and ecx registers respectively. Then we trigger
Linux interrupt 80h which transfers control to the kernel. We check if
file descriptor returned in eax is positive int by comparing eax with 0
(cmp eax, 0) and then store file descriptor in ebx.
Now we are ready to write two strings to the file, using file descriptor in ebx:
; write line1 to the file pointer we keep in ebx mov eax, 4 ;sys_write mov edx, len mov ecx, line int 80h ; write second line to the file mov eax, 4 ;sys_write mov edx, len2 mov ecx, line2 int 80h ; sync all write buffers with files mov eax, 36 ;sys_sync int 80h ; close file, fd in ebx may not be valid anymore mov eax, 6 ;sys_close int 80h
Write operation requires sys_write call number (4) in eax,
string pointer in ecx and number of bytes to be written in edx.
After we are done with writing we synchronize kernel buffers with the
file using system call sys_sync (36) and close file using sys_close (6).
During these operations ebx register contains our file descriptor,
and we don not have to save it to stack.
Now let's try and re-open our newly created file and read data into
the buffer (bssbuf) we reserved in bss section of the file:
; re-open same file in read-only mode mov eax, 5 ;sys_open file with fd in ebx mov ebx, [file] ;file to be re-opened mov ecx, 0 ;O_RDONLY int 80h cmp eax, 0 ;check if fd in eax > 0 (ok) jbe error ;can not open file mov ebx, eax ;store new (!) fd of the same file ; read from file into bss data buffer mov eax, 3 ;sys_read mov ecx, bssbuf ;pointer to destination buffer mov edx, len ;length of data to be read int 80h js error ;file is open but cannot be read cmp eax, len ;check number of bytes read jb close ;must close file first ; write bss data buffer to stderr mov eax, 4 ;sys_write push ebx ;save fd on stack for sys_close mov ebx, 2 ;fd of stderr which is unbuffered mov ecx, bssbuf ;pointer to buffer with data mov edx, len ;length of data to be written int 80h pop ebx ;restore fd in ebx from stack close: mov eax, 6 ;sys_close file int 80h mov eax, 1 ;sys_exit mov ebx, 0 ;ok int 80h error: mov ebx, eax ;exit code = sys call result mov eax, 1 ;sys_exit int 80h
Notice that this time we use file name stored in file variable,
we re-open file using read-only mode (mov ecx, 0) and then we
read data from file into bss buffer using Unix/Linux system call
sys_read (mov eax, 3) and number of bytes in edx (mov edx, len).
Number of bytes to be read should be greater or equal to the length
of line in the file, otherwise line will not appear on console.
After line has been read, for test/demo purposes, we write data from
initially empty bss buffer to standard error descriptor (mov ebx, 2)
which is unbuffered, saving our file descriptor on stack.
Here one can specify any (non-negative) number of bytes - from zero
to len+n, only (null-terminated) line or part of it will be printed.
Finally we close file using file descriptor before exiting program
moving exit code in ebx register (either 0 or eax in case of error).
To be able to debug this program we need first to insert nop (no operation)
operator into the code right after _start label and then compile it with -g key.
Without nop we may not be able to set up breakpoint in our code.
geo@fermat:/home/work/asm/nasm$ nasm -f elf -g fops.s geo@fermat:/home/work/asm/nasm$ ld -o fops fops.o geo@fermat:/home/work/asm/nasm$ l fops* -rwxr-xr-x 1 geo geo 2011 May 22 21:26 fops -rw-r--r-- 1 geo geo 2560 May 22 21:26 fops.o -rw-r--r-- 1 geo geo 2788 May 22 21:19 fops.s geo@fermat:/home/work/asm/nasm$
Here's how we run the program under debugger:
geo@fermat:/home/work/asm/nasm/geonasm$ gdb -q fops Reading symbols from /home/work/asm/nasm/geonasm/fops...done. (gdb) b *_start+1 Breakpoint 1 at 0x8048081 (gdb) run write.txt Starting program: /home/work/asm/nasm/geonasm/fops write.txt Breakpoint 1, 0x08048081 in _start () (gdb) i r eax 0x0 0 ecx 0x0 0 edx 0x0 0 ebx 0x0 0 esp 0xbfffef40 0xbfffef40 ebp 0x0 0x0 esi 0x0 0 edi 0x0 0 eip 0x8048081 0x8048081 <_start+1> eflags 0x292 [ AF SF IF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x0 0 (gdb) x/4c &bssbuf 0x80491a0 : 0 '�00' 0 '�00' 0 '�00' 0 '�00' (gdb) x/4c &file 0x80491c4 : 0 '�00' 0 '�00' 0 '�00' 0 '�00' (gdb) x/36c &line 0x8049164 : 84 'T' 104 'h' 105 'i' 115 's' 32 ' ' 105 'i' 115 's' 32 ' ' 0x804916c: 71 'G' 101 'e' 111 'o' 114 'r' 103 'g' 101 'e' 44 ',' 32 ' ' 0x8049174: 97 'a' 110 'n' 100 'd' 32 ' ' 104 'h' 105 'i' 115 's' 32 ' ' 0x804917c: 110 'n' 97 'a' 115 's' 109 'm' 32 ' ' 108 'l' 105 'i' 110 'n' 0x8049184: 101 'e' 46 '.' 10 'n' 0 '�00' (gdb) x/24c &line2 0x8049188 : 84 'T' 104 'h' 105 'i' 115 's' 32 ' ' 105 'i' 115 's' 32 ' ' 0x8049190: 108 'l' 105 'i' 110 'n' 101 'e' 32 ' ' 110 'n' 117 'u' 109 'm' 0x8049198: 98 'b' 101 'e' 114 'r' 32 ' ' 50 '2' 46 '.' 10 'n' 0 '�00' (gdb) s Single stepping until exit from function _start, which has no line number information. This is George, and his nasm line. 0x08048147 in close () (gdb) x/36c &bssbuf 0x80491a0 : 84 'T' 104 'h' 105 'i' 115 's' 32 ' ' 105 'i' 115 's' 32 ' ' 0x80491a8: 71 'G' 101 'e' 111 'o' 114 'r' 103 'g' 101 'e' 44 ',' 32 ' ' 0x80491b0: 97 'a' 110 'n' 100 'd' 32 ' ' 104 'h' 105 'i' 115 's' 32 ' ' 0x80491b8: 110 'n' 97 'a' 115 's' 109 'm' 32 ' ' 108 'l' 105 'i' 110 'n' 0x80491c0: 101 'e' 46 '.' 10 'n' 0 '�00' (gdb) x/4c &file 0x80491c4 : 9 't' -15 '361' -1 '377' -65 '277' (gdb) i r eax 0x24 36 ecx 0x80491a0 134517152 edx 0x24 36 ebx 0x6 6 esp 0xbfffef4c 0xbfffef4c ebp 0x0 0x0 esi 0x0 0 edi 0x0 0 eip 0x8048147 0x8048147 eflags 0x246 [ PF ZF IF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x0 0 (gdb) s Single stepping until exit from function close, which has no line number information. Program exited normally. (gdb)
As you see we set up breakpoint (b *_start+1) on next line after _start which is nop operator,
and then we step (s) through code, view registers (info registers)
and examine contents of memory around local variables (x/nc).
If you reserve zero bytes in bss section for bssbuf and/or file, the program
will still run ok, but debugger will say "No symbol "bssbuf" in current context."
in response to "(gdb) x/4c &bssbuf".
To accelerate the process of building/testing the code I use simple shell script.
When script runs the program, unbuffered stderr comes first, program exit code (ebx)
comes after ls, and finally contents of write.txt file:
geo@fermat:/home/work/asm/nasm$ ./runfops.sh This is George, and his nasm line. -rwxr-xr-x 1 geo geo 2011 May 22 21:49 fops -rw-r--r-- 1 geo geo 2281 May 22 21:41 fops-gdb.txt -rw-r--r-- 1 geo geo 2560 May 22 21:49 fops.o -rw-r--r-- 1 geo geo 2788 May 22 21:19 fops.s -rwxr-xr-x 1 geo geo 60 May 22 21:49 write.txt exit code = ebx = 0 This is George, and his nasm line. This is line number 2. geo@fermat:/home/work/asm/nasm$
You can download source code, binary and shell script of this tutorial from here.
For this tutorial I used NASM 2.09 on Debian Squeeze.
geo@fermat:/home/work/asm/nasm$ nasm -v NASM version 2.09.04 compiled on Feb 12 2011 geo@fermat:/home/work/asm/nasm$