it is available on both Linux and other operating systems (DOS/Windows) and its syntax is pretty close to MASM and could be easier to read/code than GAS.
1. As our first example let's write a program that outputs a message on console:
; hello.s section .text global _start ; required for linker (ld) _start: ; entry point mov edx,len ; string length mov ecx,mes ; string to write mov ebx,1 ; file descriptor (stdout) mov eax,4 ; syscall number (sys_write) int 0x80 ; call kernel mov eax,1 ; syscall number (sys_exit) int 0x80 ; call kernel section .data mes db 'Hello, NASM!', 0xa, 0 ; null terminated string to be printed len equ $ - mes ; length of string
Essentially we move string to ecx, its length to edx register, file descriptor
to be used for output in ebx and Linux sys_write system call number to eax.
After we set up the evironment we simply call interupt 80 which performs context
switch to kernel and performs actions inplemented in kernel sys_write call.
Second syscall we use in this program requires even less set up - only syscall
number needs to be moved to eax register before calling the kernel.
To see all system calls available in your kernel have a look into
linux/arch/architecture/include/asm/unistd.h.
Linux kernel 2.6.32 contains about 360 syscalls for ARM architecture
and about 330 syscalls in x86 specific unistd_32.h
Here's how you build and run the program:
geo@fermat:/home/work/asm/nasm$ nasm -f elf hello.s geo@fermat:/home/work/asm/nasm$ ld -o hello hello.o geo@fermat:/home/work/asm/nasm$ l hello* -rwxr-xr-x 1 geo geo 664 Feb 17 15:43 hello -rw-r--r-- 1 geo geo 624 Feb 17 15:43 hello.o -rw-r--r-- 1 geo geo 432 Feb 17 15:33 hello.s geo@fermat:/home/work/asm/nasm$ ./hello Hello, NASM! geo@fermat:/home/work/asm/nasm$
Notice that we don't use any high-level library API in doing this, only kernel syscalls.
If you change the length of the message to be greater then actual string length
you may be able to see something really interesting on screen ;-)
2. Our second example (based on fixed NASM distro) is a bit more interesting since we are going
to call assembly functions and variables available in libgeo.s from C application (testlib.c):
Here's the assembly code first:
; libgeo.s GLOBAL lrotate GLOBAL printasm GLOBAL asmstr GLOBAL textptr GLOBAL selfptr GLOBAL integer EXTERN printf COMMON commvar 1 ; size in bytes SECTION .text ; prototype: long lrotate(long x, int num) lrotate: push ebp mov ebp,esp mov eax,[ebp+8] mov ecx,[ebp+12] .rot rol eax,1 loop .rot mov esp,ebp pop ebp ret ; prototype: void printasm(void) printasm: mov eax,[integer] inc eax mov [localint],eax ; localint = integer + 1 inc eax mov [commvar],eax ; commvar = integer + 2 push dword [commvar] ; pushing 3 int variables (commvar, localptr, integer) mov eax,[localptr] ; on stack in reverse order using eax push dword [eax] push dword [integer] push dword printfstr ; pushing printf format string call printf add esp,16 ret SECTION .data ; long string with new line symbol 0xa asmstr db 'Hello George,', db ' how', db ' are', db ' you!', db 0xa, 0xa, 0 ; format string for printf, ; less then 4 int in db will result in alignment warning from gcc printfstr db 'integer=%d, localint=%d, commvar=%d', db 0xa, 0xa, 0xa, 0 integer dd 5 ; local pointers localptr dd localint ; localptr points to localint textptr dd printasm ; textptr points to printasm() selfptr dd selfptr ; points to itself SECTION .bss ; uninitialized integer localint resd 1
There are two fuctions implemented in assembly language here (lrotate and printasm),
several pointers and integers and a formating string. Since we are going to use them
in C code they are declared as GLOBAL. Also we make use of libc printf function
which is declared as EXTERN. Some other things worth noticing here:
when we need to call function with parameters we push them on stack first,
in reverse order, then push formatting string and finally call function (printf).
Long strings could be split into several parts using db instruction and some
formatting like line feed (0xa or 10) ending with zero (null terminating).
C application (testlib.c) making use of our assembly code is quite straightforward:
#include <stdio.h> #include <stdlib.h> long lrotate(long, int); void printasm(void); // without [] - Segmentation fault // without array length (4) - warning char asmstr[8]; int textptr; int selfptr; int integer; int main(void) { printf("\nTesting lrotate: expect 0x00400000 and 0x00000001\n"); printf("lrotate(0x00040000, 4) = 0x%08lx\n", lrotate(0x40000, 4)); printf("lrotate(0x00040000, 14) = 0x%08lx\n\n", lrotate(0x40000, 14)); printf("Pointers from asm:\n"); printf("textptr = %p, selfptr = %p\n", textptr, selfptr); printf("&printasm() = %p\n\n", &printasm); printf("Long assembly string: %s", asmstr); printf("Printasm: "); printasm(); exit(0); }
We declare all assembly functions/variables first, notice that asmstring which is
multi-part string in assembly is declared as char array, if you delete brackets
program will crash because of "Segmentation fault", array length is needed
to avoid compile time warnings.
To build and run the program enter the following on command line:
geo@fermat:/home/work/asm/nasm$ nasm -f elf libgeo.s geo@fermat:/home/work/asm/nasm$ gcc -o testlib testlib.c libgeo.o geo@fermat:/home/work/asm/nasm$ l libgeo* testlib* -rw-r--r-- 1 geo geo 1468 Feb 15 23:13 libgeo.a -rw-r--r-- 1 geo geo 1168 Feb 17 17:16 libgeo.o -rw-r--r-- 1 geo geo 1239 Feb 17 14:36 libgeo.s -rwxr-xr-x 1 geo geo 2748 Feb 15 23:13 libgeo.so -rwxr-xr-x 1 geo geo 5548 Feb 17 17:16 testlib -rw-r--r-- 1 geo geo 816 Feb 17 11:14 testlib.c -rw-r--r-- 1 geo geo 1684 Feb 15 23:13 testlib.o geo@fermat:/home/work/asm/nasm$ ./testlib Testing lrotate: expect 0x00400000 and 0x00000001 lrotate(0x00040000, 4) = 0x00400000 lrotate(0x00040000, 14) = 0x00000001 Pointers from asm: textptr = 0x8048511, selfptr = 0x8049858 &printasm() = 0x8048511 Long assembly string: Hello George, how are you! Printasm: integer=5, localint=6, commvar=7 geo@fermat:/home/work/asm/nasm$
3. Third program we describe here illustrates usage of command line parameters in assembly language
as well as some basic error checking. It prints first n numbers of famous Fibonacci sequence.
Here's the program:
;------------------------------------------------------------------------------ ; fibonacci.s ; ; George Matveev ; ; www.matveev.se ;------------------------------------------------------------------------------ global _start extern printf extern atoi extern exit section .text _start: ; esp contains total number of command line arguments ; esp+4 contains name of the programm ; esp+8 contains first parameter ; compare nbr of arguments with 2 mov eax, [esp] cmp eax, 2 jne badarg ; print title with nbr of args mov eax, [esp+4] ; name of the program mov ecx, [esp+8] ; ecx is a counter push ecx push eax push title call printf add esp, 4 pop eax pop ecx push ecx call atoi add esp, 4 mov ecx, eax ; start fibonacci calculus xor eax, eax ; first number inc eax ; eax is 1 xor ebx, ebx ; second number inc ebx ; ebx is 1 print: push eax ; save current number and counter push ecx ; since printf will need those registers push eax ; number to be printed push format ; format string call printf add esp, 8 ; get out of stack pop ecx ; restore counter pop eax ; and current number mov edx, eax ; save the current number mov eax, ebx ; next number is now current add ebx, edx ; get the new next number dec ecx ; count down jnz print ; if ecx is not zero print more call exit badarg: push eax push badArg call printf add esp, 4 call exit section .data format db '%10d', 0xa, 0 badArg db 'wrong number of args: %5d', 0xa, 0 title db 'program %s is running with arg %s:', 0xa, 0
First we declare external C functions we need (printf, atoi and exit),
second we read total number of command line parameters (including program name)
from esp register. If this number is not two (that is program name and an integer),
we go to badarg label and print error message which contains (wrong) number of parameters.
If input is ok, we print program name and parameter entered, convert string value
of parameter to int using atoi, initialize eax and ebx registers using xor, and start
Fibonacci sequence calculus using print loop printing each new member with printf.
When counter (ecx) becomes zero we call C exit function.
Here's how you can build the program:
geo@fermat:/home/work/asm/nasm$ nasm -f elf fibonacci.s geo@fermat:/home/work/asm/nasm$ ld -dynamic-linker /lib/ld-linux.so.2 -o fibonacci -lc fibonacci.o geo@fermat:/home/work/asm/nasm$ l fibonacci* -rwxr-xr-x 1 geo geo 2238 Feb 17 17:36 fibonacci -rw-r--r-- 1 geo geo 976 Feb 17 17:36 fibonacci.o -rw-r--r-- 1 geo geo 1552 Feb 17 14:47 fibonacci.s geo@fermat:/home/work/asm/nasm$
Notice that this time we use Linux dynamic linker to produce elf executable.
This is required step if you want to keep your executable slim and load
all required libraries (libc in this case) dynamically at runtime.
Let's run the program (first with wrong number of parameters):
geo@fermat:/home/work/asm/nasm$ ./fibonacci 5 6 wrong number of args: 3 geo@fermat:/home/work/asm/nasm$ ./fibonacci 22 program ./fibonacci is running with arg 22: 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 geo@fermat:/home/work/asm/nasm$
So when we provide wrong number of parameters (2 instead of 1) our
error check is triggered and program exits after printing a message.
And if input is ok we get first n members of Fibonacci sequence
where each number is the sum of the previous two.
You can download source code and binaries of examples from here.
For this tutorial I used NASM 2.09 on Debian Squeeze.
geo@fermat:/home/work/asm/nasm$ nasm -v NASM version 2.09.04 compiled on Feb 12 2011 geo@fermat:/home/work/asm/nasm$
Part 2 of NASM tutorial: NASM file operations with Linux system calls, debugging.