Assembly Programming

Alexander Neville

2024-01-23

Assembly languages include the panoply of low-level, architecture-specific systems languages. There is no single assembly language specification, though such languages generally have in common a sequential structure and an immediate correspondence between their mnemonic or abbreviated assembly instructions and the architecture’s machine code instructions.

Sections

Assembly programs are divided into sections, which provide information to the linker and ensure that data in a process is arranged sensibly in memory. Three common sections are:

These sections would be introduced with the statements:

section .data
section .bss
section .text

A .text section must contain a reference to _start, signifying the start of the program, required by the linker if it is to create executable machine code. This is an example of a label.

section .text

    global _start

_start:

Registers

Assembly languages allow the system registers to be addressed. In x86_64 architecture, each register is 64 bits in size. Smaller registers can be emulated by using part of a 64 bit register. These registers have the names:

8 bit 16 bit 32 bit 64 bit
al ax eax rax
bl bx ebx rbx
cl cx ecx rcx
dl dx edx rdx
sil si esi rsi
dil di edi rdi
bpl bp ebp rbp
spl sp esp rsp
r8b r8w r8d r8
r9b r9w r9d r9
r10b r10w r10d r10
r11b r11w r11d r11
r12b r12w r12d r12
r13b r13w r13d r13
r14b r14w r14d r14
r15b r15w r15d r15

Pointers

Pointers are another type of on-chip register. They are used to store memory addresses of instructions. These are the most fundamental pointers:

16 bit 32 bit 64 bit Purpose
IP EIP RIP Holds the address of next instruction to be fetched.
SP ESP RSP Holds the address of the top of the address stack.
BP EBP RBP Holds the address of the bottom of the address stack.

Many of these registers are used to manage control flow, which is typically sequential. During normal operation, the RIP will be incremented by 1 for every instruction that is executed, making the program progress.

Flags

Flags, like registers, are a type of on-chip data storage. Unlike other registers, a flag holds a single bit. Each flag is part of a larger register, the status register. Like other registers, flags are referred to with unique mnemonic identifiers within assembly programs.

Symbol Purpose
CF carry bit
PF parity bit
ZF zero
SF sign bit
OF overflow
AF adjust
IF interupt enabled

Mathematical Operations

Here is a table detailing some of the available arithmetic operations. rXX means any 64 bit register. b could be a register or a hard coded value.

Operation Parameters Description
add rXX, b rXX = rXX + b
sub rXX, b rXX = rXX - b
mul rXX rax = rax * rXX
div rXX rax = rax / rXX
neg rXX rXX = - rXX
inc rXX rXX = rXX + 1
dec rXX rXX = rXX -1

Labels & Jumps

Labels are used to store the address of an instruction in memory. Labels are used in conjunction with jump commands to manipulate the control flow of a program. _start is an example of a label. When the jump command is encountered, the address of the instruction associated with the label is loaded into the RIP and hence the control flow is changed. The syntax of a jump is illustrated below:

_start:

    jmp _start

This program performs an infinite loop.

Comparisons & Conditional Jumps

Used with jump commands, comparisons introduce complex flow control into assembly programs. Comparisons are always drawn between one register and either a literal operand or another register.

cmp r15, 37
cmp r15, r14

After a comparison is made, flags are set in the status register. Conditional jumps are made depending on the state of these flags, so a jump command may directly follow a comparison operation. Here are some common conditional jump commands:

Symbol Purpose
je jump if a = b
jne jump if a != b
jg jump if a > b
jge jump if a >= b
jl jump if a < b
jle jump if a <= b
jz jump if a = 0
jnz jump if a != 0
jo overflow occurred
jno overflow did not occur

System Calls

A system call is a request made of the operating system kernel to service the program. The nature and identity of these calls are operating system dependent. In an assembly language program, a system call will also have parameters. Operands are passed to the kernel by placing them in a number of registers, specified in the table below. The ID of the system call is placed in the register rax and the list of operands should be placed in order in the subsequent registers.

Argument Register
ID rax
1 rdi
2 rsi
3 rdx
4 r10
5 r8
6 r9

This table is accurate of 64 bit architecture. Here is a full list of system calls supported by Linux. With the operands placed in the correct registers, the syscall instruction is issued to trap into the kernel.

Assembling a Program

Oft-encountered x86 assemblers for the Linux platform include GNU AS and NASM. The first step in assembling an executable is producing the object code. Here, the -f flag and elf64 option set the format of the output executable to the 64 bit Executable and Linkable Format.

nasm -f elf64 -o example.o example.asm

As with the compilation of program written in a more abstract language, linking is required to generate the executable.

ld example.o -o example

Examples

The examples presented here are written in Intel x86 syntax as understood by NASM.

Hello World

This example prints a string of known length, created and initialised in the .data section.

section .data

    greeting db "Hello, world!",10
    length_of_greeting equ $ - greeting ; find the length of the string.

section .text

    global _start

_start:

    mov rax, 1                  ; The syscall ID is stored in the rax register
    mov rdi, 1                  ; In rdi, the second register involved, store the first arguement of the syscall
    mov rsi, greeting
    mov rdx, length_of_greeting
    syscall

    mov rax, 60
    mov rdi, 0
    syscall                     ; exit the program

Assembled and executed, this program outputs:

Hello, World!

Jumps & Loops

This program features an example of the jmp instruction, which is not conditional. Consequently, this program does not exit.

section .data
    greeting db "Hello, World!",10
    length_of_greeting equ $ - greeting

section .text
    global _start

_start:
    mov rax, 1
    mov rdi, 1
    mov rsi, greeting
    mov rdx, length_of_greeting
    syscall
    jmp _start

    mov rax, 60
    mov rdi, 0
    syscall

Assembled and executed, this program prints the same message as before indefinitely:

Hello, World!
Hello, World!
Hello, World!
...

Conditional Statements

Here is an example of a conditional statement. The contents of r15 and r14 are set to 1 and 3 respectively. The contents of the two registers are compared and the program will jump conditionally.

section .data

    equal db "equal to",10
    length_equal equ $ - equal
    less db "less than",10
    length_less equ $ - less
    more db "more than",10
    length_more equ $ - more

section .text

    global _start

_start:

    mov r15, 1
    mov r14, 3
    cmp r15, r14
    jl _less
    jg _more

    mov rax, 1
    mov rdi, 1
    mov rsi, equal
    mov rdx, length_equal
    syscall
    call _exit

_less:

    mov rax, 1
    mov rdi, 1
    mov rsi, less
    mov rdx, length_less
    syscall
    call _exit

_more:

    mov rax, 1
    mov rdi, 1
    mov rsi, more
    mov rdx, length_more
    syscall
    call _exit

_exit:

    mov rax, 60
    mov rdi, 0
    syscall

In this case, the value in r15 is less than that in r14, so the program follows the jl instruction and begins executing from the label _less, outputting:

less than

In the previous examples, the message printed by the assembly program was of known-length and intialised in the .data section. This program will print an integer in decimal format, using as many characters as necessary. The code is commented and divided into labelled subroutines.

section .bss

    string resb 100             ; hold the string itself
    position resb 8             ; hold the current position along the string

section .text

    global _start

_start:

    ; use rax register as the value will be divided repeatedly
    mov rax, 43                 ; put a number to print in rax
    call _print                 ; call the print subroutine
    mov rax, 1037                ; put a number to print in rax
    call _print                 ; call the print subroutine
    call _exit                  ; call the exit subroutine

_print:                         ; define a subroutine which prints the value in rax.

    mov rcx, string
    mov rbx, 10                 ; newline character
    mov [rcx], rbx              ; put the newline at the beginning of the string
    inc rcx                     ; increment the position along the string
    mov [position], rcx         ; distance along string

_reverse_number:

    mov rdx, 0                  ; zero the rdx register before div/mod operation
    mov rbx, 10                 ; converting from base 10, so divide by 10 each iteration.
    div rbx                     ; divide value in rax by rbx (10) remainder is stored in rdx
    add rdx, 48                 ; to get the ascii value of the character add 48

    mov rcx, [position]
    mov [rcx], dl               ; move least significant bytes of of rdx to address held by rcx
    inc rcx                     ; increment the position along the string.
    mov [position], rcx         ; store the position back in memory

    cmp rax, 0                  ; if there are whole parts left after division, call the function again.
    jne _reverse_number

_display:

    mov rcx, [position]

    mov rax, 1
    mov rdi, 1
    mov rsi, rcx
    mov rdx, 1
    syscall

    mov rcx, [position]
    dec rcx                     ; starting with the end of the address, iterate backward.
    mov [position], rcx

    cmp rcx, string
    jge _display                ; if the position is not yet back at the start, print the next character.

    ret                         ; end of subroutine, value of rax has been printed.

_exit:

    mov rax, 60
    mov rdi, 0
    syscall                     ; exit the program with sys_exit

Assembled and executed, this program outputs:

43
1037

Along the same lines as the last example, this example prints a string - the length of which is not determined at compile time.

section .data

    test_string_1 db "Hello, world! My name is Alexander.",10,0 ; define a test string
    test_string_2 db "Goodbye!",10,0 ; define a test string

section .text

    global _start

_start:

    mov r15, test_string_1      ; load the address of a string into r15
    call _print                 ; call the _print subroutine
    mov r15, test_string_2      ; load the address of a string into r15
    call _print                 ; call the _print subroutine
    jmp _exit                   ; exit the program

_print:

    push r15                    ; put the beginning of the string on the stack
    mov rbx, 0                  ; keep track of the length

_iteration:

    inc r15
    inc rbx
    mov cl, [r15]               ; copy the character at r15 into cl
    cmp cl, 0
    jne _iteration              ; if the current character != 0, increment again.

    mov rax, 1                  ; put together a sys_write call
    mov rdi, 1
    pop rsi                     ; retrieve the start of the string from the stack
    mov rdx, rbx                ; copy the final length of the string into the rdx register
    syscall

    ret                         ; return to the function call

_exit:

    mov rax, 60
    mov rdi, 0
    syscall                     ; exit the program with sys_exit

Assembled and executed, this program outputs:

Hello, world! My name is Alexander.
Goodbye!

Fibonacci Numbers

Using many of the routines defined above, this example calculates and prints some Fibonacci numbers.

section .bss

    string resb 100             ; hold the string itself
    position resb 8             ; hold the current position along the string

section .text

    global _start

_start:

    mov r13, 1                  ; the current number
    mov r14, 0                  ; the last number
    mov r15, 0                  ; initialise a counter

_loop:

    mov r12, r13                ; backup the value in r13
    add r13, r14                ; add the previous number to this number
    mov r14, r12                     ; pop the value on the stack into r14, the previous number
    mov rax, r13                     ; move the current number into rax
    call _print                 ; call the print subroutine
    inc r15                     ; increment the counter
    cmp r15, 10
    jl _loop                    ; loop if iterations < 10
    call _exit                  ; call the exit subroutine

_print:                         ; define a subroutine which prints the value in rax.

    mov rcx, string
    mov rbx, 10                 ; newline character
    mov [rcx], rbx              ; put the newline at the beginning of the string
    inc rcx                     ; increment the position along the string
    mov [position], rcx         ; distance along string

_reverse_number:

    mov rdx, 0                  ; zero the rdx register before div/mod operation
    mov rbx, 10                 ; converting from base 10, so divide by 10 each iteration.
    div rbx                     ; divide value in rax by rbx (10) remainder is stored in rdx
    add rdx, 48                 ; to get the ascii value of the character add 48

    mov rcx, [position]
    mov [rcx], dl               ; move least significant bytes of of rdx to address held by rcx
    inc rcx                     ; increment the position along the string.
    mov [position], rcx         ; store the position back in memory

    cmp rax, 0                  ; if there are whole parts left after division, call the function again.
    jne _reverse_number

_display:

    mov rcx, [position]

    mov rax, 1
    mov rdi, 1
    mov rsi, rcx
    mov rdx, 1
    syscall

    mov rcx, [position]
    dec rcx                     ; starting with the end of the address, iterate backward.
    mov [position], rcx

    cmp rcx, string
    jge _display                ; if the position is not yet back at the start, print the next character.

    ret                         ; end of subroutine, value of rax has been printed.

_exit:

    mov rax, 60
    mov rdi, 0
    syscall                     ; exit the program with sys_exit

This program prints the first ten numbers of the Fibonacci sequence (starting with 1 and 2):

1
2
3
5
8
13
21
34
55
89

See Also

Or return to the index.