Personal.X-Istence.com

Bert JW Regeer (畢傑龍)

Happenings and assembly

School has started again, I am taking six classes. I am now really gearing up to finally leave UAT and go out into the work force. I have this semester, and two more to go and then comes the decision; graduate school or work? Not entirely sure yet which I want to do, but that is what life is all about, trying to figure out what the best would be, and possibly finding out it was a mistake and growing and learning.

There are others things that have been going on lately, I have been spending much time behind a computer screen programming. The intellectual challenges it provides me with are very enjoyable and great way to stay mentally fit. Also as a perfectionist by nature, it makes it really hard to start working on something and then put it down to go to sleep, it has to be perfect. That is probably also the reason why I now have a 300 line patch set instead of 500 lines. Yay for removing redundant code, and re-thinking the problem/solution.

Other things that I have been working on is a challenge that my room mate Victor (Pancho) gave me. It requires a recursion to complete the problem. In itself this does not generally present a problem, as the max recursion level is going to be limited to a fairly reasonable number, however in C/C++ there is still a massive amount of overhead when doing a function call. This includes things like setting up the stack, creating local variables, and then saving certain states. Extra garbage that you want to eliminate especially in tight loops. The problem however cannot be solved using for/while loops, or at least there is no way that I can figure out (if you want to know what it is that I am working on, contact me. Victor is going to make the challenge public later so I don't want to spoil it for everyone). So I have been writing some self-modifying code that unrolls loops on the fly. I will post code sometime in the near future when I have it partially working, not entirely sure yet how I want to proceed right now. More on that later!

Here is a bit of Intel assembly written in AT&T syntax for GCC to grok for Mac OS X/FreeBSD machines that use a stack based syscall infrastructure:

__asm__("pushl  $21;\n"
    "pushl  %0;\n"
    "pushl  $0x1;\n"
    "movl   $4, %%eax;\n"
    "pushl  %%eax;\n"
    "int    $0x80;\n"
    "addl   $16, %%esp;\n"
    :
    : "r" (answer)
    : "%eax");

Note that the length of the string is 21 characters (20 characters, and a newline which I appended before printing), and that answer is the char *. See my previous Intel post to see what the code does, and why. It is exactly the same printing routine, only difference is the assembly syntax. It really is too bad that GCC for Darwin does not grok the .intel_style inline assembly, it would have made things much easier.

You should be able to modify it to accept an integer to push onto the stack, however I have been unable to figure out why the following is not working:

__asm__("pushl  %0;\n"
    "pushl  %1;\n"
    "pushl  $0x1;\n"
    "movl   $4, %%eax;\n"
    "pushl  %%eax;\n"
    "int    $0x80;\n"
    "addl   $16, %%esp;\n"
    :
    : "r" (length), "r" (answer)
    : "%eax");</pre>

Assuming length is an integer and answer is a character pointer. It is kind of frustrating, because there are not many good resources out there on the web that explain inline assembly for GCC. The few that I have been found have been rather vague and have not explained very much. Maybe I am just using the wrong words, and or my Google-Fu is not strong enough.

Speaking of inline assembly, I have heard from some friends that are running on 64-bit systems that if you compile 64-bit binaries for Windows in Visual-C++ 2008 you are not allowed to use inline assembly, they have removed support for it. Does that seem wrong to anyone else? That seems like it would frustrate many developers of tight code that runs many times faster in assembly than using compiled code. Sure compilers have been getting better for ages, however at the same time in assembly the programmer has a lot more control over the CPU and where time is going to be spent than any other way.

Oh, Erlang is my new shiny programming language to learn.

addnumbers.asm updated

If you have no clue what I am talking about, check out my previous post. I have fixed the flaws, and now it will parse the commands it is handed on the command line, and stick them into an integer. It will now print it to stdout.

This has been an awesome learning experience for me. Especially with regards to how to do recursion, how to debug a pure assembly program with gdb and whatnot.

Here comes the code, as I said before, see my previous thread if you don't know what I am talking about, as well as instructions on how to compile the program.

; File: addnumbers2.asm
; Bert JW Regeer
; 
; 2008-01-27
; 
; Function:
;   Add numbers together that are provided as arguments to the program in argv[1] and argv[2].
;
; Known limitations:
;   This will hopefully be fixed in the next revision. Floating point numbers will not work.
;   Any input that is larger than an integer will cause overflows, and thus will not work.

section .data

    ; Define some strings that are going to be used throughout the program

    ; This string is to let the user know they failed to provide the proper amount of arguments.
args    db  "Program addnumbers: ", 0xa, 0x9, "addnumbers <number 1> <number 2>", 0xa, 0x9, "Arguments 1 and 2 are required.", 0xa, 0x9, "Anything that will cause addition to overflow an int (2,147,483,647), will fail! :P", 0xa
largs   equ $ - args

    ; This string contains part of the output that we are going to send to the terminal. The last two
    ; bytes will be filled automatically by the program, before it is output to stdout.
msg db  'Answer: ', 0
lmsg    equ $ - msg

num1    dd  0
num2    dd  0

section .bss
    ; This is where I am going to store the output of my conversion from an integer to a char
answer  resb    64

section .text
global start                ; Linker defined entry point. Mac OS X this is start. 
global _start               ; FreeBSD and others _start.

_start:
start:
    push    ebp         ; 
    mov ebp, esp        ; Set up the stack frame

    mov ecx, [ebp + 4]      ; Get argc, we check if it set to at least 3
    mov edx, ebp        ; Put the base pointer into edx, so we can use that in 
                    ; our dereferences coming up
    add edx, 8          ; Add 8. We want to skip ebp and argc

    cmp ecx, 3          ; Check if we have at least 3 arguments to the program. 
                    ; At least two arguments are required, and the 3rd one is 
                    ; the name of the program
    jl  exit            ; If the value in ecx is less than 3, jump to exit

    mov esi, 1          ; Set the index to 1

    mov eax, [edx + esi * 4]    ; Move the pointer to the character array into eax
    push    eax         ; Push eax onto the stack
    push    num1            ; Push the pointer to num1 onto the stack
    call    ctoi            ; Call my char to int function
    add esp, byte 8     ; Put the stack pointer back to where it was.

    inc esi         ; Increase the index

    mov eax, [edx + esi * 4]    ; Move the pointer to the character array into eax
    push    eax         ; Push eax onto the stack
    push    num2            ; Push the pointer to num2 onto the stack
    call    ctoi            ; Call my char to int function
    add esp, byte 8     ; Put the stack pointer back to where it was.

    mov eax, [num1]     ; Move value stored in num1 into eax
    add eax, [num2]     ; Add num2 to eax, this will now be stored in eax

    push    eax         ; Push the new calculated number onto the stack
    call    itoa            ; Convert the integer to a character array

    push    dword lmsg      ; Push the length of the string
    push    msg         ; Push the location of the string in memory
    push    dword 0x1       ; Push the file descriptor to write to
    mov eax,4           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel
    add esp, byte 16        ; Clean up the stack

    push    answer          ; Push answer onto the stack
    call    len         ; Get it's length

    push    edi         ; Push the length onto the stack
    push    answer          ; Push the pointer to the character string onto the stack
    push    dword 0x1       ; Push the file descriptor to write to
    mov eax,4           ; Push the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel
    add esp, byte 16        ; Clean up the stack

    jmp done            ; Program is done. Jump to done

exit:
    ; This label is jumped to when we want to exit the program and let the user know how
    ; to run the program. Like for instance what paramaters to send the program.
                    ; Call sys_write
    push    dword largs     ; Push the length of the string
    push    dword args      ; Push the location of the string in memory
    push    dword 0x1       ; Push the file descriptor to write to
    mov eax,4           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel
    add esp, byte 16        ; Clean up the stack

done:
    ; This is the label we jump to when we want to exit the program, we set the exit code
    ; to 0.
                    ; Call sys_exit
    push    dword 0x0       ; Push the value to return to the operating system
    mov eax,1           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel

    ; We never return to this function, so no need to clean the stack. :P

ctoi:
    ; char to i. We actually convert entire character array's to integers.
    ;
    ; We get two paramaters on the stack. The first one we grab is the pointer to the place to store
    ; the number. The second is the pointer to the character array.

    push    ebp         ; Push the old base pointer onto the stack
    mov ebp, esp        ; Create a new base pointer
    push    esi         ; Store all the original registers
    push    eax
    push    ebx
    push    ecx 
    push    edx         ; Push edx, so that we can overwrite it
    sub esp, 4          ; We get another storage space on the stack
    mov [esp], dword 10     ; This is the number we are going to multiply by

    mov eax, [ebp + 12]     ; Move the pointer to the character array into eax
    push    eax         ; Push the pointer to the character array onto the stack
    call    len         ; Call the string length versoin
    add esp, byte 4     ; Reclaim the space we lost when we pushed eax onto the stack

    mov ebx, [ebp + 8]      ; This is where we are going to store the numbers
    mov esi, [ebp + 12]     ; This is the pointer to the character array
    movzx   ecx, di         ; move with extended zero edi.
    mov edi, 0          ; Clean up edi

    ctoi_loop:
    mov eax, [ebx]      ; Move the value stored in ebx into eax
    mul dword [esp]     ; Move it over a 10s place.
    mov [ebx], eax      ; Move the new number back into ebx

    movzx   eax, byte [esi + edi]   ; Move the character into eax
    movsx   eax, al         ; We just want the lower part of the character

    sub eax, 0x30       ; Subtract 0x30, ASCII 0 so that it is an actual number
    add [ebx], eax      ; Add the new number to the old number that has been multiplied by 10
    inc edi         ; Increase the counter
    loop ctoi_loop          ; Loop into cx is 0

    add esp, byte 4
    pop edx         ; Restore all the registers
    pop ecx
    pop ebx
    pop eax
    pop esi
    mov esp, ebp        ; Make esp the original base pointer again
    pop ebp         ; Pop the original base pointer into the register
    ret             ; Return caller

itoa:
    ; Recursive function. This is going to convert the integer to the character.
    push    ebp         ; Setup a new stack frame
    mov ebp, esp
    push    eax         ; Save the registers
    push    ebx
    push    ecx
    push    edx

    mov eax, [ebp + 8]      ; eax is going to contain the integer
    mov ebx, dword 10       ; This is our "stop" value as well as our value to divide with
    mov ecx, answer     ; Put a pointer to answer into ecx
    push    ebx         ; Push ebx on the field for our "stop" value

    itoa_loop:
    cmp eax, ebx        ; Compare eax, and ebx
    jl  itoa_unroll     ; Jump if eax is less than ebx (which is 10)
    xor edx, edx        ; Clear edx
    div ebx         ; Divide by ebx (10)
    push    edx         ; Push the remainder onto the stack
    jmp itoa_loop       ; Jump back to the top of the loop
    itoa_unroll:            
    add al, 0x30        ; Add 0x30 to the bottom part of eax to make it an ASCII char
    mov [ecx], byte al      ; Move the ASCII char into the memory references by ecx
    inc ecx         ; Increment ecx
    pop eax         ; Pop the next variable from the stack
    cmp eax, ebx        ; Compare if eax is ebx
    jne itoa_unroll     ; If they are not equal, we jump back to the unroll loop
                    ; else we are done, and we execute the next few commands
    mov [ecx], byte 0xa     ; Add a newline character to the end of the character array
    inc ecx         ; Increment ecx
    mov [ecx], byte 0       ; Add a null byte to ecx, so that when we pass it to our
                    ; len function it will properly give us a length

    pop edx         ; Restore registers
    pop ecx
    pop ebx
    pop eax
    mov esp, ebp        
    pop ebp
    ret

len:
    ; Returns the length of a string. The string has to be null terminated. Otherwise this function
    ; will fail miserably. 
    ; Upon return. edi will contain the length of the string.

    push    ebp         ; Save the previous stack pointer. We restore it on return
    mov ebp, esp        ; We setup a new stack frame
    push    eax         ; Save registers we are going to use. edi returns the length of the string
    push    ecx

    mov ecx,  [ebp + 8]     ; Move the pointer to eax; we want an offset of one, to jump over the return address

    mov edi, 0          ; Set the counter to 0. We are going to increment this each loop

    len_loop:           ; Just a quick label to jump to
    movzx   eax, byte [ecx + edi]   ; Move the character to eax.
    movsx   eax, al         ; Move al to eax. al is part of eax.
    inc di          ; Increase di.
    cmp eax, 0          ; Compare eax to 0.
    jnz     len_loop        ; If it is not zero, we jump back to len_loop and repeat.

    dec di          ; Remove one from the count

    pop ecx         ; Restore registers
    pop eax
    mov esp, ebp        ; Set esp back to what ebp used to be.
    pop ebp         ; Restore the stack frame
    ret             ; Return to caller

Introduction to Intel Assembly

This semester I am taking a class on Intel assembly, because I want more of an insight into how the computer works, and it will allow me to better reverse engineer new viruses and spyware. The class is also required if one is a Software Engineering major, so that means I have to take it.

The professor who teaches it absolutely sucks at teaching. He gets up in front of the class and mumbles through some powerpoint slides, which provide no real information, and then goes on and on about his days at Motorolla. It really sucks. Oh, best part is this quote:

"I think that is how Intel processors do it. I don't know I have not read up on it yet"

Well, we had our first assignment. Sum two numbers and then output them to the screen. We were supposed to write inline assembly using Visual Studio C++, but if we are to do an assembly class, then we should learn how to do write assembly, not have some parts assembly and other parts the compiler. Sure it makes it easy as you will get immediate access to the standard C library, but if you want that, you can just link against it.

The following code examples were written on Mac OS X, and will work on FreeBSD. Linux uses a different calling convention for it's syscalls, and as such this code will not run on Linux, unless it is modified. Do note, you need an Intel Mac for this to work. This is Intel assembly.

Compile the code with (Mac OS X):

nasm -f macho addnumbers.asm
ld -o addnumbers addnumbers.o

or (FreeBSD)

nasm -f elf addnumbers.asm
ld -o addnumbers addnumbers.o

Then you can run it with:

./addnumbers 1 5

As you can see in the comments of the source code, there are still some limitations, but the rest of the source code should be made readable by the comments that are provided.

Porting to Linux: Please double check that all the syscall numbers are the same. There are some differences between Linux and FreeBSD/Mac OS X in that regard.

; File: addnumbers.asm
; Bert JW Regeer
; 
; 2008-01-27
; 
; Function:
;   Add numbers together that are provided as arguments to the program in argv[1] and argv[2].
;
; Known limitations:
;   As of right now, the numbers that are provided may not add up to anything more than 9.
;   This will hopefully be fixed in the next revision. Floating point numbers will not work.
;
; Todo:
;   Write conversion routine, to convert a string of numbers into a real integer on which
;   math may be performed.

section .data

    ; Define some strings that are going to be used throughout the program

    ; This string is to let the user know they failed to provide the proper amount of arguments.
args    db  "You failed to provide the proper amount of arguments", 0xa
largs   equ $ - args

    ; This string contains part of the output that we are going to send to the terminal. The last two
    ; bytes will be filled automatically by the program, before it is output to stdout.
msg db  'Answer: ', 0,  0
lmsg    equ $ - msg

section .text
global start                ; Linker defined entry point. Mac OS X this is start. FreeBSD and others _start.
global _start

_start:
start:

; Start the program here.

    add esp, byte 8     ; We don't care about argc or argv[0]

    pop     ecx         ; Get the first argument or argv[1]
    jecxz   exit            ; If there was no argument. Exit. Let the user know why

    ; Change the number from a character to an actual dword
    mov eax, dword [ecx]    ; Move the character into eax so we can manipulate it
    sub eax, 0x30       ; Remove 0x30 from the character. To make it an actual number, not an ASCII number.

    pop ecx         ; get the second argument or argv[2]
    jecxz   exit            ; If there was no second argument. Exit. Let the user know why

    mov ebx, dword [ecx]    ; Move the character into ebx so we can manipulate it
    sub ebx, 0x30       ; Remove 0x30 from the character. To make it an actual number, not an ASCII number.

    add eax, ebx        ; Add the two numbers together
    add eax, 0x30       ; Make it an ASCII number again

    mov [msg+lmsg-2], eax   ; Replace the null character in the msg with the answer
    mov [msg+lmsg-1], dword 0xa ; Add an newline character so that when it spits it out it is neatly formatted

                    ; Call sys_write
    push    dword lmsg      ; Push the length of the string
    push    msg         ; Push the location of the string in memory
    push    dword 0x1       ; Push the file descriptor to write to
    mov eax,4           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel
    add esp, byte 16        ; Advance the stack pointer

    jmp done            ; Program is done. Jump to done

exit:
                    ; Call sys_write
    push    dword largs     ; Push the length of the string
    push    dword args      ; Push the location of the string in memory
    push    dword 0x1       ; Push the file descriptor to write to
    mov eax,4           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel
    add esp, byte 16        ; Advance esp past the part we were just at

done:
                    ; sys_exit
    push    dword 0x1       ; Push the value to return to the operating system
    mov eax,1           ; Move the syscall number into eax
    push    eax         ; Push the syscall onto the stack
    int 0x80            ; Interrupt 80, go to kernel

    ; We never return to this function, so no need to clean the stack.