3

How to make arrays of string in assembler and work with them?

I try:

arrayOfWords BYTE "BICYCLE", "CANOE", "SCATEBOARD", "OFFSIDE", "TENNIS"

and after I want to print second word, but its dont work

    mov edx, offset arrayOfWords[2]
    call WriteString

but He print me all world.

4
  • 1
    If you use Kip Irvine's library, please tag the question with [irvine32] and [masm]. Commented Jun 3, 2017 at 8:37
  • irvine programming.msjc.edu/asm/help/… Commented Jun 3, 2017 at 10:15
  • Then click on "edit" and add under "Tags" [masm] and [irvine32] (without brackets). After that you get some examples from me me to copy and paste ;-) Commented Jun 3, 2017 at 10:18
  • Ok. I edit my post.. I am making hangman game in irvine 32.. :) Commented Jun 3, 2017 at 10:21

2 Answers 2

3
arrayOfWords BYTE "BICYCLE", "CANOE", "SCATEBOARD", "OFFSIDE", "TENNIS"

is just another way to write

arrayOfWords BYTE "BICYCLECANOESCATEBOARDOFFSIDETENNIS"

and this is far from being an array.
Furthermore mov edx, offset arrayOfWords[2] is not an array indexing.
Brackets in assembly are used to denote an addressing mode, not array indexing.
That's why I can't stop stressing out to NOT1 use the syntax <symbol>[<displacement>] (your arrayOfWords[2]) - it is a very silly and confusing way to write [<symbol> + <displacement>] (in your case [arrayOfWords + 2]).

You can see that mov edx, OFFSET [arrayOfWords + 2] (that in my opinion is clearer written as mov edx, OFFSET arrayOfWords + 2 since the instruction is not accessing any memory) is just loading edx with the address of the C character in BICYCLE (the third char of the big string).

MASM has a lot of high-level machinery that I never bothered learning, but after a quick glance at the manual linked in the footnotes, it seems that it has no high-level support for arrays.
That's a good thing, we can use a cleaner assembly.

An array of strings is not a continuous block of strings, it is a continuous block of pointers to strings.
The strings can be anywhere.

arrayOfWords  DWORD  OFFSET strBicycle, 
                     OFFSET strCanoe,
                     OFFSET strSkateboard,
                     OFFSET strOffside,
                     OFFSET strTennis

strBicycle    BYTE "BICYCLE",0
strCanoe      BYTE "CANOE", 0
strSkateboard BYTE "SKATEBOARD", 0
strOffside    BYTE "OFFSIDE", 0
strTennis     BYTE "TENNIS", 0

Remember: the nice feature of arrays is constant access time; if the strings were to be put all together we'd get a more compact data structure but no constant access time since there'd be no way to know where a string starts but by scanning the whole thing.
With pointers we have constant access time, in general, we require all the elements of an array to be homogeneous, like the pointers are.

To load the address of the i-th2 string in the array we simply read the i-th pointer.
Suppose i is in ecx then

mov edx, DWORD PTR [arrayOfWords + ecx*4]
call writeString

since each pointer is four bytes.

If you want to read the byte j of the string i then, supposing j is in ebx and i in ecx:

mov esi, DWORD PTR [arrayOfWords + ecx*4]
mov al, BYTE PTR [esi + ebx]

The registers used are arbitrary.


1 Despite what Microsoft writes in its MASM 6.1 manual:

Referencing Arrays
Each element in an array is referenced with an index number, beginning with zero. The array index appears in brackets after the array name, as in

array[9]

Assembly-language indexes differ from indexes in high-level languages, where the index number always corresponds to the element’s position. In C, for example, array[9] references the array’s tenth element, regardless of whether each element is 1 byte or 8 bytes in size. In assembly language, an element’s index refers to the number of bytes between the element and the start of the array.

2 Counting from zero.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you a lot. This is best explane what I read.. :) New I would try to solve my problem like that. I am making hangman game for my university project. Its 30% of my mark.
2

arrayOfWords is not an array, not even a variable. It's just a label that tells the assembler where it can find something, in this case a bunch of characters. Irvine's WriteString expects a null-terminated bunch of characters as string. There are two methods to treat that bunch of characters as string array.

  1. Search the memory for the right address to the desired string. At every null begins a new string.

    INCLUDE Irvine32.inc
    
    .DATA
    manyWords BYTE "BICYCLE", 0
        BYTE "CANOE", 0
        BYTE "SCATEBOARD", 0
        BYTE "OFFSIDE", 0
        BYTE "TENNIS", 0
        BYTE 0                              ; End of list
    len equ $ - manyWords
    
    .CODE
    main PROC
    
        mov edx, 2                          ; Index
        call find_str                       ; Returns EDI = pointer to string
    
        mov edx, edi
        call WriteString                    ; Irvine32: Write astring pointed to by EDX
    
        exit                                ; Irvine32: ExitProcess
    main ENDP
    
    find_str PROC                           ; ARG: EDX = index
    
        lea edi, manyWords                  ; Address of string list
    
        mov ecx, len                        ; Maximal number of bytes to scan
        xor al, al                          ; Scan for 0
    
        @@:
        sub edx, 1
        jc done                             ; No index left to scan = string found
        repne scasb                         ; Scan for AL
        jmp @B                              ; Next string
    
        done:
        ret
    find_str ENDP                           ; RESULT: EDI pointer to string[edx]
    
    END main
    
  2. Build an array of pointers to the strings:

    INCLUDE Irvine32.inc
    
    .DATA
    wrd0 BYTE "BICYCLE", 0
    wrd1 BYTE "CANOE", 0
    wrd2 BYTE "SCATEBOARD", 0
    wrd3 BYTE "OFFSIDE", 0
    wrd4 BYTE "TENNIS", 0
    
    pointers DWORD OFFSET wrd0, OFFSET wrd1, OFFSET wrd2, OFFSET wrd3, OFFSET wrd4
    
    .CODE
    main PROC
    
        mov ecx, 2                          ; Index
        lea edx, [pointers + ecx * 4]       ; Address of pointers[index]
        mov edx, [edx]                      ; Address of string
        call WriteString
    
        exit                                ; Irvine32: ExitProcess
    main ENDP
    
    END main
    

BTW: As in other languages, the index starts at 0. The second string would be index = 1, the third index = 2.

4 Comments

lea edi, manyWords --> how would I do this in nasm or fasm?
@Jodimoro: NASM: lea edi, [manyWords] or mov edi, manyWords..I guess in FASM it is identical. I'm unsure wether and how irvine32.lib works in NASM or FASM. It doesn't work in Linux anyway. If you ask a new question, I'll work on it ;-)
@Jodimoro: Why not what? I don't understand the question.
@Jodimoro: After the operation, EDI should contain the address of manyWords. mov edi, [manyWords] would load the value, not the address. mov edi, manyWords loads the address. Casually speaking, lea edi, [manyWords] calculates the address of the value [manyWords]. MASM is different manyWords without brackets is the value and "OFFSET manyWords" expresses the address. I prefer LEA because it can calculate the address of local variables at runtime. But this isn't relevant in this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.