2

I've followed the book OS01 by Tuhdo, kind of made a simple bootloader, that loads the kernel. Then I tried to add some code to set up GDT, but I've witnessed unusual behavior in gdb.

Here I stopped inside the _start subroutine, about to call main() function (main then calls gdt_init). In _start subroutine, about to call main

But if I hit si once more to enter the main's first instruction, it stopped at the 2 instructions above main. main is not called properly

At first, I thought that the main was actually called, which happens so fast that it already calls the gdt_init (naive thought). But when I entered gdb, I already set the break at gdb_init, so it must have stopped there before it reached the leave and ret instructions. Also, the backtrace shows we never called main.

enter image description here

I've spent about a week to find the cause, but no luck. Please save my soul!


The following are the main parts of my code.

Here is my bootloader.asm:

;********************************************
; bootloader.asm
; A Simple Bootloader
;********************************************
bits 16
start: jmp boot

;; constant and variable definitions
welcome_msg db "Welcome to NyanOS!", 0ah, 0dh, 0h
err_msg db "E", 0h

boot:
    cli ; no interrupts
    cld ; all that we need to init
    
    mov bh, 0xe
    mov bl, 0x18
    call MovCursor
    
    mov si, welcome_msg
    call Print

    mov ax, 0x100

    ;; set the buffer at es:bx (0x50:0x0)
    mov es, ax  
    xor bx, bx
    
    mov al, 18  ; read 18 sectors
    mov ch, 0   ; track 0
    mov cl, 2   ; sector to start reading (from the second sector)
    mov dh, 0   ; head number
    mov dl, 0   ; drive number

    mov ah, 0x02    ;read sectors from disk 
    int 0x13    ; call the BIOS routine
    
    jc .disk_err    ; failed to read disk

    jmp [0x1000 + 0x18]; jump and execute the sector!
    
.disk_err:
    mov bh, 0xf
    mov bl, 0x15
    call MovCursor

    mov si, err_msg
    call Print
    
    hlt ; halt the system

%include "io.asm"

; We have to be 512 bytes. Clear the rest of the bytes with 0
times 510 - ($-$$) db 0
dw 0xAA55

bootloader.lds:

OUTPUT(bootloader);

SECTIONS
{
    .text 0x7c00:
    {
        *(.text)
    }
    .data :
    {
        *(.data)
    }
}

start.s, in which the _start subroutine is called by the bootloader.asm above (jmp [0x1000 + 0x18]):

section .text
extern main
global _start
_start:
    mov esp, stack_top
    call main
    hlt

section .bss
align 16
stack_bottom:
    resb 4096
stack_top:

kernel.c:

#include "gdt.h"

void main() 
{
    gdt_init();
    while (1) {};
}

kernel.lds:

ENTRY(_start);

PHDRS
{
    code PT_LOAD FLAGS(rx); 
}

SECTIONS
{
    .text 0x2000: ALIGN(0x1000) { *(.text) } :code
    .data : { *(.data) }
    .bss : { *(.bss) } 
    /DISCARD/ : { * (.eh_frame) }
}

After compiling, here are some of the debug output.

readelf -h build/kernel/kernel

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x2140
  Start of program headers:          52 (bytes into file)
  Start of section headers:          22816 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         1
  Size of section headers:           40 (bytes)
  Number of section headers:         13
  Section header string table index: 12

objdump -d build/kernel/kernel


build/kernel/kernel:     file format elf32-i386


Disassembly of section .text:

00002000 <gdt_set_entry>:
    2000:   55                      push   %ebp
    2001:   89 e5                   mov    %esp,%ebp
    2003:   83 ec 08                sub    $0x8,%esp
    2006:   8b 55 14                mov    0x14(%ebp),%edx
    2009:   8b 45 18                mov    0x18(%ebp),%eax
    200c:   88 55 fc                mov    %dl,-0x4(%ebp)
    200f:   88 45 f8                mov    %al,-0x8(%ebp)
    2012:   8b 45 0c                mov    0xc(%ebp),%eax
    2015:   89 c2                   mov    %eax,%edx
    2017:   8b 45 08                mov    0x8(%ebp),%eax
    201a:   66 89 14 c5 52 21 00    mov    %dx,0x2152(,%eax,8)
    2021:   00 
    2022:   8b 45 0c                mov    0xc(%ebp),%eax
    2025:   c1 e8 10                shr    $0x10,%eax
    2028:   89 c2                   mov    %eax,%edx
    202a:   8b 45 08                mov    0x8(%ebp),%eax
    202d:   88 14 c5 54 21 00 00    mov    %dl,0x2154(,%eax,8)
    2034:   8b 45 0c                mov    0xc(%ebp),%eax
    2037:   c1 e8 18                shr    $0x18,%eax
    203a:   89 c2                   mov    %eax,%edx
    203c:   8b 45 08                mov    0x8(%ebp),%eax
    203f:   88 14 c5 57 21 00 00    mov    %dl,0x2157(,%eax,8)
    2046:   8b 45 10                mov    0x10(%ebp),%eax
    2049:   89 c2                   mov    %eax,%edx
    204b:   8b 45 08                mov    0x8(%ebp),%eax
    204e:   66 89 14 c5 50 21 00    mov    %dx,0x2150(,%eax,8)
    2055:   00 
    2056:   8b 45 10                mov    0x10(%ebp),%eax
    2059:   c1 e8 10                shr    $0x10,%eax
    205c:   83 e0 0f                and    $0xf,%eax
    205f:   89 c2                   mov    %eax,%edx
    2061:   8b 45 08                mov    0x8(%ebp),%eax
    2064:   88 14 c5 56 21 00 00    mov    %dl,0x2156(,%eax,8)
    206b:   8b 45 08                mov    0x8(%ebp),%eax
    206e:   0f b6 04 c5 56 21 00    movzbl 0x2156(,%eax,8),%eax
    2075:   00 
    2076:   89 c2                   mov    %eax,%edx
    2078:   0f b6 45 f8             movzbl -0x8(%ebp),%eax
    207c:   83 e0 f0                and    $0xfffffff0,%eax
    207f:   09 d0                   or     %edx,%eax
    2081:   89 c2                   mov    %eax,%edx
    2083:   8b 45 08                mov    0x8(%ebp),%eax
    2086:   88 14 c5 56 21 00 00    mov    %dl,0x2156(,%eax,8)
    208d:   8b 45 08                mov    0x8(%ebp),%eax
    2090:   0f b6 55 fc             movzbl -0x4(%ebp),%edx
    2094:   88 14 c5 55 21 00 00    mov    %dl,0x2155(,%eax,8)
    209b:   90                      nop
    209c:   c9                      leave
    209d:   c3                      ret

0000209e <gdt_init>:
    209e:   55                      push   %ebp
    209f:   89 e5                   mov    %esp,%ebp
    20a1:   83 ec 08                sub    $0x8,%esp
    20a4:   66 c7 05 68 21 00 00    movw   $0x17,0x2168
    20ab:   17 00 
    20ad:   b8 50 21 00 00          mov    $0x2150,%eax
    20b2:   a3 6a 21 00 00          mov    %eax,0x216a
    20b7:   6a 00                   push   $0x0
    20b9:   6a 00                   push   $0x0
    20bb:   6a 00                   push   $0x0
    20bd:   6a 00                   push   $0x0
    20bf:   6a 00                   push   $0x0
    20c1:   e8 3a ff ff ff          call   2000 <gdt_set_entry>
    20c6:   83 c4 14                add    $0x14,%esp
    20c9:   68 cf 00 00 00          push   $0xcf
    20ce:   68 9a 00 00 00          push   $0x9a
    20d3:   6a ff                   push   $0xffffffff
    20d5:   6a 00                   push   $0x0
    20d7:   6a 01                   push   $0x1
    20d9:   e8 22 ff ff ff          call   2000 <gdt_set_entry>
    20de:   83 c4 14                add    $0x14,%esp
    20e1:   68 cf 00 00 00          push   $0xcf
    20e6:   68 92 00 00 00          push   $0x92
    20eb:   6a ff                   push   $0xffffffff
    20ed:   6a 00                   push   $0x0
    20ef:   6a 02                   push   $0x2
    20f1:   e8 0a ff ff ff          call   2000 <gdt_set_entry>
    20f6:   83 c4 14                add    $0x14,%esp
    20f9:   b8 68 21 00 00          mov    $0x2168,%eax
    20fe:   83 ec 0c                sub    $0xc,%esp
    2101:   50                      push   %eax
    2102:   e8 19 00 00 00          call   2120 <gdt_load>
    2107:   83 c4 10                add    $0x10,%esp
    210a:   90                      nop
    210b:   c9                      leave
    210c:   c3                      ret

0000210d <main>:
    210d:   55                      push   %ebp
    210e:   89 e5                   mov    %esp,%ebp
    2110:   83 e4 f0                and    $0xfffffff0,%esp
    2113:   e8 86 ff ff ff          call   209e <gdt_init>
    2118:   90                      nop
    2119:   eb fd                   jmp    2118 <main+0xb>
    211b:   66 90                   xchg   %ax,%ax
    211d:   66 90                   xchg   %ax,%ax
    211f:   90                      nop

00002120 <gdt_load>:
    2120:   8b 44 24 04             mov    0x4(%esp),%eax
    2124:   0f 01 10                lgdtl  (%eax)
    2127:   66 b8 10 00             mov    $0x10,%ax
    212b:   8e d8                   mov    %eax,%ds
    212d:   8e c0                   mov    %eax,%es
    212f:   8e e0                   mov    %eax,%fs
    2131:   8e e8                   mov    %eax,%gs
    2133:   8e d0                   mov    %eax,%ss
    2135:   ea 3c 21 00 00 08 00    ljmp   $0x8,$0x213c

0000213c <gdt_load.reload_cs>:
    213c:   c3                      ret
    213d:   66 90                   xchg   %ax,%ax
    213f:   90                      nop

00002140 <_start>:
    2140:   bc 70 31 00 00          mov    $0x3170,%esp
    2145:   e8 c3 ff ff ff          call   210d <main>
    214a:   f4                      hlt

Tell me if you need any other debug, info of the GDB session or binaries. Many thanks!

1 Answer 1

5

You've messed up your operation modes. See, your kernel starts executing in 16 bit mode (just like bootloader did before it), but your code (including _start) is being built for 32 bit mode. Normally you'd think gdb would give you a hint or two about that as you are stepping through; however gdb is notoriously bad in handling operation modes though, at least with qemu's gdbserver stub.

So, your code is built for 32 bit mode, gdb disassembles it for 32 bit mode, objdump disassembles it for 32 bit but actually it is being executed as 16-bit code. Therefore it is not e8 c3 ff ff ff being executed, it is e8 c3 ff. Same relative call, different starting IP therefore different landing point - here's why it lands before main. It wouldn't help if it landed on main of course as that machine code is gibberish to CPU.

By the way, switch to 32-bit mode is completely absent from your code; there's just GDT load... Which makes me think your kernel code was supposed to start up in 32-bit mode, like maybe a multiboot kernel? But then you added BIOS bootloader to it and that can't work. You need to either add some "pre-kernel" switch to 32 bit mode in the bootloader, or make some of your kernel code be compiled in 16 bit mode.

EDIT: Having skimmed through the book, I can now see how you'd get to this error. Its last (semi?)completed chapter does exactly that error (there is a bit of text about how gdb is in 16 bit mode and objdump is in 32 bit and how it different from one another, but it is easy to miss). I assume the plan was to expand it and talk about adding 16-to-32 bit switch code to the bootloader, but it never came to be. Accompanying example code does that, however; there is a bit of code to setup initial GDT and switch to 32 bit mode in the bootloader.

Gotta say, not a great book... There's just way too much irrelevant information being rushed onto a reader. Details of Mod/RM encoding? 7 pages explaining details on ELF's symtab? 200 pages before reader even gets to write their first line of code, and most of it is filled with (useful per se but) mostly inconsequential junk.

Sign up to request clarification or add additional context in comments.

3 Comments

Bochs's built-in debugger always knows what mode the machine is in, because it's built-in to the emulator and designed specifically for x86 (so it's aware of modes and even segmentation.) You mentioned that GDB + QEMU was bad at this, but there is a good alternative.
That's true, bochsdbg is much better when debugging 16-bit mode, bios and interrupt/exception handling.
Pretty much thanks. Agree that the book is incomplete, and introduces potential issues. Actually, the linker script sample code has a bug as well, I had to tailor it, search for solutions. I guess I'd need to switch to OSDev wiki and start all over again, and probably for 64-bit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.