CLOSE

Directly accessing the disk using the ATA/ATAPI interface allows bypassing BIOS interrupts for disk operations.

ATA, also known as IDE (Integrated Drive Electronics), is a standard interface for connecting storage devices like hard drives and SSDs to a computer system. It uses a series of I/O ports to send commands and data between the CPU and the disk.

ATAPIO is an extension of ATA that supports devices such as CD-ROM drives. These interfaces provide a set of commands and protocols for reading and writing data to and from storage devices.

Unlike BIOS interrupts, which provide a layer of abstraction for disk operations, ATA/ATAPI allows direct communication between the CPU and the disk controller, offering more direct control over the storage device..

Why We Need to Read from the Disk

When a computer boots, the BIOS (Basic Input/Output System) initializes hardware components and searches for a bootable device. The BIOS loads a small program called the bootloader from the boot sector of the device. The bootloader’s job is to load the operating system's kernel into memory, thus starting the operating system.

Understanding ATA/ATAPI:

  • ATA and ATAPI are interfaces used for communication between storage devices (such as hard drives and optical drives) and the host system.
  • ATA is primarily used for accessing hard disk drives (HDDs) and solid-state drives (SSDs), while ATAPI extends ATA to support devices like CD/DVD drives.
  • These interfaces define command sets and protocols for tasks such as reading/writing data sectors, querying device information, and managing disk operations.

Benefits of Direct Disk Access:

  • Enhanced Performance: Direct disk access bypasses the BIOS layer, allowing for more efficient disk operations and reduced overhead.
  • Greater Control: Bootloader developers have fine-grained control over disk access parameters, enabling optimization for specific hardware configurations and performance requirements.
  • Flexibility: Direct access facilitates advanced disk operations beyond basic read/write, such as querying device features, performing diagnostics, and implementing custom protocols.

Implementing Direct Disk Access

Implementing ATA/ATAPIO involves writing low-level code that directly interacts with hardware registers to read data from a disk. This task is typically done in assembly language or low-level C.

Overview:

  1. Setup Drive and Disk Parameters: Configure the drive, head, cylinder, and sector.
  2. Send Commands to the Disk: Use the ATA command set to issue read commands.
  3. Poll the Disk: Wait for the disk to be ready to transfer data.
  4. Read Data from the Disk: Transfer the data from the disk to memory.
  5. Handle Errors: Ensure that errors are handled appropriately.

Key ATA Ports and Commands

Ports:

  • 0x1F0 - Data port (read/write data).
  • 0x1F1 - Error register (read) / Features register (write).
  • 0x1F2 - Sector count (number of sectors to read/write).
  • 0x1F3 - Sector number (starting sector).
  • 0x1F4 - Cylinder low byte.
  • 0x1F5 - Cylinder high byte.
  • 0x1F6 - Drive/head register.
  • 0x1F7 - Status register (read) / Command register (write).
  • 0x3F6 - Alternate status register (read).

Commands:

  • 0x20 - Read sectors with retries.

Example:

1 Initialization:

We will read just a single sector, so we pass it in bl register and the memory address where we want to load the read sector be passed in di.

mov bl, 0 	; will be reading 1 sector (0-based indexing)
mov di, 0x8000	; Memory address to load sectors

2 Set Drive and Head:

The port 0x1F6 is used to configure the drive and its head.

; Set drive and head
    mov dx, 0x1F6        ; Head & drive # port
    mov al, 0xA0         ; Primary drive, head 0
    out dx, al           ; Output to port

3 Set Sector Count:

The port 0x1F2 is used to set the number of sector to read which is the port address for the sector count register in the ATA interface. This register is used to specify how many sectors we want to read or write.

; Set sector count
	mov dx, 0x1F2        ; Sector count port
    mov al, 0x0A         ; 10 sectors to read
    out dx, al           ; Output to port

4 Set Starting Sector:

The 0x1F3 is the port address for the sector number register and is used to specify the starting sector number for read/write operations on the disk.

  • In the ATA/ATAPI specification, the sector numbers are 1-based. This means:
    • Sector 1 is the first sector.
    • Sector 2 is the second sector.
; Starting sector
	mov dx, 0x1F3        ; Sector # port
    mov al, 0x02         ; Start at sector 2
    out dx, al           ; Output to port

5 Set Cylinder number:

The cylinder number is part of the Cylinder-Head-Sector (CHS) addressing scheme used in traditional hard drives. It specifies which cylinder (track) on the disk to read from or write to. In the ATA interface, the cylinder number is divided into two parts: the low byte and the high byte. These are set using the 0x1F4 and 0x1F5 ports respectively.

Here's how you set the cylinder number in the ATA interface:

  1. Cylinder Low Byte Port (0x1F4):
  2. Cylinder High Byte Port (0x1F5):

In the code example below, we assume the cylinder number is 0 (both low and high bytes are 0). If you need to set a specific cylinder number, you would use the appropriate values for the low and high bytes of that number.

  • Set Cylinder Low Byte (0x1F4):
; Set cylinder number low byte
    mov dx, 0x1F4        ; Cylinder low port
    xor al, al           ; Cylinder low byte = 0
    out dx, al           ; Output to port
  • Set Cylinder High Byte (0xF5):
; Set cylinder number high byte
	mov dx, 0x1F5        ; Cylinder high port
    xor al, al           ; Cylinder high byte = 0
    out dx, al           ; Output to port

6 Sending the Read Command:

The 0x1F7 is the port address for the command register. The command register is used to send commands to the disk controller.

  • 0x20 is the command code for reading sector with retries. This command tells the disk controller to read sectors from the disk, and it will retry the read operation in case of errors.
; Send read command
    mov dx, 0x1F7        ; Command port
    mov al, 0x20         ; Read sectors with retries
    out dx, al           ; Output to port

The above code sends the 0x20 to the port 0x1F7. This action instructs the disk controller to begin the read operation based on the parameters set in the other registers (such as the sector count, starting sector, and cylinder number).

7 Polling the Status Port:

The 0x1F7 which is the port address for the status register. The status register contains various status bits indicating the current state of the controller.

read_loop:
    ; Poll status port
    mov dx, 0x1F7        ; Status port
.wait_status:
    in al, dx            ; Read status register
    test al, 0x08        ; Check if data buffer is ready
    jz .wait_status      ; Loop until ready || can be used je instead of jz
  • read_loop:: This label marks the beginning of the loop where the status port will be polled.
  • mov dx, 0x1F7: This instruction loads the dx register with the port address 0x1F7, which is the status port of the ATA interface.
  • .wait_status:: This is a local label indicating the start of the loop iteration.
  • in al, dx: This instruction reads the status register of the ATA interface and stores the result in the al register. The status register contains various status flags, including the one indicating whether the data buffer is ready for reading.
  • test al, 0x08: This instruction performs a bitwise AND operation between the value in the al register and 0x08 (binary 00001000). This tests whether the 4th bit (starting from the right) of the status register is set, which indicates whether the data buffer is ready.
  • jz .wait_status: This instruction jumps back to the .wait_status label if the result of the test instruction indicates that the data buffer is not yet ready (Z flag is set). This effectively creates a loop that continues until the data buffer is ready for reading.

8 Read Sector Data:

; Read sector data
    mov cx, 256          ; 256 words (512 bytes)
    mov dx, 0x1F0        ; Data port
    rep insw             ; Read words from port to memory
  • mov cx, 256: This instruction moves the value 256 into the cx register. In this context, cx represents the number of words to read from the data port. Since each word is 2 bytes, 256 words equate to 512 bytes, which is the size of a sector in the ATA interface.
  • mov dx, 0x1F0: This instruction loads the dx register with the port address 0x1F0. In the ATA interface, 0x1F0 is the port used for reading data from the disk.
  • rep insw: This instruction repeats the insw (input string word) instruction cx times. The insw instruction reads a word (2 bytes) from the port specified in the dx register and stores it in the memory location pointed to by the es:di registers. The rep prefix causes the insw instruction to be repeated cx times, effectively reading multiple words (or bytes) from the port into consecutive memory locations.

9 Delay of Approximately 400 nanoseconds:

Reading from the alternate status port (0x3F6) typically results in a short delay. In many ATA/ATAPI interfaces, including older ones, accessing certain control ports introduces a delay as the controller processes the command or performs internal operations. This delay is often necessary to ensure proper communication timing with the disk or other connected devices.

The sequence of four in al, dx instructions effectively introduces a delay by reading from the alternate status port multiple times. Each in instruction takes a certain amount of time to execute, contributing to the overall delay.

 ; 400ns delay
    mov dx, 0x3F6        ; Alternate status port
    in al, dx
    in al, dx
    in al, dx
    in al, dx
  • mov dx, 0x3F6: This instruction loads the dx register with the port address 0x3F6. In this context, 0x3F6 is considered the alternate status port.
  • in al, dx: These instructions read a byte from the port specified in the dx register and store it in the al register. They are executed four times in succession.
read_loop:
    ; Poll status port
    mov dx, 0x1F7        ; Status port
.wait_status:
    in al, dx            ; Read status register
    test al, 0x08        ; Check if data buffer is ready
    jz .wait_status      ; Loop until ready

    ; Read sector data
    mov cx, 256          ; 256 words (512 bytes)
    mov dx, 0x1F0        ; Data port
    rep insw             ; Read words from DX port to memory pointed by DI, CX times of 1 Word

    ; 400ns delay
    mov dx, 0x3F6        ; Alternate status port
    in al, dx
    in al, dx
    in al, dx
    in al, dx

    ; Check if more sectors to read
    dec bl               ; Decrement sector count
    js jump_to_kernel    ; If zero, jump to kernel
    jmp read_loop        ; Otherwise, read next sector

jump_to_kernel:
    jmp 0x0000:0x8000    ; Jump to the loaded kernel

10 Determine If more Sectors to Read:

; Check if more sectors to read
    dec bl               ; Decrement sector count
    js jump_to_kernel    ; If zero, jump to kernel
    jmp read_loop        ; Otherwise, read next sector
  • dec bl: This instruction decrements the value stored in the bl register by 1. In the context of this code, bl likely holds the number of sectors remaining to be read. As we want to read just a single sector initially we set bl to 0.
  • If bl was initially set to 0, then after executing dec bl, it becomes -1. Since bl is now -1, the sign flag (SF) will be set, because -1 is considered a negative number in two's complement representation. Therefore, the jump to jump_to_kernel will be taken.
  • js jump_to_kernel: This instruction checks the sign flag (SF) and jumps to the label jump_to_kernel if it is set. The sign flag is set when the result of an arithmetic operation is negative. 
  • Since the jump to jump_to_kernel is taken due to the negative value of bl, this instruction is not executed. However, if it were to be executed, it would simply jump back to read_loop.

 

  • Identifying the Disk Controller: Determine the type of disk controller (ATA or ATAPI) present in the system.
  • Initializing the Controller: Configure controller registers, select appropriate I/O ports or memory-mapped registers, and set communication parameters.
  • Sending ATA/ATAPI Commands: Construct and send commands (e.g., READ SECTORS) to the disk controller, specifying the sector(s) to read and the destination buffer for storing data.
  • Handling Data Transfer: Monitor controller status signals, retrieve read data from controller registers or buffers, and manage error conditions and recovery mechanisms.
  • Finalization: Clean up resources, perform additional processing on read data (if needed), and ensure proper error handling and validation.

Difference Between Reading using ATA/ATAPI && Reading using a File System

Reading using ATA/ATAPI:

  1. Low-Level Access: ATA/ATAPI provides low-level access to storage devices, such as hard drives and optical drives. It allows direct communication with the device's hardware components, including reading and writing raw data sectors.
  2. Sector-Based: Reading using ATA/ATAPI involves specifying the physical location of data on the storage device in terms of sectors. Sectors are the smallest addressable units on the disk, typically consisting of 512 bytes each.
  3. No File System Required: When reading using ATA/ATAPI, you bypass the file system layer altogether. You access the data directly from the disk without any interpretation or organization imposed by a file system.
  4. Limited Metadata: ATA/ATAPI does not provide access to file metadata such as file names, sizes, or directory structures. It only retrieves the raw data stored on the disk.

Reading Using a File System:

  1. High-Level Abstraction: File systems provide a higher level of abstraction over storage devices. They organize data into files, directories, and metadata, providing a structured way to store and access information.
  2. File-Based Access: When reading using a file system, you specify the file's name or path rather than its physical location on the disk. The file system handles translating this logical information into physical disk sectors.
  3. Interprets Metadata: File systems store metadata associated with files, such as file names, sizes, timestamps, and permissions. When reading a file, the file system interprets this metadata to provide additional context about the data being accessed.
  4. Disk Interface: The file system operates at a higher level of abstraction, using ATA/ATAPI (or other storage interfaces) to read and write the actual data on the disk.
  5. File System Drivers: Reading using a file system requires file system drivers or libraries to be present in the operating system or application. These drivers understand the file system's structure and implement algorithms to navigate directories, locate files, and retrieve data efficiently.

Interaction Between FS and ATA/ATAPI:

  • When an application requests to read or write a file, the file system translates this high-level operation into low-level disk operations.
  • The file system determines the location of the file on the disk, which involves understanding the file system's structures like the file allocation table (FAT) or inode table.
  • The file system then issues ATA/ATAPI commands to the storage device to read or write the necessary sectors.

Example Workflow:

  • Read Operation:
    1. An application requests to read a file.
    2. The file system locates the file's metadata (e.g., in the FAT or inode table) to determine where the file's data is stored on the disk.
    3. The file system calculates which disk sectors need to be read.
    4. The file system issues ATA/ATAPI commands to read those sectors from the disk.
    5. The file system collects the data from these sectors and presents it to the application as a coherent file.
  • Write Operation:
    1. An application requests to write a file.
    2. The file system allocates space on the disk for the file.
    3. The file system updates its structures to reflect the new file's location and metadata.
    4. The file system calculates which disk sectors need to be written.
    5. The file system issues ATA/ATAPI commands to write data to those sectors on the disk.
    6. The file system updates its structures to reflect the new state of the disk.

 

 

 

[BITS 16]
[ORG 0x7E00]

jmp start

start:
    ; Initialize data structures
    mov ax, 0x1000        ; Buffer address (adjust as needed)
    mov bx, 0x1            ; Number of sectors to read
    mov cx, 0x1            ; Starting sector number
    mov dl, 0x80          ; Drive number (adjust as needed)
    
    ; Issue ATA command to read sectors
    call read_sectors

    ; Jump to kernel (assuming loaded at 0x1000)
    jmp 0x1000

read_sectors:
    ; Wait for disk to be ready
    call wait_disk_ready
    
    ; Send ATA Identify Drive command
    mov dx, 0x1F1          ; ATA command port
    mov al, 0xEC            ; ATA Identify Drive command
    out dx, al
    
    ; Wait for drive to set BSY bit
    call wait_disk_busy
    
    ; Read status port to check for errors
    in al, dx
    test al, 0x80            ; Check if BSY bit is set
    jnz .error               ; If BSY bit is still set, there's an error
    
    ; Read sectors using ATA PIO data in/out
    mov dx, 0x1F0          ; ATA data port
    mov cx, bx             ; Number of sectors to read
    mov si, ax             ; Buffer address
    mov al, 0x20           ; ATA Read Sector(s) command
    out dx, al             ; Send command
    
    .read_loop:
        call wait_disk_ready
        in al, dx           ; Read status register
        test al, 0x01       ; Check if ERR bit is set
        jnz .error          ; If ERR bit is set, there's an error
        
        mov cx, 512         ; Bytes per sector
        rep insw            ; Read a sector into memory
        add si, cx          ; Move buffer pointer
        dec bx              ; Decrement sector counter
        jnz .read_loop      ; Continue reading if more sectors remain
        
    ret

.wait_disk_ready:
    ; Poll the status register until the drive is ready
    mov dx, 0x1F7          ; ATA status port
    in al, dx
    test al, 0x80          ; Check BSY bit
    jnz .wait_disk_ready   ; Loop until BSY bit is clear
    ret

.wait_disk_busy:
    ; Poll the status register until BSY bit is set
    mov dx, 0x1F7          ; ATA status port
    in al, dx
    test al, 0x88          ; Check BSY and DRQ bits
    jz .wait_disk_busy     ; Loop until BSY bit is set
    ret

.error:
    ; Handle disk error
    hlt

 

; Initialize data structures
    mov ax, 0x1000        ; Buffer address (adjust as needed)
    mov bx, 0x1           ; Number of sectors to read
    mov cx, 0x0           ; Starting LBA (adjust as needed)
    mov dl, 0x80          ; Drive number (adjust as needed)

    ; Issue ATA command to read sectors
    call ata_pio_read_sectors


ata_pio_read_sectors:
    pusha

    ; Wait for drive to be ready
    call wait_ata_ready

    ; Select drive and head
    mov dx, 0x1F6          ; Drive/Head register
    mov al, 0xE0           ; Select master drive, LBA mode
    out dx, al

    ; Set sector count
    mov dx, 0x1F2          ; Sector Count register
    mov al, bl             ; Number of sectors to read
    out dx, al

    ; Set LBA low byte
    mov dx, 0x1F3          ; LBA low register
    mov al, cl
    out dx, al

    ; Set LBA mid byte
    mov dx, 0x1F4          ; LBA mid register
    mov al, ch
    out dx, al

    ; Set LBA high byte
    mov dx, 0x1F5          ; LBA high register
    mov al, 0x00           ; Assuming LBA is in 28-bit range
    out dx, al

    ; Send read sectors command
    mov dx, 0x1F7          ; Command register
    mov al, 0x20           ; Read sectors command
    out dx, al

    ; Read sectors
    mov dx, 0x1F0          ; Data register
    mov di, ax             ; Buffer address

.read_sector:
    call wait_ata_ready

    mov cx, 256            ; 512 bytes per sector / 2 (word size)
    rep insw               ; Read sector data into buffer

    dec bx
    jnz .read_sector       ; Loop if more sectors to read

    popa
    ret

wait_ata_ready:
    ; Wait until the drive is ready
.wait_loop:
    mov dx, 0x1F7          ; Status register
    in al, dx
    test al, 0x80          ; Check BSY bit
    jnz .wait_loop         ; Loop until BSY bit is clear

    test al, 0x08          ; Check DRQ bit
    jz .wait_loop          ; Loop until DRQ bit is set
    ret

.endless_loop:
    hlt
    jmp .endless_loop
; do a singletasking PIO ATA read
; inputs: ebx = # of sectors to read, edi -> dest buffer, esi -> driverdata struct, ebp = 4b LBA
; Note: ebp is a "relative" LBA -- the offset from the beginning of the partition
; outputs: ebp, edi incremented past read; ebx = 0
; flags: zero flag set on success, carry set on failure (redundant)
read_ata_st:
	push edx
	push ecx
	push eax
	test ebx, ebx			; # of sectors < 0 is a "reset" request from software
	js short .reset
	cmp ebx, 0x3fffff		; read will be bigger than 2GB? (error)
	stc
	jg short .r_don
	mov edx, [esi + dd_prtlen]	; get the total partition length (sectors)
	dec edx				; (to avoid handling "equality" case)
	cmp edx, ebp			; verify ebp is legal (within partition limit)
	jb short .r_don			; (carry is set automatically on an error)
	cmp edx, ebx			; verify ebx is legal (forget about the ebx = edx case)
	jb short .r_don
	sub edx, ebx			; verify ebp + ebx - 1 is legal
	inc edx
	cmp edx, ebp			; (the test actually checks ebp <= edx - ebx + 1)
	jb short .r_don
	mov dx, [esi + dd_dcr]		; dx = alt status/DCR
	in al, dx			; get the current status
	test al, 0x88			; check the BSY and DRQ bits -- both must be clear
	je short .stat_ok
.reset:
	call srst_ata_st
	test ebx, ebx			; bypass any read on a "reset" request
	jns short .stat_ok
	xor ebx, ebx			; force zero flag on, carry clear
	jmp short .r_don
.stat_ok:
; preferentially use the 28bit routine, because it's a little faster
; if ebp > 28bit or esi.stLBA > 28bit or stLBA+ebp > 28bit or stLBA+ebp+ebx > 28bit, use 48 bit
	cmp ebp, 0xfffffff
	jg short .setreg
	mov eax, [esi + dd_stLBA]
	cmp eax, 0xfffffff
	jg short .setreg
	add eax, ebp
	cmp eax, 0xfffffff
	jg short .setreg
	add eax, ebx
	cmp eax, 0xfffffff
.setreg:
	mov dx, [esi + dd_tf]		; dx = IO port base ("task file")
	jle short .read28		; test the flags from the eax cmp's above
.read48:
	test ebx, ebx		; no more sectors to read?
	je short .r_don
	call pio48_read		; read up to 256 more sectors, updating registers
	je short .read48	; if successful, is there more to read?
	jmp short .r_don
.read28:
	test ebx, ebx		; no more sectors to read?
	je short .r_don
	call pio28_read		; read up to 256 more sectors, updating registers
	je short .read28	; if successful, is there more to read?
.r_don:
	pop eax
	pop ecx
	pop edx
	ret
 
 
;ATA PI0 28bit singletasking disk read function (up to 256 sectors)
; inputs: ESI -> driverdata info, EDI -> destination buffer
; BL = sectors to read, DX = base bus I/O port (0x1F0, 0x170, ...), EBP = 28bit "relative" LBA
; BSY and DRQ ATA status bits must already be known to be clear on both slave and master
; outputs: data stored in EDI; EDI and EBP advanced, EBX decremented
; flags: on success Zero flag set, Carry clear
pio28_read:
	add ebp, [esi + dd_stLBA]	; convert relative LBA to absolute LBA
	mov ecx, ebp			; save a working copy
	mov al, bl		; set al= sector count (0 means 256 sectors)
	or dl, 2		; dx = sectorcount port -- usually port 1f2
	out dx, al
	mov al, cl		; ecx currently holds LBA
	inc edx			; port 1f3 -- LBAlow
	out dx, al
	mov al, ch
	inc edx			; port 1f4 -- LBAmid
	out dx, al
	bswap ecx
	mov al, ch		; bits 16 to 23 of LBA
	inc edx			; port 1f5 -- LBAhigh
	out dx, al
	mov al, cl			; bits 24 to 28 of LBA
	or al, byte [esi + dd_sbits]	; master/slave flag | 0xe0
	inc edx				; port 1f6 -- drive select
	out dx, al
 
	inc edx			; port 1f7 -- command/status
	mov al, 0x20		; send "read" command to drive
	out dx, al
 
; ignore the error bit for the first 4 status reads -- ie. implement 400ns delay on ERR only
; wait for BSY clear and DRQ set
	mov ecx, 4
.lp1:
	in al, dx		; grab a status byte
	test al, 0x80		; BSY flag set?
	jne short .retry
	test al, 8		; DRQ set?
	jne short .data_rdy
.retry:
	dec ecx
	jg short .lp1
; need to wait some more -- loop until BSY clears or ERR sets (error exit if ERR sets)
 
.pior_l:
	in al, dx		; grab a status byte
	test al, 0x80		; BSY flag set?
	jne short .pior_l	; (all other flags are meaningless if BSY is set)
	test al, 0x21		; ERR or DF set?
	jne short .fail
.data_rdy:
; if BSY and ERR are clear then DRQ must be set -- go and read the data
	sub dl, 7		; read from data port (ie. 0x1f0)
	mov cx, 256
	rep insw		; gulp one 512b sector into edi
	or dl, 7		; "point" dx back at the status register
	in al, dx		; delay 400ns to allow drive to set new values of BSY and DRQ
	in al, dx
	in al, dx
	in al, dx
 
; After each DRQ data block it is mandatory to either:
; receive and ack the IRQ -- or poll the status port all over again
 
	inc ebp			; increment the current absolute LBA
	dec ebx			; decrement the "sectors to read" count
	test bl, bl		; check if the low byte just turned 0 (more sectors to read?)
	jne short .pior_l
 
	sub dx, 7		; "point" dx back at the base IO port, so it's unchanged
	sub ebp, [esi + dd_stLBA]	; convert absolute lba back to relative
; "test" sets the zero flag for a "success" return -- also clears the carry flag
	test al, 0x21		; test the last status ERR bits
	je short .done
.fail:
	stc
.done:
	ret
 
 
;ATA PI0 33bit singletasking disk read function (up to 64K sectors, using 48bit mode)
; inputs: bx = sectors to read (0 means 64K sectors), edi -> destination buffer
; esi -> driverdata info, dx = base bus I/O port (0x1F0, 0x170, ...), ebp = 32bit "relative" LBA
; BSY and DRQ ATA status bits must already be known to be clear on both slave and master
; outputs: data stored in edi; edi and ebp advanced, ebx decremented
; flags: on success Zero flag set, Carry clear
pio48_read:
	xor eax, eax
	add ebp, [esi + dd_stLBA]	; convert relative LBA to absolute LBA
; special case: did the addition overflow 32 bits (carry set)?
	adc ah, 0			; if so, ah = LBA byte #5 = 1
	mov ecx, ebp			; save a working copy of 32 bit absolute LBA
 
; for speed purposes, never OUT to the same port twice in a row -- avoiding it is messy but best
;outb (0x1F2, sectorcount high)
;outb (0x1F3, LBA4)
;outb (0x1F4, LBA5)			-- value = 0 or 1 only
;outb (0x1F5, LBA6)			-- value = 0 always
;outb (0x1F2, sectorcount low)
;outb (0x1F3, LBA1)
;outb (0x1F4, LBA2)
;outb (0x1F5, LBA3)
	bswap ecx		; make LBA4 and LBA3 easy to access (cl, ch)
	or dl, 2		; dx = sectorcount port -- usually port 1f2
	mov al, bh		; sectorcount -- high byte
	out dx, al
	mov al, cl
	inc edx
	out dx, al		; LBA4 = LBAlow, high byte (1f3)
	inc edx
	mov al, ah		; LBA5 was calculated above
	out dx, al		; LBA5 = LBAmid, high byte (1f4)
	inc edx
	mov al, 0		; LBA6 is always 0 in 32 bit mode
	out dx, al		; LBA6 = LBAhigh, high byte (1f5)
 
	sub dl, 3
	mov al, bl		; sectorcount -- low byte (1f2)
	out dx, al
	mov ax, bp		; get LBA1 and LBA2 into ax
	inc edx
	out dx, al		; LBA1 = LBAlow, low byte (1f3)
	mov al, ah		; LBA2
	inc edx
	out dx, al		; LBA2 = LBAmid, low byte (1f4)
	mov al, ch		; LBA3
	inc edx
	out dx, al		; LBA3 = LBAhigh, low byte (1f5)
 
	mov al, byte [esi + dd_sbits]	; master/slave flag | 0xe0
	inc edx
	and al, 0x50		; get rid of extraneous LBA28 bits in drive selector
	out dx, al		; drive select (1f6)
 
	inc edx
	mov al, 0x24		; send "read ext" command to drive
	out dx, al		; command (1f7)
 
; ignore the error bit for the first 4 status reads -- ie. implement 400ns delay on ERR only
; wait for BSY clear and DRQ set
	mov ecx, 4
.lp1:
	in al, dx		; grab a status byte
	test al, 0x80		; BSY flag set?
	jne short .retry
	test al, 8		; DRQ set?
	jne short .data_rdy
.retry:
	dec ecx
	jg short .lp1
; need to wait some more -- loop until BSY clears or ERR sets (error exit if ERR sets)
 
.pior_l:
	in al, dx		; grab a status byte
	test al, 0x80		; BSY flag set?
	jne short .pior_l	; (all other flags are meaningless if BSY is set)
	test al, 0x21		; ERR or DF set?
	jne short .fail
.data_rdy:
; if BSY and ERR are clear then DRQ must be set -- go and read the data
	sub dl, 7		; read from data port (ie. 0x1f0)
	mov cx, 256
	rep insw		; gulp one 512b sector into edi
	or dl, 7		; "point" dx back at the status register
	in al, dx		; delay 400ns to allow drive to set new values of BSY and DRQ
	in al, dx
	in al, dx
	in al, dx
 
; After each DRQ data block it is mandatory to either:
; receive and ack the IRQ -- or poll the status port all over again
 
	inc ebp			; increment the current absolute LBA (overflowing is OK!)
	dec ebx			; decrement the "sectors to read" count
	test bx, bx		; check if "sectorcount" just decremented to 0
	jne short .pior_l
 
	sub dx, 7		; "point" dx back at the base IO port, so it's unchanged
	sub ebp, [esi + dd_stLBA]	; convert absolute lba back to relative
; this sub handles the >32bit overflow cases correcty, too
; "test" sets the zero flag for a "success" return -- also clears the carry flag
	test al, 0x21		; test the last status ERR bits
	je short .done
.fail:
	stc
.done:
	ret
 
 
; do a singletasking PIO ata "software reset" with DCR in dx
srst_ata_st:
	push eax
	mov al, 4
	out dx, al			; do a "software reset" on the bus
	xor eax, eax
	out dx, al			; reset the bus to normal operation
	in al, dx			; it might take 4 tries for status bits to reset
	in al, dx			; ie. do a 400ns delay
	in al, dx
	in al, dx
.rdylp:
	in al, dx
	and al, 0xc0			; check BSY and RDY
	cmp al, 0x40			; want BSY clear and RDY set
	jne short .rdylp
	pop eax
	ret

 

 


 

[BITS 16]
[ORG 0x7C00]

start:
    cli
    xor ax, ax
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov sp, 0x7C00

    ; Set up segment registers
    mov ax, 0x9000
    mov ds, ax
    mov es, ax

    ; Load second stage bootloader (assuming it's located at LBA 1)
    mov ebx, 1              ; Number of sectors to read
    mov edi, 0x9000         ; Destination buffer
    lea esi, [drive_data]   ; Driver data struct
    mov ebp, 1              ; LBA (logical block address) of the second stage

    call read_ata_st

    ; Jump to the second stage bootloader
    jmp 0x9000

drive_data:
    dd_tf:    dw 0x1F0      ; Task file IO port base
    dd_dcr:   dw 0x3F6      ; Alternate status/DCR
    dd_sbits: db 0xE0       ; Master drive
    dd_prtlen: dd 0x10000   ; Partition length (adjust as needed)
    dd_stLBA: dd 0          ; Starting LBA

times 510-($-$$) db 0
dw 0xAA55

; Include the read_ata_st function
%include "read_ata.inc"

 

[BITS 16]
[ORG 0x9000]

start:
    ; Print a message
    mov si, hello_msg
    call print_string

    ; Further code to load the kernel or additional tasks

    ; Hang the system
    cli
    hlt

print_string:
    .loop:
        lodsb
        or al, al
        jz .done
        mov ah, 0x0E
        int 0x10
        jmp .loop
    .done:
        ret

hello_msg db "Second stage bootloader loaded successfully!", 0

times 512-($-$$) db 0

 

 


[BITS 16]
[ORG 0x7C00]

start:
    cli                 ; Clear interrupts
    xor ax, ax
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov sp, 0x7C00      ; Set up stack

    ; Setup segment registers
    mov ax, 0x07C0
    mov ds, ax
    mov es, ax

    ; Load second stage bootloader from LBA 1
    mov bx, 0x8000      ; Buffer address in memory
    mov dx, 0x80        ; Drive number (0x80 for first hard disk)
    call read_sectors

    ; Jump to the second stage bootloader
    jmp 0x8000

read_sectors:
    pusha
    mov dx, 0x1F0       ; Base I/O port of the primary ATA channel
    mov cx, 1           ; Number of sectors to read
    mov si, 1           ; LBA (logical block addressing) sector to read

    ; Wait for drive to be ready
.wait_drq:
    inb dx+7
    test al, 0x08
    jz .wait_drq

    ; Send commands to read sectors
    mov al, 0xE0        ; LBA mode, master drive
    outb dx+6, al
    outb dx+2, cl       ; Sector count
    outb dx+3, si       ; LBA low byte
    shr si, 8
    outb dx+4, si       ; LBA mid byte
    shr si, 8
    outb dx+5, si       ; LBA high byte
    mov al, 0x20        ; Read command
    outb dx+7, al

    ; Wait for drive to be ready
.wait_data:
    inb dx+7
    test al, 0x08
    jz .wait_data

    ; Read data from disk
    mov cx, 256         ; 512 bytes = 256 words
.read_loop:
    inw dx              ; Read word from disk
    mov [es:bx], ax     ; Store word in memory
    add bx, 2
    loop .read_loop

    popa
    ret

times 510-($-$$) db 0   ; Pad the rest of the sector with zeros
dw 0xAA55               ; Boot signature
[BITS 16]
[ORG 0x8000]

start:
    mov ax, 0x07C0  ; Set up the segment registers
    mov ds, ax
    mov es, ax

    mov si, msg     ; Load the address of the message string
    call print_str  ; Call the function to print the string

    jmp $           ; Infinite loop

print_char:
    ; Print character in AL
    mov ah, 0x0E    ; BIOS teletype function
    mov bh, 0x00    ; Page number (0 for mode 3)
    mov bl, 0x07    ; Text attribute (0x07 is white on black)
    int 0x10        ; Call BIOS interrupt

    ret

print_str:
    ; Print null-terminated string at DS:SI
    .next_char:
        lodsb       ; Load next byte of string into AL
        test al, al ; Check for null terminator
        jz .done    ; If null terminator, we're done
        call print_char
        jmp .next_char
    .done:
        ret

msg db 'Hello, Second Stage Bootloader!', 0

times 512-($-$$) db 0   ; Pad the rest of the sector with zeros
nasm -f bin -o bootloader.bin bootloader.asm
dd if=bootloader.bin of=/dev/sdX bs=512 count=1


dd if=second_stage.bin of=/dev/sdX bs=512 seek=1

 

 

 

 

 

 


 ;; READ KERNEL INTO MEMORY SECOND
    mov bl, 9           ; Will be reading 10 sectors
    mov di, 2000h       ; Memory address to read sectors into (0000h:2000h)

    mov dx, 1F6h        ; Head & drive # port
    mov al, [drive_num] ; Drive # - hard disk 1
    and al, 0Fh         ; Head # (low nibble)
    or al, 0A0h         ; default high nibble to 'primary' drive (drive 1), 'secondary' drive (drive 2) would be hex B or 1011b
    out dx, al          ; Send head/drive #

    mov dx, 1F2h        ; Sector count port
    mov al, 0Ah         ; # of sectors to read
    out dx, al

    mov dx, 1F3h        ; Sector # port
    mov al, 2           ; Sector to start reading at (sectors are 0-based!!)
    out dx, al

    mov dx, 1F4h        ; Cylinder low port
    xor al, al          ; Cylinder low #
    out dx, al

    mov dx, 1F5h        ; Cylinder high port
    xor al, al          ; Cylinder high #
    out dx, al

    mov dx, 1F7h        ; Command port (writing port 1F7h)
    mov al, 20h         ; Read with retry
    out dx, al

;; Poll status port after reading 1 sector
kernel_loop:
    in al, dx           ; Status register (reading port 1F7h)
    test al, 8          ; Sector buffer requires servicing
    je kernel_loop     ; Keep trying until sector buffer is ready

    mov cx, 256         ; # of words to read for 1 sector
    mov dx, 1F0h        ; Data port, reading 
    rep insw            ; Read bytes from DX port # into DI, CX # of times
    
    ;; 400ns delay - Read alternate status register
    mov dx, 3F6h
    in al, dx
    in al, dx
    in al, dx
    in al, dx

    cmp bl, 0
    je jump_to_kernel

    dec bl
    mov dx, 1F7h
    jmp kernel_loop

jump_to_kernel: