Directly accessing the disk using the ATA/ATAPI interface allows bypassing BIOS interrupts for disk operations.
ATA, also known as IDE (Integrated Drive Electronics), is a standard interface for connecting storage devices like hard drives and SSDs to a computer system. It uses a series of I/O ports to send commands and data between the CPU and the disk.
ATAPIO is an extension of ATA that supports devices such as CD-ROM drives. These interfaces provide a set of commands and protocols for reading and writing data to and from storage devices.
Unlike BIOS interrupts, which provide a layer of abstraction for disk operations, ATA/ATAPI allows direct communication between the CPU and the disk controller, offering more direct control over the storage device..
Why We Need to Read from the Disk
When a computer boots, the BIOS (Basic Input/Output System) initializes hardware components and searches for a bootable device. The BIOS loads a small program called the bootloader from the boot sector of the device. The bootloader’s job is to load the operating system's kernel into memory, thus starting the operating system.
Understanding ATA/ATAPI:
- ATA and ATAPI are interfaces used for communication between storage devices (such as hard drives and optical drives) and the host system.
- ATA is primarily used for accessing hard disk drives (HDDs) and solid-state drives (SSDs), while ATAPI extends ATA to support devices like CD/DVD drives.
- These interfaces define command sets and protocols for tasks such as reading/writing data sectors, querying device information, and managing disk operations.
Benefits of Direct Disk Access:
- Enhanced Performance: Direct disk access bypasses the BIOS layer, allowing for more efficient disk operations and reduced overhead.
- Greater Control: Bootloader developers have fine-grained control over disk access parameters, enabling optimization for specific hardware configurations and performance requirements.
- Flexibility: Direct access facilitates advanced disk operations beyond basic read/write, such as querying device features, performing diagnostics, and implementing custom protocols.
Implementing Direct Disk Access
Implementing ATA/ATAPIO involves writing low-level code that directly interacts with hardware registers to read data from a disk. This task is typically done in assembly language or low-level C.
Overview:
- Setup Drive and Disk Parameters: Configure the drive, head, cylinder, and sector.
- Send Commands to the Disk: Use the ATA command set to issue read commands.
- Poll the Disk: Wait for the disk to be ready to transfer data.
- Read Data from the Disk: Transfer the data from the disk to memory.
- Handle Errors: Ensure that errors are handled appropriately.
Key ATA Ports and Commands
Ports:
0x1F0
- Data port (read/write data).0x1F1
- Error register (read) / Features register (write).0x1F2
- Sector count (number of sectors to read/write).0x1F3
- Sector number (starting sector).0x1F4
- Cylinder low byte.0x1F5
- Cylinder high byte.0x1F6
- Drive/head register.0x1F7
- Status register (read) / Command register (write).0x3F6
- Alternate status register (read).
Commands:
0x20
- Read sectors with retries.
Example:
1 Initialization:
We will read just a single sector, so we pass it in bl
register and the memory address where we want to load the read sector be passed in di
.
mov bl, 0 ; will be reading 1 sector (0-based indexing)
mov di, 0x8000 ; Memory address to load sectors
2 Set Drive and Head:
The port 0x1F6
is used to configure the drive and its head.
; Set drive and head
mov dx, 0x1F6 ; Head & drive # port
mov al, 0xA0 ; Primary drive, head 0
out dx, al ; Output to port
3 Set Sector Count:
The port 0x1F2
is used to set the number of sector to read which is the port address for the sector count register in the ATA interface. This register is used to specify how many sectors we want to read or write.
; Set sector count
mov dx, 0x1F2 ; Sector count port
mov al, 0x0A ; 10 sectors to read
out dx, al ; Output to port
4 Set Starting Sector:
The 0x1F3
is the port address for the sector number register and is used to specify the starting sector number for read/write operations on the disk.
- In the ATA/ATAPI specification, the sector numbers are 1-based. This means:
- Sector 1 is the first sector.
- Sector 2 is the second sector.
; Starting sector
mov dx, 0x1F3 ; Sector # port
mov al, 0x02 ; Start at sector 2
out dx, al ; Output to port
5 Set Cylinder number:
The cylinder number is part of the Cylinder-Head-Sector (CHS) addressing scheme used in traditional hard drives. It specifies which cylinder (track) on the disk to read from or write to. In the ATA interface, the cylinder number is divided into two parts: the low byte and the high byte. These are set using the 0x1F4
and 0x1F5
ports respectively.
Here's how you set the cylinder number in the ATA interface:
- Cylinder Low Byte Port (
0x1F4
): - Cylinder High Byte Port (
0x1F5
):
In the code example below, we assume the cylinder number is 0 (both low and high bytes are 0). If you need to set a specific cylinder number, you would use the appropriate values for the low and high bytes of that number.
- Set Cylinder Low Byte (
0x1F4
):
; Set cylinder number low byte
mov dx, 0x1F4 ; Cylinder low port
xor al, al ; Cylinder low byte = 0
out dx, al ; Output to port
- Set Cylinder High Byte (
0xF5
):
; Set cylinder number high byte
mov dx, 0x1F5 ; Cylinder high port
xor al, al ; Cylinder high byte = 0
out dx, al ; Output to port
6 Sending the Read Command:
The 0x1F7
is the port address for the command register. The command register is used to send commands to the disk controller.
0x20
is the command code for reading sector with retries. This command tells the disk controller to read sectors from the disk, and it will retry the read operation in case of errors.
; Send read command
mov dx, 0x1F7 ; Command port
mov al, 0x20 ; Read sectors with retries
out dx, al ; Output to port
The above code sends the 0x20
to the port 0x1F7
. This action instructs the disk controller to begin the read operation based on the parameters set in the other registers (such as the sector count, starting sector, and cylinder number).
7 Polling the Status Port:
The 0x1F7
which is the port address for the status register. The status register contains various status bits indicating the current state of the controller.
read_loop:
; Poll status port
mov dx, 0x1F7 ; Status port
.wait_status:
in al, dx ; Read status register
test al, 0x08 ; Check if data buffer is ready
jz .wait_status ; Loop until ready || can be used je instead of jz
read_loop:
: This label marks the beginning of the loop where the status port will be polled.mov dx, 0x1F7
: This instruction loads thedx
register with the port address0x1F7
, which is the status port of the ATA interface..wait_status:
: This is a local label indicating the start of the loop iteration.in al, dx
: This instruction reads the status register of the ATA interface and stores the result in theal
register. The status register contains various status flags, including the one indicating whether the data buffer is ready for reading.test al, 0x08
: This instruction performs a bitwise AND operation between the value in theal
register and0x08
(binary00001000
). This tests whether the 4th bit (starting from the right) of the status register is set, which indicates whether the data buffer is ready.jz .wait_status
: This instruction jumps back to the.wait_status
label if the result of thetest
instruction indicates that the data buffer is not yet ready (Z
flag is set). This effectively creates a loop that continues until the data buffer is ready for reading.
8 Read Sector Data:
; Read sector data
mov cx, 256 ; 256 words (512 bytes)
mov dx, 0x1F0 ; Data port
rep insw ; Read words from port to memory
mov cx, 256
: This instruction moves the value256
into thecx
register. In this context,cx
represents the number of words to read from the data port. Since each word is 2 bytes,256
words equate to512
bytes, which is the size of a sector in the ATA interface.mov dx, 0x1F0
: This instruction loads thedx
register with the port address0x1F0
. In the ATA interface,0x1F0
is the port used for reading data from the disk.rep insw
: This instruction repeats theinsw
(input string word) instructioncx
times. Theinsw
instruction reads a word (2 bytes) from the port specified in thedx
register and stores it in the memory location pointed to by thees:di
registers. Therep
prefix causes theinsw
instruction to be repeatedcx
times, effectively reading multiple words (or bytes) from the port into consecutive memory locations.
9 Delay of Approximately 400 nanoseconds:
Reading from the alternate status port (0x3F6
) typically results in a short delay. In many ATA/ATAPI interfaces, including older ones, accessing certain control ports introduces a delay as the controller processes the command or performs internal operations. This delay is often necessary to ensure proper communication timing with the disk or other connected devices.
The sequence of four in al, dx
instructions effectively introduces a delay by reading from the alternate status port multiple times. Each in
instruction takes a certain amount of time to execute, contributing to the overall delay.
; 400ns delay
mov dx, 0x3F6 ; Alternate status port
in al, dx
in al, dx
in al, dx
in al, dx
mov dx, 0x3F6
: This instruction loads thedx
register with the port address0x3F6
. In this context,0x3F6
is considered the alternate status port.in al, dx
: These instructions read a byte from the port specified in thedx
register and store it in theal
register. They are executed four times in succession.
read_loop:
; Poll status port
mov dx, 0x1F7 ; Status port
.wait_status:
in al, dx ; Read status register
test al, 0x08 ; Check if data buffer is ready
jz .wait_status ; Loop until ready
; Read sector data
mov cx, 256 ; 256 words (512 bytes)
mov dx, 0x1F0 ; Data port
rep insw ; Read words from DX port to memory pointed by DI, CX times of 1 Word
; 400ns delay
mov dx, 0x3F6 ; Alternate status port
in al, dx
in al, dx
in al, dx
in al, dx
; Check if more sectors to read
dec bl ; Decrement sector count
js jump_to_kernel ; If zero, jump to kernel
jmp read_loop ; Otherwise, read next sector
jump_to_kernel:
jmp 0x0000:0x8000 ; Jump to the loaded kernel
10 Determine If more Sectors to Read:
; Check if more sectors to read
dec bl ; Decrement sector count
js jump_to_kernel ; If zero, jump to kernel
jmp read_loop ; Otherwise, read next sector
dec bl
: This instruction decrements the value stored in thebl
register by 1. In the context of this code,bl
likely holds the number of sectors remaining to be read. As we want to read just a single sector initially we setbl
to 0.- If
bl
was initially set to 0, then after executingdec bl
, it becomes -1. Sincebl
is now -1, the sign flag (SF
) will be set, because-1
is considered a negative number in two's complement representation. Therefore, the jump tojump_to_kernel
will be taken. js jump_to_kernel
: This instruction checks the sign flag (SF
) and jumps to the labeljump_to_kernel
if it is set. The sign flag is set when the result of an arithmetic operation is negative.- Since the jump to
jump_to_kernel
is taken due to the negative value ofbl
, this instruction is not executed. However, if it were to be executed, it would simply jump back toread_loop
.
- Identifying the Disk Controller: Determine the type of disk controller (ATA or ATAPI) present in the system.
- Initializing the Controller: Configure controller registers, select appropriate I/O ports or memory-mapped registers, and set communication parameters.
- Sending ATA/ATAPI Commands: Construct and send commands (e.g., READ SECTORS) to the disk controller, specifying the sector(s) to read and the destination buffer for storing data.
- Handling Data Transfer: Monitor controller status signals, retrieve read data from controller registers or buffers, and manage error conditions and recovery mechanisms.
- Finalization: Clean up resources, perform additional processing on read data (if needed), and ensure proper error handling and validation.
Difference Between Reading using ATA/ATAPI && Reading using a File System
Reading using ATA/ATAPI:
- Low-Level Access: ATA/ATAPI provides low-level access to storage devices, such as hard drives and optical drives. It allows direct communication with the device's hardware components, including reading and writing raw data sectors.
- Sector-Based: Reading using ATA/ATAPI involves specifying the physical location of data on the storage device in terms of sectors. Sectors are the smallest addressable units on the disk, typically consisting of 512 bytes each.
- No File System Required: When reading using ATA/ATAPI, you bypass the file system layer altogether. You access the data directly from the disk without any interpretation or organization imposed by a file system.
- Limited Metadata: ATA/ATAPI does not provide access to file metadata such as file names, sizes, or directory structures. It only retrieves the raw data stored on the disk.
Reading Using a File System:
- High-Level Abstraction: File systems provide a higher level of abstraction over storage devices. They organize data into files, directories, and metadata, providing a structured way to store and access information.
- File-Based Access: When reading using a file system, you specify the file's name or path rather than its physical location on the disk. The file system handles translating this logical information into physical disk sectors.
- Interprets Metadata: File systems store metadata associated with files, such as file names, sizes, timestamps, and permissions. When reading a file, the file system interprets this metadata to provide additional context about the data being accessed.
- Disk Interface: The file system operates at a higher level of abstraction, using ATA/ATAPI (or other storage interfaces) to read and write the actual data on the disk.
- File System Drivers: Reading using a file system requires file system drivers or libraries to be present in the operating system or application. These drivers understand the file system's structure and implement algorithms to navigate directories, locate files, and retrieve data efficiently.
Interaction Between FS and ATA/ATAPI:
- When an application requests to read or write a file, the file system translates this high-level operation into low-level disk operations.
- The file system determines the location of the file on the disk, which involves understanding the file system's structures like the file allocation table (FAT) or inode table.
- The file system then issues ATA/ATAPI commands to the storage device to read or write the necessary sectors.
Example Workflow:
- Read Operation:
- An application requests to read a file.
- The file system locates the file's metadata (e.g., in the FAT or inode table) to determine where the file's data is stored on the disk.
- The file system calculates which disk sectors need to be read.
- The file system issues ATA/ATAPI commands to read those sectors from the disk.
- The file system collects the data from these sectors and presents it to the application as a coherent file.
- Write Operation:
- An application requests to write a file.
- The file system allocates space on the disk for the file.
- The file system updates its structures to reflect the new file's location and metadata.
- The file system calculates which disk sectors need to be written.
- The file system issues ATA/ATAPI commands to write data to those sectors on the disk.
- The file system updates its structures to reflect the new state of the disk.
[BITS 16]
[ORG 0x7E00]
jmp start
start:
; Initialize data structures
mov ax, 0x1000 ; Buffer address (adjust as needed)
mov bx, 0x1 ; Number of sectors to read
mov cx, 0x1 ; Starting sector number
mov dl, 0x80 ; Drive number (adjust as needed)
; Issue ATA command to read sectors
call read_sectors
; Jump to kernel (assuming loaded at 0x1000)
jmp 0x1000
read_sectors:
; Wait for disk to be ready
call wait_disk_ready
; Send ATA Identify Drive command
mov dx, 0x1F1 ; ATA command port
mov al, 0xEC ; ATA Identify Drive command
out dx, al
; Wait for drive to set BSY bit
call wait_disk_busy
; Read status port to check for errors
in al, dx
test al, 0x80 ; Check if BSY bit is set
jnz .error ; If BSY bit is still set, there's an error
; Read sectors using ATA PIO data in/out
mov dx, 0x1F0 ; ATA data port
mov cx, bx ; Number of sectors to read
mov si, ax ; Buffer address
mov al, 0x20 ; ATA Read Sector(s) command
out dx, al ; Send command
.read_loop:
call wait_disk_ready
in al, dx ; Read status register
test al, 0x01 ; Check if ERR bit is set
jnz .error ; If ERR bit is set, there's an error
mov cx, 512 ; Bytes per sector
rep insw ; Read a sector into memory
add si, cx ; Move buffer pointer
dec bx ; Decrement sector counter
jnz .read_loop ; Continue reading if more sectors remain
ret
.wait_disk_ready:
; Poll the status register until the drive is ready
mov dx, 0x1F7 ; ATA status port
in al, dx
test al, 0x80 ; Check BSY bit
jnz .wait_disk_ready ; Loop until BSY bit is clear
ret
.wait_disk_busy:
; Poll the status register until BSY bit is set
mov dx, 0x1F7 ; ATA status port
in al, dx
test al, 0x88 ; Check BSY and DRQ bits
jz .wait_disk_busy ; Loop until BSY bit is set
ret
.error:
; Handle disk error
hlt
; Initialize data structures
mov ax, 0x1000 ; Buffer address (adjust as needed)
mov bx, 0x1 ; Number of sectors to read
mov cx, 0x0 ; Starting LBA (adjust as needed)
mov dl, 0x80 ; Drive number (adjust as needed)
; Issue ATA command to read sectors
call ata_pio_read_sectors
ata_pio_read_sectors:
pusha
; Wait for drive to be ready
call wait_ata_ready
; Select drive and head
mov dx, 0x1F6 ; Drive/Head register
mov al, 0xE0 ; Select master drive, LBA mode
out dx, al
; Set sector count
mov dx, 0x1F2 ; Sector Count register
mov al, bl ; Number of sectors to read
out dx, al
; Set LBA low byte
mov dx, 0x1F3 ; LBA low register
mov al, cl
out dx, al
; Set LBA mid byte
mov dx, 0x1F4 ; LBA mid register
mov al, ch
out dx, al
; Set LBA high byte
mov dx, 0x1F5 ; LBA high register
mov al, 0x00 ; Assuming LBA is in 28-bit range
out dx, al
; Send read sectors command
mov dx, 0x1F7 ; Command register
mov al, 0x20 ; Read sectors command
out dx, al
; Read sectors
mov dx, 0x1F0 ; Data register
mov di, ax ; Buffer address
.read_sector:
call wait_ata_ready
mov cx, 256 ; 512 bytes per sector / 2 (word size)
rep insw ; Read sector data into buffer
dec bx
jnz .read_sector ; Loop if more sectors to read
popa
ret
wait_ata_ready:
; Wait until the drive is ready
.wait_loop:
mov dx, 0x1F7 ; Status register
in al, dx
test al, 0x80 ; Check BSY bit
jnz .wait_loop ; Loop until BSY bit is clear
test al, 0x08 ; Check DRQ bit
jz .wait_loop ; Loop until DRQ bit is set
ret
.endless_loop:
hlt
jmp .endless_loop
; do a singletasking PIO ATA read
; inputs: ebx = # of sectors to read, edi -> dest buffer, esi -> driverdata struct, ebp = 4b LBA
; Note: ebp is a "relative" LBA -- the offset from the beginning of the partition
; outputs: ebp, edi incremented past read; ebx = 0
; flags: zero flag set on success, carry set on failure (redundant)
read_ata_st:
push edx
push ecx
push eax
test ebx, ebx ; # of sectors < 0 is a "reset" request from software
js short .reset
cmp ebx, 0x3fffff ; read will be bigger than 2GB? (error)
stc
jg short .r_don
mov edx, [esi + dd_prtlen] ; get the total partition length (sectors)
dec edx ; (to avoid handling "equality" case)
cmp edx, ebp ; verify ebp is legal (within partition limit)
jb short .r_don ; (carry is set automatically on an error)
cmp edx, ebx ; verify ebx is legal (forget about the ebx = edx case)
jb short .r_don
sub edx, ebx ; verify ebp + ebx - 1 is legal
inc edx
cmp edx, ebp ; (the test actually checks ebp <= edx - ebx + 1)
jb short .r_don
mov dx, [esi + dd_dcr] ; dx = alt status/DCR
in al, dx ; get the current status
test al, 0x88 ; check the BSY and DRQ bits -- both must be clear
je short .stat_ok
.reset:
call srst_ata_st
test ebx, ebx ; bypass any read on a "reset" request
jns short .stat_ok
xor ebx, ebx ; force zero flag on, carry clear
jmp short .r_don
.stat_ok:
; preferentially use the 28bit routine, because it's a little faster
; if ebp > 28bit or esi.stLBA > 28bit or stLBA+ebp > 28bit or stLBA+ebp+ebx > 28bit, use 48 bit
cmp ebp, 0xfffffff
jg short .setreg
mov eax, [esi + dd_stLBA]
cmp eax, 0xfffffff
jg short .setreg
add eax, ebp
cmp eax, 0xfffffff
jg short .setreg
add eax, ebx
cmp eax, 0xfffffff
.setreg:
mov dx, [esi + dd_tf] ; dx = IO port base ("task file")
jle short .read28 ; test the flags from the eax cmp's above
.read48:
test ebx, ebx ; no more sectors to read?
je short .r_don
call pio48_read ; read up to 256 more sectors, updating registers
je short .read48 ; if successful, is there more to read?
jmp short .r_don
.read28:
test ebx, ebx ; no more sectors to read?
je short .r_don
call pio28_read ; read up to 256 more sectors, updating registers
je short .read28 ; if successful, is there more to read?
.r_don:
pop eax
pop ecx
pop edx
ret
;ATA PI0 28bit singletasking disk read function (up to 256 sectors)
; inputs: ESI -> driverdata info, EDI -> destination buffer
; BL = sectors to read, DX = base bus I/O port (0x1F0, 0x170, ...), EBP = 28bit "relative" LBA
; BSY and DRQ ATA status bits must already be known to be clear on both slave and master
; outputs: data stored in EDI; EDI and EBP advanced, EBX decremented
; flags: on success Zero flag set, Carry clear
pio28_read:
add ebp, [esi + dd_stLBA] ; convert relative LBA to absolute LBA
mov ecx, ebp ; save a working copy
mov al, bl ; set al= sector count (0 means 256 sectors)
or dl, 2 ; dx = sectorcount port -- usually port 1f2
out dx, al
mov al, cl ; ecx currently holds LBA
inc edx ; port 1f3 -- LBAlow
out dx, al
mov al, ch
inc edx ; port 1f4 -- LBAmid
out dx, al
bswap ecx
mov al, ch ; bits 16 to 23 of LBA
inc edx ; port 1f5 -- LBAhigh
out dx, al
mov al, cl ; bits 24 to 28 of LBA
or al, byte [esi + dd_sbits] ; master/slave flag | 0xe0
inc edx ; port 1f6 -- drive select
out dx, al
inc edx ; port 1f7 -- command/status
mov al, 0x20 ; send "read" command to drive
out dx, al
; ignore the error bit for the first 4 status reads -- ie. implement 400ns delay on ERR only
; wait for BSY clear and DRQ set
mov ecx, 4
.lp1:
in al, dx ; grab a status byte
test al, 0x80 ; BSY flag set?
jne short .retry
test al, 8 ; DRQ set?
jne short .data_rdy
.retry:
dec ecx
jg short .lp1
; need to wait some more -- loop until BSY clears or ERR sets (error exit if ERR sets)
.pior_l:
in al, dx ; grab a status byte
test al, 0x80 ; BSY flag set?
jne short .pior_l ; (all other flags are meaningless if BSY is set)
test al, 0x21 ; ERR or DF set?
jne short .fail
.data_rdy:
; if BSY and ERR are clear then DRQ must be set -- go and read the data
sub dl, 7 ; read from data port (ie. 0x1f0)
mov cx, 256
rep insw ; gulp one 512b sector into edi
or dl, 7 ; "point" dx back at the status register
in al, dx ; delay 400ns to allow drive to set new values of BSY and DRQ
in al, dx
in al, dx
in al, dx
; After each DRQ data block it is mandatory to either:
; receive and ack the IRQ -- or poll the status port all over again
inc ebp ; increment the current absolute LBA
dec ebx ; decrement the "sectors to read" count
test bl, bl ; check if the low byte just turned 0 (more sectors to read?)
jne short .pior_l
sub dx, 7 ; "point" dx back at the base IO port, so it's unchanged
sub ebp, [esi + dd_stLBA] ; convert absolute lba back to relative
; "test" sets the zero flag for a "success" return -- also clears the carry flag
test al, 0x21 ; test the last status ERR bits
je short .done
.fail:
stc
.done:
ret
;ATA PI0 33bit singletasking disk read function (up to 64K sectors, using 48bit mode)
; inputs: bx = sectors to read (0 means 64K sectors), edi -> destination buffer
; esi -> driverdata info, dx = base bus I/O port (0x1F0, 0x170, ...), ebp = 32bit "relative" LBA
; BSY and DRQ ATA status bits must already be known to be clear on both slave and master
; outputs: data stored in edi; edi and ebp advanced, ebx decremented
; flags: on success Zero flag set, Carry clear
pio48_read:
xor eax, eax
add ebp, [esi + dd_stLBA] ; convert relative LBA to absolute LBA
; special case: did the addition overflow 32 bits (carry set)?
adc ah, 0 ; if so, ah = LBA byte #5 = 1
mov ecx, ebp ; save a working copy of 32 bit absolute LBA
; for speed purposes, never OUT to the same port twice in a row -- avoiding it is messy but best
;outb (0x1F2, sectorcount high)
;outb (0x1F3, LBA4)
;outb (0x1F4, LBA5) -- value = 0 or 1 only
;outb (0x1F5, LBA6) -- value = 0 always
;outb (0x1F2, sectorcount low)
;outb (0x1F3, LBA1)
;outb (0x1F4, LBA2)
;outb (0x1F5, LBA3)
bswap ecx ; make LBA4 and LBA3 easy to access (cl, ch)
or dl, 2 ; dx = sectorcount port -- usually port 1f2
mov al, bh ; sectorcount -- high byte
out dx, al
mov al, cl
inc edx
out dx, al ; LBA4 = LBAlow, high byte (1f3)
inc edx
mov al, ah ; LBA5 was calculated above
out dx, al ; LBA5 = LBAmid, high byte (1f4)
inc edx
mov al, 0 ; LBA6 is always 0 in 32 bit mode
out dx, al ; LBA6 = LBAhigh, high byte (1f5)
sub dl, 3
mov al, bl ; sectorcount -- low byte (1f2)
out dx, al
mov ax, bp ; get LBA1 and LBA2 into ax
inc edx
out dx, al ; LBA1 = LBAlow, low byte (1f3)
mov al, ah ; LBA2
inc edx
out dx, al ; LBA2 = LBAmid, low byte (1f4)
mov al, ch ; LBA3
inc edx
out dx, al ; LBA3 = LBAhigh, low byte (1f5)
mov al, byte [esi + dd_sbits] ; master/slave flag | 0xe0
inc edx
and al, 0x50 ; get rid of extraneous LBA28 bits in drive selector
out dx, al ; drive select (1f6)
inc edx
mov al, 0x24 ; send "read ext" command to drive
out dx, al ; command (1f7)
; ignore the error bit for the first 4 status reads -- ie. implement 400ns delay on ERR only
; wait for BSY clear and DRQ set
mov ecx, 4
.lp1:
in al, dx ; grab a status byte
test al, 0x80 ; BSY flag set?
jne short .retry
test al, 8 ; DRQ set?
jne short .data_rdy
.retry:
dec ecx
jg short .lp1
; need to wait some more -- loop until BSY clears or ERR sets (error exit if ERR sets)
.pior_l:
in al, dx ; grab a status byte
test al, 0x80 ; BSY flag set?
jne short .pior_l ; (all other flags are meaningless if BSY is set)
test al, 0x21 ; ERR or DF set?
jne short .fail
.data_rdy:
; if BSY and ERR are clear then DRQ must be set -- go and read the data
sub dl, 7 ; read from data port (ie. 0x1f0)
mov cx, 256
rep insw ; gulp one 512b sector into edi
or dl, 7 ; "point" dx back at the status register
in al, dx ; delay 400ns to allow drive to set new values of BSY and DRQ
in al, dx
in al, dx
in al, dx
; After each DRQ data block it is mandatory to either:
; receive and ack the IRQ -- or poll the status port all over again
inc ebp ; increment the current absolute LBA (overflowing is OK!)
dec ebx ; decrement the "sectors to read" count
test bx, bx ; check if "sectorcount" just decremented to 0
jne short .pior_l
sub dx, 7 ; "point" dx back at the base IO port, so it's unchanged
sub ebp, [esi + dd_stLBA] ; convert absolute lba back to relative
; this sub handles the >32bit overflow cases correcty, too
; "test" sets the zero flag for a "success" return -- also clears the carry flag
test al, 0x21 ; test the last status ERR bits
je short .done
.fail:
stc
.done:
ret
; do a singletasking PIO ata "software reset" with DCR in dx
srst_ata_st:
push eax
mov al, 4
out dx, al ; do a "software reset" on the bus
xor eax, eax
out dx, al ; reset the bus to normal operation
in al, dx ; it might take 4 tries for status bits to reset
in al, dx ; ie. do a 400ns delay
in al, dx
in al, dx
.rdylp:
in al, dx
and al, 0xc0 ; check BSY and RDY
cmp al, 0x40 ; want BSY clear and RDY set
jne short .rdylp
pop eax
ret
[BITS 16]
[ORG 0x7C00]
start:
cli
xor ax, ax
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x7C00
; Set up segment registers
mov ax, 0x9000
mov ds, ax
mov es, ax
; Load second stage bootloader (assuming it's located at LBA 1)
mov ebx, 1 ; Number of sectors to read
mov edi, 0x9000 ; Destination buffer
lea esi, [drive_data] ; Driver data struct
mov ebp, 1 ; LBA (logical block address) of the second stage
call read_ata_st
; Jump to the second stage bootloader
jmp 0x9000
drive_data:
dd_tf: dw 0x1F0 ; Task file IO port base
dd_dcr: dw 0x3F6 ; Alternate status/DCR
dd_sbits: db 0xE0 ; Master drive
dd_prtlen: dd 0x10000 ; Partition length (adjust as needed)
dd_stLBA: dd 0 ; Starting LBA
times 510-($-$$) db 0
dw 0xAA55
; Include the read_ata_st function
%include "read_ata.inc"
[BITS 16]
[ORG 0x9000]
start:
; Print a message
mov si, hello_msg
call print_string
; Further code to load the kernel or additional tasks
; Hang the system
cli
hlt
print_string:
.loop:
lodsb
or al, al
jz .done
mov ah, 0x0E
int 0x10
jmp .loop
.done:
ret
hello_msg db "Second stage bootloader loaded successfully!", 0
times 512-($-$$) db 0
[BITS 16]
[ORG 0x7C00]
start:
cli ; Clear interrupts
xor ax, ax
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x7C00 ; Set up stack
; Setup segment registers
mov ax, 0x07C0
mov ds, ax
mov es, ax
; Load second stage bootloader from LBA 1
mov bx, 0x8000 ; Buffer address in memory
mov dx, 0x80 ; Drive number (0x80 for first hard disk)
call read_sectors
; Jump to the second stage bootloader
jmp 0x8000
read_sectors:
pusha
mov dx, 0x1F0 ; Base I/O port of the primary ATA channel
mov cx, 1 ; Number of sectors to read
mov si, 1 ; LBA (logical block addressing) sector to read
; Wait for drive to be ready
.wait_drq:
inb dx+7
test al, 0x08
jz .wait_drq
; Send commands to read sectors
mov al, 0xE0 ; LBA mode, master drive
outb dx+6, al
outb dx+2, cl ; Sector count
outb dx+3, si ; LBA low byte
shr si, 8
outb dx+4, si ; LBA mid byte
shr si, 8
outb dx+5, si ; LBA high byte
mov al, 0x20 ; Read command
outb dx+7, al
; Wait for drive to be ready
.wait_data:
inb dx+7
test al, 0x08
jz .wait_data
; Read data from disk
mov cx, 256 ; 512 bytes = 256 words
.read_loop:
inw dx ; Read word from disk
mov [es:bx], ax ; Store word in memory
add bx, 2
loop .read_loop
popa
ret
times 510-($-$$) db 0 ; Pad the rest of the sector with zeros
dw 0xAA55 ; Boot signature
[BITS 16]
[ORG 0x8000]
start:
mov ax, 0x07C0 ; Set up the segment registers
mov ds, ax
mov es, ax
mov si, msg ; Load the address of the message string
call print_str ; Call the function to print the string
jmp $ ; Infinite loop
print_char:
; Print character in AL
mov ah, 0x0E ; BIOS teletype function
mov bh, 0x00 ; Page number (0 for mode 3)
mov bl, 0x07 ; Text attribute (0x07 is white on black)
int 0x10 ; Call BIOS interrupt
ret
print_str:
; Print null-terminated string at DS:SI
.next_char:
lodsb ; Load next byte of string into AL
test al, al ; Check for null terminator
jz .done ; If null terminator, we're done
call print_char
jmp .next_char
.done:
ret
msg db 'Hello, Second Stage Bootloader!', 0
times 512-($-$$) db 0 ; Pad the rest of the sector with zeros
nasm -f bin -o bootloader.bin bootloader.asm
dd if=bootloader.bin of=/dev/sdX bs=512 count=1
dd if=second_stage.bin of=/dev/sdX bs=512 seek=1
;; READ KERNEL INTO MEMORY SECOND
mov bl, 9 ; Will be reading 10 sectors
mov di, 2000h ; Memory address to read sectors into (0000h:2000h)
mov dx, 1F6h ; Head & drive # port
mov al, [drive_num] ; Drive # - hard disk 1
and al, 0Fh ; Head # (low nibble)
or al, 0A0h ; default high nibble to 'primary' drive (drive 1), 'secondary' drive (drive 2) would be hex B or 1011b
out dx, al ; Send head/drive #
mov dx, 1F2h ; Sector count port
mov al, 0Ah ; # of sectors to read
out dx, al
mov dx, 1F3h ; Sector # port
mov al, 2 ; Sector to start reading at (sectors are 0-based!!)
out dx, al
mov dx, 1F4h ; Cylinder low port
xor al, al ; Cylinder low #
out dx, al
mov dx, 1F5h ; Cylinder high port
xor al, al ; Cylinder high #
out dx, al
mov dx, 1F7h ; Command port (writing port 1F7h)
mov al, 20h ; Read with retry
out dx, al
;; Poll status port after reading 1 sector
kernel_loop:
in al, dx ; Status register (reading port 1F7h)
test al, 8 ; Sector buffer requires servicing
je kernel_loop ; Keep trying until sector buffer is ready
mov cx, 256 ; # of words to read for 1 sector
mov dx, 1F0h ; Data port, reading
rep insw ; Read bytes from DX port # into DI, CX # of times
;; 400ns delay - Read alternate status register
mov dx, 3F6h
in al, dx
in al, dx
in al, dx
in al, dx
cmp bl, 0
je jump_to_kernel
dec bl
mov dx, 1F7h
jmp kernel_loop
jump_to_kernel: