CLOSE

The kprintf function is a kernel-level implementation of a formatted printing function, similar to the standard printf in user-space C programming.

1 Purpose

  • Debugging: Kernel developers use kprintf to print debug information directly to the console or log file. This is essential because debugging tools available in user space are not typically available in kernel space.
  • Logging: It helps in logging the status and errors of various kernel operations, which can be reviewed later to diagnose problems.
  • Communication: Provides a way for the kernel to communicate important messages to developers and administrators.

2 Key Components

To implement kprintf, we need to handle:

  • Variable arguments.
  • Format specifiers.
  • Output to a console or serial port.

How kprintf Works

The functionality of kprintf closely mirrors that of printf, but it is tailored to operate within the kernel environment. Here’s a step-by-step breakdown of its operation:

  1. Formatting Strings: kprintf formats strings based on a format specifier. It parses the format string, identifies placeholders, and substitutes them with the provided arguments.
  2. Variable Arguments Handling: It uses the va_list type and associated macros (va_start, va_arg, and va_end) from the C standard library to manage a variable number of arguments.
  3. Output Mechanism: Depending on the OS architecture, kprintf can direct its output to various destinations:
    1. Console: Directly writes to the system console.
    2. Log Buffers: Stores messages in a circular buffer in memory, accessible to user-space tools.
    3. Serial Ports: Sends output to serial ports, particularly useful for early-stage boot debugging and embedded systems.

3 Variadic function in C/C++

Variadic functions are the functions that can accept a variable number of arguments. The classic example is the printf function, which can take any number of arguments based on the format string provided. Variadic functions are particularly useful for creating flexible APIs and libraries that need to handle a varying amount of input.

To get depth explanation of variadic function, visit here: https://thejat.in/learn/cpp-ellipsis-in-cpp

  • <stdarg.h> Library: The C standard library <stdarg.h> provides a set of macros to handle variadic functions. These macros include va_list, va_start, va_arg, and va_end.

The function definition of the Variadic function looks like as the below syntax:

void kprintf(char* format, ...){
	// ...
}

Now suppose we called this function as the kprintf("Integer: %d, Char: %c, Float: %f", 42, ‘J’, 1.23);

The thing is now how do we access these passed argument inside the kprintf function?

The answer is with three “special macros” in stdarg.h.

#include <stdio.h>
#include <stdarg.h>

// Variadic function to calculate the sum of integers
int sum(int count, ...) {
    va_list args;
    va_start(args, count); // Initialize the argument list, after this args point to the very first variable argument, that is "1" in this case.

    int total = 0;
    for (int i = 0; i < count; i++) {
        total += va_arg(args, int); // Retrieve the argument pointed by args, after retrieving it advances the args to point to next argument.
    }

    va_end(args); // Clean up the argument list
    return total;
}

int main() {
    int result = sum(4, 1, 2, 3, 4);
    printf("Sum: %d\n", result); // Outputs: Sum: 10
    return 0;
}

1 va_list:

It is an kind of pointer to an arguments in the variadic argument.

Before you can access the variadic arguments, you need to declare a va_list variable.

#include <stdarg.h>

void exampleFunction(const char *format, ...) {
    va_list args;  // Declare a va_list variable
    // ...
}

2 va_start:

This macro initializes a va_list variable. It must be called before you can use va_list to access the arguments.

After initialization, va_list points to the very first variable argument passed to the function.

Parameters:

  • The first parameter is the va_list variable to be initialized.
  • The second parameter is the name of the last fixed parameter before the ellipsis (...).
void exampleFunction(const char *format, ...) {
    va_list args;
    va_start(args, format);  // Initialize args to point to the first variadic argument
    // ...
    va_end(args);  // Clean up when done
}

3 va_arg:

This macro that retrieves the next argument in the parameter list for a variadic function.

Parameters:

  • The first parameter is the va_list variable.
  • The second parameter is the type of the argument to be retrieved.
void exampleFunction(const char *format, ...) {
    va_list args;
    va_start(args, format);

    int nextArg = va_arg(args, int);  // Retrieve the next argument as an int and advances args to point to next variable argument.

    va_end(args);  // Clean up when done
}
  • Each call to va_arg updates args to point to the next argument in the list.

4 va_end:

This macro performs cleanup for the va_list variable. It should be called before the function returns.

void exampleFunction(const char *format, ...) {
    va_list args;
    va_start(args, format);

    // Process variadic arguments using va_arg

    va_end(args);  // Clean up the va_list variable
}

3.1 Own Variadic Macros

In the absence of standard library support, we must define our own macros to manage variable arguments. As we are doing bare metal kernel development and we don't have support for the c library (libc) as of now, so we have to create own va_start, va_args, va_list, and va_end macros.

These macros are specific to the x86 architecture.

  • In x86, arguments are pushed onto the stack in reverse order (right to left).
  • We can create our own variadic argument handling macros by manipulating the stack pointer.

1 Define va_list:

typedef char* va_list;

2 Define va_start:

  • va_start initializes va_list to point to the first argument after the named parameter.
#define va_start(ap, last) (ap = (va_list)(&last + 1))

3 Define va_arg:

  • va_arg retrieves the next argument from the stack. It needs to know the type and size of the argument to move the pointer correctly.
#define va_arg(ap, type) (*(type*)((ap += sizeof(type)) - sizeof(type)))

4 Define va_end:

  • va_end is typically a no-op in many implementations but is defined for completeness.
#define va_end(ap) (ap = (va_list)0)

Custom Macros for Handling Variable Arguments

1 va_list:

#define va_list char *
  • va_list is defined as a char*, which will be used to traverse the stack where arguments are stored.

2 _arg_stack_size(type):

#define _arg_stack_size(type) (((sizeof(type)-1)/sizeof(int)+1)*sizeof(int))
  • This macro calculates the size of the type in multiples of int sizes. It ensures proper alignment on the stack.

3 va_start(ap, fmt):

#define va_start(ap, fmt) do { \
    ap = (char *)((unsigned int)&fmt + _arg_stack_size(&fmt));\
} while (0)
  • Initializes the ap (argument pointer) to point to the first variable argument on the stack. It calculates the starting point just after the fixed parameter fmt.

4 va_end(ap):

#define va_end(ap)
  • This macro is a no-op but included for completeness.

5 va_arg(ap, type):

#define va_arg(ap, type) (((type *)(ap+=_arg_stack_size(type)))[-1])
  • This macro retrieves the next argument of the specified type and advances the ap pointer.

4 Format String Parsing

  • We need a way to iterates over the format string, handling both regular characters and format specifiers.
  • For each % format specifier, it uses va_arg to retrieve the corresponding argument.

Handling Different Specifiers:

  • %s: Retrieves a string argument and copies it to the buffer.
  • %c: Retrieves a character (promoted to an int) and adds it to the buffer.
  • %x: Retrieves an unsigned int and converts it to a hexadecimal string.
  • %d: Retrieves an int, handles negative values, and converts it to a decimal string.
  • %%: Adds a literal % character to the buffer.

Working Principle:

  • Prerequisites:
    • We would need a buffer (buf) to store the formatted string.
  • Initialization:
    • Initialize a va_list variable args using va_start, which points to the first variable argument after the format string fmt.
    • Initialize ptr to 0, which will be used as an index to write characters into the buffer buf.
  • Format String Parsing:
    • Iterate through each character in the format string fmt.
    • When encountering a % character, check the next character to identify the format specifier.
    • Handling Format Specifiers:
      • Depending on the format specifier, retrieve the corresponding argument from the variable argument list using va_arg.
      • Process the argument according to the specifier:
        • For %s, copy a string argument character by character into the buffer.
        • For %c, directly copy a character argument into the buffer.
        • For %x, convert an unsigned integer argument to a hexadecimal string using parse_hex.
        • For %d, convert an integer argument to a decimal string using parse_num.
        • For %%, directly add a % character to the buffer.
    • Regular Characters:
      • Copy regular characters from the format string directly into the buffer.
  • Buffer Termination:
    • Null-Terminate the buffer after processing the entire format string.
  • Cleanup:
    • Cleanup the va_list variable using va_end.
  • Output:
    • Output the formatted string stored in the buffer (e.g., to a console or serial port).

Example:

Suppose we have the following call to kprintf:

kprintf("Hello, %s! The value is %d.\n", "world", 42);

Step-by-Step Execution:

  1. Initialize args to point to the first argument after the format string "Hello, %s! The value is %d.\n".
  2. Iterate through each character in the format string.
  3. Copy regular characters until encountering a %.
  4. For %s, retrieve the string argument "world" and copy it into the buffer.
  5. For %d, retrieve the integer argument 42, convert it to a decimal string, and copy it into the buffer.
  6. Continue copying regular characters until reaching the end of the format string.
  7. Null-terminate the buffer.
  8. Cleanup the args variable.
  9. Output the formatted string to the console or serial port.

Output:

Hello, world! The value is 42.

5 Output

  • After parsing and formatting the string, buf is null-terminated.
  • puts(buf): Outputs the buffer to the console or serial port.

6 Example Explanation:


 

kprintf1.jpg
kprintf2.jpg
kprintf3.jpg
kprintf4-1.jpg

7 kprintf.c

#include <system.h>

// Define a type alias for the variable argument list
#define args_list char *

// Macro to calculate the stack size required for a type
#define _arg_stack_size(type) (((sizeof(type)-1)/sizeof(int)+1)*sizeof(int))

// Macro to initialize the variable argument list
#define args_start(ap, fmt) do { \
	ap = (char *)((unsigned int)&fmt + _arg_stack_size(&fmt));\
} while (0)

// Macro to end the variable argument list (no implementation provided)
#define args_end(ap)

// Macro to retrieve the next argument of a specified type
#define args_next(ap, type) (((type *)(ap+=_arg_stack_size(type)))[-1])

// Static buffer and pointer for formatted output
static char buf[1024] = {-1};
static int ptr = -1;

// Function to parse and convert integers to strings
static void parse_num(unsigned int value, unsigned int base) {
	unsigned int n = value / base;
	int r = value % base;
	if (r < 0) {
		r += base;
		--n;
	}
	if (value >= base) {
		parse_num(n, base);
	}
	buf[ptr++] = (r+'0');
}

// Function to parse and convert integers to hexadecimal strings
static void parse_hex(unsigned int value) {
	int i = 8;
	while (i-- > 0) {
		buf[ptr++] = "0123456789abcdef"[(value>>(i*4))&0xF];
	}
}

// Kernel printf function with support for format specifiers
void kprintf(const char *fmt, ...) {
	int i = 0;
	char *s;
	args_list args;
	args_start(args, fmt); // Initialize the variable argument list
	ptr = 0; // Initialize the buffer pointer

	// Iterate over each character in the format string
	for (; fmt[i]; ++i) {
		// Copy regular characters to the buffer
		if ((fmt[i] != '%') && (fmt[i] != '\\')) {
			buf[ptr++] = fmt[i];
			continue;
		} else if (fmt[i] == '\\') { // Handle escape sequences
			switch (fmt[++i]) {
				case 'a': buf[ptr++] = '\a'; break;
				case 'b': buf[ptr++] = '\b'; break;
				case 't': buf[ptr++] = '\t'; break;
				case 'n': buf[ptr++] = '\n'; break;
				case 'r': buf[ptr++] = '\r'; break;
				case '\\':buf[ptr++] = '\\'; break;
			}
			continue;
		}
		/* fmt[i] == '%' */
		switch (fmt[++i]) {
			// Handle string format specifier (%s)
			case 's':
				s = (char *)args_next(args, char *);
				while (*s) {
					buf[ptr++] = *s++;
				}
				break;
			// Handle character format specifier (%c)
			case 'c':
				buf[ptr++] = (char)args_next(args, int);
				break;
			// Handle hexadecimal format specifier (%x)
			case 'x':
				parse_hex((unsigned long)args_next(args, unsigned long));
				break;
			// Handle decimal format specifier (%d)
			case 'd':
				parse_num((unsigned long)args_next(args, unsigned long), 10);
				break;
			// Handle percent sign format specifier (%%)
			case '%':
				buf[ptr++] = '%';
				break;
			default:
				buf[ptr++] = fmt[i];
				break;
		}
	}
	buf[ptr] = '\0'; // Null-terminate the buffer
	args_end(args); // Cleanup the variable argument list
	puts(buf); // Output the formatted string
}

1 Macros:

  • args_list: Defines a type alias for the variable argument list.
  • _arg_stack_size(type): Calculates the stack size required for a given type.
  • args_start(ap, fmt): Initializes the variable argument list ap to point to the first argument after the format string fmt.
  • args_end(ap): Placeholder macro for ending the variable argument list (no implementation provided).
  • args_next(ap, type): Retrieves the next argument of a specified type from the variable argument list.

2 Static Variables:

  • buf: Static buffer used for formatted output.
  • ptr: Static integer pointer to mark the current position in the buffer.

3 Parsing Functions:

  • parse_num: Converts an integer to its string representation.
  • parse_hex: Converts an integer to its hexadecimal string representation.

4 kprintf Function:

  • kprintf: Formats and prints a string according to the specified format.
    • Parameters: const char *fmt (format string) and variable arguments.
    • Initialize local variables:
      • i: Loop index for iterating through the format string.
      • s: Pointer for string arguments.
      • args: Variable argument list.
    • Start parsing the format string:
      • Iterate through each character in the format string.
      • Copy regular characters to the buffer.
      • Handle escape sequences (e.g., \n, \t).
      • Handle format specifiers:
        • %s: Copy string argument to the buffer.
        • %c: Copy character argument to the buffer.
        • %x: Convert unsigned integer argument to hexadecimal and copy to the buffer.
        • %d: Convert unsigned integer argument to decimal and copy to the buffer.
        • %%: Copy a percent sign to the buffer.
    • Null-terminate the buffer after processing the entire format string.
    • Cleanup the variable argument list.
    • Output the formatted string using the puts function (assumed to be provided elsewhere).

8 Modify system.h and call for kprintf from kernel main function

Add the below extern kprintf declaration in the system.h header file.

/* kprintf */
extern void kprintf(const char *fmt, ...);

In order to test our newly built kprintf function we will call it to print a formatted string. Add the below statement in the main() function in place of puts("TheJat!\n") from our previous example.

kprintf("THE JAAT From %s-%d, "HR", 8);

9 Modify Makefile to compile our c file kprintf.c

.PHONY: all clean run install

# Compiler and assembler flags
CC = gcc
CFLAGS = -Wall -m32 -fno-pie -O0 -fstrength-reduce -fomit-frame-pointer -finline-functions -nostdinc -fno-builtin -I./include
AS = nasm
ASFLAGS = -f elf

# Output files and directories
ISO_DIR = iso
BOOT_DIR = $(ISO_DIR)/boot
GRUB_DIR = $(BOOT_DIR)/grub
ISO = my_os.iso

# Source files and objects
C_SOURCES = main.c vga.c gdt.c idt.c isr.c irq.c timer.c keyboard.c kprintf.c
ASM_SOURCES = start.asm gdt_asm.asm idt_asm.asm isr_asm.asm irq_asm.asm
OBJ = $(C_SOURCES:.c=.o) $(ASM_SOURCES:.asm=.o)

# Linker script
LINKER_SCRIPT = link.ld

# Kernel binary
KERNEL = kernel.bin

all: $(KERNEL)

# Build the ISO and run it with QEMU
run: install
	qemu-system-x86_64 -cdrom $(ISO)

# Install the kernel and GRUB configuration into the ISO directory and create the ISO image
install: $(KERNEL)
	mkdir -p $(GRUB_DIR)
	cp $(KERNEL) $(BOOT_DIR)/kernel.bin
	cp grub.cfg $(GRUB_DIR)/grub.cfg
	grub-mkrescue -o $(ISO) $(ISO_DIR)

# Link the kernel binary
$(KERNEL): $(OBJ) $(LINKER_SCRIPT)
	ld -m elf_i386 -T $(LINKER_SCRIPT) -o $(KERNEL) $(OBJ)

# Compile C source files
%.o: %.c
	$(CC) $(CFLAGS) -c -o $@ $<

# Assemble assembly source files
%.o: %.asm
	$(AS) $(ASFLAGS) -o $@ $<

# Clean up build artifacts
clean:
	rm -f $(OBJ) $(KERNEL) $(ISO)
	rm -rf $(ISO_DIR)

# Dependencies
main.o: main.c
vga.o: vga.c
gdt.o: gdt.c
idt.o: idt.c
isr.o: isr.c
irq.o: irq.c
timer.o: timer.c
keyboard.o: keyboard.c
kprintf.o: kprintf.c
start.o: start.asm
gdt_asm.o: gdt_asm.asm
idt_asm.o: idt_asm.asm
irq_asm.o: irq_asm.asm
isr_asm.o: isr_asm.asm

10 Output

image-135.png