CLOSE
Updated on 02 Feb, 202613 mins read 100 views

What Is a Page in a Database?

A page is the smallest unit of data a database reads or writes to disk.

Think of it like this:

  • Disk -> very slow
  • RAM -> fast
  • DB does not read one row at a time from disk
  • DB reads a chunk -> that chunk is a page

Typical page sizes:

  • PostgreSQL: 8 KB
  • MySQL InnoDB: 16 KB
  • Orace: 8 KB (configurable)
  • SQL Server: 8 KB

Even if you request 1 row, the DB loads the entire page containing that row.

Why Pages Exist?

Disk Reality

Disk I/O is expensive

  • Seek time
  • Rotation time
  • Latency

Reading 1 byte vs 8KB:

Almost the same cost

So DBs optimize by:

Reading data in fixed-size pages

How Tables Are Stored Using Pages

A table is not a continuous file of rows.

Instead:

Table
 ├── Page 1 → rows 1–50
 ├── Page 2 → rows 51–100
 ├── Page 3 → rows 101–150

Each page contains:

  • Page header (metadata)
  • Row slots
  • Free space

Rows inside a page are packed, not necessarily ordered.

How a Database Finds a Row Without an Index

When a table has no index that can help with a query, the database has no shortcut to locate the desired row. In this case, it performs  a full table scan.

Step-by-Step Process

  1. Start from the first data page on disk
    1. The database identifies the first page that belongs to the table.
  2. Load the page into memory (buffer cache)
    1. Since databases operate on pages (not rows), the entire page is read from disk into memory.
  3.  

Example:

SELECT * FROM users WHERE id = 7;

Steps:

  1. DB finds which page contains row 7
  2. Loads that page into buffer cache
  3. Reads the row from memory
  4. Return result

How a Database Finds a Row Without an Index

When a table has no index that can help with a query, the database has no shortcut to locate the desired row. In this case, it performs a full table scan.

Step-by-Step Process

  1. Start from the first data page on disk
    The database identifies the first page that belongs to the table.
  2. Load the page into memory (buffer cache)
    Since databases operate on pages (not rows), the entire page is read from disk into memory.
  3. Scan all rows within that page
    Each row in the page is examined to check whether it satisfies the query condition.
  4. If the row is not found, move to the next page
    The database discards or keeps the page in cache and proceeds to the next page in the table.
  5. Repeat the process page by page
    1. Load page into memory
    2. Scan all rows in that page
    3. Check for a match
  6. Stop when one of the following happens
    1. The matching row is found
    2. All pages of the table have been scanned (row does not exist)

Key Characteristics of This Approach

  • The database may need to read every page of the table
  • Time complexity grows linearly with table size (O(N))
  • Performance degrades significantly as the table grows
  • This is the slowest but most reliable way to find data

Definition

Database indexing is a technique where the database creates and maintains an extra data structure to quicky locate rows without scanning the entire table.

Types of Indexes in Databases

Indexes are classified based on how data is organized, what they optimize, and how they are stored.

At a high level, there are 4 major ways to think about index types:

  1. By data structure
  2. By physical storage
  3. By number of columns
  4. By specialized use cases

1 B-Tree (B+Tree) Index

This is the default index type in almost every relational database

Used by:

  • MySQL
  • PostgreSQL
  • Oracle
  • SQL Server

How it Works:

  • Data is kept sorted
  • Tree stays balanced
  • Searches move top -> down -> leaf
           [50]
        /          \
    [10,20]      [60,80]
     /   \        /    \
   leaf leaf   leaf   leaf

Supports:

  • =
  • <, >, BETWEEN
  • ORDER BY
  • LIKE ‘abc%’

 

Buy Me A Coffee

Leave a comment

Your email address will not be published. Required fields are marked *

Your experience on this site will be improved by allowing cookies Cookie Policy