Virtual Memory: The OS Illusion

The most surprising thing about Linux virtual memory is that your program’s memory addresses are almost never the actual physical RAM addresses.

Let’s see this in action. Imagine a simple C program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    char *mem = malloc(4096); // Allocate 4KB
    if (mem == NULL) {
        perror("malloc failed");
        return 1;
    }
    printf("Allocated memory at virtual address: %p\n", (void *)mem);
    // Keep the process alive to inspect
    sleep(60);
    free(mem);
    return 0;
}

Compile and run this: gcc -o vm_test vm_test.c && ./vm_test. You’ll get output like: Allocated memory at virtual address: 0x55d7f1234000. Now, let’s inspect the system’s view of this process’s memory.

Open another terminal and run ps aux | grep vm_test. Find your process ID (PID), let’s say it’s 12345. Now, we can look at the kernel’s representation of its memory. The /proc/[PID]/maps file shows the memory regions for a process.

cat /proc/12345/maps will show output like this:

00400000-00401000 r-xp 00000000 08:01 123456 /path/to/vm_test
00600000-00601000 r--p 00000000 08:01 123456 /path/to/vm_test
00601000-00602000 rw-p 00001000 08:01 123456 /path/to/vm_test
007f00000000-007f00010000 rw-p 00000000 00:00 [heap]
7f0000200000-7f0000221000 rw-p 00000000 00:00 [heap]
...

Notice the [heap] section. The malloc’d address 0x55d7f1234000 (which is in the 007f00000000-007f00010000 range in this example, though it will vary) is a virtual address. The kernel and the CPU’s Memory Management Unit (MMU) translate this virtual address to a physical address in RAM.

This translation is managed by page tables. The kernel maintains a hierarchical page table for each process. When the CPU needs to access a virtual address, it consults these page tables. A page table entry (PTE) contains the physical page frame number where the virtual page is stored, along with permissions (read, write, execute) and other flags. If the PTE indicates the page isn’t in RAM (e.g., it’s on disk or hasn’t been allocated yet), a page fault occurs, and the kernel handles it.

The problem this solves is twofold:

Memory Isolation: Each process gets its own private virtual address space, preventing one process from accidentally (or maliciously) corrupting another’s memory.
Efficient Memory Usage: Not all memory needs to be in RAM at once. The system can use slower storage (like a hard drive or SSD) for less frequently accessed data.

The primary mechanism for moving data between RAM and slower storage is swapping. When RAM is full, the kernel selects a "victim" page that hasn’t been used recently, writes its contents to a dedicated swap area on disk (the swap partition or swap file), and marks its PTE as invalid and pointing to the swap location. Later, if the process tries to access that page, a page fault occurs, and the kernel reads the page back from swap into RAM.

The size and configuration of your swap space are controlled by swappiness, a kernel parameter. sysctl vm.swappiness shows the current value (default is often 60). A higher value means the kernel will be more aggressive about swapping out inactive pages, potentially freeing up RAM but also increasing the chance of disk I/O for memory access. A lower value means it will favor keeping pages in RAM longer, reducing swap activity but potentially leading to less free RAM. You can change it temporarily with sudo sysctl vm.swappiness=10 or permanently by editing /etc/sysctl.conf.

The most counterintuitive part of this process is that even for a program you just ran and allocated memory for, the actual physical RAM location isn’t fixed. The kernel can move pages around in physical memory, or even swap them out and back in, without the program ever knowing. The virtual address remains constant, but the underlying physical address can change dynamically. This flexibility allows for sophisticated memory management techniques like copy-on-write, memory sharing between processes, and efficient handling of large memory allocations.

If you’ve correctly set up your page tables and swap, the next thing you’ll likely encounter is a "thrashing" scenario where the system is spending more time swapping pages in and out than doing actual work.