BootInfo { memory_map: [MemoryRegion { range: FrameRange(0x0..0x1000), region_type: FrameZero }, MemoryRegion { range: FrameRange(0x1000..0x5000), region_type: PageTable }, MemoryRegion { range: FrameRange(0x5000..0x15000), region_type: Bootloader }, MemoryRegion { range: FrameRange(0x15000..0x16000), region_type: BootInfo }, MemoryRegion { range: FrameRange(0x16000..0x1d000), region_type: Kernel }, MemoryRegion { range: FrameRange(0x1d000..0x9f000), region_type: KernelStack }, MemoryRegion { range: FrameRange(0x9f000..0xa0000), region_type: Reserved }, MemoryRegion { range: FrameRange(0xf0000..0x100000), region_type: Reserved }, MemoryRegion { range: FrameRange(0x100000..0x27e000), region_type: KernelStack }, MemoryRegion { range: FrameRange(0x27e000..0x400000), region_type: Usable }, MemoryRegion { range: FrameRange(0x400000..0x431000), region_type: Kernel }, MemoryRegion { range: FrameRange(0x431000..0x83d000), region_type: PageTable }, MemoryRegion { range: FrameRange(0x83d000..0x7fe0000), region_type: Usable }, MemoryRegion { range: FrameRange(0x7fe0000..0x8000000), region_type: Reserved }, MemoryRegion { range: FrameRange(0xfffc0000..0x100000000), region_type: Reserved }, MemoryRegion { range: FrameRange(0xfd00000000..0x10000000000), region_type: Reserved }], physical_memory_offset: 1649267441664, tls_template: TlsTemplate { start_addr: 0, file_size: 0, mem_size: 0 }, _non_exhaustive: 0 }
Page Tables
OS in Rust
Announcements
- Action Items:
- Welcome back.
- Weird 2 week split on paging and page tables.
- No HW last week, no lab this week.
Citations
- There’s a lot of ways to cover paging theory.
- I would use the radix tree
- Theoretically already covered!
- So, Paging implementation
Intro
Recall
- Be thinking about the
mallocassignment.
Recall
- We motivated paging by comparing it with segmentation.
- We introduced the multi-level page table.
- We found out that the bootloader already set up a page table.
- We now make page table frames accessible to our kernel.
Crates for days
- To implement the approach, we will need support from the bootloader, so we’ll configure it first.
- Afterward, we will implement a function that traverses the page table hierarchy in order to translate virtual to physical addresses.
- Finally, we learn how to create new mappings in the page tables and how to find unused memory frames for creating new page tables.
Accessing Page Tables
- Accessing the page tables from our kernel is non-trivial.
- Examine the example 4-level page table hierarchy.
- Visually to follow.
Visually
Description
- Keys are virtual memory
- Levels are radices of virtual memory numerical addresses
- Values are physical addresses (also numerical)
- In non-leafs, they are addresses of other page tables.
Problem
- We can’t directly access physical addresses from our kernel
- Our kernel uses virtual addressing
- Say we want to access physical
4 KiB- It is located not (at probability \(\lim_{x\to} p = 0\)) virtual
4 KiB - So we have no existing way to dereference a pointer to a physical address
- It is located not (at probability \(\lim_{x\to} p = 0\)) virtual
Identity Mapping
- Wait why are virtual and physical addresses every different anyways.
- You know the answer to this, but for now we pretend we don’t.
- We can identity map all page table addresses
- We can do this because we, the OS, load the page tables.
Visually
Pros & Cons
- The physical addresses of page tables are also valid virtual addresses.
- Easily access the page tables of all levels starting from the CR3 register.
- Clutters the virtual address space.
- More difficult to find continuous memory regions of larger sizes.
Example
- Need
1000 KiBsized region. - Start at
28 KiBcollides with mapped page1004 KiB. - First option is
1008 KiB.
- Similar to fragmentation problem.
Another Con
- Restricts where page tables can be placed.
- Potential conflicts with both physicals and virtual addresses.
- Boo!
Example
- Say
1000 KiBsized region at virtual1008 KiB. - Can’t use any frame with a physical address between
1000 KiBand2008 KiB
Options
Fixed offset
- Instead, use a separate memory region for page table mappings.
- Just increment page table’s virtual address by the size of physical memory, perhaps.
- Or by one billion terabytes.
- Doesn’t matter the offset!
Visually
This works
- In x86_64 since the 48-bit address space is
256 TiB, a size sometimes referred to as “large”.- 48 bits
- So using e.g.
10 TiB- no problem. - Also, some clients may but I don’t have that much storage.
- So - no collisions!
Limitations
- We need to create a new mapping whenever we create a new page table.
- So, somehow also tracking what transform a collection of tables uses.
- Doesn’t allow accessing page tables of other address spaces
- Mostly useful when creating a new process.
Options
Full Map
- What about mapping the complete physical memory?
- That is, not only the page tables.
- Allows our kernel to access arbitrary physical memory, including page tables.
- Reserved virtual memory range has the same size as before
- But no longer contains unmapped pages
Visually
Hang on
- Doesn’t that… defeat the purpose?
- We no longer achieve memory isolation.
- We no longer can use “local” addresses.
- We “need” to store page tables somewhere.
Shout out x86_64
- x86_64 can address ENORMOUS pages, up to size 2 MiB.
- “ENORMOUS”
- This cuts down on overhead.
- Recall: Setting addressable sizes for
malloc.- There: words vs. bytes.
- Throwback time!
TLB
- We recall the translation lookaside buffer or TLB.
- From here
- Systems in Rust week 9, on the stack.
The MMU uses the TLB, a small, fast cache, to store recent translations.
- Bigger pages gives less TLB utilization gives faster lookup.
Options
Temporary Mapping
- With sufficiently little memory, say enough to fit in a non-radixed page table, an OS can utilize a temporary map.
Visually
Explanation
- The table controls the first 2 MiB of addresses.
- It is reachable from CR3 by following zero indexes at higher radix page tables.
- Then, index
8maps32 KiB(virtual) to32 KiB(physical) - This identity maps the page table itself.
Capabilities
- Given typical page table sizes, this allows up to \(2^9 - 1 = 511\) mappings.
- We showed two:
- By mapping the 0th entry of the level 1 table to the frame with address
24 KiB, it created a temporary mapping of the virtual page at0 KiBto the physical frame of the level 2 page table
- By mapping the 0th entry of the level 1 table to the frame with address
Capabilities
- Given typical page table sizes, this allows up to \(2^9 - 1 = 511\) mappings.
- We showed two:
- By mapping the 9th entry of the level 1 table to the frame with address
4 KiB, it created a temporary mapping of the virtual page at36 KiBto the physical frame of the level 4 page table
- By mapping the 9th entry of the level 1 table to the frame with address
Use
- Now the kernel can access the level 2 page table by writing to page
0 KiB, and… - Access the the level 4 page table by writing to page
36 KiB.
Process
- Search for a free entry in the identity-mapped level 1 table.
- Map that entry to the physical frame of the page table that we want to access.
- Access the target frame through the virtual page that maps to the entry.
- Set the entry back to unused, thereby removing the temporary mapping again.
Pros & Cons
- Reuses a mere 512 virtual pages.
- Require only 4 KiB of physical memory.
- However, cumbersome.
- A new mapping might require modifications to multiple table levels
Options
Recursion
- Folks, it’s back.
- I never thought I’d see recursion used in a recursive data structure like a radix tree.
- By the way, have any of you written more than 2-3 for loops this term?
Map oneself
- Allow a page table to map itself.
- Require no new tables, consume only a single “slot” (mapping)
- Roundabout way of reserving virtual memory.
- We follow an example.
Visually

First steps
- We begin with the original example for today, but…
- Add a
511to4 KiBmap.

First steps
- If followed, the CPU finds the same table again.
- But! The CPU assumes a change-of-levels.
- Tables have a level-agnostic layout on x86_64.

Trickery
- This shorten the number of levels that the CPU traverses.
- Following the recursive entry changes the perceived radix each time.
- So we can treat the level 2 table as a level 1 table and the level 1 table as the mapped frame.
- We can now read and write page tables because the CPU thinks that it is the mapped frame.
Recurse!
- Note the recursive “Step 1”

Recurse!!
- Or recurse twice
- Modify a higher radix page table

Recurse!!!
- Or recurse thrice
- Running out of radices pretty quick here.

Learn more
- This would’ve been a great lab but we have SSRD.
- Real ones will do recursive page table address calculation.
- Blog
Note
Left as an exercise to the interested student.
Pros & Cons
- Uses too much virtual memory (512 GiB).
- Irrelevant in practice.
- It only allows accessing the currently active address space easily.
- Wait is this a con?
- It heavily relies on the page table format of x86.
- To my knowledge, recursive data structures are usually implemented recursively…
Options
Bootloading
Cheating
- Each of these approaches requires write access to the page table and fine-grained control of physical memory.
- We can’t create these required mappings without an existing way to access the page tables.
Crates for eons
- We “need” the ““help”” of the bootloader.
- Which stole the privilege of writing page tables from us!
- The bootloader has access to the page tables, so it can create any mappings that we need.
- We use “cargo features” (I’m not linking that) to do so.
Two options
- The
map_physical_memoryfeature maps the complete physical memory somewhere into the virtual address space. - With the
recursive_page_tablefeature, the bootloader maps an entry of the level 4 page table recursively.
Maps
- My two greatest joys - key-value storage and recursion - cannot both be appeased at once.
- I use
map_physical_memoryand shed but a single (1) tear for the recursion lost.
Onward!
- Forthwith the bootloader maps the complete physical memory to some unused virtual address range.
- To communicate the virtual address range to our kernel, the bootloader passes a boot information structure.
Passes (as an argument)
- Probably a good time to check your work.
- Make your own (informed) decision about whether to e.g. initialize the IDT before or after printing this.
What I got
- Not super human readable, but not bad either.
One field
memory_map- How much physical memory is available
- Which memory regions are reserved for devices such as the VGA hardware.
- Queried from the BIOS or UEFI firmware during boot.
Two fields
memory_mapphysical_memory_offset- Virtual start address of the physical memory mapping.
- Configurable by adding a
[package.metadata.bootloader]table in Cargo.toml and setting the fieldphysical-memory-offset = "0x0000f00000000000" - Higher values are less likely to explode.
Implementation
Memory Access
- Let’s begin to implement our memory management.
- Easiest within
src/memory.rs
It Begins
- First, we simply access the page table (and return it).
- Read from
CR3 - Offset
- A billion casts.
- Read from
In Rust
- Like this:
src/memory.rs
pub unsafe fn active_level_4_table(offset: x86_64::VirtAddr)
-> &'static mut x86_64::structures::paging::PageTable
{
unsafe {
let (frame, _) = x86_64::registers::control::Cr3::read();
let phys = frame.start_address();
let virt = offset + phys.as_u64();
let ptr: *mut PageTable = virt.as_mut_ptr();
return &mut *ptr;
}
}Peep it
- We can now traverse the table within
src/main.rs
Look, a table
- I’m guessing this will be the same for you.
L4 Entry 00: PageTableEntry { addr: PhysAddr(0x2000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED) }
L4 Entry 02: PageTableEntry { addr: PhysAddr(0x433000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY) }
L4 Entry 03: PageTableEntry { addr: PhysAddr(0x43b000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY) }
L4 Entry 04: PageTableEntry { addr: PhysAddr(0x63c000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY) }
L4 Entry 05: PageTableEntry { addr: PhysAddr(0x83d000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY) }
L4 Entry 1f: PageTableEntry { addr: PhysAddr(0x437000), flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY) }What’s there?
- Various non-empty entries, which map different level 3 tables.
- Kernel code
- Kernel stack
- Physical memory mapping
- Boot information
Tree Traversal
- Presumably you know how to traverse a tree.
- Just have a billion casts.
- Left as an exercise but not a particularly highly recommended one.
- More fun if you do it with recursive tables.
Translation
VirtAddr -> PhysAddr
- Translate a virtual to a physical address.
- Traverse the four-level page table
- Return the mapped frame.
- Let’s create a function that performs this translation:
Translate
Sketch
- Read level 4
- Break addresses in radices.
- Descend 4->3->2->1
- Return
Read level 4
- We’ll be changing frames as we descend.
Radices
- Surprised this is idiomatic but okay.
Descend
- Get the relevant table
src/memory.rs
pub unsafe fn translate_addr(addr: x86_64::VirtAddr, offset: x86_64::VirtAddr)
-> Option<x86_64::PhysAddr>
{
let (l4, _) = x86_64::registers::control::Cr3::read();
let mut frame = l4;
let indices = [addr.p4_index(), addr.p3_index(), addr.p2_index(), addr.p1_index()];
for &index in &indices {
let virt = offset + frame.start_address().as_u64();
let ptr: *const x86_64::structures::paging::PageTable = virt.as_ptr();
let table = &*ptr;
}
return None;
}Continue Descend
- Get the relevant frame
src/memory.rs
pub unsafe fn translate_addr(addr: x86_64::VirtAddr, offset: x86_64::VirtAddr)
-> Option<x86_64::PhysAddr>
{
let (l4, _) = x86_64::registers::control::Cr3::read();
let mut frame = l4;
let indices = [addr.p4_index(), addr.p3_index(), addr.p2_index(), addr.p1_index()];
for &index in &indices {
let virt = offset + frame.start_address().as_u64();
let ptr: *const x86_64::structures::paging::PageTable = virt.as_ptr();
frame = match &(&*ptr)[index].frame() {
Ok(x) => *x,
Err(_) => return None,
};
}
return None;
}Return
- Compose the address from offset and start address.
src/memory.rs
pub unsafe fn translate_addr(addr: x86_64::VirtAddr, offset: x86_64::VirtAddr)
-> Option<x86_64::PhysAddr>
{
let (l4, _) = x86_64::registers::control::Cr3::read();
let mut frame = l4;
let indices = [addr.p4_index(), addr.p3_index(), addr.p2_index(), addr.p1_index()];
for &index in &indices {
let virt = offset + frame.start_address().as_u64();
let ptr: *const x86_64::structures::paging::PageTable = virt.as_ptr();
frame = match &(&*ptr)[index].frame() {
Ok(x) => *x,
Err(_) => return None,
};
}
return Some(frame.start_address() + u64::from(addr.page_offset()));
}It’s unsafe btw.
src/memory.rs
pub unsafe fn translate_addr(addr: x86_64::VirtAddr, offset: x86_64::VirtAddr)
-> Option<x86_64::PhysAddr>
{
unsafe {
let (l4, _) = x86_64::registers::control::Cr3::read();
let mut frame = l4;
let indices = [addr.p4_index(), addr.p3_index(), addr.p2_index(), addr.p1_index()];
for &index in &indices {
let virt = offset + frame.start_address().as_u64();
let ptr: *const x86_64::structures::paging::PageTable = virt.as_ptr();
frame = match &(&*ptr)[index].frame() {
Ok(x) => *x,
Err(_) => return None,
};
}
return Some(frame.start_address() + u64::from(addr.page_offset()));
}
}Test it
- Modify main slightly to look up some addresses.
src/main.rs
let offset = x86_64::VirtAddr::new(boot_info.physical_memory_offset);
let mapper = unsafe { osirs::memory::init(offset) };
for &address in &addresses {
let virt = x86_64::VirtAddr::new(address);
use x86_64::structures::paging::Translate;
let phys = mapper.translate(virt);
println!("{:?} -> {:?}", virt, phys);
}I got
VirtAddr(0xb8000) -> Some(PhysAddr(0xb8000))
VirtAddr(0x201008) -> Some(PhysAddr(0x401008))
VirtAddr(0x10000201a10) -> Some(PhysAddr(0x27da10))
VirtAddr(0x18000000000) -> None- Naturally have a problem with address 0.
More Cheating
Using OffsetPageTable
- Translating virtual to physical addresses is (1) common and (2) tedious.
x86_64crate provides an abstraction for it.- It supports huge pages (which caused our error) and several other page table functions apart from
translate_addr
Traits
- It uses (sigh) traits…
Mapperoperates on pages, generically over size.translate_pagetakes pages to framesmap_tocreates a new mapping in a table.
Traits
- It uses (sigh) traits…
Translateworks on multiple page sizes.
Interface only
- This is only an interface.
- Big Object Oriented is trying to get us to orient objects without orienting any objects itself.
- This is fine and I’m not mad.
x86_64provides 3 types that implement the traits, one of which is…OffsetPageTable- Otherwise, “mapped” or “recursive”.
Init
- At this point easier to just use a lifetime than work around it.
<'static>means “lasts as long as the program is running”.
Use it
- We can now use the
Translate::translate_addrmethod:
src/main.rs
let offset = x86_64::VirtAddr::new(boot_info.physical_memory_offset);
let mapper = unsafe { memory::init(phys_mem_offset) };
for &address in &addresses {
let virt = VirtAddr::new(address);
use x86_64::structures::paging::Translate;
let phys = mapper.translate_addr(virt);
println!("{:?} -> {:?}", virt, phys);
}A Small Change
- Note that we can now map the zero address.
- Cool. Neat, even.
- We could’ve solved that problem ourselves but now we don’t have to.
By using the translation function of the
MappedPageTabletype, we can spare ourselves the work of implementing huge page support. We also have access to other page functions, such asmap_to, which we will use in the next section.