Paging

OS in Rust

Announcements

Action Items:
- Welcome back.
- You should be working on:
  - Keyboards
  - Timers

Citations

There’s a lot of ways to cover paging theory.
By convention, I would use the bin packing problem
This class doesn’t require the class that’s taught in, so… blog it is.
Paging

Today

Memory isolation
Segmentation
Virtual Memory
Paging
Multilevel page tables on x86_64

Memory Isolation

An OS Task

The OS must isolate programs.
- Imagine, say, separate VGA buffer from a clock.
- We currently do this with prolific static mut
In practice:
- Text editor vs. web browser vs. video player.

How?

Utilize hardware functionality.
Ensure that memory areas of one process are not accessible by other processes.
- Use physically different SSD area
- Recycle physical RAM
- Use distinct “keys” in virtual memory space.

One Approach

Consider the ARM Cortex-M
- I’ve run into these in simulation.
- Common embedded platform.

MPU

The Cortex-M has a “Memory Protection Unit”
- Allows software to configure ~8 “regions”
- Regions can have distinct permissions
  - Read/write/execute/etc.
- MPU is an intermediary between CPU and MMU that vets memory accesses.
This solves the memory isolation problem for the OS.

Two Approaches

x86 is not Cortex-M - it has two options:
- Segmentation
  - “In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset (memory location) within that segment.”

Two Approaches

x86 is not Cortex-M - it has two options:
- Paging
  - “Page tables are used to translate the virtual addresses seen by the application into physical addresses used by the hardware to process instructions…”

Segmentation

It’s Old

Segmentation was already introduced in 1978.
- Originally to increase the amount of addressable memory.
The situation back then was that CPUs only used 16-bit addresses
- Limited the amount of addressable memory to 64KiB.

Outcomes

To make more than these 64KiB accessible, additional segment registers added
- Each has an offset.
- The CPU automatically added this offset on each memory access
- Up to 1MiB of memory was accessible.

Segment Registers

We’ve seen these before.
- Recall we used the code segment register to handle exceptions.
CS - Code Segment
SS - Stack Segment
DS - Data Segment
ES/FS/GS - Extra, fxtra, and gxtra

Usage

Simpler when originally implemented (e.g. before ES/FS/GS existed).
- Segment registers directly contained the offset
- No access control was performed

Protected Mode

Enter protected mode.
- Rather than offsets, uses a look-up table into a global descriptor table.
- Recall we used the global descriptor table to handle double faults.
What is protected? Memory.
Who is the protector? The OS.

How it feels

A process, like a user mode program, anything we have written outside of this class, perceives there to be an internally consistent memory.
- No other programs write to this space (unless mediated by file or network).
- Any write persists to be read.
- The OS runs and supports OS operations without any visible memory impact.

Virtual Memory

What it means

So, whenever a program uses a numerical memory address, that numerical address is a historical artifact from the pre-segmentation era.
- In practice, it is a “bit-packed” data structure unpacked by the OS to extract segment and local addresses.
This is why when exception processing we drew memory addresses from multiple registers (memory and segment).

The Big Idea

The idea behind virtual memory is to abstract away the memory addresses from the underlying physical storage device.
Instead of directly accessing the storage device, a translation step is performed first.
- At the hardware-software interface, via the OS and MMU jointly.

Segment Arithmetic

For segmentation, the translation step is to add the offset address of the active segment.
Imagine a program accessing memory address.
- Local address of 0x1234000
- Segment address of 0x1111000
- Then, the address that is really accessed is 0x2345000.

hex(0x12340000 + 0x1111000)

'0x13451000'

Namespaces

This can be monumentally confusing.
- The MMU and, for example, an SSD and HDD are also using numerical memory addresses.
To differentiate the two address types:
- Addresses before the translation are called virtual
- Addresses after the translation are called physical

Physical addresses

Only visible to OS
Unique.
Same for all users/clients/programs/privilege levels

Virtual addresses

Translation dependent
- Only meaningful relevant to a specific underlying OS
Non-unique.
- Multiple distinct virtual addresses may correspond to the same physical address.
- The same virtual address in different programs/processes may refer to distinct physical addresses.

Visually

Textually

Imagine the same program runs twice or more.
- Each time uses a local 150 address for some data.
- No collision - same as if two programs both have their own x variable.
Segments differ so physical address differ.

Benefit

We recall the abstraction of the stored-program computer
Now, those programs:
- Can be placed at arbitrary physical memory addresses
- Always consistently refer to internal memory locations

Fragmentation

The differentiation between virtual and physical addresses makes segmentation really powerful.
However, it has the problem of fragmentation.
As an example, imagine that we want to run a third copy of the program we saw above.

Visually

Problem

There is no way to map the third instance of the program to virtual memory without overlapping.
- But there is more than enough free memory available!
The problem is that we need contiguous memory and can’t use the small free chunks (fragments).

One Solution

We can “make this work”
- Dispatch a new program.
- If it needs more space, “smoosh” all other programs together.
- Fit in the new program in the opened space.

Visually

“Defrag”

Even as recently as in within my lifetime, standard of practice was to recommend semiregular (monthly?) manual usage of “defragmentation” tools provided by the OS for the hard drive.
This is the same technique but in RAM for working programs.
Takes longer the longer between uses.
Wastes time every time it’s run.

Downsides

It needs to copy large amounts of memory, which decreases performance.
It needs to be done regularly before the memory becomes too fragmented.
This makes performance unpredictable since programs are paused at random times and might become unresponsive.
- What if a timer goes off for a visible application while defrag is running.

Sunset

Fragmentation has led to segmentation no longer being used by modern systems.
Segmentation is not even supported in 64-bit x86 anymore.
Instead, paging is used, which bypass the fragmentation problem.

Paging

The Idea

Divide both the virtual and physical memory space into small, fixed-size blocks.
Term these blocks of the virtual memory space as pages.
Term the blocks of the physical address space as frames.
Each page can be individually mapped to any frame.

Example

We can now split larger memory regions across non-continuous physical frames.
The advantage of this becomes visible if we recap the example of the fragmented memory space, but use paging instead of segmentation this time.
- Easy to imagine each memory chunk mapping onto multiple pages/frames.

Visually

Description

Take page size of 50 bytes.
- Each program needs three pages.
Each page is mapped to a frame individually.
- A continuous virtual memory region can be mapped to non-continuous physical frames.
Now may start the third instance of the program without defragmentation.

Hidden Fragmentation

Compared to segmentation, paging uses lots of small, fixed-sized memory regions instead of a few large, variable-sized regions.
- And by variable-sized, we mean correctly-sized.
Since every frame has the same size, there are no frames that are too small to be used, so no fragmentation occurs.
- Right?

Internal Fragmentation

Internal fragmentation occurs because not every memory region is an exact multiple of the page size.
- Imagine a program of size 101 with pages sized 50.
- It would still need three pages of size 50.
- So you squander 49 bytes.
Term the segmentation fragmentation external fragmentation.

Pros and Cons

Internal fragmentation is unfortunate but often better than the external fragmentation that occurs with segmentation.
It still wastes memory, but
- Does not require defragmentation and
- Makes the amount of fragmentation predictable
  - Half page per program on average

Page Tables

Mapping

Potentially millions of pages is individually mapped to a frame.
This mapping information needs to be stored somewhere.
- We recall the malloc assignment.
Vs. a segment register for each program…
- More pages than segment registers.

Alternative

As with, say, the IDT, the GDT, etc., we extend the capabilities of the OS to handle a hardware limitation.
Paging uses a table structure called page table to store the mapping information.
We can visualize.

Visually

Practically

Each program instance has its own page table
A pointer to the currently active table is stored in a special CPU register.
On x86, this register is called CR3
- “Control Register 3” - I noted there are many CR’s, some used.
The OS loads the correct page table to a register before running a program.

OS & Hardware

On each memory access, the CPU reads the table pointer from the register and looks up the mapped frame for the accessed page in the table.
- Just like an exception!
- Hardware only process.
Depending on the architecture, page table entries can also store attributes such as access permissions in a flags field.

Limitations

Simple page tables have a problem in larger address spaces: waste.
Imagine a program that uses the four virtual pages 0, 1_000_000, 1_000_050, and 1_000_100
It only needs 4 physical frames, but the page table has over a million entries.
We can’t omit the empty entries or we lose constant-time access.

Visually

A solution

To reduce the wasted memory, we can use a two-level page table.
The idea is that we use different page tables for different address regions.
An additional table called level 2 page table contains the mapping between address regions and (level 1) page tables.
- One-indexed 😬

Visually

It’s just a radix tree

You probably know about these.

FIN?

I think that is probably enough for today.
If not, we will start here.

--- title: Paging --- ## Announcements - **Action Items**: - Welcome back. - You should be working on: - Keyboards - Timers ## Citations - There's a lot of ways to cover paging theory. - By convention, I would use the [bin packing problem](https://en.wikipedia.org/wiki/Bin_packing_problem) - This class doesn't require the class that's taught in, so... blog it is. - [Paging](https://os.phil-opp.com/paging-introduction/) ## Today - Memory isolation - Segmentation - Virtual Memory - Paging - Multilevel page tables on x86_64 # Memory Isolation ## An OS Task - The OS must isolate programs. - Imagine, say, separate VGA buffer from a clock. - We currently do this with prolific `static mut` - In practice: - Text editor vs. web browser vs. video player. ## How? - Utilize *hardware* functionality. - Ensure that memory areas of one process are not accessible by other processes. - Use physically different SSD area - Recycle physical RAM - Use distinct "keys" in virtual memory space. ## One Approach - Consider the ARM Cortex-M - I've run into these in simulation. - Common embedded platform. <a title="ZeptoBars, CC BY 3.0 <https://creativecommons.org/licenses/by/3.0>, via Wikimedia Commons" href="https://commons.wikimedia.org/wiki/File:STM32F100C4T6B-HD.jpg"><img width="960" alt="STM32F100C4T6B-HD" src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/67/STM32F100C4T6B-HD.jpg/960px-STM32F100C4T6B-HD.jpg"></a> ## MPU - The Cortex-M has a "Memory Protection Unit" - Allows software to configure ~8 "regions" - Regions can have *distinct permissions* - Read/write/execute/etc. - MPU is an intermediary between CPU and MMU that vets memory accesses. - This solves the memory isolation problem for the OS. ## Two Approaches - x86 is not Cortex-M - it has two options: - [Segmentation](https://en.wikipedia.org/wiki/X86_memory_segmentation) - "In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset (memory location) within that segment." ## Two Approaches - x86 is not Cortex-M - it has two options: - [Paging](https://en.wikipedia.org/wiki/Virtual_memory#Paged_virtual_memory) - "Page tables are used to translate the virtual addresses seen by the application into physical addresses used by the hardware to process instructions..." # Segmentation ## It's Old - Segmentation was already introduced in 1978. - Originally to increase the amount of addressable memory. - The situation back then was that CPUs only used 16-bit addresses - Limited the amount of addressable memory to 64KiB. ## Outcomes - To make more than these 64KiB accessible, additional segment registers added - Each has an offset. - The CPU automatically added this offset on each memory access - Up to 1MiB of memory was accessible. ## Segment Registers - We've seen these before. - Recall we used the code segment register to [handle exceptions](70_except.qmd#/cs-code-segment). - CS - Code Segment - SS - Stack Segment - DS - Data Segment - ES/FS/GS - Extra, fxtra, and gxtra ## Usage - Simpler when originally implemented (e.g. before ES/FS/GS existed). - Segment registers directly contained the offset - No access control was performed ## Protected Mode - Enter [*protected mode*](https://en.wikipedia.org/wiki/X86_memory_segmentation#Protected_mode). - Rather than offsets, uses a look-up table into a global descriptor table. - Recall we used the global descriptor table to [handle double faults](70_except.qmd#/gdt). - What is protected? Memory. - Who is the protector? The OS. ## How it feels - A process, like a user mode program, anything we have written outside of this class, *perceives* there to be an internally consistent memory. - No other programs write to this space (unless mediated by file or network). - Any write persists to be read. - The OS runs and supports OS operations without any visible memory impact. # Virtual Memory ## What it means - So, whenever a program uses a numerical memory address, that numerical address is a historical artifact from the pre-segmentation era. - In practice, it is a "bit-packed" data structure unpacked by the OS to extract segment and local addresses. - This is why when exception processing we drew memory addresses from multiple registers (memory and segment). ## The Big Idea - The idea behind virtual memory is to abstract away the memory *addresses* from the underlying *physical* storage device. - Instead of directly accessing the storage device, a translation step is performed first. - At the hardware-software interface, via the OS and MMU jointly. ## Segment Arithmetic - For segmentation, the translation step is to add the offset address of the active segment. - Imagine a program accessing memory address. - Local address of `0x1234000` - Segment address of `0x1111000` - Then, the address that is really accessed is `0x2345000`. ```{python} #| echo: true hex(0x12340000 + 0x1111000) ``` ## Namespaces - This can be monumentally confusing. - The MMU and, for example, an SSD and HDD are also using numerical memory addresses. - To differentiate the two address types: - Addresses before the translation are called _virtual_ - Addresses after the translation are called _physical_ ## Physical addresses - Only visible to OS - Unique. - Same for all users/clients/programs/privilege levels ## Virtual addresses - Translation dependent - Only meaningful relevant to a specific underlying OS - Non-unique. - Multiple distinct *virtual* addresses may correspond to the same *physical* address. - The same virtual address in different programs/processes may refer to distinct physical addresses. ## Visually <img src="https://os.phil-opp.com/paging-introduction/segmentation-same-program-twice.svg" style="filter:invert(.9)"> ## Textually - Imagine the same program runs twice or more. - Each time uses a local 150 address for some data. - No collision - same as if two programs both have their own `x` variable. - Segments differ so physical address differ. ## Benefit - We recall the abstraction of the *stored-program computer* - Now, those programs: - Can be placed at *arbitrary* physical memory addresses - Always consistently refer to internal memory locations ## Fragmentation - The differentiation between virtual and physical addresses makes segmentation really powerful. - However, it has the problem of *fragmentation*. - As an example, imagine that we want to run a third copy of the program we saw above. ## Visually <img src="https://os.phil-opp.com/paging-introduction/segmentation-fragmentation.svg" style="filter:invert(.9)"> ## Problem - There is no way to map the third instance of the program to virtual memory without overlapping. - But there is more than enough free memory available! - The problem is that we need _contiguous_ memory and can't use the small free chunks (fragments). ## One Solution - We can "make this work" - Dispatch a new program. - If it needs more space, "smoosh" all other programs together. - Fit in the new program in the opened space. ## Visually <img src="https://os.phil-opp.com/paging-introduction/segmentation-fragmentation-compacted.svg" style="filter:invert(.9)"> ## "Defrag" - Even as recently as in within my lifetime, standard of practice was to recommend semiregular (monthly?) **manual** usage of "defragmentation" tools provided by the OS for the hard drive. - This is the same technique but in RAM for working programs. - Takes longer the longer between uses. - Wastes time every time it's run. ## Downsides - It needs to copy large amounts of memory, which decreases performance. - It needs to be done regularly before the memory becomes too fragmented. - This makes performance unpredictable since programs are paused at random times and might become unresponsive. - What if a timer goes off for a visible application while defrag is running. ## Sunset - Fragmentation has led to segmentation no longer being used by modern systems. - Segmentation is not even supported in 64-bit x86 anymore. - Instead, _paging_ is used, which bypass the fragmentation problem. # Paging ## The Idea - Divide both the virtual and physical memory space into small, fixed-size blocks. - Term these blocks of the virtual memory space as _pages_. - Term the blocks of the physical address space as _frames_. - Each page can be individually mapped to *any* frame. ## Example - We can now split larger memory regions across non-continuous physical frames. - The advantage of this becomes visible if we recap the example of the fragmented memory space, but use paging instead of segmentation this time. - Easy to imagine each memory chunk mapping onto multiple pages/frames. ## Visually <img src="https://os.phil-opp.com/paging-introduction/paging-fragmentation.svg" style="filter:invert(.9)"> ## Description - Take page size of 50 bytes. - Each program needs three pages. - Each page is mapped to a frame individually. - A *continuous* virtual memory region can be mapped to *non-continuous* physical frames. - Now may start the third instance of the program without defragmentation. ## Hidden Fragmentation - Compared to segmentation, paging uses lots of small, fixed-sized memory regions instead of a few large, variable-sized regions. - And by variable-sized, we mean correctly-sized. - Since every frame has the same size, there are no frames that are too small to be used, so no fragmentation occurs. - Right? ## Internal Fragmentation - *Internal* fragmentation occurs because not every memory region is an exact multiple of the page size. - Imagine a program of size 101 with pages sized 50. - It would still need three pages of size 50. - So you squander 49 bytes. - Term the segmentation fragmentation _external fragmentation_. ## Pros and Cons - Internal fragmentation is unfortunate but often better than the external fragmentation that occurs with segmentation. - It still wastes memory, but - Does not require defragmentation and - Makes the amount of fragmentation predictable - Half page per program on average # Page Tables ## Mapping - Potentially *millions* of pages is individually mapped to a frame. - This mapping information needs to be stored somewhere. - We recall the [`malloc`](22_malloc.qmd) assignment. - Vs. a segment register for each program... - More pages than segment registers. ## Alternative - As with, say, the IDT, the GDT, etc., we extend the capabilities of the OS to handle a hardware limitation. - Paging uses a table structure called *page table* to store the mapping information. - We can visualize. ## Visually <img src="https://os.phil-opp.com/paging-introduction/paging-page-tables.svg" style="filter:invert(.9)"> ## Practically - Each program instance has its own page table - A pointer to the currently active table is stored in a special CPU register. - On `x86`, this register is called `CR3` - "Control Register 3" - I noted there are many CR's, some used. - The OS loads the correct page table to a register *before* running a program. ## OS & Hardware - On each memory access, the CPU reads the table pointer from the register and looks up the mapped frame for the accessed page in the table. - Just like an exception! - Hardware only process. - Depending on the architecture, page table entries can also store attributes such as access permissions in a flags field. ## Limitations - Simple page tables have a problem in larger address spaces: waste. - Imagine a program that uses the four virtual pages `0`, `1_000_000`, `1_000_050`, and `1_000_100` - It only needs 4 physical frames, but the page table has over a million entries. - We can't omit the empty entries or we lose constant-time access. ## Visually <img src="https://os.phil-opp.com/paging-introduction/single-level-page-table.svg" style="filter:invert(.9)"> ## A solution - To reduce the wasted memory, we can use a **two-level page table**. - The idea is that we use different page tables for different address regions. - An additional table called _level 2_ page table contains the mapping between address regions and (level 1) page tables. - One-indexed 😬 ## Visually <img src="https://os.phil-opp.com/paging-introduction/multilevel-page-table.svg" style="filter:invert(.9)"> ## It's just a radix tree - You probably know about these. <img src="https://upload.wikimedia.org/wikipedia/commons/6/65/Radix_tree.svg" style="filter:invert(.9)"> # FIN? - I think that is probably enough for today. - If not, we will start [here](https://os.phil-opp.com/paging-introduction/#paging-on-x86-64).