Unsafe

OS in Rust

Announcements

  • Welcome to OS in Rust
  • Action Items:
    • wc ongoing.
    • Due Friday, 5 Sept. at 1440 PT.
    • Just a quick demo for HW this week, check it out whenever.

Today

  • Unsafe
  • UB
  • Dereferencing

Unsafe

The Dark Arts of Unsafe Rust

THE KNOWLEDGE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS.

Motivation

  • Often we don’t worry about low-level implementation details.
    • How many bits is an integer in Python?
    • How many bits is this array in Rust?
let mut array: [i32; 3] = [0; 3];
  • Who could possibly care how much space the empty tuple occupies?

Sometimes it matters…

  • The most common reason is performance, but
  • More importantly, these details can become a matter of correctness when interfacing directly with
    • hardware,
    • operating systems, or
    • other languages (by which we mean C)

Legacy

When implementation details start to matter in a safe programming language, programmers usually have three options:

  • fiddle with the code to “encourage” the compiler/runtime to perform an optimization
  • adopt a more unidiomatic or cumbersome design to get the desired implementation
  • rewrite the implementation in a language that lets you deal with those details

Rust on C

Unfortunately, C is incredibly unsafe to use (sometimes for good reason), and this unsafety is magnified when trying to interoperate with another language. Care must be taken to ensure C and the other language agree on what’s happening, and that they don’t step on each other’s toes.

Both, and

  • So what does this have to do with Rust?
    • Well, unlike C, Rust is a safe programming language.
    • But, like C, Rust is an unsafe programming language.
  • More accurately, Rust contains both a safe and unsafe programming language.

Rust can be thought of as a combination of two programming languages: Safe Rust and Unsafe Rust.

Safe Rust

  • The default case
  • What we’ve used to far, at least officially
    • You may have incidentally dabbled in unsafe.
    • Good for you.
  • Safe Rust only really makes one guarantee.

Rust’s memory safety guarantees enforced at compile time

  • That is, you can never try to read something that doesn’t exist.

Unsafe Rust

  • Sometimes, you may want to have two reference to an object that are both mutable.
    • For example, if you are implementing a suffix tree, a task I didn’t assign last term because it was miserable in safe Rust.
  • Sometimes, you want to do so in a way that can’t possibly break anything.
    • For example, you may only be using provably correct algorithms (whatever that means).
    • But of course, rustc isn’t a theorem prover…

Warning

Unsafe Rust is, well, not (safe). In fact, Unsafe Rust lets us do some really unsafe things. Things the Rust authors will implore you not to do, but we’ll do anyway.

True Rust

  • Safe Rust is the true Rust programming language.
  • If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety.
  • You will never endure
    • a dangling pointer,
    • a use-after-free, or
    • any other kind of Undefined Behavior (a.k.a. UB).

std

  • The standard library also gives you enough utilities out of the box that you’ll be able to write high-performance applications and libraries in pure idiomatic Safe Rust.
  • Assuming, of course, you are working on a system where std is already implemented.
  • Not, you know, writing your own OS or own std.

Twinsies

  • Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
  • It just lets you do some extra things that are Definitely Not Safe.

Payoffs

  • The value of this separation is that we gain the benefits of using an unsafe language like C
    • low level control over implementation details
  • …without most (citation needed) of the problems that come with trying to integrate it with a completely different safe language.

Undefined Behavior

Safe Rust

  • Safe Rust guarantees certain things about your code.
    • Most of these won’t even make sense to you unless you write C

Safe Prevents

  • Dereferencing (using the * operator on) dangling or unaligned pointers (see below)
  • Breaking the “pointer aliasing rules” using & (borrow) and mut
  • Calling a function with the wrong call types
  • Causing a data race when multithreading
  • Executing code compiled for hardware other than currently hosting the process not support
  • Producing invalid values like something other than 0 or 1 for a boolean.

Safe Allows

  • Deadlocks when multithreading
  • Leaks of memory and other resources
  • Exiting without calling destructors
  • Exposing randomized base addresses through pointer leaks
  • Integer overflow (recall SHA wrapping)
  • Logic errors
let val = 7;
let val_is_even = if val % 2 == 1 { true } else { false }; 

Unsafe allows:

  1. Dereference a raw pointer.
  2. Call an unsafe function or method.
  3. Access or modify a mutable static variable.
  4. Implement an unsafe trait.
  5. Access fields of unions.

Simply…

  1. Write Rust code normally.
  2. If you are doing something unsafe, include it in an unsafe block.

Unions

union MyUnion {
    f: f32,
    u: u32,
}

fn main() {
    let mut u = MyUnion { f: 0.0 };

    unsafe {
        println!("Bits as float: {}", u.f);
        println!("Bits as integer: {:#x}", u.u);
    }

    u.u = 0x3F800000; // This is the bit pattern for 1.0 in IEEE 754

    unsafe {
        println!("After manual bit update, float is: {}", u.f);
    }
}

Dangling

  • This worked fine for me, but technically isn’t supported.
src/main.rs
fn main() {    
    let ptr: *const i32;

    {
        let x = 1234;
        ptr = &x as *const i32;
        unsafe {
            println!("Value at ptr: {}", *ptr);
        }
    }

    unsafe {
        println!("Value at ptr: {}", *ptr);
    }
}

The Reference

  • The reference is much more… verbose.
  • It’s here

Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is an outline of the general principles:

Violating assumptions of the Rust runtime. Most assumptions of the Rust runtime are currently not explicitly documented.

Dangling

Previously in Rust

  • Safe Rust ensures that references are always valid
    • That is, borrows via &
    • These always have some plausible correct value in them (if the code compiles)

The Frontier

  • At the edge of Safe and Unsafe Rust are raw pointers.
    • Can be immutable *const T or mutable *mut T.
    • Vs. C, the asterisk is part of the name and not an operator.
      • This is annoying.
  • Raw pointers are inherently unsafe, so they can be created in safe Rust.

Differences

  • These raw pointers:
    • Are allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
    • Aren’t guaranteed to point to valid memory
    • Are allowed to be null
    • Don’t implement any automatic cleanup

Example

    let mut num = 5;

    let r1 = &raw const num;
    let r2 = &raw mut num;

This is safe!

  • There is nothing unsafe and creating a variable that refers to some value.
  • After all, what’s the worst thing that can happen?
  • These lines of code only facilitate novel possible ways of interfacing with the underlying data.
  • In this case, we know these pointers are valid and point to 5.

Casts

  • We can get a potentially invalid reference using “casts” with the as keyword.
    • We take a numeric value and treat it as a reference.
    • This numerical value represents a location in numerical organized computer memory.
    let address = 0x012345usize;
    let r = address as *const i32;
  • This is still safe so far!

Dereference

  • We now can use the dereference asterisk * operator.
  • This is decidedly unsafe.
    let mut num = 5;

    let r1 = &raw const num;
    let r2 = &raw mut num;

    unsafe {
        println!("r1 is: {}", *r1);
        println!("r2 is: {}", *r2);
    }
  • Unsafe to Rust but we know it will work.

Runtime Crash

  • This is both unsafe and (almost certainly) won’t work.
src/main.rs
fn main() {
    let address = 0x0usize;
    let r = address as *const i32;
    unsafe {
        dbg!(*r);
    }
}
  • Fun exercise: try to take the deferences out of the unsafe block.

Fin

Announcements

  • Action Items:
    • wc ongoing.
    • Due Friday, 5 Sept. at 1440 PT.
    • Just a quick demo for HW this week, check it out whenever.