Unsafe
OS in Rust
Announcements
- Welcome to OS in Rust
- Action Items:
wcongoing.- Due Friday, 5 Sept. at 1440 PT.
- Just a quick demo for HW this week, check it out whenever.
Today
- Unsafe
- UB
- Dereferencing
Unsafe
The Dark Arts of Unsafe Rust
THE KNOWLEDGE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS.
Motivation
- Often we don’t worry about low-level implementation details.
- How many bits is an integer in Python?
- How many bits is this array in Rust?
- Who could possibly care how much space the empty tuple occupies?
Sometimes it matters…
- The most common reason is performance, but
- More importantly, these details can become a matter of correctness when interfacing directly with
- hardware,
- operating systems, or
- other languages (by which we mean C)
Legacy
When implementation details start to matter in a safe programming language, programmers usually have three options:
- fiddle with the code to “encourage” the compiler/runtime to perform an optimization
- adopt a more unidiomatic or cumbersome design to get the desired implementation
- rewrite the implementation in a language that lets you deal with those details
Rust on C
Unfortunately, C is incredibly unsafe to use (sometimes for good reason), and this unsafety is magnified when trying to interoperate with another language. Care must be taken to ensure C and the other language agree on what’s happening, and that they don’t step on each other’s toes.
Both, and
- So what does this have to do with Rust?
- Well, unlike C, Rust is a safe programming language.
- But, like C, Rust is an unsafe programming language.
- More accurately, Rust contains both a safe and unsafe programming language.
Rust can be thought of as a combination of two programming languages: Safe Rust and Unsafe Rust.
Safe Rust
- The default case
- What we’ve used to far, at least officially
- You may have incidentally dabbled in unsafe.
- Good for you.
- Safe Rust only really makes one guarantee.
Rust’s memory safety guarantees enforced at compile time
- That is, you can never try to read something that doesn’t exist.
Unsafe Rust
- Sometimes, you may want to have two reference to an object that are both mutable.
- For example, if you are implementing a suffix tree, a task I didn’t assign last term because it was miserable in safe Rust.
- Sometimes, you want to do so in a way that can’t possibly break anything.
- For example, you may only be using provably correct algorithms (whatever that means).
- But of course,
rustcisn’t a theorem prover…
Warning
Unsafe Rust is, well, not (safe). In fact, Unsafe Rust lets us do some really unsafe things. Things the Rust authors will implore you not to do, but we’ll do anyway.
True Rust
- Safe Rust is the true Rust programming language.
- If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety.
- You will never endure
- a dangling pointer,
- a use-after-free, or
- any other kind of Undefined Behavior (a.k.a. UB).
std
- The standard library also gives you enough utilities out of the box that you’ll be able to write high-performance applications and libraries in pure idiomatic Safe Rust.
- Assuming, of course, you are working on a system where
stdis already implemented. - Not, you know, writing your own OS or own
std.
Twinsies
- Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
- It just lets you do some extra things that are Definitely Not Safe.
Payoffs
- The value of this separation is that we gain the benefits of using an unsafe language like C
- low level control over implementation details
- …without most (citation needed) of the problems that come with trying to integrate it with a completely different safe language.
Undefined Behavior
Safe Rust
- Safe Rust guarantees certain things about your code.
- Most of these won’t even make sense to you unless you write C
Safe Prevents
- Dereferencing (using the
*operator on) dangling or unaligned pointers (see below) - Breaking the “pointer aliasing rules” using
&(borrow) andmut - Calling a function with the wrong call types
- Causing a data race when multithreading
- Executing code compiled for hardware other than currently hosting the process not support
- Producing invalid values like something other than
0or1for a boolean.
Safe Allows
- Deadlocks when multithreading
- Leaks of memory and other resources
- Exiting without calling destructors
- Exposing randomized base addresses through pointer leaks
- Integer overflow (recall SHA wrapping)
- Logic errors
Unsafe allows:
- Dereference a raw pointer.
- Call an unsafe function or method.
- Access or modify a mutable static variable.
- Implement an unsafe trait.
- Access fields of
unions.
Simply…
- Write Rust code normally.
- If you are doing something unsafe, include it in an
unsafeblock.
Unions
Dangling
- This worked fine for me, but technically isn’t supported.
The Reference
- The reference is much more… verbose.
- It’s here
Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is an outline of the general principles:
Violating assumptions of the Rust runtime. Most assumptions of the Rust runtime are currently not explicitly documented.
Dangling
Previously in Rust
- Safe Rust ensures that references are always valid
- That is, borrows via
& - These always have some plausible correct value in them (if the code compiles)
- That is, borrows via
The Frontier
- At the edge of Safe and Unsafe Rust are raw pointers.
- Can be immutable
*const Tor mutable*mut T. - Vs. C, the asterisk is part of the name and not an operator.
- This is annoying.
- Can be immutable
- Raw pointers are inherently unsafe, so they can be created in safe Rust.
Differences
- These raw pointers:
- Are allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
- Aren’t guaranteed to point to valid memory
- Are allowed to be null
- Don’t implement any automatic cleanup
Example
This is safe!
- There is nothing unsafe and creating a variable that refers to some value.
- After all, what’s the worst thing that can happen?
- These lines of code only facilitate novel possible ways of interfacing with the underlying data.
- In this case, we know these pointers are valid and point to 5.
Casts
- We can get a potentially invalid reference using “casts” with the
askeyword.- We take a numeric value and treat it as a reference.
- This numerical value represents a location in numerical organized computer memory.
- This is still safe so far!
Dereference
- We now can use the dereference asterisk
*operator. - This is decidedly unsafe.
let mut num = 5;
let r1 = &raw const num;
let r2 = &raw mut num;
unsafe {
println!("r1 is: {}", *r1);
println!("r2 is: {}", *r2);
}- Unsafe to Rust but we know it will work.
Runtime Crash
- This is both unsafe and (almost certainly) won’t work.
src/main.rs
- Fun exercise: try to take the deferences out of the
unsafeblock.
Fin
Announcements
- Action Items:
wcongoing.- Due Friday, 5 Sept. at 1440 PT.
- Just a quick demo for HW this week, check it out whenever.