Linker
OS in Rust
A word before we begin
- This is continued directly from the Bare Metal lecture.
The Code
- We left off with the following code:
The Configuration
- We modified the configuration with a few new lines about panics.
The State of Play
- We encountered the following error:
$ cargo build
Compiling osirs v0.1.0 (/home/user/tmp/32)
error: using `fn main` requires the standard library
|
= help: use `#![no_main]` to bypass the Rust generated entrypoint and declare a platform specific entrypoint yourself, usually with `#[no_mangle]`
error: could not compile `osirs` (bin "osirs") due to 1 previous errorOnward
The start attribute
- One might think that the
mainfunction is the first function called when you run a program.- Usually because that is how
mainis taught. - It isn’t entirely wrong -
mainis the first thing that you write that is called when you run a program. - But there’s often some setup first!
- Usually because that is how
- Most languages have a “runtime system”
- For e.g. Java garbage collection (e.g. in Java)
- For e.g. Go software threads (goroutines)
- For e.g. Python, Pyodide runs in WebAssembly/Emscripten within your browser engine which itself runs on top of an OS.
- This runtime needs to be called before
main, since it needs to initialize itself.
C, again
- In a typical Rust executable that links the standard library, execution starts in a C runtime library
crt0for “C runtime zero”- C stands for “cool”
- This creates a stack and places the arguments in the right hardware registers.
- We recall even in our earliest mentions of C and Rust we always assume there just happens to be a stack we can push/pop fixed-size variables onto.
Getting _started
- The C runtime then invokes the entry point of the Rust runtime, which is marked by the
startlanguage item. - I detect a great deal of hand-waving around the term “language item”.
- I think “language item” is how Rust people describe (some subset of) things that don’t make sense with the language implementation.
- Mostly, they are not expressions.
- Read more
Example
Some features provided by lang items:
overloadable operators via traits: the traits corresponding to the
==,<, dereferencing (*) and+(etc.) operators are all marked with lang items; those specific four areeq,partial_ord,deref/deref_mut, andaddrespectively.
We recall Calvin Deutschbein Thought on both overloading and traits (they’re bad).
Example
Some features provided by lang items:
panicking: the
panicandpanic_impllang items, among others.
We have already been bamboozled into using #[panic_handler]
Example
Some features provided by lang items:
stack unwinding: the lang item
eh_personalityis a function used by the failure mechanisms of the compiler.
The eh_personality item is cut content present in the reference material. We recall Calvin Deutschbein Thought on unwinding (it’s bad).
Back to Rust
- Rust only has a very minimal runtime, which takes care of some small things such as setting up stack overflow guards or printing a backtrace on panic.
- We should approach the claim of minimal with some skeptism, but it isn’t relevant to us for now.
- The runtime then finally calls the
mainfunction.
crt0 is cheating
- Our freestanding executable does not have access to the Rust runtime and
crt0 - We need to define our own entry point. = Implementing the
startlanguage item wouldn’t help, since it would still requirecrt0. - Instead, we need to overwrite the
crt0entry point directly.
Overwriting
- To tell the Rust compiler that we don’t want to use the normal entry point chain, we add the
#![no_main]attribute.
No main in main
- At this point we can also remove the
mainfunction.- But notably still term our file
main.rs - We also pretend this is not confusing.
- But notably still term our file
- Absent a compatible runtime,
mainis meaningless! - If you
cargo buildat this point, by the way, you will get some fun errors.
Start in main
- Instead, overwrite the entry point with our own
_startfunction:
- This also won’t work.
Manglin’
- By using the
#[unsafe(no_mangle)]attribute, we disable “name mangling”- The function must be named
_start.
- The function must be named
- Otherwise, compiler generates unique symbols like
_start_imarandomstr_1234to avoid namespace collisons.- Folks… it’s key-value storage.
- The attribute is required for the linker in the next step.
Start in main
- Write your own mangle-free
_start.:
C you again
- Mark the function as
extern "C"to tell the compiler that it should use the “C calling convention” - The reason for naming the function
_startis that this is the default entry point name for most systems.
A Hardware Reality
- The C calling convention is a hardware reality
- It is the implementation of a physical device that assumes C code is running on it,
- It assumes the C runs in an expected, consistent, historical way.
- This consistency leads to a usable heap and viable
return. - It could be possible to e.g. implement stack-less C
- But that would not have hardware support.
Start in main
- Instead, overwrite the entry point with our own
_startfunction:
C you later
- We don’t have a great way to test this at this stage.
- In fact, most formulations will lead to similar errors…
- We just put it in now for forwards compatability.
- That said, I got this working without
pub extern "C"- You can try it out soon.
Read more
Divergence
- The
!return type returns! - This is required because the entry point is not called by any function, but invoked directly by the operating system or bootloader.
- It can’t return anywhere!
- Cowards use
loop(), heroes use recursion.
Allowance
- As you are aware,
cargois counted among the cowards that expect loop.- We still haven’t shown the trick to compile this yet, but if you knew the trick you would see the following.
Compiling osirs v0.1.0 (/home/user/tmp/32)
warning: function cannot return without recursing
--> src/main.rs:5:1
|
5 | pub extern "C" fn _start() -> ! {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot return without recursing
6 | _start()
| -------- recursive call site
|
= help: a `loop` may express intention better if this is on purpose
= note: `#[warn(unconditional_recursion)]` on by default
warning: `osirs` (bin "osirs") generated 1 warning
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.16sHash and Hash Bang
- We applied function-level allowances to our
panicto allow infinite recursion as follows:- Immediately prepending, octothorpe/hash prefixed “language items”
src/main.rs
- We applied file-level allowances to our
src/main.rsas follows:- Free-floating, octothorpe+exclamation point (hash+bang)
The Code
- I promoted the recursion allowance to file-level, with the following resultant code:
src/main.rs
- Put a pin in this.
- We’ll be back latter.
Exit
- Before I go out to the clubs I always “X it up” (put X’s on my hands) because I’m straight edge.
- Operating systems are similar.
- So instead of returning, the entry point should e.g. invoke the
exitsystem call of the operating system. exitsystem call- For now, we fulfill the requirement by recursing endlessly.
Now it works!
- It doesn’t.
$ cargo build
Compiling osirs v0.1.0 (/home/user/tmp/32)
error: linking with `cc` failed: exit status: 1
|
= note: "cc" "-m64" "/tmp/rustcheyGtM/symbols.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "<sysroot>/lib/rustlib/x86_64-unknown-linux-gnu/lib/{librustc_std_workspace_core-*,libcore-*,libcompiler_builtins-*}.rlib" "-L" "/tmp/rustcheyGtM/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/user/tmp/32/target/debug/deps/osirs-80f8eb3240ba748e" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs"
= note: some arguments are omitted. use `--verbose` to show all linker arguments
= note: /usr/bin/ld: /home/user/tmp/32/target/debug/deps/osirs-80f8eb3240ba748e.9chdw59nqscmfe0ef1hrxy2nb.rcgu.o: in function `_start':
/home/user/tmp/32/src/main.rs:14: multiple definition of `_start'; /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x1b): undefined reference to `main'
/usr/bin/ld: (.text+0x21): undefined reference to `__libc_start_main'
collect2: error: ld returned 1 exit status
= note: some `extern` functions couldn't be found; some native libraries may need to be installed or have their path specified
= note: use the `-l` flag to specify native libraries to link
= note: use the `cargo:rustc-link-lib` directive to specify the native libraries to link with Cargo (see https://doc.rust-lang.org/cargo/reference/build-scripts.html#rustc-link-lib)
error: could not compile `osirs` (bin "osirs") due to 1 previous errorLinker Errors
- The linker is a program that combines the generated code into an executable.
- Could be the subject of an entire course.
- Extremely compiler-relevant.
- Since the executable format differs between Linux, Windows, and macOS, each system has its own linker that throws a different error.
- The fundamental cause of the errors is the same: the default configuration of the linker assumes that our program depends on the C runtime, which it does not.
Solution
- To solve the errors, we need to tell the linker that it should not include the C runtime.
- We can do this either by passing a certain set of arguments to the linker or by building for a bare metal target.
Building for a Bare Metal Target
- By default Rust tries to build an executable that is able to run in your current system environment.
- For example, if you’re using Linux on
x86_64, Rust tries to build an ELF that usesx86_64instructions. - This environment is called your “host” system.
Target Triple
- To describe different environments, Rust uses a string called target triple.
- You can see the target triple for your host system by running
rustc --version --verbose:
On Triples
- For triple,
rustcand the linker (for megcc, oftenclangis recommended). assume an OS and C runtime. - We turned both of those things off.
- So, to avoid the linker errors, we can compile for a different environment with no underlying operating system.
Bare Metal
- One bare metal environment is the
thumbv7em-none-eabitarget triple - A embedded ARM system. Used for teaching.
- The details are not important; it has no underlying operating system.
- That is the
nonein the target triple.
- That is the
Rustup
- To be able to compile for this target, we need to add it in rustup:
This downloads a copy of the standard (and core) library for the system. Now we can build our freestanding executable for this target:
Explanation
- By passing a
--targetargument we cross-compile- Bare metal compilation!
- Since the target system has no operating system, the linker does not try to link the C runtime and our build succeeds without any linker errors.
Looking Ahead
- This is the approach that we will use for building our OS kernel.
- Instead of
thumbv7em-none-eabihf, we will use a custom target that describes ax86_64bare metal environment.- I don’t know how this will work for the Apple Silicon folks.
- But I am excited to find out!
- The details will be explained next week.
Summary
A minimal freestanding Rust executable looks like this:
- Code
src/main.rs
- Configuration
- We made no changes to this during the lab.
To build
To build this binary, we need to compile for a bare metal target such as thumbv7em-none-eabi:
Alternatively, we can compile it for Linux by passing additional linker arguments to rustc:
To run
- If you build, you can run it!
$ cargo rustc -- -C link-arg=-nostartfiles
Compiling osirs v0.1.0 (/home/user/tmp/32)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
user@cd-desk:~/tmp/32$ ./target/debug/osirs
Segmentation fault (core dumped)- Wait a minute!
RE: cursed
- The C calling convention segfaults on infinite recursion.
- Switch to
loopto get an infinite loop. - These two things are equally bad in my view.
- Pop the pin
Comments
x86_64Linux system.hosttriple isx86_64-unknown-linux-gnux86_64),unknown) - It’s Intel #Portlandlinux)gnu).