What is the course about?
This is a class on Computer Systems.
The course adopts the perspective of a programmer using computer systems, rather than a designer of operating systems.
What languages are systems applications written in?
Databases
PostgreSQL
data:image/s3,"s3://crabby-images/75957/75957293d813a4fc7ab242678ae8364ce9444903" alt=""
SQLite
data:image/s3,"s3://crabby-images/48c93/48c93f4c8c85b0c396a8389d442a7ffa0b13b6f0" alt=""
MariaDB
data:image/s3,"s3://crabby-images/335f7/335f7b4f0e02edf7e11f3552768aa8a9b47b769c" alt=""
MySQL
data:image/s3,"s3://crabby-images/d1f46/d1f460953364ae928e737310eb8e47e54700bcdb" alt=""
What languages are systems applications written in?
Networking
OpenSSH
data:image/s3,"s3://crabby-images/f247a/f247abb91b91b1e5ae56eab594871a6cbc768920" alt=""
nginx
data:image/s3,"s3://crabby-images/11b35/11b3569af6bcf432124d6f859ede52dc09fcd352" alt=""
Curl
data:image/s3,"s3://crabby-images/ab327/ab327c7037baa431aff906a07f2735a224cfa5ae" alt=""
Nmap
data:image/s3,"s3://crabby-images/6a03b/6a03b1eec14f090aadc67dc12cefb9dbd4ba14fc" alt=""
Why C and C++ in systems programming?
Primary reason: performance.
data:image/s3,"s3://crabby-images/5db92/5db929e9bab55edc54ac3c3084fc1f38a31e1490" alt=""
Why C and C++ in systems programming?
Primary reason: performance.
- Compiles directly to machine code
- Allows direct control over memory allocation
- Allows directly accessing hardware and memory
- Interoperates with other low-level code like assembly language
- Compiles to bytecode that has to be interpreted*
- Memory is garbage collected
- Platform access gated by virtual machine
* In practice the bytecode is Just-In-Time compiled to native code and executed
But people write web apps in Javascript, and those applications serve thousands of concurrent users!
Much of modern programming is abstractions upon abstractions:
Lower-level languages are still needed!
Why not C and C++ in systems programming?
Primary reason: memory unsafety.
Memory safety is a property of programming languages that prevents bugs related to memory access. These include buffer overflows, use after free, and data races*.
Memory unsafety causes
- 70% of high/critical vulnerabilities in Google’s Chromium
- 70% common vulnerabilities and exposures (CVEs) from Microsoft
- ~94% of high/critical bugs in Mozilla software
- 67% of zero-day vulnerabilities from Google’s Project Zero
White House Press Release
data:image/s3,"s3://crabby-images/5eda2/5eda21645473148537d7561c1f96c8273c512b51" alt="White House Press Release"
July 2024 Crowdstrike Incident
data:image/s3,"s3://crabby-images/7a7df/7a7df8152bce883f8eb74dff97d49849b0399a59" alt="Crowdstrike Error"
- Cybersecurity company Crowdstrike distributed a faulty update for its Falcon sensor software.
- The update caused machines to enter a bootloop or boot into recovery mode, many requiring manual fixing.
- Roughly 8.5 million systems crashed, costing at least US$10 billion in financial damanges worldwide.
July 2024 Crowdstrike Incident
Cause? Memory safety error.
…Sensors that received the new version of Channel File 291 carrying the problematic content were exposed to a latent out-of-bounds read issue in the Content Interpreter. At the next IPC notification from the operating system, the new IPC Template Instances were evaluated, specifying a comparison against the 21st input value. The Content Interpreter expected only 20 values. Therefore, the attempt to access the 21st value produced an out-of-bounds memory read beyond the end of the input data array and resulted in a system crash.
- Sensor input read into array
- Array access not bounds checked
- Template specifies a comparison with the 21st input value, but sensor only expected 20 values
Why Rust?
Let’s look at marketing from the Rust programming language:
data:image/s3,"s3://crabby-images/51c3b/51c3b97949012815325e5808398e05bd61245e1a" alt=""
Performance
data:image/s3,"s3://crabby-images/80d33/80d3352cc11439c253351d66f1b7332e1d6b6d5b" alt=""
-
Compilation to native code with no runtime
- Allows running on embedded devices
- Allows interop with other languages
-
Lack of garbage collection
- Provides predictable performance
Case Study: Discord
-
Implementing a critical data structure (used in the Elixir backend) in Rust improved performance by 820x in the best case and 42,500x in the worst case source.
-
Rewriting the “Read States” service from Go to Rust improved performance in every metric including latency, CPU, and memory source.
Note: Go is purple, Rust is blue
Reliability
- Buffer overflows
#include <stdio.h>
int main(void) {
int array[] = { 1, 2, 3, 4, 5 };
printf("%d\n", array[1000]);
}
fn main() {
let array: [u32; 5] = [1, 2, 3, 4, 5];
println!("{}", array[1000]);
}
Reliability
- Dangling pointer
#include <stdio.h>
int* return_pointer() {
int x = 5;
return &x;
}
int main(void) {
int* x = return_pointer();
printf("%d\n", *x);
}
fn return_reference() -> &i32 {
let x = 5;
&x
}
fn main() {
let x: &i32 = return_reference();
println!("{}", *x);
}
Undefined Behavior in C
- Result of running a program that violates the language specification.
- There are no restrictions on the behavior of the program.
- Implementations are not required to diagnose undefined behavior.
When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose
Spot The Overflow
#include <string.h>
void copy_packet(char *packet_data, int packet_len) {
char buffer[128];
int bytes_to_copy = packet_len;
if (bytes_to_copy < 128) {
strncpy(buffer, packet_data, bytes_to_copy);
}
}
Source: Stanford CS110L
Spot The Problem
for (size_t i = 0; i < container.size() - 1; i++) {
// Access element in container at index `i`
container[i];
}
Tricky C
#include <stdio.h>
int main(void) {
unsigned char one = 1;
unsigned char max = 255;
unsigned char sum = one + max;
if (sum == one + max) {
printf("sum = one + max and sum == one + max");
} else {
printf("sum = one + max but sum != one + max");
}
}
Undefined Behavior in Rust
- Safe Rust cannot cause Undefined Behavior.
- Unsafe Rust can cause Undefined Behavior.
-
The
unsafe
keyword separates Safe and Unsafe Rust.
Safe and Unsafe Rust
#include <stdint.h>
int32_t add(int32_t a, int32_t b) {
return a + b;
}
Safe and Unsafe Rust (2)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
Safe and Unsafe Rust (3)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
pub fn safe_add(a: i32, b: i32) -> i32 {
unsafe { add(a, b) }
}
Safe and Unsafe Rust (4)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
pub fn safe_add(a: i32, b: i32) -> i32 {
// Will overflow
if a >= 0 && (b > i32::MAX - a) {
return 0;
}
// Will underflow
if a < 0 && (b < i32::MIN - a) {
return 0;
}
unsafe { add(a, b) }
}