What is the course about?
This is a class on Computer Systems.
The course adopts the perspective of a programmer using computer systems, rather than a designer of operating systems.
What languages are systems applications written in?
Databases
PostgreSQL
SQLite
MariaDB
MySQL
What languages are systems applications written in?
Networking
OpenSSH
nginx
Curl
Nmap
Why C and C++ in systems programming?
Primary reason: performance.
Why C and C++ in systems programming?
Primary reason: performance.
- Compiles directly to machine code
- Allows direct control over memory allocation
- Allows directly accessing hardware and memory
- Interoperates with other low-level code like assembly language
- Compiles to bytecode that has to be interpreted*
- Memory is garbage collected
- Platform access gated by virtual machine
* In practice the bytecode is Just-In-Time compiled to native code and executed
But people write web apps in Javascript, and those applications serve thousands of concurrent users!
Much of modern programming is abstractions upon abstractions:
Lower-level languages are still needed!
Why not C and C++ in systems programming?
Primary reason: memory unsafety.
Memory safety is a property of programming languages that prevents bugs related to memory access. These include buffer overflows, use after free, and data races*.
Memory unsafety causes
- 70% of high/critical vulnerabilities in Google’s Chromium
- 70% common vulnerabilities and exposures (CVEs) from Microsoft
- ~94% of high/critical bugs in Mozilla software
- 67% of zero-day vulnerabilities from Google’s Project Zero
White House Press Release
July 2024 Crowdstrike Incident
- Cybersecurity company Crowdstrike distributed a faulty update for its Falcon sensor software.
- The update caused machines to enter a bootloop or boot into recovery mode, many requiring manual fixing.
- Roughly 8.5 million systems crashed, costing at least US$10 billion in financial damanges worldwide.
July 2024 Crowdstrike Incident
Cause? Memory safety error.
…Sensors that received the new version of Channel File 291 carrying the problematic content were exposed to a latent out-of-bounds read issue in the Content Interpreter. At the next IPC notification from the operating system, the new IPC Template Instances were evaluated, specifying a comparison against the 21st input value. The Content Interpreter expected only 20 values. Therefore, the attempt to access the 21st value produced an out-of-bounds memory read beyond the end of the input data array and resulted in a system crash.
- Sensor input read into array
- Array access not bounds checked
- Template specifies a comparison with the 21st input value, but sensor only expected 20 values
Why Rust?
Let’s look at marketing from the Rust programming language:
Performance
-
Compilation to native code with no runtime
- Allows running on embedded devices
- Allows interop with other languages
-
Lack of garbage collection
- Provides predictable performance
Case Study: Discord
Reliability
- Buffer overflows
#include <stdio.h>
int main(void) {
int array[] = { 1, 2, 3, 4, 5 };
printf("%d\n", array[1000]);
}
fn main() {
let array: [u32; 5] = [1, 2, 3, 4, 5];
println!("{}", array[1000]);
}
Reliability
- Dangling pointer
#include <stdio.h>
int* return_pointer() {
int x = 5;
return &x;
}
int main(void) {
int* x = return_pointer();
printf("%d\n", *x);
}
fn return_reference() -> &i32 {
let x = 5;
&x
}
fn main() {
let x: &i32 = return_reference();
println!("{}", *x);
}
Undefined Behavior in C
- Result of running a program that violates the language specification.
- There are no restrictions on the behavior of the program.
- Implementations are not required to diagnose undefined behavior.
When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose
Spot The Overflow
#include <string.h>
void copy_packet(char *packet_data, int packet_len) {
char buffer[128];
int bytes_to_copy = packet_len;
if (bytes_to_copy < 128) {
strncpy(buffer, packet_data, bytes_to_copy);
}
}
Source: Stanford CS110L
Spot The Problem
for (size_t i = 0; i < container.size() - 1; i++) {
// Access element in container at index `i`
container[i];
}
Tricky C
#include <stdio.h>
int main(void) {
unsigned char one = 1;
unsigned char max = 255;
unsigned char sum = one + max;
if (sum == one + max) {
printf("sum = one + max and sum == one + max");
} else {
printf("sum = one + max but sum != one + max");
}
}
Undefined Behavior in Rust
- Safe Rust cannot cause Undefined Behavior.
- Unsafe Rust can cause Undefined Behavior.
-
The
unsafe
keyword separates Safe and Unsafe Rust.
Safe and Unsafe Rust
#include <stdint.h>
int32_t add(int32_t a, int32_t b) {
return a + b;
}
Safe and Unsafe Rust (2)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
Safe and Unsafe Rust (3)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
pub fn safe_add(a: i32, b: i32) -> i32 {
unsafe { add(a, b) }
}
Safe and Unsafe Rust (4)
extern "C" {
fn add(a: i32, b: i32) -> i32;
}
pub fn safe_add(a: i32, b: i32) -> i32 {
// Will overflow
if a >= 0 && (b > i32::MAX - a) {
return 0;
}
// Will underflow
if a < 0 && (b < i32::MIN - a) {
return 0;
}
unsafe { add(a, b) }
}