CS3984 Computer Systems in Rust



String and &str

  • UTF-8 encoded sequence of bytes
  • Length, capacity, pointer to content
    • No null terminator!
  • Can be try-converted from byte arrays in UTF-8, UTF-16-BE, UTF-16-LE
  • Can be lossy-converted from byte arrays in UTF-8, UTF-16-BE, UTF-16-LE

Aside (Unicode):

Cow<str>

enum CowStr<'borrow> {
  Borrowed(&'borrow str),
  Owned(String)
}


  • Implements Deref, so automatic immutable method access to string.
  • The to_mut method returns a mutable reference to an owned value, cloning if necessary.

Cow<str> (2)

use std::borrow::Cow;

fn main() {
  let mut s: Cow<'static, str> = Cow::from("hello!");
  // s is a Cow::Borrowed, pointing to a string literal
  println!("{}", s.len());

  s.to_mut().make_ascii_uppercase();
  // s is now a Cow::Owned, containing an owned String
  println!("{}", s);
}


OsString and &OsStr

  • Platform native strings
    • On Unix systems, an arbitrary sequence of non-zero bytes

Notably, an OsString is not guaranteed to be valid UTF-8.

pub fn into_string(self) -> Result<String, OsString>


  • An &OsStr is to OsString what &str is to String.

  • Rust library functions interfacing with the system usually return OsString/OsStr:

  • Example: Path::file_name

pub fn file_name(&self) -> Option<&OsStr>


CString and &CStr

  • C-compatible strings
  • Nul-terminated
  • No nul bytes in the middle

Notably, not all String are valid CString, since UTF-8 strings can contain nul characters.

pub fn new<T>(t: T) -> Result<CString, NulError>
where
    T: Into<Vec<u8>>,


  • A &CStr is to CString what &str is to a String
  • Functions that interact with the C ABI usually utilize CString/CStr.