String and &str
- UTF-8 encoded sequence of bytes
-
Length, capacity, pointer to content
- No null terminator!
- Can be try-converted from byte arrays in UTF-8, UTF-16-BE, UTF-16-LE
- Can be lossy-converted from byte arrays in UTF-8, UTF-16-BE, UTF-16-LE
Aside (Unicode):
Cow<str>
- A clone-on-write string.
enum CowStr<'borrow> {
Borrowed(&'borrow str),
Owned(String)
}
- Implements Deref, so automatic immutable method access to string.
-
The
to_mut
method returns a mutable reference to an owned value, cloning if necessary.
Cow<str> (2)
use std::borrow::Cow;
fn main() {
let mut s: Cow<'static, str> = Cow::from("hello!");
// s is a Cow::Borrowed, pointing to a string literal
println!("{}", s.len());
s.to_mut().make_ascii_uppercase();
// s is now a Cow::Owned, containing an owned String
println!("{}", s);
}
OsString and &OsStr
-
Platform native strings
- On Unix systems, an arbitrary sequence of non-zero bytes
Notably, an OsString
is not guaranteed to be valid UTF-8.
pub fn into_string(self) -> Result<String, OsString>
-
An
&OsStr
is toOsString
what&str
is toString
. -
Rust library functions interfacing with the system usually return
OsString
/OsStr
: -
Example: Path::file_name
pub fn file_name(&self) -> Option<&OsStr>
CString and &CStr
- C-compatible strings
- Nul-terminated
- No nul bytes in the middle
Notably, not all String
are valid CString
, since UTF-8 strings can contain nul characters.
pub fn new<T>(t: T) -> Result<CString, NulError>
where
T: Into<Vec<u8>>,
-
A
&CStr
is toCString
what&str
is to aString
-
Functions that interact with the C ABI usually utilize
CString
/CStr
.