CS3984 Computer Systems in Rust



Exercise 2

Submission

Submission deadline: 11:59 pm, Sunday, September 29th 2024.

Submission method: Run the included submit.py script in your root exercise repository. You will need Python and Requests installed. If you do not have Python and/or Requests, you can either:

  • Install Python and/or Requests
  • Copy your assignment solution (the exercises folder) and the submit script to rlogin and run the submit script there

Part 1

Summary

  1. Complete the second set of rustlings exercises.
  2. Implement a small program that concatenates a combination of given files and/or its standard input stream to its standard output stream.

Details

  1. Clone the exercise repository.

  2. In the part-1-rustlings folder, complete the given rustlings exercises similar to Exercise 1.

    Note: You are not allowed to modify the exercise in any way that is not allowed by the exercise. For example, you are allowed to modify the tests only if required by the exercise. You should also follow all instructions given by the exercise. Failure to follow instructions will result in a penalty.

  3. In the part-1-concatenate folder, implement concatenate.rs according to the specification below.

Specification

You will implement a single-file Rust program.

  • When run without arguments, the program should copy the content of its standard input stream to its standard output stream.

    “Standard input” and “standard output” are standard streams that are set up by a control program that starts your program (often, the control program is a shell).

  • When run with arguments, the program should process the arguments in order.

    Each argument should be treated as the name of a file, unless the argument is a single hyphen (-). Each file should be opened and their content written to the standard output stream in the order in which they are listed on the command line. If the argument is a single hyphen (-), the standard input stream should be read instead. You may assume that at most one - is provided as part of your program’s arguments.

    If any of the files whose names are given on the command line do not exist, your program should exit with a failure.

Your program must meet the following guidelines:

  • You are allowed to use the Rust standard library.

  • You are not allowed to handle standard input differently from regular files by providing a separate code path.

    This means your program should define a generic function that performs the reading and writing, then call the function multiple types as needed. (Hint: See the std::io::Read trait)

  • You should buffer the reading and writing of content in order to reduce the number of system calls your program makes. Additionally, you are not allowed to assume your program can buffer the entire content of a file/standard input in memory. The autograder will run your program under a suitable timeout and memory limit that is designed to eliminate submissions that lack buffering.

  • Your program must not interpret the content of the streams in any way. Specifically, your program should treat any content read or write as raw bytes, and not assume that the input represents valid characters in any encoding.

Testing

You should compile your concatenate.rs file using the Rust compiler, rustc.

Note: The included Cargo.toml is for Rust development extension purposes and should be ignored for Exercise 2 Part 1.

You can then test your program by running the provided test-concatenate.sh script. This script can be run on any Linux-based system, but we will test your final submission on rlogin, so rlogin is the most suitable location to test your code.

You may read the contents of the script to see how your program is tested. You should be able to replicate the commands used in the script to manually test your program.

Part 2

Note: You will use the nix crate in Part 2. Specifically, you will use a fork of the nix crate with adjustments for this class. The fork is already specified in the manifests of the base code. You should visit https://rust.cs.vt.edu/docs/nix/ for documentation, and not the official nix documentation at docs.rs.

Summary

  1. Implement concatenate using the syscall abstractions that nix provides instead of the Rust standard library.
  2. Implement fastcat, an optimized concatenate program that uses the splice system call to move data between two file descriptors without a round trip to user space.
  3. Run a basic benchmark using hyperfine to compare the performance of the two binaries.

Details

  1. Pull the latest basecode from the exercise repository. This should give you a new folder, part-2-concatenate where you will implement part 2.

  2. In the folder, you will implement a package with two binary crates:

    • slowcat, which uses the crate root main.rs
    • fastcat, which uses the crate root bin/fastcat.rs

    Carefully study the specification for both slowcat and fastcat below.

  3. Benchmark your program according to the benchmark section below.

Specification: Slowcat

You will implement a binary crate that performs identically to concatenate.rs in Part 1, but with the following additional restrictions:

Your program should also follow the guidelines for separate code paths, buffering, and reinterpreting content outlined in Part 1.

(Hint for your abstraction function: See what types nix::unistd::read and nix::unistd::write take as input)

Specification: Fastcat

You will implement a binary crate that performs identically to concatenate.rs in Part 1, but with the following additional restrictions:

Your program should also follow the guidelines for separate code paths, buffering, and reinterpreting content outlined in part 1.

(Hint for your abstraction function: See what types nix::fcntl::splice take as input)

Motivation

The read(2) system call reads a number of bytes from the specified file descriptor into the buffer provided by the userspace program. This means when the system call is called, the kernel copies data read from the file descriptor in kernel space into the userspace buffer. When the write(2) system call is called, the kernel then copies data from the given userspace buffer into kernel space.

Here is a subset of the output I get when running strace on my concatenate binary in Part 2:

$ strace ./target/release/concatenate Cargo.toml > /dev/null...read(3, "[package]\nname = \"concatenate\"\nv"..., 32768) = 199write(1, "[package]\nname = \"concatenate\"\nv"..., 199) = 199read(3, "", 32768)                      = 0...


This roundtrip from kernel space -> user space -> kernel space is redundant if the program does not need to access the contents being read and written. Therefore, a potential optimization is to perform the copying of bytes entirely within kernel space.

One such system call that provides a zero-copy functionality is the splice(2) system call.

splice() moves data between two file descriptors without copying between kernel address space and user address space. It transfers up to len bytes of data from the file descriptor fd_in to the file descriptor fd_out, where one of the file descriptors must refer to a pipe.

The astute among you might notice that we cannot use splice(2) for our concatenate program. This is because the concatenate program must work regardless of the type of input and output. However, splice requires one of the file descriptors to be a pipe.

Fortunately, we can still utilize splice(2) by using a pipe buffer. We can request the kernel splice the contents from our input (which may or may not be a pipe) into the write end of the pipe, then request the kernel splice the write end of the pipe to the output (in our case, stdout, which again may or may not be a pipe).

Here’s an illustrative diagram:

Now, splice(2) has other restrictions regarding fd_in and fd_out (see the man page), but we are only concerned with the simple case shown in the diagram above. Using this method for copying files between two file descriptors will avoid the need for kernel->userspace roundtrip, while also opening up optimization opportunities the kernel can perform with pipes (if you recall from the lecture).

Testing

You should compile your package using cargo.

You can then test the compiled binaries by running the provided test-concatenate.sh script in the part-2-concatenate directory. This script can be run on any Linux-based system, but we will test your final submission on rlogin, so rlogin is the most suitable location to test your code.

You may read the contents of the script to see how your program is tested. You should be able to replicate the commands used in the script to manually test your program.

Benchmark

You will compare the performance of the slowcat and fastcat binaries created and produce a writeup called writeup.md in the part-2-concatenate directory. The performance metric you will measure is the execution time of the programs.

You will use the command-line tool hyperfine to perform the benchmark. A local copy of the tool on rlogin is available at ~cs3214/bin/hyperfine. Read the output of hyperfine --help or the README of the project to learn how to use the tool.

Your writeup should contain at least the following:

  • Statistics from the execution of slowcat and fastcat for inputs of at least 3 different sizes.
  • The relative speedup of fastcat over slowcat for the input sizes chosen.
  • An explanation of the speedup observed for each of your results observed.
  • Documentation on your benchmarking process:
    • What input sizes did you choose? Why?
    • What commands did you run?

Note: There is no one “right answer” for benchmarking. Benchmarking is notoriously hard to do accurately, and benchmark results are normally not indicative of real-world performance.

Here are a few things to keep in mind when performing your benchmark:

  • For large input sizes*, you should observe a speedup of fastcat over slowcat! Re-evaluate your implementation otherwise.
  • How is the performance of your programs affected by the input type? (eg. $ generate_large_input | fastcat versus $ fastcat large_input_file)

*How large is something for you to figure out…