CS3984 Computer Systems in Rust



Project 2: Simalloc

Also see the Help Document. Note that the help document is only a supplement to this specification, and not a replacement.

Table of Contents

Submission

Submission deadline: 11:59 pm, Sunday, November 24th 2024.

Submission method: The submission method will be the same as previous exercises, but the script has not been made available. An announcement will be made when the autograder is running and submissions can be made.

Introduction

Quickstart: Base code.

You will write a dynamic memory allocator built on top of a memory simulator in Rust, ie. your own version of malloc, free and realloc. You are encouraged to explore the design space creatively and implement an allocator that is correct, efficient, and fast.

Details

The bulk of your work done will be in src/allocator/solution.rs. You will implement at least the following four methods:

  • Allocator::new(memory) -> Self

    Before calling Allocator::malloc, Allocator::free, or Allocator::realloc, the driver program calls Allocator::new to any necessary initializations, such as allocating the initial heap area.

  • Allocator::malloc(size) -> MResult<Option<Pointer>>

    The Allocator::malloc routine returns a Pointer to an allocated block payload of at least size bytes. The entire allocated block should lie within the simulated heap region and should not overlap with any other allocated chunk.

    The Rust standard library allocator allows specifying alignments of powers of 2, but for simplification we will go with the standard C library (libc), whose malloc always returns payload pointers that are aligned to 16 bytes. Your malloc implementation should do likewise and always return 16-byte aligned pointers.

  • Allocator::free(ptr) -> MResult<()>

    The Allocator::free routine frees the block pointed to by ptr. It returns nothing on success. This routine is only guaranteed to work when the passed pointer (ptr) was returned by an earlier call to Allocator::malloc or Allocator::realloc and has not yet been freed.

  • Allocator::realloc(ptr, size) -> MResult<Option<Pointer>>

    The Allocator::realloc routine returns a pointer to an allocated region of at least size bytes with the following constraints:

    • if size is equal to zero, the call is equivalent to Allocator::free(ptr)

    • otherwise, ptr must have been returned by an earlier call to Allocator::malloc or Allocator::realloc. The call to Allocator::realloc changes the size of the memory block pointed to by ptr (the old block) to size bytes and returns the address of the new block. Notice that the address of the new block might be the same as the old block, or it might be different, depending on your implementation, the amount of internal fragmentation in the old block, and the size of the realloc request.

      The contents of the new block are the same as those of the old ptr block, up to the minimum of the old and new sizes. Note that the new size may be smaller than the old size. Everything else is uninitialized. For example, if the old block is 8 bytes and the new block is 12 bytes, then the first 8 bytes of the new block are identical to the first 8 bytes of the old block and the last 4 bytes are uninitialized. Similarly, if the old block is 8 bytes and the new block is 4 bytes, then the contents of the new block are identical to the first 4 bytes of the old block.

Memory Simulator

The memory simulator interface is documented in src/memory/mod.rs. The interface is specified as a trait to allow for a debuggable version of the simulated memory, as well as a barebones version used to test the correctness and performance of your code. You will not need to modify the Memory trait or any types implementing the trait.

  • You should propagate any errors returned from methods in the Memory interface. This means you should always bubble up MResult, for example with the question mark operator.
    • Any errors returned indicate a memory (or in some cases, semantic) error that indicate a bug in your allocator, so there is no semantically valid way to handle these errors.
    • You should not unwrap the errors to allow proper displaying of errors in the [allocator debugger] instead of crashing your program.

Memory Types

Access to the simulated memory is done primarily through two types: Pointer and TypedPointer<T>. These correspond to a void* and a T* (where T is any type) in the C programming language. Crucially, arithmetic done on Pointer values correspond to bytewise arithmetic, while arithmetic done on TypedPointer<T> values correspond to arithmetic on sizeof(T) bytes.

fn test(ptr: Pointer) {
    let mut u32_ptr = TypedPointer::<u32>::from_ptr(ptr);
    let raw_u32_ptr: Pointer = (u32_ptr + 1).raw();
    assert_eq!(raw_u32_ptr, ptr + 4);

    let mut u64_ptr = TypedPointer::<u64>::from_ptr(ptr);
    let raw_u64_ptr: Pointer = (u64_ptr + 1).raw();
    assert_eq!(raw_u64_ptr, ptr + 8);
}


Other types provided are BoundaryTag and Block, which are both used in the given implicit allocator. You should define your own types as needed to represent structures like a free block. Note that any types defined should derive the same traits as the other types shown above, as well as have the C data layout and ABI.

#[derive(Clone, Copy, Default, bytemuck::Pod, bytemuck::Zeroable)]
#[repr(C)]
struct MyCustomType {}


Testing

Note: The below examples use the provided implicit allocator, you will use Solution in place of Implicit to test your custom allocator.

You can test your allocator in two ways when running the default binary crate in the package:

$ cargo runUsage: simalloc [OPTIONS] <COMMAND>Commands:  tui   Run the TUI  test  Test your allocator with one or more trace files  help  Print this message or the help of the given subcommand(s)Options:  -v, --verbose  Show verbose output  -h, --help     Print help


  1. cargo run -- tui --help

    The tui subcommand launches a Text-based User Interface that will allow you to interact with your allocator. You can use it to manually run allocation and deallocation routines, or test your allocator step by step with a trace file.

    Note: You should run the tui subcommand in debug mode to surface debug logging.

    You are free to use the tracing crate included in the project to log messages that will be captured by the TUI driver. For example, tracing::debug!(?block_size) or tracing::warn!("No free block found").

    Sample commands:

    # Test the implicit allocator
    $ cargo run -- tui --allocator Implicit
    # Test the implicit allocator, loading a trace file for execution
    $ cargo run -- tui --trace-file ./traces/amptjp-bal.rep --allocator Implicit
    # Auto-run a list of commands on startup
    $ cargo run -- tui --cmd-file samples/sample_cmd_file.txt
    
    
    
  2. cargo run --release -- test --help

    The test subcommand runs automatic tests on your allocator with one or more trace files. Each trace file contains a sequence of allocate, reallocate, and free directions that instruct the driver to call your implemented allocator routines.

    Note: You should run the test subcommand in release mode to disable any logging utilities and speed up testing.

    Sample command to test the implicit allocator with all traces:

    $ cargo run --release -- test --allocator Implicit ./traces/
    
    
    

    The test subcommand will print out

    • Any errors found while testing your allocator

    • The space utilization of your allocator. The space utilization is the ratio between the peak aggregate amount of memory used by the driver (ie. allocated but not yet freed) and the size of the heap used by your allocator. The optimal ratio equals to 1 – in that case, the heap grew exactly as much as was needed to accommodate the aggregate amount of allocated memory when at its peak. You should find good policies to minimize fragmentation in order to make this ratio as close as possible to the optimum.

    • The performance of your allocator, measured as time taken to complete all trace operations. Note the performance calculations may be revised with an accompanying announcement.

Requirements and Hints

Requirements

  • You may not change any interfaces in the Memory trait.
  • Your allocator must always return pointers that are aligned to 16-byte boundaries. The driver will enforce this requirement for you.
  • You must not implement a pure implicit list allocator, but an example of such an allocator is provided in the starter code, and you may use it as a starting point.
  • Your allocator must not allocate any memory. This means that any metadata used for managing memory blocks must be done through the given Memory, and it also means that you are not able to use standard library data structures that allocate on the heap such as a Vec (not that you will need to).

Hints

  • Study the provided implicit list allocator. It will serve as not only a good structure for your allocator, but also as example on the usage of the simulated memory interfaces. Please ask on the forum if you have trouble understanding any implementation details.
  • Consider edge conditions. Consider the case that a block that is freed may not have a left or right neighbor. A possible strategy is to initialize your heap such that it will appear that there are always allocated “fence” blocks to the left and to the right, which means that the above case never arises. These fence block represent boundary tag headers that are marked as in use.
  • Consider small requests. Depending on which strategy you choose, you will need to round up small requests. Don’t just think about what happens when allocating a block, consider also what you’ll have to do when freeing this block. Freeing the block may include inserting the block into your free list or lists (or other data structure if you implement one), and thus it must be large enough to hold all link elements plus boundary tags (if used). You will need to consider this both when requesting more memory via sbrk and when splitting a block that may be too large.
  • Do your implementation in stages. We recommend that you start by getting your malloc and free routines working correctly and efficiently. Only then should you turn your attention to the realloc implementation. For starters, build realloc on top of your existing malloc and free implementations. But to get really good performance, you will need to build a stand-alone realloc. Similarly, splitting blocks is a performance optimization you can add separately, as are segregated lists.
  • Use asserts liberally. The debug_assert! macro allows you to document assertions you make about the code, while keeping them out of performance testing in release mode.
  • Start early! It is possible to write an efficient malloc package with a few pages of code, but it is dense and, at times, tricky code. So don’t procrastinate, and good luck!

Grading

The grade breakdown is as follows:

Points Description
45 Correctness
40 Performance
10 Documentation and Style
5 Git Usage
  • Correctness. The points are evenly divided between each tested trace, and awarded regardless of performance. You need to have implemented an allocator that is not the provided implicit list allocator.

  • Performance. The points are given based on the performance index, which is 0.6×average_utilization+0.4×allocator_performance0.6 \times \text{average\_utilization} + 0.4 \times \text{allocator\_performance}.

    • average_utilization\text{average\_utilization} is the average of the space utilizations of your allocator for each trace.
    • allocator_performance\text{allocator\_performance} is calculated by comparing the performance of your allocator with an optimized implementation on rlogin.

    Note: The grading command is equivalent to cargo run --release -- test --vary-size --allocator Solution ./traces/.

  • Documentation and style. Your code should be sufficiently documented. You should also place a header comment at the start of src/allocator/solution.rs that describes the structure of your free and allocated blocks, and the design of your allocator.

  • Git usage. You should make proper use of git. This includes the use of git as source control, and descriptive commit messages.

Logistics

You will use Git for managing your source code. Git is a distributed version control system in which every working directory contains a full repository, and thus the system can be used independently of a (centralized) repository server. Developers can commit changes to their local repository. However, in order to share their code with others, they must then push those commits to a remote repository. Your remote repository will be hosted on git.cs.vt.edu, which provides a facility to share this repository among group members. For further information on git in general you may browse the official Git documentation: http://git-scm.com/documentation, but feel free to ask questions on the forum as well! The use of git (or any distributed source code control system) may be new to some students, but it is a prerequisite skill for most programming related internships or jobs.

You will use a departmental instance of Gitlab for this class. You can access the instance with your SLO credentials at https://git.cs.vt.edu/.

The provided base code for the project is available on Gitlab at https://git.cs.vt.edu/cs3984-systems-rust/project-2-simalloc,

One team member should fork this repository by viewing this page and clicking the fork link. This will create a new repository for you with a copy of the contents. From there you must view your repository settings, and set the visibility level to private. On the settings page you may also invite your other team member to the project so that they can view and contribute.

Group members may then make a local copy of the repository by issuing a git clone <repository> command. The repository reference can be found on the project page such as git@git.cs.vt.edu:teammemberwhoclonedit/project-2-simalloc.git. To clone over SSH (which you may need to do on rlogin), you will have to add an SSH public key to your profile by visiting https://git.cs.vt.edu/profile/keys. This key is separate from the key you added to your ∼/.ssh/authorized_keys file. Although you could use the same key pair you use to log into rlogin, we recommend using a separate key pair. This way you can avoid storing the private key you use to access rlogin on rlogin itself.

If updates or bug fixes to this code are required, they will be announced on the forum. You will be required to use version control for this project. When working in a team, both team member should have a roughly equal number of committed lines of code to show their respective contributions.