Questions about your whole codebase

The C++ observer pattern is hard in Rust. What to do?

The C++ observer pattern usually means that there are broadcasters sending messages to consumers:

flowchart TB
    broadcaster_a[Broadcaster A]
    broadcaster_b[Broadcaster B]
    consumer_a[Consumer A]
    consumer_b[Consumer B]
    consumer_c[Consumer C]
    broadcaster_a --> consumer_a
    broadcaster_b --> consumer_a
    broadcaster_a --> consumer_b
    broadcaster_b --> consumer_b
    broadcaster_a --> consumer_c
    broadcaster_b --> consumer_c

The broadcasters maintain lists of consumers, and the consumers act in response to messages (often mutating their own state.)

This doesn't work in Rust, because it requires the broadcasters to hold mutable references to the consumers.

What do you do?

Option 1: make everything runtime-checked

Each of your consumers could become an Rc<RefCell<T>> or, if you need thread-safety, an Arc<RwLock<T>>.

The Rc or Arc allows broadcasters to share ownership of a consumer. The RefCell or RwLock allows each broadcaster to acquire a mutable reference to a consumer when it needs to send a message.

This example shows how, in Rust, you may independently choose reference counting or interior mutability. In this case we need both.

Just like typical reference counting in C++, Rc and Arc have the option to provide a weak pointer, so the lifetime of each consumer doesn't need to be extended unnecessarily. As an aside, it would be nice if Rust had an Rc-like type which enforces exactly one owner, and multiple weak ptrs. Rc could be wrapped quite easily to do this.

Reference counting is frowned-upon in C++ because it's expensive. But, in Rust, not so much:

  • Few objects are reference counted; the majority of objects are owned statically.
  • Even when objects are reference counted, those counts are rarely incremented and decremented because you can (and do) pass around &Rc<RefCell<T>> most of the time. In C++, the "copy by default" mode means it's much more common to increment and decrement reference counts.

In fact, the compile-time guarantees might cause you to do less reference counting than C++:

In Servo there is a reference count but far fewer objects are reference counted than in the rest of Firefox, because you don’t need to be paranoid - MG

However: Rust does not prevent reference cycles, although they're only possible if you're using both reference counting and interior mutability.

Option 2: drive the objects from the code, not the other way round

In C++, it's common to have all behavior within classes. Those classes are the total behavior of the system, and so they must interact with one another. The observer pattern is common.

flowchart TB
    broadcaster_a[Broadcaster A]
    consumer_a[Consumer A]
    consumer_b[Consumer B]
    broadcaster_a -- observer --> consumer_a
    broadcaster_a -- observer --> consumer_b

In Rust, it's more common to have some external function which drives overall behavior.

flowchart TB
    main(Main)
    broadcaster_a[Broadcaster A]
    consumer_a[Consumer A]
    consumer_b[Consumer B]
    main --1--> broadcaster_a
    broadcaster_a --2--> main
    main --3--> consumer_a
    main --4--> consumer_b

With this sort of design, it's relatively straightforward to take some output from one object and pass it into another object, with no need for the objects to interact at all.

In the most extreme case, this becomes the Entity-Component-System architecture used in game design.

Game developers seem to have completely solved this problem - we can learn from them. - MY

Option 3: use channels

The observer pattern is a way to decouple large, single-threaded C++ codebases. But if you're trying to decouple a codebase in Rust, perhaps you should assume multi-threading by default? Rust has built-in channels, and the crossbeam crate provides multi-producer, multi-consumer channels.

I'm a Rustacean, we assume massively parallel unless told otherwise :) - MG

That's all very well, but I have an existing C++ object broadcasting events. How exactly should I observe it?

If your Rust object is a consumer of events from some pre-existing C++ producer, all the above options remain possible.

  • You can make your object reference counted and have C++ own such a reference (potentially a weak reference)
  • C++ can deliver the message into a general message bucket. An external function reads messages from that bucket and invokes the Rust object that should handle it. This means the reference counting doesn't need to extend to the Rust objects outside that boundary layer.
  • You can have a shim object which converts the C++ callback into some message and injects it into a channel-based world.

Some of my C++ objects have shared mutable state. How can I make them safe in Rust?

You're going to have to do something with interior mutability: either RefCell<T> or its multithreaded equivalent, RwLock<T>.

You have three decisions to make:

  1. Will only Rust code access this particular instance of this object, or might C++ access it too?
  2. If both C++ and Rust may access the object, how do you avoid conflicts?
  3. How should Rust code react if the object is not available, because something else is using it?

If only Rust code can use this particular instance of shared state, then simply wrap it in RefCell<T> (single-threaded) or RwLock<T> (multi-threaded). Build a wrapper type such that callers aren't able to access the object directly, but instead only via the lock type.

If C++ also needs to access this particular instance of the shared state, it's more complex. There are presumably some invariants regarding use of this data in C++ - otherwise it would crash all the time. Perhaps the data can be used only from one thread, or perhaps it can only be used with a given mutex held. Your goal is to translate those invariants into an idiomatic Rust API that can be checked (ideally) at compile-time, and (failing that) at runtime.

For example, imagine:

class SharedMutableGoat {
public:
    void eat_grass(); // mutates tummy state
};

std::mutex lock;
SharedMutableGoat* billy; // only access when owning lock

Your idiomatic Rust wrapper might be:


#![allow(unused)]
fn main() {
mod ffi {
  #[allow(non_camel_case_types)]
  pub struct lock_guard;
  pub fn claim_lock() -> lock_guard { lock_guard{} }
  pub fn eat_grass() {}
  pub fn release_lock(lock: &mut lock_guard) {}
}
struct SharedMutableGoatLock {
    lock: ffi::lock_guard, // owns a std::lock_guard<std::mutex> somehow
};

// Claims the lock, returns a new SharedMutableGoatLock
fn lock_shared_mutable_goat() -> SharedMutableGoatLock {
    SharedMutableGoatLock { lock: ffi::claim_lock() }
}

impl SharedMutableGoatLock {
    fn eat_grass(&mut self) {
        ffi::eat_grass(); // Acts on the global goat
    }
}

impl Drop for SharedMutableGoatLock {
    fn drop(&mut self) {
        ffi::release_lock(&mut self.lock);
    }
}
}

Obviously, lots of permutations are possible, but the goal is to ensure that it's simply compile-time impossible to act on the global state unless appropriate preconditions are met.

The final decision is how to react if the object is not available. This decision can apply with C++ mutexes or with Rust locks (for example RwLock<T>). As in C++, the two major options are:

  • Block until the object becomes available.
  • Try to lock, and if the object is not available, do something else.

There can be a third option if you're using async Rust. If the data isn't available, you may be able to return to your event loop using an async version of the lock (Tokio example, async_std example).

How do I do a singleton?

Use OnceCell.

What's the best way to retrofit Rust's parallelism benefits to an existing codebase?

When parallelizing an existing codebase, first check that all existing types are correctly Send and Sync. Generally, though, you should try to avoid implementing these yourself - instead use pre-existing wrapper types which enforce the correct contract (for example, RwLock).

After that:

If you can solve your problem by throwing Rayon at it, do. It’s magic - MG

If your task is CPU-bound, Rayon solves this handily. - MY

Rayon offers parallel constructs - for example parallel iterators - which can readily be retrofitted to an existing codebase. It also allows you to create and join tasks. Using Rayon can help simplify your code and eliminate lots of manual scheduling logic.

If your tasks are IO-bound, then you may need to look into async Rust, but that's hard to pull into an existing codebase.

What's the best way to architect a new codebase for parallelism?

In brief, like in other languages, you have a choice of architectures:

  • Message-passing, using event loops which listen on a channel, receive Send data and pass it on.
  • More traditional multithreading using Sync data structures such as mutexes (and perhaps Rayon).

There's probably a bias towards message-passing, and that's probably well-informed by its extensibility. - MG

I need a list of nodes which can refer to one another. How?

You can't easily do self-referential data structures in Rust. The usual workaround is to use an arena and replace references from one node to another with node IDs.

An arena is typically a Vec (or similar), and the node IDs are a newtype wrapper around a simple integer index.

Obviously, Rust doesn't check that your node IDs are valid. If you don't have proper references, what stops you from having stale IDs?

Arenas are often purely additive, which means that you can add entries but not delete them (example). If you must have an arena which deletes things, then use generational IDs; see the generational-arena crate and this RustConf keynote for more details.

If arenas still sound like a nasty workaround, consider that you might choose an arena anyway for other reasons:

  • All of the objects in the arena will be freed at the end of the arena's lifetime, instead of during their manipulation, which can give very low latency for some use-cases. Bumpalo formalizes this.
  • The rest of your program might have real Rust references into the arena. You can give the arena a named lifetime ('arena for example), making the provenance of those references very clear.

Should I have a few big crates or lots of small ones?

In the past, it was recommended to have small crates to get optimal build time. Incremental builds generally make this unnecessary now. You should arrange your crates optimally for your semantic needs.

What crates should everyone know about?

CrateDescription
rayonparallelizing
serdeserializing and deserializing
crossbeamall sorts of parallelism tools
itertoolsmakes it slightly more pleasant to work with iterators. (For instance, if you want to join an iterator of strings, you can just go ahead and do that, without needing to collect the strings into a Vec first)
petgraphgraph data structures
slotmaparena-like key-value map
nomparsing
clapcommand-line parsing
regexerr, regular expressions
ringthe leading crypto library
nalgebralinear algebra
once_cellcomplex static data

How should I call C++ functions from Rust and vice versa?

Use cxx.

Oh, you want a justification? In that case, here's the history which brought us to this point.

From the beginning, Rust supported calling C functions using extern "C", #[repr(C)] and #[no_mangle]. Such callable C functions had to be declared manually in Rust:

sequenceDiagram
   Rust-->>extern: unsafe Rust function call
   extern-->>C: call from Rust to C
   participant extern as Rust unsafe extern "C" fn
   participant C as Existing C function

bindgen was invented to generate these declarations automatically from existing C/C++ header files. It has grown to understand an astonishingly wide variety of C++ constructs, but its generated bindings are still unsafe functions with lots of pointers involved.

sequenceDiagram
   Rust-->>extern: unsafe Rust function call
   extern-->>C: call from Rust to C++
   participant extern as Bindgen generated bindings
   participant C as Existing C++ function

Interacting with bindgen-generated bindings requires unsafe Rust; you will likely have to manually craft idiomatic safe Rust wrappers. This is time-consuming and error-prone.

cxx automates a lot of that process. Unlike bindgen it doesn't learn about functions from existing C++ headers. Instead, you specify cross-language interfaces in a Rust-like interface definition language (IDL) within your Rust file. cxx generates both C++ and Rust code from that IDL, marshaling data behind the scenes on both sides such that you can use standard language features in your code. For example, you'll find idiomatic Rust wrappers for std::string and std::unique_ptr and idiomatic C++ wrappers for a Rust slice.

sequenceDiagram
   Rust-->>rsbindings: safe idiomatic Rust function call
   rsbindings-->>cxxbindings: hidden C ABI call using marshaled data
   cxxbindings-->>cpp: call to standard idiomatic C++
   participant rsbindings as cxx-generated Rust code
   participant cxxbindings as cxx-generated C++ code
   participant cpp as C++ function using STL types

In the bindgen case even more work goes into wrapping idiomatic C++ signatures into something bindgen compatible: unique ptrs to raw ptrs, Drop impls on the Rust side, translating string types ... etc. The typical real-world binding we've converted from bindgen to cxx in my codebase has been -500 lines (mostly unsafe code) +300 lines (mostly safe code; IDL included). - DT

The greatest benefit is that cxx sufficiently understands C++ STL object ownership norms that the generated bindings can be used from safe Rust code.

At present, there is no established solution which combines the idiomatic, safe interoperability offered by cxx with the automatic generation offered by bindgen. It's not clear whether this is even possible but several projects are aiming in this direction.

I'm getting a lot of binary bloat.

In Rust you have a free choice between impl Trait and dyn Trait. See this answer, too. impl Trait tends to be the default, and results in large binaries as much code can be duplicated. If you have this problem, consider using dyn Trait. Other options include the 'thin template pattern' (an example is serde_json where the code to read from a string and a slice would be duplicated entirely, but instead one delegates to the other and requests slightly different behavior.)