5

I have the following (simplified) code that spawns a few threads that do long and complex operations to build a Transactions struct. This Transactions struct contains fields with Rc. At the end of the threads I want to return the computed Transactions struct to the calling thread through an mpsc::channel.

use std::thread;
use std::collections::HashMap;
use std::sync::mpsc::{channel, Sender};
use std::rc::Rc;

#[derive(Debug)]
struct Transaction {
  id: String,
}

#[derive(Debug)]
struct Transactions {
  list: Vec<Rc<Transaction>>,
  index: HashMap<String, Rc<Transaction>>,
}

fn main() {
  
  let (tx, rx) = channel();
  
  for _ in 0..4 {
    tx = Sender::clone(&tx);
    thread::spawn(move || {
      // complex and long computation to build a Transactions struct
      let transactions = Transactions { list: Vec::new(), index: HashMap::new() };
      tx.send(transactions).unwrap();
    });
  }
  
  drop(tx);

  for transactions in rx {
    println!("Got: {:?}", transactions);
  }

}

The compiler complains that std::rc::Rc<Transaction> cannot be sent safely between threads because it does not implement the std::marker::Send trait.

error[E0277]: `std::rc::Rc<Transaction>` cannot be sent between threads safely
   --> src/main.rs:23:5
    |
23  |     thread::spawn(move || {
    |     ^^^^^^^^^^^^^ `std::rc::Rc<Transaction>` cannot be sent between threads safely
    |
    = help: the trait `std::marker::Send` is not implemented for `std::rc::Rc<Transaction>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::ptr::Unique<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `alloc::raw_vec::RawVec<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `std::vec::Vec<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `Transactions`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::mpsc::Sender<Transactions>`
    = note: required because it appears within the type `[closure@src/main.rs:23:19: 27:6 tx:std::sync::mpsc::Sender<Transactions>]`

I understand the I could replace Rc by Arc, but was looking to know if there was any other solution to avoid the performance penalty of using Arc, because the Rc structs are never accessed by two threads at the same time.

2

2 Answers 2

1

Just DO NOT do it.

I was making this a comment, but I think it actually answers your question and warns others.

This seems very unsound! Rc is not about managing access, it’s about making sure something lives long enough to be shared between different “owners”/“borrowers” by counting how many references are alive. If there are two (Rc) references to the same value in two different threads, the lack of atomicity could cause two threads to change the reference count AT THE SAME TIME, which could lead to the record being smudged, which could cause memory leaks, or worse, prematurely dropping the allocation and UB.

This is because of the classic sync problem of incrementing a shared variable:

Steps of incrementing a variable:

  1. Read variable and store it in the stack.
  2. Add 1 to the copy in the stack.
  3. Write back the result in the variable

That’s all fine with one thread, but let’s see what could happen otherwise:

Multithreaded sync incident (Threads A & B)

  1. x=0
  2. A: read x into xa (stack), xa = 0
  3. B: read x into xb, xb =0
  4. A: increment xa, xa = 1
  5. A: write xa to x, x =1
  6. B: increment xb, xb = 1
  7. B: write xb to x, x = 1
  8. x is now 1

You have now incremented 0 twice, with the result being 1: BAD!

If x was the reference count of an Rc, it would think only one reference is alive. If one reference is dropped, it will think there’s no more reference alive and will drop the value, but there’s actually still a reference out there that thinks it’s ok to access the data, therefore Undefined Behaviour, therefore VERY BAD!

The performance cost of Arc is negligible compared to everything else, it’s absolutely worth using.

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, although I would like stressing the point that Rc is not Send and if it uses something thread local, it would probably be undefined behaviour. Ref counts is also an important point but not sure if it would have any effect unless you make a clone.
@Mihir Sync problems can also happen on drop, it’s just not worth it.
I agree that it is not a good idea to send Rc as it is not Send. Just for understanding, let's assume it was Send and nothing was thread local. In that case, how would it be problematic if there were no clones and it was sent to another thread?
@Mihir If there’s a Rc, it’s because there’s more than one reference, if both dropped at the same time, it could cause memory leakage. The problem is that it does not sync properly.
OP states "the Rc structs are never accessed by two threads at the same time" This answer assumes all threading is concurrent and discusses that instead.
0

Unfortunately I can't delete this post as this is the accepted answer but I want to point to this link that I missed before:

Is it safe to `Send` struct containing `Rc` if strong_count is 1 and weak_count is 0?

Since Rc is not Send, its implementation can be optimized in a variety of ways. The entire memory could be allocated using a thread-local arena. The counters could be allocated using a thread-local arena, separately, so as to seamlessly convert to/from Box…. This is not the case at the moment, AFAIK, however the API allows it, so the next release could definitely take advantage of this.


Old Answer

As you don't want to use Arc, you could use the new type pattern and wrap Rc inside a type that implements Send and Sync. These traits are unsafe to implement and after doing so it's all upto you to ensure that you don't cause undefined behaviour.


Wrapper around Rc would look like:

#[derive(Debug)]
struct RcWrapper<T> {
    rc: Rc<T>
}

impl<T> Deref for RcWrapper<T> {
    type Target = Rc<T>;

    fn deref(&self) -> &Self::Target {
        &self.rc
    }
}

unsafe impl<T: Send> Send for RcWrapper<T> {}
unsafe impl<T: Sync> Sync for RcWrapper<T> {}

Then,

#[derive(Debug)]
struct Transactions {
    list: Vec<RcWrapper<Transaction>>,
    index: HashMap<String, RcWrapper<Transaction>>,
}

Playground

Although, Deref trait is not very much worth in this case as most functions are associated. Generally Rc is cloned as Rc::clone() but you can still use the equivalentrc.clone() (probably the only case where Deref might be worth). For a workaround, you could have wrapper methods to call Rc's methods for clarity.

Update:

Found send_wrapper crate which seems to serve that purpose.

You could use it like:

use send_wrapper::SendWrapper;

#[derive(Debug)]
struct Transactions {
    list: Vec<SendWrapper<Rc<Transaction>>>,
    index: HashMap<String, SendWrapper<Rc<Transaction>>>,
}

PS: I would suggest to stick with Arc. The overhead is generally not that high unless you make alot of clones frequently. I am not sure how Rc is implemented. Send allows type to be sent into other threads and if there is anything thread-local, such as thread local locks or data, I am not sure how that would be handled.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.