Lock based concurrent vector

Question

I am starting with designing thread safe data structure and looking to implement a concurrent vector that hides multi-threading complexity for the user and that offers thread safe clear function (which is not offered by Intel TBB::ConcurrentVector) . Considering a concurrent vector which is based on std::vector and locking a mutex for each call of clear or push_back functions of std::vector, how to iterate thread safely through the concurrent vector using for loop? in other terms, how to make iterate_thread in this minimal reproducible example thread safe?

#include <bits/stdc++.h> 
using namespace std; // for the sake of minimal reproducible example
template<class T> 
class ConcurrentVector {
public: 
    ConcurrentVector() : mMutex{} 
                  , mVector{} 
    {}
    auto begin(){
        scoped_lock lk(mMutex);
        return mVector.begin(); 
    }
    auto end(){
        scoped_lock lk(mMutex);
        return mVector.end(); 
    }
    T& push_back(T t) {
        scoped_lock lk(mMutex);
        mVector.push_back(move(t)); 
        return mVector.back();
    }
    void clear() {
        scoped_lock lk(mMutex);
        mVector.clear(); 
    }
private: 
    mutex mMutex; 
    vector<T> mVector; 
}; 

int main(){
    ConcurrentVector<int> vec{};
    atomic_bool terminate_flag = false;
    jthread fill_thread{[&](){
        static int i = 0;
        while(!terminate_flag){
            this_thread::sleep_for(10ms);
            vec.push_back(++i);
        }
    }};
    jthread clear_thread{[&](){
        while(!terminate_flag){
            this_thread::sleep_for(1s);
            vec.clear();
        }
    }};
    jthread iterate_thread{[&](){
        while(!terminate_flag){
            this_thread::sleep_for(400ms);
            for(auto i : vec){
                cout << i << ", ";
            }
            cout << '\n';
        }
    }};
    
    this_thread::sleep_for(3s);
    terminate_flag = true;
    return 0;
}

Calling clear function in iterate_thread will invalidate iterator and actually, I am thinking of these two options:

Add getter function in ConcurrentVector to return a copy of the std::vector and iterating through it, but the issue here is that if clear is called in between, I will be iterating through the copy not the real existing data in the vector
or add a for_each function in ConcurrentVector that takes a callable, but the issue here is lack of safety as the user may be calling a locking function which causes a deadlock with the vector mutex

Are there other alternatives or could any of the two mentioned improved?

Question here should be self-contained. The link to Godbolt is nice as a bonus, but the code (in a form of a minimal reproducible example) should also be copy-pasted as text in the question. — wohlstad
– wohlstad, Commented Jul 11, 2024 at 10:28
Regarding your actual question: I would opt for the 2nd option (for_each with a callable). You can use a templated method to support all kinds of callables (lambda, std::function etc.). — wohlstad
– wohlstad, Commented Jul 11, 2024 at 10:31
"locking a mutex for each call of std::vector public member function" To synchronize access to elements this approach is flawed to begin with. For example user can call data() and then modify as they like, that a mutex was locked during that call does not do anything for protecting the elements — 463035818_is_not_an_ai
– 463035818_is_not_an_ai, Commented Jul 11, 2024 at 10:36
imho the most important lesson is that multithreading is not just single threaded plus threads (plus mutexes plus locks) — 463035818_is_not_an_ai
– 463035818_is_not_an_ai, Commented Jul 11, 2024 at 10:37
#include <bits/stdc++.h> + using namespace std; == you just committed C++ suicide; — pptaszni
– pptaszni, Commented Jul 11, 2024 at 11:12

n. m. could be an AI · Accepted Answer · 2024-07-11 11:18:30Z

3

Do not export any vector-like functions from your class. Only export a function that returns an object acting as a scoped lock, and make that object export vector-like functions, which in turn won't need any locking. Something like:

ConcurrentVector<int> v; 
...
{
  auto vv = v.open(); // vv locks 
  for (auto& i: vv) {  // and presents vector-like functionality
    i = 2*i+3;
  }
  // vv's destructor unlocks
}

answered Jul 11, 2024 at 11:18

n. m. could be an AI

122k14 gold badges141 silver badges267 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

omarekik Over a year ago

Thanks for the proposition, it hides mutex for the user but still he needs to call v.open and the for loop inside a scope and keep it as short as possible, in other terms still the user needs to be aware about multi-threading complexity.

n. m. could be an AI Over a year ago

There is no such thing as free lunch. You can hide complexity from the user, but for a rather heavy price

user4581301 Over a year ago

@omarekik take a quick look at what happened in Java when they made a thread-safe vector that hid all the complexity. TL;DR: They wound up with a class that was too slow most of the time and didn't provide the necessary protection most of the time. A really bad combination. The user needs to be involved in the scope of the lock, and this answer gives that with minimum impact..

MSalters · Accepted Answer · 2024-07-11 10:50:51Z

1

A range-based loop still uses iterators, even if not visible in code. That iterator has to cooperate. You can't just use std::vector::iterator for this, because the lock has to be released when the iterator goes out of scope.

The naïve implementation linked to only locks the mutex in .begin() when the iterator is created, which is way too short.

Locking the mutex while an iterator exists also prevents the problem of two iterators existing, causing write clashes. However, .end() does need to return a sentinel iterator even when there's another outstanding iterator.

Where things really break down is .end()-1. This needs to lock the vector, as the last element of the vector is definitely writeable.

And don't expect this to magically solve all your problems. Code like if (v.begin()!=v.end()) auto last = *(v.end()-1) breaks when another thread calls v.clear() at the wrong moment.

answered Jul 11, 2024 at 10:50

MSalters

182k11 gold badges171 silver badges376 bronze badges

2 Comments

omarekik Over a year ago

The proposed example contains a data race, as clear function would invalidate iterators. The question is how to avoid the data race within the implementation of ConcurrentVector so it can be safe and hiding multi-threading complexity.

MSalters Over a year ago

@omarekik: "Hiding complexity" is a dark pattern. The pattern has been invented over and over, and people keep discovering that hidden complexity is more dangerous than obvious complexity.

Aykhan Hagverdili · Accepted Answer · 2024-07-11 11:12:22Z

1

Making a ConcurrentVector that wraps std::vector and locks on all public function calls is probably a bad idea. It will lead to an unnecessary number of lock/unlock calls, and it's generally undesirable to let other threads interleave between your calls. What I would recommend doing is to have one mutex for the entire vector, lock it with an RAII based lock class, perform all your actions and then let the destructor unlock the mutex at the end of the function. This generally achieves good performance and more predictable behavior.

answered Jul 11, 2024 at 11:12

Aykhan Hagverdili

30.5k6 gold badges53 silver badges108 bronze badges

2 Comments

omarekik Over a year ago

Thanks for your proposition, the intention of ConcurrentVector is to hide multi-threading implementation for the user so he should not need to deal with mutex even though this may introduce some overhead

Aykhan Hagverdili Over a year ago

@omarekik in that case, I would propose to add a lock() method to the ConcurrentVector class, which returns a proxy object which can be used to access the vector, and also holds the mutex locked as long as the proxy is alive.

omarekik · Accepted Answer · 2024-07-15 14:20:05Z

Thanks @MikeVine, this concurrent_vector is based on a monitor that protects std::vector from data race:

template<class t_container>
class monitor{
public:
    template<typename ...Args>
    monitor(Args&&... args) : m_container(std::forward<Args>(args)...){}

    struct monitor_helper{//The time of locking is the life time of monitor_helper
        monitor_helper(monitor* mon) : m_monitor(mon), m_unique_lock(mon->m_mutex) {}
        t_container* operator->(){ return &m_monitor->m_container; }
        t_container& operator*(){ return m_monitor->m_container; } // for accessing using a manually locked object
        monitor* m_monitor;
        std::unique_lock<std::mutex> m_unique_lock;
    };

    monitor_helper operator->(){ return monitor_helper(this); }
    monitor_helper manually_lock(){ return monitor_helper(this); }
private:
    t_container m_container;
    std::mutex  m_mutex;
};

template<typename t_element> 
requires copyable<t_element>   
class concurrent_vector {
public: 
    void push_back(t_element t) {
        m_monitored_vector->push_back(move(t)); 
    }
    void clear() {
        m_monitored_vector->clear(); 
    }
    vector<t_element> get_data(){
        {//scope for manually_lock
            auto vector_lock_handler = m_monitored_vector.manually_lock();
            return *vector_lock_handler;
        }
    }
private: 
    monitor<vector<t_element>> m_monitored_vector; 
};

Collectives™ on Stack Overflow

Lock based concurrent vector

4 Answers 4

3 Comments

2 Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related