Edit
Let's restate everything. I'm implementing Bellman-Ford on OpenMP. As I understand it, the compare step and the setting of dist must be done in a critical block, since updating dist could potentially change the result of the compare step--there is a data race here.
My problem then is that the updated_last_round variable need not be updated in a critical block. There is a data race here, but the only update-to value is true, so it doesn't matter. My worry about the current implementation is that all the threads are atomically updating the updated_last_round and slowing each other down.
bool compare(int v1, int w, int v2) {
// checks if v1 + w < v2
if (v1 == INT_MAX) return false;
if (v2 == INT_MAX) return true;
return ((v1+w) < v2);
}
vector<int> bellman_ford(Graph& g, int root) {
vector<int> dist(g.num_nodes());
# pragma omp parallel for
for (int i = 0; i < g.num_nodes(); i++)
dist[i] = INT_MAX; // set INF
dist[root] = 0;
int round = 0;
bool updated_last_round = true;
// relax procedure
while (updated_last_round && round < g.num_nodes()) {
updated_last_round = false;
#pragma omp parallel for
for (int u = 0; u < g.num_nodes(); u++) {
vector<int> neighbors = g.out_neighbors(u);
vector<int> weights = g.out_weights_neighbors(u);
#pragma omp parallel for
for (int j = 0; j < g.out_degree(u); j++) {
int v = neighbors[j];
int weight = weights[j];
#pragma omp critical
{
if (compare(dist[u], weight, dist[v])) {
dist[v] = dist[u] + weight;
updated_last_round = updated_last_round || true;
}
}
}
}
round += 1;
}
/* ... */
return dist;
}
Original
I'm trying to parallelize some code in OpenMP that requires an atomic check-and-set inside a parallel for loop, and I am computing at the end of each iteration whether at least one value has been set.
Right now I'm using a reduction(||:updated_last_round) to reduce the boolean at the end of every iteration, but I'm not sure if this actually speeds up anything, as the actual line of code that updates the boolean is inside the critical section still.
bool updated_last_round = true
while (updated_last_round) {
updated_last_round = false;
#pragma omp parallel for reduction(||:updated_last_round)
for (/* stuff */) {
// other stuff
#pragma omp critical
{
if (should_update(vars_to_decide_with)) {
// do the important critical update
// I DON'T want this to be atomically updated, as
// the data race doesn't matter at all
updated_last_round = updated_last_round || true;
}
}
}
It should make sense that there's a way to let the critical section only do the critical stuff and then proceed to set a thread-local bool value, then reduce the local values at the end of each iteration. How should I achieve that?
#pragma omp parallel forwith awhileloop.should_updatebefore theifstatement: dpaste.com/2F81ZC9 But since this definitely races, a race condition detection on your program will fail.