0

I have the following for loop, which contains a private function call inside of it:

for (i = 0; i < N; ++i)
    dates[i] = to_time_t(&string_dates[i][0]);

to_time_t simply converts a string (e.g.: "18/03/2007") into a timestamp, and it does so with the help of mktime(), which is really slow. In fact, that for loop alone takes the most time out of any other code in the program. To remedy this, I am trying to apply OpenMP to the loop, like this:

#pragma omp parallel for private(i)
for (i = 0; i < N; ++i)
    dates[i] = to_time_t(&string_dates[i][0]);

My OpenMP knowledge is limited, but I'm assuming that each element of the dates array is never accessed by two threads simultaneously since i is private. The same should apply to string_dates. But when I run this code, performance is actually worse, so I must be doing something wrong, I just don't see it. Any help is appreciated!

Edit: I should have included the to_time_t code from the start.

time_t to_time_t(const string * date) {
    struct std::tm tm = {0};

    istringstream ss_tm(*date);
    ss_tm >> get_time(&tm, "%m/%d/%Y");

    return mktime(&tm);
}
4
  • How big is N ? And how do you time the execution ? Commented May 28, 2020 at 18:33
  • @HighPerformanceMark N is almost 250k, sometimes 1 milion. It depends on the dataset I'm trying this code against. To measure execution time I use MPI_Wtime() (since I'm adopting MPI as well) before and after the loop. Commented May 28, 2020 at 20:09
  • The issue is likely in to_time_t(). It may be touching some global variables or calling into library functions that keep some hidden state and are not really thread-safe. I doubt anyone could guess the actual reason without some insight into to_time_t(). Commented May 29, 2020 at 7:09
  • @HristoIliev You are absolutely right. I made an edit to add the code of that function as well! Commented May 29, 2020 at 10:51

1 Answer 1

1

The problem is in mktime() which has a process-wide side effect. From the manual page:

Calling mktime() also sets the external variable tzname with information about the current timezone.

mktime() calls internally tzset(). The latter is serialised via a mutex lock, but what really slows it down in the multithreaded case is the constant cache trashing. When a call to tzset() by a thread running on a particular CPU core writes to tzname, this invalidates the caches of all the other cores, forcing threads that run on those cores to access higher cache levels or even the main memory the next time there is a call to mktime().

You need to find or write an equivalent of mktime() that doesn't modify global state. Or just stick to sequential execution for that part of the code. It is perfectly fine to call mktime() simultaneously in multiple sequential processes (e.g., in a pure MPI application).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.