Notice removed Draw attention by CommunityBot

occurred Sep 4, 2016 at 22:03

Bounty Ended with Erik Eidt's answer chosen by CommunityBot

occurred Sep 4, 2016 at 22:03

Notice added Draw attention by caveman

occurred Aug 27, 2016 at 21:01

Bounty Started worth 100 reputation by caveman

occurred Aug 27, 2016 at 21:01

added 82 characters in body

Source Link

edited Aug 25, 2016 at 14:20

caveman

73
10

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) has a randomized component in it. So invoking the function twice will likely give you two different answers each time (albeit close to each other, but nonetheless different). Larger values of n will only reduce the estimation error of the function $f$.
f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) has a randomized component in it. So invoking the function twice will likely give you two different answers each time (albeit close to each other, but nonetheless different).
f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) has a randomized component in it. So invoking the function twice will likely give you two different answers each time (albeit close to each other, but nonetheless different). Larger values of n will only reduce the estimation error of the function $f$.
f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

added 192 characters in body

Source Link

edited Aug 25, 2016 at 13:06

caveman

73
10

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) has a randomized component in it. So invoking the function twice will likely give you two different answers each time (albeit close to each other, but nonetheless different).

f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) has a randomized component in it. So invoking the function twice will likely give you two different answers each time (albeit close to each other, but nonetheless different).

f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

Source Link

asked Aug 24, 2016 at 14:31

caveman

73
10

Best algorithm to multi-thread this application?

I define an algorithm to be best if it minimizes the total run time over commodity hardware (e.g. normal desktop and server PCs).

I have sets A and B. I also have function f that works like f(a, b, n), where a is some item in A, b is some item in B, and n is some natural number. The output that this f(a, b, n) returns is a positive real number.

These properties are true:

f(a,b,n) is twice slower as f(a,b,n/2).
f(a,b,n) = (f(a,b,n/2) + f(a,b,n/2))/2
Computing f(a_1,b_1,n) is independent from f(a_2,b_2,n), where a_1,a_2 are distinct elements from A, and b_1,b_2 are also distinct elements from B.

My goal is to compute the following the value of answer as follows:

count = 0
output = 0

for a in A:
    for b in B:
        output += f(a,b,n)
        count += 1

answer = output / count

return answer

But the code above is highly serialized. What I wish to do is to parallelize that such that I minimize the total number of seconds needed for me to obtain answer answer.

Just to re-iterate: I am running this application on a single computer, and this is why I am considering a multi-threaded application. I do not wish to distribute it over multiple machines. All I want is to just run it really fast over the many cores that I have on a single computer.

Stack Exchange Network

Return to Question

Best algorithm to multi-thread this application?