4

in java, I have a large array of strings.

I have one thread doing something like this:

for (int i=0;i<10000;i++) array[i] = getSomeValue();

I have another thread doing something like this:

for (int i=10000;i<20000;i++) array[i] = getSomeValue();

and another thread doing:

for (int i=20000;i<30000;i++) array[i] = getSomeValue();

and so on.

do I have to do something special to do this operation ?

will it work ?

I am trying to populate this large array faster by splitting the task into multiple threads but I wonder if this is the correct thing to do.

I am working with a 64 bit machine 16 cpus and all the fancy stuff.

1
  • If you have bad luck you may hit a concurrency issue if each thread tries to resize the array at the same time. If you already know the total number of elements to be added then you should create your array of that size. Commented Feb 9, 2014 at 12:44

6 Answers 6

8

Your code will work fine.

Different portions of an array are independent of eachother.

The spec says:

One implementation consideration for Java virtual machines is that every field and array element is considered distinct

Sign up to request clarification or add additional context in comments.

4 Comments

not entirely true: that depends what he wants to do with the array when he is done (need to make sure the updates are made visible to future consumers of the array).
That's a separate issue, jtahlborn.
@DJClayworth - how is that a separate issue? the question is whether that code is correct. to me, "correct" means that you can use the results in your code. the only way to use the results of the OP's code in his program is if he handles thread visibility correctly. pretty useless to init the array and then not be able to read it later.
@DJClayworth it depends on what the OP's definition of 'work' is. 'Will it work' - the answer is yes if memory visibility isn't an issue after the writes and no if they are.
1

This should work fine. However, if you want to be sure it's safe, you can populate different arrays in each thread and then System.arraycopy() them into one big array.

Comments

1

you can safely init the array with this code, however any code which needs to use the array afterwards needs to be correctly synchronized with the threads which are doing the initial updates. this can be as simple as "join"ing all the init threads before using the array.

Comments

0

It should be fine unless getSomeValue() has side-effects that change mutable state that multiple threads access. If it doesn't have any state-changes, then you are set. You won't access the same bit of memory in any of the loops.

Whether or not it will actually be faster depends on your hardware setup and threading implementation.

Comments

0

As long as each thread works on a specific segment of the array then the updates should be fine. Addiontally you can definitely see a performance boost by dividing the work. You'll probably want to test the level of the boost, by testing with a different number of threads. Probably should start at 16 since you have 16 CPU, and see how increasing and decreasing effects performance.

One issue you may have is with visibility. I don't believe the elements of the array are guaranteed to be seen by all threads because they aren't volatile. So if sections of the array need to be accessed by multiple threads then you could have an issue. One way to deal with this is to use a AtomicIntegerArray... AtomicReferenceArray.

Comments

0

With Java 8, this has become much easier:

Arrays.parallelSetAll(array, i -> getSomeValue());

This should also solve the problems of visibility mentioned in other answers and comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.