0

I have a method which takes a string as input and returns data from the DB based on the input string. I have an array of strings and I am currently passing each string as input and looping over the entire array

public DataClass getData(String input){
  ....logic to get the data when string=input from a third party API. 
       Third party API takes 'input' string and gives out data....
}

public void callerMethod() {
  List<String> myStrings = new List<String>();
  for(inputStr : myStrings) {
       DataClass data = getData(inputStr);
  }
}

Above code is the logic I have as of now. I want to change the getData() method calls to concurrent calls instead of looping through the list one after another as this approach is time taking. I am not sure if I can use threads here or if there is any newer approach to achieve this.

14
  • 3
    If you're reading from a DB I can guarantee you're IO-bound. Parallelising IO over a single channel won't help. You should consider moving the filtering logic into the DB query. Commented Sep 25, 2013 at 22:56
  • 1
    Though I have other areas like this with DB IO, in the current case, I am making an API call to a third party API. Editing the question to include this info Commented Sep 25, 2013 at 23:01
  • @millimoose Can you? There are plenty of cases where you can do a CPU bound action based on a string, or do other IO bound tasks. Commented Sep 25, 2013 at 23:02
  • 1
    Anyway, my first approach would be ExecutorService.submit() (or invokeAll()) and fetching the data from the returned Futures. That's assuming the third-party API can be used in a thread-safe way. Commented Sep 25, 2013 at 23:04
  • 1
    @user811433 If only you were using C# that would just have been just Parallel.ForAll that would take care of that for you, Java 8 provides a similar interface with the new streams API. Unfortunately for you, In Java 7 - you need 2 classes that implement Runnable, one for a worker (consumes from a ConcurrentLinkedQueue) (have like 8 of these or whatever number of cores you have) and one for the DB producer that takes the DB and just produces for it. millimoose's suggestion with Futures is probably a better approach if you want anything more. Commented Sep 25, 2013 at 23:07

1 Answer 1

2

This can be parallelized using the Executor framework. Create a ThreadPoolExecutor. The number of threads should probably be equal to the number of concurrent connections you can have to the database (i.e. connection pool size).

Loop through your strings. For each string, create a Callable that wraps getData and submit the callable to the executor. The executor will return a Future which you can use later. Once you have submitted all of the callables, you can start retrieving the DataClasses from your Futures.

Sign up to request clarification or add additional context in comments.

2 Comments

One question - the future.get() does not return me any data though. Is it supposed to return values?
@user811433, future.get() will return the same thing as callable.call(). The difference is that if you invoke call() yourself, it will execute immediately. However, if you submit the callable to an executor, then it may be invoked sometime in the future. The future allows you to get the value of the deferred call. If the value is not available yet, it will block until it is.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.