0

How do I set it so that the dependent tasks are run in parallel (instead of sequentially) to reduce the overall execution time

  • I have the following "master" task below
  • It is dependent on the get_latest_close_price(symbol) and option_chain(symbol) celery task results
  • The master tasks transforms the results from these two tasks to produce the final pivot table
  • The problem is that both these independent tasks talk to different API systems to get it's input (sometimes taking several seconds to execute)

  • What I am noticing is that the order of the celery task in the group statement matters:

    • If get_lastest_close_price is first, it gets run before the option_chain (and vise versa)
    • To me it seems like these statements are being run sequentially instead of in parallel
    • Is my understanding wrong?

.

@celery.task(name='master_task')
def process_chain(symbol):

   # g = group(get_latest_close_price.s(symbol), option_chain.s(symbol))
   g = group(option_chain.s(symbol), get_latest_close_price.s(symbol))

   results = g()

   with result.allow_join_result():
       data = results.get()
       data = util_transform_option_chain(data[1], data[0])

   return({'result':data})

1 Answer 1

1

group will definitely run in parallel . . . if you have multiple workers or concurrency > 1. But, just as an FYI--you may want to combine a chord with group to handle the result the way that you want.

Sign up to request clarification or add additional context in comments.

2 Comments

setting --concurrency=5 got me the result I was expecting
Is there a good tutorial/article to understand group, chord etc. I'm not sure if I understand the concepts fully...I'm sort of wingin' it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.