4

I'm working on a Django application that uses Celery to run some tasks Asynchronously. I tried to perform load testing and check response time using Apache Bench. From what I could figure out from the results is that response time is faster without celery async tasks.

I'm using:

  • Django: 2.1.0
  • celery: 4.2.1
  • Redis (Broker): 2.10.5
  • django-redis: 4.9.0
  • Celery configuration in Django settings.py:

    BROKER_URL = 'redis://127.0.0.1:6379/1'
    CELERY_RESULT_BACKEND = 'django-db' # Using django_celery_results
    CELERY_ACCEPT_CONTENT = ['application/json']
    CELERY_TASK_SERIALIZER = 'json'
    CELERY_RESULT_SERIALIZER = 'json'
    CELERY_TIMEZONE = 'Asia/Kolkata'
    

    Following is my code (API exposed by my system):

    class CustomerSearch(APIView):
    
        def post(self, request):
            request_dict = {# Request parameters}
            # Async Block
            response = celery_search_customer_task.delay(request_dict)
            response = response.get()
            # Synchronous Block (uncomment following to make synchronous call)
            # api_obj = ApiCall(request=request_dict)
            # response = api_obj.search_customer() # this makes an API call to 
            return Response(response)
    

    And the celery task in tasks.py:

    @app.task(bind=True)
    def celery_search_customer_task(self, req_data={}):
        api_obj = ApiCall(request=req_data)
        response = api_obj.search_customer() # this makes an API call to another system
        return response
    

    Apache Bench command:

    ab -p req_data.data -T application/x-www-form-urlencoded -l -r -n 10 -c 10 -k -H "Authorization: Token <my_token>" http://<my_host_name>/<api_end_point>/
    

    Following is the result of ab:
    Without celery Async Task

    Concurrency Level:      10
    Time taken for tests:   1.264 seconds
    Complete requests:      10
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      3960 bytes
    Total body sent:        3200
    HTML transferred:       1760 bytes
    Requests per second:    7.91 [#/sec] (mean)
    Time per request:       1264.011 [ms] (mean)
    Time per request:       126.401 [ms] (mean, across all concurrent requests)
    Transfer rate:          3.06 [Kbytes/sec] received
                            2.47 kb/s sent
                            5.53 kb/s total
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:      259  270  10.7    266     298
    Processing:   875  928  36.9    955     967
    Waiting:      875  926  35.3    950     962
    Total:       1141 1198  43.4   1224    1263
    
    Percentage of the requests served within a certain time (ms)
      50%   1224
      66%   1225
      75%   1231
      80%   1233
      90%   1263
      95%   1263
      98%   1263
      99%   1263
     100%   1263 (longest request)
    

    With celery Async Task

    Concurrency Level:      10
    Time taken for tests:   10.776 seconds
    Complete requests:      10
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      3960 bytes
    Total body sent:        3200
    HTML transferred:       1760 bytes
    Requests per second:    0.93 [#/sec] (mean)
    Time per request:       10775.688 [ms] (mean)
    Time per request:       1077.569 [ms] (mean, across all concurrent requests)
    Transfer rate:          0.36 [Kbytes/sec] received
                            0.29 kb/s sent
                            0.65 kb/s total
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:      259  271   9.2    268     284
    Processing:  1132 6128 4091.9   8976   10492
    Waiting:     1132 6127 4091.3   8975   10491
    Total:       1397 6399 4099.3   9244   10775
    
    Percentage of the requests served within a certain time (ms)
      50%   9244
      66%   9252
      75%  10188
      80%  10196
      90%  10775
      95%  10775
      98%  10775
      99%  10775
     100%  10775 (longest request)
    

    Isn't celery async task supposed to make tasks work faster than synchronous tasks? What is it that I might be missing here?

    Any help would be appreciated. Thanks.

    2
    • What do you mean without and with? By using .get() then all tasks send are awaited synchronously while blocking the rest of the execution. Can you include code examples of how you do "with async" and "without async"? Commented Apr 2, 2019 at 3:52
    • I have updated my question to demonstrate how I'm making "Synchronous" and "Asynchronous" API calls (in post() method of CustomerSearch class). While testing, I just comment or uncomment both the blocks to toggle between Async and sync. Commented Apr 2, 2019 at 7:02

    2 Answers 2

    3

    I think there are multiple misconceptions in your question that should be answered.

    Isn't celery async task supposed to make tasks work faster than synchronous tasks?

    As @Yugandhar indicate in his answer, by using something like Celery you are adding additional overhead to your processing. Instead of the same process executing the code, you are actually doing the following:

    • Client send message to broker.
    • Worker pick up message and execute it.
    • Worker return response to broker.
    • Client pick up response and process it.

    As you can see, clearly there is additional overhead involved in using Celery relative to executing it synchronously. Because of this, it is not necessarily true to say that "async task is faster than synchronous tasks".

    The question is then, why use asynchronous tasks? If it adds additional overhead and might slow down the execution, then what is the benefit of it? The benefit is that you don't need to await the response!

    Let's take your ApiCall() as an example. Let's say that the call itself takes 10 seconds to execute. By executing it synchronously it means that you are blocking anything else to be done until the call is completed. If for example you have a form submission that triggers this, it means that the user have to wait for their browser to load for 10 seconds before they get their response. This is a pretty poor user experience.

    By executing it asynchronously in the background, the call itself might take 10.01 seconds to execute (slower due to the overhead) but instead of having to wait for those seconds to pass, you can (if you choose to) immediately return the response back to the user and make the user experience much better.

    Awaiting Results vs Callbacks

    The problem with your code example is that the synchronouse and the "asynchronous" code basically do the same thing. Both of them await the results in a blocking fashion and you don't really get the benefits of executing it asychronously.

    By using the .get() method, you tell the AsyncResult object to await the results. This means that it will block (just as if you executed it synchronously) anything until the Celery worker returns a response.

    task.delay()        # Async, don't await any response.
    task.delay().get()  # Blocks execution until response is returned.
    

    Sometimes this is what you want, but in other cases you don't need to wait for the response and you can finish executing the HTTP Request and instead use a callback to handle the response of the task that you triggered.

    Sign up to request clarification or add additional context in comments.

    4 Comments

    Thanks for such a detailed explanation! I get it now. However, according to current flow of my system, System 1 is calling my API, for which I'm calling API of System 2. After receiving results from System 2 and further processing I send the result back to System 1. Hence, it is necessary for me to receive the result from celery task. Is there any workaround for this situation? And if I understood it correctly, does it mean that using .get() always makes the task run synchronously?
    Yes .get() is always blocking the rest of the execution until response is returned. That's the point of it. Regarding handling results, you could look into chaining tasks or using the link or link_error kwargs to pass in callbacks. There is no "right" way, it depends on what you want to do. docs.celeryproject.org/en/latest/userguide/… docs.celeryproject.org/en/latest/userguide/canvas.html#chains
    Thanks for the references. I will read those. While testing this further I found that when I'm increasing number of users and requests (20 users - 20 requests, 30 Users - 30 Request,.. ) the response time keeps on increasing and the difference is too much. Another reason for me to use Async Task was to make sure that increasing number of users does not increase response time in considerable amount. Is this again because of .get()?
    It's difficult to say exactly why, but theoretically it could be that your Django web server spawn more workers that can execute code than your Celery worker. By default Celery spawns 1 worker per CPU core, this can be configured if you look through the docs.
    2

    Running code synchronously is straightforward blocking code on main thread,on the other hand celery works like producer consumer mechanism. Celery forwards the task to a broker message queue like RabbitMQ or Redis this adds an extra processing time here. And depending upon where your celery is running you can consider network latency added if not running locally. If you are calling get or delay then returns a promise that can be used to monitor the status and get the result when it's ready. So architecture basically becomes

    • web

    • broker

    • worker
    • result backend

    Considering this much processing celery task is slower than running on main thread

    Comments

    Your Answer

    By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.