I have a long running celery task which iterates over an array of items and performs some actions.
The task should somehow report back which item is it currently processing so end-user is aware of the task's progress.
At the moment my django app and celery seat together on one server, so I am able to use Django's models to report the status, but I am planning to add more workers which are away from Django, so they can't reach DB.
Right now I see few solutions:
- Store intermediate results manually using some storage, like redis or mongodb making then available over the network. This worries me a little bit because if for example I will use redis then I should keep in sync the code on a Django side reading the status and Celery task writing the status, so they use the same keys.
- Report status to the Django back from celery using REST calls. Like
PUT http://django.com/api/task/123/items_processed - Maybe use Celery event system and create events like
Item processedon which django updates the counter - Create a seperate worker which runs on a server with django which holds a task which only increases
items proceededcount, so when the task is done with an item it issuesincrease_messages_proceeded_count.delay(task_id).
Are there any solution or hidden problems with the ones I mentioned?