0

I am trying to use Celery to perform a rather consuming algorithm on one of my models. Currently in my home.tasks.py I have:

@shared_task(bind=True)
def get_hot_posts():
    return Post.objects.get_hot()
    
    
@shared_task(bind=True)
def get_top_posts():
    pass

Which inside my Post object model manager I have:

def get_hot(self):
    
    qs = (
        self.get_queryset()
        .select_related("author")
    )
    
    qs_list = list(qs)
    sorted_post = sorted(qs_list, key=lambda p: p.hot(), reverse=True)
    
    return sorted_post

Which returns a list object of the hot posts.

I have used django_celery_beat in order to set a periodic task. Which I have configured in my settings.py

CELERY_BEAT_SCHEDULE = {
    'update-hot-posts': {
        'task':'get_hot_posts',
        'schedule': 3600.0
    },
    'update-top-posts': {
        'task':'get_top_posts',
        'schedule': 86400
    }
}

I do not if I can perform any functions on my models in Celery tasks, but my intention is to compute the top posts every 1 hour, and then simply use it in one of my views. How can I achieve this, I am not able to find how I can get the output of that task and use it in my views in order to render it in my template.

Thanks in advance!

EDIT

I am now caching the results:

settings.py:

CACHES = {
    "default": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": "redis://127.0.0.1:6379/1",
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "IGNORE_EXCEPTIONS": True,
            
        }
    }
}

CACHE_TTL = getattr(settings, 'CACHE_TTL', DEFAULT_TIMEOUT)

@shared_task(bind=True)
def get_hot_posts():
    hot_posts = Post.objects.get_hot()
    cache.set("hot_posts", hot_posts, timeout=CACHE_TTL)

However, when accessing objects in my view it return None, it seems my tasks are not working.

@login_required
def hot_posts(request):
    posts = cache.get("hot_posts")
    context = { 'posts':posts, 'hot_active':'-active'}
    return render(request, 'home/homepage/home.html', context)

How can I check whether my tasks are running properly or not? And it is actually working and caching the queryset function.

EDIT: Configuration in settings.py:

BROKER_URL = 'redis://localhost:6379'
BROKER_TRANSPORT = 'redis'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_BEAT_SCHEDULE = {
    'update-hot-posts': {
        'task':'get_hot_posts',
        'schedule': 3600.0
    },
    'update-top-posts': {
        'task':'get_top_posts',
        'schedule': 86400.0
    },
    'tester': {
        'task':'tester',
        'schedule': 60.0
    }
}

I do not see and results when I go to my view andcache.get returns None, I think my tasks are not running but I cannot find the reason.

This is what happens when I run my worker:

celery -A register worker -E --loglevel=info

 -------------- [email protected] v4.4.6 (cliffs)
--- ***** ----- 
-- ******* ---- Darwin-16.7.0-x86_64-i386-64bit 2020-07-06 01:46:36
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         register:0x10f3da050
- ** ---------- .> transport:   redis://localhost:6379//
- ** ---------- .> results:     redis://localhost:6379/
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: ON
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery
                

[tasks]
  . home.tasks.get_hot_posts
  . home.tasks.get_top_posts
  . home.tasks.tester

[2020-07-06 01:46:38,449: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-07-06 01:46:38,500: INFO/MainProcess] mingle: searching for neighbors
[2020-07-06 01:46:39,592: INFO/MainProcess] mingle: all alone
[2020-07-06 01:46:39,650: INFO/MainProcess] [email protected] ready.

Also for starting up beat I use:

  celery -A register beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler
2
  • You'd need to cache the data somehow. You could do that with Redis or your database via the data model. Commented Jul 5, 2020 at 23:52
  • have added caching now, but I can't find how I can achieve cached objects in my views now.. How can I achieve this? @schillingt Commented Jul 6, 2020 at 0:46

1 Answer 1

1

My suggestion is that you alter your model and make it taggable. Perhaps this: https://django-taggit.readthedocs.io/

Once you've done that you can modify your celery job that calculates hot posts. Once the new hot posts are calculated you can remove all the "hot" tags from all existing posts and then tag the newly-hot posts with the "hot" tag.

Then your view code can simply filter for posts with the hot tag.

EDIT

If you want to be sure that your code is actually executing there are extensions that you can use to do so. For example the django-celery-results backend will store whatever data your @shared_task returns (usually JSON if that's your message encoding) in the database along with a timestamp and maybe even the input args. So then you can see if/that your tasks are running as desired.

https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html#django-celery-results-using-the-django-orm-cache-as-a-result-backend

You might also consider django-celery-beat to ensure that you have a nice visual way to see job schedules via the django admin

https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html#django-celery-beat-database-backed-periodic-tasks-with-admin-interface

EDIT 2

If you're going to use the database scheduler (highly recommended!) then you'll need to login to the admin and add your tasks on the schedule that you want.

https://pinoylearnpython.com/wp-content/uploads/2019/04/Django-Celery-Beat-on-Admin-Site-Pinoy-Learn-Python-1024x718.jpg

EDIT 3

In your settings.py

CELERY_BEAT_SCHEDULE = {
    'update-hot-posts': {
        'task':'get_hot_posts',
        'schedule': 3600.0
    },
    'update-top-posts': {
        'task':'get_top_posts',
        'schedule': 86400.0
    },
    'tester': {
        'task':'tester',
        'schedule': 60.0
    }
}

The third task there is called tester which is supposed to run every 60s. I don't see that anywhere in your tasks. Because you have attempted to schedule a task which isn't defined anywhere as a @shared_task celery is getting confused and giving you the error messages about tester.

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks, this was a great idea, however, I am right now not sure if my tasks are running properly or not, I have run stats to see if it's working and have added the code for your reference.
So I cannot add the tasks from the settings file? Like the way I have done it now.
You can! But then you shouldn't use the database scheduler and they won't show up in the admin. It's up to you. Because you're using the database scheduler (the --scheduler argument) it's looking at the database for tasks to schedule, finding none, and as such nothing is running.
Okay, I understand! So with my current configuration, I should simply run celery -A register beat -l INFO? And then fire up the worker as well?
Yup! Or login to the admin and add the tasks that way, and then use the database scheduler. I enjoy that method more as it allows me to offload responsibility for the schedule to someone besides me. But ultimately that's minor compared to getting something working first.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.