2

I'm interested in using Celery for an app I'm working on. It all seems pretty straight forward, but I'm a little confused about what I need to do if I have multiple load balanced application servers. All of the documentation assumes that the broker will be on the same server as the application. Currently, all of my application servers sit behind an Amazon ELB and tasks need to be able to come from any one of them.

This is what I assume I need to do:

  • Run a broker server on a separate instance
  • Configure each application instance to connect to that broker server
  • Each application instance will also be be a celery working (running celeryd)?

My only beef with that is: What happens if my broker instance dies? Can I run 2 broker instances some how so I'm safe if one goes under?

Any tips or information on what to do in a setup like mine would be greatly appreciated. I'm sure I'm missing something or not understanding something.

3 Answers 3

3

For future reference, for those who do prefer to stick with RabbitMQ...

You can create a RabbitMQ cluster from 2 or more instances. Add those instances to your ELB and point your celeryd workers at the ELB. Just make sure you connect the right ports and you should be all set. Don't forget to allow your RabbitMQ machines to talk among themselves to run the cluster. This works very well for me in production.

One exception here: if you need to schedule tasks, you need a celerybeat process. For some reason, I wasn't able to connect the celerybeat to the ELB and had to connect it to one of the instances directly. I opened an issue about it and it is supposed to be resolved (didn't test it yet). Keep in mind that celerybeat by itself can only exist once, so that's already a single point of failure.

Sign up to request clarification or add additional context in comments.

9 Comments

How did you configure so that the ELB doesn't kill off the connection after 60 seconds? [2013-07-17 11:03:40,395: ERROR/MainProcess] consumer: Cannot connect to amqp://usr@elburl:5672/vhost: Socket closed. Trying again in 2.00 seconds...
I'm using a TCP health check, not HTTP. Using the rabbimq port. Works well for me.
I switched to TCP health check against the rabbitmq port. My celery worker still times out at exactly 1 minute when the broker_url is set against the ELB. How did you configure celery? I tried using BROKER_HEARTBEAT but no luck there either. :/
Using BROKER_HOST that points to the ELB public DNS name. I should point out that I'm using Celery 2.4.x, haven't tested it with 3.x.
Yeah that's what I'm doing, but with 3.0. I noticed BROKER_HEARTBEAT is active under 3.x, maybe that has something to do with it..
|
1

You are correct in all points.

How to make reliable broker: make clustered rabbitmq installation, as described here: http://www.rabbitmq.com/clustering.html

1 Comment

Thanks for your answer and the link to the clustering information! However, I decided to use Amazon's SQS service for my broker, instead of dealing with running my own rabbitmq cluster. Turns out Celery has built in SQS support. See this SO question: stackoverflow.com/questions/8048556/celery-with-amazon-sqs
1

Celery beat also doesn't have to be a single point of failure if you run it on every worker node with:

https://github.com/ybrs/single-beat

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.