2

I am working on a project which is I/O bound.

I have 3 dependent tasks:
1. scraping a site + extracting the main content(removing comments/ads etc)
2. as soon as 1 completes it sends the data to a summerizer
3. as soon as 2 completes it calls a view and renders a page

I know Python and Django at the moment. What technologies do you recommend me for this project? (I know that Python + Twisted or node.js are ideal for I/O bound projects).

2 Answers 2

6

If you're already using Python, you're probably better off sticking with a Python library, especially when there are so many powerful asynchronous Python libraries. Node.js is fine, but switching between Python and Javascript is unnecessary.

Anyway, your question is very very vague. You can absolutely use Twisted and it will probably do what you want just fine, as long as you learn the API well enough. Other asynchronous frameworks include gevent and a web server called Tornado.

There's also Celery which is used specifically for asynchronous processing of queues. It may or may not be helpful to what you want.

I recommend you do a lot of research, look at the documentation of the above libraries, and decide what'll fit your project best. If you have more specific questions you can ask the respective IRC channels of the library, or post a clearer question here.

Sign up to request clarification or add additional context in comments.

1 Comment

+1 for suggesting me to do a lot of research :) i discovered django-socketio
1

I am finally using django-socketio.

https://github.com/stephenmcd/django-socketio

In case websockets are not supported, socketio falls back to long polling.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.