I want to build a product that can perform some Internet scans (in Python) to collect various kinds of data.
I want to design it with tasks that perform these collecting jobs.
There can be multiple scans that run parallel on different inputs, so tasks can be duplicated since they have different inputs to operate on.
I wonder which architecture would fit for it, and what technologies are the best.
I thought of using RabbitMQ to store the tasks and Redis to store inputs.
The initial inputs trigger the scan, then each task spits its output that might be the input for other tasks.
What do you think of this possible design? Can it be improved? Other technologies?