8

I have some misunderstandings with multiprocessing and map function.

I'll try to describe briefly:

Firstly, I have an list, for instance:

INPUT_MAGIC_DATA_STRUCTURE = [
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
    ['https://github.com', 'Owner', 'Repo', '', '', '0', '0'],
]

Also I have method, which currently parsing this list using specific internal logic:

def parse(api_client1, api_client2):
     for row in INPUT_MAGIC_DATA_STRUCTURE: 
         parsed_repo_row = ... (some logic with row)
         OUTPUT_MAGIC_DATA_STRUCTURE.append(parsed_repo_row)

Finally, I've red that there is some variants to make it async instead of for.

from multiprocessing import Pool
    pool = Pool(10)
    pool.map(<???>, INPUT_MAGIC_STRUCTURE)

??? – I cannot understand how to transfer my parse() from for row in INPUT_MAGIC_DATA_STRUCTURE as a first argument to pool.map() and transfer all its arguments — api_client1, api_client2.

Could you help me?

Thanks in advance.

UPD:

I've already made:

pool = Pool(10)
pool.map(parse(magic_parser, magic_staff), INPUT_MAGIC_DATA_STRUCTURE)

Anyway, when interpreter comes to the second line it stops and makes only one instance of parse() method (I see the logging output of parsed rows: 1 , 2 , 3 , 4 , 5 – one by one).

3
  • So, you'll have many processes running on a list modifying it? Commented Mar 12, 2017 at 16:31
  • @TimGivois Suppose that these processes will only append to the list. It is not possible? Commented Mar 12, 2017 at 16:39
  • It is possible, if you are only appending to the list, because 'appends' are thread safe, that mean, that you can have x processes running concurrently modifying the list without any problems: stackoverflow.com/questions/5442910/… Commented Mar 12, 2017 at 16:42

2 Answers 2

9

Put (some logic with row) in a function:

def row_logic(row):
    return result

Pass the function to Pool.map:

pool = Pool(10)
pool.map(row_logic, INPUT_MAGIC_DATA_STRUCTURE)
Sign up to request clarification or add additional context in comments.

Comments

-1

We'll, in python it's not that easy. You need to map your rows to your parse per row function. Look at this link: https://gist.github.com/baojie/6047780

from multiprocessing import Process
def parse_row(row):
    (some logic with row)

def dispatch_job(rows, parse_row):
    for row in rows:
         p = Process(target=parse_row, args=(row,))
         p.start()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.