1

I have two python generators. Say

1) txn_gen, yield the dictionary values like

{'id': 1,'ref_no': 4323453536, 'amt': 678.00, 'txn_date': '12-11-2019'}
.
.
.
{'id':10000000 , 'ref_no':8523118426, 'amt':98788.00, 'txn_date': '12-11-2019'}

2) acc_gen, yield the dictionary values like

{'ref_no': 4323453536, 'acc_no': 123456789, 'amt': 98789.00}
.
.
.
{'ref_no': 8523118426, 'acc_no': 123456789, 'amt': 45654567.00}

I want to loop txn_gen over acc_gen for ref_no matching. I am looping like this.

for gen1 in txn_gen:
     for gen2 in acc_gen:
          if gen1[1] == gen2[0]:
               print(gen2)

But I am getting only one match value ie., the first match value. I am expecting millions of match values.

I want to improve the performance as I have millions of records.

0

2 Answers 2

1

A generator can only be evaluated once. After you've consumed all the values in acc_gen, and go on to the next value in txn_gen, you cannot loop through acc_gen again.

For this kind of analysis, you can iterate through txn_gen and save each ref_no in a hash table, and then iterate through acc_gen to look up their ref_no fields.

Sign up to request clarification or add additional context in comments.

Comments

1

Once you have consumed a generator you can't iterate it again. One way is to convert them (or at least the inner one) to a list if the memory cost is acceptable:

acc_gen = list(acc_gen)
for gen1 in txn_gen:
   ...

If you cannot justify the space complexity, you must reset or re-initialise acc_gen before the second for statement.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.