2

In most code examples of implementing __iter__ in Python, I see them returning self instead of another object. I know this is the required behavior for an iterator, but for an object that is an iterable, shouldn't it return something other than self?

My reasoning: (GIL Aside, not CPU bound)

  • If you have multiple threads iterating the same container, tracking your iterator progress in your iterable class (via the __next__ method) will have values written from both threads, thus each loop will only get part of the data.
  • Once your iterator/iterable object has been iterated once, you need to reset it. This can easily be done in __iter__ but it seems to go against the grain of how I've seen other languages implement iterators. I suppose it doesn't matter if it's idiomatic Python.

Any example I have in mind is a class called Search that uses an HTTP REST API to initialize a search. Once it's done, you can do several things.

  • Make the Search object iterable with __iter__ returning self. NOT thread safe.
  • Have the Search impliment __iter__ that returns a different object that tracks iteration and queries results from the server as it needs. IS thread safe.
  • Have a results method return an iterator/iterable SearchResults object that is NOT thread safe, but the method to get it IS thread safe.

Overall, what's the most robust way to implement an iterator in idiomatic Python that is thread safe. Maybe I should even abstract the HTTP REST API to a cursor object like many database libraries (which would be like option 3).

2
  • It depends what you want to do. If you want to have multiple independent iterators over the same data at the same time, then you don't want to make the iterable its own iterator. Commented Nov 6, 2015 at 18:00
  • 1
    Where are you seeing these examples? Making an object its own iterator is always a bad idea for the reasons you give. Commented Nov 6, 2015 at 18:01

1 Answer 1

3

The __iter__ protocol needs to return an iterator -- that can be either the same object (self, and self needs to have a __next__ method), or another object (and that other object will have __iter__ and __next__).

In just about all non-trivial use cases, a different and dedicated object should be returned for the iterator as it makes the whole process much simpler, easier to reason about, more likely to be thread-safe, etc.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.