In most code examples of implementing __iter__ in Python, I see them returning self instead of another object. I know this is the required behavior for an iterator, but for an object that is an iterable, shouldn't it return something other than self?
My reasoning: (GIL Aside, not CPU bound)
- If you have multiple threads iterating the same container, tracking your iterator progress in your iterable class (via the
__next__method) will have values written from both threads, thus each loop will only get part of the data. - Once your iterator/iterable object has been iterated once, you need to reset it. This can easily be done in
__iter__but it seems to go against the grain of how I've seen other languages implement iterators. I suppose it doesn't matter if it's idiomatic Python.
Any example I have in mind is a class called Search that uses an HTTP REST API to initialize a search. Once it's done, you can do several things.
- Make the
Searchobject iterable with__iter__returning self. NOT thread safe. - Have the
Searchimpliment__iter__that returns a different object that tracks iteration and queries results from the server as it needs. IS thread safe. - Have a
resultsmethod return an iterator/iterableSearchResultsobject that is NOT thread safe, but the method to get it IS thread safe.
Overall, what's the most robust way to implement an iterator in idiomatic Python that is thread safe. Maybe I should even abstract the HTTP REST API to a cursor object like many database libraries (which would be like option 3).