1

I am trying to learn python itertools (love it so far!), but I am struck with a problem. I have the following two lists:

a=["http://www.xyz.com/jhuh7287", "http://www.hjuk.com/kashjh716", "http://www.psudjg.com/9279jshkoh", "http://www.xyz.com/jhuh7287",  "http://www.xyz.com/9289jhjbg"]
data=["k","some small string here", "so med string here", "some string here","l"]
tempstring="http://www.xyz.com"

Initially, what I wanted was to remove data[i] for all strings which are below a certain length, and subsequently delete the corresponding entries in a. For this, I used something along the lines of:

iselectors = [x is not len(str(x))>1 for x in data]
data=list(itertools.compress(data, iselectors))
a=list(itertools.compress(a, selectors))

..which works well. Now, I need to add another condition to my iselectors, which states that only when "tempstring is in a[i]" and len(str(x))>1..

So, I have tried something like:

iselectors = [tempstring in a and x is not len(str(x))>1 for x in data]

...but I am not sure this is right, since I do not think I am iterating over the entire a when I use "tempstring in a"

Any guidance would be much appreciated. Thanks.

2 Answers 2

2

Easiest way is to work it through:

>>> pprint(zip(data, a))
[('k', 'http://www.xyz.com/jhuh7287'),
 ('some small string here', 'http://www.hjuk.com/kashjh716'),
 ('so med string here', 'http://www.psudjg.com/9279jshkoh'),
 ('some string here', 'http://www.xyz.com/jhuh7287'),
 ('l', 'http://www.xyz.com/9289jhjbg')]

>>> [ (av, dv) for av, dv in zip(a, data) if len(av) > 1 and tempstring in av]
[('http://www.xyz.com/jhuh7287', 'k'), ('http://www.xyz.com/jhuh7287', 'some string here'), ('http://www.xyz.com/9289jhjbg', 'l')]

So with a bit of refactoring:

selectors = (tempstring in dv for av, dv in izip(a, data) if len(av) > 1)

And since @mgilson deleted his answer with a key point - which I hope the OP has taken on board, I'm going to re-post his wording to this answer:

Also, is is used to compare object identities. While this check works for small integers in Cpython (1 is len(str(1))), it's not guaranteed to work with other python implementations (nor is it guaranteed to work in Cpython in the future). I think you just want len(str(x))>1.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot for this! I got it now!
@johnj welcome - don't forget to heed mgilson's warning about the use of is - you don't want to be using it for this case
for some stupid reason, I like using the keyword "is". Why is it that it will not work on future Cpython implementations? Is there a reference for this? Thanks a lot for your time.
@JohnJ I don't know of any off-hand (but I'm sure you can google it), but it's just an implementation issue that it works in CPython... Generally, the spec. of is says that it will return True if the objects are the same, it's just a fluke that sometimes that occurs when the value of the object is the same... Don't worry about it, and use it for what it's meant for - otherwise use ==
2

I think you just need to iterate over both at the same time

iselectors = [len(str(x))>1 and tempstring in y for x,y in zip(data,a)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.