python read all files in a folder except a file named "xyz"

Question

I want to read all files in a folder except a file named "xyz". When I reach to this file, I want to skip it and read the next one.

Currently I have the following code:

for file in glob.glob('*.xml'):
    data = open(file).read()
    print(file)

Obviously, this will read all files in that folder. How should I skip the file "xyz.xml"

there are bunch of simple ways to make it through. you should think about it yourself first. — Jason Hu
– Jason Hu, Commented Aug 25, 2014 at 20:57

Andrew Johnson · Accepted Answer · 2014-08-25 20:58:22Z

5

The continue keyword is useful for skipping an iteration of a for loop:

for file in glob.glob('*.xml'):
    if file=="xyz.xml":
        continue
    data = open(file).read()
    print(file)

answered Aug 25, 2014 at 20:58

Andrew Johnson

3,2061 gold badge20 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

heltonbiker Over a year ago

As per suggestion of @SylvainLeroux (a comment in my answer), you can use glob.iglob to use an iterator, if that is a concern. +1

heltonbiker · Accepted Answer · 2014-08-25 20:57:49Z

2

for file in [f for f in glob.glob('*.xml') if f != "xyz.xml"]:
    do_stuff()

answered Aug 25, 2014 at 20:57

heltonbiker

27.7k30 gold badges151 silver badges270 bronze badges

2 Comments

utdemir Over a year ago

Use a generator expression, so you don't create the whole array.

Sylvain Leroux Over a year ago

@utdemir Might even use iglob so it will never store the entire list into memory.

Sylvain Leroux · Accepted Answer · 2014-08-25 21:03:48Z

2

For sake of completeness as no one posted the most obvious version:

for file in glob.glob('*.xml'):
    if file != 'xyz.xml':
        data = open(file).read()
        print(file)

answered Aug 25, 2014 at 21:03

Sylvain Leroux

52.3k8 gold badges114 silver badges136 bronze badges

4 Comments

Andrew Johnson Over a year ago

I like this option too, but in python it unfortunately requires an extra level of indendation for the whole block.

Sylvain Leroux Over a year ago

@AndrewJohnson Yes. But I don't know if the OP cares about that, though ;)

ahri Over a year ago

wondering the difference between this option and Andrew's answer. Any advantages on running time or memory allocation?

Sylvain Leroux Over a year ago

@ahri According to dis, Andrew's answer takes 3 extra bytes once compiled (an extra JUMP_FORWARD opcode). But honestly, this is sooooo marginal...

Óscar López · Accepted Answer · 2014-08-25 21:04:48Z

2

Try this, assuming that the element to be removed is in the list returned by glob.glob() (if that's not guaranteed, put the remove() line inside a try block):

lst = glob.glob('*.xml')
lst.remove('xyz.xml') # assuming that the element is present in the list
for file in lst:
    pass

Or if you care about memory usage, use a generator:

for file in (file for file in glob.glob('*.xml') if file != 'xyz.xml'):
    pass

edited Aug 25, 2014 at 21:04

answered Aug 25, 2014 at 20:57

Óscar López

237k38 gold badges321 silver badges391 bronze badges

4 Comments

Patrick Collins Over a year ago

Might be because this throws an error if xyz.xml isn't in the list? I'm not the downvoter, though.

utdemir Over a year ago

Downvoter here. Allocating the list and removing an element afterwards is pretty unnecessary(Probably glob returns a list, but one shouldn't rely on it). You're allocating O(n) memory, and you're traversing the list to search for "xyz.xml", which passes over the list, and calling remove which moves data on memory, which has also linear complexity. The whole thing can be simply done on constant memory and one pass over resulting list.

Óscar López Over a year ago

@utdemir read the documentation of glob.glob() : "Return a possibly-empty list of path names". The list was already allocated, that function returns a list. It's evean cheaper to remove the element in the first place (as I did above) than to create a new list comprehension or generator expression

utdemir Over a year ago

@ÓscarLópez, list.remove has the possibility to reallocating the whole array, probably it won't, but why should we rely on that, since we can easily skip the element via generators/continue statement? I think one should always aim for worst-case complexity.

Azhar Ansari · Accepted Answer · 2019-06-27 17:51:36Z

1

You can use glob for fairly simple pattern matching but remember that pattern rules for glob are not regular expressions! Below code can help you exclude all xml files that start with 'X'

files = glob.glob('[!X]*.xml')

answered Jun 27, 2019 at 17:51

Azhar Ansari

1361 silver badge4 bronze badges

Collectives™ on Stack Overflow

python read all files in a folder except a file named "xyz"

5 Answers 5

1 Comment

2 Comments

4 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

2 Comments

4 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related