1

I tried to match the pattern of a file in my folders the file extension is a pdf.

I have many pdf files that have the same pattern but with different name at the end.

the pattern includes date + name of the file.

The problem is that when I run the script the system consider the both file name as the first pattern (python_pt) and do not go for the elif statement.

Example:

  • 10-11-2021 python.pdf
  • 22-09-2021 java.pdf

Code:

import re 
import  os 
from os import path 
from tqdm import tqdm
from time import sleep 

python_pt= "^[0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2}$ python.pdf"
java_pt1= "^[0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2}$ java.pdf"
java_pt2= "^ java [0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2}$.pdf"
str = 'c:'
a = 0
i = 0
for dirpath, dirnames, files in os.walk(src, topdown=True):         
    print(f'\nFound directory: {dirpath}\n')
    
    for  file in tqdm(files):
        sleep(.1)
        full_file_name = os.path.join(dirpath, file)
        if os.path.join(dirpath) == src:
            if file.endswith("pdf"):
                if python_pt:
                    i+=1
                elif java_pt1 or java_pt2:
                    a+=1
print("{} file 1 \n".format(i))
print("{} file 2 \n".format(a))
5
  • I haven't checked the regular expression for validity but you're use of the variables python_pt, java_pt1 and java_pt2 is flawed. They are strings. Therefore, for example, if python_pt will always return True Commented Nov 10, 2021 at 9:50
  • What is src? Also you are misusing anchors, none of those patterns really work as $ marks the end of string, and you require some more chars after that. And you never use the patterns, as to run a regex check, you need to use re.match/re.search/re.fullmatch. Please make sure you try these with updated patterns (without random use of anchors) and if you still fail, please edit the question. Commented Nov 10, 2021 at 9:50
  • @WiktorStribiżew src = path of the drive C: Commented Nov 10, 2021 at 9:52
  • @BrutusForcus i did not understand your comment this is not how we create a pattern ?? Commented Nov 10, 2021 at 10:08
  • @khaledM_dev There is nothing in your code that tries to use the RE patterns for matching against the filenames returned from os.walk(). I suggest using glob to get a simplified list of all files ending with '.pdf' then utilise the re module to see which, if any, are relevant. Commented Nov 10, 2021 at 10:16

1 Answer 1

1

The problems are with your regular expressions and the way you perform a regex check:

  • The anchors must not be used randomly inside the pattern; $ renders the pattern invalid once you use it in the middle (there can be no chars after end of string). As you need to check if file names end with your pattern, add $ at the end only, and do not forget to escape literal .
  • To check if there is a match you need to use one of the re.search / re.match / re.fullmatch methods.

Here is a fixed snippet:

import re, os
from os import path 
from tqdm import tqdm
from time import sleep 

python_pt= r"[0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2} python\.pdf$" # FIXED
java_pt1= r"[0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2} java\.pdf$"    # FIXED
java_pt2= r"java [0-3]?[0-9]-[0-3]?[0-9]-(?:[0-9]{2})?[0-9]{2}\.pdf$"    # FIXED

src = "C:"
i=0
a=0

for dirpath, dirnames, files in os.walk(src, topdown=True):         
    print(f'\nFound directory: {dirpath}\n')
    
    for  file in tqdm(files):
        sleep(.1)
        full_file_name = os.path.join(dirpath, file)
        if os.path.join(dirpath) == src:
            if file.endswith("pdf"):
                if re.search(python_pt, file):                               # FIXED
                    i+=1
                elif re.search(java_pt1, file) or re.search(java_pt2, file): # FIXED
                    a+=1
print("{} file 1 \n".format(i))
print("{} file 2 \n".format(a))

See the # FIXED lines.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you for your explanation i did not know how to use it very well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.