0
import re

pattern = ['[\d]{1,2} [ADFJMNOS]\w* [\d]{2,4}','\d{1,2}/\d{1,2}/\d{2,4}']

text = "He welcomed me on 12 Jan 2014 and there by 15 OF 16 cakes for the party. Next morning on 16/05/2022 he waked up. He attended on 16 Feb 1966"

for i in pattern:
    temp_list = re.findall(i,text)
    print(temp_list)

Required output: 
['12 Jan 2014','16 Feb 1966']
['16/05/2022']

The output is coming with 15 OF 16. Is there any solution to get only the dates with months

1
  • Add a r in front of the strings to ensure it's a "raw" string. e.g. patterns = [r"[\d]...", ...] Commented Feb 6, 2022 at 5:31

2 Answers 2

1

Reverse the order of findall arguments and ensure correct escaping. (?:) avoids capturing the group.

patterns = [
    r"\d{1,2} (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{1,4}",
    r"\d{1,2}/\d{1,2}/\d{2,4}",
]

for pattern in patterns:
    matches = re.findall(pattern, text)
    print(matches)

Output:

['12 Jan 2014', '16 Feb 1966']
['16/05/2022']
Sign up to request clarification or add additional context in comments.

Comments

0

My advice is to use re.findall with the regular expression

\b\d{2,4}(?: +|\/)\S+(?: +|\/)\d{2,4}\b|\b[a-z]+ +\d{1,2},? +\d{4}\b

Demo

The link shows that for the following string (which I've broken into two parts for readability) the indicated strings are matched.

blah 12 Jan 2014 blah 15 OF 16 blah 16/05/2022 blah 16 Feb 1966
     ^^^^^^^^^^^      ^^^^^^^^      ^^^^^^^^^^      ^^^^^^^^^^^        
blah 2017/8/21 blah Jan 12, 2014 blah January 12, 2014 blah cat 0 0000
     ^^^^^^^^^      ^^^^^^^^^^^^      ^^^^^^^^^^^^^^^^      ^^^^^^^^^^ 

Notice that I added additional date formats to those included in the example.

This is not what you want? Of course it's not, but it's effectively impossible to catch all valid date strings and only valid date strings with a single regular expression, so it's best to initially capture all the strings that may represent valid dates, and then filter them further, as described below.

There is another consideration. You may want to convert all the date strings you collect to a common format, for presentational and/or computational, purposes.

For each possible date string you collect in the first step you can next use strptime to see if the string matches any format string from a list of format strings that you construct. For each format string, if the string being tested represents a valid date a time tuple is returned; else a ValueError is raised.

By way of illustration, the list of format strings would include "%d %b %Y", which parses "12 Jan 2014", and "%Y/%m/%d", which parses "2017/8/21".

Not only will that tell you which strings represent valid dates but you can then use strftime (same link as strptime) with the time tuple obtained with strptime to display the dates in a consistent format.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.