1

I am using Python 2.7.

I have an Adobe PDF form doc that has a date field. I extract the values using the pdfminer function. The problem I need to solve is, the user in Adobe Acrobat reader is allowed to type in strings like april 3rd 2017 or 3rd April 2017 or Apr 3rd 2017 or 04/04/2017 as well as 4 3 2017. Now the date field in Adobe is set to mm/dd/yyyy format, so when a user types in one of the values above, that is the actual value that pdfminer pulls, yet adobe will display it as 04/03/2017, but when you click on the field is shows you the actual value like the ones above. Adobe allows this and then doing it's on conversion I think to display the date as mm/dd/yyyy. There is ability to use javascript with adobe for more control, but i can't do that the users can only have and use the pdf form without any accompanying javascript file.

So I was looking to find a method with datetime in Python that would be able to accept a written date such as the examples above from a string and then convert them into a true mm/dd/yyyy format??? I saw methods for converting long and short month names but nothing that would handle day names like 1st,2nd,3rd,4th .

3 Answers 3

2

You could just try each possible format in turn. First remove any st nd rd specifiers to make the testing easier:

from datetime import datetime

formats = ["%B %d %Y", "%d %B %Y", "%b %d %Y", "%m/%d/%Y", "%m %d %Y"]
dates = ["april 3rd 2017", "3rd April 2017", "Apr 3rd 2017", "04/04/2017", "4 3 2017"]

for date in dates:
    date = date.lower().replace("rd", "").replace("nd", "").replace("st", "")

    for format in formats:
        try:
            print datetime.strptime(date, format).strftime("%m/%d/%Y")
        except ValueError:
            pass

Which would display:

04/03/2017
04/03/2017
04/03/2017
04/04/2017
04/03/2017

This approach has the benefit of validating each date. For example a month greater than 12. You could flag any dates that failed all allowed formats.

Sign up to request clarification or add additional context in comments.

3 Comments

wow thanks, yeah i was looking at doing regex, appreciate it. didn't know if there was some other function and i'm new to phython so that why i asked, but i can go with regex :) cool.
just putting your code alone in and testing i always get "module object has no attribute 'strptime'. but i imported datetime lib ? so not sure why i checked python docs it should work.
nm :) needed to do "from datetime import datetime" ;)
1

Just write a regular expression to get the number out of the string.

import re

s = '30Apr' 
n = s[:re.match(r'[0-9]+', s).span()[1]]
print(n) # Will print 30

The other things should be easy.

Comments

0

Based on @MartinEvans's anwser, but using arrow library: (because it handles more cases than datetime so you don't have to use replace() nor lower())

First install arrow:

pip install arrow

Then try each possible format:

import arrow

dates = ['april 3rd 2017', '3rd April 2017', 'Apr 3rd 2017', '04/04/2017', '4 3 2017']
formats = ['MMMM Do YYYY', 'Do MMMM YYYY', 'MMM Do YYYY', 'MM/DD/YYYY', 'M D YYYY']

def convert_datetime(date):
    for format in formats:
        try:
            print arrow.get(date, format).format('MM/DD/YYYY')
        except arrow.parser.ParserError:
            pass

[convert_datetime(date) for date in dates]

Will output:

04/03/2017
04/03/2017
04/03/2017
04/04/2017
04/03/2017

If you are unsure of what could be wrong in your date format, you can also output a nice error message if none of the date matches the format:

def convert_datetime(date):
    for format in formats:
        try:
            print arrow.get(date, format).format('MM/DD/YYYY')
            break
        except (arrow.parser.ParserError, ValueError) as e:
            pass
    else:
        print 'For date: "{0}", {1}'.format(date, e)

convert_datetime('124 5 2017') # test invalid date

Will output the following error message:

'For date: "124 5 2017", month must be in 1..12'

1 Comment

I don't get why most are afraid of using arrow library ^^' Anyway I tried... hopefuly it could be any helps for other users of the community!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.