First advice was not a complete solution, skip to first edit section below
If you want to adjust your code in a few ways you can do this. First just break out the string into pieces:
line = "This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000"
words = line.split() # by default it splits on whitespace
Now you are able to play with each piece of your input. You can then try to parse your date using your fix_date method and re-build the string:
updated_line = ''
for word in words:
try:
updated_line += fix_date(word) + ' '
except:
updated_line += word + ' '
updated_line = updated_line[:-1] # gets rid of the extra trailing space
print(updated_line)
EDIT: upon running I realize this has a problem with punctuation attached to dates. I am making another pass.
Here is some working code:
import datetime
import re
current_date_format = "%d/%m/%Y"
new_date_format = "%d/%b/%Y"
def main():
line = "This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000"
print(line)
line = re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,line)
print(line)
def fix_date(rem):
date_string = rem.group()
return datetime.datetime.strptime(date_string, current_date_format).strftime(new_date_format)
main()
EDIT 2: As the regex method works on gigantic strings as much as small ones, if your file size is small enough to load all at once you can just do it in one shot:
import datetime
import re
current_date_format = "%d/%m/%Y"
new_date_format = "%d/%b/%Y"
def main():
with open('my_file.txt','r') as f:
text = f.read()
with open('my_fixed_file.txt','w') as f:
f.write(re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,text))
def fix_date(rem):
date_string = rem.group()
return datetime.datetime.strptime(date_string, current_date_format).strftime(new_date_format)
main()
Or even more compact by adjusting the file read/write portion:
...
with open('my_file.txt','r') as f:
with open('my_fixed_file.txt','w') as f2:
f2.write(re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,f.read()))
...
dd/mm/YYYYand a dictionary to map the numericalmmvalues to the respective string representation. However, there's probably something in thedatetimelibrary you have imported or maybe look atpandas. I've done some datetime manipulation with that and the scientific libraries always have a lot of support if you find something related. Edit: see stackoverflow.com/questions/3276180/… A post by @unutbu on that thread mentions something that may be useful to youpandasI was tossing up betweendatetimeanddateutil. But liked the waydatetimelet you build your own format.dateutil.parsercan only handle strings with one date. My strings will have 0-n dates.