I'm working with a DataFrame with a year column with the following format:
year
2015
2015-2016
2016
I want to replace strings like '2015-2016' with just '2015' using regex. I tried something like this:
df['year']=df['year'].str.replace('[0-9]{4}\-[0-9]{4}','[0-9]{4}')
But that doesn't work. I know I could do smething like:
df['year']=df['year'].str.replace('\-[0-9]{4}','')
But sometimes you need something more flexible. Is there any way to keep a portion of the match in the regex or is this one the standard approach?
Thanks in advance.