0

I'm working with a DataFrame with a year column with the following format:

  year
  2015
2015-2016
  2016

I want to replace strings like '2015-2016' with just '2015' using regex. I tried something like this:

df['year']=df['year'].str.replace('[0-9]{4}\-[0-9]{4}','[0-9]{4}')

But that doesn't work. I know I could do smething like:

df['year']=df['year'].str.replace('\-[0-9]{4}','')

But sometimes you need something more flexible. Is there any way to keep a portion of the match in the regex or is this one the standard approach?

Thanks in advance.

2 Answers 2

2

If you just want to keep the first year, and all years have 4 digits, use:

df['year'] = df.year.str.extract('(\d{4})')
>>> df
   year
0  2015
1  2015
2  2016

If you want to keep the first year before any -, use:

df['year'] = df.year.str.split('-').str[0]

>>> df
   year
0  2015
1  2015
2  2016
Sign up to request clarification or add additional context in comments.

2 Comments

That exctarct function was something I should have already known, quite useful ! Thanks sacul
Glad to help! Happy coding!
2

You can capture the good year in parenthesis and refer to it in your replacement with \1:

df['year'].str.replace(r'([0-9]{4})\-[0-9]{4}', r'\1')

Or you can make parenthesis around the good year into a non-capturing positive lookbehind assertion with ?<= and then the replacement string will be blank because only \-[0-9]{4} was matched (but only when preceded by [0-9]{4}).

df['year'].str.replace(r'(?<=[0-9]{4})\-[0-9]{4}', '')

2 Comments

First time I hear about lookaround assertions. Do you know any site where there's a good and understandable explanation of how they work? The explanations I've found haven't been very helpful to really understand the difference between negative/positive and ahead/behind
@JuanC: Here's a short explanation and a longer one on lookaround assertions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.