2

DataFrame I have prepared is as follows...

Index and Title Index
1 aa aa aaaa 1
1.2 bb bbbb bb bbbb bb b 1.2
1.2.3 ccc cc c ccccc cccccc 1.2.3
2 dddd d d dd ddd 2

DataFrame I want is as follow..

Index and Title Index Title
1 aa aa aaaa 1 aa aa aaaa
1.2 bb bbbb bb bbbb bb b 1.2 bb bbbb bb bbbb bb b
1.2.3 ccc cc c ccccc cccccc 1.2.3 ccc cc c ccccc cccccc
2 dddd d d dd ddd 2 dddd d d dd ddd

I tried it with a following code

df['Title'] = df['Index and Title'].str.replace(df['Index'] + ' ','')

However, the debugger said ...

TypeError: 'Series' objects are mutable, thus they cannot be hashed

How should I do in this case?

0

3 Answers 3

2
df["Title"] = df["Index and Title"].str.split(n=0).str[1:].str.join(" ")
>>> df
               Index and Title  Index                  Title
0                 1 aa aa aaaa      1             aa aa aaaa
1     1.2 bb bbbb bb bbbb bb b    1.2   bb bbbb bb bbbb bb b
2  1.2.3 ccc cc c ccccc cccccc  1.2.3  ccc cc c ccccc cccccc
3            2 dddd d d dd ddd      2        dddd d d dd ddd
Sign up to request clarification or add additional context in comments.

2 Comments

I'm very sorry that my explanation is too short. 'Title' includes SPACEs as above modified table.
Don't worry about that. I fixed my answer
2

With your shown samples only, this could be taken care by extract function of Pandas, please try following.

df["Title"] = df["Index and Title"].str.extract(r'^\d+(?:(?:\.\d+){1,})?\s+(\D+)$', expand=True)

OR in case you may have digits after later values then try following:

df["Title"] = df["Index and Title"].str.extract(r'^\d+(?:(?:\.\d+){1,})?\s+(.*)$', expand=True)

Output of df will be as follows:

               Index and Title  Index                  Title
0                 1 aa aa aaaa      1             aa aa aaaa
1     1.2 bb bbbb bb bbbb bb b    1.2   bb bbbb bb bbbb bb b
2  1.2.3 ccc cc c ccccc cccccc  1.2.3  ccc cc c ccccc cccccc
3            2 dddd d d dd ddd      2        dddd d d dd ddd

Explanation: Adding detailed explanation for above.

^\d+(?:(?:\.\d+){1,})?  ##Matching starting digits in column Index and Title, digits may followed by dot and digits(1 or more occurrences) keeping this optional.
\s+                     ##Matching 1 or more occurrences of spaces here.
(\D+)$                  ##Creating 1st capturing group which has all non digits values till end of value.

Comments

0

If need replace by both columns use lambda function with axis=1:

df['Title'] = df.apply(lambda x: x['Index and Title'].replace(x['Index'],''), axis=1).str.strip()

If need only letters with spaces (there is no replace by Index column) use Series.str.extract with Series.str.strip:

df['Title'] = df['Index and Title'].str.extract('([a-zA-Z ]+)', expand=False).str.strip()

2 Comments

I'm very sorry that my explanation is too short. 'Title' includes SPACEs as above modified table.
@aaaaa0a - Both solutions should working well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.