0

I have a data frame sample below and trying to get to the following output. Have looked through a lot of examples but none seem to handle this specific scenario. See the sample data.

Not sure if there is a way to achieve this using apply or map but I am not tied to it.


df = pd.DataFrame({'collen': [5, 3, 2, None, 3], 'colstr': ['turquoise', 'white', 'blue', 'red',None]})

    collen  colstr
0   5.0     turquoise
1   3.0     white
2   2.0     blue
3   NaN     red
4   3.0     None

Expected outcome:

    collen  colstr      new_col_str
0   5.0     turquoise   turqu
1   3.0     white       whi
2   2.0     blue        bl
3   NaN     red         red
4   3.0     None        None

3 Answers 3

1

if you're on a recent version of pandas supporting nullale integers (Int64), then first cast collen to Int64. Then use that as for string slicing.

df.collen = df.collen.astype('Int64')

next, use the following lambda to generate the new column

df['new_col_str'] = df.apply(
    lambda x: x.colstr if pd.isnull(x.colstr) or pd.isnull(x.collen) else x.colstr[:x.collen], 
    axis=1
    )
Sign up to request clarification or add additional context in comments.

2 Comments

on the same note, is there a way to do a contain search across series? i.e the output is a boolean flag if the string of col1 in given row is contained within the string of col2 of the same row.
@Drj, not sure what you mean, please make a full post with example input output & attempt to help SO community answer your question
1

Try with two condition here :-)

df['new'] = df.apply(lambda x : x['colstr'] if pd.isnull(x['collen']) or pd.isnull(x['colstr']) else x['colstr'][:int(x['collen'])],axis=1)
df
Out[98]: 
   collen     colstr    new
0     5.0  turquoise  turqu
1     3.0      white    whi
2     2.0       blue     bl
3     NaN        red    red
4     3.0       None   None

Comments

0
def f(row):
    if row["colstr"] is None:
        return None
    elif pd.isna(row["collen"]):
        return row["colstr"]
    else:
        return row["colstr"][:int(row["collen"])]

df.apply(f, axis=1)

Result

0    turqu
1      whi
2       bl
3      red
4     None
dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.