1

I have a dataframe with a bunch of floats and numeric values but there are some rows with characters mixed inbetween that I'm trying to remove. I've converted my entire dataframe to strings with data = data.astype(str) and I've tried using X = X[X.var1.isalpha()] but it gives me an error 'Series' object has no attribute 'isalpha'. Thanks.

1
  • 1
    please share source data and expected output. It will help address the problem Commented Mar 3, 2021 at 2:15

3 Answers 3

1

IIUC this is what you want:

df = pd.DataFrame({'a':['1','x',2,7], 'b':[2,3,'y',8]})

#   a  b
#0  1  2
#1  x  3
#2  2  y
#3  7  8

df.apply(pd.to_numeric, errors = "coerce").dropna()
#     a    b
#0  1.0  2.0
#3  7.0  8.0
Sign up to request clarification or add additional context in comments.

1 Comment

Yes something like that is good, I should've put an input and output but I'm very new to stackoverflow and not sure how to write it.
1

Based on this you need to call .str on the series you want to check, i.e.

X.var1.str.isalpha()

Comments

1

You can use pd.Series.str.replace, which accepts regex, i.e.:

new = df.astype(str).apply(lambda x: x.str.replace('[a-zA-Z]+', '')).astype(float)                                                                                                                                                                       

3 Comments

I'm pretty sure this doesn't work (the astype part). Also, OP is asking to remove rows, not chars
What's wrong with astype? Good point on the rows though.
You'll have some empty string ('') which will give an error while casting to float. It would be something like float('')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.