3

How do I ignore last whitespace in a line when converting to Pandas DataFrame?

I have a CSV file in the following format:

Column #1   : Type
Column #2   : Total Length
Column #3   : Found
Column #4   : Grand Total

1;2;1;7.00;
2;32;2;0.76;
3;4;6;6.00;
4;1;5;4.00;

I loop through the 'Column #' lines to create my column names first (so 4 columns), then I parse the following lines to create my DataFrame using ';' as the separator. However some of my files contain a trailing ';' on the end of each line as shown above, so my Pandas DataFrame thinks there is a 5th column containing whitespace, and consequently throws an error to say there aren't enough column names specified

Is there a mechanism in Pandas to remove/ignore the trailing ';', or whitespace when creating a DataFrame? I am using read_csv to create the DataFrame.

Thanks.

1 Answer 1

1

Just pass param for usecols:

In [160]:
t="""1;2;1;7.00;
2;32;2;0.76;
3;4;6;6.00;
4;1;5;4.00;"""
​import pandas as pd
import io
df = pd.read_csv(io.StringIO(t), sep=';', header=None, usecols=range(4))
df

Out[160]:
   0   1  2     3
0  1   2  1  7.00
1  2  32  2  0.76
2  3   4  6  6.00
3  4   1  5  4.00

Here I generate the list [0,1,2,3] to indicate which columns I'm interested in.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.