3

I have tried to delete blank rows from my cvs file, however this is not working, it only writes out the first line

please take a look and tell me how i can get all the rows with text and skip the rows that are blank

Here is the code: file I just reads out the first line of the csv file

Thank you in advance!

3
  • 2
    where is the code? Commented Jul 27, 2017 at 9:27
  • 3
    df.read_csv(...).dropna() Commented Jul 27, 2017 at 9:27
  • 1
    Hmmm, blanks rows are omitted by default. Commented Jul 27, 2017 at 9:28

4 Answers 4

6

First read your csv file with pandas with

df=pd.read_csv('input.csv')

then remove blank rows,

df=df.dropna()

For more details in dropna, check the documentation.

Sign up to request clarification or add additional context in comments.

2 Comments

Is there any way to optimize the row deleting part? I need this treatment for very large csv files (70Go..)
@SamadiSalahedine- dropna is efficient way of dropping nan rows. If your file size is large which can't handle by pandas easily then I suggest you to use dask. For more details follow this dask.pydata.org/en/latest/…
2

There is problem:

for line in df:
    print (line)

return columns names.

Comments

2

If I have a csv file like below with blank row

B;D;K;N;M;R 

0;2017-04-27 01:35:30;C;3.5;A;01:15:00;23.0 
1;2017-04-27 01:37:30;B;3.5;B;01:13:00;24.0 


2;2017-04-27 01:39:00;K;3.5;C;00:02:00;99.0




4;2017-04-27 01:39:00;K;3.5;C;00:02:00;99.0






df = pd.read_csv('input.csv',delimiter=';') will give the dataframe ignoring the blank lines.

                     B  D    K  N         M    R 
0  2017-04-27 01:35:30  C  3.5  A  01:15:00  23.0
1  2017-04-27 01:37:30  B  3.5  B  01:13:00  24.0
2  2017-04-27 01:39:00  K  3.5  C  00:02:00  99.0
4  2017-04-27 01:39:00  K  3.5  C  00:02:00  99.0

Your code works when you use open. Pandas read_csv will convert the csv file into dataframe. You might be confused with one another.

df = open('input.csv')
new_contents = []
for line in df:
    if not line.strip():
        continue 
    else: 
        new_contents.append(line)

Comments

0

With the latest pandas (v 1.3.0), there is an argument where you can tell it to skip blank rows. It's enabled by default, but if you want to make it True anyway (e.g. self-documenting code), just set that flag to True. This is from the doc: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

skip_blank_lines: bool, default True
      If True, skip over blank lines rather than interpreting as NaN values.

So, in your code it is:

df = pd.read_csv(path, sep = ';', skip_blank_lines=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.