i have an excel data that i read in with python pandas:
import pandas as pd
data = pd.read_csv('..../file.txt', sep='\t' )
the mock data looks like this:
unwantedjunkline1
unwantedjunkline2
unwantedjunkline3
ID ColumnA ColumnB ColumnC
1 A B C
2 A B C
3 A B C
...
the data in this case contains 3 junk lines(lines i don't want to read in) before hitting the header and sometimes it contains 4 or more suck junk lines. so in this case i read in the data :
data = pd.read_csv('..../file.txt', sep='\t', skiprows = 3 )
data looks like:
ID ColumnA ColumnB ColumnC
1 A B C
2 A B C
3 A B C
...
But each time the number of unwanted lines is different, is there a way to read in a table file using pandas without using 'skiprows=' but instead using some command that matches the header so it knows to start reading from the header? so I don't have to click open the file to count how many unwanted lines the file contains each time and then manually change the 'skiprows=' option.
opento have a file object, iterate through your file object until you reached the end of your junk (you'll have to find out how to assess this) then pass the file object intopd.read_csv(fileobject, ..)instead of your filepath.