I'm trying to write a Pandas script which to extract data from several Excel files. They contain between 10 and 15 columns. From these columns I need the 1st one which has different header in every file, and some other columns which always have the same header names ('TOTAL', 'CLEAR', 'NON-CLEAR'and 'SYSTEM') but they are positioned under different column index in the different files. (I mean that in one of the files 'TOTAL' is the 3rd column in the table but in another file it is the 5th column)
I know that using usecols keyword I could specify which columns to use, but it looks like this argument takes only header names or only column indices, and never both of them in a combination.
Is it possible to write a statement which to take at the same time the 1st column by its index and then the other ones by header name?
The below statement doesn't work:
df = pd.read_excel(file, usecols = [0,'TOTAL', 'CLEAR', 'NON-CLEAR','SYSTEM'])
use_colsarg, you could read 0 rows and just splice your columns together.usecolsdoes take a callable.