13

I have a dataframe that looks like this:

         level_0              level_1 Repo Averages for 27 Jul 2018
0  Business Date           Instrument                           Ccy
1     27/07/2018  GC_AUSTRIA_SUB_10YR                           EUR
2     27/07/2018    R_RAGB_1.15_10/18                           EUR
3     27/07/2018    R_RAGB_4.35_03/19                           EUR
4     27/07/2018    R_RAGB_1.95_06/19                           EUR

I am trying to get rid of the top row and only keep

   Business Date           Instrument         Ccy
0     27/07/2018  GC_AUSTRIA_SUB_10YR         EUR
1     27/07/2018    R_RAGB_1.15_10/18         EUR
2     27/07/2018    R_RAGB_4.35_03/19         EUR
3     27/07/2018    R_RAGB_1.95_06/19         EUR

I tried df.columns.droplevel(0) but not successful any help is more than welcome

4
  • 1
    Where are you getting the data from? It looks like an issue in reading the data. Commented Jul 31, 2018 at 10:39
  • 1
    You are likely to get answers quicker if you have runnable code in your question. Commented Jul 31, 2018 at 10:40
  • It is an automated file that has a weird structure. the top row it is like a title. So I have to read in everything and then delete undesirable rows Commented Jul 31, 2018 at 10:42
  • 5
    pd.read_csv('myfile', skiprows = 1) Commented Jul 31, 2018 at 10:46

5 Answers 5

8

You can take advantage of the parameter header (Read here more about the header parameter in pandas).

Let's say that you have the following dataset

df = pd.read_csv("Prices.csv")
print(df)

That outputs

              0       1     2         3         4
0      DATA      SESSAO  HORA  PRECO_PT  PRECO_ES
1      1/1/2020  0       1     41,88     41,88   
2      1/1/2020  0       2     38,60     38,60   
3      1/1/2020  0       3     36,55     36,55 

By simply passing the header = 0 like this

df = pd.read_csv("Prices.csv", header=0)
print(df)

You will get what you want

           DATA  SESSAO  HORA PRECO_PT PRECO_ES
0      1/1/2009  0       1     55,01    55,01  
1      1/1/2009  0       2     56,13    56,13  
2      1/1/2009  0       3     50,59    50,59  
3      1/1/2009  0       4     45,83    45,83  
4      1/1/2009  0       5     42,07    41,90 
Sign up to request clarification or add additional context in comments.

1 Comment

This gives a working solution with a clear explanation AND links to relevant documentation. Thanks!
7

You can try so:

df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None

Output:

  Business Date           Instrument  Ccy
0    27/07/2018  GC_AUSTRIA_SUB_10YR  EUR
1    27/07/2018    R_RAGB_1.15_10/18  EUR
2    27/07/2018    R_RAGB_4.35_03/19  EUR
3    27/07/2018    R_RAGB_1.95_06/19  EUR

Comments

3

You can try using slicing.

df = df[1:]

This will remove the first row of your dataframe.

2 Comments

even if the answer is accepted, have you tested it on the given example?
agree with @Joe , this example is not working.
1
df.drop(row_start, row_end)

This will help

1 Comment

don't use code snippets if the code is not executable, use code formatting instead.
0

I tested the comment by jeremycg. It works very well and is succinct. Just want more people to see, here it is again -

my_df = pd.read_csv(r"C:\path\to\my\file.csv", skiprows = 1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.