0

I have this excel table read in jupyter notebook with pandas. I want to melt the upper row side of the table into column. The table looks like as follow:

                              ori code   cgk       cgk    clg       clg
                              ori city   jakarta   NaN    cilegon   NaN
                              ori prop   jakarta   NaN    banten    NaN
                              ori area   jawa      NaN    jawa      NaN

code    city       district   island    type a     days    type b    days
001     jakarta    jakarta    jawa      12000       2      13000      3
002     surabaya   surabaya   jawa      13000       3      14000      4

I realized that df.melt should be worked to transpose the upper rows, but the type & dayscolumns, and also the 4 rows and the NaN value on it get me confuse on how to do that correctly.

The desire clean dataframe I need is as follow:

code   city        district    island    type       price_type   days    ori_code   ori_city     ori_prop   ori_area
001    jakarta     jakarta     jawa      type a     12000         2         cgk     jakarta    jakarta      jawa
001    jakarta     jakarta     jawa      type b     13000         3         clg     cilegon    banten       jawa 
002    surabaya    surabaya    jawa      type a     13000         3         cgk     jakarta    jakarta      jawa
002    surabaya    surabaya    jawa      type b     14000         4         clg     cilegon    banten       jawa

The ori_code, ori_city, ori_prop, ori_area would become column names.

So far what I have done is set fix index name which are code, city, district and also island.

df = df.set_index(['code','city','district','island'])

can anyone help me to solve this problem? Any helps would be much appreciated. Thank you in advance.

1
  • kindly share a reproducible form : df.to_dict() Commented Dec 21, 2022 at 5:59

1 Answer 1

1

For this you can use pandas melt function like this:

import pandas as pd

# Set the index for the DataFrame
df = df.set_index(['code', 'city', 'district', 'island'])

# Use pd.melt to reshape the data
df = pd.melt(df, id_vars=['code', 'city', 'district', 'island'], var_name='type', value_name='price_type')

# Split the 'type' column into two columns: 'type' and 'days'
df[['type', 'days']] = df['type'].str.split(' ', expand=True)

# Drop the 'ori code', 'ori city', 'ori prop', and 'ori area' columns
df = df.drop(columns=['ori code', 'ori city', 'ori prop', 'ori area'])

# Reorder the columns
df = df[['code', 'city', 'district', 'island', 'type', 'price_type', 'days', 'ori_code', 'ori_city', 'ori_prop', 'ori_area']]

# Display the resulting DataFrame
print(df)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.