1

I have the following dataframe:

Date    Country Type    Consumption
01/01/2019     Fr   IE  186
02/01/2019     Fr   IE  131
01/01/2019      Fr  SE  115
02/01/2019     Fr   SE  141
03/01/2019     Fr   SE  158
01/01/2019     Po   DK  208
01/01/2019     Po   IE  150
02/01/2019     Po   IE  136
01/01/2019    Po    SE  210
02/01/2019     Po   SE  195
03/01/2019     Po   SE  160
01/01/2019     Hk   DK  229
01/01/2019     Hk   IE  159
02/01/2019     Hk   IE  210
01/01/2019     Hk   SE  130
02/01/2019     Hk   SE  179
03/01/2019     Hk   SE  143

I want to split it into multiple dataframes by country & type. For example I want to have

df_1:

enter image description here

df_2:

enter image description here

df_3:

enter image description here

df_4:

enter image description here

& so on ...

I created another dataframe

df = pd.DataFrame({
"Country": ["Fr", "Po"],
"Type": ["IE", "SE"]})

because I only want to create new dataframes based on these values in "df"

Used the following code :

#create unique list of names

 UniqueNames = pd.unique(df[['Country','Type']].values.ravel())
 DataFrameDict = {elem : pd.DataFrame for elem in UniqueNames}

 for key in DataFrameDict.keys():
     DataFrameDict[key] = df3[:][df3.Country == key]

But this does not serve the purpose & I am getting dataframes with all type values.

How can this be achieved ?

I also tried the following code :

d = {}
for name, group in df3.groupby(['City','Type']):
    d['group_' + str(name)] = group

But the problem is that it creates dataframes for every unique combination of City & Type while I only need a few combination.

Also the dataframe names are like d["group_('Fr', 'IE')"] d["group_('Fr', 'SE')"]

Can I change these names to much simpler ones like Fr_IE Fr_SE because I need to run many other functions on each of these dataframes

3
  • so here’s a hint to get you started, you need groupby with country + type, so lookup pd.DataFrame.groupby Commented Dec 18, 2019 at 14:00
  • Paste the dataframe codes and not the images, please. If we want to reproduce your code we have to write it down, line by line. Commented Dec 18, 2019 at 14:06
  • Done doing that Commented Dec 18, 2019 at 14:39

2 Answers 2

1

Convert the dataframe with the desired values into a list of tuples to be able to loop and filter through it

tuples = [tuple(x) for x in df.values]

Finally, filter the original dataframe with each of the items in the list, here I print each of them but you might want to do something else...

for mytuple in tuples:
    print(original_df[(original_df['Country'] == mytuple[0]) & (original_df['Type'] == mytuple[1])])

To save each dataframe in a new variable you can do it with a list:

my_dfs = [df[(df['Country'] == mytuple[0]) & (df['Type'] == mytuple[1])] for mytuple in tuples]
for my_df in my_dfs:
    print(my_df)
Sign up to request clarification or add additional context in comments.

9 Comments

Thanks for sharing this but I will need the consumption column
sorry for that, I forgot to change one dataframe, try changing the last statement to: print(df[(df['Country'] == mytuple[0]) & (df['Type'] == mytuple[1])]). Note the change of df2 to df just at the start
This does not give the desired output
can you please specify de difference so I can help, from what I understood it gives each dataframe separately with all the fields
Save every dataframe in a list first as the edit says
|
0

Given that I understood the question correctly, if you just define the key dataframe df as you did below:

df = pd.DataFrame({
"Country": ["Fr", "Po"],
"Type": ["IE", "SE"]})

you are missing the other combinations like: ['Fr','SE'] and ['Po','IE'].

I solved the problem as below. Hope this helps:

import pandas as pd

# I put your original data in a file called data.txt
# and read it into a dataframe called df_data
df_data = pd.read_csv('data.txt', sep=',')
print(df_data)

# Creating a dataframe of all selected country and type pairs
df_temp = df_data.groupby(['Country', 'Type']).size().reset_index(name='Count')
df = df_temp[df_temp['Country'].isin(['Fr', 'Po']) & df_temp['Type'].isin(['IE', 'SE'])].drop('Count', axis=1)
print(df)

# Then loop through the tuples
tuples = [tuple(x) for x in df.values]
my_dfs = [df_data[(df_data['Country'] == mytuple[0]) & (df_data['Type'] == mytuple[1])] for mytuple in tuples]

for my_df in my_dfs:
    print(my_df)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.