Create smaller dataframes from a large dataframe using the index values from a list

Question

I have a list

a = [15, 50 , 75]

Using the above list I have to create smaller dataframes filtering out rows (the number of rows is defined by the list) on the index from the main dataframe.

let's say my main dataframe is df the dataframes I'd like to have is df1 (from row index 0-15),df2 (from row index 15-65), df3 (from row index 65 - 125)

since these are just three I can easily use something like this below:

limit1 = a[0]
limit2 = a[1] + limit1
limit3 = a[2] + limit3

df1 = df.loc[df.index <= limit1]
df2 = df.loc[(df.index > limit1) & (df.index <= limit2)]
df2 = df2.reset_index(drop = True)
df3 =  df.loc[(df.index > limit2) & (df.index <= limit3)]
df3 = df3.reset_index(drop = True)

But what if I want to implement this with a long list on the main dataframe df, I am looking for something which is iterable like the following (which doesn't work):

df1 = df.loc[df.index <= limit1]
for i in range(2,3):
 for j in range(2,3):
  for k in range(2,3):
   df[i] =  df.loc[(df.index > limit[j]) & (df.index <= limit[k])]
   df[i] = df[i].reset_index(drop=True)
   print(df[i])

According to your logic it should be from 0-15 , then from 15-65, and then from 65-90, else your rule is changing — Juan C
– Juan C, Commented Dec 11, 2019 at 13:38

Yonas Kassa · Accepted Answer · 2019-12-11 14:15:01Z

2

you could modify your code by building dataframes from the main dataframe iteratively cutting out slices from the end of the dataframe.

dfs = [] # this list contains your partitioned dataframes
a = [15, 50 , 75]
for idx in a[::-1]:
    dfs.insert(0, df.iloc[idx:])
    df = df.iloc[:idx]
dfs.insert(0, df) # add the last remaining dataframe
print(dfs)

Another option is to use list expressions as follows:

a = [0, 15, 50 , 75]
dfs = [df.iloc[a[i]:a[i+1]] for i in range(len(a)-1)]

edited Dec 11, 2019 at 14:15

answered Dec 11, 2019 at 13:35

Yonas Kassa

3,7881 gold badge23 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Juan C Over a year ago

on a 75 row dataframe this brings a 60 row df, then a 10 row df and then a 0 row df

Juan C Over a year ago

Now it brings a 35 row, a 25 row and a 0 row df

Andrea Over a year ago

0-to-15 is missing. I would set: a = [0, 15, 50 , 75]

Yonas Kassa Over a year ago

true. adding dfs.insert(0, df) at the end solves it.

Andrea · Accepted Answer · 2019-12-11 14:05:22Z

1

This does it. It's better to use dictionaries if you want to store multiple variables and call them later. It's bad practice to create variables in an iterative way, so always avoid it.

df = pd.DataFrame(np.linspace(1,75,75), columns=['a'])
a = [15, 50 , 25]
d = {}

b = 0
for n,i in enumerate(a):
    d[f'df{n}'] = df.iloc[b:b+i]
    b+=i

Output:

edited Dec 11, 2019 at 14:05

Andrea

3,1171 gold badge15 silver badges26 bronze badges

answered Dec 11, 2019 at 13:46

Juan C

6,1484 gold badges27 silver badges65 bronze badges

Collectives™ on Stack Overflow

Create smaller dataframes from a large dataframe using the index values from a list

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related