0

How do I convert a Pandas dataframe from a 'frequency table' format to a flat dataframe format and back again using idiomatic Python?

From:

        H     E     K
0       B     B    12
1       B     G     3
2       G     B    17
3       G     G    68

to:

        H     E
0       B     B
1       B     B
2       B     B
3       B     B
4       B     B
5       B     B
6       B     B
7       B     B
8       B     B
9       B     B
10      B     B
11      B     B
12      B     G
13      B     G
14      B     G
...

and back again!

        H     E     K
0       B     B    12
1       B     G     3
2       G     B    17
3       G     G    68

Please advise?

8
  • Scale up new_df = df.loc[df.index.repeat(df['K'])].reset_index(drop=True) like this answer Commented Feb 25, 2022 at 20:39
  • Scale back down df = new_df.groupby(['H', 'E']).size().reset_index(name='K') like this answer. Commented Feb 25, 2022 at 20:39
  • @henry-dcker, Thanks for the benefit of your expertise. Can I drop the 'K' column as part of the conversion? Commented Feb 25, 2022 at 20:46
  • Yeah. Just drop the column new_df = df.loc[df.index.repeat(df['K'])].drop(columns='K').reset_index(drop=True) Commented Feb 25, 2022 at 20:49
  • @henry-ecker, When I dump 'new_df' to a csv file, there are 11 'B-B' rows, 2 'B-G' rows and so on instead of 12, 3, 17, and 68 respectively? Commented Feb 25, 2022 at 20:59

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.