1

How transform data from dataframe

pd.DataFrame(
    [
        ['2021-12-14 12:00:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 12:15:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 12:15:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 12:30:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 12:45:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 13:00:00','subgroup_1','group_1','Subgroup 1'],
        ['2021-12-14 12:30:00','subgroup_3','group_2','Subgroup 3'],
        ['2021-12-14 12:45:00','subgroup_3','group_2','Subgroup 3'],
        ['2021-12-14 13:00:00','subgroup_3','group_2','Subgroup 3'],
    ], columns=['timestamp','subgroup','group','name']
)

to json

enter image description here

use pandas. Please, help me.

2 Answers 2

2

My solution :

df["timestamp"] = pd.to_datetime(df.timestamp)

out = {}

for k, df_group in df.groupby("group"):
    out[k] = []
    for _, df_subgroup in df_group.groupby("subgroup"):
        name = df_subgroup["name"].values[0]  # Assuming `name` is unique in this group
        count = len(df_subgroup)
        timegroup = [
            {"index": k, "value": v}
            for k, v in df_subgroup.groupby("timestamp")
            .name.count()
            .to_dict()
            .items()
        ]

        # Complete with missing timestamps
        for ts in df.timestamp.unique():
            if ts not in [t["index"] for t in timegroup]:
                timegroup.append({"index": pd.Timestamp(ts), "value": 0})

        # Sort by timestamp
        timegroup = sorted(timegroup, key=lambda x: x["index"])
        
        # Convert timestamp into strings
        for t in timegroup:
            t["index"] = t["index"].strftime("%Y-%m-%d %H:%M:%S")
  
        out[k].append({"name": name, "count": count, "timegroup": timegroup})

Result in out :

{
   "group_1": [
      {
         "name": "Subgroup 1",
         "count": 6,
         "timegroup": [
            {
               "index": "2021-12-14 12:00:00",
               "value": 1
            },
            {
               "index": "2021-12-14 12:15:00",
               "value": 2
            },
            {
               "index": "2021-12-14 12:30:00",
               "value": 1
            },
            {
               "index": "2021-12-14 12:45:00",
               "value": 1
            },
            {
               "index": "2021-12-14 13:00:00",
               "value": 1
            }
         ]
      }
   ],
   "group_2": [
      {
         "name": "Subgroup 3",
         "count": 3,
         "timegroup": [
            {
               "index": "2021-12-14 12:00:00",
               "value": 0
            },
            {
               "index": "2021-12-14 12:15:00",
               "value": 0
            },
            {
               "index": "2021-12-14 12:30:00",
               "value": 1
            },
            {
               "index": "2021-12-14 12:45:00",
               "value": 1
            },
            {
               "index": "2021-12-14 13:00:00",
               "value": 1
            }
         ]
      }
   ]
}
Sign up to request clarification or add additional context in comments.

Comments

2

Here to get your desire output I've applied 4 steps:

Code:

#STEP 1 ---- to group by group
df = df.groupby(['group','name']).agg(list).reset_index('name')

#STEP 2 ---Add insert column which will be the total of timstamp list elements
df['count'] = df.apply(lambda x: len(x.timestamp),axis=1)

#STEP 3 --timestamp list value element to dictionary where add the index and value
for r,v in enumerate(df.index):
    l=[]
    for i in set(df['timestamp'][r]):
        l.append({'index' : i, 'value' : df['timestamp'][r].count(i)})
    df.at[v, 'timestamp'] = l
    
    
#STEP 4 --CONVERTNG TO JSON BY INDEX
[json.loads(df[['name','count','timestamp']].to_json(orient="index"))]

Output:

[{'group_1': {'name': 'Subgroup 1',
   'count': 6,
   'timestamp': [{'index': '2021-12-14 12:00:00', 'value': 1},
    {'index': '2021-12-14 12:30:00', 'value': 1},
    {'index': '2021-12-14 13:00:00', 'value': 1},
    {'index': '2021-12-14 12:15:00', 'value': 2},
    {'index': '2021-12-14 12:45:00', 'value': 1}]},
  'group_2': {'name': 'Subgroup 3',
   'count': 3,
   'timestamp': [{'index': '2021-12-14 12:30:00', 'value': 1},
    {'index': '2021-12-14 12:45:00', 'value': 1},
    {'index': '2021-12-14 13:00:00', 'value': 1}]}}]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.