0

I have a dataframe with population like below -

   RegionName     State  2000-01  2000-02  2000-03  2000-04 ...  2016-10  2016-11  2016-12
0  New York        NY      204      300      300      124    ...    456     566        344 
1  Mountain View   CA      204      300      300      124    ...    456     566        344 

There are nearly ~10K rows in the dataset. For this dataset, I want to add columns for average population for each quarter of years from 2000 to 2016.

I wrote a function to apply to the dataframe as below -

import numpy as np
def quarterize(row):
   quarter_to_months_map = {
      'q1' : ['01', '02', '03'],
      'q2' : ['04', '05', '06'],
      'q3' : ['07', '08', '09'],
      'q4' : ['10', '11', '12']
   }
   for year in range(2000, 2017):
      year = '{}'.format(year)
      for quarter in quarter_to_months_map.keys():
          values = []
          for month in quarter_to_months_map[quarter]:
             values.append(row['{}-{}'.format(year, month)])
          row['{}{}'.format(year, quarter)] = np.nanmean(values)
      return row

df = df.apply(quarterize, axis = 1)

This works fine but smaller datasets but the ~10K dataset, this would take ~10 min. Is there a way to make this more efficient and much faster?

1 Answer 1

1

Yes. Never operate on rows, operate on columns instead.

Something along the lines of:

import numpy as np
import pandas as pd
import random

df = pd.DataFrame([[random.randint(150, 300) for x in range(12 * 10)] for _ in range(1000)],
                 columns=['{}-{:02d}'.format(year, month) for month in range(1,13) for year in range(2000, 2010)])

quarter_to_months_map = {
      'q1' : ['01', '02', '03'],
      'q2' : ['04', '05', '06'],
      'q3' : ['07', '08', '09'],
      'q4' : ['10', '11', '12']
   }

for year in range(2000, 2010):
    for quarter, months in quarter_to_months_map.items():
        months = ['{}-{}'.format(year, month) for month in months]
        df['{}{}'.format(year, quarter)] = df[months].mean(axis=1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.