I have a timneseries over a number of days where for each day, I have a variable number of datapoints. A sample dataframe is generated bwlow:
n=10,20
init=datetime.datetime(2016, 7, 24, 0, 0)
df=pd.DataFrame()
for i in np.arange(n[0],n[1]):
s =init+datetime.timedelta(days=i-10)
df = pd.concat([df,pd.DataFrame(np.random.rand(i) ,index= pd.date_range(s, periods=i, freq='T') )])
Given a dataframe like the one above, I was to create another dataframe/ndarray which has index= dates from above df (not applicable in case of ndarray). And values(rows) = concatenated data of the previous 2 days (since all rows will have different length using this, we can use "NA" to make them equal)
I tried doing this:
g = df.groupby(pd.TimeGrouper('D'))
d = {k: v for k, v in g}
k=d.keys()
k.sort()
X=pd.DataFrame(index=k)
for i in np.arange(1,len(k)):
X.ix[i]=pd.concat([d[k[i]],d[k[i-1]]]).ix[:,0]
But this doesn't work.