0

I have a DF like the following

t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX 

The time array(/values) is same for all candidates (t1,t2,...tn).

How can I transform this DF into a Time Series data ? The first step (easiest) is to take out the first half of the first row, which will be the time column (same for all candidates) and then break up the DF into two parts vertically from the middle, take the values (second) part, make transverse and concat with the time column, trasversed. But then how can I preserve the labels ?

I expect the output something like :

t1,a1,b1,c1,...,x1
t2,a2,b2,c2,...,x2
.
.
.
tn,an,bn,cn,...,xn

But I am in darkness as to how to preserve the Labels for each candidate's time series value while transforming them to timeseries data.

2
  • Can you add expected ouput from sample data, e.g. first 5 rows? Commented Mar 9, 2022 at 6:40
  • updated my question, jezreal Commented Mar 9, 2022 at 6:45

2 Answers 2

1

IIUC select all columns by position - in sample data from column position 4 to end without last column, rename columns by first row (because same values in each row) and last transpose:

print (df)
    0   1   2     3   4   5   6     7       8
0  t1  t2  t3  t500  a1  a2  a3  a500  Label1
1  t1  t2  t3  t500  b1  b2  b3  b500  Label2
2  t1  t2  t3  t500  c1  c2  c3  c500  Label3
3  t1  t2  t3  t500  x1  x2  x3  x500  LabelX



df = df.set_index([8]).iloc[:, 4:-1].rename(columns=dict(zip(df.columns[4:-1], df.iloc[0]))).T
print (df)
8  Label1 Label2 Label3 LabelX
t1     a1     b1     c1     x1
t2     a2     b2     c2     x2
t3     a3     b3     c3     x3
Sign up to request clarification or add additional context in comments.

6 Comments

Can you please explain the operations a bit ? what is the value in first row's last column value?
@AyanMitra - Is output correct in my solution?
right, so how do we preserve the labels ? I ask this question because eventually the goal is to use this DF for ML and then forecast the class (label) given a new data time series. So I need to havethe knowledge of the labels
@AyanMitra - hmmm, if need labels why are not in expected ouput? I guess you need edited answer.
Right, but I did mention, I am in darkness as to where to fit in the labels info. I have no idea
|
0

If you are confused about where to fit the labels, realize that LabelX is basically x501. It perfectly fits at the end of the time series. So after the process which you explained in the question, add a time step t501 and treat labels as any other data point. See the example

t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX 

will become

t1  ,a1,    b1,    c1,    ...,x1
t2  ,a2,    b2,    c2,    ...,x2
t3  ,a3,    b3,    c3,    ...,x3
.
.
t500,a500,  b500,  c500,  ...,x500
t501,Label1,Label2,Label3,...,LabelX 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.