1

I have these values in dataset in a pandas dataframe column

col1

0.74
0.77
0.72
0.65
0.24
0.07
0.21
0.05
0.09

I want to get a new column of six elements as list in new columns as rows (by shifting one values at a time in list)

This is the col that I want to get.

col2

[0.74,0.77,0.72,0.65,0.24,0.07]
[0.77,0.72,0.65,0.24,0.07,0.21]
[0.72,0.65,0.24,0.07,0.21,0.05]
[0.65,0.24,0.07,0.21,0.05,0.09]
[0.24,0.07,0.21,0.05,0.09,NaN]
[0.07,0.21,0.05,0.09,NaN,NaN]
2
  • When does it stop? Commented Nov 9, 2022 at 9:56
  • @Noah till last values of col1 Commented Nov 9, 2022 at 9:57

3 Answers 3

2

Using the data you have given me i came up with this solution

import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': [0.74,0.77,0.72,0.65,0.24,0.07,0.21,0.05,0.09]})
df["col2"] = ""
for i in range(len(df)):
    lst = df["col1"].iloc[i:i+6].to_list()
    length = len(lst)
# Only if you need the list to be the same length
    while length<6:
        lst.append(np.nan)
        length +=1
    print(lst)
    df.at[i, 'col2'] = lst

Unsure if there is a faster way of doing it using list comprehension

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. But i am getting ValueError: Must have equal len keys and value when setting with an iterable
I have just checked the code, it is working when copied directly into my IDE. Is everything inside the for loop?
1

I would use numpy's sliding_window_view:

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view as swv

window = 6
extra  = window-1

df['col2'] = swv(np.pad(df['col1'], (0, extra), constant_values=np.nan),
                 window).tolist()

output:

   col1                                  col2
0  0.74  [0.74, 0.77, 0.72, 0.65, 0.24, 0.07]
1  0.77  [0.77, 0.72, 0.65, 0.24, 0.07, 0.21]
2  0.72  [0.72, 0.65, 0.24, 0.07, 0.21, 0.05]
3  0.65  [0.65, 0.24, 0.07, 0.21, 0.05, 0.09]
4  0.24  [0.24, 0.07, 0.21, 0.05, 0.09,  nan]
5  0.07  [0.07, 0.21, 0.05, 0.09,  nan,  nan]
6  0.21  [0.21, 0.05, 0.09,  nan,  nan,  nan]
7  0.05  [0.05, 0.09,  nan,  nan,  nan,  nan]
8  0.09  [0.09,  nan,  nan,  nan,  nan,  nan]

8 Comments

Thanks. But i am getting ImportError: cannot import name 'sliding_window_view' from 'numpy.lib.stride_tricks' (/home/analytics/.local/lib/python3.8/site-packages/numpy/lib/stride_tricks.py)
@Rajan what is your numpy version?
numpy 1.21.6, Python 3.8.12
I don't think the location of this function has changed recently, I'm using numpy 1.23.3… can you try to update?
updated the version of numpy but it is throwing same issue
|
0
s = pd.Series([0.74,
0.77,
0.72,
0.65,
0.24,
0.07,
0.21,
0.05,
0.09])


arr = []
index = 6
for i in range(index):
    arr.append(s.values[i:index+i])
print (arr)

output:

[array([0.74, 0.77, 0.72, 0.65, 0.24, 0.07]), array([0.77, 0.72, 0.65, 0.24, 0.07, 0.21]), array([0.72, 0.65, 0.24, 0.07, 0.21, 0.05]), array([0.65, 0.24, 0.07, 0.21, 0.05, 0.09]), array([0.24, 0.07, 0.21, 0.05, 0.09]), array([0.07, 0.21, 0.05, 0.09])]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.