2

There is a sample of dataset. enter image description here

Here each value of the columns is an integer list. The highlighted row is the sum of corresponding column's list. Meaning, the highlighted row of column 'day1' is the sum of all lists in 'Day1' column and so on for the other columns. I have tried with sum() with axis but seems like it isnt working for list.
after getting the sum lists, it has to be assigned in a new dataframe with same number of column.Example in picture below,

enter image description here

Any hints of algorithm, links, help is appreciated.Thanks.

2
  • Can you post a few rows of your data? also does your actualy values have ... in it ? Commented Jan 24, 2020 at 2:12
  • oh no no. it doesnt have ... in it. The actual list is length of 50 so I just put ... in it. well its a list and all lists are same size in all column. It can be said, day1 day2 day3 ..... day90 [1,1,1,0] [1,1,1,0] [1,1,1,0] [1,1,1,0] [3,2,1,1] [3,2,1,1] [3,2,1,1] [3,2,1,1] [2,1,1,1] [2,1,1,1] [2,1,1,1] [2,1,1,1] . . . . . . . . like this. Commented Jan 24, 2020 at 3:17

2 Answers 2

2

You can convert your DataFrame to a NumPy array, like this: df.to_numpy()

And after receive something like:

a = np.random.randint(5, size=(4, 2, 5))

Each block here it is your column:

array([[[2, 4, 1, 1, 1],
        [4, 0, 1, 4, 0]],

       [[1, 2, 4, 4, 3],
        [0, 1, 4, 4, 0]],

       [[0, 0, 0, 0, 2],
        [3, 0, 4, 2, 2]],

       [[2, 0, 3, 1, 0],
        [1, 1, 3, 3, 1]]])

Then sum it with axis:

np.sum(a, axis=1)

yields:

array([[6, 4, 2, 5, 1],
       [1, 3, 8, 8, 3],
       [3, 0, 4, 2, 4],
       [3, 1, 6, 4, 1]])

Prepare to create DataFrame

dd = {f'Day{n}':np.array2string(i, separator=',')
      for n,i in enumerate(list(np.sum(ar, axis=1)), start=1)}

Create df:

df = pd.DataFrame(list(dd.values()), index=dd.keys()).T

yields:

          Day1         Day2         Day3         Day4
0  [6,4,2,5,1]  [1,3,8,8,3]  [3,0,4,2,4]  [3,1,6,4,1]
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the solution. I have used another approach and solved it but this solution gave me idea. Thanks.
0

Hope you can get all the sum values by values.tolist() and converting them in to int values as follows. Tried a sample and it shows the result as in the image.

import pandas as pd
df = pd.read_csv("data.csv")
dl = df.values.tolist()

i = 0
for column in df:
    ilist = [sum([int(s) for s in l[i].split(',')]) for l in dl] 
    i = i+1
    print(column, " - ", sum(ilist))

- Sample code tried enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.