I have a data frame...
A B C D E F
0 2018-02-01 2 3 4 5 6
1 2018-02-02 6 7 8 4 2
2 2018-02-03 3 4 5 6 7
...which I convert to a numpy array...
[['2018-02-01' 2 3 4 5 6]
['2018-02-02' 6 7 8 4 2]
['2018-02-03' 3 4 5 6 7]]
What I would like to do is the following:
- Store only columns A, B, and C in the numpy array, rather than all the columns.
- I would like to loop over the first column, then the second and the third one. How can I achieve that?
My code is as follows:
import pandas as pd
df = pd.DataFrame([
['2018-02-01', 1, 3, 6, 102, 8],
['2018-02-01', 2, 3, 4, 5, 6],
['2018-02-02', 6, 7, 8, 4, 2],
['2018-02-03', 3, 4, 5, 6, 7]
], columns=['A', 'B', 'C', 'D', 'E', 'F'])
print(df)
#--> Here only save Columns A,B,C
nparray = df.as_matrix()
print(nparray)
#--> Loop throug Columns and would like to have it looped over the Column A first
for i in nparray:
print(i)
#Using the Values in B and C columns for that loop
calc= [func(B,C)
for B, C in zip(nparray)]
Update: I made a numerical example.
A B C D E F
0 2018-02-01 1 3 6 102 8
1 2018-02-01 2 3 4 5 6
2 2018-02-02 6 7 8 4 2
3 2018-02-03 3 4 5 6 7
Dummy code looks likte the following (it is more a nested loop)
loop over date 2018-02-01:
calc = func(Column B + Column C) = 1+3 = 4
next row is the same date so:
calc += func(Column B + Column C) = 4 + 2+ 3 = 9
for date 2018-02-01 the result is 9 and can be stored e.g. in a csv file
loop over date 2018-02-02
calc = func(Column B + Column C) = 6+7 = 13
for date 2018-02-02 the result is 13 and can be stored e.g. in a csv file
loop over date 2018-02-03
calc = func(Column B + Column C) = 3+4 = 7
for date 2018-02-03 the result is 7 and can be stored e.g. in a csv file
etc
df['A'].valuesetc will give you the relevant numpy array of that column.[['2018-02-01' 2 3 4 5 6] ...will never be a proper NumPy array, or all elements will be objects: you can't mix strings and integers. You can use a structured array instead, depending on how you want to use it.numpyis great if your data is a singledtype. you should probably usedf[['B', 'C', 'D', 'E', 'F']].valuesto only get the numeric component. since you are learning, also check the type of your array viax.dtype. As an example, you may wish to upcast toint64or downcast toint8.