Let's say I have an array such as this:
a = np.array([[1, 2, 3, 4, 5, 6, 7], [20, 25, 30, 35, 40, 45, 50], [2, 4, 6, 8, 10, 12, 14]])
and a dataframe such as this:
num letter
0 1 a
1 2 b
2 3 c
What I would then like to do is to calculate the difference between the first and last number in each sequence in the array and ultimately add this difference to a new column in the df.
Currently I am able to calculate the desired difference in each sequence in this manner:
for i in a:
print(i[-1] - i[0])
Giving me the following results:
6
30
12
I would expect to be able to do is replace the print with df['new_col'] like so:
df['new_col'] = (i[-1] - i[0])
And for my df to then look like this:
num letter new_col
0 1 a 6
1 2 b 30
2 3 c 12
However, I end up getting this:
num letter new_col
0 1 a 12
1 2 b 12
2 3 c 12
I would also really appreciate if anyone could tell me what the equivalent of .diff() and .shift() are in numpy as I tried that in the same way you would with a pandas dataframe as well but just got error messages. This would be useful for me if I want to calculate the difference not just between the first and last numbers but somewhere in between.
Any help would be really appreciated, cheers.