1

I have two "vectors" of numbers where I would like to subtract one from the other. My problem is that one of them is an array of a list:

array([[ 796.24475 ],
      [ -17.138123],
      [ 164.9989  ],
      ...,
      [-469.85388 ],
      [-762.1892  ],
      [-451.34702 ]], dtype=float32)

whereas the other one is a column of a pandas data frame:

0       831.871558
21       26.070256
25      199.351116
28      861.052529
35      113.232070
           ...    
9440   -163.200046
9448   -893.619023
9449   -439.174531
9451   -795.033901
9461   -413.469417
Name: electricity, Length: 1895, dtype: float64 

They both have the same amount of data points and the error I get when I try to subtract one from the other is the following:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\series.py", line 636, in __array_ufunc__
    self, ufunc, method, *inputs, **kwargs
  File "pandas\_libs\ops_dispatch.pyx", line 91, in pandas._libs.ops_dispatch.maybe_dispatch_ufunc_to_dunder_op
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\ops\common.py", line 64, in new_method
    return method(self, other)
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\ops\__init__.py", line 502, in wrapper
    return _construct_result(left, result, index=left.index, name=res_name)
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\ops\__init__.py", line 475, in _construct_result
    out = left._constructor(result, index=index)
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\series.py", line 305, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "C:\Projects\test\testvenv\lib\site-packages\pandas\core\construction.py", line 482, in sanitize_array
    raise Exception("Data must be 1-dimensional")
Exception: Data must be 1-dimensional

All help is appreciated, thanks in advance!

2 Answers 2

2

I think simpliest is selecting for 1d array (lengths od DataFrame and array has to be same):

df['electricity'] - arr[:, 0]

Another idea, thank you @timgeb is use numpy.squeeze:

df['electricity'] - arr.squeeze()
Sign up to request clarification or add additional context in comments.

3 Comments

Alternatively subtract arr.squeeze().
I think DataFrame.squeeze is more like numpy.ndarray.item.
@timgeb - I think working similar, 1 column DataFrame convert to Series
1

I would suggest you to convert your array to a pandas series. (If it is a numpy array, you could do it as below)

series = pd.Series(np_array)

(Ensure to reshape the np.array first if necessary)

Then you can subtract this series from the column in pandas dataframe as below:

df['col_name'] - series

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.