So first of all, your design/code style is really hard to read, you should think about simplifying it.
Your problems occurs due to the fact, that you are trying to smash strings and arrays in the np.where function. The documentation says:
numpy.where(condition[, x, y])
Return elements chosen from x or y depending on condition.
Parameters:
condition : array_like, bool
Where True, yield x, otherwise yield y.
x, y : array_like
Values from which to choose. x, y and condition need to be broadcastable to some shape.
Returns:
out : ndarray
An array with elements from x where condition is True, and elements from y elsewhere.
As you can see x and y need to be broadcastable to some shape. Looking at the documentation of broadcastable:
6.4. Broadcasting
Another powerful feature of Numpy is broadcasting. Broadcasting takes
place when you perform operations between arrays of different shapes.
For instance
>>> a = np.array([
[0, 1],
[2, 3],
[4, 5],
])
>>> b = np.array([10, 100])
>>> a * b
array([[ 0, 100],
[ 20, 300],
[ 40, 500]])
The shapes of a and b don’t match. In order to proceed, Numpy will
stretch b into a second dimension, as if it were stacked three times
upon itself. The operation then takes place element-wise.
One of the rules of broadcasting is that only dimensions of size 1 can
be stretched (if an array only has one dimension, all other dimensions
are considered for broadcasting purposes to have size 1). In the
example above b is 1D, and has shape (2,). For broadcasting with a,
which has two dimensions, Numpy adds another dimension of size 1 to b.
b now has shape (1, 2). This new dimension can now be stretched three
times so that b’s shape matches a’s shape of (3, 2).
The other rule is that dimensions are compared from the last to the
first. Any dimensions that do not match must be stretched to become
equally sized. However, according to the previous rule, only
dimensions of size 1 can stretch. This means that some shapes cannot
broadcast and Numpy will give you an error:
>>> c = np.array([
[0, 1, 2],
[3, 4, 5],
])
>>> b = np.array([10, 100])
>>> c * b
ValueError: operands could not be broadcast together with shapes (2,3) (2,)
What happens here is that Numpy, again, adds a dimension to b, making
it of shape (1, 2). The sizes of the last dimensions of b and c (2 and
3, respectively) are then compared and found to differ. Since none of
these dimensions is of size 1 (therefore, unstretchable) Numpy gives
up and produces an error.
The solution to multiplying c and b above is to specifically tell
Numpy that it must add that extra dimension as the second dimension of
b. This is done by using None to index that second dimension. The
shape of b then becomes (2, 1), which is compatible for broadcasting
with c:
>>> c = np.array([
[0, 1, 2],
[3, 4, 5],
])
>>> b = np.array([10, 100])
>>> c * b[:, None]
array([[ 0, 10, 20],
[300, 400, 500]])
A good visual description of these rules, together with some advanced
broadcasting applications can be found in this tutorial of Numpy broadcasting rules.
So the problem is, that you are trying to broadcast an (n,)(first where) to a scalar(first string) to a (m,)(second where) to a scalar(second string) to a (k,)(third where) and so on. Since n != m != k can and will be the case and the dimensions for stretching do not match the broadcasting does not work.
df5['RST'] = np.where(new_df['SR'] == 1, 'a', np.where(df5['WR']==1, 'b', np.where(df['PR']==1, 'c', 'd')))right? just to make it a bit more readable.