I'm pretty new to numpy and I'm trying to vectorize a simple for loop for performance reasons, but I can't seem to come up with a solution. I have a numpy array with unique words and for each of these words i need the number of times they occur in another numpy array, called array_to_compare. The number is passed to a third numpy array, which has the same shape as the unique words array. Here is the code which contains the for loop:
import numpy as np
unique_words = np.array(['a', 'b', 'c', 'd'])
array_to_compare = np.array(['a', 'b', 'a', 'd'])
vector_array = np.zeros(len(unique_words))
for word in np.nditer(unique_words):
counter = np.count_nonzero(array_to_compare == word)
vector_array[np.where(unique_words == word)] = counter
vector_array = [2. 1. 0. 1.] #the desired output
I tried it with np.where and np.isin, but did not get the desired result. I am thankful for any help!
duplicatedoes suggest usingCounterorunique, it doesn't return the result in the desired array form. The answers provided here do that. I'm reopening it. stackoverflow.com/questions/49630204/…