I working on a pandas DataFrame which needs a new column that shows count of specific values in specific columns.
I tried various combinations groupby and pivot, but had problems to apply it to whole dataframe without errors.
df = pd.DataFrame([
['a', 'z'],
['a', 'x'],
['a', 'y'],
['b', 'v'],
['b', 'x'],
['b', 'v']],
columns=['col1', 'col2'])
I need to add col3 that counts 'v' values in col2 for each value in 'col1'. There is no 'v' in col2 for 'a' in col1, so it's 0 everywhere, while expected value count is 2 for 'b', also in a row where value in col2 equals 'x' instead of 'v'.
Expected output:
['a', 'z', 0]
['a', 'x', 0]
['a', 'y', 0]
['b', 'v', 2]
['b', 'x', 2]
['b', 'v', 2]
I'm looking rather for a nice pandas specific solution because the original dataframe is quite big, so things like row iterations and time expensive.