I have two dataframes. Each row in Dataframe A is a package of products, and Dataframe B consists of product ids and their sellers' ids.
Dataframe A:
package_name | product_1 | product_2 | product_3 | product_4
package a | 12 | 15 | NaN | NaN
package b | 17 | 16 | 14 | NaN
package c | 12 | 11 | 17 | 19
Dataframe B:
product_id | seller_id
12 | seller1
15 | seller1
12 | seller2
15 | seller2
17 | seller3
16 | seller3
14 | seller3
(Each product can have multiple sellers, and each seller has multiple products.)
I want to know which sellers have products of packages (based on Dataframe A). This is what expected:
Dataframe C:
package_name | product_1 | product_2 | product_3 | product_4 | seller_id
package a | 12 | 15 | NaN | NaN | seller1
package a | 12 | 15 | NaN | NaN | seller2
package b | 17 | 16 | 14 | NaN | seller3
Both seller1 and seller2 have "all" products of package a, and seller3 has "all" products of package b.
How can I achieve Dataframe C?
productunique? Or possible repating like for first rowpackage a | 12 | 15 | 12 | NaN?