I have one dataframe called products that looks like this:
order_number sku units revenue
1 5000 754 1 20.0
2 5000 900 4 30.0
3 5001 754 2 40.0
4 5002 754 10 200.0
. ... ... .. ...
and another called orders that looks like this
date order_number units revenue country new_customer ...
1 1-jan 5000 5 50.0 russia yes
2 1-jan 5001 2 40.0 china yes
3 2-jan 5002 10 200.0 france no
4 2-jan 5003 1 70.0 brazil yes
. .... ... .. ... ...
I would like to create a single dataframe, which has the rows from the products dataframe but additionally has the columns from the orders dataframe, where the order number in orders matches the order number in products.
I've tried to find a way to express this via both pandas.concat and pandas.merge, but I can't get around the problem that the key I'm joining on (order_number) is unique in the orders dataframe but not in the products dataframe.
How do I do a many-to-one join like this in pandas?
products.merge(orders, on='order_number')