0

I'm stuck with this issue:

I want to replace each row in one column in csv with id.

I have vehicle names and id's in the database:

enter image description here

In csv file this column look like this:

enter image description here

I was thinking to use pandas, to make a replacement:

df = pd.read_csv(file).replace('ALFA ROMEO 147 (937), 10.04 - 05.10', '0')

But it is the wrong way to write replace 2000+ times.

So, how can I use names from db and replace them with the correct id?

1
  • You need to read the data from database. Join the two dataFrame on vehicle and then replace Commented Dec 28, 2022 at 11:58

1 Answer 1

1

A possible solution is to merge the second dataset with the first one: After reading the two datasets (df1, the one from the csv file, and df2, the one with vehicle_id):

df1.merge(df2, how='left', on='vehicle')

So that the final output will be a dataset with columns: id, vehicle, vehicle_id

Imagine df1 as:

df1

and df2 as:

df2

the result will be: enter image description here

Here you can find the documentation: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you for your reply, not sure if I need to use 'merge', cause in db I have 2000+ vehicles and each vehicle name is unique, and in the CSV file I have 210000+ vehicles with repeated names and I need to replace names with correct id's which this vehicle have in db. I was thinking if there is a method, that will check the vehicle name in the file, compare it with the db vehicle name, and then replace it with id that matches in db Also, I'll read the documentation one more time, maybe I didn't get something
Although the vehicle name is repeated, you want to check if the vehicle name from the first table is present in the second one (the one with 210000+), and you want to take the id from the latter and add to the first one, right? "Left" in merge is used to copy merge the two table only on data present in the "left" (first) table. I updated the answer to make it more clear
I need to replace vehicle names in the file(with 210000+ vehicles) with the correct id from db. I don't need to add a new column. So this is my issue
Well, do the same and delete the column that you don't want anymore
Thank you for answering me, first I got an error "You are trying to merge on object and int64 columns. If you wish to proceed you should use pd. concat". So I use 'df2['vehicle_id']=df2['vehicle_id'].astype(object)', but the row with ids is all NaN now
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.