2

Is it possible to add a value in a column when the province name of second dataframe matches with the province name of the first dataframe? I searched for answers and weren't able to find anything useful for my case.

This is first DataFrame

    date        province            confirmed   released    deceased
0   2020-03-30  Daegu               6624        3837        111
1   2020-03-30  Gyeongsangbuk-do    1298        772         38
2   2020-03-30  Gyeonggi-do         463         160         5
3   2020-03-30  Seoul               426         92          0
4   2020-03-30  Chungcheongnam-do   127         83          0
...

and this is second DataFrame

    code    province            latitude    longitude
0   12000   Daegu               35.872150   128.601783   
1   60000   Gyeongsangbuk-do    36.576032   128.505599  
2   20000   Gyeonggi-do         37.275119   127.009466
3   10000   Seoul               37.566953   126.977977  
4   41000   Chungcheongnam-do   36.658976   126.673318
...

I would like to turn the first DataFrame like this.

    date        province            confirmed   released    deceased   latitude     longitude
0   2020-03-30  Daegu               6624        3837        111        35.872150    128.601783
1   2020-03-30  Gyeongsangbuk-do    1298        772         38         36.576032    128.505599
2   2020-03-30  Gyeonggi-do         463         160         5          37.275119    127.009466
3   2020-03-30  Seoul               426         92          0          37.566953    126.977977
4   2020-03-30  Chungcheongnam-do   127         83          0          36.658976    126.673318
...

Thanks!

2 Answers 2

4

The pandas.DataFrame.merge method is what you want to use here.

Using your example DataFrames:

import pandas as pd

df1 = pd.DataFrame(dict(
    date = [
        '2020-03-30','2020-03-30','2020-03-30',
        '2020-03-30','2020-03-30',],
    province = [
        'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do', 
        'Seoul', 'Chungcheongnam-do'],
    confirmed = [6624, 1298, 463, 426, 127],
    released = [3837, 772, 160, 92, 83],
    deceased = [111, 38, 5, 0, 0],
))

df2 = pd.DataFrame(dict(
    code = [12000, 60000, 20000, 10000, 41000],
    province = [
        'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do', 
        'Seoul', 'Chungcheongnam-do'],
    latitude = [
        35.872150, 36.576032, 37.275119, 
        37.566953, 36.658976],
    longitude = [
        128.601783, 128.505599, 127.009466, 
        126.977977, 126.673318],
))

df3 =  df1.merge(
    df2[['province', 'latitude','longitude']],
    on = 'province',
)

pd.set_option('display.max_columns', 7)

print(df3)

Output:

         date           province  confirmed  released  deceased   latitude  \
0  2020-03-30              Daegu       6624      3837       111  35.872150   
1  2020-03-30   Gyeongsangbuk-do       1298       772        38  36.576032   
2  2020-03-30        Gyeonggi-do        463       160         5  37.275119   
3  2020-03-30              Seoul        426        92         0  37.566953   
4  2020-03-30  Chungcheongnam-do        127        83         0  36.658976   

    longitude  
0  128.601783  
1  128.505599  
2  127.009466  
3  126.977977  
4  126.673318

Example Code in python tutor

Sign up to request clarification or add additional context in comments.

1 Comment

I appreciate your help!
0

What you really want to do is merge both the DataFrames based on the province column.

Make a new DataFrame which you want.

First run a loop on first DataFrame and add all the values in it. (Leave the values for the columns which are not present)

Then run a loop on second DataFrame and add the its values by comparing the value of province to the already added value in the new DataFrame.

Here's an example

NewDataFrame

date        province            confirmed   released    deceased   latitude     longitude

After adding the first DataFrame

    date        province            confirmed   released    deceased    latitude     longitude
0   2020-03-30  Daegu               6624        3837        111
1   2020-03-30  Gyeongsangbuk-do    1298        772         38
2   2020-03-30  Gyeonggi-do         463         160         5
3   2020-03-30  Seoul               426         92          0
4   2020-03-30  Chungcheongnam-do   127         83          0

After adding second DataFrame

    date        province            confirmed   released    deceased   latitude     longitude
0   2020-03-30  Daegu               6624        3837        111        35.872150    128.601783
1   2020-03-30  Gyeongsangbuk-do    1298        772         38         36.576032    128.505599
2   2020-03-30  Gyeonggi-do         463         160         5          37.275119    127.009466
3   2020-03-30  Seoul               426         92          0          37.566953    126.977977
4   2020-03-30  Chungcheongnam-do   127         83          0          36.658976    126.673318

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.