Joining a single-index dataframe to a multi-index dataframe

Question

I have two dataframes, structured something like

# df1
                        data1   data2
id      feature_count   
12345   1               111     888
        2               222     999
        3               333     101010
45678   0               444     111111
        2               555     121212
        3               666     131313
        4               777     141414

and

# df2
        descriptor
id
12345   "foo"
45678   "bar"

Based on this solution it seems like I should simply be able to do df1.join(df2) to get the desired result

#joined
                        data1   data2   descriptor
id      feature_count   
12345   1               111     888     "foo"
        2               222     999     "foo"
        3               333     101010  "foo"
45678   0               444     111111  "bar"
        2               555     121212  "bar"
        3               666     131313  "bar"
        4               777     141414  "bar"

However, what I actually get is NotImplementedError: Index._join_level on non-unique index is not implemented in Pandas 1.0.5.

This seems like it shouldn't be complicated, but I'm clearly misunderstanding something. All I'm looking for is to append the column of unique mappings in df2 on to the (guaranteed existing mapping) first index of df1.

I can't duplicate your error. I have version 1.0.5 too. Your example code works for me. Does your example code work for you or do you get the error on the larger dataset? — Jarad
– Jarad, Commented Sep 26, 2020 at 0:04
It was a much larger dataset pulled in from a SQL query. After I accepted the answer (and a different error proved to be more informative) I think that the query had unexpected duplication (which I'm surprised Pandas let me assign an index to). — Philip Kahn
– Philip Kahn, Commented Sep 28, 2020 at 17:13

Quang Hoang · Accepted Answer · 2020-09-26 01:01:53Z

1

Since you only need to map one column, just do:

df1['descriptor'] = df1.index.get_level_values('id').map(df2['descriptor'])

In general, you can temporarily reset the other index, join the dataframes, and set it back:

df1.reset_index('feature_count').join(df2).set_index('feature_count', append=True)

Output:

                     data1   data2 descriptor
id    feature_count                          
12345 1                111     888      "foo"
      2                222     999      "foo"
      3                333  101010      "foo"
45678 0                444  111111      "bar"
      2                555  121212      "bar"
      3                666  131313      "bar"
      4                777  141414      "bar"

answered Sep 26, 2020 at 1:01

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Philip Kahn Over a year ago

Your first case still gave me an error, but your second worked like a charm. Thanks!

Collectives™ on Stack Overflow

Joining a single-index dataframe to a multi-index dataframe

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related