1
import pandas as pd
pd.DataFrame({'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'good 4 u',
 'release_date': '2021-05-13',
 'core_genre': 'Pop',
 'metrics': [],
 'week_id': 202101,
 'top_isrc': 'USUG12101245'})

is returning column names but an otherwise empty dataframe, and this is happening because of the empty list for metrics:. This is a problem. It would be better if this returned a 1-row dataframe with an empty list in the metrics column.

enter image description here

Here is an example of the data without missing metrics:

{'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'drivers license',
 'release_date': '2021-01-07',
 'core_genre': 'Pop',
 'metrics': [{'name': 'Song w/SES On-Demand',
   'value': [{'name': 'tp', 'value': 1},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 1},
    {'name': 'atd', 'value': 1}]},
  {'name': 'Song w/SES On-Demand Audio',
   'value': [{'name': 'tp', 'value': 0},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 0},
    {'name': 'atd', 'value': 0}]},
  {'name': 'Streaming On-Demand Total',
   'value': [{'name': 'tp', 'value': 414},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 414},
    {'name': 'atd', 'value': 414}]},
  {'name': 'Streaming On-Demand Audio',
   'value': [{'name': 'tp', 'value': 69},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 69},
    {'name': 'atd', 'value': 69}]}],
 'week_id': 202101,
 'top_isrc': 'USUG12004749'}

and this is handled quite nicely by pd.DataFrame(), creating a row for each of the 4 nested options within the list in metrics. I assume for the same reason pd.DataFrame() returns 4 rows on this second example (4 dicts in the list), pd.DataFrame() returns 0 rows in the example above (0 dicts in the list). However the lost row of data is a problem. How can we handle this?

1 Answer 1

3

An empty list can be achieved by passing in a list of an empty list:

df = pd.DataFrame({'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'good 4 u',
 'release_date': '2021-05-13',
 'core_genre': 'Pop',
 'metrics': [[]],
 'week_id': 202101,
 'top_isrc': 'USUG12101245'})

Gives

  genre country     artist_name title_name release_date core_genre metrics  week_id      top_isrc
0   Pop      CA  Olivia Rodrigo   good 4 u   2021-05-13        Pop      []   202101  USUG12101245

Or you could make it a list of an empty dict [{}] too.

Comment:

It's interesting that just specifying a single list returns a blank row, but I suppose from pandas's point of view, it might have trouble distinguishing a vector of row values from a single row value that is a vector, and the default behaviour is to, apparantly, throw the whole row away? Interesting.

Sign up to request clarification or add additional context in comments.

3 Comments

makes sense. I think our desired output is actually an empty dictionary, although I assume we can replace the empty list with empty dict to get the same results
Yep that's right - see my edited comment :)
Yes it is very interesting. It's also interesting that the default approach if the list has 2-3 options is to create 2-3 rows rather than a single row.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.