0

I have a Dataframe being returned from a google trends API and contains values for date, keyword and search volume. I need to return a list of lists that will contain the following keyword, date 1, value 1, date 2, value 2, date 3, value 3, date n, value n...]

I have the following function that will take a set of keywords and send them to the API, then converts the returned dataframe to a list

def list_to_api(keyword_list):

    (pytrends.build_payload(keyword_list, cat=0, timeframe='today 12-m', geo='', gprop=''))
    df = (pytrends.interest_over_time())
    google_data_list = df.values.tolist()
    print(type(google_data_list))
    print("Resting 5 seconds for next API Call")
    print("Converted to  list ")
    insert_list.append(google_data_list)

The following screenshot1 shows what the output looks like as a dataframe

dataframe

That gives the list output [[[1, 93, 29, 7, 0, False], [1, 95, 31, 8, 0, False], [1, 91, 31, 8, 0, False], [1, 93, 34, 7, 0, False], [1, 96, 32, 8, 0, False]

I have transposed the dataframe by updating these two lines

df = (pytrends.interest_over_time())
google_data_list = df_.values.tolist()

to

df_new = df.transpose()
google_data_list = df_new.values.tolist()

Screenshot2 shows what this table looks like

df transposed and it which creates the list output for the first two values:

[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 
[92, 94, 92, 94, 98, 100, 85, 87, 88, 87, 95, 89, 89, 93, 94, 88, 86, 87, 84,
 87, 82, 80, 81, 81, 76, 78, 78, 77, 73, 77, 76, 76, 79, 73, 87, 88, 91, 92, 88, 90, 
85, 88, 95, 94, 89, 91, 91, 91, 89, 85, 86]

So for the first example my desired output would be

[0 balance transfer, date1, 1, date2, 1, date3, 1, dateN, 1...]

But I am struggling to take the date from the header and adding it alongside the corresponding value for the list. Any help much appreciated.

1 Answer 1

1

Instead of transpose() and tolist() you could use a loop & list comprehension for e.g.

df = pd.DataFrame([[1, 93, 29, 7, 0, False], [1, 95, 31, 8, 0, False], [1, 91, 31, 8, 0, False], [1, 93, 34, 7, 0, False], [1, 96, 32, 8, 0, False]])
df.columns = ['0 balance transfer', 'car insurance', 'travel insurance', 'pet insurance', 'ww travel insurance', 'isPartial']
df.index = ['2018-05-06','2018-05-13','2018-05-20','2018-05-27','2018-06-03']
out =[]
for col in df:
    tmp = [col]
    [tmp.extend((date, value)) for date, value in zip(df[col].index, df[col])]
    out.append(tmp)
print(out)

>> [['0 balance transfer', '2018-05-06', 1, '2018-05-13', 1, '2018-05-20', 1, '2018-05-27', 1, '2018-06-03', 1], ['car insurance', '2018-05-06', 93, '2018-05-13', 95, '2018-05-20', 91, '2018-05-27', 93, '2018-06-03', 96], ['travel insurance', '2018-05-06', 29, '2018-05-13', 31, '2018-05-20', 31, '2018-05-27', 34, '2018-06-03', 32], ['pet insurance', '2018-05-06', 7, '2018-05-13', 8, '2018-05-20', 8, '2018-05-27', 7, '2018-06-03', 8], ['ww travel insurance', '2018-05-06', 0, '2018-05-13', 0, '2018-05-20', 0, '2018-05-27', 0, '2018-06-03', 0], ['isPartial', '2018-05-06', False, '2018-05-13', False, '2018-05-20', False, '2018-05-27', False, '2018-06-03', False]]

Edit based on comment (Drop isPartial column and filter Dates):

del df['isPartial']
out =[]
for col in df:
    tmp = [col]
    [tmp.extend((date, value)) for date, value in zip(df[col].index, df[col]) if date > '2018-05-15']
    out.append(tmp)

print(out)
Sign up to request clarification or add additional context in comments.

7 Comments

This is great, but how can I do the same whilst removing the isPartial column, and drop the Timestamp that is appearing before each date value? I havent experience with list comprehension yet. Thanks!
drop column with del df['isPartial']. Timestap filtering depends on the datatype of the timestamp(object), you could try if date > '2018-05-15'... inside the list comprehension, see my edit
I'm not sure if I need to post a new question but looking at the DB structure I have to work with I need results to show as [['0 balance transfer, date1, value1], ['0 balance transfer, date2, value2], ['0 balance transfer, dateN, valueN]] is this possible through the same list comprehension?
sure, you don't even need a loop any more: out = [[[col, date, value] for date, value in zip(df[col].index, df[col]) if date > '2018-05-15'] for col in df]
I am still getting the results in the date data in the list as ['car insurance', Timestamp('2018-05-06 00:00:00'), 92] which is making no sense to me the more I look at the data and list comprehension @ilja
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.