Inserting counter objects into Dataframe python

Question

I am confused with insert counter (collections) into a dataframe:

My dataframe looks like,

doc_cluster_key_freq=pd.DataFrame(index=[], columns=['doc_parent_id','keyword_id','key_count_in_doc_cluster'])

sim_docs_ids=[3342,3783]  

the counters generated in for the sim_docs_ids are given below

id=3342
Counter({133: 9, 79749: 7})

id=3783
Counter({133: 10, 12072: 5, 79749: 1})

The counter is generated in loop for each sim_docs_id

My code looks like:

for doc_ids in sim_docs_ids:
    #generate counter for doc_ids
    #insert the counter into dataframe (doc_cluster_key_freq) here

The output I am looking for is as below:

 doc_cluster_key_freq=
     doc_parent_id       Keyword_id          key_count_in_doc_cluster     
 0     3342                  133                       9
 1     3342                 79749                      7
 2     3783                  133                       10
 3     3783                 12072                      5
 4     3783                 79749                      1

I tried by using counter.keys() and counter.values but I get something like below, I have no idea how to separate them into different rows:

    doc_parent_id       Keyword_id          key_count_in_doc_cluster     
 0      33342          [133, 79749]                [9, 7]
 1      3783        [12072, 133, 79749]          [5, 10, 1]

Rikka · Accepted Answer · 2015-11-12 08:34:04Z

If you have the same number of keyword for each doc_id, you may pre-allocate proper row number for each record, and use the code below to ensure one row for each keyword in every doc_id:

keywords = ['key1', 'key2', 'key3', ...]
number_of_keywords = len(keywords)

for i, doc_id in enumerate(sim_doc_ids):
    # Generate keyword Counter (counter) for doc_id
    for j, key in enumerate(keywords):
        doc_cluster_key_freq.loc[i * number_of_keywords + j] = [doc_id, key, counter[key]]

An example:

keywords = ['a', 'b', 'c']
N = len(keywords)
ids = range(5)

for i, idd in enumerate(ids):
    counter = Counter({'a': random.randint(0, 10),
                      'b': random.randint(0, 10),
                      'c': random.randint(0, 10),})
    for j, key in enumerate(keywords):
        a.loc[i*N+j] = [idd, key, counter[key]]

Output:

    id  keyword count
0   0   a   10
1   0   b   9
2   0   c   9
3   1   a   1
4   1   b   10
5   1   c   10
6   2   a   9
7   2   b   0
8   2   c   5
9   3   a   6
10  3   b   0
11  3   c   8
12  4   a   0
13  4   b   3
14  4   c   8

Collectives™ on Stack Overflow

Inserting counter objects into Dataframe python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related