0

I am writing keywords and their corresponding page numbers via LaTeX into textfiles which i then process with Python. How can I create a sorted list of page numbers with their corresponding keyword?

The following code gives me the unique list however it is not sorted.

import pandas as pd

def unique(liste):
    a = liste.split(',')
    a = [int(numeric_string) for numeric_string in a]
    a = sorted(a)
    a = map(str,a)
    b = set(a)
    return ','.join(b)

df = pd.DataFrame({'keyword': ["foo","foo","foo","foo","foo","foo","foo","foo","bar","bar","bar"], "page": [1,2,3,3,4,5,6,7,7,9,10]})
df['page'] = df['page'].astype(str)
print(df)

grouped = df.groupby('keyword',as_index=False).agg(lambda col: ','.join(col))
grouped = pd.DataFrame(grouped)
grouped['unique'] = grouped['page'].apply(unique)
print(grouped)

produces

   keyword page
0      foo    1
1      foo    2
2      foo    3
3      foo    3
4      foo    4
5      foo    5
6      foo    6
7      foo    7
8      bar    7
9      bar    9
10     bar   10
  keyword             page         unique
0     bar           7,9,10         9,7,10
1     foo  1,2,3,3,4,5,6,7  3,7,6,4,5,2,1
1
  • What is your desired output? Commented Mar 12, 2016 at 23:54

1 Answer 1

1
import numpy as np
import pandas as pd

df = pd.DataFrame(
    {'keyword': ["foo","foo","foo","foo","foo","foo","foo","foo","bar","bar","bar"], 
     "page": [1,2,3,3,4,5,6,7,7,9,10]})

# df['page'] = df['page'].astype(int)
result = df.groupby(['keyword'])['page'].agg(lambda x: ','.join(np.unique(x).astype(str)))

print(result)

yields

keyword
bar           7,9,10
foo    1,2,3,4,5,6,7
Name: page, dtype: object

  • np.unique returns a unique sorted array of values. We want the page values to be sorted as ints (not as strings) so keep page values as ints. After calling np.unique you can use astype(str) to convert to strings and then join them with ','.join.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.