1

I have the following data structure

ID Number   Product_Description
45452       MSSQL
45453       INFORMATICA
45454       INFORMATICA
45458       INFORMATICA
45456       MSSQL
45457       DBA

and the output should be

MSSQL        2
INFORMATICA  3
DBA          1

And I want to store it in a list:

v_1 = [MSSQL,INFORMATICA,DBA]
v_2 = [2,3,1]
2
  • 1
    Please update your question using the formatting tools. You will get much better help if you show what you've tried; StackOverflow is not a code writing forum. Please show your work. Commented Aug 21, 2018 at 5:08
  • @user8195447 can you add what type of data structures these are above? I'm pretty sure they are pandas series (pd.Series) objects where ID Number is the Index.name? Commented Aug 21, 2018 at 17:29

2 Answers 2

3

you can use value_counts

p=df['Product_Description'].value_counts()
V_1=p.index.tolist()
V_2=p.values.tolist()
Sign up to request clarification or add additional context in comments.

2 Comments

beat me to it. this is the best option.
Thanks a lot sir!
2

Use GroupBy.size with sort=False if order is important:

s = df.groupby('Product_Description', sort=False).size()
print (s)
Product_Description
MSSQL          2
INFORMATICA    3
DBA            1
dtype: int64

v_1 = s.index.tolist()
v_2 = s.values.tolist()

print (v_1)
['MSSQL', 'INFORMATICA', 'DBA']
print (v_2)
[2, 3, 1]

If order should be different, e.g. Series.value_counts ordering by number of occurencies:

s = df['Product_Description'].value_counts()
print (s)
INFORMATICA    3
MSSQL          2
DBA            1
Name: Product_Description, dtype: int64

v_1 = s.index.tolist()
v_2 = s.values.tolist()

print (v_1)
['INFORMATICA', 'MSSQL', 'DBA']

print (v_2)
[3, 2, 1]

Another solution is create dictionary of lists:

df1 = df.groupby('Product_Description', sort=False).size().reset_index()
df1.columns=['v_1','v_2']
print (df1)
           v_1  v_2
0        MSSQL    2
1  INFORMATICA    3
2          DBA    1

d = df1.to_dict(orient='list')
print (d)
{'v_1': ['MSSQL', 'INFORMATICA', 'DBA'], 'v_2': [2, 3, 1]}

print (d['v_1'])
['MSSQL', 'INFORMATICA', 'DBA']

print (d['v_2'])
[2, 3, 1]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.