1

I have an ascii file as following

1   306.0416667
2   286.1666667
3   207.5
4   226.4166667
5   304.2083333
6   336.1666667
7   255.5416667
8   224.5833333
9   190.1666667
10  163.5
11  231.125
12  167.3333333
13  193.5416667
14  165
15  166
16  172.173913
17  158.9166667
18  196.8333333
19  154.875
20  303.4166667

I want to found the most frequent group of values. The groups are 0-90, 90-180, 180-270, 270-360.

I tried to use .value_counts() but with no success (even though without grouping the values).

import pandas as pd
col_names=['id','val']
df = pd.read_csv(i,names=col_names,header=None)
df['val'].value_counts().[:1].index.tolist()
2

4 Answers 4

2

You can use pd.cut, groupby(), count() like below:

>>> df = pd.DataFrame({
    'freq': [306.0416667, 286.1666667, 207.5 , 226.4166667 , 304.2083333 , 
             336.1666667 , 255.5416667, 224.5833333 , 190.1666667, 163.5 , 
             231.125, 167.3333333 , 193.5416667 , 165 , 154.875 , 303.4166667]})

>>> ranges = [0,90,180,270, 360]
>>> df.groupby(pd.cut(df['freq'], ranges)).count()

           freq
freq    
(0, 90]     0
(90, 180]   4
(180, 270]  7
(270, 360]  5

>>> df.groupby(pd.cut(df['freq'], ranges)).count().idxmax()
freq    (180, 270]
dtype: interval
Sign up to request clarification or add additional context in comments.

Comments

2

Bin and calculate mode()

 col1
1   306.041667
2   286.166667
3   207.500000
4   226.416667
5   304.208333
6   336.166667
7   255.541667
8   224.583333
9   190.166667
10  163.500000
11  231.125000
12  167.333333
13  193.541667
14  165.000000
15  166.000000
16  172.173913
17  158.916667
18  196.833333
19  154.875000
20  303.416667
pd.cut(df['col1'], bins=[0, 90, 180,270, 360], labels=['0-90', '90-180', '180-270', '270-360'],ordered=False).mode()

Comments

2

Try the below (No external library is required in this solution)

from collections import defaultdict

data = defaultdict(int)
STEP = 90

with open('data.txt') as f:
    lines = [l.strip() for l in f.readlines()]
    for line in lines:
        _, val = line.split()
        cnt = 1
        while True:
            if float(val) <= STEP * cnt:
                key = f'{(cnt -1) * STEP}-{cnt * STEP}'
                data[key] += 1
                break
            cnt += 1
print(data)

max_key = max(data, key=data.get)
print(f'max: {max_key}')

output

defaultdict(<class 'int'>, {'270-360': 5, '180-270': 8, '90-180': 7})
max: 180-270

Comments

2

Use pd.cut + value_counts, as follows:

bins = [0, 90, 180, 270, 360]
df['group'] = pd.cut(df['val'], bins)

df['group'].value_counts()

Result:

(180, 270]    8
(90, 180]     7
(270, 360]    5
(0, 90]       0
Name: group, dtype: int64

For the max entry, you can use .head(1), as follows:

df['group'].value_counts().head(1)

Result:

(180, 270]    8
Name: group, dtype: int64

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.