Indexing a numpy array based on order with repetition

Question

I have a numpy array as follows,

arr = np.array([0.166667, 0., 0., 0.333333, 0., 0.166667, 0.166667, np.nan]

I wish to rank above array in descending order such that the highest value gets 1. and np.nan gets the last value but without incrementing the rank during value repetitions!

Expectation:

ranks = [2, 3, 3, 1, 3, 2, 2, 4]

i.e.
>>>>
1 0.333333
2 0.166667
2 0.166667
2 0.166667
3 0.0
3 0.0
3 0.0
4 -inf

What I have accomplished so far is below,

I used np.argsort twice and filled the np.nan value with the lowest float possible but the ranks increment even with the same value!

# The Logic
arr = np.nan_to_num(arr, nan=float('-inf'))
ranks = list(np.argsort(np.argsort(arr)[::-1]) + 1)

# Pretty Print
sorted_ = sorted([(r, a) for a, r, in zip(arr, ranks)], key=lambda v: v[0])
for r, a in sorted_:
  print(r, a)

>>>>
1 0.333333
2 0.166667
3 0.166667
4 0.166667
5 0.0
6 0.0
7 0.0
8 -inf

Any idea on how to manage the ranks without increments?

https://repl.it/@MilindDalvi/MidnightblueUnselfishCategories

To sort arr descending and Nan to end one can use -np.sort(-arr) — Aivar Paalberg
– Aivar Paalberg, Commented Jan 16, 2020 at 12:24

yatu · Accepted Answer · 2020-01-16 12:02:04Z

1

Here's a pandas approach using DataFrame.rank setting method="min" and na_option ='bottom':

s = pd.Series(arr).rank(method="min", na_option ='bottom', ascending=False)
u = np.sort(s.unique())
s.map(dict(zip(u, range(len(u))))).add(1).values
# array([2, 3, 3, 1, 3, 2, 2, 4], dtype=int64)

answered Jan 16, 2020 at 12:02

yatu

88.7k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mohammed Deifallah · Accepted Answer · 2020-01-16 11:52:49Z

0

Try something like that before the last loop:

k = 1;
for i in (1, len(sorted_)):
  if sorted_[i][1] != sorted_[i - 1][1] then
     k = k + 1
  sorted_[i][0] = k

answered Jan 16, 2020 at 11:52

Mohammed Deifallah

1,3201 gold badge10 silver badges29 bronze badges

Comments

Thijmen · Accepted Answer · 2020-01-16 12:04:04Z

0

Not necessarily a better way - just another way of approaching this issue

arr = sorted(np.array([0.166667, 0., 0., 0.333333, 0., 0.166667, 0.166667, np.nan]), reverse=True)
count = 1
mydict = {}

for a in arr:
    if a not in mydict:
        mydict[a] = count
        count += 1

for i in arr:
    print(mydict[i], i)

answered Jan 16, 2020 at 12:04

Thijmen

4583 silver badges16 bronze badges

Comments

Vicrobot · Accepted Answer · 2020-01-16 12:21:16Z

0

Here's one approach:

v = sorted(arr, reverse = 1)
for i,j in enumerate(set(v)):
    if np.isnan(j): k = i+1

print([list(set(v)).index(i)+1 if not np.isnan(i) else k for i in arr])

Output

[2, 3, 3, 1, 3, 2, 2, 4]

answered Jan 16, 2020 at 12:21

Vicrobot

3,9981 gold badge20 silver badges35 bronze badges

Comments

tenhjo · Accepted Answer · 2020-01-16 12:56:43Z

0

numpy.unique sorts the unique values ascending, so using -arr gives you the correct order. The index for reversing this operation is exactly your rank (minus one).

arr_u, inv = np.unique(-arr, return_inverse=True)
rank = inv + 1

edited Jan 16, 2020 at 12:56

answered Jan 16, 2020 at 12:08

tenhjo

4,5911 gold badge17 silver badges48 bronze badges

Collectives™ on Stack Overflow

Indexing a numpy array based on order with repetition

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related