Say I have the following array:
import numpy as np
data = np.array([[51001, 121, 1, 121212],
[51001, 121, 1, 125451],
[51001, 125, 1, 127653]]
I want to remove duplicate rows only by the first 3 elements in a row (first 3 columns).
So the result I will get is:
print data
[[51001, 121, 1, 121212],
[51001, 125, 1, 127653]]
Doesn't matter which row we keep and which row we delete as long as I get the unique by the first 3 columns
answer post, edit :sorted_idx = np.lexsort(data[:,:3].T)androw_mask = np.append([True],np.any(np.diff(sorted_data[:,:3],axis=0),1)).