I have a problem with removing the duplicates. My program is based around a loop which generates tuples (x,y) which are then used as nodes in a graph. The final array/matrix of nodes is :
[[ 1. 1. ]
[ 1.12273268 1.15322175]
[..........etc..........]
[ 0.94120695 0.77802849]
**[ 0.84301344 0.91660517]**
[ 0.93096269 1.21383287]
**[ 0.84301344 0.91660517]**
[ 0.75506418 1.0798641 ]]
The length of the array is 22. Now, I need to remove the duplicate entries (see **). So I used:
def urows(array):
df = pandas.DataFrame(array)
df.drop_duplicates(take_last=True)
return df.drop_duplicates(take_last=True).values
Fantastic, but I still get :
0 1
0 1.000000 1.000000
....... etc...........
17 1.039400 1.030320
18 0.941207 0.778028
**19 0.843013 0.916605**
20 0.930963 1.213833
**21 0.843013 0.916605**
So drop duplicates is not removing anything. I tested to see if the nodes where actually the same and I get:
print urows(total_nodes)[19,:]
---> [ 0.84301344 0.91660517]
print urows(total_nodes)[21,:]
---> [ 0.84301344 0.91660517]
print urows(total_nodes)[12,:] - urows(total_nodes)[13,:]
---> [ 0. 0.]
Why is it not working ??? How can I remove those duplicate values ???
One more question....
Say two values are "nearly" equal (say x1 and x2), is there any way to replace them in a way that they are both equal ???? What I want is to replace x2 with x1 if they are "nearly" equal.