I have a data.frame that is a result of a near neighbor search of points, and it has three columns: V1 represents the index of closest point, V2 the second closest point, and V3 the third:
search_result <- structure(list(V1 = c(1350L, 1390L, 1411L, 1437L, 1444L, 1895L,
1895L, 1467L, 1478L, 1500L),
V2 = c(1351L, 1391L, 1410L, 1438L,
1907L, 1456L, 1456L, 1466L, 1477L, 1499L),
V3 = c(1349L, 1389L, 1940L, 1913L, 1445L, 1894L,
1894L, 1884L, 1479L, 1501L)),
row.names = c(NA, -10L),
class = "data.frame")
As I want the closest neighbor point, I would select V1 as the result and I would be fine. It happens that I also want the index to be ordered, and V1 has some index that are out of order. So I want to create a column that will give me the value of V1 (when it's in order) or the value of V2 or V3 (and V2 has the priority) so the order is preserved. In this case the result would be like:
V1 V2 V3 ordered
1 1350 1351 1349 1350
2 1390 1391 1389 1390
3 1411 1410 1940 1411
4 1437 1438 1913 1437
5 1444 1907 1445 1444
6 1895 1456 1894 1456 #take V2 instead
7 1895 1456 1894 1456 #take V2 instead
8 1467 1466 1884 1467
9 1478 1477 1479 1478
10 1500 1499 1501 1500
I tried to take the minimum value of each column, but there are cases later on the dataset which the max value would be the desired (not the best option, but closer to the expected). In the example below, there is discontinuity on rows 2, 4, 5 and 6, so I would take the value of V2 (priority) or V3 as the desired, so the "order" is maintained:
# it's harder to see the "order" here, but it starts in V1 = 1881
V1 V2 V3 ordered
1 1881 1470 1880 1881
2 1457 1893 1894 1893 #take V2 instead
3 1907 1444 1906 1907
4 1442 1443 1908 1908 #take V3 instead
5 1433 1918 1432 1918 #take V2 instead
6 1402 1949 1401 1949 #take V2 instead
7 1968 1969 1967 1968
8 1985 1986 1984 1985
9 1992 1993 1991 1992
The full dataset has 2500 points, and the "unordered" values happen in roughly 10% of it, so I can estimate what's the "order".
Any base tidyverse or data.table help would be appreciated. Thanks!
diff) where probably there is a discontinuity, but it would also identify some "ok" values as suspicious, specially when there is a transition from a bad to an ok value.V1:V3are greater or equal to theorderedvalue from the previous row?V2if1895 > 1444(i.e., order is still increasing)?