Filter Array of Arrays to Unique based off 2 values?

Question

I have an array of arrays and need to filter them to have unique based off the first 2 items in the arrays.

[
    ["18", "-66"],
    ["26", "-80"],
    ["26", "-80"],
    ["26", "-80"],
    ["27", "-80"],
    ["27", "-80"],
    ["28", "-97"],
    ["29", "-99"],
    ["30", "-81"],
    ["30", "-87"],
    ["30", "-99"],
    ["31", "-110"],
    ["31", "-110"],
    ["31", "-111"],
    ["31", "-84"],
    ["31", "-87"],
    ["31", "-95"],
    ["31", "-95"],
    ["31", "-95"]
]

While a function such as this does a great job with a single unique, I fail to see how to modify it to search the first two items:

function filterArray(incomingArray){
  var unique = {};
  var distinct = [];
  for( var i in incomingArray ){
   if( typeof(unique[incomingArray[i][0]]) == "undefined"){
    distinct.push([incomingArray[i][0],{}]);
   }
   unique[incomingArray[i][0]] = 0;
  }
  return distinct;
}

The entire array doesn't need to be unique, just the first two items, so for example the following would match:

[
    ["26", "-80", 2],
    ["26", "-80", 3]
]

## Update ##

I tried each suggested method and found some interesting things. First let me say thank you for each of your ideas / ways of solving the problem. In performance testing, I supplied 4000 records and did a performance test on each method.

Mark's method of using Set() with 4000 records finished in 19ms, when supplied only about 50 records it finished in 2ms.

D. Seah's method with with 4000 records finished in 149ms, but when supplied only 50 records finished in under 1ms.

vol7ron's method with 4000 records finished in 30ms, but oddly took 52ms with 50 records

Ben's method I am still working through how it works and so far haven't had it return what I expected, but with it returning a 3 dimensional array, I am interested to see what possibilities there are in other applications, but it took around 38 seconds and I ended up with a very large array of undefined. It is possibly something I am doing wrong, but at least in this setting, it may be more capable than I need, and since the other solutions are fast enough then I may table it till next time.

Using Set() looks like the best way to go for a performance dependent solution where the dataset may keep growing, but in a smaller dataset vol7tron wins.

Mark · Accepted Answer · 2018-07-21 00:09:48Z

1

If you can find a character you know won't be in your data, you can use that to join your first two elements into a key. For example 18_-66. Then you can keep a Set of known values and use it to filter. Key lookups in Sets are constant time, so this should run linear time in n. For example:

let arr = [["18", "-66"],["26", "-80"],["26", "-80"],["26", "-80"],["27", "-80"],["27", "-80"],["28", "-97"],["29", "-99"],["30", "-81"],["30", "-87"],["30", "-99"],["31", "-110"],["31", "-110"],["31", "-111"],["31", "-84"],["31", "-87"],["31", "-95"],["31", "-95"],["31", "-95"]]

let s = new Set()
let f = arr.filter(item => {
    let key = item[0] + '_' +item[1]  // join elements 1 and 2 with _
    return !s.has(key) && s.add(key)  // add to set and return true if it's not already there
})
console.log(f)

edited Jul 21, 2018 at 0:09

answered Jul 20, 2018 at 23:59

Mark

92.7k8 gold badges116 silver badges156 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

vol7ron Over a year ago

I think I like this answer the best. I always forget to use Set, maybe because of its lack of backwards compatibility in IE? Probably not, since that's less of a reason these days. Probably due to lack of exercise in thinking about them :P

Mark Over a year ago

@vol7ron the same technique should work about as well using a regular object.

Alan Over a year ago

Thanks Mark, I also haven't been accustomed to using Set() but am looking forward to trying it. The data is XY coordinates, so any letter would be a great separator, I will give it a try :)

vol7ron Over a year ago

@MarkMeyer yes, I had provided a regular object solution before coming across your answer. Using Set and filter is why this should be the accepted solution.

Ben Botvinick · Accepted Answer · 2018-07-21 00:46:31Z

1

The following function will iterate through the array twice, and if the first two items match, it will push an array containing the two items to an output array.

function filterArray(arr) {
  var matching = []

  for (i in arr) {
    for (j in arr) {
      if (arr[i][0] === arr[j][0] && arr[i][1] === arr[j][1]) {
        matching.push([arr[i], arr[j]])
      }
    }
  }

  for (k in matching) {
    matching[k] = [matching[k][0][0], matching[k][0][1], matching[k].count]
  }

  return matching
}

edited Jul 21, 2018 at 0:46

answered Jul 21, 2018 at 0:03

Ben Botvinick

3,4433 gold badges22 silver badges42 bronze badges

11 Comments

Mark Over a year ago

This seems to return a 3 dimensional array.

David784 Over a year ago

Actually it will iterate through the array n-squared times. And you're also comparing items to themselves when i and j refer to the same index. If you want to use nested for loops, you'd be better off using an n-1 technique: for(var i=0;i<arr.length;i++) for(var j=i+1;j<arr.length;j++) { ... }

Ben Botvinick Over a year ago

That will only look at the item and the one directly after it.

Ben Botvinick Over a year ago

@MarkMeyer mine now returns an array of matching pairs formatted as [<first element>, <second element>, <amount of matches>]

vol7ron Over a year ago

@Alan if performance is a necessity, you need to specify that. Nothing is going to beat a for loop and a good algorithm.

|

vol7ron · Accepted Answer · 2018-07-21 02:35:53Z

The easy way to manage is to keep a cache variable of your previously processed list. If while evaluating your array it is already in the cache list, don't store it. If it is not in the cache, then keep it and make an entry in the cache.

let cache = {};
let data = getData();
let unique = data.reduce((acc,curr)=>{
  let key = [curr[0],curr[1]].join(',')
  if (!cache[key]){
    cache[key] = true;
    acc.push(curr)
  }
  return acc;
},[])

console.log('unique: ', prettyPrint(unique))
console.log('cache: ', prettyPrint(cache))

function getData(){ return [
    ["18", "-66"],
    ["26", "-80"],
    ["26", "-80"],
    ["26", "-80"],
    ["27", "-80"],
    ["27", "-80"],
    ["28", "-97"],
    ["29", "-99"],
    ["30", "-81"],
    ["30", "-87"],
    ["30", "-99"],
    ["31", "-110"],
    ["31", "-110"],
    ["31", "-111"],
    ["31", "-84"],
    ["31", "-87"],
    ["31", "-95"],
    ["31", "-95"],
    ["31", "-95"]
]}

function prettyPrint(data){
   if(Array.isArray(data)){
     return '\n' + JSON.stringify(data,null,2)
                     .replace(/,\n\s+[^[\s]/g, ',')
                     .replace(/(\[)\s+([^[\s])/g, '$1$2')
                     .replace(/"\s+/g,'"')
   }
   else if (typeof data === 'object')
     return '\n' + JSON.stringify(data,null,2)
   else 
     return data; 
   
}

let key = [curr[0],curr[1]].join(',') is part of the magic. It joins the two values by a delimiter so only one lookup is performed. This value is easier to see in the cache output

D. Seah · Accepted Answer · 2018-07-21 00:08:16Z

0

you can use a reduce function

var oDistinct = oData.reduce(function (acc, cur) {
  if (!acc.some(function(i) {
    return i[0] === cur[0] && i[1] === cur[1];
  })) {
    acc.push(cur);
  }
  return acc;
}, []);

answered Jul 21, 2018 at 0:08

D. Seah

4,7121 gold badge15 silver badges27 bronze badges

2 Comments

Alan Over a year ago

I haven't ever seen oData before, I am going to look into it to compare it with JSON for performance / readability

vol7ron Over a year ago

I think oData is his Hungarian notation for your data. In this case data is an object denoted by the prefix “o”. Perhaps it should be “aData”?

Collectives™ on Stack Overflow

Filter Array of Arrays to Unique based off 2 values?

4 Answers 4

4 Comments

11 Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

11 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related