4

I have an array of the form:

class anim {
    public $qs;
    public $dp;
    public $cg;
    public $timestamp;
}
$animArray = array();

$myAnim = new anim();
$myAnim->qs = "fred";
$myAnim->dp = "shorts";
$myAnim->cg = "dino";
$myAnim->timestamp = 1590157029399;
$animArray[] = $myAnim;

$myAnim = new anim();
$myAnim->qs = "barney";
$myAnim->dp = "tshirt";
$myAnim->cg = "bird";
$myAnim->timestamp = 1590133656330;
$animArray[] = $myAnim;

$myAnim = new anim();
$myAnim->qs = "fred";
$myAnim->dp = "tshirt";
$myAnim->cg = "bird";
$myAnim->timestamp = 1590117032286;
$animArray[] = $myAnim;

How do I create a new array containing only the non-duplicates (and the latest entry where duplicates are found) of $animArray, where a duplicate is defined as:

one where $myAnim->dp has the same value as that of another array element's $myAnim->dp AND the $myAnim->cg from the first and the $myAnim->cg from the second have the same value as each other.

In the example above, only the first element is unique by that definition.

I'm hoping there's an elegant solution. I've been through all the array functions in the PHP manual but can't see how it could be achieved.

I could loop through each array element checking if $myAnim->dp has the same value as that of another array element's $myAnim->dp, saving the matches into a new array and then looping through that new array, checking for its $myAnim->cg matching the $myAnim->cg of any other element in that new array.

A more elegant solution would allow me to to change which combination of key-value pairs determine whether there's a duplicate, without having to recast much code.

Does such a solution exist?

Thanks for helping this novice :)

5
  • 1
    So in your example, object 0 and object 2 should be returned, right? object 0 because it's unique, and object 2 because it's the last one of the duplicates? Commented May 22, 2020 at 17:42
  • Yes, that's right, MatsLindh. Commented May 22, 2020 at 18:10
  • This is not a good way of making a class. I hope this is for the main purpose of demonstrating what you want to achieve :-) Commented May 22, 2020 at 21:13
  • Did you give up? You have 3 answers. Commented May 23, 2020 at 16:17
  • I'm reviewing and testing the suggestions on a larger data set. Commented May 24, 2020 at 2:25

3 Answers 3

3

While there is nothing built-in that can be used directly out of the box, there isn't a lot of code necessary to handle an arbitrary number of properties to consider for uniqueness. By keeping track of each unique property in a lookup array, we can build an array where the leaf nodes (i.e. the ones that isn't arrays themselves) are the objects.

We do this by keeping a reference (&) to the current level in the array, then continue building our lookup array for each property.

function find_uniques($list, $properties) {
    $lookup = [];
    $unique = [];
    $last_idx = count($properties) - 1;

    // Build our lookup array - the leaf nodes will be the items themselves,
    // located on a level that matches the number of properties to look at
    // to consider a duplicate
    foreach ($list as $item) {
        $current = &$lookup;

        foreach ($properties as $idx => $property) {
            // last level, keep object for future reference
            if ($idx == $last_idx) {
                $current[$item->$property] = $item;
                break;
            } else if (!isset($current[$item->$property])) {
                // otherwise, if not already set, create empty array
                $current[$item->$property] = [];
            }

            // next iteration starts on this level as its current level
            $current = &$current[$item->$property];
        }
    }

    // awr only calls the callback for leaf nodes - i.e. our items.
    array_walk_recursive($lookup, function ($item) use (&$unique) {
        $unique[] = $item;
    });

    return $unique;
}

Called with your data above, and the requirement being that uniques and the last element of duplicates being returned, we get the following result:

var_dump(find_uniques($animArray, ['dp', 'cg']));

array(2) {
  [0] =>
  class anim#1 (4) {
    public $qs =>
    string(4) "fred"
    public $dp =>
    string(6) "shorts"
    public $cg =>
    string(4) "dino"
    public $timestamp =>
    int(1590157029399)
  }
  [1] =>
  class anim#3 (4) {
    public $qs =>
    string(4) "fred"
    public $dp =>
    string(6) "tshirt"
    public $cg =>
    string(4) "bird"
    public $timestamp =>
    int(1590117032286)
  }
}

Which maps to element [0] and element [2] in your example. If you instead want to keep the first object for duplicates, add an isset that terminates the inner loop if property value has been seen already:

foreach ($properties as $idx => $property) {
    if ($idx == $last_idx) {
        if (isset($current[$item->$property])) {
            break;
        }

        $current[$item->$property] = $item;
    } else {
        $current[$item->$property] = [];
    }

    // next iteration starts on this level as its current level
    $current = &$current[$item->$property];
}

It's important to note that this has been written with the assumption that the array you want to check for uniqueness doesn't contain arrays themselves (since we're looking up properties with -> and since we're using array_walk_recursive to find anything that isn't an array).

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your answer. Unfortunately, it doesn't correctly identify uniques when I extend the array data with the following: $myAnim = new anim(); $myAnim->qs = "wilma"; $myAnim->dp = "shorts"; $myAnim->cg = "bird"; $myAnim->timestamp = 1590117035383; $animArray[] = $myAnim; $myAnim = new anim(); $myAnim->qs = "pebbles"; $myAnim->dp = "tshirt"; $myAnim->cg = "bird"; $myAnim->timestamp = 1590117038461; $animArray[] = $myAnim;
What's wrong about the answer? It'd be helpful if you could at least explain :-)
I think I see what you're thinking of. Fixed.
Thanks @MatsLindh - I'll try it again :)
I made yours as the accepted answer, Mats even though @AbraCadaver's soltion worked perfectly as well, I did so because yours was more readable / understandable to a novice like me.
2

This was fun:

array_multisort(array_column($animArray, 'timestamp'), SORT_DESC, $animArray);

$result = array_intersect_key($animArray,
          array_unique(array_map(function($v) { return $v->dp.'-'.$v->cg; }, $animArray)));
  • First, extract the timestamp and sort that array descending, thereby sorting the original array.
  • Then, map to create a new array using the dp and cg combinations.
  • Next, make the combination array unique which will keep the first duplicate encountered (that's why we sorted descending).
  • Finally, get the intersection of keys of the original array and the unique one.

In a function with dynamic properties:

function array_unique_custom($array, $props) {

    array_multisort(array_column($array, 'timestamp'), SORT_DESC, $array);

    $result = array_intersect_key($array,
              array_unique(array_map(function($v) use ($props) {
                  return implode('-', array_map(function($p) use($v) { return $v->$p; }, $props));;
              },
              $array)));

    return $result;
}
$result = array_unique_custom($animArray, ['dp', 'cg']);

Another option would be to sort it ascending and then build an array with a dp and cg combination as the key, which will keep the last duplicate:

array_multisort(array_column($animArray, 'timestamp'), SORT_ASC, $animArray);

foreach($animArray as $v) {
    $result[$v->dp.'-'.$v->cg] = $v;
}

In a function with dynamic properties:

function array_unique_custom($array, $props) {

    array_multisort(array_column($array, 'timestamp'), SORT_ASC, $array);

    foreach($array as $v) {
        $key = implode(array_map(function($p) use($v) { return $v->$p; }, $props));
        $result[$key] = $v;
    }
    return $result;
}
$result = array_unique_custom($animArray, ['dp', 'cg']);

4 Comments

Be aware that the usage of implode can create false duplicates; i.e. if one value's postfix matches the prefix of another value - implode(['foo', 'bar']) will give the same key as implode(['foob', 'ar']). It'll be slightly better with a separation character, but again you might hit the same issue if that character is part of the value.
@MatsLindh Good catch, added a delimiter.
@AbraCadaver: Your option works well: "Another option would be to sort it ascending and then build an array with a dp and cg combination as the key, which will keep the last duplicate"
I already accepted another answer. I would like to accept both but if I accept yours, the other becomes unaccepted.
0
//Create an array with dp and cg values only
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}
$cvs = array_count_values($new_arr);
$final_array = [];
foreach($cvs as $cvs_key=>$occurences) {
    if ($occurences == 1) {
        $filter_key = array_keys($new_arr, $cvs_key)[0];         
        $final_array[$filter_key] = $animArray[$filter_key];    
    }
}

The final result would be (from your example) in $final_array:

[0] => anim Object
    (
        [qs] => fred
        [dp] => shorts
        [cg] => dino
        [timestamp] => 1590157029399
    )

Some explanation:

//Create a new array based on your array of objects with the attributes dp and cg
//with a comma  between them
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}
/*
$new_arr now contains:

    [0] => shorts,dino
    [1] => tshirt,bird
    [2] => tshirt,bird
*/

//Use builtin-function array_count_values to get the nr of occurences for 
//each item in an array
$cvs = array_count_values($new_arr);

/*
$cvs would contain:

(
    [shorts,dino] => 1
    [tshirt,bird] => 2
)
*/

//Iterate through the $cvs array.
//Where there are only one occurence (no duplicates)
//create a final array $final_array
$final_array = [];
foreach($cvs as $cvs_key=>$occurences) {
    if ($occurences == 1) {

        /*
        array_keys with second argument $csv_key searches for key with 
        with the key from $cvs-key

        so basically search for:
        shorts,dino and retrieve the key 0 (first element)        
        */
        $filter_key = array_keys($new_arr, $cvs_key)[0];         

        /*
        Add a new item to the $final_array based on the key in
        the original array $animArray
        if you don't want the original key in the new array
        you could just do $final_array[] instead of 
        $final_array[$filter_key]
        */
        $final_array[$filter_key] = $animArray[$filter_key];    
    }
}

You said you would like to have some kind of functionality test different attributes. I believe it would just be making a function/method where you pass in two values to the arguments $attr1 ('dp'?), $attr2('cg'?) or similar.


UPDATE

I had not grasped that you wanted the last value as well. This actually seemed as an easier task. Maybe I am missing something but it was fun to come up with a different approach than other answers :-)

//Create an array with dp and cg values only
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}

//Sort keys descending order
krsort($new_arr); 

//Because of sending order of keys above, the unique values would return the 
//last item of the duplicates
$new_arr2 = array_unique($new_arr); 

//Switch order of keys back to normal (ascending)
ksort($new_arr2); 

//Create a new array based on the keys set in $new_arr2
//
$final_arr = [];
foreach($new_arr2 as $key=>$item) {
    $final_arr[] = $animArray[$key];
}

The output of $final_arr[] would be (in your example)

Array
(
    [0] => anim Object
        (
            [qs] => fred
            [dp] => shorts
            [cg] => dino
            [timestamp] => 1590157029399
        )

    [1] => anim Object
        (
            [qs] => fred
            [dp] => tshirt
            [cg] => bird
            [timestamp] => 1590117032286
        )

)

6 Comments

Your answer missed the inclusion of the latest duplicate. I said "How do I create a new array containing only the non-duplicates (and the latest entry where duplicates are found)"
aha. Sorry I missed that totally. I will get back to you with an answer that I hope works better.
I've yet to try this, BPTW. I will, though and let you know how it goes.
Oops, I thought you'd changed it. I'll wait :)
@MarkHightonRidley - haha sorry. It's been a lot of things going on the latest week. I believe I would have some time tomorrow afternoon to take a further look at it.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.