3

OK, there are a lot of examples of duplicate detection and removal in php arrays, using array_unique() etc but what if you want to find dups, modify them, check again in a loop until all dups are now unique?

I think it's something like using array_filter()... so as a more specific example, here's what would come out of a sql statement something like this:

SELECT id, list.comboname 
FROM list
   INNER JOIN (
      SELECT comboname 
      FROM list
       GROUP BY comboname 
       HAVING count(id) > 1
   ) dup ON list.comboname = dup.comboname

To an array of the duplicates in the table:

Array ( 
    [0] => 49 
    [1] => big.dup  
    [2] => 233  
    [3] => another.duplicate  
    [4] => 653  
    [5] => big.dup  
    [6] => 387  
    [7] => big.dup  
    [8] => 729  
    [9] => another.duplicate  
    [10] => 1022  
    [11] => big.dup   
)

Now what I want is some PHP to delete characters until the period so they are unique [or add numbers if needed to the end]

So result would be:

Array (  
    [0] => 49  
    [1] => big.dup  
    [2] => 233  
    [3] => another.duplicate  
    [4] => 653  
    [5] => big.du  
    [6] => 387  
    [7] => big.d  
    [8] => 729  
    [9] => another.duplicat  
    [10] => 1022  
    [11] => big.dup1  
)

While retaining the original value (i.e. big.dup and another.duplicate)... I've looked through just about every PHP array function trying to imagine a strategy ... ideas?

2
  • why don't you finish this in MySQL? I think this would help you: stackoverflow.com/questions/7416545/… Commented Nov 20, 2011 at 22:05
  • A. Because I want to set it up to do the duplication removal automatically without changing and rerunning the SQL.. B. I don't want to just flag them, I want to change them to a non-duplicate and leave them Commented Nov 22, 2011 at 0:15

2 Answers 2

1

For the array in your question and for adding numbers at the end if a duplicate, you only need to loop over the array once and temporary build up a helper array that stores if a value has been already found (and how often):

$found = array();

foreach($array as &$value)
{
    if (is_int($value)) continue; # skip integer values

    if (isset($found[$value]))
    {
        $value = sprintf('%s-%d', $value, ++$found[$value]);
    }
    else
    {
        $found[$value] = 0;
    }
}
unset($value);

Demo

Sign up to request clarification or add additional context in comments.

Comments

0

First of all, I think you have a rather complicated array structure.

Why don't you change it into something like:

$arr = array(
    '49' => 'big.dup',
    '233' => 'another.duplicate',
    '653' => 'big.dup',
    '387' => 'big.dup',
    '729' => 'another.duplicate',
    '1022' => 'big.dup',
);

This way, you can easily check for duplicate using something like this:

$arr = array(
    '49' => 'big.dup',
    '233' => 'another.duplicate',
    '653' => 'big.dup',
    '387' => 'big.dup',
    '729' => 'another.duplicate',
    '1022' => 'big.dup',
);
$arr_val = array();
foreach( $arr as $key => $val)
{
    if(isset($arr_val[ $val ]))
    {
        $arr_val[ $val ]++;
        $arr[ $key ] = $val . $arr_val[ $val ];
    }
    else
    {
        $arr_val[ $val ] = 0;
    }
}

Or, if you insist on using that complicated array structure, you can modify code above into this:

$arr_val = array();
foreach( $arr as $key => $val)
{
    if(isset($arr_val[ $val ]) && !is_numeric( $val ) )
    {
        $arr_val[ $val ]++;
        $arr[ $key ] = $val . $arr_val[ $val ];
    }
    else
    {
        $arr_val[ $val ] = 0;
    }
}

You may notice that there's not much different here. I just add && !is_numeric($val) to it as I think, you wouldn't want to process the ID. Though, I still think that the ID will never duplicate.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.