0

My code is working but there is a small possibility to have duplicated $categoryurl as output, how can I keep the uniques only?

I have a folder called "xml" in the webroot, I use glob() to search the /xml/ directory for the xml files.

I use a loop to find all XML files and find all item nodes, the item nodes can be duplicated as some of the nodes are available in multiple xml files so I use $html = array_unique($html); to keep all 100% uniques and remove all duplicates from my array.

Some code:

<?php
// Removed the code above this line as it's not needed in this question 
// $URL_array is defined above, it's an array() filled with XML URL's
foreach($URL_array as $XML_url){
$xml = simplexml_load_file($XML_url);
if ($xml===null || !is_object($xml))
    die('Kon het XML bestand niet laden, Raporteer a.u.b. deze fout.');
if (!is_object($xml->item))
    die('Kon de items niet laden, rapporteer a.u.b. deze fout.');
$Number_Of_Nodes = $xml->item->count(); /** Count number of items **/
for($i = 0; $i < $Number_Of_Nodes; $i++){ /** Number of category here... **/
$categoryname = $xml->item[$i]->recepttitle;
$categoryurl = $xml->item[$i]->recepturl;
$receptintroduction = $xml->item[$i]->receptintroduction;
$receptimageurl = $xml->item[$i]->receptimageurl;
$receptcategoryurl = $xml->item[$i]->receptcategoryurl;
$receptcategory = $xml->item[$i]->receptcategory;
$html[] = '<div class="content_box">' . "\r\n" . '<div class="content_box_header">' . "\r\n\t" . ucfirst($categoryname) . ' &bull; <a href="'. $receptcategoryurl . '">'. $receptcategory . '</a>' . "\r\n" . '</div>' . "\r\n" . '<div class="story_box_text">' . "\r\n" . '<br />' . "\r\n" . '<p><a title="' . $categoryname . '" href="' . $categoryurl . '"><img src="' . $receptimageurl . '" alt="' . $categoryname . '" title="' . $categoryname . '" /></a><br />' . $receptintroduction . '<br /><span class="align-right"><a title="'. $categoryname . '" href="' . $categoryurl . '" class="purplesmallbutton">Lees verder</a></span><br /></p>' . "\r\n" . '</div></div>' . "\r\n" . '<div class="clear"></div>' . "\r\n";
}
}
if(empty($html)){
    echo '<p class="error">In verband met werkzaamheden geen inhoud beschikbaar</p>' . "\r\n";
    }else{
        $html = array_unique($html); /** Alle duplicaten verwijderen **/
        shuffle($html);

Now I have a large shuffled array full of strings

if $categoryurl is duplicated I'd like to keep the first key found and all uniques, How should I achive this?

Last part of my code:

        echo implode("\n", array_slice($html, 0, 6)); /** output shoud always be 6 array keys only without duplicated $categoryurl, if there are duplicates the duplicates should be removed before this action **/
    }
    ?>

Edit

Contents for XML file: 1.xml

   <?xml version="1.0" encoding="UTF-8"?>
    <channel xmlns="http://www.w3.org/2005/Atom">
      <id>...</id>
      <title><![CDATA[vegetarische hoofdgerechten RSS]]></title>
  <author>
    <name>Voeding vegetarische hoofdgerechten</name>
    <email>[email protected]</email>
  </author>
  <updated></updated>
  <link rel="alternate" href="https://voeding.esthervrees.nl/vegetarische-hoofdgerechten" />
  <subtitle><![CDATA[Some subtitle here.]]></subtitle>
  <rights>Copyrights reserved. Feel free to use the embed function.</rights>

    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://google.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>

</channel>

Contents for XML file: 2.xml

<?xml version="1.0" encoding="UTF-8"?>
<channel xmlns="http://www.w3.org/2005/Atom">
  <id>...</id>
  <title><![CDATA[vegetarische hoofdgerechten RSS]]></title>
  <author>
    <name>Voeding vegetarische hoofdgerechten</name>
    <email>[email protected]</email>
  </author>
  <updated></updated>
  <link rel="alternate" href="https://voeding.esthervrees.nl/vegetarische-hoofdgerechten" />
  <subtitle><![CDATA[Some subtitle here.]]></subtitle>
  <rights>Copyrights reserved. Feel free to use the embed function.</rights>

    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://google.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>
    <item>
    <recepttitle><![CDATA[Pittige rijst met bonen voor 6 tot 8 personen]]></recepttitle>
    <shortrecepttitle><![CDATA[Pittige rijst met bonen]]></shortrecepttitle>
    <receptintroduction>Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here Some cool description here </receptintroduction>
    <recepturl>https://yahoo.com</recepturl>
    <receptimageurl>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptimageurl>
    <receptcategoryurl>https://www.yahoo.com</receptcategoryurl>
    <receptcategoryimage>https://voeding.esthervrees.nl/plaatjes/werk-aan-de-winkel-geen-afbeelding-beschikbaar-610x550px.gif</receptcategoryimage>
    <receptcategory>Vegetarische hoofdgerechten</receptcategory>
    </item>

</channel>

the item node in the XML file 1.xml and the first item node do have the same content for $categoryurl but the $categoryurl should not be available more then once in any of the items.

If the $categoryurl content is duplicated (available in any other $categoryurl inside one or more of the other item nodes) I would like to keep/add only one random item node and all unique items. $html[]with only 100% uniques (already done with) and a randomly selected key that has a duplicated $categoryurl as $categoryurl content should be an unique URL, if not unique, skip all of the duplicates and keep uniques only..

php example array:

    $URL_array = array($_SERVER['DOCUMENT_ROOT'] . '/xml/1.xml', $_SERVER['DOCUMENT_ROOT'] . '/xml/xml2.xml'); /** I added a lot more xml files to this array **/
7
  • 1
    "keep the first key found and all uniques", you want to get the first only ? and all other elsewhere? Commented Mar 2, 2018 at 18:43
  • If there are any duplicates I would like to keep 1 of the duplicates (the first one where the url is duplicated) and all uniques Commented Mar 2, 2018 at 18:48
  • php.net/manual/en/function.array-unique.php Commented Mar 2, 2018 at 19:18
  • @IncredibleHat The answers below didn't do the trick, I believe your answer is looking very well but I wasn't able to get it to work. I receive php warnings such as Illegal offset type in isset or empty and Illegal offset type I'll try to get it to work by checking the other values if $categoryurl isset then if present add the string to the $html array as the ones without $categoryurl set are duplicates.. but it's still not working Commented Mar 4, 2018 at 22:11
  • @IncredibleHat Yes, I noticed the line with trim() but was not able to get it to work with it, if used the errors are gone but it's still showing objects with duplicated $categoryurl. If a duplicated $categoryurl is found, only 1 xml item that is using the $categoryurl can be available in the output randomly, if there multiple item nodes that have equal $categoryurl only 1 out of the duplicates and all uniques should apear in the $htm[] array I updated my question with XML and php code examples that will allow you to test and see the output, How can I use your sample code? Commented Mar 21, 2018 at 3:36

1 Answer 1

0

Instead of:

$html[] = '<div class="content_box">...';

try:

if (array_key_exists($receptcategoryurl, $html)) {
    // this key already exists, we proceed to
    // the next item in the loop
    continue;
}
$html[$receptcategoryurl] = '<div class="content_box">...';

This way you make sure that only the first occurrence is allocated and skip the next ones.

Sign up to request clarification or add additional context in comments.

1 Comment

I updated my question, it's about skipping duplicated$categoryurl, please read the updated question, I added some lines and code..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.