2

I have 2 questions:

1-I need to parse XML file and insert the data into mysql database. Let's say that the file is around 250 kB (but it could be even larger) and it has a lot of subnodes, so I need at least 3 tables. I parsed the xml with SimpleXml and successfully inserted all the data into db. But for this exact file it took around 160s which seems to me a lot. Is there a method to do better, in lesser time?

And another question is that I need to get the XML file from an URL and save to the server, and I'm not sure how to do this ...

Thanks for your answers.

The code for parsing xml

function parse_xml($file=""){
  global $database;
  if(file_exists($file) && !empty($file)){
      $sport = new SimpleXMLElement($file, null, true);    
      $count = count($sport->OddsObject)-1;
      $listAttr = array();
      $start_time = time();
      for($i=0; $i <= $count; $i++){
          $countMatch = count($sport->OddsObject[$i]->Matches->Match)-1;
          //echo $countMatch; 
          for($k=0; $k <= $countMatch; $k++){           
              $OOdata = $sport->OddsObject[$i]->children();
              $columns = array();
              $data = array();
              foreach($OOdata as $key => $value){            
                  if($key != "Matches"){
                      //$listAttr[$i][$key] = $attr;
                      $columns[] = $key;
                      if ($value != "") {
                          $data[] = "'" . $database->escape_value($value) . "'";
                    } else {
                         $data[] = "NULL";
                    }
                }
            }        

            //get matches: MatchId, Date, HomeTeam, AwayTeam
            $Mdata = $sport->OddsObject[$i]->Matches->Match[$k]->children();     
            foreach ( $Mdata as $key => $value) {
                if($key != "OddsData"){    
                    $columns[] = $key;
                    if ($value != "") {
                      $data[] = "'" . $database->escape_value($value) . "'";
                    } else {
                      $data[] = "NULL";
                    }    
                }
            }                      
            $cols = strtolower(implode(",",$columns));
            $values = implode(",",$data);
            $sql = "INSERT INTO sports($cols) values(".$values.")";
            if($database->query($sql)) {
                $last_id = $database->insert_id();

                $countData = count($sport->OddsObject[$i]->Matches->Match[$k]->OddsData)-1;
                for($t=0; $t <= $countData; $t++){
                    //get OddsData: Home-,Draw-, -Away ...
                    $ODdata = $sport->OddsObject[$i]->Matches->Match[$k]->OddsData[$t]->children();
                    foreach($ODdata as $key=>$attr){
                        $MID = $last_id;
                        $new_bet = Bet::make($attr->getName(),$attr, $MID);
                        $new_bet->save(); 

                    }                    
                }
            }
        }
        $end_time = time() - $start_time;
    }    
    return $end_time;
}
else{
    die("The file doesn't exist.");
}
}
3
  • Are you sure the bottleneck is the XML parser, and not the way you update your database? Are you using transactions? Could you show the relevant parts of your parsing code? For the "get data from URL", please do some more searching yourself, that's very common. Commented Jul 8, 2011 at 4:45
  • I am not sure how you figured you need 3 tables but it would sure help if you had a sample xml. Commented Jul 8, 2011 at 4:45
  • @Mat - no Mat, I'm not sure the bottleneci is the xml parser ... there's a lot of records to validate if I have to be sure that the inserts are ok. I add the code at the top :) Commented Jul 8, 2011 at 5:02

2 Answers 2

2

A pretty easy way to get a file from a url and write it is file_get_contents() and file_put_contents().

SimpleXML should be pretty efficient and fast on a file thats only 250kb. Your slowness might be with your database inserts. Try grouping your inserts to the database. I've found that running 50 inserts at a time usually works the best(this depends on the row size though). That will probably speed up the whole process quite a bit.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for giving me the idea of grouping-in fact, I splitted the function in 2 chunks as the id's are the same and unique and wrote everything in 2 tables. Amazingly, it took only 30s instead of 160.But I'm still not sure how I deal with reading and saving the xml from url. Anyway thanks.
1

I assume you're parsing it with

$dom = new DOMDocument();   
... 
// read and insert into db

DOM can use a significant amount of memory and cpu compared to SAX parser, you might try commenting out database code and running it to see if it uses too much CPU and RAM, if so you may want to re-code it with SAX parser, as shown here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.