PHP/MYSQL: Iterate over every record in a database

Question

I am new to the whole php/mysql thing. I have a weeks worth of server logs (about 300,000 items) and I need to do some analysis. I am planning on reading them all into a mysql db and then analysing them with php.

The thing I am not sure about is how to iterate through them. Using java reading a file I would do something like this:

Scanner s = new Scanner(myfile);
while(s.hasNext()){
    String line = s.nextLine();
    ~~ Do something with this record. 
}

How do I iterate through all records in a mysql db using php? I think that something like this will take a stupid amount of memory.

    $query = "SELECT * FROM mytable";
    $result = mysql_query($query);
    $rows = mysql_num_rows($result);
    for($j = 0; $j < $rows; ++$j){
            $curIndex   = mysql_result($result,$j,"index");
            $curURL     = mysql_result($result,$j,"something");
            ~~ Do something with this record
    }

So I have added a limit to the select statement and I repeat until all records have been cycled through. Is there a more standard way to do this? Is there a built in that will do this?

while($startIndex < $numberOfRows){

    $query = "SELECT * FROM mytable ORDERBY mytable.index LIMIT $startIndex,$endIndex";
    $result = mysql_query($query);
    $rows = mysql_num_rows($result);
    for($j = 0; $j < $rows; ++$j){
            $curIndex   = mysql_result($result,$j,"index");
            $curURL     = mysql_result($result,$j,"something");
            ~~ Do something with this record
    }
    $startIndex = $endIndex + 1;
    $endIndex = $endIndes + 10;
}

Richard H · Accepted Answer · 2010-11-11 14:26:30Z

5

You don't want to do a SELECT * FROM MYTABLE if your table is large, you're going to have the whole thing in memory. A trade-off between memory overhead and database calls would be to batch requests. You can get the min and max id's of rows in your table:

SELECT MIN(ID) FROM MYTABLE;
SELECT MAX(ID) FROM MYTABLE;

Now loop from minId to maxId, incrementing by say 10,000 each time. In pseudo-code:

for (int i = minId; i < maxId; i = i + 10000) {
   int x = i;
   int y = i + 10000;
   SELECT * FROM MYTABLE WHERE ID >= x AND ID < y;
}

answered Nov 11, 2010 at 14:26

Richard H

39.3k38 gold badges115 silver badges142 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

sixtyfootersdude Over a year ago

This is what I am doing in the third example using LIMIT, except my solution allows the results to be sorted by something other than ID.

Richard H Over a year ago

My version is more efficient because you only pull out the rows between x and y. Using LIMIT you're pulling out everything, and then taking whatever rows your start and end ids specify (Your startIndex and endIndex ids here are NOT the primary ID of your table, but the row number of the results that have been generated by the preceeding query)

sixtyfootersdude Over a year ago

Ok, I buy that. This is just a MYSQL optimization.

SW4 · Accepted Answer · 2010-11-11 14:19:07Z

4

See here:

http://www.tizag.com/mysqlTutorial/

http://www.tizag.com/mysqlTutorial/mysqlfetcharray.php

<?php
// Make a MySQL Connection
$query = "SELECT * FROM example"; 

$result = mysql_query($query) or die(mysql_error());


while($row = mysql_fetch_array($result)){
    echo $row['name']. " - ". $row['age'];
    echo "<br />";
}
?>

Depending on what you need to do with the resulting rows, you can use a different loops style, whether its 'while', 'for each' or 'for x to x'. Most of the time, a simple 'while' iteration will be great, and is efficient.

answered Nov 11, 2010 at 14:19

SW4

71.5k20 gold badges137 silver badges140 bronze badges

5 Comments

sixtyfootersdude Over a year ago

won't this use a crazy amount of memory? Does this have some underlying method to get things as they are needed?

SW4 Over a year ago

Any filter should be applied to the underlying SQL, the SQL statement should produce only the required records, which PHP will then iterate through for your purposes, if you have a large dataset, think about using seperate 'pages'

SW4 Over a year ago

Re: memory, going on the fact you need to use all the records returned (if you dont, then tweak your SQL), these are inbuilt PHP functions so likely the best approach

sixtyfootersdude Over a year ago

Ok so doing what I did in my third example is not necessary and already done under the hood?

sixtyfootersdude Over a year ago

Cool. Looks like mysql_query returns a resource. "A resource is a special variable, holding a reference to an external resource. Resources are created and used by special functions. See the appendix for a listing of all these functions and the corresponding resource types." For more info see php.net/manual/en/language.types.resource.php Could you append this to your answer please?

Bart van Heukelom · Accepted Answer · 2010-11-11 14:20:08Z

2

Use mysql_fetch_*

$result = mysql_query(...);
while($row = mysql_fetch_assoc($result)) {
 $curIndex = $row['index'];
}

I think that retrieves results in a "streaming" manner, rather than loading them all into memory at once. I'm not sure what exactly mysql_result does.

Side note: Since you're still new, I'd advice to get into good habits right away and immediately skip the mysql_ functions and go for PDO or at least mysqli.

answered Nov 11, 2010 at 14:20

Bart van Heukelom

44.2k62 gold badges192 silver badges309 bronze badges

3 Comments

sixtyfootersdude Over a year ago

Why is it better to use PDO or mysqli? Is this a standard or a holly war?

Frank Over a year ago

The general term to look up would be Object-Relational-Mapping (ORM). There are different more-or-less standards, but I'd not consider it a holy war. It's pretty natural that having OO on one side and a relational formalism on the other, that you need some mapping. And not doing it by hand is always a good thing.

Bart van Heukelom Over a year ago

@Frank: Neither PDO nor mysqli have anything to do with ORM I'm afraid, they only provide an OO-interface to the database connection, the data is still relational as always.

Victor Nicollet · Accepted Answer · 2010-11-11 14:37:26Z

0

In an ideal world, PHP would generate aggregate queries, send them to MySQL, and only get a small number of rows in return. For instance, if you're counting the number of log items of each severity between two dates:

SELECT COUNT(*), severity 
FROM logs
WHERE date < ? AND date > ?
GROUP BY severity

Doing the work on the PHP side is quite unusual. If you find out that you have needs too complex for SQL queries to handle (which, given that you have control over your database structure, leaves you with a lot of freedom), a better option would be to move to a Map-Reduce database engine like CouchDB.

answered Nov 11, 2010 at 14:37

Victor Nicollet

24.7k4 gold badges62 silver badges90 bronze badges

Comments

Lukas Lukac · Accepted Answer · 2017-10-11 10:39:09Z

I strongly believe the batch processing with Doctrine or any kind of iterations with MySQL (PDO or mysqli) are just an illusion.

@dimitri-k provided a nice explanation especially about unit of work. The problem is the miss leading: "$query->iterate()" which doesn't really iterate over the data source. It's just an \Traversable wrapper around already fully fetched data source.

An example demonstrating that even removing Doctrine abstraction layer completely from the picture, we will still run into memory issues:

echo 'Starting with memory usage: ' . memory_get_usage(true) / 1024 / 1024 . " MB \n";

$pdo  = new \PDO("mysql:dbname=DBNAME;host=HOST", "USER", "PW");
$stmt = $pdo->prepare('SELECT * FROM my_big_table LIMIT 100000');
$stmt->execute();

while ($rawCampaign = $stmt->fetch()) {
    // echo $rawCampaign['id'] . "\n";
}

echo 'Ending with memory usage: ' . memory_get_usage(true) / 1024 / 1024 . " MB \n";

Output:

Starting with memory usage: 6 MB 
Ending with memory usage: 109.46875 MB

Here, the disappointing getIterator() method:

namespace Doctrine\DBAL\Driver\Mysqli\MysqliStatement

/**
 * {@inheritdoc}
 */
public function getIterator()
{
    $data = $this->fetchAll();

    return new \ArrayIterator($data);
}

You can use my little library to actually stream heavy tables using PHP Doctrine or DQL or just pure SQL. However you find appropriate: https://github.com/EnchanterIO/remote-collection-stream

Collectives™ on Stack Overflow

PHP/MYSQL: Iterate over every record in a database

5 Answers 5

3 Comments

5 Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

5 Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related