0

I've got a CSV file of members that I receive once a month that contains ~6000 rows.

I'm (trying to) loop through the CSV file, check if the record already exists in the members table, and if so check whether it's the same data.

Then insert it into the pending table (with a exists flag where appropriate).

I'm using Laravel and League\CSV to read in the file that is saved in my storage folder:

class ImportController extends Controller
{
  public function import(Request $request) {

    $readDirectory = 'storage/csv/';
    $filename = $request->name;

    $stream = fopen($readDirectory.$filename, 'r');
    $reader = Reader::createFromStream($stream, 'r')->setHeaderOffset(0);
    $records = (new Statement())->process($reader);

    // Truncate the imported table prior to import
    Imported::truncate(); 

    foreach ($records as $record) {

        $email = $record['email'];

        $recordExists = $this->recordExists($email);

        if($recordExists) {
          // Compare the md5 of the recordArray and the memberArray and skip the record if thit's the same.
          $memberArray = $this->getmemberArray($recordExists);
          $recordArray = $this->getRecordArray($record);

          if($memberArray['hash'] === $recordArray['hash']) { continue; }

          $record['exists'] = TRUE;
          $this->write($record);

          continue;
        }


        else
        {
          $record['exists'] = FALSE;
          $this->write($record);
          Log::debug("missing: ".$record['URN']);

          continue;
        }
      };
    // End Foreach Loop

    return redirect()->route('upload.show');
  }



  public function recordExists($urn){
    $member = Member::where('email', 'LIKE', $email)->first();
    if ($member == null) { return false; }
    return $member;
  }

  public function getmemberArray($member) {
    $memberArray = [
      'email'       =>  $member->email,
      'first_name'  =>  $member->first_name,
      'last_name'   =>  $member->last_name,
      'age_years'   =>  $member->age_years,
      'gender'      =>  $member->gender,
      'address_1'   =>  $member->address_1,
      'address_2'   =>  $member->address_2,
      'address_3'   =>  $member->address_3,
      'town'        =>  $member->town,
      'county'      =>  $member->county,
      'postcode'    =>  $member->postcode,
      'sport_1'     =>  $member->sport_1,
      'sport_2'     =>  $member->sport_2,
    ];
    $memberArray['hash'] = md5(json_encode($memberArray));
    return $memberArray;
  }

  public function getRecordArray($record) {
    $recordArray = [
      'email'       =>  $record['email'], 
      'first_name'  =>  $record['first_name'], 
      'last_name'   =>  $record['last_name'], 
      'age_years'   =>  $record['age_years'], 
      'gender'      =>  $record['gender'],
      'address_1'   =>  $record['address_1'], 
      'address_2'   =>  $record['address_2'], 
      'address_3'   =>  $record['address_3'], 
      'town'        =>  $record['town'], 
      'county'      =>  $record['county'], 
      'postcode'    =>  $record['postcode'], 
      'sport_1'     =>  $record['sport_1'], 
      'sport_2'     =>  $record['sport_2'], 
    ];
    $recordArray['hash'] = md5(json_encode($recordArray));
    return $recordArray;
  }

  public function write($record) {

    $import = [];

    $import['email']      = $record['email'], 
    $import['first_name'] = $record['first_name'], 
    $import['last_name']  = $record['last_name'], 
    $import['age_years']  = $record['age_years'], 
    $import['gender']     = $record['gender'],
    $import['address_1']  = $record['address_1'], 
    $import['address_2']  = $record['address_2'], 
    $import['address_3']  = $record['address_3'], 
    $import['town']       = $record['town'], 
    $import['county']     = $record['county'], 
    $import['postcode']   = $record['postcode'], 
    $import['sport_1']    = $record['sport_1'], 
    $import['sport_2']    = $record['sport_2'], 
    $import['exists']     = $record['exists']

    DB::table('imported')->insert(
      $import
    );

    Log::debug($record['email']);

    return TRUE;
  }
}

But I keep getting:

Symfony \ Component \ Debug \ Exception \ FatalErrorException (E_UNKNOWN) Allowed memory size of 134217728 bytes exhausted (tried to allocate 181321056 bytes)

It works if I use a lot less rows in my CSV, but that's not an option.

I was previously writing to the DB using eloquent->save(), but changed it to DB::table()->insert to improve performance.

I've already added the following for testing purposes, but it's still breaking.

set_time_limit(0);
ini_set('max_execution_time', 100000);
ini_set('memory_limit','512m');

Am I missing something? Some kind of memory leak somewhere?

I'm guessing it's keeping the record in memory each time, so is there any way to make it forget after each row?

ALSO: Is there a way to clear this memory, so that I can edit the code and retry?

Even if I stop and re-run php artisan serve it still keeps the same error message.

4
  • Apologies for the code. I'm relatively new to PHP, and especially OOP. I'm sure there are much better ways to do this than just chucking it in the controller, but for now, I just need it to work. Then I'll look at DRY, SOLID etc after. Commented Sep 27, 2018 at 13:00
  • stackoverflow.com/questions/34864524 - see if information here helps. Up the allowed memory limit in php.ini directly and see what happens. The other option is to load data directly into your DB first and then use Laravel to process that information. Another thread: laracasts.com/discuss/channels/laravel/… Commented Sep 27, 2018 at 13:18
  • As mentioned, I've already upped the memory limit with ini_set, and it's still timing out. Uploading to the database directly isn't an option, as this is for an admin of a local club. They receive the data on their members as a CSV monthly, and want to update or add new records. They don't have technical knowledge. Commented Sep 27, 2018 at 13:21
  • 6000 rows but how large are the data? Is the memory use justified from loading the CSV or is the memory running out somewhere else? Commented Sep 27, 2018 at 13:25

2 Answers 2

3

The problem here is that League\CSV is reading the whole CSV file into memory when you do:

$records = (new Statement())->process($reader);

You should use the chunk method of the Reader like this to only read a specific amount of rows at once:

foreach($reader->chunk(50) as $row) {
    // do whatever
}

The chunk method returns a Generator that you can iterate over. You can find this mentioned here in the documentation.

EDIT: I misread the documentation and recommended the wrong method.

You basically just have to iterate over the $reader itself:

foreach ($reader as $row) {
    print_r($row);
}

Also if you are using a mac or if your CSV was created on one you need to use the following to be able to successfully read large CSV files:

if (!ini_get('auto_detect_line_endings')) {
    ini_set('auto_detect_line_endings', '1');
}

See this part of the documentation.

Sign up to request clarification or add additional context in comments.

3 Comments

You should use it directly on the $reader since $records will already contain the whole CSV files content and you only want to read it in chunks.
Ok, so I'd want to do this: $stream = fopen($readDirectory.$filename, 'r'); $reader = Reader::createFromStream($stream, 'r')->setHeaderOffset(0); foreach ($reader->chunk(512) as $chunk) { $records = (new Statement())->process($chunk); } // Truncate the imported table prior to import Imported::truncate(); foreach ($records as $record) {...
@n8udd Please checkout the edit I made. I've misread the documentation before and recommended the wrong method. I've corrected my post with a method that works and also added a tipp for OS specific problems.
-2

I get it that you are using php artisan serve to run your server. You can try deploying some form of an actual web server as you will be using that in production environment. You can try Apache, comes easily in XAMPP for windows and Linux.

You can check online on how to install Apache HTTP Server or Nginx on your operating system. These have better control and use of memory that the php default server.

1 Comment

I am trying to avoid just throwing more memory at it, and instead trying to identify why it's haemorrhaging memory.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.