1

Using PHP, is there a better way of extracting the appropiate bits of information out of a text file without using strpos and substr PHP functions?

I need to extract the "Course xxx", No, and Ref

Example: The results for record with Subject of "Course 1...." would be:

  • Course 1
  • 8415
  • 152

Example .txt File:

Name: Dave
Age: 15
Subject: Course 1 (No: 8415, Ref: #152#)
Description:

Description 1



Name: John
Age: 28
Subject: Course 2 (No: 646544, Ref: #325#)
Description:

Description 1



Name: Steve
Age: 22
Subject: Course 3 (No: 545, Ref: #451#)
Description:

Description 1

EDIT: Noticed I don't need all data extracted, but all data will still be in file.

4
  • I have used strpos and substr, but its messy imo Commented Oct 17, 2012 at 20:38
  • 1
    What's the 'Description 1' part? Rest is easy to extract with Regular Expressions. The question is... is the file consistent through out its entire structure? (Like the 2 enters which I can use as separators to help you with this) And is Description: Always a single line? Commented Oct 17, 2012 at 20:51
  • EDIT: I have just noticed I don't nedd all the data extracted, but all data will still be in file. Commented Oct 17, 2012 at 20:58
  • @Claudrian yes, the file is always in the same format, and the description is always on one line, with blank lines underneath. Commented Oct 17, 2012 at 21:01

4 Answers 4

3
if(preg_match_all('~'.
    'Name:\\s*(?<Name>.+?)\r?\n'. // Name
    'Age:\\s*(?<Age>[0-9]+)\r?\n'. // Age
    'Subject:\\s*(?<Subject>.+?)\\s*\\('. // Subject
        'No:\\s*(?<No>[0-9]+)\\s*,'. // No
        '\\s*Ref:\\s*#(?<Ref>[0-9]+)#'. // Ref
    '\\)\r?\n'. // /Subject
    'Description:\\s*(?<Description>.+?)\r?\n'. // Description
'~si', $AccountDump, $Matches)){
    $Names = $Matches['Name'];
    $Ages = $Matches['Age'];
    $Subjects = $Matches['Subject'];
    $Nos = $Matches['No'];
    $Refs = $Matches['Ref'];
    $Descriptions = $Matches['Description'];
    $Accounts = array();
    foreach($Names as $Key => $Name){
        $Accounts[$Key] = array_map('trim', array(
            'Name'              => $Name,
            'Age'               => $Ages[$Key],
            'Subject'           => $Subjects[$Key],
            'No'                => $Nos[$Key],
            'Ref'               => $Refs[$Key],
            'Description'       => $Descriptions[$Key],
        ));
    }
    // Got them!
    var_dump($Accounts);
}

Load text in a variable named $AccountDump.

Have fun. Tested on your sample and it works. I've split the RegExp so you can track it if you want.

Hope it works!

Sign up to request clarification or add additional context in comments.

Comments

1

You'll probably want to use regular expressions for this. It will get a little complex, but won't be nearly as bad as strpos and substr.

As a starting point, here's a regular expression that will match name:value pairs -

$matches = array();
preg_match_all('/^([^\s:]+):\s*(.+)$/m', $data, $matches);

print_r($matches);

Edit: I got curious and finished the regex, here it is in its entirety -

preg_match_all('/^([^\s:]+):\s*(.+?)(?:\s*\(([^\s:]+):\s*(.+),\s*([^\s:]+):\s*(.+)\))?$/m', $data, $matches);

Comments

1

You can have

$data = file_get_contents("log.txt");
$data = array_chunk(array_filter(array_map("trim",explode(chr(13).chr(10).chr(13), $data))),2);
$lists = array();

foreach ( $data as $value ) {
    $list = array();
    foreach ( explode("\n", implode("", $value)) as $item ) {
        list($key, $value) = explode(":", $item);
        $list[trim($key)] = trim($value);
    }
    $lists[] = $list;
}
var_dump($lists);

Output

array
  0 => 
    array
      'Name' => string 'Dave' (length=4)
      'Age' => string '15' (length=2)
      'Subject' => string 'Course 1 (No' (length=12)
      'Description' => string 'Description 1' (length=13)
  1 => 
    array
      'Name' => string 'John' (length=4)
      'Age' => string '28' (length=2)
      'Subject' => string 'Course 2 (No' (length=12)
      'Description' => string 'Description 1' (length=13)
  2 => 
    array
      'Name' => string 'Steve' (length=5)
      'Age' => string '22' (length=2)
      'Subject' => string 'Course 3 (No' (length=12)
      'Description' => string 'Description 1' (length=13)

Comments

0

Take a look at this two PHP functions:

preg_replace

preg_match_all

3 Comments

it's more a comment than an answer
Does any one know of a site that can gerate regular expressions for you? All you need to do is entered the match (the thing you are searching for.)?
regexpal.com can be very helpful when building regular expressions. The site was inspired by a commercial product, regexbuddy, which I've had a lot of success with over the years

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.