1

In a text file, I have the folowing strings :

 ID  |      LABEL       | A |   B  | C
--------------------------------------
9999 | Oxygen Isotopes  |   | 0.15 | 1 
8733 | Enriched Uranium |   | 1    | 1 

I would like to extract the fields ID and LABEL of each line using regular expression.

How I can achieve it ?

2
  • what is the delimiter beetween the strings ? Commented Nov 10, 2012 at 22:12
  • 1
    Delimiter is : | At this moment I am only able to extract the string with regexp, i'll edit the original thread. Commented Nov 10, 2012 at 22:16

7 Answers 7

2

I am not certain why you insisted on regex.

As the column appear to be separated by | symbol, it seems like using PHP function explode would be an easier solution.

You would be able loop through the lines, and refer to each column using typical array index notation, for example: $line[0] and $line[1] for ID and LABEL respectively.

Sign up to request clarification or add additional context in comments.

1 Comment

i wanted to ask him this also, but if he wants a regex, i need the delimiter
1

I doubt regex is the best solution here.

Try this to separate the text file into an array of lines (this might or might not work, depending on the OS of the machine you created the txt file on)

$lines = explode($text, "\n");
$final_lines = array();

foreach ($lines as $line) {
    $parts = explode($line, " | ");
    $final_lines[] = $parts;
}

Now you can access all of the data through the line number then the column, like

$final_lines[2][0]

Will contain 8733.

Comments

1

You could use preg_split on every line:

$array = preg_split(`/\s*\|\s*/`, $inputLine, 2);

Then as in djdy's answer, the ID will be in $array[0] and the label in $array[1].

Comments

1

No regex needed:

<?php
$file = file('file.txt');

$ret = array();
foreach($file as $k=>$line){
    if($k<2){continue;}

    list($ret['ID'][],
         $ret['LABEL'][],
         $ret['A'][],
         $ret['B'][],
         $ret['C'][]) = explode('|',$line);
}

print_r($ret);

//Label: Oxygen Isotopes ID:9999 
echo 'Label: '.$ret['LABEL'][0].' ID:'.$ret['ID'][0];

/*
Array
(
    [C] => Array
        (
            [0] =>  1 

            [1] =>  1 
        )

    [B] => Array
        (
            [0] =>  0.15 
            [1] =>  1    
        )

    [A] => Array
        (
            [0] =>    
            [1] =>    
        )

    [LABEL] => Array
        (
            [0] =>  Oxygen Isotopes  
            [1] =>  Enriched Uranium 
        )

    [ID] => Array
        (
            [0] => 9999 
            [1] => 8733 
        )

)
*/
?>

Comments

0

Regular expressions might not be the best approach here. I'd read in each line as a string, and use String.explode("|", input) to make an array of strings. The 0 index is your ID, the 1 index is your label, and so on for A, B, and C if you want. That's a more robust solution than using regex.

A regular expression that gets the ID might be something like

\d{4}  |

You could do something similar for the label field, bug again, this isn't as robust as just using explode.

Comments

0

Though its not a best approach to use regular expression here but one may be like this

preg_match_all("/(\d{4}.?)\|(.*?)\|/s", $data, $matchs)

2nd and 3rd index of $matches will carry the required data

Comments

0

Try

$str = file_get_contents($filename);
preg_match_all('/^\s*(\d*)\s*\|\s*(.*?)\s*\|/m', $str, $matches);
// $matches[1] will have ids
// $matches[2] will have labels 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.