0

I need to split a string by number and by spaces but not sure the regex for that. My code is:

$array = preg_split('/[0-9].\s/', $content);

The value of $content is:

Weight 229.6104534866 g
Energy 374.79170898476 kcal
Total lipid (fat) 22.163422468932 g
Carbohydrate, by difference 13.641848209743 g
Sugars, total 4.3691034101428 g
Protein 29.256342349938 g
Sodium, Na 468.99386390008 mg 

Which gives the result:

Array ( [0] => Weight 229.61045348 [1] => g
Energy 374.791708984 [2] => kcal
Total lipid (fat) 22.1634224689 [3] => g
Carbohydrate, by difference 13.6418482097 [4] => g
Sugars, total 4.36910341014 [5] => g
Protein 29.2563423499 [6] => g
Sodium, Na 468.993863900 [7] => mg
) 1

I need to split the text from the number but not sure how, so that:

[0] => Weight
[1] => 229.60145348
[2] => g

and so on...

I also need it to ignore the commas, brackets and spaces where the label is. When using explode I found that 'Total lipid (fat)' instead of being one value separated into 3 values, not sure how to fix that with regex.

When using explode() I get:

[0] => Total
[1] => lipid
[2] => (fat)

but I need those values as one for a label, any way to ignore that?

Any help is very appreciated!

2
  • Why don't use the explode() function ? Commented Feb 21, 2022 at 16:18
  • Please can you edit to include a minimal reproducible example - show us the input that the output you've printed comes from, and the exact output you want for that input. Commented Feb 21, 2022 at 16:19

3 Answers 3

2

Instead of splitting, you might very well match and capture the required parts, e.g. with the following pattern:

^(?P<category>\D+)\s+(?P<value>[\d.]+)\s+(?P<unit>.+)

See a demo on regex101.com.


In PHP this could be

<?php

$data = 'Weight 229.6104534866 g
Energy 374.79170898476 kcal
Total lipid (fat) 22.163422468932 g
Carbohydrate, by difference 13.641848209743 g
Sugars, total 4.3691034101428 g
Protein 29.256342349938 g
Sodium, Na 468.99386390008 mg ';

$pattern = '~^(?P<category>\D+)\s+(?P<value>[\d.]+)\s+(?P<unit>.+)~m';

preg_match_all($pattern, $data, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
print_r($matches);
?>

See a demo on ideone.com.

Sign up to request clarification or add additional context in comments.

Comments

0

As an alternative to using a preg_ functions, sscanf() allows the decimal value to be explicitly typed as a float (if that is valuable).

Unfortunately due to the greedy nature of sscanf(), the space between the label and the float value will still be attached to the label string. If this is a problem, the label value will need to be rtrim()ed.

Code: (Demo)

// $contentLines = file('path/to/content.txt');
$contentLines = [
    'Weight 229.6104534866 g',
    'Energy 374.79170898476 kcal',
    'Total lipid (fat) 22.163422468932 g',
    'Carbohydrate, by difference 13.641848209743 g',
    'Sugars, total 4.3691034101428 g',
    'Protein 29.256342349938 g',
    'Sodium, Na 468.99386390008 mg',
];

var_export(
    array_map(
        fn($line) => sscanf(
            $line,
            '%[^0-9]%f%s',
        ),
        $contentLines
    )
);

Comments

-1

Thanks to everyone for the help. I found that by adding a double space in between all values then setting the explode parameter to the double space it ignored what I needed.

1 Comment

Look into the answers, this is surely not the best solution, actually.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.