2

I have a string like

BK0001 My book (4th Edition) $49.95 (Clearance Price!)

I would like a way to split it into different parts like

[BK0001] 
[My Book (4th Edition)] 
[$49.95] 
[(Clearance Price!)]

I'm pretty new at regex and I'm using this to parse a line on a file. I managed to get the first part BK0001 by using

$parts = preg_split('/\s+/', 'BK0001 My book (4th Edition) $49.95 (Clearance Price!)';

then getting the $part[0] value but not sure on how to split it to get the other values.

5
  • 1
    have you used regex101 yet? Great resource for both learning regexes and developing for a particular need . Commented Nov 7, 2018 at 20:42
  • 2
    Try spelling out the subpatterns. Say, preg_match('~^(?<code>\S+)\s+(?<name>.*?)\s+(\$\d[\d.]*)\s*(?<details>.*)$~', $text, $matches), see demo. Commented Nov 7, 2018 at 20:43
  • @Dan Farrel I have but I don't use php and regex often, I code mostly in python and usually use string.split() for tasks such as these. This is one of those rare moments when I need regex and investing time learning it fully really a good option right now. Commented Nov 7, 2018 at 20:48
  • @WiktorStribiżew works perfectly. Thanks Commented Nov 7, 2018 at 20:48
  • learning it fully really a good option right now it's always good to learn Regex, most languages have some flavor of it and it's incredibly powerful and useful. Commented Nov 7, 2018 at 21:07

2 Answers 2

3

You may match the specific parts of the input string using a single pattern with capturing groups:

preg_match('~^(?<code>\S+)\s+(?<name>.*?)\s+(?<num>\$\d[\d.]*)\s*(?<details>.*)$~', $text, $matches)

See the regex demo. Actually, the last $ is not required, it is there just to show the whole string is matched.

Details

  • ^ - start of a string
  • (?<code>\S+) - Group "code": one or more non-whitespace chars
  • \s+ - 1+ whitespaces
  • (?<name>.*?) - Group "name": any 0+ chars other than line break chars, as few as possible
  • \s+ - 1+ whitespaces
  • (?<num>\$\d[\d.]*) - Group "num": a $, then 1 digit and then 0+ digits or .
  • \s* - 0+ whitespaces
  • (?<details>.*) - Group "details": any 0+ chars other than line break chars, as many as possible
  • $ - end of string.

PHP code:

$re = '~^(?<code>\S+)\s+(?<name>.*?)\s+(?<num>\$\d[\d.]*)\s*(?<details>.*)$~';
$str = 'BK0001 My book (4th Edition) $49.95 (Clearance Price!)';
if (preg_match($re, $str, $m)) {
    echo "Code: " . $m["code"] . "\nName: " . $m["name"] . "\nPrice: " .
         $m["num"] . "\nDetails: " . $m["details"]; 
}

Output:

Code: BK0001
Name: My book (4th Edition)
Price: $49.95
Details: (Clearance Price!)
Sign up to request clarification or add additional context in comments.

Comments

3

Try using preg_match

$book_text = "BK0001 My book (4th Edition) $49.95 (Clearance Price!)";
if(preg_match("/([\w\d]+)\s+(.*?)\s+\\((.*?)\\)\s+(\\$[\d\.]+)\s+\\((.*?)\\)$/",$book_text,$matches)) {
    //Write code here
    print_r($matches);
}

$matches[0] is reserved for the full match string. You can find the split parts from $matches[1]...

Array ( [0] => BK0001 My book (4th Edition) $49.95 (Clearance Price!) [1] => BK0001 [2] => My book [3] => 4th Edition [4] => $49.95 [5] => Clearance Price! )

$matches[1] is "book number"
$matches[2] is "book name"
$matches[3] is "edition"
$matches[4] is "price"
$matches[5] is "special text"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.