0

I have html page from old website that has list of some places using below format.

<p><b>Ado’s Kitchen &amp; Bar&nbsp; </b>1143 13th St., 720-465-9063; <a href="http://www.span-ishatthehill.com">span-ishatthehill.com.</a> Laid back restaurant with global menu. Open for breakfast and lunch daily and dinner Mon.-Sat.</p>
    

<p><strong>Blackbelly Market</strong> 1606 Conestoga St. #3, 303-247-1000; <a href="http://www.blackbelly.com">blackbelly.com</a>. Locavore dining, butchery and bar. Open daily for happy hour and dinner; see website for market hours.</p>

I am going to use this data for listing page. so i need to get this data in correct formate like

$arr = [
'name'=>'', //in <b> tag
'address'=>'', //after <b> tag
'phone'=>'', //after address. address is end with comma 
'website'=>'', //after number number, number is ended with semicolon and in a tag
'description'=>'', //after <a> tag
]

I tried to use preg_match but can not extract content those are no in a tag, eg address or phone number etc.

$htmlContent = 'content here';
preg_match('/<b>(.*?)<\/b>/s', $htmlContent, $match); /*for address */
    preg_match('/< strong >(.*?)<\/strong >/s', $htmlContent, $match); /*for address */

preg_match('/<a href="(.*?)">(.*?)<\/a>/s', $htmlContent, $match); /*for website */

using this code i can get website address or address (from tag) but how to get phone, address and other details?

Thanks

1 Answer 1

1

you can use a single regular expression to catch the data. like this:

preg_match('#<p><b>(?<name>.*)</b>(?<address>.*),(?<phone>.*);.*<a.*href="(?<website>.*)".*>.*</a>(?<description>.*)</p>#', $htmlContent, $match);

then you can retrieve the matches like this:

$name = $match['name'];
$address = $match['address'];
$phone = $match['phone'];
...

if you want to see in more detail how this regular expression works here is the link: [1]: https://regex101.com/r/EYpXwi/1

Sign up to request clarification or add additional context in comments.

1 Comment

That <b>(?<name>.*)</b> should be (<strong>|<b>)(?<name>.*)(</strong>|</b>) if you also want to catch both <b> and <strong> tags

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.