0

i have page in that approx 10-15 links are there and all links are in my control and end with some words like celebrity i want to filter all links ending with that word so i have written this

    $regex='|<a.*?href="(.*_celebrity)"|';


    preg_match_all($regex,$result21,$parts);
$links=$parts[0];
foreach($links as $link){
{
    echo $link."<br>";
    mysql_query ("INSERT INTO tablea(linkssas) VALUES ('$link')");
    }

it does the job and filters all links which is ending with _celebrity but the output is not entering in database.all links are entering in one row and it is not plain it is in the form of anchor text but i want plain links in the database as i am using foreach so all links should be entered in seperate row but all rows are entering in single row and in the form of anchor like http://xyz.com/edje/jjeieied_celebrity">A</a>

but i want only links in database

3
  • You should not use a regex to get the links, but DOMDocument instead. Please read: stackoverflow.com/questions/1732348/… Commented Feb 7, 2013 at 14:49
  • 2
    Sounds like a problem with a greedy regex. Really you want href="(.*?_celebrity)", but really you are better off using a proper DOM parser like DOMDocument or SimpleXML for this. Commented Feb 7, 2013 at 14:49
  • This sounds like a job for Tony The Pony..... Or better yet, read this, it's a good explaination about why you shouldn't try to parse HTML using regex. Commented Feb 7, 2013 at 14:55

2 Answers 2

3

I felt obliged to give you the DOMDocument tour:

$d = new DOMDocument();
$d->loadHTML($result21);

$suffix = "_celebrity"; $suffix_len = strlen($suffix);

foreach ($d->getElementsByTagName('a') as $link) {
    $href = $link->getAttribute('href');
    if ($href && substr($href, -$suffix_len) === $suffix) {
        // do your insert here
    }
}

Or, using XPath instead of getElementsByTagName:

$xp = new DOMXPath($d);

foreach($xp->query('//a[substring(@href, string-length(@href) - 9) = "_celebrity"]') as $node) {
    echo $node->getAttribute('href');
}

And here's a message from our chat room:

Please, don't use mysql_* functions in new code. They are no longer maintained and are officially deprecated. See the red box? Learn about prepared statements instead, and use PDO, or MySQLi - this article will help you decide which. If you choose PDO, here is a good tutorial.

Sign up to request clarification or add additional context in comments.

3 Comments

And do the insert with PDO using bindParam
@jack not working i am trying this added few lines to echo $op7=''.$link->getAttribute('href').''; echo $op7;
0

You probably want to loop through $parts[1] instead of $parts[0].

http://php.net/manual/en/function.preg-match-all.php

1 Comment

I had to modify the regex, but this is bad practice to use a regex in that case. This is also bad practice to use mysql_query()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.