I'm reading an HTML file, trying to get some information out of it. I've tried HTML parsers, but can't figure out how to use them to get key text out. The original reads the html file, but this version is a minimal working example for StackOverflow purposes.
#!/usr/bin/env perl
use 5.036;
use warnings FATAL => 'all';
use autodie ':default';
use Devel::Confess 'color';
sub regex_test ( $string, $regex ) {
if ($string =~ m/$regex/s) {
say "$string matches $regex";
} else {
say "$string doesn't match $regex";
}
}
# the HTML text is $s
my $s = ' rs577952184 was merged into
<a target="_blank"
href="rs59222162">rs59222162</a>
';
regex_test ( $s, 'rs\d+ was merged into.*\<a target="_blank".+href="rs(\d+)/');
however, this doesn't match.
I think that the problem is the newline after "merged into" isn't matching.
How can I alter the above regex to match $s?
href="rs(\d+)/the/looks like a typo for"regex_test().regex_test($s, 'rs\\d+ was merged...')