0

I've been using this in php...

preg_match_all('|<a href="http://www.example.com/photoid/(.*?)"><img src="(.*?)" alt="(.*?)" /></a>|is',$xx, $matches, PREG_SET_ORDER);

where $xx is the entire webpage content as a string, to find all occurrences of matches.

This sets $matches to a two dimensional array which I can then loop through with a for statement based on the length of $matches and use for example ..

$matches[$i][1] which is would be the first (.*?)

$matches[$i][2] which is would be the second (.*?)

and so on....

My question is how can this be replicated in java? I've been reading tutorials and blogs on java regex and have been using Pattern and Matcher but can't seem to figure it out. Also, matcher never finds anything. so my while(matcher.find()) have been futile and usually throws an error saying no matches have been found yet

This is my java code for the pattern to be matched is ...

String pattern = new String(
    "<a href=\"http://www.example.com/photoid/(w+)\"><img src=\"(w+)\" alt=\"(w+)\" /></a>");

I've also tried ..

String pattern = new String(
    "<a href=\"http://www.example.com/photoid/(.*?)\"><img src=\"(.*?)\" alt=\"(.*?)\" /></a>");

and

String pattern = new String(
    "<a href=\"http://www.example.com/photoid/(\\w+)\"><img src=\"(\\w+)\" alt=\"(\\w+)\" /></a>");

no matches are ever found.

3
  • Check this stackoverflow.com/questions/3842112/… Commented Sep 3, 2013 at 16:09
  • @FaceOfJock do you think I should be using (\w+) instead of (.*?) Commented Sep 3, 2013 at 16:14
  • @TonyCruze I realize the www.mysite.com is just an example here but in case you didn't do so already, you should update your regex to quote the text that shouldn't be interpreted as regex. www.mysite.com will match wwwemysite.com also. Commented Sep 3, 2013 at 19:48

2 Answers 2

1

The regex you posted worked for me so perhaps your fault is in how you use it :

String test = "<html>\n<a href=\"http://www.mysite.com/photoid/potato.html\"><img src=\"quack-quack\" alt=\"hi\" /></a>\n</html>";
// This is exactly the pattern code you posted :
String pattern = new String(
    "<a href=\"http://www.mysite.com/photoid/(.*?)\"><img src=\"(.*?)\" alt=\"(.*?)\" /></a>");

Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(test);
m.find(); // returns true

See Java Tutorial on how this should be used.

Sign up to request clarification or add additional context in comments.

1 Comment

Yep! just figured it out. that was it!
1

Not an expert on Java, but shouldn't the strings escape double quotes and escapes?

 "<a href=\"http://www.mysite.com/photoid/(.*?)\"><img src=\"(.*?)\" alt=\"(.*?)\" /></a>"
 or
 "<a\\ href=\"http://www.mysite.com/photoid/(.*?)\"><img\\ src=\"(.*?)\"\\ alt=\"(.*?)\"\\ /></a>"

1 Comment

yea I'm escaping correctly I believe. Without escaping the quotes it does not compile. I'll update the question with the actual pattern I'm using.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.