0
public String replace(){
    String[] parts = str.split("&([A-Za-z]+|[0-9]+|x[A-Fa-f0-9]+);");
    for (int i = 0; i < parts.length; i++) {
        System.out.println(parts[i]);

    }
    return "";
}

what makes this line exactly "String[] parts = str.split("&([A-Za-z]+|[0-9]+|x[A-Fa-f0-9]+);");"? i tried in my code but it didnt do anything..could someone give a string example so i can see how it splits ?

7
  • 1
    &lt;text1&gt;&lt;text2&gt; Commented Apr 14, 2014 at 22:05
  • 1
    There are tons of sites like myregextester.com where you can throw all kinds of strings at interactively Commented Apr 14, 2014 at 22:09
  • 1
    Looks more like XML/HTML entities, hence the leading & and trailing ; Commented Apr 14, 2014 at 22:19
  • Where is the variable str defined? Commented Apr 14, 2014 at 22:56
  • 1
    The Stack Overflow Regular Expressions FAQ has a list of online regex testers, listed by flavor (at the bottom). Debuggex and regex101 are what I use. regex101 also a has a replacement tester. Offline I use Regex Buddy. Commented Apr 15, 2014 at 0:32

2 Answers 2

2

Here is one example of a string that will be split by the regex you provided.

import java.util.regex.*;


public class ReverseRegex{
    public static void main(String[] args) {
        String str = "hello &fjeaifjiajwta; world";
        String[] parts = str.split("&([A-Za-z]+|[0-9]+|x[A-Fa-f0-9]+);");
        for (int i = 0; i < parts.length; i++) {
            System.out.println(parts[i]);
        }
    }
}

Here are a few more examples.

    String str = "hello &21342352352; world"; // Two pieces
    String str = "hello &xffea424242; world"; // Two pieces
    String str = "hello &xffea424242; world &hefiajeifjae; world"; // Three pieces.
Sign up to request clarification or add additional context in comments.

Comments

0

The regex is apparently for a named or numbered HTML entity reference, but it's incomplete. It's missing the hash sign for the numbered entities and it doesn't allow for names with digits in them, like &sup2; and &frac14;. Here's what I would use:

"&(?:[a-zA-Z]+[0-9]*|#[0-9]+|#x[0-9a-fA-F]+);"

However, I don't see why you would want to use that regex with split(), which throws away whatever it matches and returns everything else. If you want to do something with the entities themselves, you'll most likely want to use find(). Here's an example that just collects the entities in a list;

List<String> matchList = new ArrayList<String>();
Pattern p = Pattern.compile("&(?:[a-zA-Z]+[0-9]*|#[0-9]+|#x[0-9a-fA-F]+);");
Matcher m = p.matcher(s);
while (m.find()) {
    matchList.add(m.group());
} 

1 Comment

i m just trying to figure out how to replace [ä,ü,ö] ina a string with HTML_escapecodes...but it needs to be really a fast change.. not every char in the string needs to be looked for it..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.