8

I need to extract the first integer found in a java.lang.String and am unsure as to whether to try and use a substring approach or a regex approach:

// Want to extract the 510 into an int.
String extract = "PowerFactor510";

// Either:
int num = Integer.valueof(extract.substring(???));

// Or a regex solution, something like:
String regex = "\\d+";
Matcher matcher = new Matcher(regex);
int num = matcher.find(extract);

So I ask:

  • Which type of solution is more appropriate here, and why?; and
  • If the substring approach is more appropriate, what could I use to indicate the beginning of a number?
  • Else, if the regex is the appropriate solution, what is the regex/pattern/matcher/method I should use to extract the number?

Note: The string will always begin with the word PowerFactor followed by a non-negative integer. Thanks in advance!

3
  • 1
    Regex would be more advisable due to faster processing. Commented Mar 25, 2013 at 13:25
  • 3
    Is regex really faster than substring(11)? The first part is always fixed... I don't think that parsing a regex, going through the string and extracting the appropriate group would be quicker than to just chop off the first 11 chars... Commented Mar 25, 2013 at 13:26
  • docs.oracle.com/javase/6/docs/api/java/lang/… Commented Mar 25, 2013 at 13:28

2 Answers 2

9

The string will always begin with the word "PowerFactor" followed by a non-negative integer

This means you know exactly at which index you will find the number, i would say you better use the substring directly, at least considering the performance it would be much faster than searching and matching work.

extract.substring("PowerFactor".length());

I could not find any direct comparision but you can read about each one of the two options:

Sign up to request clarification or add additional context in comments.

Comments

1

Was a bit curious and tried the following

String extract = "PowerFactor510";
long l = System.currentTimeMillis();
System.out.println(extract.replaceAll("\\D", ""));
System.out.println(System.currentTimeMillis() - l);

System.out.println();

l = System.currentTimeMillis();
System.out.println(extract.substring("PowerFactor".length()));
System.out.println(System.currentTimeMillis() - l);

And it tuned out that the second test was much faster, so substring wins.

3 Comments

Why in the world would you put \D in brackets there?
@tchrist Editted the answer
That is a horrible test. the replaceAll method of the String class performs an inline compile on the RegEx before processing it. The method does not yield a suitable test against Pattern/Matcher or anything RegEx related. Speed differences you are seeing are related to object creation and GC in the JVM.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.