23

I have a string that looks like "A=1.23;B=2.345;C=3.567"

I am only interested in "C=3.567"

what i have so far is:

     Matcher m = Pattern.compile("C=\\d+.\\d+").matcher("A=1.23;B=2.345;C=3.567");

    while(m.find()){ 
        double d = Double.parseDouble(m.group());
        System.out.println(d);
    }

the problem is it shows the 3 as seperate from the 567

output:

3.0

567.0

i am wondering how i can include the decimal so it outputs "3.567"

EDIT: i would also like to match C if it does not have a decimal point: so i would like to capture 3567 as well as 3.567

since the C= is built into the pattern as well, how can i strip it out before parsing the double?

1
  • 2
    A period (".") is not a digit. Commented Sep 9, 2010 at 23:20

5 Answers 5

43

I may be mistaken on this part, but the reason it's separating the two is because group() will only match the last-matched subsequence, which is whatever gets matched by each call to find(). Thanks, Mark Byers.

For sure, though, you can solve this by placing the entire part you want inside a "capturing group", which is done by placing it in parentheses. This makes it so that you can group together matched parts of your regular expression into one substring. Your pattern would then look like:

Pattern.compile("C=(\\d+\\.\\d+)")

For the parsing 3567 or 3.567, your pattern would be C=(\\d+(\\.\\d+)?) with group 1 representing the whole number. Also, do note that since you specifically want to match a period, you want to escape your . (period) character so that it's not interpreted as the "any-character" token. For this input, though, it doesn't matter

Then, to get your 3.567, you would you would call m.group(1) to grab the first (counting from 1) specified group. This would mean that your Double.parseDouble call would essentially become Double.parseDouble("3.567")

As for taking C= out of your pattern, since I'm not that well-versed with RegExp, I might recommend that you split your input string on the semi-colons and then check to see if each of the splits contain the C; then you could apply the pattern (with the capturing groups) to get the 3.567 from your Matcher.

Edit For the more general (and likely more useful!) cases in gawi's comment, please use the following (from http://www.regular-expressions.info/floatingpoint.html)

Pattern.compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?")

This has support for optional sign, either optional integer or optional decimal parts, and optional positive/negative exponents. Insert capturing groups where desired to pick out parts individually. The exponent as a whole is in its own group to make it, as a whole, optional.

Sign up to request clarification or add additional context in comments.

6 Comments

NOTE: The regexp does not deal with the following floats: 10 10. .1 1.3e10 1.2e-12 1.41e+12
@gawi Thank you :) I've updated the answer with a regular expression that should do the trick. Is 10. considered a valid float, with the decimal point but no digits after?
10. is a valid float literal in Java (well... 10.f to be exact)
I don't understand why you think using group() has something to do with the problem. He doesn't have any extra groups in his regular expression.
your answer still does not support 10.. I've adapted yours to match all @gawi 's requirements. [-+]?([0-9]+\\.?[0-9]*|[0-9]*\\.?[0-9]+)([eE][-+]?[0-9]+)?
|
8

Your regular expression is only matching numeric characters. To also match the decimal point too you will need:

Pattern.compile("\\d+\\.\\d+")

The . is escaped because this would match any character when unescaped.

Note: this will then only match numbers with a decimal point which is what you have in your example.

Comments

2

To match any sequence of digits and dots you can change the regular expression to this:

"(?<=C=)[.\\d]+"

If you want to be certain that there is only a single dot you might want to try something like this:

"(?<=C=)\\d+(?:\\.\\d+)?"

You should also be aware that this pattern can match the 1.2 in ABC=1.2.3;. You should consider if you need to improve the regular expression to correctly handle this situation.

Comments

2

if you need to validate decimal with dots, commas, positives and negatives:

Object testObject = "-1.5";
boolean isDecimal = Pattern.matches("^[\\+\\-]{0,1}[0-9]+[\\.\\,][0-9]+$", (CharSequence) testObject);

Good luck.

3 Comments

Isn't the {1} implicit?
@cutter yeah, I don't remember why I added that {1}. Maybe to be more clear or because I was so noob with regex in 2015 xD
The only answer that treats negative (and positive with leading +) numbers correctly.
0

if you want a regex for an input which might be double or just integer without any *.0 thing you can use this:
Pattern.compile("(-?\d+\.?\d*)")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.