118

I have a String variable (basically an English sentence with an unspecified number of numbers) and I'd like to extract all the numbers into an array of integers. I was wondering whether there was a quick solution with regular expressions?


I used Sean's solution and changed it slightly:

LinkedList<String> numbers = new LinkedList<String>();

Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(line); 
while (m.find()) {
   numbers.add(m.group());
}
2
  • 1
    Are numbers surrounded by spaces or other characters? How are numbers formatted, are they hexadecimal, octal, binary, decimal? Commented Mar 2, 2010 at 22:38
  • I thought it was clear from the question: it's an English sentence with numbers. Moreover I was talking about an integer array, so what I was looking for were integers. Commented Mar 2, 2010 at 22:56

13 Answers 13

187
Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There are more than -2 and less than 12 numbers here");
while (m.find()) {
  System.out.println(m.group());
}

... prints -2 and 12.


-? matches a leading negative sign -- optionally. \d matches a digit, and we need to write \ as \\ in a Java String though. So, \d+ matches 1 or more digits.

Sign up to request clarification or add additional context in comments.

5 Comments

Could you complement your answer by explaining your regular expression please?
-? matches a leading negative sign -- optionally. \d matches a digit, and we need to write \ as \\ in a Java String though. So, \\d+ matches 1 more more digits
I changed my expression to Pattern.compile("-?[\\d\\.]+") to support floats. You definitely lead me on the way, Thx!
This method detects digits but does not detect formated numbers, e.g. 2,000. For such use -?\\d+,?\\d+|-?\\d+
That only supports a single comma, so would miss "2,000,000". It also accepts strings like "2,00". If comma separators must be supported, then: -?\\d+(,\\d{3})* should work.
55

What about to use replaceAll java.lang.String method:

    String str = "qwerty-1qwerty-2 455 f0gfg 4";      
    str = str.replaceAll("[^-?0-9]+", " "); 
    System.out.println(Arrays.asList(str.trim().split(" ")));

Output:

[-1, -2, 455, 0, 4]

Description

[^-?0-9]+
  • [ and ] delimites a set of characters to be single matched, i.e., only one time in any order
  • ^ Special identifier used in the beginning of the set, used to indicate to match all characters not present in the delimited set, instead of all characters present in the set.
  • + Between one and unlimited times, as many times as possible, giving back as needed
  • -? One of the characters “-” and “?”
  • 0-9 A character in the range between “0” and “9”

2 Comments

Why would you want to keep question marks? Also, this treats - by itself as a number, along with things like 9-, ---6, and 1-2-3.
A very nice alternative without using importing libraries ;)
20
Pattern p = Pattern.compile("[0-9]+");
Matcher m = p.matcher(myString);
while (m.find()) {
    int n = Integer.parseInt(m.group());
    // append n to list
}
// convert list to array, etc

You can actually replace [0-9] with \d, but that involves double backslash escaping, which makes it harder to read.

2 Comments

Whoops. Sean's handles negative numbers, so that's an improvement.
yours will handle negative numbers too if you use "-?[0-9]+"
10
  StringBuffer sBuffer = new StringBuffer();
  Pattern p = Pattern.compile("[0-9]+.[0-9]*|[0-9]*.[0-9]+|[0-9]+");
  Matcher m = p.matcher(str);
  while (m.find()) {
    sBuffer.append(m.group());
  }
  return sBuffer.toString();

This is for extracting numbers retaining the decimal

2 Comments

Doesn't handle negatives
I think that the point should be escaped "(-?[0-9]+\\.[0-9]*|-?[0-9]*\\.[0-9]+|-?[0-9]+)" and to handle negative value, just need to add -?
8

The accepted answer detects digits but does not detect formated numbers, e.g. 2,000, nor decimals, e.g. 4.8. For such use -?\\d+(,\\d+)*?\\.?\\d+?:

Pattern p = Pattern.compile("-?\\d+(,\\d+)*?\\.?\\d+?");
List<String> numbers = new ArrayList<String>();
Matcher m = p.matcher("Government has distributed 4.8 million textbooks to 2,000 schools");
while (m.find()) {  
    numbers.add(m.group());
}   
System.out.println(numbers);

Output: [4.8, 2,000]

3 Comments

@JulienS.: I disagree. This regex does much more than the OP asked for, and it does incorrectly. (At the least, the decimal portion should be in an optional group, with everything in it required and greedy: (?:\.\d+)?.)
You certainly have a point there for the decimal portion. However it is very common to encounter formatted numbers.
@AlanMoore many visitors to SO are looking for any/different ways to resolve issues with varying similarity/difference, and it is helpful that suggestion are brought up. Even the OP might have oversimplified.
5

for rational numbers use this one: (([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))

1 Comment

The OP said integers, not real numbers. Also, you forgot to escape the dots, and none of those parentheses are necessary.
5

Using Java 8, you can do:

String str = "There 0 are 1 some -2-34 -numbers 567 here 890 .";
int[] ints = Arrays.stream(str.replaceAll("-", " -").split("[^-\\d]+"))
                 .filter(s -> !s.matches("-?"))
                 .mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]

If you don't have negative numbers, you can get rid of the replaceAll (and use !s.isEmpty() in filter), as that's only to properly split something like 2-34 (this can also be handled purely with regex in split, but it's fairly complicated).

Arrays.stream turns our String[] into a Stream<String>.

filter gets rid of the leading and trailing empty strings as well as any - that isn't part of a number.

mapToInt(Integer::parseInt).toArray() calls parseInt on each String to give us an int[].


Alternatively, Java 9 has a Matcher.results method, which should allow for something like:

Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There 0 are 1 some -2-34 -numbers 567 here 890 .");
int[] ints = m.results().map(MatchResults::group).mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]

As it stands, neither of these is a big improvement over just looping over the results with Pattern / Matcher as shown in the other answers, but it should be simpler if you want to follow this up with more complex operations which are significantly simplified with the use of streams.

Comments

1

Extract all real numbers using this.

public static ArrayList<Double> extractNumbersInOrder(String str){

    str+='a';
    double[] returnArray = new double[]{};

    ArrayList<Double> list = new ArrayList<Double>();
    String singleNum="";
    Boolean numStarted;
    for(char c:str.toCharArray()){

        if(isNumber(c)){
            singleNum+=c;

        } else {
            if(!singleNum.equals("")){  //number ended
                list.add(Double.valueOf(singleNum));
                System.out.println(singleNum);
                singleNum="";
            }
        }
    }

    return list;
}


public static boolean isNumber(char c){
    if(Character.isDigit(c)||c=='-'||c=='+'||c=='.'){
        return true;
    } else {
        return false;
    }
}

Comments

1

Fraction and grouping characters for representing real numbers may differ between languages. The same real number could be written in very different ways depending on the language.

The number two million in German

2,000,000.00

and in English

2.000.000,00

A method to fully extract real numbers from a given string in a language agnostic way:

public List<BigDecimal> extractDecimals(final String s, final char fraction, final char grouping) {
    List<BigDecimal> decimals = new ArrayList<BigDecimal>();
    //Remove grouping character for easier regexp extraction
    StringBuilder noGrouping = new StringBuilder();
    int i = 0;
    while(i >= 0 && i < s.length()) {
        char c = s.charAt(i);
        if(c == grouping) {
            int prev = i-1, next = i+1;
            boolean isValidGroupingChar =
                    prev >= 0 && Character.isDigit(s.charAt(prev)) &&
                    next < s.length() && Character.isDigit(s.charAt(next));                 
            if(!isValidGroupingChar)
                noGrouping.append(c);
            i++;
        } else {
            noGrouping.append(c);
            i++;
        }
    }
    //the '.' character has to be escaped in regular expressions
    String fractionRegex = fraction == POINT ? "\\." : String.valueOf(fraction);
    Pattern p = Pattern.compile("-?(\\d+" + fractionRegex + "\\d+|\\d+)");
    Matcher m = p.matcher(noGrouping);
    while (m.find()) {
        String match = m.group().replace(COMMA, POINT);
        decimals.add(new BigDecimal(match));
    }
    return decimals;
}

Comments

1

If you want to exclude numbers that are contained within words, such as bar1 or aa1bb, then add word boundaries \b to any of the regex based answers. For example:

Pattern p = Pattern.compile("\\b-?\\d+\\b");
Matcher m = p.matcher("9There 9are more9 th9an -2 and less than 12 numbers here9");
while (m.find()) {
  System.out.println(m.group());
}

displays:

2
12

Comments

1

I would suggest to check the ASCII values to extract numbers from a String Suppose you have an input String as myname12345 and if you want to just extract the numbers 12345 you can do so by first converting the String to Character Array then use the following pseudocode

    for(int i=0; i < CharacterArray.length; i++)
    {
        if( a[i] >=48 && a[i] <= 58)
            System.out.print(a[i]);
    }

once the numbers are extracted append them to an array

Hope this helps

1 Comment

A Java string is counted sequence of Unicode/UTF-16 code-units. By the design of UTF-16 the first 128 characters have the same value (by not the same size) as their ASCII encoding; Beyond that, thinking you are dealing with ASCII will lead to errors.
0

I found this expression simplest

String[] extractednums = msg.split("\\\\D++");

Comments

0
public static String extractNumberFromString(String number) {
    String num = number.replaceAll("[^0-9]+", " ");
    return num.replaceAll(" ", "");
}

extracts only numbers from string

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.