4

I have a string of numbers that are a little weird. The source I'm pulling from has a non-standard formatting and I'm trying to switch from a .split where I need to specify an exact method to split on (2 spaces, 3 spaces, etc.) to a replaceall regex.

My data looks like this:

23574     123451    81239   1234    19274  4312457     1234719

I want to end up with

23574,xxxxx,xxxxx,xxxx 

So I can just do a String.split on the ,

1
  • Why do you want to split it, join back to a string, and split again? Commented Jan 30, 2014 at 7:54

4 Answers 4

13

I will use \s Regex

This is its usage on Java

String[] numbers = myString.split("\\s+");
Sign up to request clarification or add additional context in comments.

4 Comments

This worked out amazingly well, thank you so much for your help.
As an aside, do you know of any decent tutorials on learning regex? It seems like a ton of people know them, but never have a concrete source on where to read up about them.
docs.oracle.com/javase/tutorial/essential/regex and vogella.com/articles/JavaRegularExpressions/article.html Are the ones I have bookmarked for any REGEX related question I have
That looks pretty interesting.
3
final Iterable<String> splitted = Splitter.on('').trimResults().omitEmptyStrings().split(input);
final String output = Joiner.on(',').join(splitted);

with Guava Splitter and Joiner

Comments

2
String pattern = "(\s+)";
Pattern regex = Pattern.compile(pattern);
Matcher match = r.matcher(inputString);
match.replaceAll(",");
String stringToSplit = match.toString();

I think that should do it for you. If not, googling for the Matcher and Pattern classes in the java api will be very helpful.

Comments

0

I understand this problem as a way to obtain integer numbers from a string with blank (not only space) separators.

The accepted solution does not work if the separator is a TAB \t for instance or if it has an \n at the end.

If we define an integer number as a sequence of digits, the best way to solve this is using a simple regular expression. Checking the Java 8 Pattern API, we can find that \D represents any non digit character:

\D  A non-digit: [^0-9]

So if the String.split() method accepts a regular expression with the possible separators, it is easy to send "\\D+" to a trimmed string and get the result in one shot like this.

String source = "23574     123451    81239   1234    19274  4312457     1234719";
String trimmed = source.trim();
String[] numbers = trimmed.split("\\D+");

It is translated as split this trimmed string using any non digit character sequence as a possible separator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.