5

I have written a regex to omit the characters after the first occurrence of some characters (, and #)

String number = "(123) (456) (7890)#123";
number = number.replaceAll("[,#](.*)", ""); //This is the 1st regex

Then a second regex to get only numbers (remove spaces and other non numeric characters)

number = number.replaceAll("[^0-9]+", ""); //This is the 2nd regex

Output: 1234567890

How can I merge the two regex into one like piping the O/p from first regex to the second.

4
  • number.match(\\d+") Commented Feb 8, 2016 at 9:07
  • 1
    Try [,#].*$|[^#,0-9]+ Commented Feb 8, 2016 at 9:10
  • @WiktorStribiżew This seems to work, but (1) not sure it will work in the general case, and (2) should point out that | is not a "pipe" symbol in this case, but an or (just because OP asked about "piping" the regexes) Commented Feb 8, 2016 at 9:17
  • @tobias_k: Since the question is not that clear (too few example inputs, too generic regex used) I am not posting any answer. I just suggested some possible solution, thanks for "decyphering" it, but if OP says my solution is working, I will post with all explanations. Commented Feb 8, 2016 at 9:38

2 Answers 2

2

You can combine both regex in the following way.

String number = "(123) (456) (7890)#123";
number = number.replaceAll("[,#](.*)", "").replaceAll("[^0-9]+", "");
Sign up to request clarification or add additional context in comments.

3 Comments

While this does not "merge the two regex into one", I think this is about as close to "piping the O/p from first regex to the second" as it gets.
Note that "piping" != "chaining" (although it might be, depends on what is meant by "piping" OP meant). Here, two replaceAll methods are chained. Anyway, I am not that sure what is necessary here, OPs are likely to misuse terminology.
@WiktorStribiżew By piping I meant that I want to carry out the regex operation to remove characters after # and , first and then direct the o/p to the second regex to remove any non numeric characters. Basically in a sequential manner. Your solution seems to work. Just want to confirm what tobias_k is referring to and/or operation & not pipe?
1

So you need to remove all symbols other than digits and the whole rest of the string after the first hash symbol or a comma.

You cannot just concatenate the patterns with |operator because one of the patterns is anchored implicitly at the end of the string.

You need to remove any symbols but digits AND hashes with commas first since the tegex engine processes the string from left to right and then you can add the alternative to match a comma or hash with any text after them. Use DOTALL modifier in case you have newline symbols in your input.

Use

 (?s)[,#].*$|[^#,0-9]+

1 Comment

Ok. Got it. Thanks!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.