4

I have this List in Scala:

List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

And I want to obtain the same List with the substrings between | and ] removed and | removed too.

So the result would be:

List[String] = List([[aaa]], [[ccc]], [[ooo]])

I tried something making a String with the List and using replaceAll, but I want to conserve the List.

Thanks.

3 Answers 3

5

Here is a simple solution that should be quite good in performance:

val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list.map(str => str.takeWhile(_ != '|') + "]]" )

It assumes that the format of the strings is:

  • Two left square brackets [ at the beginning,
  • then the word we want to extract,
  • and then a pipe |.
Sign up to request clarification or add additional context in comments.

3 Comments

This answer may be too brief. In addition to providing an MCVE, can you explain your solution? From How do I write a good answer?: "…try to mention any limitations, assumptions or simplifications in your answer. Brevity is acceptable, but fuller explanations are better."
Clean and efficient, but not the result that the OP has requested.
You are right, I misread the question, sorry about that. I have edited the response
4

You can use a simple \|.*?]] regex to match these substrings you need to remove.

Here is a way to perform the replacement in Scala code:

val l = List[String]("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
println(l.map(x => x.replaceAll("""\|.*?(]])""", "$1"))) 

See the Scala demo

I added a capturing group around ]] and used a $1 backreference in the replacement pattern to insert the ]] back into the result.

Details:

  • \| - a literal | pi[e symbol (since it is a special char outide of a character class, it must be escaped)
  • .*? - any zero or more symbols other than line break symbols
  • (]]) - Group 1 capturing ]] substring (note that ] outside of a character class does not need escaping, it is just the opposite of the case with |).

4 Comments

You don't need a capture group if you replaceAll("""\|[^\]]+""", ""), and it's still accurate if the level of [] nesting changes.
@jwvh: I know I could replace with ]]. It is just a way to show what regex can do.
Granted, but if the string is "[aa|bb]" or maybe "[[[x|y]]]" then your solution (with or without the capture group) won't balance each [ with a closing ].
@jwvh: There is no need balancing, see the OP: the format is fixed, [[ + some chars other than |, then |, then some chars up to the first ]]. If balanced brackets had been mentioned, I would have never suggested a Java regex solution unless the recursion depth were just 1 or 2 levels.
0

Replace the 3 characters between | and } with ].

regex is "\\|(.{3})\\]" (do not forget to escape | and })

scala> val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list: List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

scala> list.map(_.replaceAll("\\|(.{3})\\]", "]"))
res16: List[String] = List([[aaa]], [[ccc]], [[ooo]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.