My question is the Scala (Java) variant of this query on Python.
In particular, I have a string val myStr = "Shall we meet at, let's say, 8:45 AM?". I would like to tokenize it and retain the delimiters (all except whitespace). If my delimiters were only characters, e.g. ., :, ? etc., I could do:
val strArr = myStr.split("((\\s+)|(?=[,.;:?])|(?<=\\b[,.;:?]))")
which yields
[Shall, we, meet, at, ,, let's, say, ,, 8, :, 45, AM, ?]
However, I wish to make the time signature \\d+:\\d+ a delimiter, and would still like to retain it. So, what I'd like is
[Shall, we, meet, at, ,, let's, say, ,, 8:45, AM, ?]
Note:
- Adding the disjunct
(?=(\\d+:\\d+))in the expression of the split statement is not helping - outside of the time signature,
:is a delimiter in itself
How could I make this happen?