1

I'm searching solution how to parse plain text to the js array. I have already found some scheme in which i want to do this, but kind of stuck.

Part of plain text:

2017-11-08 09:43:49,153 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}2017-11-08 09:53:02,293 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}2017-11-08 09:53:02,355 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}

Expected result

const arr = [
    '2017-11-08 09:43:49,153 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}',
    '2017-11-08 09:53:02,293 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}',
    '2017-11-08 09:53:02,355 [INFO ] root: {\"methodId\":6,\"requestBody\":{},\"token\":\"XXXX\"}'
]

RegEx Pattern:

/}\d{4}-\d{2}/

Each chunk ends by closing object "}" and starting new date "YYYY-MM".

Problem

plainText.split(/}\d{4}-\d{2}/)

If i split it this way, it always "eats" my separator. Is there some way to split text and add founded separator to the second element from the splited pair? Then i could just add "}" to the first one and remove "}" from the second one. It's solution I'm thinking about, but maybe you can suggest something even better.

3
  • Where does this messed up plain text come from in the first place? Commented Nov 8, 2017 at 9:56
  • 1
    s.split(/\b(?=\d{4}-\d{2}-\d{2}\s+[\d:,]+\s+\[INFO ]\s+root:)/).filter(Boolean). Shorten the pattern if the requirements can be lax (depends on the scenario, it can even be /\b(?=\d{4}-\d{2}-\d{2}\s/ if the date strings do not appear in the JSON data). See this demo. Commented Nov 8, 2017 at 10:00
  • @melpomene It's response from the api I'm working with. I have no control over the form in which it gets the answer. Commented Nov 8, 2017 at 10:02

1 Answer 1

1

If the JSON data does not contain datetime-like substrings, you may use

s.split(/\b(?=\d{4}-\d{2}-\d{2}\s/).filter(Boolean)

Or a more verbose (to play it safer):

s.split(/\b(?=\d{4}-\d{2}-\d{2}\s+[\d:,]+\s+\[INFO ]\s+root:)/).filter(Boolean)

See the regex demo

The point is to match the datetime-like string but not consume it, thus, the whole pattern is wrapped within a positive lookahead (?=...) construct.

Longer pattern details

  • \b - a word boundary
  • (?= - start of the positive lookahead pattern
    • \d{4}-\d{2}-\d{2} - date-like string (4 digits-2 digits-2 digits)
    • \s+ - 1 or more whitespaces
    • [\d:,]+ - 1 or more digits, : or/and ,
    • \s+ - 1 or more whitespaces
    • \[INFO ] - an [INFO ] substring
    • \s+ - 1+ whitespaces
    • root: - root: substring
  • ) - end of the lookahead
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.