I have this line of text here, which will always be the same (except the message at the end):
2021-12-08T18:18:38+00:00 INFO Produktbestand erfolgreich von Collmex abgerufen | "STOCK_AVAILABLE;23;1;363;PCE;-1\r\nMESSAGE;S;204020;Daten?bertragung erfolgreich. Es wurden 1 Datens?tze verarbeitet.\r\n"
I have 3 functions which should return parts of the log entry:
public function get_log_file_entry_time( string $entry ): string {
}
public function get_log_file_entry_level( string $entry ): string {
}
public function get_log_file_entry_message( string $entry ): string {
}
I've first tried using explode with a whitespace as delimiter, which works but not the best way since the log message can be very long in some cases.
I'm not that RegEx expert, but I've found the following combination to match the first two pieces: ([^\s]+) ([A-Z]+)
This returns me the timestamp and the level. Now I'm struggling to get the message after the second group - maybe my nesting is not perfect at all. Any advice would make me happy!
Notice
The message will start after the first whitespace after the logging level. For example:
Produktbestand erfolgreich von Collmex abgerufen | "STOCK_AVAILABLE;23;1;363;PCE;-1\r\nMESSAGE;S;204020;Daten?bertragung erfolgreich. Es wurden 1 Datens?tze verarbeitet.\r\n"
^(\S+)\h([A-Z]+)\h([^|]+)regex101.com/r/CyMiDJ/1(?s)^(\S+)\h+([A-Z]+)\h+(.+)regex101.com/r/WkuRgY/1 but if there are more lines that start with a date and time it will over match it.^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\+\d{2}:\d{2})\h+([A-Z]+)\h+(.*(?:\R(?!(?1)).*)*)for multiple lines regex101.com/r/V8wUYy/1[^\s]is more elegantly written as\S, but if all delimiters are single spaces, then[^ ]is also appropriate.