0

I'm trying to replace occurrences of text in documents with a ugly but normally working regex (tested on regex101 and in the editor, it's matching). But when I run the code, it's not working.

Regex101: https://regex101.com/r/rGmYpP/1

My code:

(async () => {
    const rename = util.promisify(fs.rename);
    const files = await fs.readdir('./working');
    for (let i = 0; i < files.length; i++) {
        const file = files[i];
        const read = async (filePath) => {
            const str = fs.createReadStream(filePath, 'utf8')
                .pipe(new stream.Transform({
                    decodeStrings : false,
                    transform(chunk, encoding, done) {
                        let result = chunk.replace(/\(LAST.*\n.*\n.*\n.*UPDATE/gm, '--------------');
                        done(null, result);
                    }
                }));
            const tempPath = await tempWrite(str);
            await rename(tempPath, filePath+'-output');
        };
        await read('./working/'+file);
    }
})();

Text sample:

OESHNF+Arial*1 [12 0 0 -12 0 0 ]msf
321.639 19.075 mo
(LAST )
[6.672 8.00409 8.00409 7.33191 0 ]xsh
354.938 19.075 mo
(UPDATE )
[8.664 8.00409 8.664 8.00409 7.33191 8.00409 0 ]xsh
406.945 19.075 mo
(OF )

Expected output:

OESHNF+Arial*1 [12 0 0 -12 0 0 ]msf
321.639 19.075 mo
-------------- )
[8.664 8.00409 8.664 8.00409 7.33191 8.00409 0 ]xsh
406.945 19.075 mo
(OF )

Thank you for your help, hope I provided everything.

7
  • what is the expected output for the provided sample string? Commented Jan 21, 2020 at 7:16
  • updated the post, thank you Commented Jan 21, 2020 at 7:29
  • In the transform-function, add console.log(chunk) to see what you have in the chunk. It might be that you don't have all the text for the regex to match. Commented Jan 21, 2020 at 7:31
  • I also think you have forgotten a .* after UPDATE: /\(LAST.*\n.*\n.*\n.*UPDATE.*/gm unless you want that unmatched closing bracket. Commented Jan 21, 2020 at 7:34
  • 1
    Have you checked what line-terminators you have in your file? Is it just LF or is it CR+LF? I just tested your code, and it works if it is LF but not for CR+LF. Commented Jan 21, 2020 at 8:05

2 Answers 2

1

I created an example like this:

#!/bin/env node

const fs = require('fs');
const stream = require('stream');

(async () => {
  const file = "test-file.txt";
  const read = async (filePath) => {
      const str = fs.createReadStream(filePath, 'utf8')
          .pipe(new stream.Transform({
              decodeStrings : false,
              transform(chunk, encoding, done) {
                  console.log("chunk", chunk);
                  let result = chunk.replace(/\(LAST.*\n.*\n.*\n.*UPDATE/gm, '--------------');
                  console.log("result", result);
                  done(null, result);
              }
          }));
  };
  await read(file);
})()

And "test-file.txt" like this:

OESHNF+Arial*1 [12 0 0 -12 0 0 ]msf
321.639 19.075 mo
(LAST )
[6.672 8.00409 8.00409 7.33191 0 ]xsh
354.938 19.075 mo
(UPDATE )
[8.664 8.00409 8.664 8.00409 7.33191 8.00409 0 ]xsh
406.945 19.075 mo
(OF )

The regex works if the file uses LF (0x0A) as line terminator, but not if the file uses CR+LF (0x0D + 0x0A) as line terminator.

Sign up to request clarification or add additional context in comments.

2 Comments

Replacing in the regex \n by \r\n is working, thank you
Changing the regexp to /\(LAST.*(\x0D?\x0A).*(\x0D?\x0A).*(\x0D?\x0A).*UPDATE/gm and it will work with both. :)
1

Use \(LAST[^]+?UPDATE.

Where [^]+? means 1 or more any characters (including newlines), not greedy.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.