4

I´m working on a *.po file, I´m trying to catch all the text between msgid "" and msgstr "", not really lucky, never more than one line:

msgid ""
"%s asdfgh asdsfgf asdfg %s even if you "
"asdfgdh sentences with no sense. We are not asking  translate "
"Shakespeare's %s Hamlet %s !. %s testing regex %s "
"don't require specific industry knowledge. enjoying "
msgstr ""

What I´ve tried:

var myArray = fileContent.match(/msgid ([""'])(?:(?=(\\?))\2.)*?\1/g);

Thanks for your help, I´m not really good with regex :(

0

4 Answers 4

10

Here is one way to extract all of that text:

var match = text.replace(/msgid ""([\s\S]*?)msgstr ""/, "$1");

Example: http://jsfiddle.net/bqk79/

The [\s\S] is a character class that will match any character including line breaks, so [\s\S]*? will match any number of any character. In other languages you could use the s or DOTALL flag to make . match line breaks, but Javascript does not support this.

Note that you regex doesn't make any mention of single quotes, but if you need to be able to match between msgid '' and msgstr '' as well you can use the following:

var match = text.replace(/msgid (['"]{2})([\s\S]*?)msgstr \1/, "$2");
Sign up to request clarification or add additional context in comments.

1 Comment

Simply My two days search, ends here.
2

Try with this pattern:

/msgid (["']{2})\n([\s\S]*?)\nmsgstr \1/

The result is in the second capturing group, but you can make more simple with:

/msgid ["']{2}\n([\s\S]*?)\nmsgstr /

in the first capturing group

1 Comment

s flag does not exist in Javascript.
2

I realize that the question specifically asks for a regular expression, but you should consider using string split instead if you can.

Here is a ready-made function:

function extractTextBetween(subject, start, end) {
    try{
        return subject.split(start)[1].split(end)[0];
    } catch(e){
        console.log("Exception when extracting text", e);
    }
}

http://jsfiddle.net/b33hdh9b/3/

Comments

1

You could perhaps try this regex?

msgid ""((?:.|[\n\r])+)msgstr ""

((?:.|[\n\r])+) this is your catching group;

(?:.|[\n\r])+ This enables the match of . or [\n\r] multiple times, the \n\r are for newlines and carriage returns.

Tested

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.