3

I am looking for a regular expression that will properly detect if there are printf() kind of placeholders inside a strings.

1

2 Answers 2

6

Turns out this is a bit trickier than it it looks, and the answer will depend on what you want to do with the placeholders, and what language/spec you're using for both the printf and the regular expressions.

To give you an example of what it looks like in practice, here's the placeholder regexp from the sprintf.js library, which matches it's own placeholder spec, which is similar but not identical to the c++ spec which is similar but not identical to the php spec:

placeholder: /^\x25(?:([1-9]\d*)\$|\(([^\)]+)\))?(\+)?(0|'[^$])?(-)?(\d+)?(?:\.(\d+))?([b-fiosuxX])/,

You can get a great explanation of all the bits and peices using something like regex101 https://regex101.com/r/bV9nT8/1 but there are two key things to consider that will be broadly relevant:

1) The '^' at the start: The javascript Regexp implementation is missing the \G "End of previous match" anchor, which makes it impossible to match adjacent placeholders ( '%s%s' => ['%s','%s']) while ignoring escaped '%' signs ('%s%%s' => ['%s']). As a result, this library programmatically handles plain text and %% substrings before trying the placeholder regex.

2) The capture groups: As part of a parsing library, this regex captures a lot of things that it needs to know:

  1. the digits in $1s ( argument index )
  2. the name in $(variableName)s
  3. the + after the brackets if it's present
  4. Either a 0 or a ' followed by another character sequence ( I think this is the padding character )
  5. an optional - character
  6. More digits ( minimum field width )
  7. More digits ( floating point precision )
  8. the type specifier

Depending on your purpose, you will probably want a subset of those things, or perhaps none of them, with only a single capture group for the whole expression, like:

var regexForPlaceholders = /(?:\x25\x25)|(\x25(?:(?:[1-9]\d*)\$|\((?:[^\)]+)\))?(?:\+)?(?:0|'[^$])?(?:-)?(?:\d+)?(?:\.(?:\d+))?(?:[b-fiosuxX]))/g;
function placeholders(string ){
return ( string.match( regexForPlaceholders ) || [] )
        .filter( function( v ) {
            return v !== '\x25\x25';
        } );
}
Sign up to request clarification or add additional context in comments.

1 Comment

Really awesome I like it! I updated a bit to find placeholders in unformartted strings regex101.com/r/bV9nT8/3
2

Consider the Regex -

/%(?:\d+\$)?[dfsu]/

You should take a look at this prior answer as well - Validate sprintf format from input field with regex

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.