4

I am looking for a regex that would match .js in the following URI:

 /foo/bar/file.js?cache_key=123

I'm writing a function that tries to identify what kind of file is being passed in as a parameter. In this case, the file ends with the extension .js and is a javascript file. I'm working with PHP and preg_match so I'm assuming this is a PCRE compatible regular expression. Ultimately I'll build on this expression and be able to check for multiple file types that are being passed in as a URI that isn't just limited to js, but perhaps css, images, etc.

0

3 Answers 3

6

You can use a combination of pathinfo and a regular expression. pathinfo will give you the extension plus the ?cache_key=123, and you can then remove the ?cache_key=123 with a regex that matches the ? and everything after it:

$url = '/foo/bar/file.js?cache_key=123';

echo preg_replace("#\?.*#", "", pathinfo($url, PATHINFO_EXTENSION)) . "\n";

Output:

js

Input:

$url = 'my_style.css?cache_key=123';

Output:

css

Obviously, if you need the ., it's trivial to add it to the file extension string.

ETA: if you do want a regex solution, this will do the trick:

function parseurl($url) {
    # takes the last dot it can find and grabs the text after it
    echo preg_replace("#(.+)?\.(\w+)(\?.+)?#", "$2", $url) . "\n";
}

parseurl('my_style.css');
parseurl('my_style.css?cache=123');
parseurl('/foo/bar/file.js?cache_key=123');
parseurl('/my.dir.name/has/dots/boo.html?cache=123');

Output:

css
css
js
html
Sign up to request clarification or add additional context in comments.

Comments

1

Use:

.+\.(js|css|etc)[?]?

extension in $matches[1]

Or you can just use

.+\.(js|css|etc)\?

if the final ?cache... is always used

Comments

0

DEMO

Code

$input_line = '/foo/bar/file.js?cache_key=123';

// lets grab the part part between filename and ?
preg_match("/\w+\/\w+\/\w+(.*)\?/", $input_line, $output_array);

var_dump($matches);

echo $matches[0]; 

Output

Array
(
   [0] => foo/bar/file.js?
   [1] => .js
)

.js

If you know the extensions beforehand (whitelist approach), you might switch from matching everything (.*) to matching specific extensions /.*\.(js|jpg|jpeg|png|gif)/

preg_match("/.*\.(js|jpg|jpeg|png|gif)/", $input_line, $matches);
echo $matches[1]; // js

3 Comments

This solution is rather dependent upon there being a set number of directories...
true! structure is /word/word/word(match)? . maybe it's sufficient. who knows. there is always pathinfo().
You could make the number of directories flexible using (\/\w+)* -- matches 0 or more /\w+

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.