-1

My situation is very similar to the one in this question (in fact, the code is very similar). I've been trying to create a .htaccess file to use URLs without file extensions so that e.g. https://example.com/file finds file.html in the appropriate directory, but also that https://example.com/file.html redirects (using a HTTP redirect) to https://example.com/file so there is only one canonical URL. With the following .htaccess:

Options +MultiViews
RewriteEngine On

# Redirect <...>.php, <...>.html to <...> (without file extension)
RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

I've been running into a redirect loop just as in the question mentioned above. (In my case, finding the corresponding file is achieved by MultiViews instead of a separate RewriteRule.)

However, with a solution adopted from this answer:

Options +MultiViews
RewriteEngine On

# Redirect <...>.php, <...>.html to <...> (without file extension)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s(.+)\.(php|html)
RewriteRule ^ %1 [L,R]

there is no redirect loop. I’d be interested to find out where the difference comes from. Aren’t both files functionally equivalent? How come that using a “normal” RewriteRule creates a loop, while using %{THE_REQUEST} doesn’t?

Note that I’m not looking for a way to get clean URLs (I could just use the second version of my file or the answer to the question linked above, which looks at %{ENV:REDIRECT_STATUS}), but for the reason why these two approaches work/don’t work, so this is not the same question as the one linked above.

Note: I'm seeing the same problem using only mod_rewrite (without MultiViews), so it doesn't seem to be due to the order of execution of MultiViews and mod_rewrite:

Options -MultiViews
RewriteEngine On

## Redirect <...>.php, <...>.html to <...> (without file extension)
# This works...
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s(.+)\.(php|html)
RewriteRule ^ %1 [L,R]
# But this doesn’t!
#RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

# Find file with file extension .php or .html on the filesystem for a URL
# without file extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^ %{REQUEST_FILENAME}.php [L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^ %{REQUEST_FILENAME}.html [L]

Where’s the difference? I would expect both approaches to work because the internal rewrite to a file is at the very end of the .htaccess with an [L] flag, so there shouldn't be any processing or redirecting happening afterwards, right?

2 Answers 2

1

If you look at RewriteRule directive's documentation, you'll notice the following:

On the first RewriteRule, it is matched against the (%-decoded) URL-path of the request, or, in per-directory context (see below), the URL path relative to that per-directory context. Subsequent patterns are matched against the output of the last matching RewriteRule.

Since, it will be matched on a per directory basis, once you put the following:

RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

the REQUEST_URI variable changes, and mod-rewrite parses the URI again. This leads to MultiViews rewriting the URL to the proper file matching this redirected URL and causing a loop (URI changes on every rewrite).

Now, when you put THE_REQUEST variable to match against, the URI may change on internal rewrites, but the actual request as received by the server would never change unless a redirect is performed.

Sign up to request clarification or add additional context in comments.

5 Comments

Same question here as for the other answer: Shouldn't the [L] flag stop the mod_rewrite loop (in my last example)? Why does it still keep going after e.g. RewriteRule ^ %{REQUEST_FILENAME}.html [L]? Even if other rules would still match, shouldn't it stop at [L]?
@Socob The L flag documentation has: "It is therefore important, if you are using RewriteRule directives in one of these contexts, that you take explicit steps to avoid rules looping, and not count solely on the [L] flag to terminate execution of a series of rules, as shown below.".
Yes, I know that part of the documentation. What I’m trying to understand is why [L] doesn’t help me in this case. Is it because of the following part? “It is possible that as the rewritten request is handled, the .htaccess file or <Directory> section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.”
@Socob ah, yes. I removed the wrong paragraph when I pasted text in previous comment.
@Socob Once L flag is encountered, mod-rewrite does whatever is asked of it. Now, if you have an internal redirect applied, the REQUEST_URI gets updated, and the .htaccess is processed from the start again. If you do not have L flag, mod-rewrite will continue to match rules until end of file and then start from the beginning once more, if there were successful REQUEST_URI changes..
1
# But this doesn’t!
#RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

Reason why this commented rule doesn't work and causes rewrite loop because your other rule is adding .html extension and changing %{REQUEST_URI} variable to /file.html thus causing this rule to execute again. And taking out .html from rule causes other rule to fire again. This goes on until max recursion limit is reached.

You also need to understand that mod_rewrite runs in a loop until a rule doesn't match. Since both rules keep firing therefore mod_rewrite keeps looping.

Reason why rule based on THE_REQUEST works because THE_REQUEST variable doesn't get overwritten after execution of other rewrite rules.

5 Comments

But shouldn't the [L] flag stop the mod_rewrite loop? Why does it still keep going after e.g. RewriteRule ^ %{REQUEST_FILENAME}.html [L]?
No L flag doesn't stop the loop. L flag merely ends the current rule and forces mod_rewrite to run the loop again.
By “current rule”, do you mean “current pass of the .htaccess file”? Otherwise the flag would be kind of useless...
Yes that is current. It ends the current RewriteRule of where it is used and then acts as continue for the mod_rewrite loop. Also note that if you use END instead (available with Apache 2.4+) then that will definitely terminate the mod_rewrite loop.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.