1

I would like to remove all query strings including parameters and values from URLs with htaccess rules.

Here are a few URLs with query strings as examples which are needed to be removed from the end of URLs.

https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=342 
https://example.com/page/62/?option=com_content&view=article&id=91&Itemid=2 
https://example.com/page/30/?start=72 
https://example.com/other-category-slug/page/12/?add_to_wishlist=9486  
https://example.com/other-category-slug/page/15/?add_to_wishlist=9486 
https://example.com/other-category-slug/page/4/?orderby=price-desc&add_to_wishlist=332 
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=5736 
https://example.com/other- category-slug/page/7/?orderby=popularity 
https://example.com/other-category-slug/page/15/?add_to_wishlist=350 
https://example.com/category-slug/page/19/?orderby=price-desc 
https://example.com/category-slug/page/3/?orderby=date 
https://example.com/page/2/?post_type=map 
https://example.com/category-slug/page/2/?PageSpeed=noscript 
https://example.com/category/page/6/?orderby=menu_order 
https://example.com/page/50/?Itemid=wzshaxrogq 
https://example.com/category-slug/page/1/?orderby=price&add_to_wishlist=12953 
https://example.com/category-slug/this-is-product-slug/?PageSpeed=noscript 
https://example.com/category-slug/?add_to_wishlist=15153 
https://example.com/page/24/?op 
https://example.com/page/68/?iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625

I need clean URLs like these without query strings and parameters.category-slug and product-slug are just examples. I believe i need 5 rules.

https://example.com/category-slug/product-slug/
https://example.com/category-slug/page/15/
https://example.com/category-slug/
https://example.com/page/62/
https://example.com/

Here are a few query strings which I want to keep.

https://example.com/?attachment_id=123
https://example.com/?p=123
https://example.com/page/12/?fbclid=PAAaaK8eCN
https://example.com/your-shopping-cart/?remove_item=22c1acb3539e1aeba2
https://example.com/category-slug/this-is-product-slug/?add-to-cart=29030
https://example.com/?s=%7Bsearch_term_string%7D

Here is my code which is not working. In fact I don't understand the Regex in them.

RewriteEngine On
RewriteRule ^(page/[0-9]+)/.+$ /$1? [L,NC,R=301]
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
RewriteRule ^$ /? [L,NC,R=301]

Thanks in advance

7
  • 1
    "query strings which need to be removed" - Are they exact match query strings? And always the entire query string? Or specific URL parameters? Any value? Or the specific values as stated? Do you need to check the URL-path at all? " I don't understand the Regex" - So, where did you get the regex from? Commented Jan 31, 2023 at 22:31
  • Yes , Query strings are exact match and real /complete. I have mentioned Query strings with parameters and value already. Let me edit the question please. Commented Feb 1, 2023 at 8:30
  • 2
    remove_item= is shown in removals as well as in keeps Commented Feb 1, 2023 at 17:19
  • 2
    Why do you want to rewrite the URL like this? If you don't need that parameters, just ignore them in your server-side code. You can't prevent the client from sending whatever query parameter it wants. Commented Feb 7, 2023 at 18:11
  • 1
    @RickySixx It can potentially cause issues with duplicate content if these URLs have been erroneously indexed by search engines. Also, URL params like add_to_wishlist look as if they are potentially destructive. Redirecting would be an attempt at "fixing" this. Long term they would need to make sure they are setting the correct canonical meta tag. Commented Feb 10, 2023 at 19:05

1 Answer 1

0

Yes , Query strings are exact match

Whilst you've given examples of the URL-path, it looks like you just need to base the match on the query string part of the URL, not the URL-path? Unless the same query string could appear on another URL-path that you would want to keep?

You would only need to focus on the query strings you want to remove, not the ones you want to keep.

I believe i need 5 rules.

It looks like you would need just one rule, but with a lot of conditions (RewriteCond directives). One condition for every query string (since you say they are "exact matches").

RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$

Although, rather confusingly, you are not attempting an "exact match" at all in your rule, but rather using a generic pattern. (Although you've stated you "don't understand the Regex".)

If you are wanting "exact matches" then you don't need to use regex at all. You can use the = prefix operator on the CondPattern (2nd argument to the RewriteCond directive) to make it an exact (lexicographical) match.

For example, try something like the following instead:

RewriteEngine On

RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=342 [OR]
RewriteCond %{QUERY_STRING} =option=com_content&view=article&id=91&Itemid=2 [OR]
RewriteCond %{QUERY_STRING} =start=72 [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=9486 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=332 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=5736 [OR]
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=350 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =post_type=map [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
RewriteCond %{QUERY_STRING} =Itemid=wzshaxrogq [OR]
RewriteCond %{QUERY_STRING} =orderby=price&add_to_wishlist=12953 [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=15153 [OR]
RewriteCond %{QUERY_STRING} =op [OR]
RewriteCond %{QUERY_STRING} =iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]

The above redirects to the same URL-path, but strips the original query string if it matches any of those stated in the preceding conditions.

The QSD flag (Query String Discard) strips the original query string from the request. This is the preferred method on Apache 2.4. However, if you are still on Apache 2.2 then you would need to append an empty query string instead (as you are doing in your existing rule). For example:
RewriteRule ^ %{REQUEST_URI}? [R,L]

Note there is no OR flag on the last RewriteCond directive.

NB: You had included the query string add_to_wishlist=9486 twice in the list of URLs/query strings to remove.

Test first with a 302 (temporary) redirect and only change to a 301 (permanent), if that is the intention, once you have confirmed that it works as intended. 301s are cached persistently by the browser so can make testing problematic.

Make sure the browser cache is cleared before testing.


Combining conditions using regex

Using regex, you could combine several of the conditions. For example, the following 4 conditions could be combined into one:

RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]

Is the same as (using regex alternation):

RewriteCond %{QUERY_STRING} ^orderby=(popularity|price-desc|date|menu_order)$ [OR]

UPDATE:

Is it possible to Remove everything (query string and parameters etc) from all URLs with something like * instead of hardcoding each query string?

To remove every query string from every URL (seriously?) then you can do the following (no, you don't use *):

RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]

This removes any query string from any URL. The single dot (.) in the CondPattern matches a single character to check that there is a query string.

But this obviously removes the query strings you want to "keep" as well.

The regex character * is a quantifier that repeats the preceding token 0 or more times. (It is not a "wildcard-pattern".) It is not required here. You need to check that the query string is something, not nothing.

There are other options:

  • Reverse the logic and make exceptions for query strings you want to "keep" and remove the rest. But it depends which is the larger.
  • Don't match the query strings "exactly". And instead match URL parameter names, with any value.
Sign up to request clarification or add additional context in comments.

8 Comments

Is it possible to Remove everything (query string and parameters etc) from all URLs with something like * instead of hardcoding each query string?
@muhammadusman To clarify... you now want to remove every query string on every URL?! Well yes you can do that (very simply) - but this is very different to what you are asking initially. No, you don't use *. I've updated my answer.
@muhammadusman How did you get on with this?
I just need all URLs clean. no query string or parameter. I am not sure how it will be solved.
@muhammadusman "but pattern is same" - What do you mean? What is the "pattern"? You initially stated they were "exact match"? "I just need all URLs clean. no query string or parameter." - That is what the update to my answer does (but that obviously also removes the query strings you stated you wanted to keep in your question). Are you saying that you don't want to keep any query strings?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.