0

I am trying the following code to replace all spanish special characters to something that can be converted to an URL.

<?php
                $Handle = "blusa-tipo-túnica-asimétrica-sin-mangas";
                $Handle = str_replace( 'à', 'a', $Handle );
                $Handle = str_replace( 'á', 'a', $Handle );
                $Handle = str_replace( 'â', 'a', $Handle );
                $Handle = str_replace( 'ã', 'a', $Handle );
                $Handle = str_replace( 'ä', 'a', $Handle );
                $Handle = str_replace( 'å', 'a', $Handle );
                $Handle = str_replace( 'è', 'e', $Handle );
                $Handle = str_replace( 'é', 'e', $Handle );
                $Handle = str_replace( 'ê', 'e', $Handle );
                $Handle = str_replace( 'ë', 'e', $Handle );
                $Handle = str_replace( 'ì', 'i', $Handle );
                $Handle = str_replace( 'í', 'i', $Handle );
                $Handle = str_replace( 'î', 'i', $Handle );
                $Handle = str_replace( 'ï', 'i', $Handle );
                $Handle = str_replace( 'ð', 'o', $Handle );
                $Handle = str_replace( 'ñ', 'n', $Handle );
                $Handle = str_replace( 'ò', 'o', $Handle );
                $Handle = str_replace( 'ó', 'o', $Handle );
                $Handle = str_replace( 'ô', 'o', $Handle );
                $Handle = str_replace( 'õ', 'o', $Handle );
                $Handle = str_replace( 'ù', 'u', $Handle );
                $Handle = str_replace( 'ú', 'u', $Handle );
                $Handle = str_replace( 'û', 'u', $Handle );
                $Handle = str_replace( 'ü', 'u', $Handle );;
                
                echo $Handle;
?>

But the above prints exactly the same input I give "blusa-tipo-túnica-asimétrica-sin-mangas". Why? What am I doing wrong?

3
  • No repro ~ 3v4l.org/4se7E Commented Feb 24, 2022 at 0:18
  • 2
    1. The charset you've written your matches in is probably different than what the data is using. 2. No, you can't detect the charset. 3. Yes I know there are functions that purport to do that. They guess. 4. You need to explicitly know the charset. 5. Don't use anything like this code, use Transliterator. Commented Feb 24, 2022 at 0:19
  • Those "spanish special characters" are just letters and also exist outside Spanish. Do you even keep in mind the actually interesting characters, like ¿ and ¡? Commented Feb 24, 2022 at 12:47

1 Answer 1

1

My little trick to replace all special characters is to convert the string to HTML, then replace the special characters by their base letters :

function strip_accents($str)
    {
        $str = htmlentities($str, ENT_COMPAT, 'UTF-8');
        
        $str = preg_replace('#\&([A-za-z])(?:acute|cedil|circ|grave|ring|tilde|uml)\;#', '\1', $str);
        $str = preg_replace('#\&([A-za-z]{2})(?:lig)\;#', '\1', $str);
        $str = preg_replace('#\&[^;]+\;#', '', $str);
        
        return $str;
    }

Note: make sure your source file is UTF-8 encoded

Sign up to request clarification or add additional context in comments.

2 Comments

This will kill ð instead of turning it into o as per OP's code (unbound to how little sense this makes for its meaning or in the context of Spanish).
@AmigoJack Sure, but in that case OP could add a str_replace() turning &eth; into o

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.