php - htmlspecialchars with unicode

Question

    $string = "Główny folder grafik<p>asd nc</p>";

echo htmlspecialchars($string);

on live site

G&#322;ówny folder grafik<p>asd nc</p>

on local

Główny folder grafik<p>asd nc</p>

what is problem ? i want when run on live site result look like local

Why do you need to do this in the first place? It shouldn't be necessary. — Pekka
– Pekka, Commented Apr 4, 2011 at 10:38

Community · Accepted Answer · 2023-11-17 19:42:58Z

1

htmlspecialchars() accepts additional parameters -- the third one being the charset.

Try specifying that third parameter.

edited Nov 17, 2023 at 19:42

CommunityBot

11 silver badge

answered Apr 4, 2011 at 10:37

Pascal MARTIN

402k82 gold badges665 silver badges666 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Christophe · Accepted Answer · 2011-04-04 10:42:00Z

1

You need to add extra parameters to the htmlspecialchars() function. The following should work:

htmlspecialchars($string, ENT_QUOTES, "UTF-8");

answered Apr 4, 2011 at 10:42

Christophe

4,8285 gold badges46 silver badges84 bronze badges

Comments

fabrik · Accepted Answer · 2011-04-04 10:38:11Z

0

You may want to pass an optional parameter to htmlspecialchars about charset which is ISO-8859-1 by default.

answered Apr 4, 2011 at 10:38

fabrik

14.4k8 gold badges58 silver badges71 bronze badges

Comments

rubo77 · Accepted Answer · 2021-03-18 22:56:03Z

If you require all strings that have associated named entities to be translated, use htmlentities() instead, that function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

but even htmlentities() does not encode all unicode characters. It encodes what it can [all of latin1], and the others slip through (e.g. `Љ).

This function consults an ansii table to custom include/omit chars you want/don't.

(note: sure it's not that fast)

/**
 * Unicode-proof htmlentities.
 * Returns 'normal' chars as chars and weirdos as numeric html entites.
 * @param  string $str input string
 * @return string      encoded output
 */
function superentities( $str ){
    // get rid of existing entities else double-escape
    $str = html_entity_decode(stripslashes($str),ENT_QUOTES,'UTF-8');
    $ar = preg_split('/(?<!^)(?!$)/u', $str );  // return array of every multi-byte character
    foreach ($ar as $c){
        $o = ord($c);
        if ( (strlen($c) > 1) || /* multi-byte [unicode] */
            ($o <32 || $o > 126) || /* <- control / latin weirdos -> */
            ($o >33 && $o < 40) ||/* quotes + ambersand */
            ($o >59 && $o < 63) /* html */
        ) {
            // convert to numeric entity
            $c = mb_encode_numericentity($c,array (0x0, 0xffff, 0, 0xffff), 'UTF-8');
        }
        $str2 .= $c;
    }
    return $str2;
}

Collectives™ on Stack Overflow

php - htmlspecialchars with unicode

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related