3

I have a MySQL database with some content created with ckeditor. The text is stored like this <p><strong>L’automatisation des systèmes dans le monde so with HTML entities.

With Twig, in a Symfony project, when I want to display my data, I use for this field {{ article.mytext|raw}} but it display <p><strong>L&rsquo;automatisation des syst&egrave;mes dans le monde so it's not fully decoded and interpreted...

With PHP I have no problem and a html_entity_decode($mytext); do the job perfectly.

Can you help me ? What is wrong ?

As requested, more of the code : in MySQL in a utf8_general_ci column "vTexte" &lt;p&gt;&lt;strong&gt;L&amp;rsquo;automatisation des syst&amp;egrave;mes dans le monde

in my controler in symfony :

namespace MU\CoreBundle\Controller;

use Symfony\Bundle\FrameworkBundle\Controller\Controller;
use Symfony\Component\HttpFoundation\Response;
use MU\CoreBundle\Entity\Veille;

class CoreController extends Controller
{

public function actufreeAction()
{

  $repository = $this
    ->getDoctrine()
    ->getManager()
    ->getRepository('MUCoreBundle:Veille')
  ;

  $listactufree = $repository->findBy(
          array('vStatus' => '4'), // Critere
          array('vDatePublished' => 'desc'),        // Tri
          5,                              // Limite
          0                               // Offset
        );  



    $content = $this->get('templating')->render('MUCoreBundle::news.html.twig', array(
  'listactufree'  => $listactufree,
));
    return new Response($content);
}

}

In my Twig file news.html.twig

{% for veille in listactufree %}

    {{ veille.vTexte | raw }}

{% endfor %}

With that, it show : <p><strong>L&rsquo;automatisation des syst&egrave;mes dans le monde

and I want :

L’automatisation des systèmes dans le monde

5
  • you really should provide more of the source, rather than little snippets. Commented Dec 30, 2017 at 0:33
  • I'll provide a solution for you soon, but so that I can ensure it's the best one for you, could you tell me if you have control over how this HTML is stored? Are you using html_entity_encode somewhere before sending it to the DB? Commented Dec 30, 2017 at 6:02
  • Thank you for your help. Yes I have control over this HTML is stored but I have 800 articles stored like this... it's why I would really like to find a way to display them correctly. This HTML is stored using CKEditor 4 and a PDO prepared request in "normal" PHP/MyQSL. I used also a htmlspecialchars before the insertion but I already tried without it and it change nothing. Commented Dec 30, 2017 at 14:13
  • I don't use a html_entity_decode before the insertion, but I used one to display this HTML data from my DB and it's work. Without the html_entity_decode is show the result I have in Symfony after using the raw filter <p><strong>L&rsquo;automatisation des syst&egrave;mes dans le monde it's why I was trying previouly to use 2 times the raw filter... Commented Dec 30, 2017 at 14:13
  • and I have also tried to replace the content my DB by the content I have after using the raw filter : p><strong>L&rsquo;automatisation des syst&egrave;mes dans le monde and when I use the raw filter on this, it works. It's why i was confused and tried to use the raw filter 2 times. Commented Dec 30, 2017 at 14:24

4 Answers 4

3

{{ html_var|convert_encoding('UTF-8', 'HTML-ENTITIES')}}

Sign up to request clarification or add additional context in comments.

Comments

1

Carefully control encodings.

The problem you are encountering is the simplest form of the nightmare that is character encoding. Rule one in handling encoding issues is to always control the encoding. In your case, this HTML should really be stored un-encoded in the DB, allowing for a single use of the twig raw output filter.

Consider the implications if you don't know whether the HTML needs to be decoded. For instance, if someone intended to show a < (&lt;) in the text, and the html wasn't encoded, then applying html_entity_decode will turn that encoded < into a real one, and break the HTML. (Browsers will think you're starting a new TAG).

Forms submitting html

I'm going to guess that elsewhere in your app, there are forms which allow people to submit HTML. HTML forms encode data before posting, PHP $_POST handling usually auto applies htmlentities onto posted fields.

Whatever methods of your application are handling the storing of such Posted HTML, or add/changes these entities, should use html_entity_decode to ensure it is stored as raw html.

This way, you always know that HTML is stored in you database in exactly the state it needs to be rendered on the page. If it needs to be encoded and decoded again somewhere for some reason, you're not left hoping that some of the content won't be double-decoded. (Or as in your case, missing a decode step and spitting out raw HTML).

Rendering HTML in twig

One way or another, you need to get that content to your twig, passed through the raw filter, in the exact state it needs to be put on the page. While this can be done in twig, or with twig functions, this really should be done in the Controller.

Consider that path this data is traveling:

  1. Form (raw HTML)
  2. PHP POST (encoded HTML)
  3. DB (encoded HTML)
  4. SELECT Query
  5. PHP Controller (encoded HTML)
  6. Twig
  7. Rendered HTML

The earlier you can get the data in the correct state, the better. Doing so makes the data easier to work with, reduces the performance impact, and prevents code-duplication later in those steps. (What happens if you want to use this html in other Controllers or Twig templates? Code duplication.) Hense, my first suggestion to get it in the database cleanly.

If clean data is not an option...

Clean the data in your controller, maybe even with a static function in the entity.

The only other clean-ish way to handle this, is to loop through and decode the html before passing it to Twig. Doing it this way however, or later in the Tiwg template as you're currently doing, you run the risk of double-decoding that I mentioned earlier.

Veille

Class Veille
{

    . . .

    public static function decodeArray(?array $vielleList): ?array
    {
        foreach ($vielleList as $vielle) {
            if (!$vielle instanceof self) {
                continue;
            }

            $vielle->setVText(html_entity_decode($vielle->getVText()));
        }

        return $vielleList;
    }

    public function getVText(): ?string
    {
        return htmlentities("<h3>Encoded HTML Test</h3>Good data &gt; parsing bad data.");
    }

    public function setVText(?string $text): void
    {

    }
}

Controller

$listactufree = $repository->findBy(
    array('vStatus' => '4'), // Critere
    array('vDatePublished' => 'desc'),        // Tri
    5,                              // Limite
    0                               // Offset
);


$listactufree = Veille::decodeArray($listactufree);

Twig

{% for veille in listactufree %}

    {{ veille.vTexte|raw }}

{% endfor %}

Note, this same code could clean your data.

Plugging the contents of the decodeArray method into a command, and running that command, along with persist/flush, would store decoded HTML for all of your entities. Just make sure you only decode ONCE, and any methods which can add or edit these entities store the HTML unencoded. That's the key. Unencoded data.

2 Comments

Super, thank you for these great explanations ! First I added a html_entity_decode before saving the content of my form in the database. And I tried your code in Symfony and the decodeArray function works well, I was confused with the getter and setter you write, should I need them ? It works with the one I had : public function getVTexte() { return $this->vTexte; } ? And finally, I just write a command to transform the data in my database in decoded HTML !
I only included the vtexe getter and setter for code completeness, so it was clear what the decodeArray function expected, and returned. You don't need my getters and setters, just use yours. Glad I could help :)
1

Using automatic escaping with convert_encoding should decode:

{% autoescape false %}
  {{ value|convert_encoding('UTF-8', 'HTML-ENTITIES') }}
{% endautoescape %}

If you don't use autoescape false, it will not decode the html entities. I used this for decoding &#039; to '.

Sources: https://twig.symfony.com/doc/2.x/tags/autoescape.html https://twig.symfony.com/doc/2.x/filters/convert_encoding.html

Comments

0

You had to set the autoescape to false:

see there: https://twig.symfony.com/doc/2.x/tags/autoescape.html

{% autoescape false %}
    {{ html_var }}
{% endautoescape %}

1 Comment

It display the same result as I have by using {{ html_var | raw }} I have still <p><strong>L&rsquo;automatisation des syst&egrave;mes dans le monde :-(

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.