5

I am trying to clean up a HTML string and create an HTML5 document using Tidy and PHP, however, am creating a HTML3.2 document. As seen, I am getting an Config: missing or malformed argument for option: doctype error. I am operating PHP Version 5.5.35 with Centos 6 and Apache 2.2, and php_info() shows the following:

tidy

Tidy support    enabled
libTidy Release 14 June 2007
Extension Version   2.0 ($Id: e066a98a414c7f79f89f697c19c4336c61bc617b $)

Directive   Local Value Master Value
tidy.clean_output   no value    no value
tidy.default_config no value    no value

How do I create an HTML5 document? Below is my attempt:

<?php
$html = <<<EOD
<p>Hello</p>
<div>
 <p data-customattribute="will be an error">bla</p>
 <p>bla</p>
</div>
<div>
 <p>Hi there!</p>
 <div>
  <p>Opps, a mistake</px>
 </div>
</div>
EOD;
$html="<!DOCTYPE HTML><html><head><title></title></head><body>$html</body></html>";

echo($html."\n\n");

    $config = array(
        'indent' => true,
        'indent-spaces' => 4,
        'doctype' => '<!DOCTYPE HTML>',
    );

$tidy = new tidy;
$tidy->parseString($html, $config, 'utf8');
$tidy->cleanRepair();
print_r($tidy);

OUTPUT

<!DOCTYPE HTML><html><head><title></title></head><body><p>Hello</p>
<div>
 <p data-customattribute="will be an error">bla</p>
 <p>bla</p>
</div>
<div>
 <p>Hi there!</p>
 <div>
  <p>Opps, a mistake</px>
 </div>
</div></body></html>

tidy Object
(
    [errorBuffer] => Config: missing or malformed argument for option: doctype
line 9 column 21 - Warning: discarding unexpected </px>
line 3 column 2 - Warning: <p> proprietary attribute "data-customattribute"
    [value] => <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
    <head>
        <title></title>
    </head>
    <body>
        <p>
            Hello
        </p>
        <div>
            <p data-customattribute="will be an error">
                bla
            </p>
            <p>
                bla
            </p>
        </div>
        <div>
            <p>
                Hi there!
            </p>
            <div>
                <p>
                    Opps, a mistake
                </p>
            </div>
        </div>
    </body>
</html>
)

1 Answer 1

1

Old versions of Tidy do not support HTML5 documents

The first release of the tidy the supports HTML 5 was in Sep 2015, where the HTML Tidy Advocacy Community Group released the first version of tidy-html5.

Note that you are using an old version of tidy, so you will not be ableto validate html5 documents.

Current precompiled releases of php are not yet compiled with tidy-html5, so if you will want to use tidy-html5 you will have to compile it yourself.

These instructions were taken from the README file in the tidy-html5 github:

Due to API changes in the PHP source, "buffio.h" needs to be changed to "tidybuffio.h" in the file ext/tidy/tidy.c.

That is - prior to configuring php run this in the php source directory:

   sed -i 's/buffio.h/tidybuffio.h/' ext/tidy/*.c

And then continue with (just an example here, use your own php config options):

   ./configure --with-tidy=/usr/local
   make
   make test
   make install
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Dekel, I suspected so much after seeing I had a 7 year old version of tidy. Ah, compile my own version. Joy!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.