DOMDocument
在线手册:中文 英文
PHP手册

DOMDocument::saveHTML

(PHP 5)

DOMDocument::saveHTML Dumps the internal document into a string using HTML formatting

说明

string DOMDocument::saveHTML ([ DOMNode $node = NULL ] )

Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.

参数

node

Optional parameter to output a subset of the document.

返回值

Returns the HTML, or FALSE if an error occurred.

更新日志

版本 说明
5.3.6 The node parameter was added.

范例

Example #1 Saving a HTML tree into a string

<?php

$doc 
= new DOMDocument('1.0');

$root $doc->createElement('html');
$root $doc->appendChild($root);

$head $doc->createElement('head');
$head $root->appendChild($head);

$title $doc->createElement('title');
$title $head->appendChild($title);

$text $doc->createTextNode('This is the title');
$text $title->appendChild($text);

echo 
$doc->saveHTML();

?>

参见


DOMDocument
在线手册:中文 英文
PHP手册
PHP手册 - N: Dumps the internal document into a string using HTML formatting

用户评论:

mpeters at domblogger dot net (03-Nov-2011 08:27)

There is not a <script /> problem.

When a script node does not have a child and it is dumped as XML, a self closing script node is proper. Any browser with XML support will do the right thing IF you send your document with the right mime type -- application/xhtml+xml

When you dump it via saveHTML() - the script node will not be self closing.

There is however a <source /> problem.

With the new html5 media tags, <source src="whatever"> is not closed in html - so when sending as html, do a preg_replace on the output of saveHTML() to get rid of the </source> tags which are invalid.

alvaro at demogracia dot com (29-Mar-2011 07:04)

Since PHP/5.3.6, DOMDocument->saveHTML() accepts an optional DOMNode parameter similarly to DOMDocument->saveXML():

http://bugs.php.net/bug.php?id=39771

Yajo (24-Nov-2010 03:21)

Another way to workaround the <script/> problem is putting a semicolon (;) inside the script element.

Anonymous (09-Feb-2010 03:52)

If you want a simpler way to get around the <script> tag problem try:

<?php

  $script
= $doc->createElement ('script');\
 
// Creating an empty text node forces <script></script>
 
$script->appendChild ($doc->createTextNode (''));
 
$head->appendChild ($script);

?>

Anonymous (13-May-2009 03:35)

To avoid script tags from being output as <script />, you can use the DOMDocumentFragment class:

<?php

$doc
= new DOMDocument();
$doc -> loadXML($xmlstring);
$fragment = $doc->createDocumentFragment();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */
if($fragment->appendXML("<script type='text/javascript' src='$source'></script>") {
 
$xpath = new DOMXpath($doc);
 
$resultlist = $xpath->query("//*[local-name() = 'html']/*[local-name() = 'head']"); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
 
foreach($resultlist as $headnode// insert the script tag
    
$headnode->appendChild($fragment);
}
$doc->saveXML(); /* and our script tags will still be <script></script> */

?>

Bart Feenstra (18-Jan-2009 06:17)

I am using this solution to prevent tags and the doctype from being added to the HTML string automatically:

<?php
$html
= '<h1>Hello world!</h1>';
$html = '<div>' . $html . '</div>';
$doc = new DOMDocument;
$doc->loadHTML($html);
echo
substr($doc->saveXML($doc->getElementsByTagName('div')->item(0)), 5, -6)

// Outputs: "<h1>Hello world!</h1>"
?>

m at hbblogs daught calm (18-Aug-2008 04:41)

This method, as of 5.2.6, will automatically add <html><body> and <!DOCTYPE> tags to the document if they are missing, without asking whether you want them. In my application, I needed to use the DOM methods to manipulate just a fragment of html, so these tags were rather unhelpful.

Here's a simple hack to remove them in case, like me, all you wanted to do was perform a few operations on an HTML fragment.

$html_fragment = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));

Anonymous (26-Apr-2008 04:15)

<?php
function getDOMString($retNode) {
  if (!
$retNode) return null;
 
$retval = strtr($retNode-->ownerDocument->saveXML($retNode),
  array(
   
'></area>' => ' />',
   
'></base>' => ' />',
   
'></basefont>' => ' />',
   
'></br>' => ' />',
   
'></col>' => ' />',
   
'></frame>' => ' />',
   
'></hr>' => ' />',
   
'></img>' => ' />',
   
'></input>' => ' />',
   
'></isindex>' => ' />',
   
'></link>' => ' />',
   
'></meta>' => ' />',
   
'></param>' => ' />',
   
'default:' => '',
   
// sometimes, you have to decode entities too...
   
'&quot;' => '&#34;',
   
'&amp;' =>  '&#38;',
   
'&apos;' => '&#39;',
   
'&lt;' =>   '&#60;',
   
'&gt;' =>   '&#62;',
   
'&nbsp;' => '&#160;',
   
'&copy;' => '&#169;',
   
'&laquo;' => '&#171;',
   
'&reg;' =>   '&#174;',
   
'&raquo;' => '&#187;',
   
'&trade;' => '&#8482;'
 
));
  return
$retval;
}
?>

mjaque at ilkebenson dot com (19-Feb-2008 07:34)

DOMDocument->saveXML() doesn't generate a proper XHTML format either.

There is a problem with "script" empty elements. For example:

This will be the code generated by saveXML, with an empty script tag.

<html>
  <head>
    <script type="text/JavaScript" src="myScript.js"/>
  </head>
  <body>
    <p>I will not appear</p>
    <script type="text/JavaScript">
    alert("Not working");
    </script>
  </body>
</html>

I don't know if this is valid XHTML (W3C Validator doesn't complain), but both FF 2.0 and IE 6 will not render it properly. Both will use </script> as the closing tag for the first script causing js errors and ignoring in between elements.

You can post-process saveXML string in order to close empty tags with the following function:

<?php
   
function cerrarTag($tag, $xml){
       
$indice = 0;
        while (
$indice< strlen($xml)){
           
$pos = strpos($xml, "<$tag ", $indice);
            if (
$pos){
               
$posCierre = strpos($xml, ">", $pos);
                if (
$xml[$posCierre-1] == "/"){
                   
$xml = substr_replace($xml, "></$tag>", $posCierre-1, 2);
                }
               
$indice = $posCierre;
            }
            else break;
        }
        return
$xml;
    }
?>

At least script and select empty elements should be closed. This example shows how it can be used:

<?php
    define
("CABECERA_XHTML", '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">');

 
$xhtml = $docXML->saveXML($docXML->documentElement);
 
$xhtml = cerrarTag("script", $xhtml);
 
$xhtml = cerrarTag("select", $xhtml);
 
$xhtml = CABECERA_XHTML."\n".$xhtml;
  echo
$xhtml;
?>

archanglmr at yahoo dot com (27-Nov-2007 11:28)

If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you're reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.

<?php
header
('Content-Type: text/html; charset=iso-8859-1');
?>

xoplqox (20-Nov-2007 07:07)

XHTML:

If the output is XHTML use the function saveXML().

Output example for saveHTML:

<select name="pet" size="3" multiple>
    <option selected>mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>

XHTML conform output using saveXML:

<select name="pet" size="3" multiple="multiple">
    <option selected="selected">mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>

tyson at clugg dot net (22-Apr-2005 01:44)

<?php
// Using DOM to fix sloppy HTML.
// An example by Tyson Clugg <tyson@clugg.net>
//
// vim: syntax=php expandtab tabstop=2

function tidyHTML($buffer)
{
 
// load our document into a DOM object
 
$dom = @DOMDocument::loadHTML($buffer);
 
// we want nice output
 
$dom->formatOutput = true;
  return(
$dom->saveHTML());
}

// start output buffering, using our nice
// callback funtion to format the output.
ob_start("tidyHTML");

?>
<html>
<p>It's like comparing apples to oranges.
</html>
<?php

// this will be called implicitly, but we'll
// call it manually to illustrate the point.
ob_end_flush();

?>

The above code takes out sloppy HTML:
 <html>
 <p>It's like comparing apples to oranges.
 </html>

And cleans it up to the following:
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
 <html><body><p>It's like comparing apples to oranges.
 </p></body></html>