Closed as not planned
Closed as not planned
Description
Description
The following code:
<?php
$html='<p>My Brand®</p>';
$dom = new \DOMDocument();
$dom->encoding = 'UTF-8';
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$result = $dom->saveHTML();
print($result);
Resulted in this output:
<p>My Brand®</p>
But I expected this output instead:
<p>My Brand®</p>
Or at least this:
<p>My Brand®</p>
The same issue is with other unicode characters, more examples:
<p>My ☆ Brand</p>
>>><p>My ☆ Brand</p>
<div>€ 100</div>
>>><div>€ 100</div>
<p>À bientôt!</p>
>>><p>À bientôt!</p>
<p>Hello 😕 there!</p>
>>><p>Hello 😕 there!</p>
Could you please explain why this happens, and how to fix this issue? And maybe suggest any workarounds?
Also, as I see, it forcibly converts all unicode characters to HTML entities, but many people prefer to keep the original formatting (keep HTML entities as entities, but UTF-8 symbols as symbols), would be good to fix this too.
PHP Version
PHP 8.4.5 (cli) (built: Mar 17 2025 20:35:32) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.5, Copyright (c) Zend Technologies
with Zend OPcache v8.4.5, Copyright (c), by Zend Technologies
Operating System
Ubuntu 25.04