Skip to content Skip to sidebar Skip to footer

Html_entity_decode Terminate?

I'm using html_entity_decode($row['Content']) to display some JSON data that contains HTML in a PHP document. Problem is that some of the data being returned has open HTML tags suc

Solution 1:

If you ever accept raw HTML from an outside source to embed into your site, you should always, always, reformat and whitelist it. You have no idea what that 3rd party HTML may contain, and you have no guarantee that it's valid; yet on your site you presumably want guaranteed valid HTML with certain limits on its content (or do you really want to enable the embedding of arbitrary <script> tags...?!).

That means you want to:

  1. parse the HTML and extract whatever structural information is in it
  2. filter that structure to allow only approved elements and then
  3. produce your own HTML from that which you can guarantee is syntactically valid.

Supposedly the best PHP library which does that is HTML Purifier. Without using a library, you would use a lenient HTML parser, something like DOMDocument to inspect and filter the content, and then the built-in DOMDocument::saveXML to produce the new sanitised HTML.

Post a Comment for "Html_entity_decode Terminate?"