Tuesday 15 February 2011

php - workaround for tinymce utf8 bug -


Background: I have a web site that uses tinnies to edit HTML non-breaking space and ; HTML entities like x22c4; Tinymce is converting to UTF-8 characters, even if I am using entity_encoding: "named" option. I store it in a MySQL table which is UTF-8 when the same HTML is retrieved for editing again, multi-byte characters are separated by tinxes in individual characters. Tinymce has verified that this is a bug.

The question is: how do I convert all multi-byte UTF-8 characters into HTML entities without breaking the HTML?

I tried to have the following in PHP, but multi-byte UTF-8 characters have been deleted:

  $ encoded_string = histimens ($ utf_string, ErikHTML5, UTF-8 ', incorrect); $ Html_ent_conv = htmlspecialchars_decode ($ encoded_string, ENT_COMPAT | ENT_HTML5);   

I also tried mb_encode_numericentity, but I could not figure out what should I use for the parametric parameter.

Update: I have PHP 5.3.17, which is not supported ENT_HTML5, so I removed it. Now it works for non-breaking space, but not for other multi-byte UTF-8 characters.

A few years ago, I went to the section of this code in PHP documentation: < Pre> function utf8_to_html ($ data) {back preg_replace ("/ ([\\ xC0 - \\ xF7] {1,1} [\\ x80 - \\ xBF] +) / E", '_tf8_to_html ( "\\ 1"), $ data); } Function _utf8_to_html ($ data) {$ ret = 0; Foreach ($ str =) ($ ($ data {0})% 252% 248% 240% 224% 192) + 128) Substr ($ data, 1)))) $ k = & gt; $ V) $ ret + = (ord ($ v)% 128) * Pau (64, $ k); Return "& # $ ret;";

No comments:

Post a Comment