Saturday 15 March 2014

Non-ascii characters in URL -


I have a new problem that I have never seen before: My client is adding files to a project that we created And in some filenames they have special characters because some words are Spanish

For example, the file I'm testing in has a ¡I in that CSS file as a background file Saying it but not seen in Safari Does not, but it works on FF and Chrome.

As a test I have pasted the link in the browser and the same thing works on FF and Chrome, but Safari throws an error, so what language characters are throwing it to me?

- Firefox changes the following URL and changes to a CC% CC 81% and loads the image.

-Clássico_foto-Henrique-Peron-470x120-1371827671.jpg

You can break it up and see ... but FF and Chrome convert as follows: / P>

You can also see it in action here:

Testbox {width: 340px; Height: 100px; Background: URL ('http://www.themediacouncil.com/test/nonascii/LA-MAR_Cebiche-Clásico_foto-Henrique-Peron-470x120-1371827671.jpg') No-repeat top left; }

So what is the right way to handle it? I am developing in PHP and WordPress, I do not have to tell the client to go back and change all the files with special characters.

Any help is appreciated. Thanks!

Strange thing that did not respond, now it's probably too late for you, but by the way Also:

I believe that the standard that is becoming the norm is to convert non-ascii characters into UTF-8 byte sequences, and these sequences include% hh hex code. Url In URF-8, UTF-8 creates two bytes 0xC3 0xA1 , which is U + 00E1 (Unicode). Therefore, Clássico will become CL% C3% A1ssico .

You have reported conversion from Firefox, Cla% CC% 81ssico , this is done in a slightly different way: it changed to U + 301, turned into conjunction ac acetyl character . In UTF-8, U + 0301 0xCC 0x81

Another non-ESSK is another, old way of handling Latin characters, an 8-bit Latin charset, representation (ISO- 885 9 -1 or something like this, such as Windows-1252) and to convert it into symbolic terms as a byte. This will result in Clássico in the Cl% E1ssico . But since it only works for the Latin alphabet, and it is ambiguous for some of them, this is the hope and probably disappears.

No comments:

Post a Comment