UTF-8 and GB2312

Reading Time: < 1 minute

To me this is really a nusance: some Chinese web pages are in UTF-8 (unicode), some are in GB2312. Internet Explorer can not determine the Chinese code format from a web page header and makes a wrong guess. Basically the user has to go to View => Encoding and try those two formats to see if it will work. 

This is not an isolated problem for Chinese language, or for web browsers. Similar problems do occur to other non-western languages, and software. Before we have a better solution in the software, I wonder if we could all agree on UTF-8, which seems to me is a newer (and better) standard.  

1 comment

  1. Hi,
    the problem is much more related to the way how your data is stored. It’s not only about the browsers detection. To assure the correct display of a language ALL documents (*.php files) which contain displayable data like text AND all data from the database have to be stored in one char set. E.g. you can open a php file an check the encoding with dreamweaver (). And you can check all database tables which contain Chinese text entries if they are in gb2312 or not. Only if the whole line from the database through all the files including every single displayable character are in one char set, than it will be displayed correctly.
    If utf-8 is used, it is possible that many readers in mainland China will not change the character encoding in their browsers. Thus don’t see the content the way you want them to see it….
    I actually have the same problem and IMHO the DB and programming style of WP seems not the best solution for multi language websites.
    Cheers Peter

Leave a comment

Your email address will not be published. Required fields are marked *