PHP-MySQL charset basics

from the Artful MySQL Tips List


If non-ASCII characters are displaying incorrectly in a PHP page, there is a character encoding problem. Most of the time, the best solution is to use UTF8 encoding at every level:

1. Set your code editor to save all files as UTF-8.

2. Set the character set of each column, table and database to UTF8. Notice that the MySQL name for UTF-8 has no hyphen.

3. Set all PHP-MySQL connections to UTF8 using mysqli_set_charset('utf8'). This is preferable to SET NAMES, which does not control charset inside PHP functions, e.g. mysqli_escape_string().

4. Set the charset in the web page header (notice: no quotes round the charset name):

<?php header('Content-Type: text/html; charset=utf-8'); ?>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
5. If your webapp needs to set its character set dynamically, say from a process that puts the desired charset in $_SESSION['charset'], you'll need functionality like this ...

<?php
$charset 
= ( isset( $_SESSION['charset'] )) ? $_SESSION['charset'] : "utf8";
$htmlcharset htmlcharset$charset);
header"Content-type: text/html; charset=$htmlcharset);
echo 
"<head><meta http-equiv='Content-Type' content='text/html;charset=$htmlcharset'></head>"

...

function 
htmlcharset$mysqlcharset ) {
  
$a acharsets();
  
$n array_search$mysqlcharsetacolumn$a));
  return ( 
$n === FALSE "" $a[$n][1] );
}

function 
acharsets() {
  return array (
  
//  MySQL NAME   HTML NAME           DESCRIPTION
  
array('armscii8' 'ARMSCII-8'       'Armenian',1),
  array(
'ascii'    'US-ASCII'        'US ASCII',1),
  array(
'big5'     'BIG5'            'Traditional Chinese',2),
  array(
'cp1250'   'CP1250'          'Windows Central European',1),
  array(
'cp1251'   'CP12511'         'Windows Cyrillic',1),
  array(
'cp1256'   'CP1256'          'Windows Arabic',1),
  array(
'cp1257'   'CP1257'          'Windows Baltic',1),
  array(
'cp852'    'CP852'           'DOS Central European',1),
  array(
'cp866'    'CP866'           'DOS Russian',1),
  array(
'cp932'    'CP932'           'SJIS for Windows Japanese',2),
  array(
'eucjpms'  'EUC-JP'          'UJIS for Windows Japanese',2),
  array(
'euckr'    'EUC-KR'          'Korean',2),
  array(
'gb2312'   'GB2312'          'Simplified Chinese',2),
  array(
'gbk'      'GBK'             'Simplified Chinese',2),
  array(
'geostd8'  'GEOSTD8'         'Georgian',1),
  array(
'greek'    'ISO 8859-7'      'Greek',1),
  array(
'hebrew'   'ISO 8859-8'      'Hebrew',1),
  array(
'koi8u'    'KOI8-U'          'Ukrainian',1),
  array(
'latin1'   'CP1252'          'West European',1),
  array(
'latin2'   'ISO 8859-2'      'Central European',1),
  array(
'latin5'   'ISO 8859-9'      'Turkish',1),
  array(
'latin7'   'ISO 8859-13'     'Baltic',1),
  array(
'macce'    'MACCENTRALEUROPE''Mac Central European',1),
  array(
'macroman' 'MACROMAN'        'Mac West European',1),
  array(
'sjis'     'SHIFT-JIS'       'Japanese',2),
  array(
'tis620'   'TIS620'          'Thai',1),
  array(
'ujis'     'EUC-JP'          'Japanese',3),
  array(
'utf8'     'UTF-8'           'Unicode',3)
  );
}
?>


Notice that UCS-2 is missing from this list: mysql clients don't support it.

Last updated 5 Mar 2025


Return to the Artful MySQL Tips page