Connector/ODBC and Utf8mb4

from the Artful MySQL Tips List

The characters you use in your string (2-byte or 16-bit) are commonly referenced as wide characters, not multi-byte. The wide characters are always 16 bit, while multi-byte characters can be different than that and they can even be variable length.

ODBC supports two main types: ANSI characters (SQLCHAR type is 8-bit) and Wide characters (SQLWCHAR type is 16-bit). Therefore, the functions that work with ANSI have the A suffix (such as SQLConnectA()) and the functions that work with wide characters have W suffix (such as SQLConnectW()).

When your application is using SQLConnect() the functions are mapped to A or W depending on the Unicode setting in your program.

With UTF8 character set used in MySQL the characters vary from 8 to 24 bits whilst UTF8MB4 can be up to 32-bits. Note that it is different from UTF32 where all characters are of fixed 32-bit length.

Specifying CHARSET=UTF8MB4 allows the ODBC driver to perform the correct conversion from multi-byte variable length characters of UTF8MB4 character set into wide 16-bit characters supported by ODBC.


Last updated 18 Aug 2021

