Recently we wanted to check if our application can work with a non western characterset. For example: greek(1253).
I needed to implement some changes, one of which was to force our application to read the result from a select SQL query into a UnicodeString, instead of AnsiString to make the app independent of a character set. The same goes for write (UPDATE table SET column=?). So I changed some AnsiString to UnicodeString's and started testing.
Easy does it. Right? No.
Let start with the writing, that was changed easily by changing the parameter datatype from ftString to ftWideString. Constructed a UnicodeString with some special characters, et voila, tested and confirmed. The data is stored correctly in the database.
But the reading the same data back turns out to be not that straight forward as I would have thought. For some reason I constantly get to ANSI converted characters back, regardless of how I set the fieldtype or how I read from the TField.
I changed the fieldtype of the TField from ftString to ftWideString. Then I read the text from field->AsString (or field->AsWideString). The text is now in a UnicodeString object, but is not real unicode data. In fact, its still ANSI below the surface, or some other ill whar's.
This is what I do:
I have tested this with 3 different database brands and its supplied ODBC connectors: MSSQL, Sybase and MySQL.
MSSQL and Sybase drivers both render the same issue. I can't find a way to read UnicodeString properly.
MySQL, when using its supplied ANSI driver, showed the same issue - of course.
But the MySQL Unicode driver proved working correctly. (and, next to that, the FieldType is now automatically ftWideString, no need to set it from ftString).
Next to ADO, I also tested all of the above with FireDAC. Though results differ somewhat from ADO, It did not return proper unicode for MSSQL, Sybase and MySQL ANSI. It did return correctly with the MySQL Unicode driver, just like ADO did.
What is going on here? The goal is to get it working with Sybase and TADOConnection.
It seems to me that, for some reason, the ODBC driver for MSSQL and Sybase is set to convert data to ANSI. Obviously not what I want. How do I force this to return unicode (using ftWideString) - and get proper UTF16 unicode in my UnicodeString?
Did some extra research: It looks like ADO is NOT initializing the ODBC drivers with ODBC version 3 or 3.8. Meaning, that drivers may fall back to ODBC 2, and ANSI conversion enabled as a result.
The MySQL unicode driver proved that both ADO and FireDAC both are capable of working in unicode mode.
I am probably missing something obvious and I just don't see it. Or something is wrong in the way ADO 'talks' to the driver causing the drivers (MSSQL and Sybase) to switch to ANSI mode.
Any thoughts on next steps to try?
I needed to implement some changes, one of which was to force our application to read the result from a select SQL query into a UnicodeString, instead of AnsiString to make the app independent of a character set. The same goes for write (UPDATE table SET column=?). So I changed some AnsiString to UnicodeString's and started testing.
Easy does it. Right? No.
Let start with the writing, that was changed easily by changing the parameter datatype from ftString to ftWideString. Constructed a UnicodeString with some special characters, et voila, tested and confirmed. The data is stored correctly in the database.
But the reading the same data back turns out to be not that straight forward as I would have thought. For some reason I constantly get to ANSI converted characters back, regardless of how I set the fieldtype or how I read from the TField.
I changed the fieldtype of the TField from ftString to ftWideString. Then I read the text from field->AsString (or field->AsWideString). The text is now in a UnicodeString object, but is not real unicode data. In fact, its still ANSI below the surface, or some other ill whar's.
This is what I do:
I have tested this with 3 different database brands and its supplied ODBC connectors: MSSQL, Sybase and MySQL.
MSSQL and Sybase drivers both render the same issue. I can't find a way to read UnicodeString properly.
MySQL, when using its supplied ANSI driver, showed the same issue - of course.
But the MySQL Unicode driver proved working correctly. (and, next to that, the FieldType is now automatically ftWideString, no need to set it from ftString).
Next to ADO, I also tested all of the above with FireDAC. Though results differ somewhat from ADO, It did not return proper unicode for MSSQL, Sybase and MySQL ANSI. It did return correctly with the MySQL Unicode driver, just like ADO did.
What is going on here? The goal is to get it working with Sybase and TADOConnection.
It seems to me that, for some reason, the ODBC driver for MSSQL and Sybase is set to convert data to ANSI. Obviously not what I want. How do I force this to return unicode (using ftWideString) - and get proper UTF16 unicode in my UnicodeString?
Did some extra research: It looks like ADO is NOT initializing the ODBC drivers with ODBC version 3 or 3.8. Meaning, that drivers may fall back to ODBC 2, and ANSI conversion enabled as a result.
The MySQL unicode driver proved that both ADO and FireDAC both are capable of working in unicode mode.
I am probably missing something obvious and I just don't see it. Or something is wrong in the way ADO 'talks' to the driver causing the drivers (MSSQL and Sybase) to switch to ANSI mode.
Any thoughts on next steps to try?