invalid byte sequence for encoding "UTF8"

invalid byte sequence for encoding "UTF8"

am 15.02.2010 17:03:59 von Luke Coldiron

I am running into a difference in behavior that I think is related to the=
psqlODBC driver between Windows and Linux. On Windows x86 using the Win=
dows ODBC manager and psqlODBC 8.4.2 ANSI driver I have some C++ code tha=
t executes the following SQL statements and inserts into a Windows x86 Po=
stgreSQL 8.4.2 database.

CREATE TABLE luke_test(a TEXT);
INSERT INTO luke_test (a) VALUES('ý');
INSERT INTO luke_test (a) VALUES('þ');
INSERT INTO luke_test (a) VALUES('ÿ');
SELECT * FROM luke_test;
--ý
--þ
--ÿ
DROP TABLE luke_test;

Where the value being insered is a single character byte. This seems to =
work fine in Windows. I am not wanting to treat the bytes with there 8th=
bit set as multi-byte characters.

However, if I compile and run the same C++ code on Linux x64 using unixOD=
BC 2.2.15pre ODBC manager and psqlODBC 8.4.2 ANSI driver inserting into t=
he before mentioned Windows x86 PostreSQL 8.4.2 database I get different =
behavior.

CREATE TABLE luke_test(a TEXT);
INSERT INTO luke_test (a) VALUES('ý');
Error executing "INSERT INTO luke_test (a) VALUES('ý');": ERROR:=20
invalid byte sequence for encoding "UTF8": 0xfd;
Error while executing the query
INSERT INTO luke_test (a) VALUES('þ');
Error executing "INSERT INTO luke_test (a) VALUES('þ');": ERROR:=20
invalid byte sequence for encoding "UTF8": 0xfe;
Error while executing the query
INSERT INTO luke_test (a) VALUES('ÿ');
Error executing "INSERT INTO luke_test (a) VALUES('ÿ');": ERROR:=20
invalid byte sequence for encoding "UTF8": 0xff;
Error while executing the query
SELECT * FROM luke_test;
DROP TABLE luke_test;

Which leads me to believe that my environment on Linux is trying to treat=
the character bytes as multi-byte characters. I have tried SET client_e=
ncoding=3D'ANSI_SQL' before I execute the inserts and that does not seem =
to help the situation. Can anyone help explain the difference in behavio=
r that I am seeing and suggest a workaround that does not involve encodin=
g the character bytes as UTF8.

Luke

--=20
Sent via pgsql-odbc mailing list (pgsql-odbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-odbc

Re: invalid byte sequence for encoding "UTF8"

am 16.02.2010 02:00:03 von Hiroshi Inoue

luke wrote:
> I am running into a difference in behavior that I think is related to t=
he psqlODBC driver between Windows and Linux. On Windows x86 using the W=
indows ODBC manager and psqlODBC 8.4.2 ANSI driver I have some C++ code t=
hat executes the following SQL statements and inserts into a Windows x86 =
PostgreSQL 8.4.2 database.
>=20
> CREATE TABLE luke_test(a TEXT);
> INSERT INTO luke_test (a) VALUES('ý');
> INSERT INTO luke_test (a) VALUES('þ');
> INSERT INTO luke_test (a) VALUES('ÿ');
> SELECT * FROM luke_test;
> --ý
> --þ
> --ÿ
> DROP TABLE luke_test;
>=20
> Where the value being insered is a single character byte. This seems t=
o work fine in Windows. I am not wanting to treat the bytes with there 8=
th bit set as multi-byte characters.
>=20
> However, if I compile and run the same C++ code on Linux x64 using unix=
ODBC 2.2.15pre ODBC manager and psqlODBC 8.4.2 ANSI driver inserting into=
the before mentioned Windows x86 PostreSQL 8.4.2 database I get differen=
t behavior.
>=20
> CREATE TABLE luke_test(a TEXT);
> INSERT INTO luke_test (a) VALUES('ý');
> Error executing "INSERT INTO luke_test (a) VALUES('ý');": ERROR:=20
> invalid byte sequence for encoding "UTF8": 0xfd;
> Error while executing the query
> INSERT INTO luke_test (a) VALUES('þ');
> Error executing "INSERT INTO luke_test (a) VALUES('þ');": ERROR:=20
> invalid byte sequence for encoding "UTF8": 0xfe;
> Error while executing the query
> INSERT INTO luke_test (a) VALUES('ÿ');
> Error executing "INSERT INTO luke_test (a) VALUES('ÿ');": ERROR:=20
> invalid byte sequence for encoding "UTF8": 0xff;
> Error while executing the query
> SELECT * FROM luke_test;
> DROP TABLE luke_test;
>=20
> Which leads me to believe that my environment on Linux is trying to tre=
at the character bytes as multi-byte characters.
> I have tried SET client_encoding=3D'ANSI_SQL' before I execute the=20
inserts and that does not seem to help the situation.

Please try SET client_encoding=3D'LATIN1' .

regards,
Hiroshi Inoue

> Can anyone help explain the difference in behavior that I am seeing=20
and suggest a workaround that does not involve encoding the character=20
bytes as UTF8.



--=20
Sent via pgsql-odbc mailing list (pgsql-odbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-odbc

Re: invalid byte sequence for encoding "UTF8"

am 16.02.2010 02:44:10 von Luke Coldiron

Hiroshi Inoue wrote:
> luke wrote:
>> I am running into a difference in behavior that I think is related to=20
>> the psqlODBC driver between Windows and Linux. On Windows x86 using=20
>> the Windows ODBC manager and psqlODBC 8.4.2 ANSI driver I have some=20
>> C++ code that executes the following SQL statements and inserts into=20
>> a Windows x86 PostgreSQL 8.4.2 database.
>>
>> CREATE TABLE luke_test(a TEXT);
>> INSERT INTO luke_test (a) VALUES('ý');
>> INSERT INTO luke_test (a) VALUES('þ');
>> INSERT INTO luke_test (a) VALUES('ÿ');
>> SELECT * FROM luke_test;
>> --ý
>> --þ
>> --ÿ
>> DROP TABLE luke_test;
>>
>> Where the value being insered is a single character byte. This seems=20
>> to work fine in Windows. I am not wanting to treat the bytes with=20
>> there 8th bit set as multi-byte characters.
>>
>> However, if I compile and run the same C++ code on Linux x64 using=20
>> unixODBC 2.2.15pre ODBC manager and psqlODBC 8.4.2 ANSI driver=20
>> inserting into the before mentioned Windows x86 PostreSQL 8.4.2=20
>> database I get different behavior.
>>
>> CREATE TABLE luke_test(a TEXT);
>> INSERT INTO luke_test (a) VALUES('ý');
>> Error executing "INSERT INTO luke_test (a) VALUES('ý');": ERROR:=20
>> invalid byte sequence for encoding "UTF8": 0xfd;
>> Error while executing the query
>> INSERT INTO luke_test (a) VALUES('þ');
>> Error executing "INSERT INTO luke_test (a) VALUES('þ');": ERROR:=20
>> invalid byte sequence for encoding "UTF8": 0xfe;
>> Error while executing the query
>> INSERT INTO luke_test (a) VALUES('ÿ');
>> Error executing "INSERT INTO luke_test (a) VALUES('ÿ');": ERROR:=20
>> invalid byte sequence for encoding "UTF8": 0xff;
>> Error while executing the query
>> SELECT * FROM luke_test;
>> DROP TABLE luke_test;
>>
>> Which leads me to believe that my environment on Linux is trying to=20
>> treat the character bytes as multi-byte characters.
> > I have tried SET client_encoding=3D'ANSI_SQL' before I execute the=20
> inserts and that does not seem to help the situation.
>
> Please try SET client_encoding=3D'LATIN1' .
That worked! Thank you!
>
> regards,
> Hiroshi Inoue
>
> > Can anyone help explain the difference in behavior that I am seeing=20
> and suggest a workaround that does not involve encoding the character=20
> bytes as UTF8.
>
>
>
> ------------------------------------------------------------ -----------=
-
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com=20
> Version: 9.0.733 / Virus Database: 271.1.1/2688 - Release Date: 02/14/1=
0 11:35:00
>
> =20


--=20
Sent via pgsql-odbc mailing list (pgsql-odbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-odbc