DBD::Oracle patch to set charset via connect attribute

DBD::Oracle patch to set charset via connect attribute

am 29.03.2007 21:04:40 von SJS

Below is a patch that allows DBD::Oracle users to specify charset during
connect as follows:

$dbh = DBI->connect("dbi:Oracle:$sid", $user, $pass, { ora_csid => 'AL32UTF8' });

The patch is the simplest thing that I could come up with to demonstrate
the idea. I'm happy to make it robust if there is interest. Basically,
I call OCIEnvNlsCreate a second time, after using the first envhp to
convert the charset given as a string to Oracle's numeric identifier.

Cheers,
Stephen

$ diff -uNr DBD-Oracle-1.19.orig DBD-Oracle-1.19
diff -uNr DBD-Oracle-1.19.orig/dbdimp.c DBD-Oracle-1.19/dbdimp.c
--- DBD-Oracle-1.19.orig/dbdimp.c 2006-11-03 10:05:46.000000000 -0400
+++ DBD-Oracle-1.19/dbdimp.c 2007-03-29 08:34:54.486164400 -0400
@@ -502,6 +502,14 @@
return 0;
}

+ svp = DBD_ATTRIB_GET_SVP(attr, "ora_csid", 8);
+ if (svp && SvOK(*svp)) {
+ charsetid = OCINlsCharSetNameToId(imp_dbh->envhp, SvPV_nolen(*svp));
+ imp_dbh->envhp = NULL;
+ OCIEnvNlsCreate_log_stat( &imp_dbh->envhp, init_mode, 0, NULL, NULL, NULL, 0, 0,
+ charsetid, ncharsetid, status );
+ }
+
/* update the hard-coded csid constants for unicode charsets */
utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"UTF8");
al32utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"AL32UTF8");
EOP

Re: DBD::Oracle patch to set charset via connect attribute

am 02.04.2007 18:04:23 von scoles

Thanks Stephen I will get that into the next release which should be comming
out around the end of April.
""Stephen J. Smith"" wrote in message
news:20070329190440.GA28763@asp.domus.khadrin.com...
> Below is a patch that allows DBD::Oracle users to specify charset during
> connect as follows:
>
> $dbh = DBI->connect("dbi:Oracle:$sid", $user, $pass, { ora_csid =>
> 'AL32UTF8' });
>
> The patch is the simplest thing that I could come up with to demonstrate
> the idea. I'm happy to make it robust if there is interest. Basically,
> I call OCIEnvNlsCreate a second time, after using the first envhp to
> convert the charset given as a string to Oracle's numeric identifier.
>
> Cheers,
> Stephen
>
> $ diff -uNr DBD-Oracle-1.19.orig DBD-Oracle-1.19
> diff -uNr DBD-Oracle-1.19.orig/dbdimp.c DBD-Oracle-1.19/dbdimp.c
> --- DBD-Oracle-1.19.orig/dbdimp.c 2006-11-03
> 10:05:46.000000000 -0400
> +++ DBD-Oracle-1.19/dbdimp.c 2007-03-29 08:34:54.486164400 -0400
> @@ -502,6 +502,14 @@
> return 0;
> }
>
> + svp = DBD_ATTRIB_GET_SVP(attr, "ora_csid", 8);
> + if (svp && SvOK(*svp)) {
> + charsetid = OCINlsCharSetNameToId(imp_dbh->envhp,
> SvPV_nolen(*svp));
> + imp_dbh->envhp = NULL;
> + OCIEnvNlsCreate_log_stat( &imp_dbh->envhp, init_mode, 0,
> NULL, NULL, NULL, 0, 0,
> + charsetid, ncharsetid, status );
> + }
> +
> /* update the hard-coded csid constants for unicode charsets
> */
> utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp,
> (void*)"UTF8");
> al32utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp,
> (void*)"AL32UTF8");
> EOP

Re: DBD::Oracle patch to set charset via connect attribute

am 02.04.2007 19:42:07 von SJS

On Mon, Apr 02, 2007 at 12:04:23PM -0400, John Scoles wrote:
> Thanks Stephen I will get that into the next release which should be comming
> out around the end of April.

No problem. I'd be willing to make another patch that handles ncsid
and has a test case and documentation if you want.

Also, I was thinking "ora_charset" is a better name than "ora_csid"
since the parameter takes the name of a charset rather than a charset id.

-Stephen

Re: DBD::Oracle patch to set charset via connect attribute

am 02.04.2007 20:51:21 von scoles

----- Original Message -----
From: "Stephen J. Smith"
To:
Sent: Monday, April 02, 2007 1:42 PM
Subject: Re: DBD::Oracle patch to set charset via connect attribute


> On Mon, Apr 02, 2007 at 12:04:23PM -0400, John Scoles wrote:
>> Thanks Stephen I will get that into the next release which should be
>> comming
>> out around the end of April.
>
> No problem. I'd be willing to make another patch that handles ncsid
> and has a test case and documentation if you want.

That would be nice Thanks

>
> Also, I was thinking "ora_charset" is a better name than "ora_csid"
> since the parameter takes the name of a charset rather than a charset id.
>

How about both?
> -Stephen
>

Re: DBD::Oracle patch to set charset via connect attribute

am 03.04.2007 00:00:38 von Tim.Bunce

I wonder if this would break any of the assumptions built into the utf8 code.
There might be places that assume the charset is what the NLS env vars specify.
Just a thought.

Tim.

On Thu, Mar 29, 2007 at 03:04:40PM -0400, Stephen J. Smith wrote:
> Below is a patch that allows DBD::Oracle users to specify charset during
> connect as follows:
>
> $dbh = DBI->connect("dbi:Oracle:$sid", $user, $pass, { ora_csid => 'AL32UTF8' });
>
> The patch is the simplest thing that I could come up with to demonstrate
> the idea. I'm happy to make it robust if there is interest. Basically,
> I call OCIEnvNlsCreate a second time, after using the first envhp to
> convert the charset given as a string to Oracle's numeric identifier.
>
> Cheers,
> Stephen
>
> $ diff -uNr DBD-Oracle-1.19.orig DBD-Oracle-1.19
> diff -uNr DBD-Oracle-1.19.orig/dbdimp.c DBD-Oracle-1.19/dbdimp.c
> --- DBD-Oracle-1.19.orig/dbdimp.c 2006-11-03 10:05:46.000000000 -0400
> +++ DBD-Oracle-1.19/dbdimp.c 2007-03-29 08:34:54.486164400 -0400
> @@ -502,6 +502,14 @@
> return 0;
> }
>
> + svp = DBD_ATTRIB_GET_SVP(attr, "ora_csid", 8);
> + if (svp && SvOK(*svp)) {
> + charsetid = OCINlsCharSetNameToId(imp_dbh->envhp, SvPV_nolen(*svp));
> + imp_dbh->envhp = NULL;
> + OCIEnvNlsCreate_log_stat( &imp_dbh->envhp, init_mode, 0, NULL, NULL, NULL, 0, 0,
> + charsetid, ncharsetid, status );
> + }
> +
> /* update the hard-coded csid constants for unicode charsets */
> utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"UTF8");
> al32utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"AL32UTF8");
> EOP

Re: DBD::Oracle patch to set charset via connect attribute

am 13.04.2007 18:12:00 von SJS

On Mon, Apr 02, 2007 at 02:51:21PM -0400, John Scoles wrote:
> From: "Stephen J. Smith"
> >No problem. I'd be willing to make another patch that handles ncsid
> >and has a test case and documentation if you want.
>
> That would be nice Thanks

A new patch against subversion is included below.

One thing I was hoping to be able to do is connect two different database
handles at the same time, but specifying different client character
sets. I think it doesn't work because charsetid and ncharsetid are
global variables. Sound right?

Something like this will show the issue:
env ORA_CHARSET_FAIL=1 perl "-Iblib/lib" "-Iblib/arch" t/80ora_charset.t

Thanks,
Stephen


Index: Oracle.pm
============================================================ =======
--- Oracle.pm (revision 9406)
+++ Oracle.pm (working copy)
@@ -1258,6 +1258,17 @@
setting ora_envhp to 0 you can create connections with different
NLS settings. This is most useful for testing.

+=item ora_charset, ora_ncharset
+
+For oracle versions >= 9.2 you can specify the client charset and
+ncharset with the ora_charset and ora_ncharset attributes. You
+still need to pass C for all but the first connect.
+
+These attributes override the settings from environment variables.
+
+ $dbh = DBI->connect ($dsn, $user, $passwd,
+ {ora_charset => 'AL32UTF8'});
+
=back

=head2 Database Handle Attributes
Index: dbdimp.c
============================================================ =======
--- dbdimp.c (revision 9406)
+++ dbdimp.c (working copy)
@@ -326,6 +326,9 @@

imp_dbh->envhp = imp_drh->envhp; /* will be NULL on first connect */

+ ub2 new_charsetid = 0;
+ ub2 new_ncharsetid = 0;
+
#if defined(USE_ITHREADS) && defined(PERL_MAGIC_shared_scalar)
shared_dbh_priv_svp = (DBD_ATTRIB_OK(attr)?hv_fetch((HV*)SvRV(attr), "ora_dbh_share", 13, 0):NULL) ;
shared_dbh_priv_sv = shared_dbh_priv_svp?*shared_dbh_priv_svp:NULL ;
@@ -502,6 +505,43 @@
return 0;
}

+ svp = DBD_ATTRIB_GET_SVP(attr, "ora_charset", 11);
+ if (svp) {
+ if (!SvPOK(*svp)) {
+ croak("ora_charset is not a string");
+ }
+
+ new_charsetid = OCINlsCharSetNameToId(imp_dbh->envhp, SvPV_nolen(*svp));
+ if (!new_charsetid) {
+ croak("ora_charset value (%s) is not valid", SvPV_nolen(*svp));
+ }
+ }
+
+ svp = DBD_ATTRIB_GET_SVP(attr, "ora_ncharset", 12);
+ if (svp) {
+ if (!SvPOK(*svp)) {
+ croak("ora_ncharset is not a string");
+ }
+
+ new_ncharsetid = OCINlsCharSetNameToId(imp_dbh->envhp, SvPV_nolen(*svp));
+ if (!new_ncharsetid) {
+ croak("ora_ncharset value (%s) is not valid", SvPV_nolen(*svp));
+ }
+ }
+
+ if (new_charsetid || new_ncharsetid) {
+ if (new_charsetid) charsetid = new_charsetid;
+ if (new_ncharsetid) ncharsetid = new_ncharsetid;
+ imp_dbh->envhp = NULL;
+ OCIEnvNlsCreate_log_stat( &imp_dbh->envhp, init_mode, 0, NULL, NULL, NULL, 0, 0,
+ charsetid, ncharsetid, status );
+ if (status != OCI_SUCCESS) {
+ oci_error(dbh, NULL, status,
+ "OCIEnvNlsCreate. Check ORACLE_HOME env var, NLS settings, permissions, etc.");
+ return 0;
+ }
+ }
+
/* update the hard-coded csid constants for unicode charsets */
utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"UTF8");
al32utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp, (void*)"AL32UTF8");
Index: t/80ora_charset.t
============================================================ =======
--- t/80ora_charset.t (revision 0)
+++ t/80ora_charset.t (revision 0)
@@ -0,0 +1,127 @@
+#!perl -w
+use strict;
+
+use Encode;
+use Devel::Peek;
+
+use DBI;
+use DBD::Oracle qw(ORA_OCI);
+
+use Test::More;
+
+unshift @INC ,'t';
+require 'nchar_test_lib.pl';
+
+my $tdata = {
+ cols => [
+ [ 'ch', 'varchar2(20)', ],
+ [ 'nch', 'nvarchar2(20)', ],
+ [ 'descr', 'varchar2(50)', ],
+ ],
+ 'dump' => 'DUMP(%s)',
+ rows => [
+ [
+ "\xb0",
+ "\xb0",
+ 'DEGREE SIGN',
+ ],
+ ],
+};
+
+my $table = table();
+
+my $utf8_charset = (ORA_OCI >= 9.2) ? 'AL32UTF8' : 'UTF8';
+my $eight_bit_charset = 'WE8ISO8859P1';
+
+my $dbh_utf8;
+my $dbh;
+SKIP: {
+ plan skip_all => "Oracle 9.2 or newer required" unless ORA_OCI >= 9.2;
+
+ if ($ENV{ORA_CHARSET_FAIL}) {
+ # Connecting up here breaks because of the charset and ncharset
+ # global variables defined in dbdimp.c
+ $dbh_utf8 = db_connect(1);
+ }
+ $dbh = db_connect(0);
+
+ plan skip_all => "Not connected to oracle" if not $dbh;
+
+ my $testcount = 8 + insert_test_count( $tdata );
+
+ plan tests => $testcount;
+ show_test_data( $tdata ,0 );
+
+ drop_table($dbh);
+ create_table($dbh, $tdata);
+ insert_rows( $dbh, $tdata);
+
+ my ($ch, $nch) = $dbh->selectrow_array("select ch, nch from $table");
+ check($ch, $nch, 0);
+
+ unless ($ENV{ORA_CHARSET_FAIL}) {
+ $dbh_utf8 = db_connect(1);
+ }
+ ($ch, $nch) = $dbh_utf8->selectrow_array("select ch, nch from $table");
+ check($ch, $nch, 1);
+};
+
+sub check {
+ my $ch = shift;
+ my $nch = shift;
+ my $is_utf8 = shift;
+
+ if ($is_utf8) {
+ ok(Encode::is_utf8($ch));
+ ok(Encode::is_utf8($nch));
+ }
+ else {
+ ok(!Encode::is_utf8($ch));
+ ok(!Encode::is_utf8($nch));
+ }
+
+ is($ch, "\xb0", "match char");
+ is($nch, "\xb0", "match char");
+}
+
+sub db_connect
+{
+ my $utf8 = shift;
+
+ # Make sure we really are overriding the environment settings.
+ my ($charset, $ncharset);
+ if ($utf8) {
+ set_nls_lang_charset($eight_bit_charset);
+ set_nls_nchar($eight_bit_charset);
+ $charset = $utf8_charset;
+ $ncharset = $utf8_charset;
+ }
+ else {
+ set_nls_lang_charset($utf8_charset);
+ set_nls_nchar($utf8_charset);
+ $charset = $eight_bit_charset;
+ $ncharset = $eight_bit_charset;
+ }
+
+ my $dsn = oracle_test_dsn();
+ my $dbuser = $ENV{ORACLE_USERID} || 'scott/tiger';
+
+ my $p = {
+ AutoCommit => 1,
+ PrintError => 1,
+ FetchHashKeyName => 'NAME_lc',
+ ora_envhp => 0, # force fresh environment (with current NLS env vars)
+ };
+ $p->{ora_charset} = $charset if $charset;
+ $p->{ora_ncharset} = $ncharset if $ncharset;
+
+ my $dbh = DBI->connect($dsn, $dbuser, '', $p);
+ return $dbh;
+}
+
+END {
+ eval {
+ local $dbh->{PrintError} = 0;
+ drop_table( $dbh ) if $dbh and not $ENV{'DBD_SKIP_TABLE_DROP'};
+ };
+}

Re: DBD::Oracle patch to set charset via connect attribute

am 13.04.2007 20:50:17 von scoles

Thanks Stephen That was something I have been working on as of late mite
mesh very well with some of the other little things I am toying with.

----- Original Message -----
From: "Stephen J. Smith"
To:
Sent: Friday, April 13, 2007 12:12 PM
Subject: Re: DBD::Oracle patch to set charset via connect attribute


> On Mon, Apr 02, 2007 at 02:51:21PM -0400, John Scoles wrote:
>> From: "Stephen J. Smith"
>> >No problem. I'd be willing to make another patch that handles ncsid
>> >and has a test case and documentation if you want.
>>
>> That would be nice Thanks
>
> A new patch against subversion is included below.
>
> One thing I was hoping to be able to do is connect two different database
> handles at the same time, but specifying different client character
> sets. I think it doesn't work because charsetid and ncharsetid are
> global variables. Sound right?
>
> Something like this will show the issue:
> env ORA_CHARSET_FAIL=1 perl "-Iblib/lib" "-Iblib/arch" t/80ora_charset.t
>
> Thanks,
> Stephen
>
>
> Index: Oracle.pm
> ============================================================ =======
> --- Oracle.pm (revision 9406)
> +++ Oracle.pm (working copy)
> @@ -1258,6 +1258,17 @@
> setting ora_envhp to 0 you can create connections with different
> NLS settings. This is most useful for testing.
>
> +=item ora_charset, ora_ncharset
> +
> +For oracle versions >= 9.2 you can specify the client charset and
> +ncharset with the ora_charset and ora_ncharset attributes. You
> +still need to pass C for all but the first connect.
> +
> +These attributes override the settings from environment variables.
> +
> + $dbh = DBI->connect ($dsn, $user, $passwd,
> + {ora_charset => 'AL32UTF8'});
> +
> =back
>
> =head2 Database Handle Attributes
> Index: dbdimp.c
> ============================================================ =======
> --- dbdimp.c (revision 9406)
> +++ dbdimp.c (working copy)
> @@ -326,6 +326,9 @@
>
> imp_dbh->envhp = imp_drh->envhp; /* will be NULL on first connect */
>
> + ub2 new_charsetid = 0;
> + ub2 new_ncharsetid = 0;
> +
> #if defined(USE_ITHREADS) && defined(PERL_MAGIC_shared_scalar)
> shared_dbh_priv_svp = (DBD_ATTRIB_OK(attr)?hv_fetch((HV*)SvRV(attr),
> "ora_dbh_share", 13, 0):NULL) ;
> shared_dbh_priv_sv = shared_dbh_priv_svp?*shared_dbh_priv_svp:NULL ;
> @@ -502,6 +505,43 @@
> return 0;
> }
>
> + svp = DBD_ATTRIB_GET_SVP(attr, "ora_charset", 11);
> + if (svp) {
> + if (!SvPOK(*svp)) {
> + croak("ora_charset is not a string");
> + }
> +
> + new_charsetid = OCINlsCharSetNameToId(imp_dbh->envhp,
> SvPV_nolen(*svp));
> + if (!new_charsetid) {
> + croak("ora_charset value (%s) is not valid",
> SvPV_nolen(*svp));
> + }
> + }
> +
> + svp = DBD_ATTRIB_GET_SVP(attr, "ora_ncharset", 12);
> + if (svp) {
> + if (!SvPOK(*svp)) {
> + croak("ora_ncharset is not a string");
> + }
> +
> + new_ncharsetid = OCINlsCharSetNameToId(imp_dbh->envhp,
> SvPV_nolen(*svp));
> + if (!new_ncharsetid) {
> + croak("ora_ncharset value (%s) is not valid",
> SvPV_nolen(*svp));
> + }
> + }
> +
> + if (new_charsetid || new_ncharsetid) {
> + if (new_charsetid) charsetid = new_charsetid;
> + if (new_ncharsetid) ncharsetid = new_ncharsetid;
> + imp_dbh->envhp = NULL;
> + OCIEnvNlsCreate_log_stat( &imp_dbh->envhp, init_mode, 0,
> NULL, NULL, NULL, 0, 0,
> + charsetid, ncharsetid, status );
> + if (status != OCI_SUCCESS) {
> + oci_error(dbh, NULL, status,
> + "OCIEnvNlsCreate. Check ORACLE_HOME env var, NLS
> settings, permissions, etc.");
> + return 0;
> + }
> + }
> +
> /* update the hard-coded csid constants for unicode charsets
> */
> utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp,
> (void*)"UTF8");
> al32utf8_csid = OCINlsCharSetNameToId(imp_dbh->envhp,
> (void*)"AL32UTF8");
> Index: t/80ora_charset.t
> ============================================================ =======
> --- t/80ora_charset.t (revision 0)
> +++ t/80ora_charset.t (revision 0)
> @@ -0,0 +1,127 @@
> +#!perl -w
> +use strict;
> +
> +use Encode;
> +use Devel::Peek;
> +
> +use DBI;
> +use DBD::Oracle qw(ORA_OCI);
> +
> +use Test::More;
> +
> +unshift @INC ,'t';
> +require 'nchar_test_lib.pl';
> +
> +my $tdata = {
> + cols => [
> + [ 'ch', 'varchar2(20)', ],
> + [ 'nch', 'nvarchar2(20)', ],
> + [ 'descr', 'varchar2(50)', ],
> + ],
> + 'dump' => 'DUMP(%s)',
> + rows => [
> + [
> + "\xb0",
> + "\xb0",
> + 'DEGREE SIGN',
> + ],
> + ],
> +};
> +
> +my $table = table();
> +
> +my $utf8_charset = (ORA_OCI >= 9.2) ? 'AL32UTF8' : 'UTF8';
> +my $eight_bit_charset = 'WE8ISO8859P1';
> +
> +my $dbh_utf8;
> +my $dbh;
> +SKIP: {
> + plan skip_all => "Oracle 9.2 or newer required" unless ORA_OCI >=
> 9.2;
> +
> + if ($ENV{ORA_CHARSET_FAIL}) {
> + # Connecting up here breaks because of the charset and ncharset
> + # global variables defined in dbdimp.c
> + $dbh_utf8 = db_connect(1);
> + }
> + $dbh = db_connect(0);
> +
> + plan skip_all => "Not connected to oracle" if not $dbh;
> +
> + my $testcount = 8 + insert_test_count( $tdata );
> +
> + plan tests => $testcount;
> + show_test_data( $tdata ,0 );
> +
> + drop_table($dbh);
> + create_table($dbh, $tdata);
> + insert_rows( $dbh, $tdata);
> +
> + my ($ch, $nch) = $dbh->selectrow_array("select ch, nch from $table");
> + check($ch, $nch, 0);
> +
> + unless ($ENV{ORA_CHARSET_FAIL}) {
> + $dbh_utf8 = db_connect(1);
> + }
> + ($ch, $nch) = $dbh_utf8->selectrow_array("select ch, nch from
> $table");
> + check($ch, $nch, 1);
> +};
> +
> +sub check {
> + my $ch = shift;
> + my $nch = shift;
> + my $is_utf8 = shift;
> +
> + if ($is_utf8) {
> + ok(Encode::is_utf8($ch));
> + ok(Encode::is_utf8($nch));
> + }
> + else {
> + ok(!Encode::is_utf8($ch));
> + ok(!Encode::is_utf8($nch));
> + }
> +
> + is($ch, "\xb0", "match char");
> + is($nch, "\xb0", "match char");
> +}
> +
> +sub db_connect
> +{
> + my $utf8 = shift;
> +
> + # Make sure we really are overriding the environment settings.
> + my ($charset, $ncharset);
> + if ($utf8) {
> + set_nls_lang_charset($eight_bit_charset);
> + set_nls_nchar($eight_bit_charset);
> + $charset = $utf8_charset;
> + $ncharset = $utf8_charset;
> + }
> + else {
> + set_nls_lang_charset($utf8_charset);
> + set_nls_nchar($utf8_charset);
> + $charset = $eight_bit_charset;
> + $ncharset = $eight_bit_charset;
> + }
> +
> + my $dsn = oracle_test_dsn();
> + my $dbuser = $ENV{ORACLE_USERID} || 'scott/tiger';
> +
> + my $p = {
> + AutoCommit => 1,
> + PrintError => 1,
> + FetchHashKeyName => 'NAME_lc',
> + ora_envhp => 0, # force fresh environment (with current NLS env
> vars)
> + };
> + $p->{ora_charset} = $charset if $charset;
> + $p->{ora_ncharset} = $ncharset if $ncharset;
> +
> + my $dbh = DBI->connect($dsn, $dbuser, '', $p);
> + return $dbh;
> +}
> +
> +END {
> + eval {
> + local $dbh->{PrintError} = 0;
> + drop_table( $dbh ) if $dbh and not $ENV{'DBD_SKIP_TABLE_DROP'};
> + };
> +}
>