BerkeleyDB solution
am 10.03.2009 09:55:58 von torsten.foertsch--Boundary-00=_esitJUcy/D5f86t
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hi,
this is not a problem but a solution. I know some of you use the
BerkeleyDB module to store data. Recently I have tried to use UTF8 keys
and failed. When reading back keys I sometimes got character strings
sometimes octet strings. I had used the following 2 filters to ensure
the data in the database is octet strings and the data I get back are
character strings:
$db->filter_fetch_key(sub { $_=Encode::decode('utf8', $_) });
$db->filter_store_key(sub { $_=Encode::encode('utf8', $_) });
The problem is BerkeleyDB doesn't reset the UTF8 bit when storing data
to @_ variables as in c_get() or db_get(). One possible solution is
$db->filter_fetch_key(sub {
Encode::_utf8_off($_);
$_=Encode::decode('utf8', $_);
});
The other/better one is to fix it in BerkeleyDB.xs. This is what the
attached patch does. I have sent it to the author, Paul Marquess. Here
is his reply:
On Tue 10 Mar 2009, Paul Marquess wrote:
> Your patch looks fine and should be ok to include in my development
> copy without any changes.
Torsten
--
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net
--Boundary-00=_esitJUcy/D5f86t
Content-Type: text/x-diff;
charset="iso 8859-15";
name="utf8.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="utf8.patch"
--- BerkeleyDB.xs~ 2009-02-18 21:31:46.000000000 +0100
+++ BerkeleyDB.xs 2009-03-06 14:38:04.000000000 +0100
@@ -430,7 +430,10 @@
#define getInnerObject(x) ((SV*)SvRV(sv))
#endif
-#define my_sv_setpvn(sv, d, s) (s ? sv_setpvn(sv, d, s) : sv_setpv(sv, "") )
+#define my_sv_setpvn(sv, d, s) do { \
+ s ? sv_setpvn(sv, d, s) : sv_setpv(sv, ""); \
+ SvUTF8_off(sv); \
+ } while(0)
#define GetValue_iv(h,k) (((sv = readHash(h, k)) && sv != &PL_sv_undef) \
? SvIV(sv) : 0)
--Boundary-00=_esitJUcy/D5f86t--