Apache 2 + perl UTF-8 problem
Apache 2 + perl UTF-8 problem
am 22.06.2008 21:24:47 von aw
Hi.
I apologise if this is not really a mod_perl problem, but this list
might be my best chance to find the competences required for some tips.
Platform : SunOS 5.8 (Solaris 8)
Apache : Apache/2.0.52
Perl : v5.8.5 built for sun4-solaris
CGI.pm : 3.37
I have a perl cgi-bin script which handles a POST from a form, using
CGI.pm to retrieve the POSTed values via param().
In this form, a Java applet picks up the values of some input fields
from the
Re: Apache 2 + perl UTF-8 problem
am 23.06.2008 14:33:24 von torsten.foertsch
On Sun 22 Jun 2008, André Warnier wrote:
> Now, the first thing I would like to understand is why this is so.
> Since this is a POST, and since the browser knows that "everything" is
> UTF-8, I would expect it to send the proper multipart POST, with each
> item marked as UTF-8. Â So why does my cgi-bin script not see it as s=
uch ?
Yes that is the current state. Neither CGI nor libapreq2 does that conversi=
on=20
for you, afaik. You have to do it yourself.
> (I am trying to do that to see the content of the real POST, before
> CGI.pm grabs it.)
You could write a small mod_perl input filter. That's not complicated in yo=
ur=20
case. With luck one of the examples on perl.apache.org does fit your needs.
Torsten
=2D-
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net
Re: Apache 2 + perl UTF-8 problem
am 23.06.2008 16:49:57 von Rhesa Rozendaal
André Warnier wrote:
> Hi.
>
> I apologise if this is not really a mod_perl problem, but this list
> might be my best chance to find the competences required for some tips.
>
> Platform : SunOS 5.8 (Solaris 8)
> Apache : Apache/2.0.52
> Perl : v5.8.5 built for sun4-solaris
> CGI.pm : 3.37
That version of CGI.pm has support for what you need:
use CGI qw( -utf8 );
Although the documentation warns it will interfere with file uploads.
As an alternative, below is a customization I've been using that tries to keep
file uploads intact. It's been running live for almost 3 years now. The code
looks pretty similar to what's in CGI 3.37, so maybe that warning is just FUD.
I suggest you test either solution before believing me ;-)
Usage: Add it to your startup.pl, or add a "use CGI::as_utf;". It assumes you
always use the object interface.
Rhesa
package CGI::as_utf;
BEGIN
{
use strict;
use warnings;
use CGI;
use Encode;
{
no warnings 'redefine';
my $param_org = \&CGI::param;
my $might_decode = sub {
my $p = shift;
return ( !$p || ( ref $p && fileno($p) ) )
? $p
: eval { decode_utf8($p) } || $p;
};
*CGI::param = sub {
my $q = $_[0]; # assume object calls always
my $p = $_[1];
goto &$param_org if scalar @_ != 2;
return wantarray
? map { $might_decode->($_) } $q->$param_org($p)
: $might_decode->( $q->$param_org($p) );
}
}
}
1;
Re: Apache 2 + perl UTF-8 problem
am 23.06.2008 17:22:03 von aw
Torsten Foertsch wrote:
> On Sun 22 Jun 2008, André Warnier wrote:
>> Now, the first thing I would like to understand is why this is so.
>> Since this is a POST, and since the browser knows that "everything" is
>> UTF-8, I would expect it to send the proper multipart POST, with each
>> item marked as UTF-8. So why does my cgi-bin script not see it as such ?
>
> Yes that is the current state. Neither CGI nor libapreq2 does that conversion
> for you, afaik. You have to do it yourself.
>
Thanks.
For the moment, I am dealing with CGI.pm, without mod_perl or libapreq2.
I'll deal with those afterward.
I see a problem though : as far as I can tell, CGI.pm does not offer any
way to find out the "charset" header with which each POST parameter was
sent. Or am I missing something ?
André