Apache2::Filter Intermittently Missing Injected String

Apache2::Filter Intermittently Missing Injected String

am 30.03.2011 12:17:25 von Chris Datfung

--001636c5b6c9be4510049fb07b45
Content-Type: text/plain; charset=ISO-8859-1

I have a script that uses Apache2::Filter to filter the server response
output and inject a string into the HTML body. The script normally works
fine expect intermittently the output is missing the injected string. This
happens around 10% of the time. I verified that there is enough memory and
CPU available and tried playing with the buffer size, but to no avail.

The server is running:

Apache 2.2.17-2
Modperl 2.0.4-7

Any explanation for why the script fails 10% of the time?

Thanks
Chris

--001636c5b6c9be4510049fb07b45
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I have a script that uses=A0Apache2::Filter to filter the =
server response output and inject a string into the HTML body. The script n=
ormally works fine expect intermittently the output is missing the injected=
string. This happens around 10% of the time. I verified that there is enou=
gh memory and CPU available and tried playing with the buffer size, but to =
no avail.


The server is running:

Apache 2.2.17-=
2
Modperl=A02.0.4-7

Any explanatio=
n for why the script fails 10% of the time?

Thanks=

Chris


--001636c5b6c9be4510049fb07b45--

Re: Apache2::Filter Intermittently Missing Injected String

am 30.03.2011 12:36:09 von Hendrik Schumacher

Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
> I have a script that uses Apache2::Filter to filter the server response
> output and inject a string into the HTML body. The script normally works
> fine expect intermittently the output is missing the injected string. This
> happens around 10% of the time. I verified that there is enough memory and
> CPU available and tried playing with the buffer size, but to no avail.
>
> The server is running:
>
> Apache 2.2.17-2
> Modperl 2.0.4-7
>
> Any explanation for why the script fails 10% of the time?
>
> Thanks
> Chris
>

I had a similar problem with a http proxy that injected a string into the
HTML body. If the response is passed to the filter in multiple parts there
is a certain probability that the response is split on the string position
you are looking for (for example part 2 ends with " with "dy>"). I had to buffer the last bytes of each response part and take
them into account when looking for the search-string in the next part. I
dont know though if this is possible in Apache2::Filter or if this is your
problem at all.

Hendrik

Re: Apache2::Filter Intermittently Missing Injected String

am 31.03.2011 06:30:35 von Chris Datfung

--000325558c3637400f049fbfc165
Content-Type: text/plain; charset=ISO-8859-1

On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher wrote:

> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
>
> I had a similar problem with a http proxy that injected a string into the
> HTML body. If the response is passed to the filter in multiple parts there
> is a certain probability that the response is split on the string position
> you are looking for (for example part 2 ends with " > with "dy>"). I had to buffer the last bytes of each response part and take
> them into account


Hi Hendrik,

That is exactly the problem. How did you buffer the last bytes of each
response. Don't you just set the BUFF_LEN and thats the number of characters
you get?

Chris

--000325558c3637400f049fbfc165
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Wed, Mar 30, 2011 at 12:36 P=
M, Hendrik Schumacher < e.de">hs@activeframe.de> wrote:
_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
1ex;">
Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:



I had a similar problem with a http proxy that injected a strin=
g into the

HTML body. If the response is passed to the filter in multiple parts there<=
br>
is a certain probability that the response is split on the string position<=
br>
you are looking for (for example part 2 ends with "</bo" and p=
art 3 starts

with "dy>"). I had to buffer the last bytes of each response p=
art and take

them into account

Hi Hendrik,
>
That is exactly the problem. How did you buffer the last bytes =
of each response. Don't you just set the BUFF_LEN and thats the number =
of characters you get?


Chris



--000325558c3637400f049fbfc165--

Re: Apache2::Filter Intermittently Missing Injected String

am 31.03.2011 10:07:43 von Hendrik Schumacher

Am Do, 31.03.2011, 06:30 schrieb Chris Datfung:
> On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher
> wrote:
>
>> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
>>
>> I had a similar problem with a http proxy that injected a string into
>> the
>> HTML body. If the response is passed to the filter in multiple parts
>> there
>> is a certain probability that the response is split on the string
>> position
>> you are looking for (for example part 2 ends with " >> starts
>> with "dy>"). I had to buffer the last bytes of each response part and
>> take
>> them into account
>
>
> Hi Hendrik,
>
> That is exactly the problem. How did you buffer the last bytes of each
> response. Don't you just set the BUFF_LEN and thats the number of
> characters
> you get?
>
> Chris
>

You have to handle the "last bytes buffer" yourself. If you use the
f->read approach of Apache2::Filter, you could use the following (untested
and probably not very efficient):

my $lastbytes = undef;
my $done = undef;
while ($filter->read(my $buffer, $wanted)) {
{
if ($lastbytes)
{
$buffer = $lastbytes.$buffer;
$lastbytes = undef;
}
if (not $done)
{
if ($buffer =~ s/<\/body>/$injection<\/body>/)
{
$done = 1;
}
else
{
$lastbytes = substr ($buffer, -6); # length of string to search - 1
$buffer = substr ($buffer, 0, -6);
}
}
$filter->print($buffer);
}
if ($filter->seen_eos && $lastbytes) {
$filter->print($lastbytes);
}

If you are using the callback approach, you would have to store $lastbytes
somewhere (eg in $filter->ctx) and make sure to flush $lastbytes on eos.

Hendrik

Re: Apache2::Filter Intermittently Missing Injected String

am 31.03.2011 12:08:03 von Chris Datfung

--00032555a4aa0f0f7a049fc478d1
Content-Type: text/plain; charset=ISO-8859-1

Hi Hendrik,

That seems like a good work around assuming the string gets cut off at the
same place each time. Thanks for that, in my case, I'm not certain that it
does. I thought the BUFF_LEN constant defines how many bytes should be read.
My string is always within the first 5000 bytes, but setting BUFF_LEN to
8000 did not help as the buffer still sometimes gets cut after ~2500 bytes
or so. Do you know of any way to force the bucket to be a certain length?

Thanks
Chris

On Thu, Mar 31, 2011 at 10:07 AM, Hendrik Schumacher wrote:

> Am Do, 31.03.2011, 06:30 schrieb Chris Datfung:
> > On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher
> > wrote:
> >
> >> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
> >>
> >> I had a similar problem with a http proxy that injected a string into
> >> the
> >> HTML body. If the response is passed to the filter in multiple parts
> >> there
> >> is a certain probability that the response is split on the string
> >> position
> >> you are looking for (for example part 2 ends with " > >> starts
> >> with "dy>"). I had to buffer the last bytes of each response part and
> >> take
> >> them into account
> >
> >
> > Hi Hendrik,
> >
> > That is exactly the problem. How did you buffer the last bytes of each
> > response. Don't you just set the BUFF_LEN and thats the number of
> > characters
> > you get?
> >
> > Chris
> >
>
> You have to handle the "last bytes buffer" yourself. If you use the
> f->read approach of Apache2::Filter, you could use the following (untested
> and probably not very efficient):
>
> my $lastbytes = undef;
> my $done = undef;
> while ($filter->read(my $buffer, $wanted)) {
> {
> if ($lastbytes)
> {
> $buffer = $lastbytes.$buffer;
> $lastbytes = undef;
> }
> if (not $done)
> {
> if ($buffer =~ s/<\/body>/$injection<\/body>/)
> {
> $done = 1;
> }
> else
> {
> $lastbytes = substr ($buffer, -6); # length of string to search - 1
> $buffer = substr ($buffer, 0, -6);
> }
> }
> $filter->print($buffer);
> }
> if ($filter->seen_eos && $lastbytes) {
> $filter->print($lastbytes);
> }
>
> If you are using the callback approach, you would have to store $lastbytes
> somewhere (eg in $filter->ctx) and make sure to flush $lastbytes on eos.
>
> Hendrik
>
>
>

--00032555a4aa0f0f7a049fc478d1
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Hendrik,

That seems like a good work=
around assuming the string gets cut off at the same place each time. Thank=
s for that, in my case, I'm not certain that it does. I thought the BUF=
F_LEN constant defines how many bytes should be read. My string is always w=
ithin the first 5000 bytes, but setting BUFF_LEN to 8000 did not help as th=
e buffer still sometimes gets cut after ~2500 bytes or so. Do you know of a=
ny way to force the bucket to be a certain length?


Thanks
Chris=A0

ote">On Thu, Mar 31, 2011 at 10:07 AM, Hendrik Schumacher ><> =
wrote:

x #ccc solid;padding-left:1ex;">Am Do, 31.03.2011, 06:30 schrieb Chris Datf=
ung:

> On Wed, Mar 30, 2011 at 12:36 PM, He=
ndrik Schumacher

> <>wrot=
e:

>

>> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:

>>

>> I had a similar problem with a http proxy that injected a string i=
nto

>> the

>> HTML body. If the response is passed to the filter in multiple par=
ts

>> there

>> is a certain probability that the response is split on the string<=
br>
>> position

>> you are looking for (for example part 2 ends with "</bo&qu=
ot; and part 3

>> starts

>> with "dy>"). I had to buffer the last bytes of each r=
esponse part and

>> take

>> them into account

>

>

> Hi Hendrik,

>

> That is exactly the problem. How did you buffer the last bytes of each=


> response. Don't you just set the BUFF_LEN and thats the number of<=
br>
> characters

> you get?

>

> Chris

>



You have to handle the "last bytes buffer" yourself. =
If you use the

f->read approach of Apache2::Filter, you could use the following (untest=
ed

and probably not very efficient):



my $lastbytes =3D undef;

my $done =3D undef;

while ($filter->read(my $buffer, $wanted)) {

{

=A0if ($lastbytes)

=A0{

=A0 =A0$buffer =3D $lastbytes.$buffer;

=A0 =A0$lastbytes =3D undef;

=A0}

=A0if (not $done)

=A0{

=A0 =A0if ($buffer =3D~ s/<\/body>/$injection<\/body>/)

=A0 =A0{

=A0 =A0 =A0$done =3D 1;

=A0 =A0}

=A0 =A0else

=A0 =A0{

=A0 =A0 =A0$lastbytes =3D substr ($buffer, -6); # length of string to sear=
ch - 1

=A0 =A0 =A0$buffer =3D substr ($buffer, 0, -6);

=A0 =A0}

=A0}

=A0$filter->print($buffer);

}

if ($filter->seen_eos && $lastbytes) {

=A0$filter->print($lastbytes);

}



If you are using the callback approach, you would have to store $lastbytes<=
br>
somewhere (eg in $filter->ctx) and make sure to flush $lastbytes on eos.=




Hendrik








--00032555a4aa0f0f7a049fc478d1--

Re: Apache2::Filter Intermittently Missing Injected String

am 31.03.2011 12:18:35 von Hendrik Schumacher

Hi Chris,

my example implementation doesnt assume a string cut-off at a certain
place. If your search string has a length of 7 bytes, the "worst case" is
that one buffer contains the first 6 bytes and the next buffer the last
one. If the string is cut at another place you just carry over a little
bit too much (but it doesnt hurt as long as you make sure that the
replacement takes place only once).
I dont think that you can force the bucket to be a certain length. I dont
know how it is handled exactly but I would assume that you get passed any
content that is flushed by the previous handler/filter. The only thing you
could possibly control is the threshold at which Apache does an automatic
flushing of the output buffer.

Hendrik

Am Do, 31.03.2011, 12:08 schrieb Chris Datfung:
> Hi Hendrik,
>
> That seems like a good work around assuming the string gets cut off at the
> same place each time. Thanks for that, in my case, I'm not certain that it
> does. I thought the BUFF_LEN constant defines how many bytes should be
> read.
> My string is always within the first 5000 bytes, but setting BUFF_LEN to
> 8000 did not help as the buffer still sometimes gets cut after ~2500 bytes
> or so. Do you know of any way to force the bucket to be a certain length?
>
> Thanks
> Chris
>
> On Thu, Mar 31, 2011 at 10:07 AM, Hendrik Schumacher
> wrote:
>
>> Am Do, 31.03.2011, 06:30 schrieb Chris Datfung:
>> > On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher
>> > wrote:
>> >
>> >> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
>> >>
>> >> I had a similar problem with a http proxy that injected a string into
>> >> the
>> >> HTML body. If the response is passed to the filter in multiple parts
>> >> there
>> >> is a certain probability that the response is split on the string
>> >> position
>> >> you are looking for (for example part 2 ends with " >> >> starts
>> >> with "dy>"). I had to buffer the last bytes of each response part and
>> >> take
>> >> them into account
>> >
>> >
>> > Hi Hendrik,
>> >
>> > That is exactly the problem. How did you buffer the last bytes of each
>> > response. Don't you just set the BUFF_LEN and thats the number of
>> > characters
>> > you get?
>> >
>> > Chris
>> >
>>
>> You have to handle the "last bytes buffer" yourself. If you use the
>> f->read approach of Apache2::Filter, you could use the following
>> (untested
>> and probably not very efficient):
>>
>> my $lastbytes = undef;
>> my $done = undef;
>> while ($filter->read(my $buffer, $wanted)) {
>> {
>> if ($lastbytes)
>> {
>> $buffer = $lastbytes.$buffer;
>> $lastbytes = undef;
>> }
>> if (not $done)
>> {
>> if ($buffer =~ s/<\/body>/$injection<\/body>/)
>> {
>> $done = 1;
>> }
>> else
>> {
>> $lastbytes = substr ($buffer, -6); # length of string to search - 1
>> $buffer = substr ($buffer, 0, -6);
>> }
>> }
>> $filter->print($buffer);
>> }
>> if ($filter->seen_eos && $lastbytes) {
>> $filter->print($lastbytes);
>> }
>>
>> If you are using the callback approach, you would have to store
>> $lastbytes
>> somewhere (eg in $filter->ctx) and make sure to flush $lastbytes on eos.
>>
>> Hendrik
>>
>>
>>
>

Re: Apache2::Filter Intermittently Missing Injected String

am 31.03.2011 13:13:48 von torsten.foertsch

On Thursday, March 31, 2011 12:08:03 Chris Datfung wrote:
> y string is always within the first 5000 bytes, but setting BUFF_LEN to
> 8000 did not help as the buffer still sometimes gets cut after ~2500 bytes
> or so. Do you know of any way to force the bucket to be a certain length?

To my knowledge there is no such device.

But you can accumulate the content of the buckets in $f->ctx until there is=
=20
enough of it. If the current brigade does not have enough data simply do no=
t=20
pass it on to the next filter.

You have to watch out for flush end eos buckets, though.

If the string you are looking for is always within the first 5000 bytes tha=
t=20
should not cause problems. Avoid, however, to accumulate the whole response.

Something like:

sub filter {
my ($f, $bb)=3D@_;
my $mybb=3D$f->ctx;
$f->ctx($mybb=3DAPR::Brigade->new($f->r->pool, $f->c->bucket_alloc))
unless $mybb;
$mybb->concat($bb);
if( $mybb->lengh>=3D5000 ) {
$mybb->flatten(my $buf);
$buf=3D~s/.../.../;
$mybb->cleanup;
$mybb->insert_tail(APR::Bucket->new($mybb->bucket_alloc, $buf));
my $rc=3D$f->next->pass_brigade($mybb);
$mybb->destroy;
$f->remove;
$rc==APR::Const::SUCCESS or return $rc;
}
return Apache2::Const::OK;
}

You still have to add code to check for flush and eos buckets.

Torsten Förtsch

=2D-=20
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: Apache2::Filter Intermittently Missing Injected String

am 01.04.2011 03:45:25 von Adam Prime

I wrote a module based on a talk Geoff Young gave a bazillion years ago
to abstract this problem away (sort of). You can check it out here:

http://search.cpan.org/~aprime/Apache2-Filter-TagAware-0.02/ lib/Apache2/Filter/TagAware.pm

Adam

On 3/31/2011 12:30 AM, Chris Datfung wrote:
> On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher > > wrote:
>
> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung:
>
> I had a similar problem with a http proxy that injected a string
> into the
> HTML body. If the response is passed to the filter in multiple parts
> there
> is a certain probability that the response is split on the string
> position
> you are looking for (for example part 2 ends with " > starts
> with "dy>"). I had to buffer the last bytes of each response part
> and take
> them into account
>
>
> Hi Hendrik,
>
> That is exactly the problem. How did you buffer the last bytes of each
> response. Don't you just set the BUFF_LEN and thats the number of
> characters you get?
>
> Chris
>