Statistics in mech?

Statistics in mech?

am 14.01.2005 11:41:56 von peter.stevens

Hi everyone,

Happy New Year!

I am using mech to scrape data from various websites. I wanted to
collect data about the bytes sent and received by my scraper (I need
this for sizing purposes). I looked though Mech and LWP, but did not
see any methods which give me that information. Is there a way to do this?

Thanks in advance,

Peter

--
------------------------------------------------------------ ----------
Peter Stevens Phone: +41 43 535 8517
www.MinuteWatcher.com Fax: +41 44 544 8392

Re: Statistics in mech?

am 14.01.2005 14:12:48 von gisle

Peter Stevens writes:

> I am using mech to scrape data from various websites. I wanted to
> collect data about the bytes sent and received by my scraper (I need
> this for sizing purposes). I looked though Mech and LWP, but did not
> see any methods which give me that information. Is there a way to do
> this?

Not directly, but you can replace the protocol handler with your own
that counts bytes passed by. This is an example that will count the
bytes sent over http:

#!/usr/bin/perl -w

use LWP::UserAgent;
use LWP::Protocol;

LWP::Protocol::implementor('http', 'MyHTTP');
my $bytes_in = 0;
my $bytes_out = 0;

my $ua = LWP::UserAgent->new(keep_alive => 1);

for (1..3) {
my $res = $ua->get("http://www.example.com");
print "$_: ", $res->status_line, "\n";
}

print "received $bytes_in bytes, send $bytes_out bytes\n";


# Overridden protocol handler that counts the bytes transfered
package MyHTTP;
use base 'LWP::Protocol::http';

package MyHTTP::Socket;
use base 'LWP::Protocol::http::Socket';

sub sysread {
my $self = shift;
my $n = $self->SUPER::sysread(@_);
$bytes_in += $n if defined($n) && $n > 0;
return $n;
}

sub syswrite {
my $self = shift;
my $n = $self->SUPER::syswrite(@_);
$bytes_out += $n if defined($n) && $n > 0;
return $n;
}

__END__

Regards,
Gisle

Re: Statistics in mech?

am 14.01.2005 16:02:30 von peter.stevens

--------------070409010201090905040000
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Perfect! Thank you very much!

Peter

Gisle Aas wrote:

>Peter Stevens writes:
>
>
>
>>I am using mech to scrape data from various websites. I wanted to
>>collect data about the bytes sent and received by my scraper (I need
>>this for sizing purposes). I looked though Mech and LWP, but did not
>>see any methods which give me that information. Is there a way to do
>>this?
>>
>>
>
>Not directly, but you can replace the protocol handler with your own
>that counts bytes passed by. This is an example that will count the
>bytes sent over http:
>
>#!/usr/bin/perl -w
>
>use LWP::UserAgent;
>use LWP::Protocol;
>
>LWP::Protocol::implementor('http', 'MyHTTP');
>my $bytes_in = 0;
>my $bytes_out = 0;
>
>my $ua = LWP::UserAgent->new(keep_alive => 1);
>
>for (1..3) {
> my $res = $ua->get("http://www.example.com");
> print "$_: ", $res->status_line, "\n";
>}
>
>print "received $bytes_in bytes, send $bytes_out bytes\n";
>
>
># Overridden protocol handler that counts the bytes transfered
>package MyHTTP;
>use base 'LWP::Protocol::http';
>
>package MyHTTP::Socket;
>use base 'LWP::Protocol::http::Socket';
>
>sub sysread {
> my $self = shift;
> my $n = $self->SUPER::sysread(@_);
> $bytes_in += $n if defined($n) && $n > 0;
> return $n;
>}
>
>sub syswrite {
> my $self = shift;
> my $n = $self->SUPER::syswrite(@_);
> $bytes_out += $n if defined($n) && $n > 0;
> return $n;
>}
>
>__END__
>
>Regards,
>Gisle
>
>
>

--------------070409010201090905040000--