Unable to connect a website using LWP

Unable to connect a website using LWP

am 23.02.2006 10:31:48 von prasanna_ssb

Hi,

Iam trying to connect to a website
ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/ to download files
from there. I am using LWP for the same and I am getting the following
Error

Cant get ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/:
HTTP::Response=HASH(0x98ceb34)->status_line at try.pl line 18.

Where can I know what this error code stands for?

Here is the portion of the code referring to this problem

#! /usr/bin/perl

use warnings;
use strict;
use lib '/home/ms2/PERL/LWP/libwww-perl-5.800/lib/';
use LWP;

#Create a UserAgent object

use LWP::UserAgent;
my $ua = LWP::UserAgent->new;

#need to make a get request

my $url = 'ftp://ftp.rcsb.prg/pub/pdb/data/structures/all/pdb/';
my $response = $ua->get($url);

if ($response->is_success == 0 ) { die "Cant get $url:\n
$response->status_line"; }

else { # do something }
......other sections of code not added here ......

One more very basic question - what is the difference between CGI and
LWP and when do we use each-- that is- is LWP for client side
scripting and CGI for server side scripting?

Any help appreciated
Thanks

Re: Unable to connect a website using LWP

am 23.02.2006 14:42:01 von Paul Lalli

prasanna_ssb@rediffmail.com wrote:
> Hi,
>
> Iam trying to connect to a website
> ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/

That's not a website. The world wide web uses the HTTP proticol. This
URL uses the FTP proticol.

Does LWP handle FTP addresses? I have no idea. But it seems like it
shouldn't have any reason to.

Perhaps you should be using the Net::FTP module instead.

> to download files
> from there. I am using LWP for the same and I am getting the following
> Error
>
> Cant get ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/:
> HTTP::Response=HASH(0x98ceb34)->status_line at try.pl line 18.
>
> Where can I know what this error code stands for?

You didn't print out the error code. That's at least half your
problem.

>
> Here is the portion of the code referring to this problem
>
> #! /usr/bin/perl
>
> use warnings;
> use strict;
> use lib '/home/ms2/PERL/LWP/libwww-perl-5.800/lib/';
> use LWP;
>
> #Create a UserAgent object
>
> use LWP::UserAgent;
> my $ua = LWP::UserAgent->new;
>
> #need to make a get request
>
> my $url = 'ftp://ftp.rcsb.prg/pub/pdb/data/structures/all/pdb/';
> my $response = $ua->get($url);
>
> if ($response->is_success == 0 ) { die "Cant get $url:\n
> $response->status_line"; }

status_line is a method of the HTTP::Response class. You can not
interpolate method calls in a double quoted string, any more than you
can interpolate function calls.

die "Can't get $url:\n", $response->status_line;

Make that change and see what the error status actually is.

> One more very basic question - what is the difference between CGI and
> LWP and when do we use each-- that is- is LWP for client side
> scripting and CGI for server side scripting?

Kinda-sorta-almost.

CGI programs reside on a webserver. A web browser contacts the
webserver and asks it to run that program and return that program's
output. The web browser then displays the output.

LWP is a series of Perl modules for writing a web browser. You write
programs to contact webservers and display whatever content those
webservers returned to you.

Paul Lalli

Re: Unable to connect a website using LWP

am 23.02.2006 16:37:42 von unknown

Paul Lalli wrote:
> prasanna_ssb@rediffmail.com wrote:
>
>>Hi,
>>
>>Iam trying to connect to a website
>>
>
>
> That's not a website. The world wide web uses the HTTP proticol. This
> URL uses the FTP proticol.
>
> Does LWP handle FTP addresses? I have no idea. But it seems like it
> shouldn't have any reason to.

Actually, LWP handles FTP as well as HTTP, as

GET ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/

will demonsrate.

>
> Perhaps you should be using the Net::FTP module instead.
>

Net::FTP will work also. There's More Than One Way To Do It.

>
>>to download files
>>from there. I am using LWP for the same and I am getting the following
>>Error
>>
>>Cant get ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/:
>> HTTP::Response=HASH(0x98ceb34)->status_line at try.pl line 18.
>>
>>Where can I know what this error code stands for?
>
>
> You didn't print out the error code. That's at least half your
> problem.
>
>
>>Here is the portion of the code referring to this problem
>>
>>#! /usr/bin/perl
>>
>>use warnings;
>>use strict;
>>use lib '/home/ms2/PERL/LWP/libwww-perl-5.800/lib/';
>>use LWP;
>>
>>#Create a UserAgent object
>>
>>use LWP::UserAgent;
>>my $ua = LWP::UserAgent->new;
>>
>>#need to make a get request
>>
>>my $url = 'ftp://ftp.rcsb.prg/pub/pdb/data/structures/all/pdb/';

Earlier in your post you said you were going for 'ftp.rcsb.org'. If you
get your error reporting sorted out, you will find that domain
'ftp.rcsb.prg' does not exist.

>>my $response = $ua->get($url);
>>
>>if ($response->is_success == 0 ) { die "Cant get $url:\n
>>$response->status_line"; }
>
>
> status_line is a method of the HTTP::Response class. You can not
> interpolate method calls in a double quoted string, any more than you
> can interpolate function calls.
>
> die "Can't get $url:\n", $response->status_line;
>
> Make that change and see what the error status actually is.
>

I would like to recommend, in lieu of your two lines, something like

$response->is_success or
die "Can't get $url:\n", $response->status_line, "\n";

The explicit '== 0' probably does what you think it does though, so
maybe it's just a matter of style. The previous poster is correct,
though, you can't interpolate a subroutine or method call into a string.
At least, not directly. You _can_ do something like

$response->is_success or die < Can't get $url:
@{[$response->status_line]}
eod

if you want, using the @{[ ]} idiom to interpolate a list which you
build on-the-fly.

Tom Wyant

Re: Unable to connect a website using LWP

am 23.02.2006 17:20:10 von Paul Lalli

harryfmudd [AT] comcast [DOT] net wrote:
> Paul Lalli wrote:
> > prasanna_ssb@rediffmail.com wrote:

> > Does LWP handle FTP addresses? I have no idea. But it seems like it
> > shouldn't have any reason to.
>
> Actually, LWP handles FTP as well as HTTP, as
>
> GET ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/
>
> will demonsrate.

Hmm. Good to know. Thanks for the info.

> >>if ($response->is_success == 0 ) { die "Cant get $url:\n
> >>$response->status_line"; }
>
> I would like to recommend, in lieu of your two lines, something like
>
> $response->is_success or
> die "Can't get $url:\n", $response->status_line, "\n";
>
> The explicit '== 0' probably does what you think it does though, so
> maybe it's just a matter of style.

It does do what the OP wants, but perhaps not for the reasons the OP
might suspect... If you trace the is_success method definitions
through to the actual code (you wind up in the HTTP::Status module),
you will see that it comes out to be the logical and'ing of two
inequality operations (testing if the status code returned is >= 200
and < 300). This is where perl magic comes into place. Normally, when
you compare an empty string to a number using the == operator, you'll
get a warning message. The result of boolean checks like inequality,
however, are not "normal":

$ perl -MData::Dumper -wle'
$x = q{};
print Dumper(\$x);
print "True" if $x == 0
'
$VAR1 = \'';
Argument "" isn't numeric in numeric eq (==) at -e line 1.
True

$ perl -MData::Dumper -wle'
$x = (5<3);
print Dumper(\$x);
print "True" if $x == 0
'
$VAR1 = \'';
True

Even though Data::Dumper seems to think the result of 5<3 is an empty
string, Perl treats it as a generic false value - empty string when
used as a string, but 0 when used as a number.

Paul Lalli

Re: Unable to connect a website using LWP

am 23.02.2006 22:47:31 von Bill Segraves

wrote in message
news:1140687108.651391.276040@g44g2000cwa.googlegroups.com.. .
> Hi,
>
> Iam trying to connect to a website
> ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/ to download files
> from there. I am using LWP for the same and I am getting the following
> Error

Why are you using LWP to try to download the files?

If you point your browser to one level up in the directory hierarchy, e.g.,

ftp://ftp.rcsb.org/pub/pdb/data/structures/all/

you should be able to see the folders in said directory, among which is the
folder "pdb". While the permissions of the folders appear to allow you to
see the contents, the permissions of the files contained "in" the folders
may not be set to allow you to see the files.

Personally, I'd recommend that you enter through the "front door", e.g.,

http://www.rcsb.org

click on "Download Files" + "FTP Services", i.e.,

http://www.rcsb.org/pdb/static.do?p=download/ftp/index.html

and browse around a bit until you find the data you seek, *before* you
attempt to write a script to automate what you could simply do with your
browser.

Also, if you click on "FTP Services Help", you may find that the data you
seek is not where you thought it was, e.g.,

ftp://ftp.rcsb.org/pub/pdb/data/structures/divided/pdb/

where I can see all the folders and the files within them.

Good luck!
--
Bill Segraves

Re: Unable to connect a website using LWP

am 24.02.2006 04:20:00 von unknown

Paul Lalli wrote:
>
> It does do what the OP wants, but perhaps not for the reasons the OP
> might suspect... If you trace the is_success method definitions
> through to the actual code (you wind up in the HTTP::Status module),
> you will see that it comes out to be the logical and'ing of two
> inequality operations (testing if the status code returned is >= 200
> and < 300). This is where perl magic comes into place. Normally, when
> you compare an empty string to a number using the == operator, you'll
> get a warning message. The result of boolean checks like inequality,
> however, are not "normal":
>
> $ perl -MData::Dumper -wle'
> $x = q{};
> print Dumper(\$x);
> print "True" if $x == 0
> '
> $VAR1 = \'';
> Argument "" isn't numeric in numeric eq (==) at -e line 1.
> True
>
> $ perl -MData::Dumper -wle'
> $x = (5<3);
> print Dumper(\$x);
> print "True" if $x == 0
> '
> $VAR1 = \'';
> True
>
> Even though Data::Dumper seems to think the result of 5<3 is an empty
> string, Perl treats it as a generic false value - empty string when
> used as a string, but 0 when used as a number.
>
> Paul Lalli
>

That's the thing about Perl - it's such a big, sprawling language
there's always something to learn. Thank you very much.

Tom Wyant

Re: Unable to connect a website using LWP

am 25.02.2006 01:15:13 von Sisyphus

"Paul Lalli" wrote in message

> This is where perl magic comes into place. Normally, when
> you compare an empty string to a number using the == operator, you'll
> get a warning message. The result of boolean checks like inequality,
> however, are not "normal":
>
> $ perl -MData::Dumper -wle'
> $x = q{};
> print Dumper(\$x);
> print "True" if $x == 0
> '
> $VAR1 = \'';
> Argument "" isn't numeric in numeric eq (==) at -e line 1.
> True
>
> $ perl -MData::Dumper -wle'
> $x = (5<3);
> print Dumper(\$x);
> print "True" if $x == 0
> '
> $VAR1 = \'';
> True
>
> Even though Data::Dumper seems to think the result of 5<3 is an empty
> string, Perl treats it as a generic false value - empty string when
> used as a string, but 0 when used as a number.
>

Not only that - but even Devel::Peek doesn't detect any difference between
the return value of 'q{}' and the return value of '5<3'. I ran this:

use warnings;
use Devel::Peek;

$x = q{};
if($x == 0) {print "OK1\n"}

$y = (5 < 3);
if($y == 0) {print "OK2\n"}

Dump($x);
print "##################\n";
Dump($y);
__END__

Do you know what accounts for the difference in behaviour ?

Cheers,
Rob

Re: Unable to connect a website using LWP

am 25.02.2006 01:24:12 von Sisyphus

"Sisyphus" wrote in message

>
> Not only that - but even Devel::Peek doesn't detect any difference between
> the return value of 'q{}' and the return value of '5<3'. I ran this:
>
> use warnings;
> use Devel::Peek;
>
> $x = q{};
> if($x == 0) {print "OK1\n"}
>
> $y = (5 < 3);
> if($y == 0) {print "OK2\n"}
>
> Dump($x);
> print "##################\n";
> Dump($y);
> __END__
>
> Do you know what accounts for the difference in behaviour ?
>

Doh ... silly me. As soon as the equivalence check is carried out on $x, its
flags are altered. The correct way to see the difference between $x and $y
is:

use warnings;
use Devel::Peek;

$x = q{};
$y = (5 < 3);

Dump($x);
print "##################\n";
Dump($y);
__END__

Which outputs:

SV = PV(0x3f5f74) at 0x3f5d94
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x3fd87c ""\0
CUR = 0
LEN = 4
##################
SV = PVNV(0x3f876c) at 0x3f5db8
REFCNT = 1
FLAGS = (IOK,NOK,POK,pIOK,pNOK,pPOK)
IV = 0
NV = 0
PV = 0x8bfe84 ""\0
CUR = 0
LEN = 4

And therein lies the answer - $y already has its IOK flag set, but $x does
not.

Cheers,
Rob

Re: Unable to connect a website using LWP

am 25.02.2006 11:12:14 von Prasanna

Paul Lalli wrote:
> prasanna_ssb@rediffmail.com wrote:
> > Hi,
> >
> > Iam trying to connect to a website
> > ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/
>
> That's not a website. The world wide web uses the HTTP proticol. This
> URL uses the FTP proticol.
>
> Does LWP handle FTP addresses? I have no idea.

Yes it does.. I've taken this code from an article by Sean Burke
(author of Perl and LWP) from www.perl.com

> But it seems like it shouldn't have any reason to.

Why do you say that?

>
> Perhaps you should be using the Net::FTP module instead.
>
> > to download files
> > from there. I am using LWP for the same and I am getting the following
> > Error
> >
> > Cant get ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/:
> > HTTP::Response=HASH(0x98ceb34)->status_line at try.pl line 18.
> >
> > Where can I know what this error code stands for?
>
> You didn't print out the error code. That's at least half your
> problem.

Precisely. I was surprised that the error code wasnt thrown out at me
and frankly didnt know how to get the error code as well




> > if ($response->is_success == 0 ) { die "Cant get $url:\n
> > $response->status_line"; }
>
> status_line is a method of the HTTP::Response class. You can not
> interpolate method calls in a double quoted string, any more than you
> can interpolate function calls.

Oh, OH.... I thought status_line was a string or something (scalar)
which contained the error message/ success message. Iam asking this
from a C++ prog. perspective -shouldnt the method/ function names be
followed by a () for easier differentiation, else how does one know? Or
Perl classes design also follow the paradigm of C++ that generally
variables are supposed to be private and are to be accessed by member
functions/ friend functions only?


>
> die "Can't get $url:\n", $response->status_line;
>
> Make that change and see what the error status actually is.

I'll do that. Thanks for the tip.

>
> > One more very basic question - what is the difference between CGI and
> > LWP and when do we use each-- that is- is LWP for client side
> > scripting and CGI for server side scripting?
>
> Kinda-sorta-almost.
>
> CGI programs reside on a webserver. A web browser contacts the
> webserver and asks it to run that program and return that program's
> output. The web browser then displays the output.

Got this... PHP is similar to CGI but different in that every request
(of the same program) on the webserver wouldnt fork out a new process
unlike CGI (but I heard that newer versions of CGI do take care of
this) , right? Because this was one of the advantages touted by ASP
(Yeah, right!) over CGI.

> LWP is a series of Perl modules for writing a web browser. You write
> programs to contact webservers and display whatever content those
> webservers returned to you.

did you mean writing to a web browser? sorry if its a repeat question.

Thanks for all the help, Paul.

Re: Unable to connect a website using LWP

am 25.02.2006 11:24:00 von Prasanna

Bill Segraves wrote:
> wrote in message
> news:1140687108.651391.276040@g44g2000cwa.googlegroups.com.. .
> > Hi,
> >
> > Iam trying to connect to a website
> > ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/ to download files
> > from there. I am using LWP for the same and I am getting the following
> > Error
>
> Why are you using LWP to try to download the files?

Because I need to download the entire pdb. Of course I could have
written a shell script for automatic getting the files by ftp, but
since I was learning Perl, I thought that LWP would be my best shot

>
> If you point your browser to one level up in the directory hierarchy, e.g.,
>
> ftp://ftp.rcsb.org/pub/pdb/data/structures/all/
>
> you should be able to see the folders in said directory, among which is the
> folder "pdb". While the permissions of the folders appear to allow you to
> see the contents, the permissions of the files contained "in" the folders
> may not be set to allow you to see the files.
>
> Personally, I'd recommend that you enter through the "front door", e.g.,
>
> http://www.rcsb.org
>
> click on "Download Files" + "FTP Services", i.e.,
>
> http://www.rcsb.org/pdb/static.do?p=download/ftp/index.html
>
> and browse around a bit until you find the data you seek, *before* you
> attempt to write a script to automate what you could simply do with your
> browser.

I'll do this. There was another reason for the script by the way. My
prof. wants to get this data every month beginning or some such period
( essentially periodically) and obviously wants something which'd not
bother him (he might put this in the crontab file, for all I know).

>
> Also, if you click on "FTP Services Help", you may find that the data you
> seek is not where you thought it was, e.g.,
>
> ftp://ftp.rcsb.org/pub/pdb/data/structures/divided/pdb/

Sorry!! the above path which I'd used in my program is just a trial
kind of thing. I've used this path which you've mentioned, in the code,
but I didnt post that part of the code since that was not the problem.

>
> where I can see all the folders and the files within them.
>
> Good luck!

Thank you.
> --
> Bill Segraves

Re: Unable to connect a website using LWP

am 25.02.2006 11:32:14 von Prasanna

harryfmudd [AT] comcast [DOT] net wrote:
> Paul Lalli wrote:
> > prasanna_ssb@rediffmail.com wrote:
> >
> >>Hi,
> >>
> >>Iam trying to connect to a website
> >>
> >
> >
> > That's not a website. The world wide web uses the HTTP proticol. This
> > URL uses the FTP proticol.
> >
> > Does LWP handle FTP addresses? I have no idea. But it seems like it
> > shouldn't have any reason to.
>
> Actually, LWP handles FTP as well as HTTP, as
>
> GET ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/
>
> will demonsrate.
>
> >
> > Perhaps you should be using the Net::FTP module instead.
> >
>
> Net::FTP will work also. There's More Than One Way To Do It.
>
> >
> >>to download files
> >>from there. I am using LWP for the same and I am getting the following
> >>Error
> >>
> >>Cant get ftp://ftp.rcsb.org/pub/pdb/data/structures/all/pdb/:
> >> HTTP::Response=HASH(0x98ceb34)->status_line at try.pl line 18.
> >>
> >>Where can I know what this error code stands for?
> >
> >
> > You didn't print out the error code. That's at least half your
> > problem.
> >
> >
> >>Here is the portion of the code referring to this problem
> >>
> >>#! /usr/bin/perl
> >>
> >>use warnings;
> >>use strict;
> >>use lib '/home/ms2/PERL/LWP/libwww-perl-5.800/lib/';
> >>use LWP;
> >>
> >>#Create a UserAgent object
> >>
> >>use LWP::UserAgent;
> >>my $ua = LWP::UserAgent->new;
> >>
> >>#need to make a get request
> >>
> >>my $url = 'ftp://ftp.rcsb.prg/pub/pdb/data/structures/all/pdb/';
>
> Earlier in your post you said you were going for 'ftp.rcsb.org'. If you
> get your error reporting sorted out, you will find that domain
> 'ftp.rcsb.prg' does not exist.

So sorry about that. I'd realized this after I'd pasted my code here.
I've made the changes though.


> >>my $response = $ua->get($url);
> >>
> >>if ($response->is_success == 0 ) { die "Cant get $url:\n
> >>$response->status_line"; }
> >
> >
> > status_line is a method of the HTTP::Response class. You can not
> > interpolate method calls in a double quoted string, any more than you
> > can interpolate function calls.
> >
> > die "Can't get $url:\n", $response->status_line;
> >
> > Make that change and see what the error status actually is.
> >
>
> I would like to recommend, in lieu of your two lines, something like
>
> $response->is_success or
> die "Can't get $url:\n", $response->status_line, "\n";
>
> The explicit '== 0' probably does what you think it does though, so
> maybe it's just a matter of style. The previous poster is correct,
> though, you can't interpolate a subroutine or method call into a string.
> At least, not directly. You _can_ do something like
>
> $response->is_success or die < > Can't get $url:
> @{[$response->status_line]}
> eod
>
> if you want, using the @{[ ]} idiom to interpolate a list which you
> build on-the-fly.

I dont know anything about the interpolation syntax which you've used.
I'll go through perldoc
Thanks for the pointer.

>
> Tom Wyant

Re: Unable to connect a website using LWP

am 25.02.2006 14:17:44 von Paul Lalli

Prasanna wrote:
> Paul Lalli wrote:
> > prasanna_ssb@rediffmail.com wrote:

> > Does LWP handle FTP addresses? I have no idea.
>
> Yes it does.. I've taken this code from an article by Sean Burke
> (author of Perl and LWP) from www.perl.com
>
> > But it seems like it shouldn't have any reason to.
>
> Why do you say that?

Because there's other modules specifically designed for FTP transfers.
Like I said, I honestly had no idea if LWP handled FTP. It just
wouldn't have occurred to me to try.

> > > if ($response->is_success == 0 ) { die "Cant get $url:\n
> > > $response->status_line"; }
> >
> > status_line is a method of the HTTP::Response class. You can not
> > interpolate method calls in a double quoted string, any more than you
> > can interpolate function calls.
>
> Oh, OH.... I thought status_line was a string or something (scalar)
> which contained the error message/ success message. Iam asking this
> from a C++ prog. perspective -shouldnt the method/ function names be
> followed by a () for easier differentiation, else how does one know?

I'm not sure why at you mean. There is nothing else that syntax could
represent. Are you asking how one knows that we're referring to a
method rather than a member element?

In Perl, classes are (usually) defined by hash references. To access
an element of a hash, you use the {} notation:
my %hash = (foo => bar);
print "Foo's value: $hash{foo}\n";

Correspondingly, to access an element of a hash when you only have a
reference to that hash, you add in the arrow notation:

my $hash_ref = { foo => bar };
print "Foo's value: $hash->{foo}\n";

If we're accessing an element of the hash that underlies the object
we're using, we'd have the { } arround the element name. There is no
ambiguity here.

You may wish to read the following:
perldoc perlreftut
perldoc perllol
perldoc perldsc
perldoc perlobj

> Or
> Perl classes design also follow the paradigm of C++ that generally
> variables are supposed to be private and are to be accessed by member
> functions/ friend functions only?

If and only if you want to implement that restriction. In "normal"
objects, no, any piece of code that has access to the object can muck
about with the object's member elements. For alternative solutions
that do implement a very strict form of encapsulation, check out Damian
Conway's Class::Std module, available on CPAN -
http://search.cpan.org/~dconway/Class-Std-v0.0.8/lib/Class/S td.pm

(In fact, Damian Conway's book, _Perl Best Practices_, is quickly
becoming required reading for Perl programmers....)

> > LWP is a series of Perl modules for writing a web browser. You write
> > programs to contact webservers and display whatever content those
> > webservers returned to you.
>
> did you mean writing to a web browser? sorry if its a repeat question.

No. I meant writing a web browser. When you use the LWP:: family of
modules (including such favorites as WWW::Mechanize), you are
effectively creating a web browser (more officially, an HTTP client),
albeit significantly less advanced than browsers
such as Firefox or Internet Explorer.

> Thanks for all the help, Paul.

You're welcome.

Paul Lalli

Re: Unable to connect a website using LWP

am 25.02.2006 21:43:21 von Bill Segraves

"Prasanna" wrote in message
news:1140863040.402425.158780@i39g2000cwa.googlegroups.com.. .
> Bill Segraves wrote:
> > wrote in message
> > news:1140687108.651391.276040@g44g2000cwa.googlegroups.com.. .

> Because I need to download the entire pdb. Of course I could have
> written a shell script for automatic getting the files by ftp, but
> since I was learning Perl, I thought that LWP would be my best shot
>

From what you've described, it doesn't appear that you need to download the
entire pdb; but rather, it seems you need to download the folders/files that
have changed. See below for some suggestions on how to mirror the RCSB PDB
on the site you need.



> > Personally, I'd recommend that you enter through the "front door", e.g.,
> >
> > http://www.rcsb.org
> >
> > click on "Download Files" + "FTP Services", i.e.,
> >
> > http://www.rcsb.org/pdb/static.do?p=download/ftp/index.html
> >
> > and browse around a bit until you find the data you seek, *before* you
> > attempt to write a script to automate what you could simply do with your
> > browser.
>
> I'll do this. There was another reason for the script by the way. My
> prof. wants to get this data every month beginning or some such period
> ( essentially periodically) and obviously wants something which'd not
> bother him (he might put this in the crontab file, for all I know).
>

I understand. It's a terrible bother to click on a shortcut/link
ftp://ftp.rcsb.org/pub/pdb/data/structures/divided
on his desktop and copy the target folder "pdb" to a local directory. ;-)

You may benefit from the work of others who have already shown us how to
solve similar problems, e.g., Randal L. Schwartz articles:

"Special Purpose Mirrors"
http://www.stonehenge.com/merlyn/WebTechniques/col31.html

"Mirroring your own mini-CPAN"
http://www.stonehenge.com/merlyn/LinuxMag/col42.html

both of which use LWP::Simple, and

"Calculating download time"
http://www.stonehenge.com/merlyn/WebTechniques/col63.html


> >
> > Also, if you click on "FTP Services Help", you may find that the data
you
> > seek is not where you thought it was, e.g.,
> >
> > ftp://ftp.rcsb.org/pub/pdb/data/structures/divided/pdb/
>

One level up from the above link is where you may want to go to see the
"pdb" folder.

> Sorry!! the above path which I'd used in my program is just a trial
> kind of thing. I've used this path which you've mentioned, in the code,
> but I didnt post that part of the code since that was not the problem.
>

O.K.

I hope Randal's articles are helpful to you.

heers.
--
Bill Segraves

Re: Unable to connect a website using LWP

am 26.02.2006 17:31:55 von prasanna_ssb

Paul Lalli wrote:
> Prasanna wrote:
> > Paul Lalli wrote:
> > > prasanna_ssb@rediffmail.com wrote:

> > > > if ($response->is_success == 0 ) { die "Cant get $url:\n
> > > > $response->status_line"; }
> > >
> > > status_line is a method of the HTTP::Response class. You can not
> > > interpolate method calls in a double quoted string, any more than you
> > > can interpolate function calls.
> >
> > Oh, OH.... I thought status_line was a string or something (scalar)
> > which contained the error message/ success message. Iam asking this
> > from a C++ prog. perspective -shouldnt the method/ function names be
> > followed by a () for easier differentiation, else how does one know?
>
> I'm not sure why at you mean. There is nothing else that syntax could
> represent. Are you asking how one knows that we're referring to a
> method rather than a member element?

Actually, yes.. I wanted to ask - do we know that from perldoc or from
the syntax? But I think you've already answered that below.

>
> In Perl, classes are (usually) defined by hash references. To access
> an element of a hash, you use the {} notation:
> my %hash = (foo => bar);
> print "Foo's value: $hash{foo}\n";
>
> Correspondingly, to access an element of a hash when you only have a
> reference to that hash, you add in the arrow notation:
>
> my $hash_ref = { foo => bar };
> print "Foo's value: $hash->{foo}\n";
>
> If we're accessing an element of the hash that underlies the object
> we're using, we'd have the { } arround the element name. There is no
> ambiguity here.
>
> You may wish to read the following:
> perldoc perlreftut
> perldoc perllol
> perldoc perldsc
> perldoc perlobj

Thanks for pointing the way.


> > > LWP is a series of Perl modules for writing a web browser. You write
> > > programs to contact webservers and display whatever content those
> > > webservers returned to you.
> >
> > did you mean writing to a web browser? sorry if its a repeat question.
>
> No. I meant writing a web browser. When you use the LWP:: family of
> modules (including such favorites as WWW::Mechanize), you are
> effectively creating a web browser (more officially, an HTTP client),
> albeit significantly less advanced than browsers
> such as Firefox or Internet Explorer.

Wow!! Writing a web browser.. Cool, I like that.... But before doing
this I'll first read the perldoc tutorials which you have pointed me to
...

Thanks once more.

Prasanna

>
> > Thanks for all the help, Paul.
>
> You're welcome.
>
> Paul Lalli

Re: Unable to connect a website using LWP

am 26.02.2006 17:39:31 von prasanna_ssb

Bill Segraves wrote:
> "Prasanna" wrote in message
> news:1140863040.402425.158780@i39g2000cwa.googlegroups.com.. .
> > Bill Segraves wrote:
> > > wrote in message
> > > news:1140687108.651391.276040@g44g2000cwa.googlegroups.com.. .
>
> > Because I need to download the entire pdb. Of course I could have
> > written a shell script for automatic getting the files by ftp, but
> > since I was learning Perl, I thought that LWP would be my best shot
> >
>
> From what you've described, it doesn't appear that you need to download the
> entire pdb; but rather, it seems you need to download the folders/files that
> have changed.

THe first time it'd be the entire pdb and later on it'd be the updates
only.

>See below for some suggestions on how to mirror the RCSB PDB
> on the site you need.

OK..


> >
> > I'll do this. There was another reason for the script by the way. My
> > prof. wants to get this data every month beginning or some such period
> > ( essentially periodically) and obviously wants something which'd not
> > bother him (he might put this in the crontab file, for all I know).
> >
>
> I understand. It's a terrible bother to click on a shortcut/link
> ftp://ftp.rcsb.org/pub/pdb/data/structures/divided
> on his desktop and copy the target folder "pdb" to a local directory. ;-)

He's a crystallographer, not a programmer... so it'd be a bother :-)

>
> You may benefit from the work of others who have already shown us how to
> solve similar problems, e.g., Randal L. Schwartz articles:
>
> "Special Purpose Mirrors"
> http://www.stonehenge.com/merlyn/WebTechniques/col31.html
>
> "Mirroring your own mini-CPAN"
> http://www.stonehenge.com/merlyn/LinuxMag/col42.html
>
> both of which use LWP::Simple, and
>
> "Calculating download time"
> http://www.stonehenge.com/merlyn/WebTechniques/col63.html

Thanks for the tip
I'll go through the code(s)... in his articles...




> I hope Randal's articles are helpful to you.

Me too..

Thanks once more...
Prasanna

>
> heers.
> --
> Bill Segraves

Re: Unable to connect a website using LWP

am 26.02.2006 21:28:54 von Bill Segraves

wrote in message
news:1140971971.288139.106720@j33g2000cwa.googlegroups.com.. .
>

> > > Because I need to download the entire pdb. Of course I could have
> > > written a shell script for automatic getting the files by ftp, but
> > > since I was learning Perl, I thought that LWP would be my best shot
> > >
> >
> > From what you've described, it doesn't appear that you need to download
the
> > entire pdb; but rather, it seems you need to download the folders/files
that
> > have changed.
>
> THe first time it'd be the entire pdb and later on it'd be the updates
> only.
>

You are no doubt aware there are over 1000 directories to download to get
the entire pdb. You may wish to consider downloading the directories in
batches, say, of 36 per day for 36 days, to get the entire set. See names of
the folders, all of which appear to be two characters with the pattern
[0-9a-z]{1}[0-9a-z]{1} .

While downloading the directories, you may want to build a local index of
the entire pdb at the RCSB PDB site.



Once you've built the index, the mirroring task could be done using
LWP:Simple's mirror function.

--
Bill Segraves