XML RSS reader with BBC Website..

am 15.08.2007 16:57:21 von junkmate

I have made an RSS reader and am testing on the BBC website, and I use
this code to grab the contents of the XML file, however when I look at
the contents grabbed by my function, and the HTML source of the bbc
website XML, they are different... how is that even possible?

Anyone have an XML parser that they could test this on please? Heres a
sample link and my code:
http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/football /rss.xml

$rss_name = "filename.xml";

$ch = curl_init($feed);
$fp = fopen($rss_name, "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

curl_exec($ch);
curl_close($ch);
fclose($fp);

Re: XML RSS reader with BBC Website..

am 15.08.2007 17:13:09 von luiheidsgoeroe

On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wrote:=

> I have made an RSS reader and am testing on the BBC website, and I use=

> this code to grab the contents of the XML file, however when I look at=

> the contents grabbed by my function, and the HTML source of the bbc
> website XML, they are different... how is that even possible?
>
> Anyone have an XML parser that they could test this on please? Heres a=

> sample link and my code:
> http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/football /rss.xml
>
>
> $rss_name =3D "filename.xml";
>
> $ch =3D curl_init($feed);
> $fp =3D fopen($rss_name, "w");
>
> curl_setopt($ch, CURLOPT_FILE, $fp);
> curl_setopt($ch, CURLOPT_HEADER, 0);
>
> curl_exec($ch);
> curl_close($ch);
> fclose($fp);
>

Viewing the feed source & the file from CURL, the only difference I see =
is =

(understandably) . What do you see and what do you expe=
ct?

-- =

Rik Wasmus

Re: XML RSS reader with BBC Website..

am 15.08.2007 17:23:05 von junkmate

I get an old set of items... the latest items are not included...
Now I am thinking my cUrl function maybe grabbing cached versions of
the xml file? is that possible and if so, can it be switched off?

On Aug 15, 4:13 pm, Rik wrote:
> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wrote:
> > I have made an RSS reader and am testing on the BBC website, and I use
> > this code to grab the contents of the XML file, however when I look at
> > the contents grabbed by my function, and the HTML source of the bbc
> > website XML, they are different... how is that even possible?
>
> > Anyone have an XML parser that they could test this on please? Heres a
> > sample link and my code:
> >http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/footbal l/rss.xml
>
> > $rss_name = "filename.xml";
>
> > $ch = curl_init($feed);
> > $fp = fopen($rss_name, "w");
>
> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > curl_setopt($ch, CURLOPT_HEADER, 0);
>
> > curl_exec($ch);
> > curl_close($ch);
> > fclose($fp);
>
> Viewing the feed source & the file from CURL, the only difference I see is
> (understandably) . What do you see and what do you expect?
>
> --
> Rik Wasmus

Re: XML RSS reader with BBC Website..

am 15.08.2007 17:44:31 von luiheidsgoeroe

On Wed, 15 Aug 2007 17:23:05 +0200, junkmate wrote:=

> On Aug 15, 4:13 pm, Rik wrote:
>> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wro=
te:
>> > I have made an RSS reader and am testing on the BBC website, and I =
use
>> > this code to grab the contents of the XML file, however when I look=
at
>> > the contents grabbed by my function, and the HTML source of the bbc=

>> > website XML, they are different... how is that even possible?
>>
>> > Anyone have an XML parser that they could test this on please? Here=
s a
>> > sample link and my code:
>> >http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/footbal l/rss.xml=

>>
>> > $rss_name =3D "filename.xml";
>>
>> > $ch =3D curl_init($feed);
>> > $fp =3D fopen($rss_name, "w");
>>
>> > curl_setopt($ch, CURLOPT_FILE, $fp);
>> > curl_setopt($ch, CURLOPT_HEADER, 0);
>>
>> > curl_exec($ch);
>> > curl_close($ch);
>> > fclose($fp);
>>
>> Viewing the feed source & the file from CURL, the only difference I s=
ee =

>> is
>> (understandably) . What do you see and what do you =

>> expect?

(topposting fixed)

> I get an old set of items... the latest items are not included...
> Now I am thinking my cUrl function maybe grabbing cached versions of
> the xml file? is that possible and if so, can it be switched off?

No such problem here, though it might depend on sever setup. Are you sur=
e =

that what CURL gets is cached data, and it is not your own output on the=
=

web which is? (i.e. your file gets updated, browser still shows old file=
)
-- =

Rik Wasmus

Re: XML RSS reader with BBC Website..

am 15.08.2007 17:55:39 von junkmate

No, I have a button which grabs a fresh XML file and writes a fresh
htm file to be included every time via AJAX.

I did find this:
curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);

Since adding that, I get the latest results... which means one of two
things:
1) The cache finally ran out and it refreshed anyway!
2) its fixed...

On Aug 15, 4:44 pm, Rik wrote:
> On Wed, 15 Aug 2007 17:23:05 +0200, junkmate wrote:
> > On Aug 15, 4:13 pm, Rik wrote:
> >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wrote:
> >> > I have made an RSS reader and am testing on the BBC website, and I use
> >> > this code to grab the contents of the XML file, however when I look at
> >> > the contents grabbed by my function, and the HTML source of the bbc
> >> > website XML, they are different... how is that even possible?
>
> >> > Anyone have an XML parser that they could test this on please? Heres a
> >> > sample link and my code:
> >> >http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/footbal l/rss.xml
>
> >> > $rss_name = "filename.xml";
>
> >> > $ch = curl_init($feed);
> >> > $fp = fopen($rss_name, "w");
>
> >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> >> > curl_setopt($ch, CURLOPT_HEADER, 0);
>
> >> > curl_exec($ch);
> >> > curl_close($ch);
> >> > fclose($fp);
>
> >> Viewing the feed source & the file from CURL, the only difference I see
> >> is
> >> (understandably) . What do you see and what do you
> >> expect?
>
> (topposting fixed)
>
> > I get an old set of items... the latest items are not included...
> > Now I am thinking my cUrl function maybe grabbing cached versions of
> > the xml file? is that possible and if so, can it be switched off?
>
> No such problem here, though it might depend on sever setup. Are you sure
> that what CURL gets is cached data, and it is not your own output on the
> web which is? (i.e. your file gets updated, browser still shows old file)
> --
> Rik Wasmus

Re: XML RSS reader with BBC Website..

am 15.08.2007 18:01:01 von junkmate

No i just tried on a brand new fresh feed:
http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_pag e/rss.xml

The second item is different...

On Aug 15, 4:55 pm, junkmate wrote:
> No, I have a button which grabs a fresh XML file and writes a fresh
> htm file to be included every time via AJAX.
>
> I did find this:
> curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);
>
> Since adding that, I get the latest results... which means one of two
> things:
> 1) The cache finally ran out and it refreshed anyway!
> 2) its fixed...
>
> On Aug 15, 4:44 pm, Rik wrote:
>
> > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate wrote:
> > > On Aug 15, 4:13 pm, Rik wrote:
> > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wrote:
> > >> > I have made an RSS reader and am testing on the BBC website, and I use
> > >> > this code to grab the contents of the XML file, however when I look at
> > >> > the contents grabbed by my function, and the HTML source of the bbc
> > >> > website XML, they are different... how is that even possible?
>
> > >> > Anyone have an XML parser that they could test this on please? Heres a
> > >> > sample link and my code:
> > >> >http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/footbal l/rss.xml
>
> > >> > $rss_name = "filename.xml";
>
> > >> > $ch = curl_init($feed);
> > >> > $fp = fopen($rss_name, "w");
>
> > >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > >> > curl_setopt($ch, CURLOPT_HEADER, 0);
>
> > >> > curl_exec($ch);
> > >> > curl_close($ch);
> > >> > fclose($fp);
>
> > >> Viewing the feed source & the file from CURL, the only difference I see
> > >> is
> > >> (understandably) . What do you see and what do you
> > >> expect?
>
> > (topposting fixed)
>
> > > I get an old set of items... the latest items are not included...
> > > Now I am thinking my cUrl function maybe grabbing cached versions of
> > > the xml file? is that possible and if so, can it be switched off?
>
> > No such problem here, though it might depend on sever setup. Are you sure
> > that what CURL gets is cached data, and it is not your own output on the
> > web which is? (i.e. your file gets updated, browser still shows old file)
> > --
> > Rik Wasmus

Re: XML RSS reader with BBC Website..

am 15.08.2007 18:17:25 von junkmate

OK, somethings erratic... I added to my parser a date at the top which
shows the LastBuildDate of the XML file being parsed. It changes as
you click on refresh... and is always different to the one found in
the actual XML source found by clicking the rss button.

Is it my browser? Is it my page being cached? I dont know. Any ideas?

Here: http://dev.oldsushi.com/joe
The top one, labeled BBC News
(the actual RSS feed can be accessed by clicking the rss button in the
top right)

On Aug 15, 5:01 pm, junkmate wrote:
> No i just tried on a brand new fresh feed:http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/fron t_page/rss.xml
>
> The second item is different...
>
> On Aug 15, 4:55 pm, junkmate wrote:
>
> > No, I have a button which grabs a fresh XML file and writes a fresh
> > htm file to be included every time via AJAX.
>
> > I did find this:
> > curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);
>
> > Since adding that, I get the latest results... which means one of two
> > things:
> > 1) The cache finally ran out and it refreshed anyway!
> > 2) its fixed...
>
> > On Aug 15, 4:44 pm, Rik wrote:
>
> > > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate wrote:
> > > > On Aug 15, 4:13 pm, Rik wrote:
> > > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate wrote:
> > > >> > I have made an RSS reader and am testing on the BBC website, and I use
> > > >> > this code to grab the contents of the XML file, however when I look at
> > > >> > the contents grabbed by my function, and the HTML source of the bbc
> > > >> > website XML, they are different... how is that even possible?
>
> > > >> > Anyone have an XML parser that they could test this on please? Heres a
> > > >> > sample link and my code:
> > > >> >http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/footbal l/rss.xml
>
> > > >> > $rss_name = "filename.xml";
>
> > > >> > $ch = curl_init($feed);
> > > >> > $fp = fopen($rss_name, "w");
>
> > > >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > > >> > curl_setopt($ch, CURLOPT_HEADER, 0);
>
> > > >> > curl_exec($ch);
> > > >> > curl_close($ch);
> > > >> > fclose($fp);
>
> > > >> Viewing the feed source & the file from CURL, the only difference I see
> > > >> is
> > > >> (understandably) . What do you see and what do you
> > > >> expect?
>
> > > (topposting fixed)
>
> > > > I get an old set of items... the latest items are not included...
> > > > Now I am thinking my cUrl function maybe grabbing cached versions of
> > > > the xml file? is that possible and if so, can it be switched off?
>
> > > No such problem here, though it might depend on sever setup. Are you sure
> > > that what CURL gets is cached data, and it is not your own output on the
> > > web which is? (i.e. your file gets updated, browser still shows old file)
> > > --
> > > Rik Wasmus