Data mining?

Data mining?

am 25.04.2007 10:31:53 von tom

Hi all,

I wonder if anyone can give me some help here.

I have permission from a colleage to use some data from his website,
now I need to take the data and intermingle some of the information
with data from my database. I only have access to the HTML of the
external site, there's no XML feed or anything simple I can get it
from either. So, I was wondering if there was an easyish way of
parsing out the info I need from the HTML and putting it into and
array or something? This is how the HTML is formed:


Tel: .



xxNeed this Titlexx




Address line 1
address line 2
Postcode
















Tel: .



xxNeed this Titlexx




Address line 1
address line 2
Postcode














Tel: .



xxNeed this Titlexx




Address line 1
address line 2
Postcode













And so on....

Does anyone have any bright ideas? I've got as far as putting the
whole page into a string and ripping out the and other
stuff. It's looping through those DIVs and turning them into something
I can manipulate where I'm struggling.

Thanks in advance,

Tom

Re: Data mining?

am 26.04.2007 18:02:40 von Anthony Jones

"Tom" wrote in message
news:1177489913.166536.19160@r30g2000prh.googlegroups.com...
> Hi all,
>
> I wonder if anyone can give me some help here.
>
> I have permission from a colleage to use some data from his website,
> now I need to take the data and intermingle some of the information
> with data from my database. I only have access to the HTML of the
> external site, there's no XML feed or anything simple I can get it
> from either. So, I was wondering if there was an easyish way of
> parsing out the info I need from the HTML and putting it into and
> array or something? This is how the HTML is formed:
>
>


>

Tel: .


>
>

xxNeed this Titlexx


>

>

>
Address line 1
> address line 2
> Postcode
>

>

>

>

>

    >
    >

  • >
    >

  • >

>

>
>

>
>

>

Tel: .


>
>

xxNeed this Titlexx


>

>

>
Address line 1
> address line 2
> Postcode
>

>

>

>

>

    >
    >

  • >
    >

  • >

>

>

>

>

Tel: .


>
>

xxNeed this Titlexx


>

>

>
Address line 1
> address line 2
> Postcode
>

>

>

>

>

    >
    >

  • >
    >

  • >

>

>
> And so on....
>
> Does anyone have any bright ideas? I've got as far as putting the
> whole page into a string and ripping out the and other
> stuff. It's looping through those DIVs and turning them into something
> I can manipulate where I'm struggling.
>
> Thanks in advance,

I looks like the HTML is XML compliant (e.g., it uses
rather than
simply
) you might be able to get away with loading it into an XML DOM.




>
> Tom
>