HTML Table > database entry

HTML Table > database entry

am 25.04.2007 10:19:15 von Blagovist

Hi.
Is there an easy way to "lift" data from HTML tables and enter that into
my database? I'm a total novice and so far my searches have yielded
little. I see Navicat has an import option, but that appears to be for
well structured data like Word, Excel or PDF...

Thanks,

Blago

Re: HTML Table > database entry

am 25.04.2007 19:40:55 von Virginner

"Blagovist" wrote in message
news:462f0f0f_3@x-privat.org...
> Hi.
> Is there an easy way to "lift" data from HTML tables and enter that into
> my database? I'm a total novice and so far my searches have yielded
> little. I see Navicat has an import option, but that appears to be for
> well structured data like Word, Excel or PDF...
>
> Thanks,
>
> Blago

If you've got Excel, then you can "bounce" a table via that (copy / paste)
then use that to import via Navicat....

D.
--
googlegroups > /dev/nul

Re: HTML Table > database entry

am 25.04.2007 22:58:25 von unknown

Post removed (X-No-Archive: yes)

Re: HTML Table > database entry

am 26.04.2007 14:14:55 von Blagovist

Virginner wrote:
> "Blagovist" wrote in message
> news:462f0f0f_3@x-privat.org...
>> Hi.
>> Is there an easy way to "lift" data from HTML tables and enter that into
>> my database? I'm a total novice and so far my searches have yielded
>> little. I see Navicat has an import option, but that appears to be for
>> well structured data like Word, Excel or PDF...
>>
>> Thanks,
>>
>> Blago
>
> If you've got Excel, then you can "bounce" a table via that (copy / paste)
> then use that to import via Navicat....
>
> D.

I found something called easywebsave (an IE add-on) that looks
promising. But still a long way from being automated.

Blaqgo

Re: HTML Table > database entry

am 26.04.2007 22:50:11 von Christoph Burschka

Blagovist wrote:
> Virginner wrote:
>> "Blagovist" wrote in message
>> news:462f0f0f_3@x-privat.org...
>>> Hi.
>>> Is there an easy way to "lift" data from HTML tables and enter that
>>> into my database? I'm a total novice and so far my searches have
>>> yielded little. I see Navicat has an import option, but that appears
>>> to be for well structured data like Word, Excel or PDF...
>>>
>>> Thanks,
>>>
>>> Blago
>>
>> If you've got Excel, then you can "bounce" a table via that (copy /
>> paste) then use that to import via Navicat....
>>
>> D.
>
> I found something called easywebsave (an IE add-on) that looks
> promising. But still a long way from being automated.
>
> Blaqgo

The following code relies heavily on your input html table being well-formatted
XHTML:

$text = "

[your table here]
";

/* first, strip the first and last tr tags.
preg_match('/]*>(.+)<\/tr>/',$text,$match);
$to_split=$match[1];

/* now split wherever a row is closed, then opened. */
$rows = preg_split('/<\/td>.*?<\/tr>.*?]*>.*?]>/',$to_split);

foreach ($rows as $row)
{
// now split the rows into cells.
$cells[]=preg_split('/<\/td>.*?]*>/',$row);
}


Your data is now split in a two-dimensional array. Putting it into a database is
pretty trivial after that.

--
cb

Re: HTML Table > database entry

am 27.04.2007 13:10:50 von Captain Paralytic

On 26 Apr, 21:50, Christoph Burschka aachen.de> wrote:
> Blagovist wrote:
> > Virginner wrote:
> >> "Blagovist" wrote in message
> >>news:462f0f0f_3@x-privat.org...
> >>> Hi.
> >>> Is there an easy way to "lift" data from HTML tables and enter that
> >>> into my database? I'm a total novice and so far my searches have
> >>> yielded little. I see Navicat has an import option, but that appears
> >>> to be for well structured data like Word, Excel or PDF...
>
> >>> Thanks,
>
> >>> Blago
>
> >> If you've got Excel, then you can "bounce" a table via that (copy /
> >> paste) then use that to import via Navicat....
>
> >> D.
>
> > I found something called easywebsave (an IE add-on) that looks
> > promising. But still a long way from being automated.
>
> > Blaqgo
>
> The following code relies heavily on your input html table being well-formatted
> XHTML:
>
> $text = "

[your table here]
";
>
> /* first, strip the first and last tr tags.
> preg_match('/]*>(.+)<\/tr>/',$text,$match);
> $to_split=$match[1];
>
> /* now split wherever a row is closed, then opened. */
> $rows = preg_split('/<\/td>.*?<\/tr>.*?]*>.*?]>/',$to_split);
>
> foreach ($rows as $row)
> {
> // now split the rows into cells.
> $cells[]=preg_split('/<\/td>.*?]*>/',$row);
>
> }
>
> Your data is now split in a two-dimensional array. Putting it into a database is
> pretty trivial after that.
>
> --
> cb- Hide quoted text -
>
> - Show quoted text -

But what if that data had individual formatting. The data in one cell
could have a superscript or be in bold. All those tags would be
included.

Re: HTML Table > database entry

am 27.04.2007 19:34:04 von unknown

Post removed (X-No-Archive: yes)

Re: HTML Table > database entry

am 27.04.2007 19:44:56 von Virginner

"Blagovist" wrote in message
news:463097cb_1@x-privat.org...
> Virginner wrote:
>> "Blagovist" wrote in message
>> news:462f0f0f_3@x-privat.org...
>>> Hi.
>>> Is there an easy way to "lift" data from HTML tables and enter that into
>>> my database? I'm a total novice and so far my searches have yielded
>>> little. I see Navicat has an import option, but that appears to be for
>>> well structured data like Word, Excel or PDF...
>>>
>>> Thanks,
>>>
>>> Blago
>>
>> If you've got Excel, then you can "bounce" a table via that (copy /
>> paste) then use that to import via Navicat....
>>
>> D.
>
> I found something called easywebsave (an IE add-on) that looks promising.
> But still a long way from being automated.

Ah! You didn't state "automated" in your OP, hence my suggestion about
Excel -> Navicat.

If you want it automated, then file_get_contents of the url into a string,
strip_tags except table related ones, then use a few explodes or preg_splits
to rip the reaming data into array(s).

D.
--
googlegroups > /dev/nul

Re: HTML Table > database entry

am 28.04.2007 14:14:44 von Christoph Burschka

Captain Paralytic wrote:
> On 26 Apr, 21:50, Christoph Burschka > aachen.de> wrote:
>> Blagovist wrote:
>>> Virginner wrote:
>>>> "Blagovist" wrote in message
>>>> news:462f0f0f_3@x-privat.org...
>>>>> Hi.
>>>>> Is there an easy way to "lift" data from HTML tables and enter that
>>>>> into my database? I'm a total novice and so far my searches have
>>>>> yielded little. I see Navicat has an import option, but that appears
>>>>> to be for well structured data like Word, Excel or PDF...
>>>>> Thanks,
>>>>> Blago
>>>> If you've got Excel, then you can "bounce" a table via that (copy /
>>>> paste) then use that to import via Navicat....
>>>> D.
>>> I found something called easywebsave (an IE add-on) that looks
>>> promising. But still a long way from being automated.
>>> Blaqgo
>> The following code relies heavily on your input html table being well-formatted
>> XHTML:
>>
>> $text = "

[your table here]
";
>>
>> /* first, strip the first and last tr tags.
>> preg_match('/]*>(.+)<\/tr>/',$text,$match);
>> $to_split=$match[1];
>>
>> /* now split wherever a row is closed, then opened. */
>> $rows = preg_split('/<\/td>.*?<\/tr>.*?]*>.*?]>/',$to_split);
>>
>> foreach ($rows as $row)
>> {
>> // now split the rows into cells.
>> $cells[]=preg_split('/<\/td>.*?]*>/',$row);
>>
>> }
>>
>> Your data is now split in a two-dimensional array. Putting it into a database is
>> pretty trivial after that.
>>
>> --
>> cb- Hide quoted text -
>>
>> - Show quoted text -
>
> But what if that data had individual formatting. The data in one cell
> could have a superscript or be in bold. All those tags would be
> included.
>

Hopefully, that information is in the style attribute of the cell tag (and will
get split away, since ]*> matches a complete tag with all attributes). But
if there's markup inside the cell, strip_tags() will remove it.

--
cb