Database table load utility

Database table load utility

am 08.11.2010 01:16:24 von Chap Harrison

Hi folks,

I'm in the thinking stages of creating a table-load utility, which reads =
a tab-separated CSV file and inserts or updates rows in a relational =
table (probably Oracle). I don't think that will be too hard, having =
used Perl DBI/DBD modules in the past. What's different is that =
customers will transmit their files to a directory on a Linux server, =
using an FTP/SFTP client of their choosing, after which my utility needs =
to "notice" the arrival of the file, and initiate the table updating.

Are there any Perl facilities, or modules, I should be considering for =
this? Or is this sort of problem typically solved with something as =
primitive as a daemon that periodically polls for changes to the =
directories? =20

And - is there perhaps a name for this kind of design? (I mean, other =
than "somewhat retarded" ;-) Kind of like "store-and-forward", but =
different? That would help my googling.

I do appreciate any follow-up questions or suggestions....

Chap


--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Database table load utility

am 08.11.2010 01:47:06 von Shawn Wilson

--0016364c7f13a8c42504947ff648
Content-Type: text/plain; charset=ISO-8859-1

its Unix, I'd use cron as its probably already running. And do something
like: script.pl $(find -type f -print0) Every few seconds. Have the
script die if ARGV[0] is undefined. I don't remember what character
'-print0' puts between files -check the man pages and split on that. Loop
through the files, use the csv module to separate your input (or
'split("\t", ...) ) and use dbo and insert your stuff.
On Nov 7, 2010 7:17 PM, "Chap Harrison" wrote:
> Hi folks,
>
> I'm in the thinking stages of creating a table-load utility, which reads a
tab-separated CSV file and inserts or updates rows in a relational table
(probably Oracle). I don't think that will be too hard, having used Perl
DBI/DBD modules in the past. What's different is that customers will
transmit their files to a directory on a Linux server, using an FTP/SFTP
client of their choosing, after which my utility needs to "notice" the
arrival of the file, and initiate the table updating.
>
> Are there any Perl facilities, or modules, I should be considering for
this? Or is this sort of problem typically solved with something as
primitive as a daemon that periodically polls for changes to the
directories?
>
> And - is there perhaps a name for this kind of design? (I mean, other than
"somewhat retarded" ;-) Kind of like "store-and-forward", but different?
That would help my googling.
>
> I do appreciate any follow-up questions or suggestions....
>
> Chap
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>

--0016364c7f13a8c42504947ff648--

Re: Database table load utility

am 08.11.2010 05:46:20 von Chap Harrison

On Nov 7, 2010, at 6:47 PM, shawn wilson wrote:

> its Unix, I'd use cron as its probably already running. And do =
something
> like: script.pl $(find -type f -print0) Every few seconds.=20

That looks plenty straightforward enough, except how does one prevent a =
race condition wherein cron launches another instance of script.pl =
before the first instance has exclusively grabbed responsibility for =
loading the newly-arrived files?

Some googling seems to indicate that cron itself does not provide any =
mechanism to ensure against spawning multiple concurrent instances of a =
cron job. Sorry, I realize that this is not specifically a Perl problem =
at this point, so I guess I'll need to research some Unix-y ways of =
handling this.






> On Nov 7, 2010 7:17 PM, "Chap Harrison" wrote:
>> Hi folks,
>>=20
>> I'm in the thinking stages of creating a table-load utility, which =
reads a
> tab-separated CSV file and inserts or updates rows in a relational =
table
> (probably Oracle). I don't think that will be too hard, having used =
Perl
> DBI/DBD modules in the past. What's different is that customers will
> transmit their files to a directory on a Linux server, using an =
FTP/SFTP
> client of their choosing, after which my utility needs to "notice" the
> arrival of the file, and initiate the table updating.
>>=20
>> Are there any Perl facilities, or modules, I should be considering =
for
> this? Or is this sort of problem typically solved with something as
> primitive as a daemon that periodically polls for changes to the
> directories?
>>=20
>> And - is there perhaps a name for this kind of design? (I mean, other =
than
> "somewhat retarded" ;-) Kind of like "store-and-forward", but =
different?
> That would help my googling.
>>=20
>> I do appreciate any follow-up questions or suggestions....
>>=20
>> Chap
>>=20
>>=20
>> --
>> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
>> For additional commands, e-mail: beginners-help@perl.org
>> http://learn.perl.org/
>>=20
>>=20


--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Database table load utility

am 08.11.2010 06:03:16 von Ishwor Gurung

Hi.

On 8 November 2010 15:46, Chap Harrison wrote:
>
> On Nov 7, 2010, at 6:47 PM, shawn wilson wrote:
>
>> its Unix, I'd use cron as its probably already running. And do something
>> like: script.pl $(find -type f -print0) Every few seconds.
>
> That looks plenty straightforward enough, except how does one prevent a r=
ace condition wherein cron launches another instance of script.pl before th=
e first instance has exclusively grabbed responsibility for loading the new=
ly-arrived files?

Using a temporary file might be one approach. i.e., in your code, you
write one when processing, exit if it exists in other subsequent code.
This way no code can gobble up each other's resources even though
polling isn't the coolest thing on earth - it does the job.

If you want to tackle it another way, check out dnotify/inotify syscalls.

HTH.

[...]
--=20

Regards
Ishwor Gurung
Key id:0xa98db35e
Key fingerprint:FBEF 0D69 6DE1 C72B A5A8=A0 35FE 5A9B F3BB 4E5E 17B5

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Database table load utility

am 08.11.2010 06:04:21 von Shawn Wilson

--0016364d1ca59d074a0494838e57
Content-Type: text/plain; charset=ISO-8859-1

Perl's flock and then unlink (IIRC, in perl to delete it) and move to the
next one. That, or if you think you'll get too busy, check for a 'lock'
file, if it doesn't exist, create it, run, then delete your lock file.
On Nov 7, 2010 11:46 PM, "Chap Harrison" wrote:
>
> On Nov 7, 2010, at 6:47 PM, shawn wilson wrote:
>
>> its Unix, I'd use cron as its probably already running. And do something
>> like: script.pl $(find -type f -print0) Every few seconds.
>
> That looks plenty straightforward enough, except how does one prevent a
race condition wherein cron launches another instance of script.pl before
the first instance has exclusively grabbed responsibility for loading the
newly-arrived files?
>
> Some googling seems to indicate that cron itself does not provide any
mechanism to ensure against spawning multiple concurrent instances of a cron
job. Sorry, I realize that this is not specifically a Perl problem at this
point, so I guess I'll need to research some Unix-y ways of handling this.
>
>
>
>
>
>
>> On Nov 7, 2010 7:17 PM, "Chap Harrison" wrote:
>>> Hi folks,
>>>
>>> I'm in the thinking stages of creating a table-load utility, which reads
a
>> tab-separated CSV file and inserts or updates rows in a relational table
>> (probably Oracle). I don't think that will be too hard, having used Perl
>> DBI/DBD modules in the past. What's different is that customers will
>> transmit their files to a directory on a Linux server, using an FTP/SFTP
>> client of their choosing, after which my utility needs to "notice" the
>> arrival of the file, and initiate the table updating.
>>>
>>> Are there any Perl facilities, or modules, I should be considering for
>> this? Or is this sort of problem typically solved with something as
>> primitive as a daemon that periodically polls for changes to the
>> directories?
>>>
>>> And - is there perhaps a name for this kind of design? (I mean, other
than
>> "somewhat retarded" ;-) Kind of like "store-and-forward", but different?
>> That would help my googling.
>>>
>>> I do appreciate any follow-up questions or suggestions....
>>>
>>> Chap
>>>
>>>
>>> --
>>> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
>>> For additional commands, e-mail: beginners-help@perl.org
>>> http://learn.perl.org/
>>>
>>>
>

--0016364d1ca59d074a0494838e57--

Re: Database table load utility

am 08.11.2010 06:15:48 von Shawn Wilson

--0016367fa44096e788049483b790
Content-Type: text/plain; charset=ISO-8859-1

Ooh, I didn't know about inotify. It would seem that you can either use it
within perl (just keep the script running like a service) or from a command
line. I think it really comes down to how often you're expecting new files.
And I don't know how the perl module works - its possible you might have to
spawn a fork in case another file comes in while you are working.

However, it would seem that that's the tool for the job:
http://www.ibm.com/developerworks/linux/library/l-ubuntu-ino tify/index.html
On Nov 8, 2010 12:04 AM, "Ishwor Gurung" wrote:
> Hi.
>
> On 8 November 2010 15:46, Chap Harrison wrote:
>>
>> On Nov 7, 2010, at 6:47 PM, shawn wilson wrote:
>>
>>> its Unix, I'd use cron as its probably already running. And do something
>>> like: script.pl $(find -type f -print0) Every few seconds.
>>
>> That looks plenty straightforward enough, except how does one prevent a
race condition wherein cron launches another instance of script.pl before
the first instance has exclusively grabbed responsibility for loading the
newly-arrived files?
>
> Using a temporary file might be one approach. i.e., in your code, you
> write one when processing, exit if it exists in other subsequent code.
> This way no code can gobble up each other's resources even though
> polling isn't the coolest thing on earth - it does the job.
>
> If you want to tackle it another way, check out dnotify/inotify syscalls.
>
> HTH.
>
> [...]
> --
>
> Regards
> Ishwor Gurung
> Key id:0xa98db35e
> Key fingerprint:FBEF 0D69 6DE1 C72B A5A8 35FE 5A9B F3BB 4E5E 17B5
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>

--0016367fa44096e788049483b790--

Re: Database table load utility

am 08.11.2010 07:13:05 von Chap Harrison

On Nov 7, 2010, at 11:15 PM, shawn wilson wrote:

> Ooh, I didn't know about inotify.

Yeah, this is much more appealing. Looks like Linux::Inotify2 is =
better-supported than Linux::Inotify.

> It would seem that you can either use it
> within perl (just keep the script running like a service) or from a =
command
> line. I think it really comes down to how often you're expecting new =
files.
> And I don't know how the perl module works - its possible you might =
have to
> spawn a fork in case another file comes in while you are working.

It looks like the Inotify2 "streaming interface" allows you to block =
until "read" notification(s) come in (my @events =3D $inotify->read). I =
haven't absorbed the details yet, but it sounds like I can read a =
notification, handle it, and then loop back and read others that may =
have occurred in the meantime -- in a strictly serialized fashion, =
without any worries.

Pretty neat. Cheers and thanks!

Chap



> On Nov 8, 2010 12:04 AM, "Ishwor Gurung" =
wrote:
>> Hi.
>>=20
>> On 8 November 2010 15:46, Chap Harrison wrote:
>>>=20
>>> On Nov 7, 2010, at 6:47 PM, shawn wilson wrote:
>>>=20
>>>> its Unix, I'd use cron as its probably already running. And do =
something
>>>> like: script.pl $(find -type f -print0) Every few seconds.
>>>=20
>>> That looks plenty straightforward enough, except how does one =
prevent a
> race condition wherein cron launches another instance of script.pl =
before
> the first instance has exclusively grabbed responsibility for loading =
the
> newly-arrived files?
>>=20
>> Using a temporary file might be one approach. i.e., in your code, you
>> write one when processing, exit if it exists in other subsequent =
code.
>> This way no code can gobble up each other's resources even though
>> polling isn't the coolest thing on earth - it does the job.
>>=20
>> If you want to tackle it another way, check out dnotify/inotify =
syscalls.
>>=20
>> HTH.
>>=20
>> [...]
>> --
>>=20
>> Regards
>> Ishwor Gurung
>> Key id:0xa98db35e
>> Key fingerprint:FBEF 0D69 6DE1 C72B A5A8 35FE 5A9B F3BB 4E5E 17B5
>>=20
>> --
>> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
>> For additional commands, e-mail: beginners-help@perl.org
>> http://learn.perl.org/
>>=20
>>=20


--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Database table load utility

am 08.11.2010 20:39:15 von Parag Kalra

--20cf30433f7ac76ffb04948fc880
Content-Type: text/plain; charset=UTF-8

>>I'm in the thinking stages of creating a table-load utility, which reads a
tab-separated CSV file and inserts or updates rows in a >>relational table
(probably Oracle). I don't think that will be too hard, having used Perl
DBI/DBD modules in the past.

Since you are planning to design your own table-load utility probably for
Oracle, I would like to share my own experience on this and would just like
to add something to other aspects of your requirement.

1. First preference should always be SQLLDR (in direct mode probably)
provided you have Oracle client installed on the *Nix box.
2. If you can't use SQLLDR in direct mode, you can definitely use Perl to
load the data. The challenge here is how optimize you can design your
utility. And the best way to upload the data using Perl DBI/DBD is to use
DBD::Oracle in array interface mode. Refer the following link for the bench
marking results:

http://www.nntp.perl.org/group/perl.dbi.users/2010/05/msg349 05.html


Cheers,
Parag
On Sun, Nov 7, 2010 at 4:16 PM, Chap Harrison wrote:

> Hi folks,
>
> I'm in the thinking stages of creating a table-load utility, which reads a
> tab-separated CSV file and inserts or updates rows in a relational table
> (probably Oracle). I don't think that will be too hard, having used Perl
> DBI/DBD modules in the past. What's different is that customers will
> transmit their files to a directory on a Linux server, using an FTP/SFTP
> client of their choosing, after which my utility needs to "notice" the
> arrival of the file, and initiate the table updating.
>
> Are there any Perl facilities, or modules, I should be considering for
> this? Or is this sort of problem typically solved with something as
> primitive as a daemon that periodically polls for changes to the
> directories?
>
> And - is there perhaps a name for this kind of design? (I mean, other than
> "somewhat retarded" ;-) Kind of like "store-and-forward", but different?
> That would help my googling.
>
> I do appreciate any follow-up questions or suggestions....
>
> Chap
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>
>

--20cf30433f7ac76ffb04948fc880--

Re: Database table load utility

am 10.11.2010 05:55:26 von Chandrashekar Bhat

--20cf300faee1abd8cd0494abab91
Content-Type: text/plain; charset=ISO-8859-1

How about DBI ? I am not sure of ORACLE, but i have used
DBIfor mysql (similar
kind of project), only connector will be different i
believe.

Thanks,
Chandrashekar



On Tue, Nov 9, 2010 at 1:09 AM, Parag Kalra wrote:

> >>I'm in the thinking stages of creating a table-load utility, which reads
> a
> tab-separated CSV file and inserts or updates rows in a >>relational table
> (probably Oracle). I don't think that will be too hard, having used Perl
> DBI/DBD modules in the past.
>
> Since you are planning to design your own table-load utility probably for
> Oracle, I would like to share my own experience on this and would just
> like
> to add something to other aspects of your requirement.
>
> 1. First preference should always be SQLLDR (in direct mode probably)
> provided you have Oracle client installed on the *Nix box.
> 2. If you can't use SQLLDR in direct mode, you can definitely use Perl to
> load the data. The challenge here is how optimize you can design your
> utility. And the best way to upload the data using Perl DBI/DBD is to use
> DBD::Oracle in array interface mode. Refer the following link for the bench
> marking results:
>
> http://www.nntp.perl.org/group/perl.dbi.users/2010/05/msg349 05.html
>
>
> Cheers,
> Parag
> On Sun, Nov 7, 2010 at 4:16 PM, Chap Harrison wrote:
>
> > Hi folks,
> >
> > I'm in the thinking stages of creating a table-load utility, which reads
> a
> > tab-separated CSV file and inserts or updates rows in a relational table
> > (probably Oracle). I don't think that will be too hard, having used Perl
> > DBI/DBD modules in the past. What's different is that customers will
> > transmit their files to a directory on a Linux server, using an FTP/SFTP
> > client of their choosing, after which my utility needs to "notice" the
> > arrival of the file, and initiate the table updating.
> >
> > Are there any Perl facilities, or modules, I should be considering for
> > this? Or is this sort of problem typically solved with something as
> > primitive as a daemon that periodically polls for changes to the
> > directories?
> >
> > And - is there perhaps a name for this kind of design? (I mean, other
> than
> > "somewhat retarded" ;-) Kind of like "store-and-forward", but different?
> > That would help my googling.
> >
> > I do appreciate any follow-up questions or suggestions....
> >
> > Chap
> >
> >
> > --
> > To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> > For additional commands, e-mail: beginners-help@perl.org
> > http://learn.perl.org/
> >
> >
> >
>

--20cf300faee1abd8cd0494abab91--