To update one file with the another file"s data..

To update one file with the another file"s data..

am 17.01.2008 21:35:53 von clearguy02

Hi,

I have two text files: each one has has four fields (delimited by a
space) on each line: id, group, email and manager_id. First file is a
small file with 50 entries and the second one is a huge file with
5,000 entries. The "id" field is same in both files, but the
manager_id's may be different. By comparing all the entries in the
second file (that has the correct manager id), I need to update the
manager_id field in the first file.

Here is the code, I am thinking of:

-----------------------------------------
open (INPUT1,"smallFile.txt") or die "Cannot open the file: $!";
open (INPUT2,"bigFile.txt") or die "Cannot open the file: $!";


while $line1 ()
{
@small_arr = split /\s+/, $line1;
}

while $line2 ()
{
@big_arr = split /\s+/, $line2;
}


foreach (@small_arr)
{
if ($small_arr[0] == $big_arr[0] ) # first ID is same in both
files
{
$small_arr[3] = $big_arr[3];
print "$_\n";
}
}

-------------------------------------------------

I know some thing is certainly wrong here. can some one tell me pl.?

Thanks,
J

Re: To update one file with the another file"s data..

am 17.01.2008 22:29:34 von davidfilmer

On Jan 17, 12:35 pm, cleargu...@yahoo.com wrote:
> while $line1 ()
> {
> @small_arr = split /\s+/, $line1;
> }

You are reading the entire file and assigning the fields of each line
to an array. But the array only holds the fields of a single line.
By the time the while loop is done, the array holds only the fields
for the last line of the file. You have thrown away all the rest of
the file. Same thing for your second loop. That's the main reason
why your program doesn't work.

I would approach it by building a hash of the id/manager values from
the big file and then running through the second file to make the
corrections. Here's an example suitable for a newsgroup posting which
uses a DATA block for the big file and a hardcoded array for the small
file. In reality these would both be files, and you would read the
second file one-line-at-a-time with a while loop.

#!/usr/bin/perl
use strict; use warnings;

my @small = (
"id2 group1 email1 WRONG1",
"id3 group3 email3 mgr3",
"id5 group5 email5 WRONG2",
);

my %big_file_mgr;
while () {
my ($id, $mgr) = (split(/\s+/, $_))[0,3];
$big_file_mgr{$id} = $mgr;
}

foreach (@small) { #this would be: while () {
my ($id, $group, $email, $mgr) = (split(/\s+/, $_));
$mgr = $big_file_mgr{$id} if $big_file_mgr{$id};
print "$id $group $email $mgr\n";
}

__DATA__
id1 group1 email1 mgr1
id2 group2 email2 mgr2
id3 group3 email3 mgr3
id4 group4 email4 mgr4
id5 group5 email5 mgr5

--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)

Re: To update one file with the another file"s data..

am 17.01.2008 23:48:33 von Gunnar Hjalmarsson

clearguy02@yahoo.com wrote:
> I have two text files: each one has has four fields (delimited by a
> space) on each line: id, group, email and manager_id. First file is a
> small file with 50 entries and the second one is a huge file with
> 5,000 entries. The "id" field is same in both files, but the
> manager_id's may be different. By comparing all the entries in the
> second file (that has the correct manager id), I need to update the
> manager_id field in the first file.

This approach is similar to the one posted by David, both making use of
a hash for 'bigFile.txt'. I chose to use 'Tie::File' for updating
'smallFile.txt'.

use Tie::File;
my %bighash;

# Put the data of 'bigFile.txt' into %bighash;
open my $BIG, '<', 'bigFile.txt' or die $!;
while ( <$BIG> ) {
chomp;
my ( $key, $value ) = split ' ', $_, 2;
$bighash{ $key } = $value;
}

# Update manager_id of 'smallFile.txt' if needed
tie my @small, 'Tie::File', 'smallFile.txt' or die $!;
foreach my $line ( @small ) {
my @fields = split ' ', $line;
my $man_id_big = ( split ' ', $bighash{ $fields[0] } )[2];
if ( $fields[3] ne $man_id_big ) {
$line = join ' ', @fields[ 0..2 ], $man_id_big;
}
}
untie @small or die $!;

__END__

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: To update one file with the another file"s data..

am 18.01.2008 01:37:38 von David Filmer

Gunnar Hjalmarsson wrote:

> my ( $key, $value ) = split ' ', $_, 2;

FWIW, if the order of the fields in the input file is the same order
that the OP stipulated then this will not grab the correct fields (the
hash needs the first and fourth fields, not the first two fields).

Re: To update one file with the another file"s data..

am 18.01.2008 01:53:22 von Gunnar Hjalmarsson

David Filmer wrote:
> Gunnar Hjalmarsson wrote:
>> my ( $key, $value ) = split ' ', $_, 2;
--------------------------------------------^
Please note the LIMIT argument.

> FWIW, if the order of the fields in the input file is the same order
> that the OP stipulated then this will not grab the correct fields (the
> hash needs the first and fourth fields, not the first two fields).

The above does not grab only the first two fields; it assigns the first
field to $key and all the other fields to $value.

In the foreach loop I have:

my $man_id_big = ( split ' ', $bighash{ $fields[0] } )[2];

Please feel free to claim that it's a clumsy solution, but it does
address the OP's problem. :) Furthermore, my solution makes it easier
to adapt the code for the case the OP would be interested in altering
also the 'group' and/or 'email' fields.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: To update one file with the another file"s data..

am 18.01.2008 02:53:32 von David Filmer

Gunnar Hjalmarsson wrote:
>>> my ( $key, $value ) = split ' ', $_, 2;
> --------------------------------------------^
> Please note the LIMIT argument.

Ah. I had noted it, but had forgotten exactly how it behaved. I was
thinking it split everything and then returned only the first two
splits. Thanks for refreshing my memory on how this limit (which I
seldom use) actually behaves.

Re: To update one file with the another file"s data..

am 18.01.2008 07:10:18 von clearguy02

On Jan 17, 5:53 pm, David Filmer wrote:
> Gunnar Hjalmarsson wrote:
> >>> my ( $key, $value ) = split ' ', $_, 2;
> > --------------------------------------------^
> > Please note the LIMIT argument.
>
> Ah. I had noted it, but had forgotten exactly how it behaved. I was
> thinking it split everything and then returned only the first two
> splits. Thanks for refreshing my memory on how this limit (which I
> seldom use) actually behaves.

Thnks to every one.

J