Comparing file contents, the perl way

am 22.05.2008 04:40:52 von Beast

Good morning,

I have 2 files which contains some IDs. Basically I want to search ID
in the file A which is missing on the file B.

This program is ugly, but its work :-)
------------
use strict;

my $target_file = "B.txt";
while(<>) {
chomp;
my $res = `grep $_ $target_file`;
print "$_ is missing\n" if ! $res;
}
------------

I'm trying to found another solutions which more perlish and efficient.
So far im relying that ID should be same digits on the both file,
because "123" will match "1234" :-(

File contents is around 20k - 30k lines, so i must consider the
performance and memory usage also.

A.txt
-----
12345
56789
32134
62134
42134
52134
12234

B.txt
-----
12234
42134
82134
32134

Thanks.

--budhi

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Comparing file contents, the perl way

am 22.05.2008 06:51:08 von jialin

------=_Part_14200_16197384.1211431868616
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Wed, May 21, 2008 at 9:40 PM, beast wrote:

> Good morning,
>
> I have 2 files which contains some IDs. Basically I want to search ID
> in the file A which is missing on the file B.
>
> This program is ugly, but its work :-)
> ------------
> use strict;
>
> my $target_file = "B.txt";
> while(<>) {
> chomp;
> my $res = `grep $_ $target_file`;
> print "$_ is missing\n" if ! $res;
> }
> ------------
>
> I'm trying to found another solutions which more perlish and efficient.
> So far im relying that ID should be same digits on the both file,
> because "123" will match "1234" :-(
>
> File contents is around 20k - 30k lines, so i must consider the
> performance and memory usage also.
>
> A.txt
> -----
> 12345
> 56789
> 32134
> 62134
> 42134
> 52134
> 12234
>
> B.txt
> -----
> 12234
> 42134
> 82134
> 32134
>
>
> Thanks.
>
> --budhi
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>
> in your method ,you can use egrep to search, and let grep stop after first
match
grep -e "^$_\$" -m 1 $target_file, this will prevent partial matching

I would prefer read everything in $target_file into a hash, then compare
$input_file against the hash

use strict;
use warnings;

my ($input_file,$target_file) = @ARGV;
my %hash;

open my $fh, "<", $input_file or die "Cannot open $input_file\n";
while (<$fh>) {
chomp;
$hash{$_}++;
}
close $fh;

open $fh, "<", $target_file or die "Cannot open $target_file\n";
while (<$fh>) {
chomp;
print "$_ is missing\n" unless ( defined $hash{$_});
}

------=_Part_14200_16197384.1211431868616--

Re: Comparing file contents, the perl way

am 22.05.2008 09:52:19 von krahnj

beast wrote:
> Good morning,

Hello,

> I have 2 files which contains some IDs. Basically I want to search ID
> in the file A which is missing on the file B.
>
> This program is ugly, but its work :-)
> ------------
> use strict;
>
> my $target_file = "B.txt";
> while(<>) {
> chomp;
> my $res = `grep $_ $target_file`;
> print "$_ is missing\n" if ! $res;
> }
> ------------
>
> I'm trying to found another solutions which more perlish and efficient.
> So far im relying that ID should be same digits on the both file,
> because "123" will match "1234" :-(
>
> File contents is around 20k - 30k lines, so i must consider the
> performance and memory usage also.
>
> A.txt
> -----
> 12345
> 56789
> 32134
> 62134
> 42134
> 52134
> 12234
>
> B.txt
> -----
> 12234
> 42134
> 82134
> 32134

$ cat A.txt
12345
56789
32134
62134
42134
52134
12234
$ cat B.txt
12234
42134
82134
32134
$ grep -f A.txt -wv B.txt
82134

John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Comparing file contents, the perl way

am 22.05.2008 12:16:26 von Beast

On Thu, May 22, 2008 at 2:52 PM, John W. Krahn wrote:
> 32134
> $ grep -f A.txt -wv B.txt
> 82134
>

Ouch, i never think that it can be so easy!!
Thanks.

--budhi

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/