Re: parsing script duplication of lines issue, please advise

Re: parsing script duplication of lines issue, please advise

am 21.07.2011 13:54:25 von Rob Dixon

On 21/07/2011 12:00, Nathalie Conte wrote:
> HI,
> I want to create a simple script where I am parsing a file and writing
> only the lines where I can find a certain value in a new output file
> this is my Infile format: workable example attached
> I want to keep only the lines where there is a 1 not the ones with -1,
> there are 10 in this example and when I produce my outfile it is 20
> lines long! They are duplicated and I am not sure why, I would
> appreciate any advise. the example infile attached contain 50 and
> produce a outfile of 100...
> 18 3016088 3016288 -1
> 18 3035364 3035564 -1
> 18 3163934 3164134 -1
> 18 3167351 3167551 1
> 18 3176373 3176573 1
> 18 3198845 3199045 -1
> 18 3215936 3216136 1
> 18 3275482 3275682 -1
> 18 3281089 3281289 -1
> 18 3388675 3388875 -1
> 18 3517500 3517700 -1
> 18 3588447 3588647 1
> 18 3667294 3667494 -1
> 18 3746503 3746703 -1
> 18 3771167 3771367 -1
> 18 3779418 3779618 -1
> 18 3916005 3916205 -1
> 18 3933642 3933842 1
> 18 3975635 3975835 1
> 18 3992344 3992544 -1
> 18 4084642 4084842 1
> 18 4127586 4127786 -1
> 18 4149689 4149889 -1
> 18 4158287 4158487 -1
> 18 4189973 4190173 1
> 18 4402882 4403082 -1
> 18 4441582 4441782 1
> 18 4454914 4455114 -1
> 18 4549176 4549376 1
> 18 4557665 4557865 -1
> 18 4557697 4557897 -1
> 18 4600101 4600301 -1
>
>
> ####this is my script
> #!/software/bin/perl
> use warnings;
> use strict;
>
>
>
>
> my $file="./infile.txt";
>
> open( IN , '<' , $file ) or die( $! );
> open(OUT, ">>outfile.txt");
>
>
> while (){
> my @line=split(/\t/);
> if($line[3]==-1) {
> print OUT $line[0],"\t",$line[1],"\t",$line[2],"\t",$line[3],"\n";
> }
>
> } close OUT; close IN;

As Brian said, the final field is already terminated by "\n" and you are
adding an additional one in the print statement. You are also lucky that
the string "-1\n" matches the number -1, so it would be better to chomp
the record before splitting it.

And since all you want is a copy of the input line, why not just write

print OUT "$_\n";

Finally, your code is copying records that have a final field of -1,
which is the opposite of what you describe. Which is it that you mean?

After these changes, the loop looks like this

while (){
chomp;
my @line=split(/\t/);
if ($line[3] == -1) {
print OUT "$_\n";
}
}

HTH,

Rob

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: parsing script duplication of lines issue, please advise

am 22.07.2011 03:29:55 von jwkrahn

Rob Dixon wrote:
>
> After these changes, the loop looks like this
>
> while (){
> chomp;
> my @line=split(/\t/);
> if ($line[3] == -1) {
> print OUT "$_\n";
> }
> }

You can make it much simpler than that:

while ( ) {
print OUT if /\t-1$/;
}




John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/