need some help here
am 08.02.2006 01:34:28 von Sliver
Hi,
Im looking to see if someone can help me with this problem. i'm new to
perl so its been a bit of a bother figuring this out.
I have the following data in a log file:
barcode count1 count2 count3
Now i want to go through this log file line by line and add all the
corresponding counts when a barcode repeats.
so for example:
9999 0 0 1
9999 1 0 0
9999 0 1 0
should give me : 9999 1 1 1 in a new file.
I tried doing this by taking a line from the file and comparing with
every other line of the same file (done using a nested split function:
basically a foreach, then a split, then a foreach and then a split
again) .if i find a match, i would add the counts and write it to a new
file.
the barcode is a string btw...
and it seems to be write SCALAR[someaddress] to the new file instead of
the barcode... and all the counts are showing up as 0 instead of being
added.
Is this a problem with scope? all the variables were declared in the
beginning using my.
I think i've pretty much included all the info that i can. If someone
can figure this out for me plz let me know. Example code will be much
appreciated.. and i will bless ur kids.. and their kids... and their
kids... and......
Thanks a lot
-Varun
Re: need some help here
am 08.02.2006 03:40:30 von Matt Garrish
"Sliver" wrote in message
news:1139358868.556278.286780@g47g2000cwa.googlegroups.com.. .
> Hi,
> Im looking to see if someone can help me with this problem. i'm new to
> perl so its been a bit of a bother figuring this out.
> I have the following data in a log file:
>
> barcode count1 count2 count3
>
> Now i want to go through this log file line by line and add all the
> corresponding counts when a barcode repeats.
>
> so for example:
>
> 9999 0 0 1
> 9999 1 0 0
> 9999 0 1 0
>
> should give me : 9999 1 1 1 in a new file.
>
> I tried doing this by taking a line from the file and comparing with
> every other line of the same file (done using a nested split function:
> basically a foreach, then a split, then a foreach and then a split
> again) .if i find a match, i would add the counts and write it to a new
> file.
> the barcode is a string btw...
> and it seems to be write SCALAR[someaddress] to the new file instead of
> the barcode... and all the counts are showing up as 0 instead of being
> added.
> Is this a problem with scope? all the variables were declared in the
> beginning using my.
> I think i've pretty much included all the info that i can.
And yet you've included nothing particularly useful. Where is the code that
demonstrates this problem? If you don't provide a short and complete script
that demonstrates your problem, you're probably not going to get much in the
way of help.
If you're really getting what you say, though, then you are trying to print
a reference instead of a value. Try dereferencing the variable before
printing it.
Matt
Re: need some help here
am 08.02.2006 04:06:35 von Sliver
hey sorry about not including the code, its just there are some
security issues and the code is not accessible to me outside... i
thought i described the problem well enough though .. let me know what
is missing..... will check the dereferencing of variables in the
meantime... thanks
Re: need some help here
am 08.02.2006 21:40:12 von Sliver
ok .... can someone tell me why this code doesnt work.. how do i
preserve scope and value in perl, do i use local, my or our???
suppose i had this code
my $barcode;
my $cnt;
my $b;
my $c;
foreach $text ()
{
($bar, $cnt) = split(/\t/, $text);
foreach $new_text()
{
($b, $c) = split(/\t/, $new_text);
if($bar eq $b)
{
$c += $cnt;
}
}
}
Re: need some help here
am 08.02.2006 23:18:45 von Matt Garrish
"Sliver" wrote in message
news:1139431212.089819.25810@f14g2000cwb.googlegroups.com...
> ok .... can someone tell me why this code doesnt work.. how do i
> preserve scope and value in perl, do i use local, my or our???
> suppose i had this code
>
> my $barcode;
> my $cnt;
> my $b;
> my $c;
>
> foreach $text ()
If you're going to scope your variables, turn on strictures. The above
should be:
foreach my $text ()
This also shows that you aren't using the three-argument open. Standard
practice these days is to write:
open(my $infile, '<', 'somefile.txt') or die "Could not open somfile.txt:
$!";
so that you can avoid the barewords (and for other benefits that you can
read up on in perlopentut under indirect filehandles).
> {
> ($bar, $cnt) = split(/\t/, $text);
> foreach $new_text()
> {
> ($b, $c) = split(/\t/, $new_text);
Using one-letter variable names is bad to begin with (not to mention that $a
and $b are special), but if $c is your count, why are you assigning to it
here? The most that $c could ever be is whatever the last matching value is
plus whatever $cnt happens to be when you assign to it again in the next
statement. Every time through the loop you clobber whatever value $c had the
previous time.
> if($bar eq $b)
> {
> $c += $cnt;
> }
> }
> }
Try using one unique variable for the count and lexically scope all the
variables for each iteration through the loop. For example:
[untested]
foreach my $text (<$oldfile>) {
my $cnt = 0;
my ($oldbar, $oldcnt) = split(/\t/, $text);
foreach my $new_text (<$newfile>) {
my ($newbar, $newcnt) = split(/\t/, $newtext);
$cnt += $newcnt if $newbar eq $oldbar;
}
print "The count for $oldbar was: $cnt\n";
}
But you may need to make some modifications, because I'm just guessing what
you really intended to do with all the $cnt and $c clobbering that was going
on in your original code. I also don't know how you expect this to work
based on your original post. Are there spaces between the numbers in the
field you're splitting on? If so, you can't just add the value without first
compacting that space (i.e., perl will treat them as strings, so you'll
probably always wind up adding 1).
Matt
Re: need some help here
am 09.02.2006 00:34:04 von Sliver
Hi matt,
Thanks for the help. I know my posts have been a bit confusing but what
you have told me has been a great help.
a few questions.......
In the above code, the only catch is that my new_file is initially
empty. When i find a barcode, I put it in new_file from old_file. When
i find a barcode that has been previously seen, i just need to add the
counts without creating a new entry in new_file. Does that make things
clearer?
Could i use getline?
>Are there spaces between the numbers in the
>field you're splitting on? If so, you can't just add the value without first
>compacting that space (i.e., perl will treat them as strings, so you'll
>probably always wind up adding 1).
all values are separated by tabs.... so just splitting by tab is not
enough??? are u telling me that cnt and c actually hold string
values...? is there a way to convert them to integers?
I'm gonna look through my code again and do what you suggested. Once i
get it a bit neat and tidy , ill post it again.... have a look ....
thanks a lot matt.....
Re: need some help here
am 09.02.2006 02:13:50 von Matt Garrish
"Sliver" wrote in message
news:1139441644.098140.88220@z14g2000cwz.googlegroups.com...
> Hi matt,
> Thanks for the help. I know my posts have been a bit confusing but what
> you have told me has been a great help.
>
> a few questions.......
>
> In the above code, the only catch is that my new_file is initially
> empty. When i find a barcode, I put it in new_file from old_file. When
> i find a barcode that has been previously seen, i just need to add the
> counts without creating a new entry in new_file. Does that make things
> clearer?
>
Yes, but it's wasteful. You should be storing your values in a hash until
you finish processing the file. That would save you having to reread the
data from file.
>
>>Are there spaces between the numbers in the
>>field you're splitting on? If so, you can't just add the value without
>>first
>>compacting that space (i.e., perl will treat them as strings, so you'll
>>probably always wind up adding 1).
>
> all values are separated by tabs.... so just splitting by tab is not
> enough??? are u telling me that cnt and c actually hold string
> values...? is there a way to convert them to integers?
>
Perl does type conversion for you.
You're splitting on a tab, but only taking the first two values. I don't
know that much about barcodes, but from your original example you said you
had
barcode 1 0 0
barcode 0 1 0
barcode 0 0 1
And you wanted "barcode 1 1 1". To me, the easiest way to get this value
would be to add each of the three fields (i.e., 100 + 10 + 1). If all of
those values are separated by tabs, however, when you write for example:
my ($barcode, $oldcnt) = split(/\t/, $text);
You would wind up with $barcode set to 'barcode' and $oldcnt set to 1 for
the first line (the two 0s would be discarded because you don't keep their
values). I wasn't sure if the counts were separated by spaces, so that you'd
now have '1 0 0' in $oldcnt (which is what I was getting at when I said I
suspected you would wind up with a string). If they're all tab-delimited,
then you can make your life really simple by using an array to get all the
values:
my @chunk = split(/\t/, $text);
You now know $chunk[0] is the barcode and 1..$#chunk contains the remaining
parts. Using a hash as I mentioned above, you could then add them like so:
for my $i (1..$#chunk) {
$barcodelist{$chunk[0]}[$i] += $chunk[$i];
}
You'd just have to remember that your array starts at 1 when you dump the
data back out. See below for an example which splits on spaces instead of
tabs.
Matt
----
use strict;
use warnings;
my %barcodelist;
foreach my $text () {
my @chunk = split(/ /, $text);
for my $i (1..$#chunk) {
$barcodelist{$chunk[0]}[$i] += $chunk[$i];
}
}
foreach my $barcode (sort keys %barcodelist) {
print $barcode;
for my $i (1..$#{$barcodelist{$barcode}}) {
print " $barcodelist{$barcode}[$i]";
}
print "\n";
}
__DATA__
9999 1 0 0
9999 0 1 0
9999 0 0 1
Re: need some help here
am 11.02.2006 10:29:48 von Joe Smith
Sliver wrote:
> foreach $text () {
> ($bar, $cnt) = split(/\t/, $text);
> foreach $new_text() {
That won't work at all.
The first line reads in all the lines of the file all at once
then processes them one at a time. It is better to read the
lines one at a time instead:
while (my $text = ) {
...
}
The third line quoted above reads in the entire contents
of the new file, leaving the it in an end-of-file condition,
on the first time through. When you get around to the second
line of the old file, there is nothing to read into $new_text.
> I tried doing this by taking a line from the file and comparing with
> every other line of the same file
The code you posted does not match that description, but no matter.
You should not be "comparing with every line" - that's what a hash
(also known as an associative array) is for.
Have you even tried using a hash yet? That's the way to go.
-Joe