putting file columns into arrays
am 21.05.2011 06:32:44 von Eric Mooshagian
Dear All,
I would like a subroutine that will allow me to easily put columns of a =
tab delimited file into their own arrays.
I've been calling the following repeatedly for each column:
my @array1 =3D getcolvals($filehandle, 0);
my @array2 =3D getcolvals($filehandle, 1); ...etc.
sub getcolvals {
@_ and not @_ % 2 or die "Incorrect number of arguments to =
getcolvals!\n";
my $myfile =3D shift;
my $mycol =3D shift;
=09
my @column =3D ();
=09
while (<$myfile>) {
my ($field) =3D (split /\s/, $_)[$mycol];=20
push @column, $field; =20
}
return @column;
}=20
This accomplishes exactly what I want, but it requires going through the =
whole file for each column extraction which seems inefficient. Also, I =
want to know if I can modify the subroutine to return all the (arbitrary =
number of) columns at once into arrays. Any suggestions?
Many thanks,
Eric=
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: putting file columns into arrays
am 21.05.2011 07:10:25 von Uri Guttman
>>>>> "EM" == Eric Mooshagian writes:
EM> I would like a subroutine that will allow me to easily put columns
EM> of a tab delimited file into their own arrays.
EM> I've been calling the following repeatedly for each column:
EM> my @array1 = getcolvals($filehandle, 0);
EM> my @array2 = getcolvals($filehandle, 1); ...etc.
whenever you think you need to name things with numeric parts, you
usually need an array. since you want arrays, then you really want an
array of arrays.
EM> sub getcolvals {
EM> @_ and not @_ % 2 or die "Incorrect number of arguments to getcolvals!\n";
that is sort of clunky. why not just check @_ == 2?
@_ == 2 or die ...
EM> my $myfile = shift;
EM> my $mycol = shift;
it is usually better to assign from @_. i posted not to long ago several
reasons why. check the archives for it.
my( $myfile, $mycol ) = @_ ;
and in this case you won't need a $mycol since the code will load all
the columns into arrays.
EM> my @column = ();
you don't need to initialize my arrays to () as my does that for you.
EM> while (<$myfile>) {
this will fail unless you reopen the file each time you call the sub or
you seek to the beginning of the file.
EM> my ($field) = (split /\s/, $_)[$mycol];
since you are slicing the split and getting one value, you don't need
the () around $field.
EM> push @column, $field;
and you can combing both of those lines into one:
push @column, (split /\s/, $_)[$mycol] ;
EM> }
EM> return @column;
EM> }
this is untested:
# this is a faster and easier way to get lines from a file
use File::Slurp ;
sub load_columns {
my( $file_name ) = @_ ;
$file_name or die 'load_columns: missing file name' ;
my @lines = read_file $file_name ;
my $matrix ;
foreach my $line ( @lines ) {
my @fields = split ' ', $line ;
for my $i ( 0 .. $#fields ) {
# build up the array of arrays here. each array gets the next field value
push( @{$matrix[$i]}, $field[$i] ) ;
}
}
return $matrix ;
}
for more on references and perl data structures read:
perlreftut
perllol
perldsc
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/