Adding file contents into hashes
am 08.06.2011 03:17:13 von Aravind Venkatesan--------------050601050402010007040701
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Hi,
This is a snippet of the data
ENTRY K00001 KO
NAME E1.1.1.1, adh
DEFINITION alcohol dehydrogenase [EC:1.1.1.1]
PATHWAY ko00010 Glycolysis / Gluconeogenesis
ko00071 Fatty acid metabolism
ko00350 Tyrosine metabolism
ko00625 Chloroalkane and chloroalkene degradation
ko00626 Naphthalene degradation
ko00830 Retinol metabolism
ko00980 Metabolism of xenobiotics by cytochrome P450
ko00982 Drug metabolism - cytochrome P450
///
ENTRY K14865 KO
NAME U14snoRNA, snR128
DEFINITION U14 small nucleolar RNA
CLASS Genetic Information Processing; Translation; Ribosome
Biogenesis [BR:ko03009]
///
I am trying to store this in the following data structure by splitting
the file along the "///" and have each record in a hash with primary key
as the ENTRY number and storing all the other info under that key :
$VAR1 = {
K00001 => {
'NAME' => [
'E1.1.1.1',
'adh'
],
'DEFINITION' =>
'alcohol dehydrogenase [EC:1.1.1.1]',
'PATHWAY' => {
'ko00010' => 'Glycolysis / Gluconeogenesis',
'ko00071' => 'Fatty acid metabolism'
}
I have started off with the following code:
sub parse{
my $kegg_file_path = shift;
my %keggData;
open my $fh, '<', $kegg_file_path || croak ("Cannot open file
'$kegg_file_path': $!");
my $contents = do{local $/, <$fh>};
my @dataArray = split ('///', $contents);
foreach my $currentLine (@dataArray){
if ($currentLine =~ /^ENTRY\s{7}(.+?)\s+/){
my $value = $1;
$keggData{'ENTRY'} = $value;
}
}
print Dumper(%keggData);
close $fh;
}
but not sure how to proceed further and bring it to the data structure
mentioned above, I am new to perl and trying to learn ways of parsing
files so any help would be much appreciated.
thanks,
Aravind
--------------050601050402010007040701--