How can I read and parse chunks of data into a Perl hash of arrays?

Posted by neversaint on Stack Overflow See other posts from Stack Overflow or by neversaint
Published on 2010-04-16T06:19:54Z Indexed on 2010/04/17 13:33 UTC
Read the original article Hit count: 168

Filed under:
|

I have data that looks like this:

#info
#info2

1:SRX004541
Submitter: UT-MGS, UT-MGS
Study: Glossina morsitans transcript sequencing project(SRP000741)
Sample: Glossina morsitans(SRS002835)
Instrument: Illumina Genome Analyzer
Total: 1 run, 8.3M spots, 299.9M bases
Run #1: SRR016086, 8330172 spots, 299886192 bases

2:SRX004540
Submitter: UT-MGS
Study: Anopheles stephensi transcript sequencing project(SRP000747)
Sample: Anopheles stephensi(SRS002864)
Instrument: Solexa 1G Genome Analyzer
Total: 1 run, 8.4M spots, 401M bases
Run #1: SRR017875, 8354743 spots, 401027664 bases

3:SRX002521
Submitter: UT-MGS
Study: Massive transcriptional start site mapping of human cells under hypoxic conditions.(SRP000403)
Sample: Human DLD-1  tissue culture cell line(SRS001843)
Instrument: Solexa 1G Genome Analyzer
Total: 6 runs, 27.1M spots, 977M bases
Run #1: SRR013356, 4801519 spots, 172854684 bases
Run #2: SRR013357, 3603355 spots, 129720780 bases
Run #3: SRR013358, 3459692 spots, 124548912 bases
Run #4: SRR013360, 5219342 spots, 187896312 bases
Run #5: SRR013361, 5140152 spots, 185045472 bases
Run #6: SRR013370, 4916054 spots, 176977944 bases

What I want to do is to create a hash of array with first line of each chunk as keys and SR## part of lines with "^Run" as its array member:

$VAR = {
     'SRX004541' => ['SRR016086'], 
     # etc
}

But why my construct doesn't work. And it must be a better way to do it.

use Data::Dumper;
my %bighash;
my $head = "";
my @temp = ();

while ( <> ) {
    chomp;
    next if (/^\#/);


    if ( /^\d{1,2}:(\w+)/ ) { 
print "$1\n";
      $head = $1;


    }
    elsif (/^Run \#\d+: (\w+),.*/){ 
print "\t$1\n";
      push @temp, $1;
    }
    elsif (/^$/) {
         push @{$bighash{$head}}, [@temp];
         @temp =();
    }

}               


print Dumper \%bighash ;

© Stack Overflow or respective owner

Related posts about perl

Related posts about data-structures