Perl: Deleting multiple re-occuring lines where a certain criteria is met

Posted by george-lule on Stack Overflow See other posts from Stack Overflow or by george-lule
Published on 2010-04-21T18:13:39Z Indexed on 2010/04/21 19:43 UTC
Read the original article Hit count: 317

Filed under:
|

Dear all, I have data that looks like below, the actual file is thousands of lines long.

 Event_time                 Cease_time                
 Object_of_reference                                                                                                                                                                                                                                             
 -------------------------- --------------------------
 ----------------------------------------------------------------------------------                       

    Apr  5 2010  5:54PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSJN1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=LUGALAMBO_900                                                                                                                                             
    Apr  5 2010  5:55PM        Apr  5 2010  6:43PM
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSJN1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=LUGALAMBO_900                                                                                                                                             
    Apr  5 2010  5:58PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA                                                                                                                                                    
    Apr  5 2010  5:58PM        Apr  5 2010  6:01PM
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA                                                                                                                                                    
    Apr  5 2010  6:01PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA                                                                                                                                                    
    Apr  5 2010  6:03PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSJN1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=KAPKWAI_900                                                                                                                                               
    Apr  5 2010  6:03PM        Apr  5 2010  6:04PM
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSJN1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=KAPKWAI_900                                                                                                                                               
    Apr  5 2010  6:04PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSJN1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=KAPKWAI_900                                                                                                                                               
    Apr  5 2010  6:03PM        Apr  5 2010  6:03PM
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA                                                                                                                                                    
    Apr  5 2010  6:03PM                       NULL
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA                                                                                                                                                    
    Apr  5 2010  6:03PM        Apr  5 2010  7:01PM
 SubNetwork=ONRM_RootMo,SubNetwork=AXE,ManagedElement=BSCC1,BssFunction=
 BSS_ManagedFunction,BtsSiteMgr=BULAGA             

As you can see, each file has a header which describes what the various fields stand for(event start time, event cease time, affected element). The header is followed by a number of dashes. My issue is that, in the data, you see a number of entries where the cease time is NULL i.e event is still active. All such entries must go i.e for each element where the alarm cease time is NULL, the start time, the cease time(in this case NULL) and the actual element must be deleted from the file. In the remaining data, all the text starting from word SubNetwork upto BtsSiteMgr= must also go. Along with the headers and the dashes.
Final output should look like below:

    Apr  5 2010  5:55PM        Apr  5 2010  6:43PM
 LUGALAMBO_900                                                                                                                                                                                                                                                                                        
    Apr  5 2010  5:58PM        Apr  5 2010  6:01PM
 BULAGA                                                                                                                                                                                                                                                                                                                                                                            
    Apr  5 2010  6:03PM        Apr  5 2010  6:04PM
 KAPKWAI_900                                                                                                                                                                                                                                                                                       
    Apr  5 2010  6:03PM        Apr  5 2010  6:03PM
 BULAGA                                                                                                                                                                                                               
    Apr  5 2010  6:03PM        Apr  5 2010  7:01PM
 BULAGA                                                                                       

Below is a Perl script that I have written. It has taken care of the headers, the dashes, the NULL entries but I have failed to delete the lines following the NULL entries so as to produce the above output.

#!/usr/bin/perl
use strict;
use warnings;
$^I=".bak" #Backup the file before messing it up.
open (DATAIN,"<george_perl.txt")|| die("can't open datafile: $!"); # Read in the data
open (DATAOUT,">gen_results.txt")|| die("can't open datafile: $!"); #Prepare for the  writing
while (<DATAIN>) {
s/Event_time//g;
s/Cease_time//g;
s/Object_of_reference//g;
s/\-//g; #Preceding 4 statements are for cleaning out the headers
my $theline=$_;
if ($theline =~ /NULL/){
 next;
 next if $theline =~ /SubN/;
 }
 else{
   print DATAOUT $theline;
  }
 }
   close DATAIN;
   close DATAOUT;     

Kindly help point out any modifications I need to make on the script to make it produce the necessary output. Will be very glad for your help Kind regards George.

© Stack Overflow or respective owner

Related posts about perl

Related posts about regex